Commit Graph

444 Commits

Author SHA1 Message Date
Willy Tarreau
655dce90d4 [MINOR] http: create new MSG_BODY sub-states
An HTTP message can be decomposed into several sub-states depending
on the transfer-encoding. We'll have to keep these state information
while parsing chunks, so we must extend the values. In order not to
change everything, we'll now consider that anything >= MSG_BODY is
the body, and that the value indicates the precise state. The
MSG_ERROR status which was greater than MSG_BODY was moved for this.
2009-11-08 13:10:58 +01:00
Krzysztof Piotr Oledzki
de71d16ec0 [MINOR] Collect & provide http response codes for frontends, fix backends
This patch extends and corrects the functionality introduced by
"Collect & provide http response codes received from servers":
 - responses are now also accounted for frontends
 - backend's and frontend's counters are incremented based
   on responses sent to client, not received from servers
2009-10-27 21:56:47 +01:00
Willy Tarreau
8e89b84848 [MINOR] http: remove the last call to stream_int_return
And remove the now unused function itself too.
2009-10-18 23:56:35 +02:00
Willy Tarreau
b50943e717 [MINOR] http response: update the TX_CLI_CONN_KA flag on rewrite
If we modify the "connection" header we send to the client,
update the TX_CLI_CONN_KA flag.
2009-10-18 23:53:19 +02:00
Willy Tarreau
b8c82c295b [MEDIUM] http response: check body length and set transaction flags
We also check the close status and terminate the server persistent
connection if appropriate. Note that since this change, we'll not
get any "Connection: close" headers added to HTTP/1.0 responses
anymore, which is good.
2009-10-18 23:45:12 +02:00
Willy Tarreau
75a5fef4d2 [MINOR] http: pre-set the persistent flags in the transaction
We should pre-set the persistent flags then try to clear them
instead of the opposite.
2009-10-18 23:43:57 +02:00
Willy Tarreau
b37c27e28f [MAJOR] http: create the analyser which waits for a response
The code part which waits for an HTTP response has been extracted
from the old function. We now have two analysers and the second one
may re-enable the first one when an 1xx response is encountered.
This has been tested and works.

The calls to stream_int_return() that were remaining in the wait
analyser have been converted to stream_int_retnclose().
2009-10-18 23:15:41 +02:00
Willy Tarreau
2225dd4421 [MEDIUM] http request: make use of pre-parsed transfer-encoding header
This change should go a bit further. We should have a dedicated analyser
to find and skip chunks.
2009-10-18 21:36:47 +02:00
Willy Tarreau
03a5633299 [MEDIUM] http request: simplify POST length detection
We can now rely on the pre-parsed content-length and transfer-encoding
to find what the supposed body length will be.
2009-10-18 21:28:29 +02:00
Willy Tarreau
349a0f62b5 [MINOR] http request: simplify the test of no-data
Now we can rely on (chunked && !hdr_content_len) to stop forwarding
data. We only do that for known methods that are not CONNECT though.
2009-10-18 21:19:33 +02:00
Willy Tarreau
4273664a1b [MINOR] http request: update the TX_SRV_CONN_KA flag on rewrite
If we modify the "connection" header we send to the server,
update the TX_SRV_CONN_KA flag.
2009-10-18 21:10:21 +02:00
Willy Tarreau
32b47f42a0 [MEDIUM] http request: parse connection, content-length and transfer-encoding
Store those elements in the transaction. RFC2616 is strictly followed.
Note that requests containing two different content-length fields are
discarded as invalid.
2009-10-18 20:59:20 +02:00
Cyril Bonté
bf47aeb946 [MEDIUM] appsession: add the "request-learn" option
This patch has 2 goals :

1. I wanted to test the appsession feature with a small PHP code,
using PHPSESSID. The problem is that when PHP gets an unknown session
id, it creates a new one with this ID. So, when sending an unknown
session to PHP, persistance is broken : haproxy won't see any new
cookie in the response and will never attach this session to a
specific server.

This also happens when you restart haproxy : the internal hash becomes
empty and all sessions loose their persistance (load balancing the
requests on all backend servers, creating a new session on each one).
For a user, it's like the service is unusable.

The patch modifies the code to make haproxy also learn the persistance
from the client : if no session is sent from the server, then the
session id found in the client part (using the URI or the client cookie)
is used to associated the server that gave the response.

As it's probably not a feature usable in all cases, I added an option
to enable it (by default it's disabled). The syntax of appsession becomes :

  appsession <cookie> len <length> timeout <holdtime> [request-learn]

This helps haproxy repair the persistance (with the risk of losing its
session at the next request, as the user will probably not be load
balanced to the same server the first time).

2. This patch also tries to reduce the memory usage.
Here is a little example to explain the current behaviour :
- Take a Tomcat server where /session.jsp is valid.
- Send a request using a cookie with an unknown value AND a path
  parameter with another unknown value :

  curl -b "JSESSIONID=12345678901234567890123456789012" http://<haproxy>/session.jsp;jsessionid=00000000000000000000000000000001

(I know, it's unexpected to have a request like that on a live service)
Here, haproxy finds the URI session ID and stores it in its internal
hash (with no server associated). But it also finds the cookie session
ID and stores it again.

- As a result, session.jsp sends a new session ID also stored in the
  internal hash, with a server associated.

=> For 1 request, haproxy has stored 3 entries, with only 1 which will be usable

The patch modifies the behaviour to store only 1 entry (maximum).
2009-10-18 11:56:26 +02:00
Willy Tarreau
f1ba4b3de5 [MAJOR] buffer: flag BF_DONT_READ to disable reads when not required
When processing a GET or HEAD request in close mode, we know we don't
need to read anything anymore on the socket, so we can disable it.
Doing this can save up to 40% of the recv calls, and half of the
epoll_ctl calls.

For this we need a buffer flag indicating that we're not interesting in
reading anymore. Right now, this flag also disables both polled reads.
We might benefit from disabling only speculative reads, but we will need
at least this flag when we want to support keepalive anyway.

Currently we don't disable the flag on completion, but it does not
matter as we close ASAP when performing the shutw().
2009-10-18 08:52:24 +02:00
Willy Tarreau
7859991dd7 [MINOR] http: detect connection: close earlier
Till now we would only set SN_CONN_CLOSED after rewriting it. Now we
set it just after checking the Connection header so that we can use
the result later if required.
2009-10-17 20:15:29 +02:00
Krzysztof Piotr Oledzki
5fb1882514 [MINOR] Collect & provide http response codes received from servers
Additional data is provided on both html & csv stats:
 - html: when passing a mouse over Sessions -> Total (servers, backends)
 - cvs: by 6 additional fields (hrsp_1xx, hrsp_2xx, hrsp_3xx, hrsp_4xx, hrsp_5xx, hspr_other)

Patch inspired by:
 http://www.formilux.org/archives/haproxy/0910/2528.html
 http://www.formilux.org/archives/haproxy/0910/2529.html
2009-10-14 21:49:53 +02:00
Krzysztof Piotr Oledzki
6f61b21524 [BUG] Fix NULL pointer dereference in stats_check_uri_auth(), v2
Recent "struct chunk rework" introduced a NULL pointer dereference
and now haproxy segfaults if auth is required for stats but not found.

The reason is that size_t cannot store negative values, but current
code assumes that "len < 0" == uninitialized.

This patch fixes it.
2009-10-04 23:44:45 +02:00
Krzysztof Piotr Oledzki
aeebf9ba65 [MEDIUM] Collect & provide separate statistics for sockets, v2
This patch allows to collect & provide separate statistics for each socket.
It can be very useful if you would like to distinguish between traffic
generate by local and remote users or between different types of remote
clients (peerings, domestic, foreign).

Currently no "Session rate" is supported, but adding it should be possible
if we found it useful.
2009-10-04 18:56:02 +02:00
Krzysztof Piotr Oledzki
052d4fd07d [CLEANUP] Move counters to dedicated structures
Move counters from "struct proxy" and "struct server"
to "struct pxcounters" and "struct svcounters".

This patch should make no functional change.
2009-10-04 18:32:39 +02:00
Willy Tarreau
b0c9bc4f95 [MEDIUM] stats: make HTTP stats use an I/O handler
Doing this, we can remove the last BF_HIJACK user and remove
produce_content(). s->data_source could also be removed but
it is currently used to detect if the stats or a server was
used.
2009-10-04 15:56:38 +02:00
Krzysztof Piotr Oledzki
78abe618a8 [MAJOR] struct chunk rework
Add size to struct chunk and simplify the code as there is
no longer required to pass sizeof in chunk_printf().
2009-10-01 10:17:37 +02:00
Willy Tarreau
2f9cc8ab52 [BUG] http stats: large outputs sometimes got some parts chopped off
Due to a misplaced call to stream_int_retnclose(), the stats output
buffer was erased before each call to produce_content(), resulting
in missing pieces in the stats output if the connection was not
fast enough between haproxy and the client.
2009-09-24 22:22:18 +02:00
Willy Tarreau
56a560aef4 [MEDIUM] stats: prepare the connection for closing before dumping
We will need to modify the stats dump functions so that they can
be used in interactive mode. For this, we want their caller to
prepare the connection for a close, not themselves to do it.
Let's simply move the stream_int_retnclose() out.
2009-09-23 23:52:16 +02:00
Willy Tarreau
520d95e42b [MAJOR] buffers: split BF_WRITE_ENA into BF_AUTO_CONNECT and BF_AUTO_CLOSE
The BF_WRITE_ENA buffer flag became very complex to deal with, because
it was used to :
  - enable automatic connection
  - enable close forwarding
  - enable data forwarding

The last point was not very true anymore since we introduced ->send_max,
but still the test remained everywhere. This was causing issues such as
impossibility to connect without forwarding data, impossibility to prevent
closing when data was forwarded, etc...

This patch clarifies the situation by getting rid of this multi-purpose
flag and replacing it with :
  - data forwarding based only on ->send_max || ->pipe ;
  - a new BF_AUTO_CONNECT flag to allow automatic connection and only
    that ;
  - ability to perform an automatic connection when ->send_max or ->pipe
    indicate that data is waiting to leave the buffer ;
  - a new BF_AUTO_CLOSE flag to let the producer automatically set the
    BF_SHUTW_NOW flag when it gets a BF_SHUTR.

During this cleanup, it was discovered that some tests were performed
twice, or that the BF_HIJACK flag was still tested, which is not needed
anymore since ->send_max replcaed it. These places have been fixed too.

These cleanups have also revealed a few areas where the other flags
such as BF_EMPTY are not cleanly used. This will be an opportunity for
a second patch.
2009-09-19 21:14:54 +02:00
Willy Tarreau
816b979977 [MAJOR] http: add support for HTTP 1xx informational responses
HTTP supports status codes 100 and 101 to report protocol indications,
which are followed by the requests's response. Till now, haproxy would
only see those responses without parsing subsequent ones. That means
that cookie additions were only performed on 1xx messages for instance,
which does not work since headers must be ignored with 1xx messages.
Also, logs were not terribly useful with the common 100 status code
in response to "Expect: 100-continue" during POST some requests.

This change adds support for such messages. Now haproxy sees them,
forwards them and skips them until it finds a correct response, which
it logs and processes. As an exception, header removal/rewriting still
work on 1xx responses in order to be able to strip out sensible
information that may have accidentely been left by another equipment
(possibly an older haproxy itself). But headers addition are disabled
however.

This change brings the ability to loop on response without data, which
is a starting point to support keepalive. The change is marked as major
as a few fixes had to be performed in the HTTP message parser.
2009-09-19 14:53:47 +02:00
Willy Tarreau
106f979bbd [MINOR] acl: add support for hdr_ip to match IP addresses in headers
For x-forwarded-for and such headers, it's sometimes needed to match
based on network addresses. Let's use hdr_ip() for that.
2009-09-19 14:47:49 +02:00
Willy Tarreau
c465fd7836 [BUG] tarpit did not work anymore
Tarpit was broken by recent splitting of analysers. It would still
let the connection go to the server due to a missing buffer_write_dis().
Also, it was performed too late (after content switching rules).
2009-08-31 00:17:18 +02:00
Willy Tarreau
a07a34eb24 [MEDIUM] replace BUFSIZE with buf->size in computations
The first step towards dynamic buffer size consists in removing
all static definitions of the buffer size. Instead, we store a
buffer's size in itself. Right now they're all preinitialized
to BUFSIZE, but we will change that.
2009-08-16 23:27:46 +02:00
Willy Tarreau
52a0c60845 [MINOR] set s->srv_error according to the analysers
s->srv_error was set depending on the frontend's protocol. Now it is
set by the HTTP analyser, so that even when switching from a TCP
frontend to an HTTP backend, we can have HTTP error messages.
2009-08-16 22:45:38 +02:00
Willy Tarreau
b55932ddaf [MEDIUM] remove old experimental tcpsplice option
This Linux-specific option was never really used in production and
has since been superseded by new splicing options brought by recent
Linux kernels.

It caused several particular cases in the code because the kernel
would take care of the session without haproxy being able to do
anything on it, which became hard to handle in the new architecture.

Let's simply get rid of it now that there is a replacement available.
2009-08-16 13:20:32 +02:00
Emeric Brun
3a058f3091 [MINOR] add a new CLF log format
Appending the "clf" word after "option httplog" turns the HTTP log
format into a CLF format, more suited for certain tools.
2009-07-14 12:50:40 +02:00
Willy Tarreau
cd7afc0a13 [MINOR] http: take http request timeout from the backend
Since we can now switch from TCP to HTTP, we need to be able to apply
the HTTP request timeout after switching. That means we need to take
it from the backend and not from the frontend. Since the backend points
to the frontend before switching, that changes nothing for the normal
case.
2009-07-12 10:03:17 +02:00
Willy Tarreau
51aecc76f8 [MEDIUM] allow a TCP frontend to switch to an HTTP backend
This patch allows a TCP frontend to switch to an HTTP backend.
During the switch, missing structures are automatically allocated.
The HTTP parser is enabled so that the backend first waits for a
full HTTP request.
2009-07-12 09:47:04 +02:00
Willy Tarreau
2492d5b4d6 [MINOR] acl: add HTTP protocol detection (req_proto_http)
Now that we can perform TCP-based content switching, it makes sense
to be able to detect HTTP traffic and act accordingly. We already
have an HTTP decoder, we just have to call it in order to detect HTTP
protocol. Note that since the decoder will automatically fill in the
interesting fields of the HTTP transaction, it would make sense to
use this parsing to extend HTTP matching to TCP.
2009-07-12 08:06:20 +02:00
Willy Tarreau
1d0dfb155d [MAJOR] http: complete splitting of the remaining stages
The HTTP processing has been splitted into 7 steps, one of which
is not anymore HTTP-specific (content-switching). That way, it
becomes possible to use "use_backend" rules in TCP mode. A new
"use_server" directive should follow soon.
2009-07-07 15:10:31 +02:00
Willy Tarreau
3a816293e9 [MEDIUM] session: tell analysers what bit they were called for
Some stream analysers might become generic enough to be called
for several bits. So we cannot have the analyser bit hard coded
into the analyser itself. Let's make the caller inform the callee.
2009-07-07 10:55:49 +02:00
Willy Tarreau
d787e6648c [MEDIUM] http: split request waiter from request processor
We want to split several steps in HTTP processing so that
we can call individual analysers depending on what processing
we want to perform. The first step consists in splitting the
part that waits for a request from the rest.
2009-07-07 10:14:51 +02:00
Willy Tarreau
571ec98baa [CLEANUP] remove unused DEBUG_PARSE_NO_SPEEDUP define
This one has become useless with the new HTTP parser.
2009-07-07 08:56:15 +02:00
Willy Tarreau
06b917c7ab [BUG] http: redirect rules were processed too early
redirect rules are documented as being processed last before
use_backend but were mistakenly processed before block rules.
Fortunately very few people use a mix of block and redirect
rules, so this bug has never been reported yet.
2009-07-06 16:34:52 +02:00
Willy Tarreau
c9bd0cc224 [MINOR] add options dontlog-normal and log-separate-errors
Some big traffic sites have trouble dealing with logs and tend to
disable them. Here are two new options to help cope with massive
logs.

  - dontlog-normal only disables logging for 100% successful
    connections, other ones will still be logged

  - log-separate-errors will cause non-100% successful connections
    to be logged at level "err" instead of level "info" so that a
    properly configured syslog daemon can send them to a different
    file for longer conservation.
2009-05-10 11:57:02 +02:00
Maik Broemme
2850cb42b6 [MINOR] add X-Original-To: header
I have attached a patch which will add on every http request a new
header 'X-Original-To'. If you have HAProxy running in transparent mode
with a big number of SQUID servers behind it, it is very nice to have
the original destination ip as a common header to make decisions based
on it.

The whole thing is configurable with a new option 'originalto'. I have
updated the sourcecode as well as the documentation. The 'haproxy-en.txt'
and 'haproxy-fr.txt' files are untouched, due to lack of my french
language knowledge. ;)

Also the patch adds this header for IPv4 only. I haven't any IPv6 test
environment running here and don't know if getsockopt() with SO_ORIGINAL_DST
will work on IPv6. If someone knows it and wants to test it I can modify
the diff. Feel free to ask me questions or things which should be changed. :)

--Maik
2009-05-01 16:22:33 +02:00
Willy Tarreau
2df8d713b3 [BUG] fix wrong pointer arithmetics in HTTP message captures
The pointer arithmetics was wrong in http_capture_bad_message().
This has no impact right now because the error only msg->som was
affected and right now it's always 0. But this was a bug waiting
for keepalive support to strike.
2009-05-01 11:33:17 +02:00
Willy Tarreau
1772ece025 [MINOR] fix several printf formats and missing arguments
Last patch revealed a number of mistakes in printf-like calls, mostly int/long
mismatches, and a few missing arguments.
2009-04-03 14:49:12 +02:00
Willy Tarreau
4076a15255 [MEDIUM] http: capture invalid requests/responses even if accepted
It's useful to be able to accept an invalid header name in a request
or response but still be able to monitor further such errors. Now,
when an invalid request/response is received and accepted due to
an "accept-invalid-http-{request|response}" option, the invalid
request will be captured for later analysis with "show errors" on
the stats socket.
2009-04-02 21:36:37 +02:00
Willy Tarreau
32a4ec0ed7 [MEDIUM] http: add options to ignore invalid header names
Sometimes it is required to let invalid requests pass because
applications sometimes take time to be fixed and other servers
do not care. Thus we provide two new options :

     option accept-invalid-http-request  (for the frontend)
     option accept-invalid-http-response (for the backend)

When those options are set, invalid requests or responses do
not cause a 403/502 error to be generated.
2009-04-02 21:36:34 +02:00
Willy Tarreau
2ab85e6fee [BUG] don't set an expiration date directly from now_ms
now_ms can be zero, don't set ->analyse_exp directly from it, we
must use tick_add() instead.
2009-03-29 10:24:15 +02:00
Willy Tarreau
1b194fe03e [OPTIM] buffer: new BF_READ_DONTWAIT flag reduces EAGAIN rates
When the reader does not expect to read lots of data, it can
set BF_READ_DONTWAIT on the request buffer. When it is set,
the stream_sock_read callback will not try to perform multiple
reads, it will return after only one, and clear the flag.
That way, we can immediately return when waiting for an HTTP
request without trying to read again.

On pure request/responses schemes such as monitor-uri or
redirects, this has completely eliminated the EAGAIN occurrences
and the epoll_ctl() calls, resulting in a performance increase of
about 10%. Similar effects should be observed once we support
HTTP keep-alive since we'll immediately disable reads once we
get a full request.
2009-03-21 21:57:30 +01:00
Willy Tarreau
8365f9335d [CLEANUP] http: remove some commented out obsolete code in process_response 2009-03-15 23:11:49 +01:00
Willy Tarreau
6bf1736fb1 [BUILD] proto_http did not build on gcc-2.95 (again)
move the DPRINTF below the local variable declarations.
(cherry picked from commit 7b92db4cd5)

The patch accidently got reverted.
2009-03-08 23:10:34 +01:00
Willy Tarreau
6f0aa476bd [CLEANUP] buffer_flush() was misleading, rename it as buffer_erase 2009-03-08 20:33:29 +01:00
Willy Tarreau
ec22b2c27a [CLEANUP] remove last references to term_trace
term_trace was very useful while reworking the lower layers but has almost
completely been removed from every place it was referenced. Even the few
remaining ones were not accurate, so it's better to completely remove those
references and re-add them from scratch later if needed.
2009-03-06 13:07:40 +01:00
Willy Tarreau
7f062c4193 [MEDIUM] measure and report session rate on frontend, backends and servers
With this change, all frontends, backends, and servers maintain a session
counter and a timer to compute a session rate over the last second. This
value will be very useful because it varies instantly and can be used to
check thresholds. This value is also reported in the stats in a new "rate"
column.
2009-03-05 18:43:00 +01:00
Willy Tarreau
defc52da95 [MINOR] errors dump must use user-visible date, not internal date. 2009-03-04 20:53:44 +01:00
Willy Tarreau
f073a83b1d [MEDIUM] store a complete dump of request and response errors in proxies
Each proxy instance, either frontend or backend, now has some room
dedicated to storing a complete dated request or response in case
of parsing error. This will make it possible to consult errors in
order to find the exact cause, which is particularly important for
troubleshooting faulty applications.
2009-03-04 10:26:38 +01:00
Willy Tarreau
7552c031c0 [MINOR] ensure that http_msg_analyzer updates pointer to invalid char
If an invalid character is encountered while parsing an HTTP message, we
want to get buf->lr updated to reflect it.

Along this change, a few useless __label__ declarations have been removed
because they caused gcc to consume stack space without putting anything
there.
2009-03-01 11:10:40 +01:00
Willy Tarreau
7b92db4cd5 [BUILD] proto_http did not build on gcc-2.95
move the DPRINTF below the local variable declarations.
2009-02-24 10:48:35 +01:00
Willy Tarreau
fe651a50d6 [MINOR] redirect: in prefix mode a "/" means not to change the URI
If the prefix is set to "/", it means the user does not want to alter
the original URI, so we don't want to insert a new slash before the
original URI.

(cherry-picked from commit 02a35c74942c1bce762e996698add1270e6a5030)
2008-12-07 23:48:39 +01:00
Willy Tarreau
0140f2553c [MINOR] redirect: add support for "set-cookie" and "clear-cookie"
It is now possible to set or clear a cookie during a redirection. This
is useful for logout pages, or for protecting against some DoSes. Check
the documentation for the options supported by the "redirect" keyword.

(cherry-picked from commit 4af993822e880d8c932f4ad6920db4c9242b0981)
2008-12-07 23:46:38 +01:00
Willy Tarreau
79da4697ca [MINOR] redirect: add support for the "drop-query" option
If "drop-query" is present on a "redirect" line using the "prefix" mode,
then the returned Location header will be the request URI without the
query-string. This may be used on some login/logout pages, or when it
must be decided to redirect the user to a non-secure server.

(cherry-picked from commit f2d361ccd73aa16538ce767c766362dd8f0a88fd)
2008-12-07 23:42:01 +01:00
Willy Tarreau
fd39ddaa3d [BUG] cookie capture is declared in the frontend but checked on the backend
Cookie capture would only work by pure luck on the request but did
never work on responses since only the backend was checked. The fix
consists in always checking frontend for cookie captures.
(cherry picked from commit a83c5ba9315a7c47cda2698280b7e49a9d3eb374)
2008-12-07 23:36:52 +01:00
Willy Tarreau
0a46489228 [MINOR] slightly rebalance stats_dump_{raw,http}
Both should process the response buffer equally. They now both
clear the hijack bit once done, and both receive a pointer to
the response buffer in their arguments.
2008-12-07 18:30:00 +01:00
Willy Tarreau
01bf8675ed [MEDIUM] reference the current hijack function in the buffer itself
Instead of calling a hard-coded function to produce data, let's
reference this function into the buffer and call it from there
when BF_HIJACK is set. This goes in the direction of more generic
session management code.
2008-12-07 18:03:29 +01:00
Willy Tarreau
59234e91c2 [MEDIUM] rename process_request to http_process_request
Now the function only does HTTP request and nothing else. Also pass
the request buffer to it.
2008-11-30 23:51:27 +01:00
Willy Tarreau
d34af78a34 [MEDIUM] move the HTTP request body analyser out of process_request().
A new function http_process_request_body() has been created to process
the request body. Next step is now to clean up process_request().
2008-11-30 23:36:37 +01:00
Willy Tarreau
60b85b0694 [MEDIUM] extract the HTTP tarpit code from process_request().
The tarpit is now an autonomous independant analyser.
2008-11-30 23:28:40 +01:00
Willy Tarreau
edcf6687d6 [MEDIUM] extract TCP request processing from HTTP
The TCP analyser has moved to proto_tcp.c. Breaking the function
has required finer use of the return value and adding some tests
to process_session().
2008-11-30 23:15:34 +01:00
Willy Tarreau
0cac36f415 [MEDIUM] make the http server error function a pointer in the session
It was a bit awkward to have session.c call return_srv_error() for
HTTP error messages related to servers. The function has been adapted
to be passed a pointer to the faulty stream interface, and is now a
pointer in the session. It is possible that in the future, it will
become a callback in the stream interface itself.
2008-11-30 20:44:17 +01:00
Willy Tarreau
2d3d94cf23 [MINOR] replace srv_close_with_err() with http_server_error()
The new function looks like the previous one except that it operates
at the stream interface level and assumes an already closed SI.

Also remove some old unused occurrences of srv_close_with_err().
2008-11-30 20:28:57 +01:00
Willy Tarreau
dded32defa [MINOR] replace client_retnclose() with stream_int_retnclose()
This makes more sense to return a message to a stream interface
than to a session.

senddata.{c,h} have been removed.
2008-11-30 19:48:07 +01:00
Willy Tarreau
81acfab4fd [MINOR] replace the ambiguous client_return function by stream_int_return
This one applies to a stream interface, which makes more sense.
2008-11-30 19:22:53 +01:00
Willy Tarreau
a5555ec68a [MINOR] call session->do_log() for logging
In order to avoid having to call per-protocol logging function directly
from session.c, it's better to assign the logging function when the session
is created. This also eliminates a test when the function is needed, and
opens the way to more complete logging functions.
2008-11-30 19:02:32 +01:00
Willy Tarreau
55a8d0e1bb [CLEANUP] move the session-related functions to session.c
proto_http.c was not suitable for session-related processing, it was
just convenient for the tranformation.

Some more splitting must occur: process_request/response in proto_http.c
must be split again per protocol, and the caller must run a list.

Some functions should be directly attached to the session or the buffer
(eg: perform_http_redirect, return_srv_error, http_sess_log).
2008-11-30 18:47:21 +01:00
Willy Tarreau
fe3718ab79 [MAJOR] complete layer4/7 separation
All the processing has now completely been split in layers. As of
now, everything is still in process_session() which is not the right
place, but the code sequence works. Timeouts, retries, errors, all
work.

The shutdown sequence has been strictly applied: BF_SHUTR/BF_SHUTW
are only assigned by lower layers. Upper layers can only indicate
their wish to close using BF_SHUTR_NOW and BF_SHUTW_NOW.

When a shutdown is performed on a stream interface, the buffer flags
are updated accordingly and re-checked by upper layers. A lot of care
has been taken to ensure that aborts during intermediate connection
setups are correctly handled and shutdowns correctly propagated to
both buffers.

A future evolution would consist in ensuring that BF_SHUT?_NOW may
be set at any time, and applies only when the buffer is empty. This
might help with error messages, but might complicate the processing
of data remaining in buffers.

Some useless buffer flag combinations have been removed.

Stat counters are still broken (eg: per-server total number of sessions).

Error messages should be delayed to the close instant and be produced by
protocol.

Many functions must now move to proper locations.
2008-11-30 18:14:12 +01:00
Willy Tarreau
99126c35c1 [MEDIUM] make the stream interface control the SHUT{R,W} bits
It's better that the stream interface controls the BF_SHUT* bits so
that they always reflect the real state of the interface.
2008-11-27 22:32:14 +01:00
Willy Tarreau
8bfa426cad [MEDIUM] process shutw during connection attempt
It sometimes happens that a connection is aborted at the exact same moment
it establishes. We have to close the socket and not only to shut it down
for writes.

Some corner cases remain. We have to handle the shutr/shutw at the stream
interface and only report the status to the buffer, not the opposite.
2008-11-27 09:25:45 +01:00
Willy Tarreau
0a5d5ddeb9 [MEDIUM] remove stream_sock_update_data()
Two new functions are used instead : buffer_check_{shutr,shutw}.
It is indeed more adequate to check for new closures only when the
buffer reports them.

Several remaining unclosed connections were detected after a test,
even before this patch, so a bug remains. To reproduce, try the
following during 30 seconds :

  inject30l4 -n 20000 -l -t 1000 -P 10 -o 4 -u 100 -s 100 -G 127.0.0.1:8000/
2008-11-23 19:31:35 +01:00
Willy Tarreau
74ab2ac7b0 [MEDIUM] stream_interface: added a DISconnected state between CON/EST and CLO
There were rare situations where it was not easy to detect that a failed
session attempt had occurred and needed some server cleanup. In particular,
client aborts sometimes lead to session leaks on the server side.

A new state "SI_ST_DIS" (disconnected) has been introduced for this. When
a session has been closed at a stream interface but the server cleanup has
not occurred, this state is entered instead of CLO. The cleanup is then
performed there and the state goes to CLO.

A new diagram has been added to show possible stream_interface state
transitions that can occur in a stream-sock. It makes debugging easier.
2008-11-23 17:23:07 +01:00
Willy Tarreau
4351b3a4ca [MEDIUM] continue layering cleanups.
The server sessions are now only decremented when entering SI_ST_CER
and SI_ST_CLO states. A state is clearly missing between EST and CLO,
or after CLO (eg: END), because many cleanups are performed upon CLO
and must rely on tricks to ensure being done only once.

The goal of next changes will be to improve what has been started.
Ideally, the FD should only notify the SI about the change, which
should itself only notify the session when it has some news or when
it needs help (eg: redispatch). The buffer's error processing should
not change the FD's status immediately, otherwise we risk race conds
between a pending connect and a shutw (for instance). Also, the new
connect attempt should only be made after layer 7 and all the crap
above buffers.
2008-11-12 01:51:41 +01:00
Willy Tarreau
1e62de615b [MEDIUM] add the SN_CURR_SESS flag to the session to track open sessions
It is quite hard to track when the current session has already been counted
or discounted from the server's total number of established sessions. For
this reason, we introduce a new session flag, SN_CURR_SESS, which indicates
if the current session is one of those reported by the server or not. It
simplifies session accounting and makes it far more robust. It also makes
it possible to perform a last-minute cleanup during session_free().

Right now, with this fix and a few more buffer transitions fixes, no session
were found to remain after a test.
2008-11-11 20:26:58 +01:00
Willy Tarreau
cff6411f9a [MAJOR] add a connection error state to the stream_interface
Tracking connection status changes was hard, and some code was
redundant. A new SI_ST_CER state was added to the stream interface
to indicate a past connection error, and an SI_FL_ERR flag was
added to report past I/O error. The stream_sock code does not set
the connection to SI_ST_CLO anymore in case of I/O error, it's
the upper layer which does it. This makes it possible to know
exactly when the file descriptors are allocated.

The new SI_ST_CER state permitted to split tcp_connection_status()
in two parts, one processing SI_ST_CON and the other one SI_ST_CER.
Synchronous connection errors now make use of this last state, hence
eliminating duplicate code.

Some ib<->ob copy paste errors were found and fixed, and all entities
setting SI_ST_CLO also shut the buffers down.

Some of these stream_interface specific functions and structures
have migrated to a new stream_interface.c file.

Some types of errors are still not detected by the buffers. For
instance, let's assume the following scenario in one single pass
of process_session: a connection sits in SI_ST_TAR state during
a retry. At TAR expiration, a new connection attempt is made, the
connection is obtained and srv->cur_sess is increased. Then the
buffer timeout is fires and everything is cleared, the new state
becomes SI_ST_CLO. The cleaning code checks that previous state
was either SI_ST_CON or SI_ST_EST to release the connection. But
that's wrong because last state is still SI_ST_TAR. So the
server's connection count does not get decreased.

This means that prev_state must not be used, and must be replaced
by some transition detection instead of level detection.

The following debugging line was useful to track state changes :

  fprintf(stderr, "%s:%d: cs=%d ss=%d(%d) rqf=0x%08x rpf=0x%08x\n", __FUNCTION__, __LINE__,
          s->si[0].state, s->si[1].state, s->si[1].err_type, s->req->flags, s-> rep->flags);
2008-11-03 06:26:53 +01:00
Willy Tarreau
efb453c259 [MAJOR] migrate the connection logic to stream interface
The connection setup code has been refactored in order to
make it run only on low level (stream interface). Several
complicated functions have been removed from backend.c,
and we now have sess_update_stream_int() to manage
an assigned connection, sess_prepare_conn_req() to assign a
server to a connection request, perform_http_redirect() to
redirect instead of connecting to server, and return_srv_error()
to return connection error status messages.

The stream_interface status changes are checked before adjusting
buffer flags, so that the buffers can be informed about this lower
level update.

A new connection is initiated by changing si->state from SI_ST_INI
to SI_ST_REQ.

The code seems to work but is awfully dirty. Some functions need
to be moved, and the layering is not yet quite clear.

A lot of dead old code has simply been removed.
2008-11-02 10:19:10 +01:00
Willy Tarreau
d7704b5343 [MINOR] add an expiration flag to the stream_sock_interface
This expiration flag is used to indicate that the timer has
expired without having to check it everywhere.
2008-11-02 10:19:10 +01:00
Willy Tarreau
3c6ab2e28d [MEDIUM] use buffer_check_timeouts instead of stream_sock_check_timeouts()
It's more appropriate to use buffer_check_timeouts() to check for buffer
timeouts and si->shutw/shutr to shutdown the stream interfaces.
2008-11-02 10:19:10 +01:00
Willy Tarreau
3537467679 [MEDIUM] move QUEUE and TAR timers to stream interfaces
It was not practical to have QUEUE and TAR timers in buffers, as they caused
triggering of the timeout flags. Move them to the stream interface where they
belong.
2008-11-02 10:19:09 +01:00
Willy Tarreau
a37095b96f [CLEANUP] process_session: move debug outputs out of the critical loop
The if(debug&closed) printfs have moved outside of the loop. It also
permitted to merge several of them.
2008-11-02 10:19:09 +01:00
Willy Tarreau
4ffd51a848 [MEDIUM] process_session: make use of the new buffer flags
Now we have almost two distinct parts between tcp and http.
Only the connection establishment code still requires some
resynchronization, the rest does not.
2008-11-02 10:19:09 +01:00
Willy Tarreau
9a2d15429d [MEDIUM] buffers: add BF_READ_ATTACHED and BF_ANA_TIMEOUT
Those two flags will be used to wake up analysers only when
needed.
2008-11-02 10:19:09 +01:00
Willy Tarreau
48adac5db9 [MEDIUM] stream interface: add the ->shutw method as well as in and out buffers
Those entries were really needed for cleaner and better code. Using them
has permitted to automatically close a file descriptor during a shut write,
reducing by 20% the number of calls to process_session() and derived
functions.

Process_session() does not need to know the file descriptor anymore, though
it still remains very complicated due to the special case for the connect
mode.
2008-11-02 10:19:08 +01:00
Willy Tarreau
e5ed406715 [MAJOR] make stream sockets aware of the stream interface
As of now, a stream socket does not directly wake up the task
but it does contact the stream interface which itself knows the
task. This allows us to perform a few cleanups upon errors and
shutdowns, which reduces the number of calls to data_update()
from 8 per session to 2 per session, and make all the functions
called in the process_session() loop completely swappable.

Some improvements are required. We need to provide a shutw()
function on stream interfaces so that one side which closes
its read part on an empty buffer can propagate the close to
the remote side.
2008-11-02 10:19:08 +01:00
Willy Tarreau
cb651251f9 [OPTIM] ev_sepoll: detect newly created FDs and check them once
When an accept() creates a new FD, it is already marked as set for
reads. But the task will be woken up without first checking if the
socket could be read.

The speculative I/O gives us a chance to either read the FD if there
are data pending on it, or immediately mark it for poll mode if
nothing is pending.

Simply doing this reduces the number of calls to process_session
from 6 to 5 per session, 2 to 1 calls to process_request, 10% less
calls to epoll_ctl, fd_clr, fd_set, stream_sock_data_update, 20%
less eb32_insert/eb_delete, etc... General performance increase
seems to be around 3%.
2008-11-02 10:19:07 +01:00
Willy Tarreau
3da77c5abd [MINOR] re-arrange buffer flags and rename some of them
The buffer flags became a big bazaar. Re-arrange them
so that their names are more explicit and so that they
are more easily readable in hex form. Some aggregates
have also been adjusted.
2008-11-02 10:19:07 +01:00
Willy Tarreau
72b179a53c [MEDIUM] reintroduce BF_HIJACK with produce_content
The stats dump are back. Even very large config files with
5000 servers work fast and well. The SN_SELF_GEN flag has
completely been removed.
2008-11-02 10:19:06 +01:00
Willy Tarreau
36e6a41bc8 [MINOR] only call flow analysers when their read side is connected.
It's useless to call flow analysers when their read side has not
seen a connection yet.
2008-11-02 10:19:06 +01:00
Willy Tarreau
3a16b2c9cd [MEDIUM] split stream_sock_process_data
It was a waste to constantly update the file descriptor's status
and timeouts during a flags update. So stream_sock_process_data
has been slit in two parts :
  stream_sock_data_update()  => computes updated flags
  stream_sock_data_finish()  => computes timeouts

Only the first one is called during flag updates. The second one
is only called upon completion. The number of calls to fd_set/fd_clr
has now significantly dropped.

Also, it's useless to check for errors and timeouts in the
process_session() loop, it's enough to check for them at the
beginning.
2008-11-02 10:19:06 +01:00
Willy Tarreau
f9839bdffe [MAJOR] make the client side use stream_sock_process_data()
The client side now relies on stream_sock_process_data(). One
part has not yet been re-implemented, it concerns the calls
to produce_content().

process_session() has been adjusted to correctly check for
changing bits in order not to call useless functions too many
times.

It already appears that stream_sock_process_data() should be
split so that the timeout computations are only performed at
the exit of process_session().
2008-11-02 10:19:06 +01:00
Willy Tarreau
2d2127989c [MEDIUM] stream_sock_process_data moved to stream_sock.c
The old temporary process_srv_data function moved to stream_sock.c.
2008-11-02 10:19:05 +01:00
Willy Tarreau
8a8188301b [MEDIUM] process_srv_data: ensure that we always correctly re-arm timeouts
We really want to ensure that we don't miss a timeout update and do not
update them for nothing. So the code takes care of updating the timeout
in the two following circumstances :
  - it was not set
  - some I/O has been performed

Maybe we'll be able to remove that from stream_sock_{read|write}, or
we'll find a way to ensure that we never have to re-enable this.
2008-11-02 10:19:05 +01:00
Willy Tarreau
2ac679d9aa [MEDIUM] third cleanup and optimization of process_srv_data()
Some repeated tests were factored out.
Now the code makes sense and is fully understandable.
2008-11-02 10:19:05 +01:00
Willy Tarreau
8fbd3b4ce7 [MEDIUM] second level of code cleanup for process_srv_data
Now the function is 100% server-independant. Next step will
consist in using the same function for the client side too.
2008-11-02 10:19:05 +01:00
Willy Tarreau
376580a873 [MEDIUM] massive cleanup of process_srv()
Server-specific calls were extracted and moved to the caller.
The function is now nearly server-agnostic.
2008-11-02 10:19:05 +01:00
Willy Tarreau
8b46aa01ac [OPTIM] remove useless fd_set(read) upon shutdown(write)
Those old tricks are no longer needed and are overwritten
anyway. Remove them.
2008-11-02 10:19:05 +01:00
Willy Tarreau
fa7e10251d [MAJOR] rework of the server FSM
srv_state has been removed from HTTP state machines, and states
have been split in either TCP states or analyzers. For instance,
the TARPIT state has just become a simple analyzer.

New flags have been added to the struct buffer to compensate this.
The high-level stream processors sometimes need to force a disconnection
without touching a file-descriptor (eg: report an error). But if
they touched BF_SHUTW or BF_SHUTR, the file descriptor would not
be closed. Thus, the two SHUT?_NOW flags have been added so that
an application can request a forced close which the stream interface
will be forced to obey.

During this change, a new BF_HIJACK flag was added. It will
be used for data generation, eg during a stats dump. It
prevents the producer on a buffer from sending data into it.

  BF_SHUTR_NOW  /* the producer must shut down for reads ASAP  */
  BF_SHUTW_NOW  /* the consumer must shut down for writes ASAP */
  BF_HIJACK     /* the producer is temporarily replaced        */

BF_SHUTW_NOW has precedence over BF_HIJACK. BF_HIJACK has
precedence over BF_MAY_FORWARD (so that it does not need it).

New functions buffer_shutr_now(), buffer_shutw_now(), buffer_abort()
are provided to manipulate BF_SHUT* flags.

A new type "stream_interface" has been added to describe both
sides of a buffer. A stream interface has states and error
reporting. The session now has two stream interfaces (one per
side). Each buffer has stream_interface pointers to both
consumer and producer sides.

The server-side file descriptor has moved to its stream interface,
so that even the buffer has access to it.

process_srv() has been split into three parts :
  - tcp_get_connection() obtains a connection to the server
  - tcp_connection_failed() tests if a previously attempted
    connection has succeeded or not.
  - process_srv_data() only manages the data phase, and in
    this sense should be roughly equivalent to process_cli.

Little code has been removed, and a lot of old code has been
left in comments for now.
2008-11-02 10:19:04 +01:00
Willy Tarreau
41f40ede3b [MEDIUM] make it possible for analysers to follow the whole session
Some analysers will need to remain present after connection is
established. Change the way BF_MAY_FORWARD is set to allow this.
2008-11-02 10:19:04 +01:00
Willy Tarreau
c52164a1a8 [BUG] process_request: HTTP body analysis must return zero if missing data
This missing return and timeout check caused an infinite loop too.
2008-08-17 19:27:11 +02:00
Willy Tarreau
2500981dc1 [BUG] process_cli/process_srv: don't call shutdown when already done
A few missing checks of BF_SHUTR and BF_SHUTW caused busy loops upon
some error paths.
2008-08-17 18:16:38 +02:00
Willy Tarreau
ffab5b4ab0 [MEDIUM] merge inspect_exp and txn->exp into request buffer
Since we may have several analysers on a buffer, it's more
convenient to have the analyser timeout attached to the
buffer itself.
2008-08-17 18:03:28 +02:00
Willy Tarreau
6d2889ba3d [OPTIM] process_cli/process_srv: reduce the number of tests
We can skip a number of tests by simply checking a few flags,
it saves a few CPU cycles in the fast path.
2008-08-17 16:25:06 +02:00
Willy Tarreau
2df28e8110 [MEDIUM] session: move the analysis bit field to the buffer
It makes more sense to store the list of analysers in the buffer
than in the session since they are precisely plugged onto one
buffer.
2008-08-17 15:20:19 +02:00
Willy Tarreau
f495ddf9d4 [MINOR] ensure the termination flags are set by process_xxx
When any processing remains on a buffer, it must be up to the
processing functions to set the termination flags, because they
are the only ones who know about higher levels.
2008-08-17 14:38:41 +02:00
Willy Tarreau
507385d0e1 [MEDIUM] centralize buffer timeout checks at the top of process_session
it's more efficient and easier to check all the timeouts at once and
always rely on the buffer flags than to check them everywhere.
2008-08-17 13:04:25 +02:00
Willy Tarreau
26ed74dadc [MEDIUM] use buffer->wex instead of buffer->cex for connect timeout
It's a shame not to use buffer->wex for connection timeouts since by
definition it cannot be used till the connection is not established.
Using it instead of ->cex also makes the buffer processing more
symmetric.
2008-08-17 12:11:14 +02:00
Willy Tarreau
dafde43410 [MAJOR] process_session: rely only on buffer flags
Instead of calling all functions in a loop, process_session now
calls them according to buffer flags changes. This ensures that
we almost never call functions for nothing. The flags settings
are still quite coarse, but the number of average functions
calls per session has dropped from 31 to 18 (the calls to
process_srv dropped from 13 to 7 and the calls to process_cli
dropped from 13 to 8).

This could still be improved by memorizing which flags each
function uses, but that would add a level of complexity which
is not desirable and maybe even not worth the small gain.
2008-08-17 01:15:41 +02:00
Willy Tarreau
e393fe224b [MEDIUM] buffers: add BF_EMPTY and BF_FULL to remove dependency on req/rep->l
It is not always convenient to run checks on req->l in functions to
check if a buffer is empty or full. Now the stream_sock functions
set flags BF_EMPTY and BF_FULL according to the buffer contents. Of
course, functions which touch the buffer contents adjust the flags
too.
2008-08-16 22:18:07 +02:00
Willy Tarreau
ba392cecf9 [CLEANUP] get rid of BF_SHUT*_PENDING
BF_SHUTR_PENDING and BF_SHUTW_PENDING were poor ideas because
BF_SHUTR is the pending of BF_SHUTW_DONE and BF_SHUTW is the
pending of BF_SHUTR_DONE. Remove those two useless and confusing
"pending" versions and rename buffer_shut{r,w}_* functions.
2008-08-16 21:13:23 +02:00
Willy Tarreau
a7c52761b4 [BUG] process_response: do not touch srv_state
process_response is not allowed to touch srv_state (this is an
incident which has survived the code migration). This bug was
causing connection exhaustion on frontend due to some closed
sockets marked SV_STDATA again.
2008-08-16 18:40:18 +02:00
Willy Tarreau
d9f483646d [BUG] buffers: remove BF_MAY_CONNECT and fix forwarding issue
It wasn't really wise to separate BF_MAY_CONNECT and BF_MAY_FORWARD,
as it caused trouble in TCP mode because the connection was allowed
but not the forwarding. Remove BF_MAY_CONNECT.
2008-08-16 16:39:26 +02:00
Willy Tarreau
9a8c5de375 [BUG] process_response must not enable the read FD
Since the separation of TCP and HTTP state machines, the HTTP
code must not play anymore with the file descriptor status
without checking if they are closed. Remains of such practice
have caused busy loops under some circumstances (mainly when
client closed during headers response).
2008-08-16 16:11:07 +02:00
Willy Tarreau
f853320b44 [MINOR] term_trace: add better instrumentations to trace the code
A new member has been added to the struct session. It keeps a trace
of what block of code performs a close or a shutdown on a socket, and
in what sequence. This is extremely convenient for post-mortem analysis
where flag combinations and states seem impossible. A new ABORT_NOW()
macro has also been added to make the code immediately segfault where
called.
2008-08-16 14:55:08 +02:00
Willy Tarreau
1ae3a057df [MEDIUM] remove unused references to {CL|SV}_STSHUT*
All references to CL_STSHUT* and SV_STSHUT* were removed where
possible. Some of them could not be removed because they are
still in use by the unix sockets.

A bug remains at this stage. Injecting with a very short timeout
sometimes leads to a client in close state and a server in data
state with all buffer flags indicating a shutdown but the server
fd still enable, thus causing a busy loop.
2008-08-16 10:56:30 +02:00
Willy Tarreau
461f662846 [MAJOR] clearly separate HTTP response processing from TCP server state
The HTTP response is now processed in its own function, regardless of
the TCP state. All FSMs have become fairly simpler and must still be
improved by removing useless CL_STSHUT* and SV_STSHUT* (still used by
proto_uxst). The number of calls to process_* is still huge though.

Next steps consist in :
  - removing useless assignments of CL_STSHUT* and SV_STSHUT*
  - add a BF_EMPTY flag to buffers to indicate an empty buffer
  - returning smarter values in process_* so that each callee
    may explicitly indicate whom needs to be called after it.
  - unify read and write timeouts for a same side. The way it
    is now is too complicated and error-prone
  - auditing code for regression testing

We're close to getting something which works fairly better now.
2008-08-15 23:43:19 +02:00
Willy Tarreau
cebf57e0bf [MAJOR] better separation of response processing and server state
TCP timeouts are not managed anymore by the response FSM. Warning,
the FORCE_CLOSE state does not work anymore for now. All remaining
bugs causing stale connections have been swept.
2008-08-15 18:16:37 +02:00
Willy Tarreau
f5483bf639 [MAJOR] get rid of the SV_STHEADERS state
The HTTP response code has been moved to a specific function
called "process_response" and the SV_STHEADERS state has been
removed and replaced with the flag AN_RTR_HTTP_HDR.
2008-08-14 18:35:40 +02:00
Willy Tarreau
e46ab5524f [BUG] fix recently introduced loop when client closes early
Due to a recent change in the FSMs, if the client closes with buffer
full, then the server loops waiting for headers. We can safely ignore
this case since the server FSM will have to be reworked too. Let's
fix the root cause for now.
2008-08-14 00:18:39 +02:00
Willy Tarreau
c65a3ba3d4 [MAJOR] completely separate HTTP and TCP states on the request path
For the first time, HTTP and TCP are not merged anymore. All request
processing has moved to process_request while the TCP processing of
the frontend remains in process_cli. The code is a lot cleaner,
simpler, smaller (1%) and slightly faster (1% too).

Right now, the HTTP state machine cannot easily command the TCP
state machine, but it does not cause that many difficulties.

The response processing has not yet been extracted, and the unix-stream
state machines have to be broken down that way too.

The CL_STDATA, CL_STSHUTR and CL_STSHUTW states still exist and are
exactly the sames. They will have to be all merged into CL_STDATA
once the work has stabilized. It is also possible that this single
state will disappear in favor of just buffer flags.
2008-08-14 00:18:39 +02:00
Willy Tarreau
7f875f6c8f [MEDIUM] simplify and centralize request timeout cancellation and request forwarding
Instead of playing with req->flags and request timeout everywhere,
tweak them only at precise locations.
2008-08-14 00:18:38 +02:00
Willy Tarreau
adfb8569f7 [MAJOR] get rid of SV_STANALYZE (step 2)
The SV_STANALYZE state was installed on the server side but was really
meant to be processed with the rest of the request on the client side.
It suffered from several issues, mostly related to the way timeouts were
handled while waiting for data.

All known issues related to timeouts during a request - and specifically
a request involving body processing - have been raised and fixed. At this
point, the code is a bit dirty but works fine, so next steps might be
cleanups with an ability to come back to the current state in case of
trouble.
2008-08-14 00:18:38 +02:00
Willy Tarreau
67f0eead22 [MAJOR] kill CL_STINSPECT and CL_STHEADERS (step 1)
This is a first attempt at separating data processing from the
TCP state machine. Those two states have been replaced with flags
in the session indicating what needs to be analyzed. The corresponding
code is still called before and in lieu of TCP states.

Next change should get rid of the specific SV_STANALYZE which is in
fact a client state.

Then next change should consist in making it possible to analyze
TCP contents while being in CL_STDATA (or CL_STSHUT*).
2008-08-14 00:18:38 +02:00
Aleksandar Lazic
697bbb0106 [PATCH] appsessions: cleanup DEBUG_HASH and initialize request_counter
This patch cleanup the -DDEBUG=DEBUG_HASH output setting and initialize
the request_counter for the appsessions.
2008-08-13 23:43:26 +02:00
Willy Tarreau
9f1f24bb7f [BUG] client timeout incorrectly rearmed while waiting for server
Client timeout could be refreshed in stream_sock_*, but this is
undesired when the timeout is already set to eternity. The effect
is that a session could still be aborted if client timeout was
smaller than server timeout. A second effect is that sessions
expired on the server side would expire with "cD" flags.

The fix consists in not updating it if it was not previously set.
A cleaner method might consist in updating the buffer timeout. This
is probably what will be done later when the state machines only
deal with the buffers.
2008-08-11 11:34:18 +02:00
Willy Tarreau
ce09c52187 [BUG] server timeout was not considered in some circumstances
Due to a copy-paste typo, the client timeout was refreshed instead
of the server's when waiting for server response. This means that
the server's timeout remained eternity.
2008-08-11 11:34:16 +02:00
Willy Tarreau
fb0528bd56 [BUG] fix segfault with url_param + check_post
If an HTTP/0.9-like POST request is sent to haproxy while
configured with url_param + check_post, it will crash. The
reason is that the total buffer length was computed based
on req->total (which equals the number of bytes read) and
not req->l (number of bytes in the buffer), thus leading
to wrong size calculations when calling memchr().

The affected code does not look like it could have been
exploited to run arbitrary code, only reads were performed
at wrong locations.
2008-08-11 11:34:01 +02:00
Willy Tarreau
718f0ef129 [MEDIUM] process_cli: don't rely at all on server state
A new buffer flag BF_MAY_FORWARD has been added so that the client
FSM can check whether it is allowed to forward the response to the
client. The client FSM does not have to monitor the server state
anymore.
2008-08-10 16:21:32 +02:00
Willy Tarreau
dc0a6a0dea [MEDIUM] process_srv: don't rely at all on client state
A new buffer flag BF_MAY_CONNECT has been added so that the server
FSM can check whether it is allowed to establish a connection or
not. That way, the client FSM only has to move this flag and the
server side does not need to monitor client state anymore.
2008-08-03 22:47:10 +02:00
Willy Tarreau
6468d924ea [MEDIUM] process_srv: rely on buffer flags for client shutdown
The open/close nature of each half of the client side is known
to the buffer, so let the server state machine rely on this
instead of checking the client state for CL_STSHUT* or
CL_STCLOSE.
2008-08-03 20:48:51 +02:00
Willy Tarreau
89edf5e629 [MEDIUM] buffers: ensure buffer_shut* are properly called upon shutdowns
It is important that buffer states reflect the state of both sides so
that we can remove client and server state inter-dependencies.
2008-08-03 20:48:50 +02:00
Ross West
af72a1d8ec [MINOR] permit renaming of x-forwarded-for header
Because I needed it in my situation - here's a quick patch to
allow changing of the "x-forwarded-for" header by using a suboption to
"option forwardfor".

Suboption "header XYZ" will set the header from "x-forwarded-for" to "XYZ".

Default is still "x-forwarded-for" if the header value isn't defined.
Also the suboption 'except a.b.c.d/z' still works on the same line.

So it's now: option forwardfor [except a.b.c.d[/z]] [header XYZ]
2008-08-03 10:51:45 +02:00
Willy Tarreau
0ceba5af74 [MEDIUM] acl: set types on all currently known ACL verbs
All currently known ACL verbs have been assigned a type which makes
it possible to detect inconsistencies, such as response values used
in request rules.
2008-07-25 19:31:03 +02:00
Willy Tarreau
ec6c5df018 [CLEANUP] remove many #include <types/xxx> from C files
It should be stated as a rule that a C file should never
include types/xxx.h when proto/xxx.h exists, as it gives
less exposure to declaration conflicts (one of which was
caught and fixed here) and it complicates the file headers
for nothing.

Only types/global.h, types/capture.h and types/polling.h
have been found to be valid includes from C files.
2008-07-16 10:30:42 +02:00
Willy Tarreau
284648e079 [CLEANUP] remove unused include/types/client.h
This file is not used anymore.
2008-07-16 10:30:40 +02:00
Willy Tarreau
b686644ad8 [MAJOR] implement tcp request content inspection
Some people need to inspect contents of TCP requests before
deciding to forward a connection or not. A future extension
of this demand might consist in selecting a server farm
depending on the protocol detected in the request.

For this reason, a new state CL_STINSPECT has been added on
the client side. It is immediately entered upon accept() if
the statement "tcp-request inspect-delay <xxx>" is found in
the frontend configuration. Haproxy will then wait up to
this amount of time trying to find a matching ACL, and will
either accept or reject the connection depending on the
"tcp-request content <action> {if|unless}" rules, where
<action> is either "accept" or "reject".

Note that it only waits that long if no definitive verdict
can be found earlier. That generally implies calling a fetch()
function which does not have enough information to decode
some contents, or a match() function which only finds the
beginning of what it's looking for.

It is only at the ACL level that partial data may be processed
as such, because we need to distinguish between MISS and FAIL
*before* applying the term negation.

Thus it is enough to add "| ACL_PARTIAL" to the last argument
when calling acl_exec_cond() to indicate that we expect
ACL_PAT_MISS to be returned if some data is missing (for
fetch() or match()). This is the only case we may return
this value. For this reason, the ACL check in process_cli()
has become a lot simpler.

A new ACL "req_len" of type "int" has been added. Right now
it is already possible to drop requests which talk too early
(eg: for SMTP) or which don't talk at all (eg: HTTP/SSL).

Also, the acl fetch() functions have been extended in order
to permit reporting of missing data in case of fetch failure,
using the ACL_TEST_F_MAY_CHANGE flag.

The default behaviour is unchanged, and if no rule matches,
the request is accepted.

As a side effect, all layer 7 fetching functions have been
cleaned up so that they now check for the validity of the
layer 7 pointer before dereferencing it.
2008-07-16 10:29:07 +02:00
Willy Tarreau
11382813a1 [TESTS] added test-acl.cfg to test some ACL combinations
various rules constructions can be tested with this test case.
2008-07-09 16:18:21 +02:00
Willy Tarreau
a8cfa34a9c [BUG] use_backend would not correctly consider "unless"
A copy-paste typo made use_backend not correctly consider the "unless"
case, depending on the previous "block" rule.
2008-07-09 11:23:31 +02:00
Willy Tarreau
0c303eec87 [MAJOR] convert all expiration timers from timeval to ticks
This is the first attempt at moving all internal parts from
using struct timeval to integer ticks. Those provides simpler
and faster code due to simplified operations, and this change
also saved about 64 bytes per session.

A new header file has been added : include/common/ticks.h.

It is possible that some functions should finally not be inlined
because they're used quite a lot (eg: tick_first, tick_add_ifset
and tick_is_expired). More measurements are required in order to
decide whether this is interesting or not.

Some function and variable names are still subject to change for
a better overall logics.
2008-07-07 00:09:58 +02:00
Willy Tarreau
91e99931b7 [MEDIUM] introduce task->nice and boot access to statistics
The run queue scheduler now considers task->nice to queue a task and
to pick a task out of the queue. This makes it possible to boost the
access to statistics (both via HTTP and UNIX socket). The UNIX socket
receives twice as much a boost as the HTTP socket because it is more
sensible.
2008-06-30 07:51:00 +02:00
Willy Tarreau
284c7b3195 [BUG] disable buffer read timeout when reading stats
The buffer read timeouts were not reset when stats were produced. This
caused unneeded wakeups.
2008-06-29 16:38:43 +02:00
Willy Tarreau
b7f694f20e [MEDIUM] implement a monotonic internal clock
If the system date is set backwards while haproxy is running,
some scheduled events are delayed by the amount of time the
clock went backwards. This is particularly problematic on
systems where the date is set at boot, because it seldom
happens that health-checks do not get sent for a few hours.

Before switching to use clock_gettime() on systems which
provide it, we can at least ensure that the clock is not
going backwards and maintain two clocks : the "date" which
represents what the user wants to see (mostly for logs),
and an internal date stored in "now", used for scheduled
events.
2008-06-22 17:18:02 +02:00
Willy Tarreau
7c669d7e0f [BUG] fix the dequeuing logic to ensure that all requests get served
The dequeuing logic was completely wrong. First, a task was assigned
to all servers to process the queue, but this task was never scheduled
and was only woken up on session free. Second, there was no reservation
of server entries when a task was assigned a server. This means that
as long as the task was not connected to the server, its presence was
not accounted for. This was causing trouble when detecting whether or
not a server had reached maxconn. Third, during a redispatch, a session
could lose its place at the server's and get blocked because another
session at the same moment would have stolen the entry. Fourth, the
redispatch option did not work when maxqueue was reached for a server,
and it was not possible to do so without indefinitely hanging a session.

The root cause of all those problems was the lack of pre-reservation of
connections at the server's, and the lack of tracking of servers during
a redispatch. Everything relied on combinations of flags which could
appear similarly in quite distinct situations.

This patch is a major rework but there was no other solution, as the
internal logic was deeply flawed. The resulting code is cleaner, more
understandable, uses less magics and is overall more robust.

As an added bonus, "option redispatch" now works when maxqueue has
been reached on a server.
2008-06-20 15:08:06 +02:00
Willy Tarreau
7008987813 [BUG] queue management: wake oldest request in queues
When a server terminates a connection, the next session in its
own queue was immediately processed. Because of this, if all
server queues are always filled, then no new anonymous request
will be processed. Consider oldest request between global and
server queues to choose from which to pick the request.

An improvement over this will consist in adding a configurable
offset when comparing expiration dates, so that cookie-less
requests can get either less or more priority.
2008-06-20 15:07:40 +02:00
Willy Tarreau
b463dfb2de [MEDIUM] add support for conditional HTTP redirection
A new "redirect" keyword adds the ability to send an HTTP 301/302/303
redirection to either an absolute location or to a prefix followed by
the original URI. The redirection is conditionned by ACL rules, so it
becomes very easy to move parts of a site to another site using this.

This work was almost entirely done at Exceliance by Emeric Brun.

A test-case has been added in the tests/ directory.
2008-06-07 23:08:56 +02:00
Krzysztof Piotr Oledzki
1acf217366 [BUG/CLEANUP] cookiedomain -> cookie_domain rename + free(p->cookie_domain)
Rename cookiedomain -> cookie_domain to be consistent with current
naming scheme. Also make sure cookie_domain is deallocated at deinit()
2008-05-30 07:03:22 +02:00
Krzysztof Piotr Oledzki
efe3b6f524 [MINOR] Allow to specify a domain for a cookie
This patch allows to specify a domain used when inserting a cookie
providing a session stickiness. Usefull for example with wildcard domains.

The patch adds one new variable to the struct proxy: cookiedomain.
When set the domain is appended to a Set-Cookie header.

Domain name is validated using the new invalid_domainchar() function.
It is basically invalid_char() limited to [A-Za-z0-9_.-]. Yes, the test
is too trivial and does not cover all wrong situations, but the main
purpose is to detect most common mistakes, not intentional abuses.

The underscore ("_") character is not RFC-valid but as it is
often (mis)used so I decided to allow it.
2008-05-25 10:09:02 +02:00
matt.farnsworth@nokia.com
1c2ab96be5 [MAJOR] implement parameter hashing for POST requests
This patch extends the "url_param" load balancing method by introducing
the "check_post" option. Using this option enables analysis of the beginning
of POST requests to search for the specified URL parameter.

The patch also fixes a few minor typos in comments that were discovered
during code review.
2008-04-15 15:30:41 +02:00
Willy Tarreau
f899b94e63 [BUG] fix double-decrement of server connections
If a client does a sudden dirty close (CL_STCLOSE) during a server
connect turn-around, then the number of server connections is
decremented twice. This causes huge problems on the affected
server because when its connection number becomes negative, it
overflows and prevents the server from accepting new connections
due to an apparent saturation.

The fix consists in not decrementing the counter if the server is
in a turn-around state.
2008-03-28 18:19:05 +01:00
Willy Tarreau
39f7e6d516 [MEDIUM] fix stats socket limitation to 16 kB
Due to the way the stats socket work, it was not possible to
maintain the information related to the command entered, so
after filling a whole buffer, the request was lost and it was
considered that there was nothing to write anymore.

The major reason was that some flags were passed directly
during the first call to stats_dump_raw() instead of being
stored persistently in the session.

To definitely fix this problem, flags were added to the stats
member of the session structure.

A second problem appeared. When the stats were produced, a first
call to client_retnclose() was performed, then one or multiple
subsequent calls to buffer_write_chunks() were done. But once the
stats buffer was full and a reschedule operated, the buffer was
flushed, the write flag cleared from the buffer and nothing was
done to re-arm it.

For this reason, a check was added in the proto_uxst_stats()
function in order to re-call the client FSM when data were added
by stats_dump_raw(). Finally, the whole unix stats dump FSM was
rewritten to avoid all the magics it depended on. It is now
simpler and looks more like the HTTP one.
2008-03-17 22:08:01 +01:00
Willy Tarreau
51406233bb [MAJOR] implementation of the "leastconn" load balancing algorithm
The new "leastconn" LB algorithm selects the server which has the
least established or pending connections. The weights are considered,
so that a server with a weight of 20 will get twice as many connections
as the server with a weight of 10.

The algorithm respects the minconn/maxconn settings, as well as the
slowstart since it is a dynamic algorithm. It also correctly supports
backup servers (one and all).

It is generally suited for protocols with long sessions (such as remote
terminals and databases), as it will ensure that upon restart, a server
with no connection will take all new ones until its load is balanced
with others.

A test configuration has been added in order to ease regression testing.
2008-03-10 22:04:30 +01:00
Krzysztof Piotr Oledzki
5a329cf017 [MEDIUM]: Prevent redispatcher from selecting the same server, version #3
When haproxy decides that session needs to be redispatched it chose a server,
but there is no guarantee for it to be a different one. So, it often
happens that selected server is exactly the same that it was previously, so
a client ends up with a 503 error anyway, especially when one sever has
much bigger weight than others.

Changes from the previous version:
 - drop stupid and unnecessary SN_DIRECT changes

 - assign_server(): use srvtoavoid to keep the old server and clear s->srv
    so SRV_STATUS_NOSRV guarantees that t->srv == NULL (again)
    and get_server_rr_with_conns has chances to work (previously
    we were passing a NULL here)

 - srv_redispatch_connect(): remove t->srv->cum_sess and t->srv->failed_conns
   incrementing as t->srv was guaranteed to be NULL

 - add avoididx to get_server_rr_with_conns. I hope I correctly understand this code.

 - fix http_flush_cookie_flags() and move it to assign_server_and_queue()
   directly. The code here was supposed to set CK_DOWN and clear CK_VALID,
   but: (TX_CK_VALID | TX_CK_DOWN) == TX_CK_VALID == TX_CK_MASK so:
	if ((txn->flags & TX_CK_MASK) == TX_CK_VALID)
		txn->flags ^= (TX_CK_VALID | TX_CK_DOWN);
   was really a:
	if ((txn->flags & TX_CK_MASK) == TX_CK_VALID)
		txn->flags &= TX_CK_VALID

   Now haproxy logs "--DI" after redispatching connection.

 - defer srv->redispatches++ and s->be->redispatches++ so there
   are called only if a conenction was redispatched, not only
   supposed to.

 - don't increment lbconn if redispatcher selected the same sarver

 - don't count unsuccessfully redispatched connections as redispatched
   connections

 - don't count redispatched connections as errors, so:

 - the number of connections effectively served by a server is:
 srv->cum_sess - srv->failed_conns - srv->retries - srv->redispatches
   and
 SUM(servers->failed_conns) == be->failed_conns

 - requires the "Don't increment server connections too much + fix retries" patch

 - needs little more testing and probably some discussion so reverting to the RFC state

Tests #1:
 retries 4
 redispatch

i) 1 server(s): b (wght=1, down)
  b) sessions=5, lbtot=1, err_conn=1, retr=4, redis=0
  -> request failed

ii) server(s): b (wght=1, down), u (wght=1, down)
  b) sessions=4, lbtot=1, err_conn=0, retr=3, redis=1
  u) sessions=1, lbtot=1, err_conn=1, retr=0, redis=0
  -> request FAILED

iii) 2 server(s): b (wght=1, down), u (wght=1, up)
  b) sessions=4, lbtot=1, err_conn=0, retr=3, redis=1
  u) sessions=1, lbtot=1, err_conn=0, retr=0, redis=0
  -> request OK

iv) 2 server(s): b (wght=100, down), u (wght=1, up)
  b) sessions=4, lbtot=1, err_conn=0, retr=3, redis=1
  u) sessions=1, lbtot=1, err_conn=0, retr=0, redis=0
  -> request OK

v) 1 server(s): b (down for first 4 SYNS)
  b) sessions=5, lbtot=1, err_conn=0, retr=4, redis=0
  -> request OK

Tests #2:
 retries 4

i) 1 server(s): b (down)
  b) sessions=5, lbtot=1, err_conn=1, retr=4, redis=0
  -> request FAILED
2008-03-04 06:16:37 +01:00
Krzysztof Piotr Oledzki
626a19b66f [BUG] Don't increment server connections too much + fix retries
Commit 98937b8757 while fixing
one bug introduced another one. With "retries 4" and
"option redispatch" haproxy tries to connect 4 times to
one server server and 1 time to a second one. However
logs showed 5 connections to the first server (the
last one was counted twice) and 2 to the second.

This patch also fixes srv->retries and be->retries increments.

Now I get: 3 retries and 1 error in a first server (4 cum_sess)
and 1 error in a second server (1 cum_sess) with:
 retries 4
 option redispatch

and: 4 retries and 1 error (5 cum_sess) with:
 retries 4

So, the number of connections effectively served by a server is:
 srv->cum_sess - srv->failed_conns - srv->retries
2008-03-04 06:11:17 +01:00
Ryan Warnick
6d0b1fac23 [BUG] appsession lookup in URL does not work
We've been trying to use the latest release (1.3.14.2) of haproxy  to do
sticky sessions.  Cookie insertion is not an option for us, although we
would much rather use it, as we are trying to work around a problem where
cookies are unreliable.  The appsession functionality only partially worked
(it wouldn't read the session id out of a query string) until we made the
following code change to the get_srv_from_appsession function in
proto_http.c.
2008-02-17 11:24:35 +01:00
Willy Tarreau
50fd1e1e3b [BUG] failed conns were sometimes incremented in the frontend! 2008-02-15 10:09:15 +01:00
Willy Tarreau
e69eada057 [OPTIM] used unsigned ints for HTTP state and message offsets
State and offsets within http_msg were incorrectly set to signed int.
Turning them into unsigned slightly improved performance while reducing
code size.
2008-02-14 23:14:30 +01:00
Willy Tarreau
21d2af3e9f Revert "[BUILD] backend.c and checks.c did not build without tproxy !"
This reverts commit 3c3c0122f8.
This commit was buggy as it also removed previous tproxy changes !
2008-02-14 20:25:24 +01:00
Willy Tarreau
3c3c0122f8 [BUILD] backend.c and checks.c did not build without tproxy !
missing #ifdefs.
2008-02-13 22:22:56 +01:00
Willy Tarreau
9c33612f53 [MEDIUM] completely implement the server redirection method
Now when a server has "redir <prefix>" on its config line, any HEAD or GET
request addressing it will lead to a 302 with Location set to "<prefix>"
immediately followed by the relative URI of the incoming request. This makes
it very easy to send redirect to browsers to check remote static servers, as
well as to provide redirection for remote sites when the local one is down.
2008-02-13 00:55:49 +01:00
Krzysztof Piotr Oledzki
f1e1cb463f [BUG]: Restore clearing t->logs.bytes
Commit 8b3977ffe3 removed "t->logs.bytes_in = 0;"
but instead it should change it into "t->logs.bytes_out = 0;" as since
583bc96606 counters are incremented not set.

It should be incremented in session_process_counters while sending data to a
client:
        bytes = s->rep->total - s->logs.bytes_out;
        s->logs.bytes_out = s->rep->total;

However, if we increment (set) s->logs.bytes_out while processing
"logasap", statistics get wrong values added for headers: 0 or even
negative if haproxy adds some headers itself.

To test it, please enable logasap and download one empty file and look at
stats. Without my fix information available on that page are invalid, for
example:

# pxname,svname,qcur,qmax,scur,smax,slim,stot,bin,bout,dreq,dresp,ereq,econ,eresp,wretr,wredis,status,weight,act,bck,chkfail,chkdown,lastchg,downtime,qlimit,pid,iid,sid,throttle,lbtot,
www,b,0,0,0,1,,1,24,-92,,0,,0,0,0,,UP,1,1,0,0,0,3121,0,,1,2,1,,1,
www,BACKEND,0,0,0,1,0,1,24,-92,0,0,,0,0,0,0,UP,1,1,0,,0,3121,0,,1,2,0,,1,
2008-01-22 10:30:26 +01:00
Willy Tarreau
b881608e57 [BUILD] code did not build in full debug mode 2008-01-18 12:18:15 +01:00
Willy Tarreau
8b3977ffe3 [BUG] log response byte count, not request
Due to a shameless copy-paste typo, the number of bytes logged was
from the request and not the response. This bug has been present
for a long time.
2008-01-18 11:16:32 +01:00
Willy Tarreau
00559e7117 [BUG] fix typo in redispatched connection
a copy-paste typo was present in the reconnection code responsible
for respatching. The client's FSM would not be re-evaluated if an
error occurred. It looks harmless but better fix it.
2008-01-06 23:46:19 +01:00
Willy Tarreau
541b5c24ca [MEDIUM] add a turn-around state of one second after a connection failure
Several users have complained that when haproxy gets a connection
failure due to an active reject from a server, it immediately
retries, often leading to the same situation being repeated until
the retry counter reaches zero.

Now if a connection error shows up, a turn-around state of 1 second
is applied before retrying. This is performed by faking a connection
timeout in order not to touch much code. However, a cleaner method
would involve an extra state.
2008-01-06 23:34:21 +01:00
Krzysztof Piotr Oledzki
25b501a6b1 [MEDIUM]: Count retries and redispatches also for servers, fix redistribute_pending, extend logs, %d->%u cleanup
This patch extends a little previously added functionality to also
count retries and redispatches for servers. Now it is possible to know
which server causes redispatches as it is not always the same that takes
most retries.

While working with the code I found that redistribute_pending() does not increment
srv->redispatches && be->redispatches. I don't know how to test it but
I think the fix is correct. If not I can withdraw it.

I also extended logs to show how many retries were done and if redispatching
was necessary ('+'). I'm using an additional session flag SN_REDISP to match
redispatched connections. I had to rearrange all defines in session.h to make
more room for it.

The documentation about logs was also fixed a little (sorry, english only),
as current version uses totally different format. BTW: examples are still
outdated, maybe next time...

Finally, I changed %d -> %u for retries/redispatches as those variables
are declared as unsigned.
2008-01-06 16:43:05 +01:00
Willy Tarreau
98937b8757 [BUG] increment server connections for each connect()
It was abnormal to see more connect errors than connect attempts.
This was caused by the fact that the server's connection count was
not incremented for failed connect() attempts.

Now the per-server connections are correctly incremented for each
connect() attempt. This includes the retries too. The number of
connections effectively served by a server will then be :

   srv->cum_sess - srv->errors - srv->warnings
2008-01-06 15:43:38 +01:00
Willy Tarreau
036fae0ec9 [MEDIUM] introduce "timeout http-request" in frontends
In order to offer DoS protection, it may be required to lower the maximum
accepted time to receive a complete HTTP request without affecting the client
timeout. This helps protecting against established connections on which
nothing is sent. The client timeout cannot offer a good protection against
this abuse because it is an inactivity timeout, which means that if the
attacker sends one character every now and then, the timeout will not
trigger. With the HTTP request timeout, no matter what speed the client
types, the request will be aborted if it does not complete in time.
2008-01-06 13:24:40 +01:00
Krzysztof Oledzki
336d475d13 [MEDIUM]: Inversion for options
This patch adds a possibility to invert most of available options by
introducing the "no" keyword, available as an additional prefix.
If it is found arguments are shifted left and an additional flag (inv)
is set.

It allows to use all options from a current defaults section, except
the selected ones, for example:

-- cut here --
defaults
        contimeout      4200
        clitimeout      50000
        srvtimeout      40000
        option contstats

listen stats 1.2.3.4:80
	no option contstats
-- cut here --

Currenly inversion works only with the "option" keyword.

The patch also moves last_checks calculation at the end of the readcfgfile()
function and changes "PR_O_FORCE_CLO | PR_O_HTTP_CLOSE" into "PR_O_FORCE_CLO"
in cfg_opts so it is possible to invert forceclose without breaking httpclose
(and vice versa) and to invert tcpsplice in one proxy but to keep a proper
last_checks value when tcpsplice is used in another proxy. Now, the code
checks for PR_O_FORCE_CLO everywhere it checks for PR_O_HTTP_CLOSE.

I also decided to depreciate "redisp" and "redispatch" keywords as it is IMHO
better to use "option redispatch" which can be inverted.

Some useful documentation were added and at the same time I sorted
(alfabetically) all valid options both in the code and the documentation.
2007-12-27 11:52:06 +01:00
Willy Tarreau
d7c30f9a8c [CLEANUP] grouped all timeouts in one structure
All known timeouts in a proxy have been grouped into a
"timeout" sub-structure.
2007-12-03 01:38:36 +01:00
Willy Tarreau
1fa3126ec4 [MEDIUM] introduce separation between contimeout, and tarpit + queue
Now the connect timeout, tarpit timeout and queue timeout are
distinct. In order to retain compatibility with older versions,
if either queue or tarpit is left unset both in the proxy and
in the default proxy, then it is inherited from the connect
timeout as before.
2007-12-03 00:36:16 +01:00
Willy Tarreau
b80c230f41 [MEDIUM] add the "fail" condition to monitor requests
Under certain circumstances, it is very useful to be able to fail some
monitor requests. One specific case is when the number of servers in
the backend falls below a certain level. The new "monitor fail" construct
followed by either "if"/"unless" <condition> makes it possible to specify
ACL-based conditions which will make the monitor return 503 instead of
200. Any number of conditions can be passed. Another use may be to limit
the requests to local networks only.
2007-11-30 20:51:32 +01:00
Alexandre Cassen
5eb1a9033a [MEDIUM] New option http_proxy
Hello,

You will find attached an updated release of previously submitted patch.
It polish some part and extend ACL engine to match IP and PORT parsed in
HTTP request. (and take care of comments made by Willy ! ;))

Best regards,
Alexandre
2007-11-29 15:43:32 +01:00
Krzysztof Piotr Oledzki
583bc96606 [MEDIUM] continous statistics
By default, counters used for statistics calculation are incremented
only when a session finishes. It works quite well when serving small
objects, but with big ones (for example large images or archives) or
with A/V streaming, a graph generated from haproxy counters looks like
a hedgehog.

This patch implements a contstats (continous statistics) option.
When set counters get incremented continuously, during a whole session.
Recounting touches a hotpath directly so it is not enabled by default,
as it has small performance impact (~0.5%).
2007-11-26 20:21:47 +01:00
Willy Tarreau
5df518788d [BUG] fix missing parenthesis in check_response_for_cacheability
Parenthesis were missed when code was moved to this function.
This results in non-cacheable transactions not being ignored.
2007-11-26 20:16:53 +01:00
Willy Tarreau
396d2c6782 [MINOR] avoid calling some layer7 functions if not needed
Small optimization: in some cases, it's not interesting to call
functions which are dedicated to checking the cache headers or
cookies. Avoid calling them when not necessary.
2007-11-04 22:41:49 +01:00
Willy Tarreau
fe94460d53 [BUG] fix calls to localtime()
localtime() was called with pointers to tv_sec, which is time_t on
some platforms and long on others. A problem was encountered on
Sparc64 under OpenBSD where tv_sec is long (64 bits) and time_t is
32 bits. Since this architecture is big-endian, it exhibited the
bug because localtime() always worked with the high part of the
value which is always zero. This problem was identified and debugged
by Thierry Fournier.

The correct solution is to pass the date by value and not by pointer,
through an intermediate function. The use of localtime_r() instead of
localtime() also made it possible to get rid of the first call to
localtime() since it does not need to allocate memory anymore.
2007-10-25 10:34:16 +02:00
Krzysztof Oledzki
1cf36ba3ae [MEDIUM] stats: count server retries and redispatches
It is important to know how your installation performs. Haproxy masks
connection errors, which is extremely good for a client but it is bad for
an administrator (except people believing that "ignorance is a bless").

Attached patch adds retries and redispatches counters, so now haproxy:

1. For server:
 - counts retried connections (masked or not)

2. For backends:
 - counts retried connections (masked or not) that happened to
    a slave server
 - counts redispatched connections
 - does not count successfully redispatched connections as backend errors.
    Errors are increased only when client does not get a valid response,
    in other words: with failed redispatch or when this function is not
    enabled.

3. For statistics:
 - display Retr (retries) and Redis (redispatches) as a "Warning"
   information.
2007-10-18 19:12:30 +02:00
Willy Tarreau
55bb8450c0 [MEDIUM] implement the CSV output for the statistics
It is now possible to get CSV ouput from the statistics by
simply appending ";csv" to the HTTP request sent to get the
stats. The fields keep the same ordering as in the HTML page,
and a field "pxname" has been prepended at the beginning of
the line.
2007-10-18 14:12:28 +02:00
Willy Tarreau
9186126e1c [MEDIUM] moved stats and buffer generic functions to new files
Neither the primitives used to write data to a buffer, nor the stats
dump functions are HTTP-specific anymore. Move them to dedicated
files
2007-10-18 14:12:21 +02:00
Krzysztof Oledzki
d9db9274fe [MINOR] report haproxy's version by default on the stats page
For people who manage many haproxies, it is sometimes convenient
to be informed of their version. This patch adds this, with the
option to disable this report by specifying "stats hide-version".

Also, the feature may be permanently disabled by setting the
STATS_VERSION_STRING to "" (empty string), or the format can
simply be adjusted.
2007-10-15 10:05:11 +02:00
Krzysztof Oledzki
9198ab5e7c [MEDIUM] do not add a cache-control: header when on non-cacheable responses
I noticed that haproxy, with "cookie (...) nocache" option, always adds
"Cache-control: private" at the end of a header list received from this
server:

Cache-Control: no-cache
(...)
Set-Cookie: SERVERID=s6; path=/
Cache-control: private

or:

Set-Cookie: ASPSESSIONIDCSRCTSSB=HCCBGGACGBHDHMMKIOILPHNG; path=/
Cache-control: private
Set-Cookie: SERVERID=s5; path=/
Cache-control: private

It may be just redundant (two "Cache-control: private"), but sometimes it
may be quite confused as we may end with two different, more and less
restricted directions (no-cache & private) and even quite conflicting
directions (eg. public & private):

So, I added and rearranged a code, so now haproxy adds a "Cache-control:
private" header only when there is no the same (private) or more
restrictive (no-cache) one. It was done in three steps:

1. Use check_response_for_cacheability to check if response is
not cacheable. I simply moved this call before http_header_add_tail2.

2. Use TX_CACHEABLE (not TX_CACHE_COOK - apache <= 1.3.26) to check if we
need to add a Cache-control header. If we add it, clear TX_CACHEABLE and
TX_CACHE_COOK.

3. Check cacheability not only with PR_O_CHK_CACHE but also with
PR_O_COOK_NOC, so:

-                           unlikely(t->be->options & PR_O_CHK_CACHE))
+                           (t->be->options & (PR_O_CHK_CACHE|PR_O_COOK_NOC)))
                                txn->flags |= TX_CACHEABLE | TX_CACHE_COOK;

I removed this unlikely since I believe that now it is not so unlikely.

The patch is definitely not perfect, proxy should probably also remove
"Cache-control: public". Unfortunately, I do not know the code good enough
to do in myself, yet. ;)

Anyway, I think that even now, it should be very useful.
2007-10-15 09:33:02 +02:00
Willy Tarreau
6e4261ee2f [MAJOR] timeouts and retries could be ignored when switching backend
When switching from a frontend to a backend, the "retries" parameter
was not kept, resulting in the impossibility to reconnect after the
first connection failure. This problem was reported and analyzed by
Krzysztof Oledzki.

While fixing the code, it appeared that some of the backend's timeouts
were not updated in the session when using "use_backend" or "default_backend".
It seems this had no impact but just in case, it's better to set them as
they should have been.
2007-10-15 09:32:19 +02:00
Willy Tarreau
51041c737c [MAJOR] remove files distributed under an obscure license
src/chtbl.c, src/hashpjw.c and src/list.c are distributed under
an obscure license. While Aleks and I believe that this license
is OK for haproxy, other people think it is not compatible with
the GPL.

Whether it is or not is not the problem. The fact that it rises
a doubt is sufficient for this problem to be addressed. Arnaud
Cornet rewrote the unclear parts with clean GPLv2 and LGPL code.
The hash algorithm has changed too and the code has been slightly
simplified in the process. A lot of care has been taken in order
to respect the original API as much as possible, including the
LGPL for the exportable parts.

The new code has not been thoroughly tested but it looks OK now.
2007-09-09 21:56:53 +02:00
Willy Tarreau
e7150cdcfa [MEDIUM] stats page: added links for 'refresh' and 'hide down'
The stats page now supports an option to hide servers which are DOWN
and to enable/disable automatic refresh. It is also possible to ask
for an immediate refresh.
2007-09-09 21:09:29 +02:00
Willy Tarreau
bbd42123e1 [MINOR] add support for "stats refresh <interval>"
Sometimes it may be desirable to automatically refresh the
stats page. Most browsers support the "Refresh:" header with
an interval in seconds. Specifying "stats refresh xxx" will
automatically add this header.
2007-09-09 21:09:28 +02:00
Willy Tarreau
4b946c8564 [MINOR] fix backend's weight in the stats page.
The GCD used when computing the servers' weights causes the total
weight of the backend to appear lower than expected because it is
divided by the GCD. Easy solution consists in recomputing the GCD
from the first server and apply it to the global weight.
2007-09-09 21:09:28 +02:00
Willy Tarreau
5af3a694f5 [MEDIUM] improve behaviour with large number of servers per proxy
When a very large number of servers is configured (thousands),
shutting down many of them at once could lead to large number
of calls to recalc_server_map() which already takes some time.
This would result in an O(N^3) computation time, leading to
noticeable pauses on slow embedded CPUs on test platforms.

Instead, mark the map as dirty and recalc it only when needed.
2007-09-09 21:09:28 +02:00
Willy Tarreau
8f8e645066 [CLEANUP] shut warnings 'is*' macros from ctype.h on solaris
Solaris visibly uses an array for is*, which returns warnings
about the use of signed chars as indexes. Good opportunity to
put casts everywhere.
2007-06-17 21:51:38 +02:00
Willy Tarreau
55ea7579d7 [MAJOR] added the 'use_backend' keyword for full content-switching
The new "use_backend" keyword permits full content switching by the
use of ACLs. Its usage is simple :

   use_backend <backend_name> {if|unless} <acl_cond>
2007-06-17 19:56:27 +02:00
Willy Tarreau
c11416f22f [MEDIUM] acl: distinguish between request and response headers
hdr(x) will now still be used for request headers, and shdr(x) for
server headers (response).
2007-06-17 16:58:38 +02:00
Willy Tarreau
c8d7c96b26 [MEDIUM] acl: support '-i' to ignore case when matching
Implemented the "-i" option on ACLs to state that the matching
will have to be performed for all patterns ignoring case. The
usage is :

   acl <aclname> <aclsubject> -i pattern1 ...

If a pattern must begin with "-", either it must not be the first one,
or the "--" option should be specified first.
2007-06-17 08:20:33 +02:00
Willy Tarreau
1ad7c6dd85 [MINOR] acl: permit to return any header when no name specified
Having the ability to match on hdr_xxx in addition to hdr_xxx(yyy)
makes it possible to match any value or to count the headers easily.
2007-06-10 21:42:55 +02:00
Willy Tarreau
737b0c12a6 [MEDIUM] acl: support maching on 'path' component
'path', 'path_reg', 'path_beg', 'path_end', 'path_sub', 'path_dir'
and 'path_dom' have been implemented to process the path component
of the URI. It starts after the host part, and stops before the
question mark.
2007-06-10 21:28:46 +02:00
Willy Tarreau
33a7e6901f [MEDIUM] acl: implement matching on header values
hdr(x), hdr_reg(x), hdr_beg(x), hdr_end(x), hdr_sub(x), hdr_dir(x),
hdr_dom(x), hdr_cnt(x) and hdr_val(x) have been implemented. They
apply to any of the possibly multiple values of header <x>.

Right now, hdr_val() is limited to integer matching, but it should
reasonably be upgraded to match long long ints.
2007-06-10 19:45:56 +02:00
Willy Tarreau
97be145991 [MINOR] acl: provide a reference to the expr to fetch()
The fetch() functions may need to access the full expr to get
their args. Turn the void *arg into a struct acl_expr *expr.
2007-06-10 11:47:14 +02:00
Willy Tarreau
d41f8d85e8 [MINOR] acl: specify the direction during fetches
Some fetches such as 'line' or 'hdr' need to know the direction of
the test (request or response). A new 'dir' parameter is now
propagated from the caller to achieve this.
2007-06-10 10:06:18 +02:00