6665 Commits

Author SHA1 Message Date
Willy Tarreau
3b88d441e9 [MINOR] switch all stat counters to 64-bit
The byte counters have long been 64-bit to avoid overflows. But with
several sites nowadays, we see session counters wrap around every 10-days
or so. So it was the moment to switch counters to 64-bit, including
error and warning counters which can theorically rise as fast as session
counters even if in practice there is very low risk.

The performance impact should not be noticeable since those counters are
only updated once per session. The stats output have been carefully checked
for proper types on both 32- and 64-bit platforms.
2009-04-11 20:44:08 +02:00
Willy Tarreau
5e4a6f13f4 [MINOR] fix a few remaining printf-like formats on 64-bit platforms
Mainly two sizeof() returning size_t which is not the size of an int
on 64-bit platforms.
2009-04-11 19:42:49 +02:00
Willy Tarreau
0bba5a8f6c [BUG] stats: total and lbtot are unsigned
Some big users are seeing negative numbers in the CSV stats. This patch
needs to be backported to 1.3.15 and extended to the HTML part.
2009-04-07 13:27:40 +02:00
Jeffrey 'jf' Lim
af403fc59d [CLEANUP] give a little bit more information in error message
Indicate the error is about redirection.
2009-04-03 15:01:48 +02:00
Willy Tarreau
1772ece025 [MINOR] fix several printf formats and missing arguments
Last patch revealed a number of mistakes in printf-like calls, mostly int/long
mismatches, and a few missing arguments.
2009-04-03 14:49:12 +02:00
Willy Tarreau
4076a15255 [MEDIUM] http: capture invalid requests/responses even if accepted
It's useful to be able to accept an invalid header name in a request
or response but still be able to monitor further such errors. Now,
when an invalid request/response is received and accepted due to
an "accept-invalid-http-{request|response}" option, the invalid
request will be captured for later analysis with "show errors" on
the stats socket.
2009-04-02 21:36:37 +02:00
Willy Tarreau
32a4ec0ed7 [MEDIUM] http: add options to ignore invalid header names
Sometimes it is required to let invalid requests pass because
applications sometimes take time to be fixed and other servers
do not care. Thus we provide two new options :

     option accept-invalid-http-request  (for the frontend)
     option accept-invalid-http-response (for the backend)

When those options are set, invalid requests or responses do
not cause a 403/502 error to be generated.
2009-04-02 21:36:34 +02:00
Willy Tarreau
61d188920e [MINOR] improve reporting of misplaced acl/reqxxx rules
Now we can detect improper ordering of "block", "reqxxx", "reqadd",
"redirect" and "use_backend", and warn the user accordingly.
2009-03-31 10:49:21 +02:00
Willy Tarreau
0a6d2efc45 [MINOR] stats/html: group digits by 3 to clarify numbers
Large stats numbers are more readable if there is a small space
between groups of 3 digits.
2009-03-29 14:46:01 +02:00
Willy Tarreau
e7239b5152 [MINOR] implement ulltoh() to write HTML-formatted numbers
This function sets CSS letter spacing after each 3rd digit. The page must
create a class "rls" (right letter spacing) with style "letter-spacing: 0.3em"
in order to use it.
2009-03-29 13:41:58 +02:00
Willy Tarreau
2ab85e6fee [BUG] don't set an expiration date directly from now_ms
now_ms can be zero, don't set ->analyse_exp directly from it, we
must use tick_add() instead.
2009-03-29 10:24:15 +02:00
Willy Tarreau
d06e71179a [BUG] stream_sock: check for shut{r,w} before refreshing some timeouts
Under some circumstances, it appears possible to refresh a timeout
just after a side has been shut. For instance, if poll() plans to
call both read and write, and the read side calls chk_snd() which
in turn causes a shutw to occur, then stream_sock_write could update
its write timeout. The same problem happens the other way.

The timeout checks will then not catch these cases because they
ignore timeouts in case of shut{r,w}.

This is very likely to be the major cause of the 100% CPU usages
reported by Bart Bobrowski.

The fix consists in always ensuring that a side is not shut before
updating its timeout.
2009-03-29 10:18:41 +02:00
Willy Tarreau
c6dcad6e74 [MINOR] show sess: report a lot more information about sessions
For complex troubleshooting, it's sometimes useful to be able to
completely dump all the states and flags related to a session.
Now "show sess" will report the stream interfaces and buffers
status for each session.
2009-03-29 10:09:16 +02:00
Willy Tarreau
6574519c23 [MINOR] sepoll: don't count two events on the same FD.
sepoll counts the number of speculative events it has processed in
order to remain fair with epoll_wait(). If a same FD is processed
both for read and for write, it is counted twice. Fix this.
2009-03-28 23:42:55 +01:00
Willy Tarreau
1714e0ffda [BUG] stream_sock: disable I/O on fds reporting an error
Upon read or write error, we cannot immediately close the FD because
we want to first report the error to the upper layer which will do it
itself. However, we want to prevent any further I/O from being performed
on the FD. This is especially important in case of speculative I/O where
nothing else could stop the FD from still being polled until the upper
layer takes care of the condition.
2009-03-28 23:42:30 +01:00
Willy Tarreau
1eead503da [BUG] don't call epoll_ctl() on closed sockets
Some I/O callbacks are able to close their socket themselves. We
want to check this before calling epoll_ctl(EPOLL_CTL_DEL), otherwise
we get a -1 EBADF. Right now is looks like this could not cause any
trouble but the case is racy enough to fix it.
2009-03-28 19:43:06 +01:00
Willy Tarreau
3884cbaae6 [MINOR] show sess: report number of calls to each task
For debugging purposes, it can be useful to know how many times each
task has been called.
2009-03-28 17:54:35 +01:00
Willy Tarreau
2d045597f7 [BUG] reject unix accepts when connection limit is reached
unix sockets are not attached to a real frontend, so there is
no way to disable/enable the listener depending on the global
session count. For this reason, if the global maxconn is reached
and a unix socket comes in, it will just be ignored and remain
in the poll list, which will call again indefinitely.

So we need to accept then drop incoming unix connections when
the table is full.

This should not happen with clean configurations since the global
maxconn should provide enough room for unix sockets.
2009-03-28 11:02:18 +01:00
Willy Tarreau
127334e89b [BUG] reset the stream_interface connect timeout upon connect or error
The stream_interface timeout was not reset upon a connect success or
error, leading to busy loops when requeuing tasks in the past.

Thanks to Bart Bobrowski for reporting the issue.
2009-03-28 11:01:20 +01:00
Willy Tarreau
573fd806ed [OPTIM] sepoll: do not re-check whole list upon accepts
There is already an optimisation in the speculative poller which
causes newly created FDs to be checked immediately after being
created. Unfortunately, this optimisation causes the whole spec
list to be re-checked while we're only interested in the new FDs.

Doing this minor change causes performance gains of up to 6% on
medium-sized objects with a few hundreds concurrent connections.
2009-03-22 19:25:46 +01:00
Willy Tarreau
b00f9c456c [BUG] check for global.maxconn before doing accept()
If the accept() is done before checking for global.maxconn, we can
accept too many connections and encounter a lack of file descriptors
when trying to connect to the server. This is the cause of the
"cannot get a server socket" message  encountered in debug mode
during injections with low timeouts.
2009-03-21 22:43:12 +01:00
Willy Tarreau
06bea94266 [MEDIUM] session: don't resync FSMs on non-interesting changes
While processing the session, we used to resync the FSMs when buffer
flags changed. But since BF_KERN_SPLICING and BF_READ_DONTWAIT were
introduced, sometimes we could resync after they were set, which is
not what we want. This was because there were some old checks left
which did not mask changes with BF_MASK_STATIC before checking.
2009-03-21 22:09:29 +01:00
Willy Tarreau
1b194fe03e [OPTIM] buffer: new BF_READ_DONTWAIT flag reduces EAGAIN rates
When the reader does not expect to read lots of data, it can
set BF_READ_DONTWAIT on the request buffer. When it is set,
the stream_sock_read callback will not try to perform multiple
reads, it will return after only one, and clear the flag.
That way, we can immediately return when waiting for an HTTP
request without trying to read again.

On pure request/responses schemes such as monitor-uri or
redirects, this has completely eliminated the EAGAIN occurrences
and the epoll_ctl() calls, resulting in a performance increase of
about 10%. Similar effects should be observed once we support
HTTP keep-alive since we'll immediately disable reads once we
get a full request.
2009-03-21 21:57:30 +01:00
Willy Tarreau
6f4a82c7af [OPTIM] stream_sock: don't retry to read after a large read
If we get very large data at once, it's almost certain that it's
worthless trying to read again, because we got everything we could
get.

Doing this has made all -EAGAIN disappear from splice reads. The
threshold has been put in the global tunable structures so that if
we one day want to make it accessible from user config, it will be
easy to do so.
2009-03-21 20:43:57 +01:00
Willy Tarreau
e38388033f [BUG] server check intervals must not be null
If server check interval is null, we might end up looping in
process_srv_chk().

Prevent those values from being zero and add some control in
process_srv_chk() against infinite loops.
2009-03-21 18:58:32 +01:00
Willy Tarreau
c7bdf09f9f [MINOR] stats: report number of tasks (active and running)
It may be useful for statistics purposes to report the number of
tasks.
2009-03-21 18:33:52 +01:00
Willy Tarreau
a461318f97 [MINOR] task: keep a task count and clean up task creators
It's sometimes useful at least for statistics to keep a task count.
It's easy to do by forcing the rare task creators to always use the
same functions to create/destroy a task.
2009-03-21 18:13:21 +01:00
Willy Tarreau
135a113e36 [MINOR] sched: permit a task to stay up between calls
If a task wants to stay in the run queue, it is possible. It just
needs to wake itself up. We just want to ensure that a reniced
task will be processed at the right instant.
2009-03-21 13:26:05 +01:00
Willy Tarreau
26ca34e66e [BUG] scheduler: fix improper handling of duplicates __task_queue()
The top of a duplicate tree is not where bit == -1 but at the most
negative bit. This was causing tasks to be queued in reverse order
within duplicates. While this is not dramatic, it's incorrect and
might lead to longer than expected duplicate depths under some
circumstances.
2009-03-21 12:57:06 +01:00
Willy Tarreau
218859ad6c [BUG] sched: don't leave 3 lasts tasks unprocessed when niced tasks are present
When there are niced tasks, we would only process #tasks/4 per
turn, without taking care of running #tasks when #tasks was below
4, leaving those tasks waiting for a few other tasks to push them.

The fix simply consists in checking (#tasks+3)/4.
2009-03-21 11:53:09 +01:00
Willy Tarreau
e35c94a748 [MEDIUM] scheduler: get rid of the 4 trees thanks and use ebtree v4.1
Since we're now able to search from a precise expiration date in
the timer tree using ebtree 4.1, we don't need to maintain 4 trees
anymore. Not only does this simplify the code a lot, but it also
ensures that we can always look 24 days back and ahead, which
doubles the ability of the previous scheduler. Indeed, while based
on absolute values, the timer tree is now relative to <now> as we
can always search from <now>-31 bits.

The run queue uses the exact same principle now, and is now simpler
and a bit faster to process. With these changes alone, an overall
0.5% performance gain was observed.

Tests were performed on the few wrapping cases and everything works
as expected.
2009-03-21 10:25:14 +01:00
Willy Tarreau
5804434a0f [MINOR] update ebtree to version 4.1
Ebtree version 4.1 brings lookup by ranges. This will be useful for
the scheduler.
2009-03-21 10:23:36 +01:00
Willy Tarreau
8365f9335d [CLEANUP] http: remove some commented out obsolete code in process_response 2009-03-15 23:11:49 +01:00
Willy Tarreau
86ef7dc98d [MINOR] tcp_request: let the caller take care of errors and timeouts
tcp_request is not meant to decide how an error or a timeout has to
be handled. It must just apply it rules. Now that the error checks
have been added to the session, we don't need to check them anymore
in tcp_request_inspect(), which will only consider the shutdown which
may be the result of such an error.

That makes a lot more sense since tcp_request is not really waiting
for a request.
2009-03-15 22:55:47 +01:00
Willy Tarreau
844553303d [BUG] session: errors were not reported in termination flags in TCP mode
In order to get termination flags properly updated, the session was
relying a bit too much on http_return_srv_error() which is http-centric.

A generic srv_error function was implemented in the session in order to
catch all connection abort situations. It was then noticed that a request
abort during a connection attempt was not reported, which is now fixed.

Read and write errors/timeouts were not logged either. It was necessary
to add those tests at 4 new locations.

Now it looks like everything is correctly logged. Most likely some error
checking code could now be removed from some analysers.
2009-03-15 22:34:05 +01:00
Willy Tarreau
a3780f2db8 [BUG] connect timeout is in the stream interface, not the buffer
The connect timeout was not properly detected due to the fact that
it was not correctly initialized. It must be set as the stream interface
timeout, not the buffer's write timeout.
2009-03-15 21:49:00 +01:00
Willy Tarreau
5af24efee9 [CLEANUP] config: catch and report some possibly wrong rule ordering
There are some configurations in which redirect rules are declared
after use_backend rules. We can also find "block" rules after any
of these ones. The processing sequence is :
  - block
  - redirect
  - use_backend

So as of now we try to detect wrong ordering to warn the user about
a possibly undesired behaviour.
2009-03-15 15:23:16 +01:00
Willy Tarreau
55bc0f8eb7 [MEDIUM] reverse internal proxy declaration order to match configuration
People are regularly complaining that proxies are linked in reverse
order when reading the stats. This is now definitely fixed because
the proxy order is now fixed to match configuration order.
2009-03-15 14:51:53 +01:00
Willy Tarreau
d869b24119 [MINOR] tcp-inspect: permit the use of no-delay inspection
Sometimes it may make sense to be able to immediately apply a verdict
without waiting at all. It was not possible because no inspect-delay
meant no inspection at all. This is now fixed.
2009-03-15 14:43:58 +01:00
Willy Tarreau
3cd9af228f [MINOR] cfgparse: set backends to "balance roundrobin" by default
When a backend has no LB algo specified and is not in dispatch, proxy
nor transparent mode, use "balance roundrobin" by default instead of
complaining. This will be particularly useful with stats and redirects.
2009-03-15 14:11:27 +01:00
Willy Tarreau
ff01a21ebe [MINOR] cfgparse: some cleanups in the consistency checks
Check for servers in health mode, for health mode in pure-backends.
Some code have been refactored for better organization.
2009-03-15 13:46:16 +01:00
Willy Tarreau
787bbd9b7a [MINOR] show errors: encode backslash as well as non-ascii characters
These ones were not properly encoded, causing confusion on the output.
2009-03-12 08:18:33 +01:00
Willy Tarreau
c9619468ea [BUG] stream_sock: write timeout must be updated when forwarding !
When data are forwarded between socket, we must update the output
socket's write timeout. This was forgotten, causing sessions to
unexpectedly expire during long posts.
2009-03-09 22:40:57 +01:00
Willy Tarreau
6bf1736fb1 [BUILD] proto_http did not build on gcc-2.95 (again)
move the DPRINTF below the local variable declarations.
(cherry picked from commit 7b92db4cd5c106f5110c871503de11aabd0776eb)

The patch accidently got reverted.
2009-03-08 23:10:34 +01:00
Willy Tarreau
87bed62a92 [BUILD] build fixes for Solaris
One build error in stream_sock.c when MSG_NOSIGNAL is not defined,
and a warning in task.c.
2009-03-08 22:25:28 +01:00
Willy Tarreau
7c84bab879 [MEDIUM] rearrange forwarding condition to enable splice during analysis
The forwarding condition was not very clear. We would only enable
forwarding when send_max is zero, and we would only splice when no
analyser is installed. In fact we want to enable forward when there
is no analyser and we want to splice at soon as there is data to
forward, regardless of the analysers.
2009-03-08 21:38:23 +01:00
Willy Tarreau
6f0aa476bd [CLEANUP] buffer_flush() was misleading, rename it as buffer_erase 2009-03-08 20:33:29 +01:00
Willy Tarreau
ed066fae25 [CLEANUP] don't enable kernel splicing when socket is closed
Splicing will not be used when the source socket is closed. Don't
enable it uselessly.
2009-03-08 19:44:29 +01:00
Willy Tarreau
0be0ef9604 [OPTIM] do not re-check req buffer when only response has changed
In process_session(), we used to re-run through all the evaluation
loop when only the response had changed. Now we carefully check in
this order :
  - changes to the stream interfaces (only SI_ST_DIS)
  - changes to the request buffer flags
  - changes to the response buffer flags

And we branch to the appropriate section. This saves significant
CPU cycles, which is important since process_session() is one of
the major CPU eaters.

The same changes have been applied to uxst_process_session().
2009-03-08 19:20:25 +01:00
Willy Tarreau
531cf0cf8d [OPTIM] task: reduce the number of calls to task_queue()
Most of the time, task_queue() will immediately return. By extracting
the preliminary checks and putting them in an inline function, we can
significantly reduce the number of calls to the function itself, and
most of the tests can be optimized away due to the caller's context.

Another minor improvement in process_runnable_tasks() consisted in
taking benefit from the processor's branch prediction unit by making
a special case of the process_session() callback which is by far the
most common one.

All this improved performance by about 1%, mainly during the call
from process_runnable_tasks().
2009-03-08 16:35:27 +01:00