81 Commits

Author SHA1 Message Date
Willy Tarreau
babd05a6c6 MEDIUM: fd: add fd_poll_{recv,send} for use when explicit polling is required
The old EV_FD_SET() macro was confusing, as it would enable receipt but there
was no way to indicate that EAGAIN was received, hence the recently added
FD_WAIT_* flags. They're not enough as we're still facing a conflict between
EV_FD_* and FD_WAIT_*. So let's offer I/O functions what they need to explicitly
request polling.
2012-09-02 21:53:11 +02:00
Willy Tarreau
3788e4c874 MEDIUM: fd: remove the EV_FD_COND_* primitives
These primitives were initially introduced so that callers were able to
conditionally set/disable polling on a file descriptor and check in return
what the state was. It's been long since we last had an "if" on this, and
all pollers' functions were the same for cond_* and their systematic
counter parts, except that this required a check and a specific return
value that are not always necessary.

So let's simplify the FD API by removing this now unused distinction and
by making all specific functions return void.
2012-09-02 21:53:10 +02:00
Willy Tarreau
076be25ab8 CLEANUP: remove the now unused fdtab direct I/O callbacks
They were all left to NULL since last commit so we can safely remove them
all now and remove the temporary dual polling logic in pollers.
2012-09-02 21:51:29 +02:00
Willy Tarreau
9845e75d23 MEDIUM: polling: prepare to call the iocb() function when defined.
We will need this to centralize I/O callbacks. Nobody sets it right
now so the code should have no impact.
2012-09-02 21:51:27 +02:00
Willy Tarreau
db3b32610f REORG/MEDIUM: fd: remove FD_STCLOSE from struct fdtab
In an attempt to get rid of fdtab[].state, and to move the relevant
parts to the connection struct, we remove the FD_STCLOSE state which
can easily be deduced from the <owner> pointer as there is a 1:1 match.
2012-09-02 21:51:25 +02:00
Willy Tarreau
491c498d97 BUG/MINOR: polling: some events were not set in various pollers
fdtab[].ev was only set in ev_sepoll. Unfortunately, some I/O handling
functions now rely on this, so depending on the polling mechanism, some
useless operations might have been performed, such as performing a useless
recv() when a HUP was reported.

This is a very old issue, the flags were only added to the fdtab and not
propagated into any poller. Then they were used in ev_sepoll which needed
them for the cache. It is unsure whether a backport to 1.4 is appropriate
or not.
2012-07-31 07:55:31 +02:00
Willy Tarreau
45a1251515 [MEDIUM] poll: add a measurement of idle vs work time
We now measure the work and idle times in order to report the idle
time in the stats. It's expected that we'll be able to use it at
other places later.
2011-09-10 18:01:41 +02:00
Willy Tarreau
43d8fb2d3a [REORG] build: move syscall redefinition to specific places
Some older libc don't define splice() and and don't define _syscall*()
either, which causes build errors if splicing is enabled.

To solve this, we now split the syscall redefinition into two layers :
  - one file per syscall (epoll, splice)
  - one common file to declare the _syscall*() macros

The code is cleaner because files using the syscalls just have to include
their respective file. It's not adviced to merge multiple syscall families
into a same file if all are not intended to be used simultaneously, because
defining unused static functions causes warnings to be emitted during build.

As a result, the new USE_MY_SPLICE parameter was added in order to be able
to define the splice() syscall separately.
2011-08-23 00:11:25 +02:00
Willy Tarreau
d79e79b436 [BUG] O(1) pollers should check their FD before closing it
epoll, sepoll and kqueue pollers should check that their fd is not
closed before attempting to close it, otherwise we can end up with
multiple closes of fd #0 upon exit, which is harmless but dirty.
2009-05-10 10:18:54 +02:00
Willy Tarreau
332740dab2 [MEDIUM] pollers: don't wait if a signal is pending
If an asynchronous signal is received outside of the poller, we don't
want the poller to wait for a timeout to occur before processing it,
so we set its timeout to zero, just like we do with pending tasks in
the run queue.
2009-05-10 09:57:21 +02:00
Willy Tarreau
a534fea478 [CLEANUP] remove 65 useless NULL checks before free
C specification clearly states that free(NULL) is a no-op.
So remove useless checks before calling free.
2008-08-03 20:48:50 +02:00
Willy Tarreau
ec6c5df018 [CLEANUP] remove many #include <types/xxx> from C files
It should be stated as a rule that a C file should never
include types/xxx.h when proto/xxx.h exists, as it gives
less exposure to declaration conflicts (one of which was
caught and fixed here) and it complicates the file headers
for nothing.

Only types/global.h, types/capture.h and types/polling.h
have been found to be valid includes from C files.
2008-07-16 10:30:42 +02:00
Willy Tarreau
0c303eec87 [MAJOR] convert all expiration timers from timeval to ticks
This is the first attempt at moving all internal parts from
using struct timeval to integer ticks. Those provides simpler
and faster code due to simplified operations, and this change
also saved about 64 bytes per session.

A new header file has been added : include/common/ticks.h.

It is possible that some functions should finally not be inlined
because they're used quite a lot (eg: tick_first, tick_add_ifset
and tick_is_expired). More measurements are required in order to
decide whether this is interesting or not.

Some function and variable names are still subject to change for
a better overall logics.
2008-07-07 00:09:58 +02:00
Willy Tarreau
b0b37bcd65 [MEDIUM] further improve monotonic clock by check forward jumps
The first implementation of the monotonic clock did not verify
forward jumps. The consequence is that a fast changing time may
expire a lot of tasks. While it does seem minor, in fact it is
problematic because most machines which boot with a wrong date
are in the past and suddenly see their time jump by several
years in the future.

The solution is to check if we spent more apparent time in
a poller than allowed (with a margin applied). The margin
is currently set to 1000 ms. It should be large enough for
any poll() to complete.

Tests with randomly jumping clock show that the result is quite
accurate (error less than 1 second at every change of more than
one second).
2008-06-23 14:00:57 +02:00
Willy Tarreau
b7f694f20e [MEDIUM] implement a monotonic internal clock
If the system date is set backwards while haproxy is running,
some scheduled events are delayed by the amount of time the
clock went backwards. This is particularly problematic on
systems where the date is set at boot, because it seldom
happens that health-checks do not get sent for a few hours.

Before switching to use clock_gettime() on systems which
provide it, we can at least ensure that the clock is not
going backwards and maintain two clocks : the "date" which
represents what the user wants to see (mostly for logs),
and an internal date stored in "now", used for scheduled
events.
2008-06-22 17:18:02 +02:00
Willy Tarreau
3a6281199a [BUG] event pollers must not wait if a task exists in the run queue
Under some circumstances, a task may already lie in the run queue
(eg: inter-task wakeup). It is disastrous to wait for an event in
this case because some processing gets delayed.
2008-06-20 15:05:56 +02:00
Willy Tarreau
70bcfb77a7 [OPTIM] GCC4's builtin_expect() is suboptimal
GCC4 is stupid (unbelievable news!).

When some code uses __builtin_expect(x != 0, 1), it really performs
the check of x != 0 then tests that the result is not zero! This is
a double check when only one was expected. Some performance drops
of 10% in the HTTP parser code have been observed due to this bug.

GCC 3.4 is fine though.

A solution consists in expecting that the tested value is 1. In
this case, it emits the correct code, but it's still not optimal
it seems. Finally the best solution is to ignore likely() and to
pray for the compiler to emit correct code. However, we still have
to fix unlikely() to remove the test there too, and to fix all
code which passed pointers overthere to pass integers instead.
2008-02-14 23:14:33 +01:00
Willy Tarreau
1db37710dc [MEDIUM] limit the number of events returned by *poll*
By default, epoll/kqueue used to return as many events as possible.
This could sometimes cause huge latencies (latencies of up to 400 ms
have been observed with many thousands of fds at once). Limiting the
number of events returned also reduces the latency by avoiding too
many blind processing. The value is set to 200 by default and can be
changed in the global section using the tune.maxpollevents parameter.
2007-06-03 17:16:49 +02:00
Willy Tarreau
fb8983f21b [BUG] the epoll FD must not be shared between processes
Recreate the epoll file descriptor after a fork(). It will ensure
that all processes will not share their epoll_fd. Some side effects
were encountered because of this, such as epoll_wait() returning an
FD which was previously deleted, in multi-process mode.
2007-06-03 16:40:44 +02:00
Willy Tarreau
bdefc513a0 [BUG] fix null timeouts in *poll-based pollers
Introduction of timeval timers broke *poll-based pollers, because the call to
tv_ms_remain may return 0 while the event is not elapsed yet. Now we carefully
check for those cases and round the result up by 1 ms.
2007-05-14 02:02:04 +02:00
Willy Tarreau
d825eef9c5 [MAJOR] replaced all timeouts with struct timeval
The timeout functions were difficult to manipulate because they were
rounding results to the millisecond. Thus, it was difficult to compare
and to check what expired and what did not. Also, the comparison
functions were heavy with multiplies and divides by 1000. Now, all
timeouts are stored in timevals, reducing the number of operations
for updates and leading to cleaner and more efficient code.
2007-05-12 22:35:00 +02:00
Willy Tarreau
ef1d1f859b [MAJOR] auto-registering of pollers at load time
Gcc provides __attribute__((constructor)) which is very convenient
to execute functions at startup right before main(). All the pollers
have been converted to have their register() function declared like
this, so that it is not necessary anymore to call them from a centralized
file.
2007-04-16 00:25:25 +02:00
Willy Tarreau
b40d42006c [BUILD] declare epoll_* as static when using our own functions
We will have to share this code among several implementations.
2007-04-15 23:57:41 +02:00
Willy Tarreau
58094f2fd9 [MAJOR] ev_epoll: do not rely on fd_sets anymore
The new epoll-based poller uses a list of changes in order to
process only the fds which have changed.
2007-04-10 01:43:43 +02:00
Willy Tarreau
2ff7622c0c [MAJOR] delay registering of listener sockets at startup
Some pollers such as kqueue lose their FD across fork(), meaning that
the registered file descriptors are lost too. Now when the proxies are
started by start_proxies(), the file descriptors are not registered yet,
leaving enough time for the fork() to take place and to get a new pollfd.
It will be the first call to maintain_proxies that will register them.
2007-04-09 19:29:56 +02:00
Willy Tarreau
63455a9be5 [MINOR] use 'is_set' instead of 'isset' in struct poller
'isset' was defined as a macro in /usr/include/sys/param.h, and
it breaks build on at least OpenBSD.
2007-04-09 15:34:49 +02:00
Willy Tarreau
69801b8e77 [MINOR] removed proto/polling.h which was not used anymore 2007-04-09 15:28:51 +02:00
Willy Tarreau
e54e9176a3 [MINOR] ev_* : moved the poll function closer to fd_* 2007-04-09 09:23:31 +02:00
Willy Tarreau
97129b5408 [MINOR] changed fd_set*/fd_clr* functions to return ints
The fd_* functions now return ints so that they can be
factored when appropriate.
2007-04-09 00:54:46 +02:00
Willy Tarreau
28d86862bc [MEDIUM] pollers: store the events in arrays
Instead of managing StaticReadEvent/StaticWriteEvent, use evts[dir]
2007-04-08 17:42:27 +02:00
Willy Tarreau
4f60f16dd3 [MAJOR] modularize the polling mechanisms
select, poll and epoll now have their dedicated functions and have
been split into distinct files. Several FD manipulation primitives
have been provided with each poller.

The rest of the code needs to be cleaned to remove traces of
StaticReadEvent/StaticWriteEvent. A trick involving a macro has
temporarily been used right now. Some work needs to be done to
factorize tests and sets everywhere.
2007-04-08 16:39:58 +02:00