226 Commits

Author SHA1 Message Date
Willy Tarreau
5d53634f36 [MINOR] unix socket: report the socket path in case of bind error
When an error occurs during binding of the stats unix socket, messages
are far from clear for the user !
2009-10-14 20:37:00 +02:00
Willy Tarreau
65671abd32 [MINOR] remove now obsolete ana_state from the session struct
This one is not used anymore.
2009-10-04 14:24:59 +02:00
Willy Tarreau
f5a885fd28 [MEDIUM] stats: don't use s->ana_state anymore
The stats handler used to store internal states in s->ana_state. Now
we only rely on si->st0 in which we can store as many states as we
have possible outputs. This cleans up the stats code a lot and makes
it more maintainable. It has also reduced code size by a few hundred
bytes.
2009-10-04 14:22:18 +02:00
Willy Tarreau
f27b5ea8dc [MEDIUM] new option "independant-streams" to stop updating read timeout on writes
By default, when data is sent over a socket, both the write timeout and the
read timeout for that socket are refreshed, because we consider that there is
activity on that socket, and we have no other means of guessing if we should
receive data or not.

While this default behaviour is desirable for almost all applications, there
exists a situation where it is desirable to disable it, and only refresh the
read timeout if there are incoming data. This happens on sessions with large
timeouts and low amounts of exchanged data such as telnet session. If the
server suddenly disappears, the output data accumulates in the system's
socket buffers, both timeouts are correctly refreshed, and there is no way
to know the server does not receive them, so we don't timeout. However, when
the underlying protocol always echoes sent data, it would be enough by itself
to detect the issue using the read timeout. Note that this problem does not
happen with more verbose protocols because data won't accumulate long in the
socket buffers.

When this option is set on the frontend, it will disable read timeout updates
on data sent to the client. There probably is little use of this case. When
the option is set on the backend, it will disable read timeout updates on
data sent to the server. Doing so will typically break large HTTP posts from
slow lines, so use it with caution.
2009-10-03 22:01:18 +02:00
Willy Tarreau
9a42c0d771 [MEDIUM] stats: replace the stats socket analyser with an SI applet
We can get rid of the stats analyser by moving all the stats code
to a stream interface applet. Above being cleaner, it provides new
advantages such as the ability to process requests and responses
from the same function and work only with simple state machines.
There's no need for any hijack hack anymore.

The direct advantage for the user are the interactive mode and the
ability to chain several commands delimited by a semi-colon. Now if
the user types "prompt", he gets a prompt from which he can send
as many requests as he wants. All outputs are terminated by a
blank line followed by a new prompt, so this can be used from
external tools too.

The code is not very clean, it needs some rework, but some part
of the dirty parts are due to the remnants of the hijack mode used
in the old functions we call.

The old AN_REQ_STATS_SOCK analyser flag is now unused and has been
removed.
2009-09-23 23:52:17 +02:00
Willy Tarreau
b029f8cd7d [MINOR] stream_interface: add iohandler callback
When stream interfaces will embedded applets running as part as their
holding task, we'll need a new callback to process them from the
session processor.
2009-09-23 23:52:15 +02:00
Willy Tarreau
dc85b39db7 [MEDIUM] stream_interface: add and use ->update function to resync
We used to call stream_sock_data_finish() directly at the end of
a session update, but if we want to support non-socket interfaces,
we need to have this function configurable. Now we access it via
->update().
2009-08-18 07:38:19 +02:00
Willy Tarreau
27a674efb8 [MEDIUM] make it possible to change the buffer size in the configuration
The new tune.bufsize and tune.maxrewrite global directives allow one to
change the buffer size and the maxrewrite size. Right now, setting bufsize
too low will block stats sockets which will not be able to write at all.
An error checking must be added to buffer_write_chunk() so that if it
cannot write its message to an empty buffer, it causes the caller to abort.
2009-08-17 22:56:56 +02:00
Willy Tarreau
a07a34eb24 [MEDIUM] replace BUFSIZE with buf->size in computations
The first step towards dynamic buffer size consists in removing
all static definitions of the buffer size. Instead, we store a
buffer's size in itself. Right now they're all preinitialized
to BUFSIZE, but we will change that.
2009-08-16 23:27:46 +02:00
Willy Tarreau
2c9f5b130f [MINOR] move the initial task's nice value to the listener
Since the listener is the one indicating what analyser and session
handlers to call, it makes sense that it also sets the task's nice
value. This also helps getting rid of the last trace of the stats
in the proto_uxst file.
2009-08-16 19:36:56 +02:00
Willy Tarreau
5ca791da8d [CLEANUP] move remaining stats sockets code to dumpstats
The remains of the stats socket code has nothing to do in proto_uxst
anymore and must move to dumpstats. The code is much cleaner and more
structured. It was also an opportunity to rename AN_REQ_UNIX_STATS
as AN_REQ_STATS_SOCK as the stats socket is no longer unix-specific
either.

The last item refering to stats in proto_uxst is the setting of the
task's nice value which should in fact come from the listener.
2009-08-16 19:35:36 +02:00
Willy Tarreau
8e13d7492d [CLEANUP] unix: remove uxst_process_session()
This one is not used anymore.
2009-08-16 19:34:23 +02:00
Willy Tarreau
104eb36f26 [MEDIUM] make the unix stats sockets use the generic session handler
process_session() is now ready to handle unix stats sockets. This
first step works and old code has not been removed. A cleanup is
required. The stats handler is not unix socket-centric anymore and
should move to dumpstats.c.
2009-08-16 19:33:51 +02:00
Willy Tarreau
89a6313c34 [MEDIUM] make the global stats socket part of a frontend
Creating a frontend for the global stats socket will help merge
unix sockets management with the other socket management. Since
frontends are huge structs, we only allocate it if required.
2009-08-16 19:31:51 +02:00
Willy Tarreau
9650f37628 [MEDIUM] move connection establishment from backend to the SI.
The connection establishment was completely handled by backend.c which
normally just handles LB algos. Since it's purely TCP, it must move to
proto_tcp.c. Also, instead of calling it directly, we now call it via
the stream interface, which will later help us unify session handling.
2009-08-16 17:46:15 +02:00
Willy Tarreau
43e0e39978 [MINOR] print usage on the stats sockets upon invalid commands
When issuing commands on the unix socket, there's no way to
know if the result is empty or if the command is wrong. This
patch makes invalid command return a help message.
2009-07-26 18:16:43 +02:00
Willy Tarreau
3a816293e9 [MEDIUM] session: tell analysers what bit they were called for
Some stream analysers might become generic enough to be called
for several bits. So we cannot have the analyser bit hard coded
into the analyser itself. Let's make the caller inform the callee.
2009-07-07 10:55:49 +02:00
Willy Tarreau
2d045597f7 [BUG] reject unix accepts when connection limit is reached
unix sockets are not attached to a real frontend, so there is
no way to disable/enable the listener depending on the global
session count. For this reason, if the global maxconn is reached
and a unix socket comes in, it will just be ignored and remain
in the poll list, which will call again indefinitely.

So we need to accept then drop incoming unix connections when
the table is full.

This should not happen with clean configurations since the global
maxconn should provide enough room for unix sockets.
2009-03-28 11:02:18 +01:00
Willy Tarreau
b00f9c456c [BUG] check for global.maxconn before doing accept()
If the accept() is done before checking for global.maxconn, we can
accept too many connections and encounter a lack of file descriptors
when trying to connect to the server. This is the cause of the
"cannot get a server socket" message  encountered in debug mode
during injections with low timeouts.
2009-03-21 22:43:12 +01:00
Willy Tarreau
06bea94266 [MEDIUM] session: don't resync FSMs on non-interesting changes
While processing the session, we used to resync the FSMs when buffer
flags changed. But since BF_KERN_SPLICING and BF_READ_DONTWAIT were
introduced, sometimes we could resync after they were set, which is
not what we want. This was because there were some old checks left
which did not mask changes with BF_MASK_STATIC before checking.
2009-03-21 22:09:29 +01:00
Willy Tarreau
1b194fe03e [OPTIM] buffer: new BF_READ_DONTWAIT flag reduces EAGAIN rates
When the reader does not expect to read lots of data, it can
set BF_READ_DONTWAIT on the request buffer. When it is set,
the stream_sock_read callback will not try to perform multiple
reads, it will return after only one, and clear the flag.
That way, we can immediately return when waiting for an HTTP
request without trying to read again.

On pure request/responses schemes such as monitor-uri or
redirects, this has completely eliminated the EAGAIN occurrences
and the epoll_ctl() calls, resulting in a performance increase of
about 10%. Similar effects should be observed once we support
HTTP keep-alive since we'll immediately disable reads once we
get a full request.
2009-03-21 21:57:30 +01:00
Willy Tarreau
a461318f97 [MINOR] task: keep a task count and clean up task creators
It's sometimes useful at least for statistics to keep a task count.
It's easy to do by forcing the rare task creators to always use the
same functions to create/destroy a task.
2009-03-21 18:13:21 +01:00
Willy Tarreau
7c84bab879 [MEDIUM] rearrange forwarding condition to enable splice during analysis
The forwarding condition was not very clear. We would only enable
forwarding when send_max is zero, and we would only splice when no
analyser is installed. In fact we want to enable forward when there
is no analyser and we want to splice at soon as there is data to
forward, regardless of the analysers.
2009-03-08 21:38:23 +01:00
Willy Tarreau
0be0ef9604 [OPTIM] do not re-check req buffer when only response has changed
In process_session(), we used to re-run through all the evaluation
loop when only the response had changed. Now we carefully check in
this order :
  - changes to the stream interfaces (only SI_ST_DIS)
  - changes to the request buffer flags
  - changes to the response buffer flags

And we branch to the appropriate section. This saves significant
CPU cycles, which is important since process_session() is one of
the major CPU eaters.

The same changes have been applied to uxst_process_session().
2009-03-08 19:20:25 +01:00
Willy Tarreau
26c250683f [MEDIUM] minor update to the task api: let the scheduler queue itself
All the tasks callbacks had to requeue the task themselves, and update
a global timeout. This was not convenient at all. Now the API has been
simplified. The tasks callbacks only have to update their expire timer,
and return either a pointer to the task or NULL if the task has been
deleted. The scheduler will take care of requeuing the task at the
proper place in the wait queue.
2009-03-08 09:38:41 +01:00
Willy Tarreau
74808cb907 [MEDIUM] implement error dump on unix socket with "show errors"
The new "show errors" command sent on a unix socket will dump
all captured request and response errors for all proxies. It is
also possible to bound the log to frontends and backends whose
ID is passed as an optional parameter.

The output provides information about frontend, backend, server,
session ID, source address, error type, and error position along
with a complete dump of the request or response which has caused
the error.

If a new error scratches the one currently being reported, then
the dump is aborted with a warning message, and processing goes
on to next error.
2009-03-04 15:53:18 +01:00
Willy Tarreau
38c99bcb98 [BUG] fix unix socket processing of interrupted output
Unix socket processing was still quite buggy. It did not properly
handle interrupted output due to a full response buffer. The fix
mainly consists in not trying to prematurely enable write on the
response buffer, just like the standard session works. This also
gets the unix socket code closer to the standard session code
handling.
2009-02-22 15:58:45 +01:00
Willy Tarreau
fd3828e263 [BUG] fix random memory corruption using "show sess"
Commit 8a5c626e73bac905d150185e45110525588d7b4c introduced the sessions
dump on the unix socket. This implementation is buggy because it may try
to link to the sessions list's head after the last session is removed
with a backref. Also, for the LIST_ISEMPTY test to succeed, we have to
proceed with LIST_INIT after LIST_DEL.
2009-02-22 15:17:24 +01:00
Willy Tarreau
0abebcc0fb [MEDIUM] i/o: rework ->to_forward and ->send_max
The way the buffers and stream interfaces handled ->to_forward was
really not handy for multiple reasons. Now we've moved its control
to the receive-side of the buffer, which is also responsible for
keeping send_max up to date. This makes more sense as it now becomes
possible to send some pre-formatted data followed by forwarded data.

The following explanation has also been added to buffer.h to clarify
the situation. Right now, tests show that the I/O is behaving extremely
well. Some work will have to be done to adapt existing splice code
though.

/* Note about the buffer structure

   The buffer contains two length indicators, one to_forward counter and one
   send_max limit. First, it must be understood that the buffer is in fact
   split in two parts :
     - the visible data (->data, for ->l bytes)
     - the invisible data, typically in kernel buffers forwarded directly from
       the source stream sock to the destination stream sock (->splice_len
       bytes). Those are used only during forward.

   In order not to mix data streams, the producer may only feed the invisible
   data with data to forward, and only when the visible buffer is empty. The
   consumer may not always be able to feed the invisible buffer due to platform
   limitations (lack of kernel support).

   Conversely, the consumer must always take data from the invisible data first
   before ever considering visible data. There is no limit to the size of data
   to consume from the invisible buffer, as platform-specific implementations
   will rarely leave enough control on this. So any byte fed into the invisible
   buffer is expected to reach the destination file descriptor, by any means.
   However, it's the consumer's responsibility to ensure that the invisible
   data has been entirely consumed before consuming visible data. This must be
   reflected by ->splice_len. This is very important as this and only this can
   ensure strict ordering of data between buffers.

   The producer is responsible for decreasing ->to_forward and increasing
   ->send_max. The ->to_forward parameter indicates how many bytes may be fed
   into either data buffer without waking the parent up. The ->send_max
   parameter says how many bytes may be read from the visible buffer. Thus it
   may never exceed ->l. This parameter is updated by any buffer_write() as
   well as any data forwarded through the visible buffer.

   The consumer is responsible for decreasing ->send_max when it sends data
   from the visible buffer, and ->splice_len when it sends data from the
   invisible buffer.

   A real-world example consists in part in an HTTP response waiting in a
   buffer to be forwarded. We know the header length (300) and the amount of
   data to forward (content-length=9000). The buffer already contains 1000
   bytes of data after the 300 bytes of headers. Thus the caller will set
   ->send_max to 300 indicating that it explicitly wants to send those data,
   and set ->to_forward to 9000 (content-length). This value must be normalised
   immediately after updating ->to_forward : since there are already 1300 bytes
   in the buffer, 300 of which are already counted in ->send_max, and that size
   is smaller than ->to_forward, we must update ->send_max to 1300 to flush the
   whole buffer, and reduce ->to_forward to 8000. After that, the producer may
   try to feed the additional data through the invisible buffer using a
   platform-specific method such as splice().
 */
2009-01-09 10:15:03 +01:00
Willy Tarreau
6b66f3e4f6 [MAJOR] implement autonomous inter-socket forwarding
If an analyser sets buf->to_forward to a given value, that many
data will be forwarded between the two stream interfaces attached
to a buffer without waking the task up. The same applies once all
analysers have been released. This saves a large amount of calls
to process_session() and a number of task_dequeue/queue.
2009-01-09 10:15:02 +01:00
Willy Tarreau
3ffeba1f67 [MEDIUM] enable inter-stream_interface wakeup calls
By letting the producer tell the consumer there is data to check,
and the consumer tell the producer there is some space left again,
we can cut in half the number of session wakeups.

This is also an important starting point for future splicing support.
2008-12-28 11:09:02 +01:00
Willy Tarreau
b0ef735c71 [MINOR] add flags to indicate when a stream interface is waiting for space/data
It will soon be required to know when a stream interface is waiting for
buffer data or buffer room. Let's add two flags for that.
2008-12-28 11:08:03 +01:00
Willy Tarreau
86491c3164 [MEDIUM] indicate when we don't care about read timeout
Sometimes we don't care about a read timeout, for instance, from the
client when waiting for the server, but we still want the client to
be able to read.

Till now it was done by articially forcing the read timeout to ETERNITY.
But this will cause trouble when we want the low level stream sock to
communicate without waking the session up. So we add a BF_READ_NOEXP
flag to indicate that when the read timeout is to be set, it might
have to be set to ETERNITY.

Since BF_READ_ENA was not used, we replaced this flag.
2008-12-28 11:06:40 +01:00
Willy Tarreau
f890dc9003 [MEDIUM] add a send limit to a buffer
For keep-alive, line-mode protocols and splicing, we will need to
limit the sender to process a certain amount of bytes. The limit
is automatically set to the buffer size when analysers are detached
from the buffer.
2008-12-28 10:58:52 +01:00
Willy Tarreau
3dfe6cd095 [MEDIUM] add support for "show sess" in unix stats socket
It is now possible to list all known sessions by issuing "show sess"
on the unix stats socket. The format is not much evolved but it is
very useful for debugging.

The doc has been updated to reflect the new keyword.
2008-12-07 22:41:17 +01:00
Willy Tarreau
62e4f1dedd [MINOR] add back-references to sessions for later use by a dumper.
This is the first step in implementing a session dump tool.
A session dump will need restart points. It will be necessary for
it to get references to sessions which can be moved when the session
dies.

The principle is not that complex : when a session ends, it looks for
any potential back-references. If it finds any, then it moves them to
the next session in the list. The dump function will of course have
to restart from that new point.
2008-12-07 21:57:02 +01:00
Willy Tarreau
01bf8675ed [MEDIUM] reference the current hijack function in the buffer itself
Instead of calling a hard-coded function to produce data, let's
reference this function into the buffer and call it from there
when BF_HIJACK is set. This goes in the direction of more generic
session management code.
2008-12-07 18:03:29 +01:00
Willy Tarreau
b5654f6ff4 [MINOR] move the listener reference from fd to session
The listener referenced in the fd was only used to check the
listener state upon session termination. There was no guarantee
that the FD had not been reassigned by the moment it was processed,
so this was a bit racy. Having it in the session is more robust.
2008-12-07 16:45:10 +01:00
Willy Tarreau
7e5067d459 [MEDIUM] remove cli_fd, srv_fd, cli_state and srv_state from the session
Those were previously used by the unix sockets only, and could be
removed.
2008-12-07 16:27:56 +01:00
Willy Tarreau
b1356cf4e4 [MAJOR] make unix sockets work again with stats
The unix protocol handler had not been updated during the last
stream_sock changes. This has been done now. There is still a
lot of duplicated code between session.c and proto_uxst.c due
to the way the session is handled. Session.c relies on the existence
of a frontend while it does not exist here.

It is easier to see the difference between the stats part (placed
in dumpstats.c) and the unix-stream part (in proto_uxst.c).

The hijacking function still needs to be dynamically set into the
response buffer, and some cleanup is still required, then all those
changes should be forward-ported to the HTTP part. Adding support
for new keywords should not cause trouble now.
2008-12-07 16:06:43 +01:00
Willy Tarreau
a11e976163 [MEDIUM] first pass of lifting to proto_uxst.c:uxst_event_accept()
The accept function must be adapted to the new framework. It is
still broken, and calling it will still result in a segfault. But
this cleanup is needed anyway.
2008-12-01 01:44:25 +01:00
Willy Tarreau
dded32defa [MINOR] replace client_retnclose() with stream_int_retnclose()
This makes more sense to return a message to a stream interface
than to a session.

senddata.{c,h} have been removed.
2008-11-30 19:48:07 +01:00
Willy Tarreau
f54f8bdd8d [MINOR] maintain a global session list in order to ease debugging
Now the global variable 'sessions' will be a dual-linked list of all
known sessions. The list element is set at the beginning of the session
so that it's easier to follow them all with gdb.
2008-11-23 19:53:55 +01:00
Willy Tarreau
eabf313df2 [MINOR] change type of fdtab[]->owner to void*
The owner of an fd was initially a task but this was sometimes
casted to a (struct listener *). We'll soon need more types,
so void* is more appropriate.
2008-11-02 10:19:08 +01:00
Willy Tarreau
fdccded0e8 [MEDIUM] indicate a reason for a task wakeup
It's very frequent to require some information about the
reason why a task is running. Some flags have been added
so that a task now knows if it got woken up due to I/O
completion, timeout, etc...
2008-11-02 10:19:08 +01:00
Willy Tarreau
3da77c5abd [MINOR] re-arrange buffer flags and rename some of them
The buffer flags became a big bazaar. Re-arrange them
so that their names are more explicit and so that they
are more easily readable in hex form. Some aggregates
have also been adjusted.
2008-11-02 10:19:07 +01:00
Willy Tarreau
fa7e10251d [MAJOR] rework of the server FSM
srv_state has been removed from HTTP state machines, and states
have been split in either TCP states or analyzers. For instance,
the TARPIT state has just become a simple analyzer.

New flags have been added to the struct buffer to compensate this.
The high-level stream processors sometimes need to force a disconnection
without touching a file-descriptor (eg: report an error). But if
they touched BF_SHUTW or BF_SHUTR, the file descriptor would not
be closed. Thus, the two SHUT?_NOW flags have been added so that
an application can request a forced close which the stream interface
will be forced to obey.

During this change, a new BF_HIJACK flag was added. It will
be used for data generation, eg during a stats dump. It
prevents the producer on a buffer from sending data into it.

  BF_SHUTR_NOW  /* the producer must shut down for reads ASAP  */
  BF_SHUTW_NOW  /* the consumer must shut down for writes ASAP */
  BF_HIJACK     /* the producer is temporarily replaced        */

BF_SHUTW_NOW has precedence over BF_HIJACK. BF_HIJACK has
precedence over BF_MAY_FORWARD (so that it does not need it).

New functions buffer_shutr_now(), buffer_shutw_now(), buffer_abort()
are provided to manipulate BF_SHUT* flags.

A new type "stream_interface" has been added to describe both
sides of a buffer. A stream interface has states and error
reporting. The session now has two stream interfaces (one per
side). Each buffer has stream_interface pointers to both
consumer and producer sides.

The server-side file descriptor has moved to its stream interface,
so that even the buffer has access to it.

process_srv() has been split into three parts :
  - tcp_get_connection() obtains a connection to the server
  - tcp_connection_failed() tests if a previously attempted
    connection has succeeded or not.
  - process_srv_data() only manages the data phase, and in
    this sense should be roughly equivalent to process_cli.

Little code has been removed, and a lot of old code has been
left in comments for now.
2008-11-02 10:19:04 +01:00
Willy Tarreau
ffab5b4ab0 [MEDIUM] merge inspect_exp and txn->exp into request buffer
Since we may have several analysers on a buffer, it's more
convenient to have the analyser timeout attached to the
buffer itself.
2008-08-17 18:03:28 +02:00
Willy Tarreau
2df28e8110 [MEDIUM] session: move the analysis bit field to the buffer
It makes more sense to store the list of analysers in the buffer
than in the session since they are precisely plugged onto one
buffer.
2008-08-17 15:20:19 +02:00
Willy Tarreau
26ed74dadc [MEDIUM] use buffer->wex instead of buffer->cex for connect timeout
It's a shame not to use buffer->wex for connection timeouts since by
definition it cannot be used till the connection is not established.
Using it instead of ->cex also makes the buffer processing more
symmetric.
2008-08-17 12:11:14 +02:00