175 Commits

Author SHA1 Message Date
Willy Tarreau
8f46ccad27 BUG/MEDIUM: checks: don't call connect() on unsupported address families
At the moment, all address families supported on a "server" statement support
a connect() method, but this will soon change with the generalization of
str2sa_range(). Checks currently call ->connect() unconditionally so let's
add a check for this.
2013-03-08 14:04:53 +01:00
Willy Tarreau
8b4c376288 BUILD: fix a warning emitted by isblank() on non-c99 compilers
Commit a2b9dad introduced use of isblank() which is not present everywhere
and requires -std=c99 or various other defines. Here on gcc-3.4 with glibc
2.3 it emits a warning though it works :

  src/checks.c: In function 'event_srv_chk_r':
  src/checks.c:1007: warning: implicit declaration of function 'isblank'

This macro matches only 2 values, better replace it with the explicit match.
2013-02-13 12:49:46 +01:00
Simon Horman
a2b9dadedd MEDIUM: checks: Add agent health check
Support a agent health check performed by opening a TCP socket to a
pre-defined port and reading an ASCII string. The string should have one of
the following forms:

* An ASCII representation of an positive integer percentage.
  e.g. "75%"

  Values in this format will set the weight proportional to the initial
  weight of a server as configured when haproxy starts.

* The string "drain".

  This will cause the weight of a server to be set to 0, and thus it will
  not accept any new connections other than those that are accepted via
  persistence.

* The string "down", optionally followed by a description string.

  Mark the server as down and log the description string as the reason.

* The string "stopped", optionally followed by a description string.

  This currently has the same behaviour as down (iii).

* The string "fail", optionally followed by a description string.

  This currently has the same behaviour as down (iii).

A agent health check may be configured using "option lb-agent-chk".
The use of an alternate check-port, used to obtain agent heath check
information described above as opposed to the port of the service,
may be useful in conjunction with this option.

e.g.

    option  lb-agent-chk
    server  http1_1 10.0.0.10:80 check port 10000 weight 100

Signed-off-by: Simon Horman <horms@verge.net.au>
2013-02-13 11:03:28 +01:00
Simon Horman
007f2a2d24 CLEANUP: checks: Make desc argument to set_server_check_status const
This parameter is not modified by set_server_check_status() and
thus may be const.

Signed-off-by: Simon Horman <horms@verge.net.au>
2013-02-13 10:53:16 +01:00
Willy Tarreau
5ba04f6cf9 BUG/MEDIUM: checks: fix a race condition between checks and observe layer7
When observe layer7 is enabled on a server, a response may cause a server
to be marked down while a check is in progress. When the check finally
completes, the connection is not properly released in process_chk() because
the server states makes it think that no check was in progress due to the
lastly reported failure.

When a new check gets scheduled, it reuses the same connection structure
which is reinitialized. When the server finally closes the previous
connection, epoll_wait() notifies conn_fd_handler() which sees that the
old connection is still referenced by fdtab[fd], but it can not do anything
with this fd which does not match conn->t.sock.fd. So epoll_wait() keeps
reporting this fd forever.

The solution is to always make process_chk() always take care of closing
the connection and not make it rely on the connection layer to so.

Special thanks go to James Cole and Finn Arne Gangstad who encountered
the issue almost at the same time and took care of reporting a very
detailed analysis with rich information to help understand the issue.
2013-02-12 16:04:47 +01:00
Willy Tarreau
bb95666bac BUG/MEDIUM: checks: ensure the health_status is always within bounds
health_adjust() checks for incorrect bounds for the status argument.
With current code, the argument is always a constant from the valid
enum so there is no impact and the check is basically a NOP. However
users running local patches (eg: new checks) might want to recheck
their code.

This fix should be backported to 1.4 which introduced the issue.

Reported-by: Dinko Korunic <dkorunic@reflected.net>
2013-01-24 16:19:18 +01:00
Willy Tarreau
c5c61fcf45 BUG/MEDIUM: checks: ignore late resets after valid responses
Reinout Verkerk from Trilex reported an issue with servers recently
flapping after an haproxy upgrade. Haproxy checks a simple agent
returning an HTTP response. The issue is that if the request packet
is lost but the simple agent responds before reading the HTTP request
and closes, the server will emit a TCP RST once the request finally
reaches it.

The way checks have been ported to use connections makes the error
flag show up as a failure after the success, reporting a stupid case
where the server is said to be down with a correct response.

In order to fix this, let's ignore the connection's error flag if a
successful check has already been reported. Reinout could verify that
a patched server did not exhibit the problem anymore.
2012-12-30 01:44:24 +01:00
Willy Tarreau
14cba4b0b1 MEDIUM: connection: add an error code in connections
This will be needed to improve error reporting, especially for SSL.
2012-12-03 14:22:13 +01:00
Willy Tarreau
6c560da279 BUG/MEDIUM: checks: report handshake failures
Up to now, only data layer failures were reported to the task, but
if a handshake failed from the beginning, the error was not reported
as a failure.
2012-11-24 11:14:45 +01:00
Willy Tarreau
f0837b259b MEDIUM: tcp: add explicit support for delayed ACK in connect()
Commit 24db47e0 tried to improve support for delayed ACK upon connect
but it was incomplete, because checks with the proxy protocol would
always enable polling for data receive and there was no way of
distinguishing data polling and delayed ack.

So we add a distinct delack flag to the connect() function so that
the caller decides whether or not to use a delayed ack regardless
of pending data (eg: when send-proxy is in use). Doing so covers all
combinations of { (check with data), (sendproxy), (smart-connect) }.
2012-11-24 10:24:27 +01:00
Willy Tarreau
cfd97c6f04 BUG/MEDIUM: checks: prevent TIME_WAITs from appearing also on timeouts
We need to disable lingering before closing on timeout too, otherwise
we accumulate TIME_WAITs.
2012-11-23 17:35:59 +01:00
Willy Tarreau
2b199c9ac3 MEDIUM: connection: provide a common conn_full_close() function
Several places got the connection close sequence wrong because it
was not obvious. In practice we always need the same sequence when
aborting, so let's have a common function for this.
2012-11-23 17:32:21 +01:00
Willy Tarreau
db3b4a2891 MINOR: checks: fix recv polling after connect()
Commit a522f801 moved a call to __conn_data_want_recv() just after the
connect() call, which is not 100% correct. First, it does not take errors
into account, eventhough this is harmless. Second, this change will only
be taken into account after next call do conn_data_polling_update(), which
is not necessarily what is expected (eg: if an error is only reported on
the recv side).

So let's use conn_data_poll_recv() instead, which directly subscribes
the event to polling.
2012-11-23 16:32:33 +01:00
Willy Tarreau
b63b59641e BUG/MAJOR: checks: close FD on all timeouts
Since last commit, some timeouts were converted into an error to report
the status, and as a result, the socket was not closed because it was
supposed to have been done during the wake() call.

Close the socket as soon as the timeout is detected to fix the issue.
Also we now ensure to first initialize the connection flags.
2012-11-23 16:22:08 +01:00
Willy Tarreau
74fa7fbec9 MEDIUM: checks: close the socket as soon as we have a response
Until now, the check socked was closed in the task which handles the
check, which can sometimes be substantially later when many tasks are
running. It's much cleaner to close() in the wake call, which also
helps removing some FD management from the task itself.

The code is faster and smaller, and fast health checks show a more
predictable behaviour.
2012-11-23 14:43:49 +01:00
Willy Tarreau
24db47e0cc MEDIUM: checks: avoid waking the application up for pure TCP checks
Pure TCP checks only use the SYN/ACK in return to a SYN. By forcing
the system to use delayed ACKs, it is possible to send an RST instead
of the ACK and thus ensure that the application will never be needlessly
woken up. This avoids error logs or counters on checked components since
the application is never made aware of this connection which dies in the
network stack.
2012-11-23 14:18:39 +01:00
Willy Tarreau
acbdc7a760 BUG/MINOR: checks: slightly clean the state machine up
The process_chk() function still did not consider the the timeout when
it was woken up, so a spurious wakeup could trigger a false timeout. Some
checks were now redundant or could not be triggered (eg: L7 timeout).
So remove them and rearrange the timeout detection.
2012-11-23 14:02:57 +01:00
Willy Tarreau
5a78f36db3 MAJOR: checks: rework completely bogus state machine
The porting of checks to using connections was totally bogus. Some checks
were considered successful as soon as the connection was established,
regardless of any response. Some errors would be triggered upon recv
if polling was enabled for send or if the send channel was shut down.

Now the behaviour is much better. It would be cleaner to perform the
fd_delete() in wake_srv_chk() and to process failures and timeouts
separately, but this is already a good start.
2012-11-23 12:47:05 +01:00
Willy Tarreau
d3aac7088e CLEANUP: checks: rename some server check flags
Some server check flag names were not properly choosen and cause
analysis trouble, especially the CHK_RUNNING one which does not
mean that a check is running but that the server is running...

Here's the rename :
  CHK_RUNNING -> CHK_PASSED
  CHK_ERROR   -> CHK_FAILED
2012-11-23 11:32:12 +01:00
Willy Tarreau
fd29cc537b MEDIUM: checks: avoid accumulating TIME_WAITs during checks
Some checks which do not induce a close from the server accumulate
local TIME_WAIT sockets because they're cleanly shut down. Typically
TCP probes cause this. This is very problematic when there are many
servers, when the checks are fast or when local source ports are rare.

So now we'll disable lingering on the socket instead of sending a
shutdown. Before doing this we try to drain any possibly pending data.
That way we avoid sending an RST when the server has closed first.

This change means that some servers will see more RSTs, but this is
needed to avoid local source port starvation.
2012-11-23 09:18:20 +01:00
Willy Tarreau
ef8a719f70 BUG/MINOR: checks: don't mark the FD as closed before transport close
Some future transport layers might need the connection's file descriptor
on ->close(), so we must not destroy it before we're finished with it.
2012-11-23 09:05:05 +01:00
Willy Tarreau
a522f801fb BUG/MEDIUM: checks: ensure we completely disable polling upon success
When a check succeeds, it used to only disable receive events while
it should disable both directions. The problem is that if the send
event was reported too, it could re-enable the recv event. In theory
this is not a problem as the task is going to be woken up, but if
there are many tasks in the queue and this task is not processed
immediately, we could theorically face a storm of unprocessed events
(typically POLL_HUP).

So better stop both directions, prevent the send side from enabling
recv and have the process_chk() code enable both directions. This
will also help detecting closes before the check is sent.

Note that all this mess has been inherited from the old code that used
the fd as a flag to report if a check was running. We should have a
dedicated flag and perform the fd_delete() in wake_srv_chk() instead.
2012-11-23 09:03:59 +01:00
Willy Tarreau
6b0a850503 BUG/MEDIUM: checks: mark the check as stopped after a connect error
Health checks currently still use the connection's fd to know whether
a check is running (this needs to change). When a health check
immediately fails during connect() because of a lack of local resource
(eg: port), we failed to unset the fd, so each time the process_chk
woken up after such an error, it believed a check was still running
and used to close the fd again instead of starting a new check. This
could result in other connections being closed because they were
assigned the same fd value.

The bug is only marked medium because when this happens, the system
is already in a bad state.

A comment was added above tcp_connect_server() to clarify that the
fd is *not* valid on error.
2012-11-23 09:03:29 +01:00
Willy Tarreau
3fdb366885 MAJOR: connection: replace struct target with a pointer to an enum
Instead of storing a couple of (int, ptr) in the struct connection
and the struct session, we use a different method : we only store a
pointer to an integer which is stored inside the target object and
which contains a unique type identifier. That way, the pointer allows
us to retrieve the object type (by dereferencing it) and the object's
address (by computing the displacement in the target structure). The
NULL pointer always corresponds to OBJ_TYPE_NONE.

This reduces the size of the connection and session structs. It also
simplifies target assignment and compare.

In order to improve the generated code, we try to put the obj_type
element at the beginning of all the structs (listener, server, proxy,
si_applet), so that the original and target pointers are always equal.

A lot of code was touched by massive replaces, but the changes are not
that important.
2012-11-12 00:42:33 +01:00
Willy Tarreau
19d14ef104 MEDIUM: make the trash be a chunk instead of a char *
The trash is used everywhere to store the results of temporary strings
built out of s(n)printf, or as a storage for a chunk when chunks are
needed.

Using global.tune.bufsize is not the most convenient thing either.

So let's replace trash with a chunk and directly use it as such. We can
then use trash.size as the natural way to get its size, and get rid of
many intermediary chunks that were previously used.

The patch is huge because it touches many areas but it makes the code
a lot more clear and even outlines places where trash was used without
being that obvious.
2012-10-29 16:57:30 +01:00
Willy Tarreau
7780473c3b CLEANUP: replace chunk_printf() with chunk_appendf()
This function's naming was misleading as it is used to append data
at the end of a string, causing some surprizes when used for the
first time!

Add a chunk_printf() function which does what its name suggests.
2012-10-29 16:14:26 +01:00
Willy Tarreau
c919dc66a3 CLEANUP: remove trashlen
trashlen is a copy of global.tune.bufsize, so let's stop using it as
a duplicate, fall back to the original bufsize, it's less confusing
this way.
2012-10-26 20:04:27 +02:00
Willy Tarreau
5f2877a7dd BUG/MEDIUM: tcp: transparent bind to the source only when address is set
Thomas Heil reported that health checks did not work anymore when a backend
or server has "usesrc clientip". This is because the source address is not
set and tcp_bind_socket() tries to bind to that address anyway.

The solution consists in explicitly clearing the source address in the checks
and to make tcp_bind_socket() avoid binding when the address is not set. This
also has an indirect benefit that a useless bind() syscall will be avoided
when using "source 0.0.0.0 usesrc clientip" in health checks.
2012-10-26 20:04:27 +02:00
Willy Tarreau
392e9390e6 CLEANUP: checks: remove minor warnings for assigned but not used variables
We don't use the return value from snd_buf/rcv_buf anymore since we only
rely on the connection flags.
2012-10-05 14:54:30 +02:00
Willy Tarreau
6c16adc661 MEDIUM: checks: enable the PROXY protocol with health checks
When health checks are configured on a server which has the send-proxy
directive and no "port" nor "addr" settings, the health check connections
will automatically use the PROXY protocol. If "port" or "addr" are set,
the "check-send-proxy" directive may be used to force the protocol.
2012-10-05 00:33:14 +02:00
Willy Tarreau
f150317671 MAJOR: checks: completely use the connection transport layer
With this change, we now use the connection's transport layer to receive
and send data during health checks. It even becomes possible to send data
in multiple times, which was not possible before.

The transport layer used is the same as the one used for the traffic, unless
a specific address and/or port is specified for the checks using "port" or
"addr", in which case the transport layer defaults to raw_sock. An option
will be provided to force SSL checks on different IP/ports later.

Connection errors and timeouts are still reported.

Some situations where strerror() was able to report a precise error after
a failed connect() in the past might not be reported with as much precision
anymore, but the error message was already meaningless. During the tests,
no situation was found where a message became less precise.
2012-10-05 00:33:14 +02:00
Willy Tarreau
f4288ee4ba MEDIUM: check: add the ctrl and transport layers in the server check structure
Since it's possible for the checks to use a different protocol or transport layer
than the prod traffic, we need to have them referenced in the server. The
SSL checks are not enabled yet, but the transport layers are completely used.
2012-10-05 00:33:14 +02:00
Willy Tarreau
1ae1b7b53c MEDIUM: checks: use real buffers to store requests and responses
Till now the request was made in the trash and sent to the network at
once, and the response was read into a preallocated char[]. Now we
allocate a full buffer for both the request and the response, and make
use of it.

Some of the operations will probably be replaced later with buffer macros
but the point was to ensure we could migrate to use the data layers soon.

One nice improvement caused by this change is that requests are now formed
at the beginning of the check and may safely be sent in multiple chunks if
needed.
2012-10-05 00:33:14 +02:00
Willy Tarreau
5b3a202f78 REORG: server: move the check-specific parts into a check subsection
The health checks in the servers are becoming a real mess, move them
into their own subsection. We'll soon need to have a struct buffer to
replace the char * as well as check-specific protocol and transport
layers.
2012-10-05 00:33:14 +02:00
Willy Tarreau
fb56aab443 MAJOR: checks: make use of the connection layer to send checks
This is a first step, we now use the connection layer without the data
layers (send/recv are still used by hand). The connection is established
using tcp_connect_server() and raw_sock is assumed and forced for now.

fdtab is not manipulated anymore and polling is managed via the connection
layer.

It becomes quite clear that the server needs a second ->ctrl and ->xprt
dedicated to the checks.
2012-10-05 00:33:14 +02:00
Willy Tarreau
40ff59d820 CLEANUP: fd: remove fdtab->flags
These flags were added for TCP_CORK. They were only set at various places
but never checked by any user since TCP_CORK was replaced with MSG_MORE.
Simply get rid of this now.
2012-09-03 20:49:14 +02:00
Willy Tarreau
c7e4238df0 REORG: buffers: split buffers into chunk,buffer,channel
Many parts of the channel definition still make use of the "buffer" word.
2012-09-03 20:47:32 +02:00
Willy Tarreau
3267d36c84 MEDIUM: checks: don't use FD_WAIT_* anymore
make use of fd_poll_* instead in preparation for a later adoption by the
connection subsystem.
2012-09-02 21:53:12 +02:00
Willy Tarreau
49b046dddf MAJOR: fd: replace all EV_FD_* macros with new fd_*_* inline calls
These functions have a more explicity meaning and will offer provisions
for explicit polling.

EV_FD_ISSET() has been left for now as it is still in use in checks.
2012-09-02 21:53:11 +02:00
Willy Tarreau
076be25ab8 CLEANUP: remove the now unused fdtab direct I/O callbacks
They were all left to NULL since last commit so we can safely remove them
all now and remove the temporary dual polling logic in pollers.
2012-09-02 21:51:29 +02:00
Willy Tarreau
20bea42a95 MEDIUM: checks: make use of fdtab->iocb instead of cb[]
Use the single I/O callback to handle the checks. This should soon be
replaced by the common connection handler.
2012-09-02 21:51:27 +02:00
Willy Tarreau
4e6049e553 MINOR: fd: add a new I/O handler to fdtab
This one will eventually replace both cb[] handlers. At the moment it
is not used yet.
2012-09-02 21:51:27 +02:00
Willy Tarreau
505e34a36d MAJOR: get rid of fdtab[].state and use connection->flags instead
fdtab[].state was only used to know whether a connection was in progress
or an error was encountered. Instead we now use connection->flags to store
a flag for both. This way, connection management will be able to update the
connection status on I/O.
2012-09-02 21:51:26 +02:00
Jamie Gloudon
801a0a353a DOC: fix name for "option independant-streams"
The correct spelling is "independent", not "independant". This patch
fixes the doc and the configuration parser to accept the correct form.
The config parser still allows the old naming for backwards compatibility.
2012-09-02 21:51:07 +02:00
Willy Tarreau
96596aeead MEDIUM: fd/si: move peeraddr from struct fdinfo to struct connection
The destination address is purely a connection thing and not an fd thing.
It's also likely that later the address will be stored into the connection
and linked to by the SI.

struct fdinfo only keeps the pointer to the port range and the local port
for now. All of this also needs to move to the connection but before this
the release of the port range must move from fd_delete() to a new function
dedicated to the connection.
2012-06-08 22:59:52 +02:00
Justin Karneges
eb2c24ae2a MINOR: checks: add on-marked-up option
This implements the feature discussed in the earlier thread of killing
connections on backup servers when a non-backup server comes back up. For
example, you can use this to route to a mysql master & slave and ensure
clients don't stay on the slave after the master goes from down->up. I've done
some minimal testing and it seems to work.

[WT: added session flag & doc, moved the killing after logging the server UP,
 and ensured that the new server is really usable]
2012-06-03 23:48:42 +02:00
Willy Tarreau
1e44a49c89 BUG/MINOR: checks: expire on timeout.check if smaller than timeout.connect
It happens that haproxy doesn't displace the task in the wait queue when
validating a connection, so if the check timeout is set to a smaller value
than timeout.connect, it will not strike before timeout.connect.

The bug is present at least in 1.4.15..1.4.21, so the fix must be backported.
2012-05-25 07:42:37 +02:00
Willy Tarreau
9580d16e40 BUG/MAJOR: checks: don't call set_server_status_* when no LB algo is set
David Touzeau reported that haproxy dies when a server is checked and is
used in a farm with only "option transparent" and no LB algo. This is
because the LB params are NULL, the functions should be checked before
being called.

The same bug is present in 1.4 so this patch must be backported.
2012-05-19 19:09:46 +02:00
David du Colombier
7af4605ef7 BUG/MAJOR: trash must always be the size of a buffer
Before it was possible to resize the buffers using global.tune.bufsize,
the trash has always been the size of a buffer by design. Unfortunately,
the recent buffer sizing at runtime forgot to adjust the trash, resulting
in it being too short for content rewriting if buffers were enlarged from
the default value.

The bug was encountered in 1.4 so the fix must be backported there.
2012-05-16 14:21:55 +02:00
Willy Tarreau
b147a8382a CLEANUP: fd: remove unused cb->b pointers in the struct fdtab
These pointers were used to hold pointers to buffers in the past, but
since we introduced the stream interface, they're no longer used but
they were still sometimes set.

Removing them shrink the struct fdtab from 32 to 24 bytes on 32-bit machines,
and from 52 to 36 bytes on 64-bit machines, which is a significant saving. A
quick tests shows a steady 0.5% performance gain, probably due to the better
cache efficiency.
2012-05-13 00:35:44 +02:00