3107 Commits

Author SHA1 Message Date
Willy Tarreau
3986b9c140 MEDIUM: config: report it when tcp-request rules are misplaced
A config where a tcp-request rule appears after an http-request rule
might seem valid but it is not. So let's report a warning about this
since this case is hard to detect by the naked eye.
2014-09-16 15:43:24 +02:00
Willy Tarreau
9dc1c61c43 BUG/CRITICAL: http: don't update msg->sov once data start to leave the buffer
Commit bb2e669 ("BUG/MAJOR: http: correctly rewind the request body
after start of forwarding") was incorrect/incomplete. It used to rely on
CF_READ_ATTACHED to stop updating msg->sov once data start to leave the
buffer, but this is unreliable because since commit a6eebb3 ("[BUG]
session: clear BF_READ_ATTACHED before next I/O") merged in 1.5-dev1,
this flag is only ephemeral and is cleared once all analysers have
seen it. So we can start updating msg->sov again each time we pass
through this place with new data. With a sufficiently large amount of
data, it is possible to make msg->sov wrap and validate the if()
condition at the top, causing the buffer to advance by about 2GB and
crash the process.

Note that the offset cannot be controlled by the attacker because it is
a sum of millions of small random sizes depending on how many bytes were
read by the server and how many were left in the buffer, only because
of the speed difference between reading and writing. Also, nothing is
written, the invalid pointer resulting from this operation is only read.

Many thanks to James Dempsey for reporting this bug and to Chris Forbes for
narrowing down the faulty area enough to make its root cause analysable.

This fix must be backported to haproxy 1.5.
2014-09-02 16:48:54 +02:00
Willy Tarreau
7346acb6f1 MINOR: log: add a new field "%lc" to implement a per-frontend log counter
Sometimes it would be convenient to have a log counter so that from a log
server we know whether some logs were lost or not. The frontend's log counter
serves exactly this purpose. It's incremented each time a traffic log is
produced. If a log is disabled using "http-request set-log-level silent",
the counter will not be incremented. However, admin logs are not accounted
for. Also, if logs are filtered out before being sent to the server because
of a minimum level set on the log line, the counter will be increased anyway.

The counter is 32-bit, so it will wrap, but that's not an issue considering
that 4 billion logs are rarely in the same file, let alone close to each
other.
2014-08-28 15:08:14 +02:00
Willy Tarreau
4edd6836fc OPTIM/MINOR: proxy: reduce struct proxy by 48 bytes on 64-bit archs
Just by moving a few struct members around, we can avoid 32-bit holes
between 64-bit pointers and shrink the struct size by 48 bytes. That's
not huge but that's for free, so let's do it.
2014-08-28 15:08:14 +02:00
Dave McCowan
328fb58d74 MEDIUM: connection: add new bit in Proxy Protocol V2
There are two sample commands to get information about the presence of a
client certificate.
ssl_fc_has_crt is true if there is a certificate present in the current
connection
ssl_c_used is true if there is a certificate present in the session.
If a session has stopped and resumed, then ssl_c_used could be true, while
ssl_fc_has_crt is false.

In the client byte of the TLS TLV of Proxy Protocol V2, there is only one
bit to indicate whether a certificate is present on the connection.  The
attached patch adds a second bit to indicate the presence for the session.

This maintains backward compatibility.

[wt: this should be backported to 1.5 to help maintain compatibility
 between versions]
2014-08-23 07:35:29 +02:00
Lukas Tribus
656c5fa7e8 BUILD: ssl: disable OCSP when using boringssl
Google's boringssl doesn't currently support OCSP, so
disable it if detected.

OCSP support may be reintroduced as per:
https://code.google.com/p/chromium/issues/detail?id=398677

In that case we can simply revert this commit.

Signed-off-by: Lukas Tribus <luky-37@hotmail.com>
2014-08-18 14:33:48 +02:00
Godbach
e468d55998 BUG/MINOR: server: move the directive #endif to the end of file
If a source file includes proto/server.h twice or more, redefinition errors will
be triggered for such inline functions as server_throttle_rate(),
server_is_draining(), srv_adm_set_maint() and so on. Just move #endif directive
to the end of file to solve this issue.

Signed-off-by: Godbach <nylzhaowei@gmail.com>
2014-07-29 11:03:14 +02:00
Willy Tarreau
09448f7d7c MEDIUM: http: add the track-sc* actions to http-request rules
Add support for http-request track-sc, similar to what is done in
tcp-request for backends. A new act_prm field was added to HTTP
request rules to store the track params (table, counter). Just
like for TCP rules, the table is resolved while checking for
config validity. The code was mostly copied from the TCP code
with the exception that here we also count the HTTP request count
and rate by hand. Probably that something could be factored out in
the future.

It seems like tracking flags should be improved to mark each hook
which tracks a key so that we can have some check points where to
increase counters of the past if not done yet, a bit like is done
for TRACK_BACKEND.
2014-07-16 17:26:40 +02:00
Willy Tarreau
5ed1bbfc75 CLEANUP: session: move the stick counters declarations to stick_table.h
They're really not appropriate in session.h as they always require a
stick table, and I'm having a hard time finding them each time I need
to.
2014-07-16 17:26:40 +02:00
Willy Tarreau
edee1d60b7 MEDIUM: stick-table: make it easier to register extra data types
Some users want to add their own data types to stick tables. We don't
want to use a linked list here for performance reasons, so we need to
continue to use an indexed array. This patch allows one to reserve a
compile-time-defined number of extra data types by setting the new
macro STKTABLE_EXTRA_DATA_TYPES to anything greater than zero, keeping
in mind that anything larger will slightly inflate the memory consumed
by stick tables (not per entry though).

Then calling stktable_register_data_store() with the new keyword will
either register a new keyword or fail if the desired entry was already
taken or the keyword already registered.

Note that this patch does not dictate how the data will be used, it only
offers the possibility to create new keywords and have an index to
reference them in the config and in the tables. The caller will not be
able to use stktable_data_cast() and will have to explicitly cast the
stable pointers to the expected types. It can be used for experimentation
as well.
2014-07-15 19:14:52 +02:00
Willy Tarreau
e12704bfc7 MINOR: session: export the function 'smp_fetch_sc_stkctr'
This one is sometimes useful outside of this file.
2014-07-15 19:09:56 +02:00
Thierry FOURNIER
055b9d5c63 MINOR: http: export the function 'smp_fetch_base32'
It's sometimes useful outside of proto_http.c.
2014-07-15 19:09:36 +02:00
Willy Tarreau
65d805fdfc BUILD: fix dependencies between config and compat.h
compat.h only depends on the system, and config needs compat, not the
opposite. global.h was fixed to explicitly include standard.h for LONGBITS.
2014-07-15 19:09:36 +02:00
Willy Tarreau
bb2e669f9e BUG/MAJOR: http: correctly rewind the request body after start of forwarding
Daniel Dubovik reported an interesting bug showing that the request body
processing was still not 100% fixed. If a POST request contained short
enough data to be forwarded at once before trying to establish the
connection to the server, we had no way to correctly rewind the body.

The first visible case is that balancing on a header does not always work
on such POST requests since the header cannot be found. But there are even
nastier implications which are that http-send-name-header would apply to
the wrong location and possibly even affect part of the request's body
due to an incorrect rewinding.

There are two options to fix the problem :
  - first one is to force the HTTP_MSG_F_WAIT_CONN flag on all hash-based
    balancing algorithms and http-send-name-header, but there's always a
    risk that any new algorithm forgets to set it ;

  - the second option is to account for the amount of skipped data before
    the connection establishes so that we always know the position of the
    request's body relative to the buffer's origin.

The second option is much more reliable and fits very well in the spirit
of the past changes to fix forwarding. Indeed, at the moment we have
msg->sov which points to the start of the body before headers are forwarded
and which equals zero afterwards (so it still points to the start of the
body before forwarding data). A minor change consists in always making it
point to the start of the body even after data have been forwarded. It means
that it can get a negative value (so we need to change its type to signed)..

In order to avoid wrapping, we only do this as long as the other side of
the buffer is not connected yet.

Doing this definitely fixes the issues above for the requests. Since the
response cannot be rewound we don't need to perform any change there.

This bug was introduced/remained unfixed in 1.5-dev23 so the fix must be
backported to 1.5.
2014-07-10 19:29:45 +02:00
Willy Tarreau
8fed9037cd MEDIUM: stick-table: implement lookup from a sample fetch
Currently we have stktable_fetch_key() which fetches a sample according
to an expression and returns a stick table key, but we also need a function
which does only the second half of it from a known sample. So let's cut the
function in two and introduce smp_to_stkey() to perform this lookup. The
first function was adapted to make use of it in order to avoid code
duplication.
2014-07-10 16:43:44 +02:00
Dan Dubovik
bd57a9f977 BUG/MEDIUM: backend: Update hash to use unsigned int throughout
When we were generating a hash, it was done using an unsigned long.  When the hash was used
to select a backend, it was sent as an unsigned int.  This made it difficult to predict which
backend would be selected.

This patch updates get_hash, and the hash methods to use an unsigned int, to remain consistent
throughout the codebase.

This fix should be backported to 1.5 and probably in part to 1.4.
2014-07-08 22:00:21 +02:00
Willy Tarreau
fd0e008d9d BUG/MEDIUM: unix: completely unbind abstract sockets during a pause()
Abstract namespace sockets ignore the shutdown() call and do not make
it possible to temporarily stop listening. The issue it causes is that
during a soft reload, the new process cannot bind, complaining that the
address is already in use.

This change registers a new pause() function for unix sockets and
completely unbinds the abstract ones since it's possible to rebind
them later. It requires the two previous patches as well as preceeding
fixes.

This fix should be backported into 1.5 since the issue apperas there.
2014-07-08 01:13:35 +02:00
Willy Tarreau
092d865c53 MEDIUM: listener: implement a per-protocol pause() function
In order to fix the abstact socket pause mechanism during soft restarts,
we'll need to proceed differently depending on the socket protocol. The
pause_listener() function already supports some protocol-specific handling
for the TCP case.

This commit makes this cleaner by adding a new ->pause() function to the
protocol struct, which, if defined, may be used to pause a listener of a
given protocol.

For now, only TCP has been adapted, with the specific code moved from
pause_listener() to tcp_pause_listener().
2014-07-08 01:13:34 +02:00
Willy Tarreau
18324f574f MEDIUM: log: support a user-configurable max log line length
With all the goodies supported by logformat, people find that the limit
of 1024 chars for log lines is too short. Some servers do not support
larger lines and can simply drop them, so changing the default value is
not always the best choice.

This patch takes a different approach. Log line length is specified per
log server on the "log" line, with a value between 80 and 65535. That
way it's possibly to satisfy all needs, even with some fat local servers
and small remote ones.
2014-06-27 18:13:53 +02:00
Willy Tarreau
4e957907aa MINOR: log: make MAX_SYSLOG_LEN overridable at build time
This value was set in log.h without any #ifndef around, so when one
wanted to change it, a patch was needed. Let's move it to defaults.h
with the usual #ifndef so that it's easier to change it.
2014-06-27 18:13:53 +02:00
Willy Tarreau
b5975defba MINOR: stick-table: make stktable_fetch_key() indicate why it failed
stktable_fetch_key() does not indicate whether it returns NULL because
the input sample was not found or because it's unstable. It causes trouble
with track-sc* rules. Just like with sample_fetch_string(), we want it to
be able to give more information to the caller about what it found. Thus,
now we use the pointer to a sample passed by the caller, and fill it with
the information we have about the sample. That way, even if we return NULL,
the caller has the ability to check whether a sample was found and if it is
still changing or not.
2014-06-25 17:17:53 +02:00
Emeric Brun
0abf836ecb BUG/MINOR: ssl: Fix external function in order not to return a pointer on an internal trash buffer.
'ssl_sock_get_common_name' applied to a connection was also renamed
'ssl_sock_get_remote_common_name'. Currently, this function is only used
with protocol PROXYv2 to retrieve the client certificate's common name.
A further usage could be to retrieve the server certificate's common name
on an outgoing connection.
2014-06-24 22:39:16 +02:00
Simon Horman
98637e5bff MEDIUM: Add external check
Add an external check which makes use of an external process to
check the status of a server.
2014-06-20 07:10:07 +02:00
Emeric Brun
c8b27b6c68 MEDIUM: ssl: add 300s supported time skew on OCSP response update.
OCSP_MAX_RESPONSE_TIME_SKEW can be set to a different value at
compilation (default is 300 seconds).
2014-06-19 14:37:30 +02:00
Thierry FOURNIER
f4e6129e30 MINOR: missing regex.h include 2014-06-19 14:29:32 +02:00
Emeric Brun
4147b2ef10 MEDIUM: ssl: basic OCSP stapling support.
The support is all based on static responses. This doesn't add any
request / response logic to HAProxy, but allows a way to update
information through the socket interface.

Currently certificates specified using "crt" or "crt-list" on "bind" lines
are loaded as PEM files.
For each PEM file, haproxy checks for the presence of file at the same path
suffixed by ".ocsp". If such file is found, support for the TLS Certificate
Status Request extension (also known as "OCSP stapling") is automatically
enabled. The content of this file is optional. If not empty, it must contain
a valid OCSP Response in DER format. In order to be valid an OCSP Response
must comply with the following rules: it has to indicate a good status,
it has to be a single response for the certificate of the PEM file, and it
has to be valid at the moment of addition. If these rules are not respected
the OCSP Response is ignored and a warning is emitted. In order to  identify
which certificate an OCSP Response applies to, the issuer's certificate is
necessary. If the issuer's certificate is not found in the PEM file, it will
be loaded from a file at the same path as the PEM file suffixed by ".issuer"
if it exists otherwise it will fail with an error.

It is possible to update an OCSP Response from the unix socket using:

  set ssl ocsp-response <response>

This command is used to update an OCSP Response for a certificate (see "crt"
on "bind" lines). Same controls are performed as during the initial loading of
the response. The <response> must be passed as a base64 encoded string of the
DER encoded response from the OCSP server.

Example:
  openssl ocsp -issuer issuer.pem -cert server.pem \
               -host ocsp.issuer.com:80 -respout resp.der
  echo "set ssl ocsp-response $(base64 -w 10000 resp.der)" | \
               socat stdio /var/run/haproxy.stat

This feature is automatically enabled on openssl 0.9.8h and above.

This work was performed jointly by Dirkjan Bussink of GitHub and
Emeric Brun of HAProxy Technologies.
2014-06-18 18:28:56 +02:00
Thierry FOURNIER
26202760a4 MINOR: regex: Use native PCRE API.
The pcreposix layer (in the pcre projetc) execute strlen to find
thlength of the string. When we are using the function "regex_exex*2",
the length is used to add a final \0, when pcreposix is executed a
strlen is executed to compute the length.

If we are using a native PCRE api, the length is provided as an
argument, and these operations disappear.

This is useful because PCRE regex are more used than POSIC regex.
2014-06-18 15:14:00 +02:00
Thierry FOURNIER
09af0d6d43 MEDIUM: regex: replace all standard regex function by own functions
This patch remove all references of standard regex in haproxy. The last
remaining references are only in the regex.[ch] files.

In the file src/checks.c, the original function uses a "pmatch" array.
In fact this array is unused. This patch remove it.
2014-06-18 15:07:57 +02:00
Thierry FOURNIER
b8f980cc19 MINOR: regex: Create JIT compatible function that return match strings
This patchs rename the "regex_exec" to "regex_exec2". It add a new
"regex_exec", "regex_exec_match" and "regex_exec_match2" function. This
function can match regex and return array containing matching parts.
Otherwise, this function use the compiled method (JIT or PCRE or POSIX).

JIT require a subject with length. PCREPOSIX and native POSIX regex
require a null terminted subject. The regex_exec* function are splited
in two version. The first version take a null terminated string, but it
execute strlen() on the subject if it is compiled with JIT. The second
version (terminated by "2") take the subject and the length. This
version adds a null character in the subject if it is compiled with
PCREPOSIX or native POSIX functions.

The documentation of posix regex and pcreposix says that the function
returns 0 if the string matche otherwise it returns REG_NOMATCH. The
REG_NOMATCH macro take the value 1 with posix regex and the value 17
with the pcreposix. The documentaion of the native pcre API (used with
JIT) returns a negative number if no match, otherwise, it returns 0 or a
positive number.

This patch fix also the return codes of the regex_exec* functions. Now,
these function returns true if the string match, otherwise it returns
false.
2014-06-18 15:07:50 +02:00
Sasha Pachev
218f064f55 MEDIUM: http: add actions "replace-header" and "replace-values" in http-req/resp
This patch adds two new actions to http-request and http-response rulesets :
  - replace-header : replace a whole header line, suited for headers
                     which might contain commas
  - replace-value  : replace a single header value, suited for headers
                     defined as lists.

The match consists in a regex, and the replacement string takes a log-format
and supports back-references.
2014-06-17 18:34:32 +02:00
Willy Tarreau
4bfc580dd3 MEDIUM: session: maintain per-backend and per-server time statistics
Using the last rate counters, we now compute the queue, connect, response
and total times per server and per backend with a 95% accuracy over the last
1024 samples. The operation is cheap so we don't need to condition it.
2014-06-17 17:15:56 +02:00
Willy Tarreau
2438f2b984 MINOR: freq_ctr: introduce a new averaging method
While the current functions report average event counts per period, we are
also interested in average values per event. For this we use a different
method. The principle is to rely on a long tail which sums the new value
with a fraction of the previous value, resulting in a sliding window of
infinite length depending on the precision we're interested in.

The idea is that we always keep (N-1)/N of the sum and add the new sampled
value. The sum over N values can be computed with a simple program for a
constant value 1 at each iteration :

    N
  ,---
   \       N - 1              e - 1
    >  ( --------- )^x ~= N * -----
   /         N                  e
  '---
  x = 1

Note: I'm not sure how to demonstrate this but at least this is easily
verified with a simple program, the sum equals N * 0.632120 for any N
moderately large (tens to hundreds).

Inserting a constant sample value V here simply results in :

   sum = V * N * (e - 1) / e

But we don't want to integrate over a small period, but infinitely. Let's
cut the infinity in P periods of N values. Each period M is exactly the same
as period M-1 with a factor of ((N-1)/N)^N applied. A test shows that given a
large N :

     N - 1           1
  ( ------- )^N ~=  ---
       N             e

Our sum is now a sum of each factor times  :

   N*P                                     P
  ,---                                   ,---
   \         N - 1               e - 1    \     1
    >  v ( --------- )^x ~= VN * -----  *  >   ---
   /           N                   e      /    e^x
  '---                                   '---
  x = 1                                  x = 0

For P "large enough", in tests we get this :

    P
  ,---
   \     1        e
    >   --- ~=  -----
   /    e^x     e - 1
  '---
  x = 0

This simplifies the sum above :

   N*P
  ,---
   \         N - 1
    >  v ( --------- )^x = VN
   /           N
  '---
  x = 1

So basically by summing values and applying the last result an (N-1)/N factor
we just get N times the values over the long term, so we can recover the
constant value V by dividing by N.

A value added at the entry of the sliding window of N values will thus be
reduced to 1/e or 36.7% after N terms have been added. After a second batch,
it will only be 1/e^2, or 13.5%, and so on. So practically speaking, each
old period of N values represents only a quickly fading ratio of the global
sum :

  period    ratio
    1       36.7%
    2       13.5%
    3       4.98%
    4       1.83%
    5       0.67%
    6       0.25%
    7       0.09%
    8       0.033%
    9       0.012%
   10       0.0045%

So after 10N samples, the initial value has already faded out by a factor of
22026, which is quite fast. If the sliding window is 1024 samples wide, it
means that a sample will only count for 1/22k of its initial value after 10k
samples went after it, which results in half of the value it would represent
using an arithmetic mean. The benefit of this method is that it's very cheap
in terms of computations when N is a power of two. This is very well suited
to record response times as large values will fade out faster than with an
arithmetic mean and will depend on sample count and not time.

Demonstrating all the above assumptions with maths instead of a program is
left as an exercise for the reader.
2014-06-17 17:15:51 +02:00
Willy Tarreau
588297f2f9 MINOR: tools: add new functions to quote-encode strings
qstr() and cstr() will be used to quote-encode strings. The first one
does it unconditionally. The second one is aimed at CSV files where the
quote-encoding is only needed when the field contains a quote or a comma.
2014-06-16 18:20:14 +02:00
Simon Horman
75ab8bdb83 MEDIUM: Add port_to_str helper
This helper is similar to addr_to_str but
tries to convert the port rather than the address
of a struct sockaddr_storage.

This is in preparation for supporting
an external agent check.

Signed-off-by: Simon Horman <horms@verge.net.au>
2014-06-16 10:10:33 +02:00
Willy Tarreau
8fccfa256e CLEANUP: connection: merge proxy proto v2 header and address block
This is in order to simplify the PPv2 header parsing code to look more
like the one provided as an example in the spec. No code change was
performed beyond just merging the proxy_addr union into the proxy_hdr_v2
struct.
2014-06-14 11:46:02 +02:00
Willy Tarreau
3a4ac422ce MINOR: tcp: prepare support for the "capture" action
A few minor entries will be needed to capture sample fetches in requests
or responses. This patch just prepares the code for this.
2014-06-13 16:32:48 +02:00
Willy Tarreau
54da8db40b MINOR: capture: extend the captures to support non-header keys
This patch adds support for captures with no header name. The purpose
is to allow extra captures to be defined and logged along with the
header captures.
2014-06-13 16:32:48 +02:00
Remi Gacogne
f46cd6e4ec MEDIUM: ssl: Add the option to use standardized DH parameters >= 1024 bits
When no static DH parameters are specified, this patch makes haproxy
use standardized (rfc 2409 / rfc 3526) DH parameters with prime lenghts
of 1024, 2048, 4096 or 8192 bits for DHE key exchange. The size of the
temporary/ephemeral DH key is computed as the minimum of the RSA/DSA server
key size and the value of a new option named tune.ssl.default-dh-param.
2014-06-12 16:12:23 +02:00
Willy Tarreau
c874653bb4 BUILD: don't use type "uint" which is not portable
Dmitry Sivachenko reported that "uint" doesn't build on FreeBSD 10.
On Linux it's defined in sys/types.h and indicated as "old". Just
get rid of the very few occurrences.
2014-05-28 23:05:07 +02:00
Willy Tarreau
ce3f913e48 MINOR: stats: add counters for SSL cache lookups and misses
One important aspect of SSL performance tuning is the cache size,
but there's no metric to know whether it's large enough or not. This
commit introduces two counters, one for the cache lookups and another
one for cache misses. These counters are reported on "show info" on
the stats socket. This way, it suffices to see the cache misses
counter constantly grow to know that a larger cache could possibly
help.
2014-05-28 16:53:04 +02:00
Willy Tarreau
0c9c2720dc MINOR: stats: report SSL key computations per second
It's commonly needed to know how many SSL asymmetric keys are computed
per second on either side (frontend or backend), and to know the SSL
session reuse ratio. Now we compute these values and report them in
"show info".
2014-05-28 12:28:58 +02:00
Sasha Pachev
c600204ddf BUG/MEDIUM: regex: fix risk of buffer overrun in exp_replace()
Currently exp_replace() (which is used in reqrep/reqirep) is
vulnerable to a buffer overrun. I have been able to reproduce it using
the attached configuration file and issuing the following command:

  wget -O - -S -q http://localhost:8000/`perl -e 'print "a"x4000'`/cookie.php

Str was being checked only in in while (str) and it was possible to
read past that when more than one character was being accessed in the
loop.

WT:
   Note that this bug is only marked MEDIUM because configurations
   capable of triggering this bug are very unlikely to exist at all due
   to the fact that most rewrites consist in static string additions
   that largely fit into the reserved area (8kB by default).

   This fix should also be backported to 1.4 and possibly even 1.3 since
   it seems to have been present since 1.1 or so.

Config:
-------

  global
        maxconn         500
        stats socket /tmp/haproxy.sock mode 600

  defaults
        timeout client      1000
        timeout connect      5000
        timeout server      5000
        retries         1
        option redispatch

  listen stats
        bind :8080
        mode http
        stats enable
        stats uri /stats
        stats show-legends

  listen  tcp_1
        bind :8000
        mode            http
        maxconn 400
        balance roundrobin
        reqrep ^([^\ :]*)\ /(.*)/(.*)\.php(.*) \1\ /\3.php?arg=\2\2\2\2\2\2\2\2\2\2\2\2\2\4
        server  srv1 127.0.0.1:9000 check port 9000 inter 1000 fall 1
        server  srv2 127.0.0.1:9001 check port 9001 inter 1000 fall 1
2014-05-27 14:36:06 +02:00
Willy Tarreau
23964187ae MINOR: checks: support a neutral check result
Agent will have the ability to return a weight without indicating an
up/down status. Currently this is not possible, so let's add a 5th
result CHK_RES_NEUTRAL for this purpose. It has been mapped to the
unused HCHK_STATUS_CHECKED which already serves as a neutral delimitor
between initiated checks and those returning a result.
2014-05-23 15:42:49 +02:00
Willy Tarreau
ed7df90068 MEDIUM: stats: introduce new actions to simplify admin status management
Instead of enabling/disabling maintenance mode and drain mode separately
using 4 actions, we now offer 3 simplified actions :
  - set state to READY
  - set state to DRAIN
  - set state to MAINT

They have the benefit of reporting the same state as displayed on the page,
and of doing the double-switch atomically eg when switching from drain to
maint.

Note that the old actions are still supported for users running scripts.
2014-05-23 14:29:11 +02:00
Willy Tarreau
bfc7b7acd8 MAJOR: checks: add support for a new "drain" administrative mode
This patch adds support for a new "drain" mode. So now we have 3 admin
modes for a server :
  - READY
  - DRAIN
  - MAINT

The drain mode disables load balancing but leaves the server up. It can
coexist with maint, except that maint has precedence. It is also inherited
from tracked servers, so just like maint, it's represented with 2 bits.

New functions were designed to set/clear each flag and to propagate the
changes to tracking servers when relevant, and to log the changes. Existing
functions srv_set_adm_maint() and srv_set_adm_ready() were replaced to make
use of the new functions.

Currently the drain mode is not yet used, however the whole logic was tested
with all combinations of set/clear of both flags in various orders to catch
all corner cases.
2014-05-23 14:29:11 +02:00
Willy Tarreau
8eb7784634 MINOR: server: implement srv_set_stopping()
This function was taken from check_set_server_drain(). It does not
consider health checks at all and only sets a server to stopping
provided it's not in maintenance and is not currently stopped. The
resulting state will be STOPPING. The state change is propagated
to tracked servers.

For now the function is not used, but the goal is to split health
checks status from server status and to be able to change a server's
state regardless of health checks statuses.
2014-05-23 14:29:11 +02:00
Willy Tarreau
dbd5e78f5b MINOR: server: implement srv_set_running()
This function was taken from check_set_server_up(). It does not consider
health checks at all and only sets a server up provided it's not in
maintenance. The resulting state may be either RUNNING or STARTING
depending on the presence of a slowstart or not. The state change is
propagated to tracked servers.

For now the function is not used, but the goal is to split health
checks status from server status and to be able to change a server's
state regardless of health checks statuses.
2014-05-23 14:29:11 +02:00
Willy Tarreau
e7d1ef16bf MINOR: server: implement srv_set_stopped()
This function was extracted from check_set_server_down(). In only
manipulates the server state and does not consider the health checks
at all, nor does it modify their status. It takes a reason message to
report in logs, however it passes NULL when recursing through the
trackers chain.

For now the function is not used, but the goal is to split health
checks status from server status and to be able to change a server's
state regardless of health checks statuses.
2014-05-23 14:29:11 +02:00
Willy Tarreau
bda92271e6 MINOR: server: make the status reporting function support a reason
srv_adm_append_status() was renamed srv_append_status() since it's no
more dedicated to maintenance mode. It now supports a reason which if
not null is appended to the output string.
2014-05-23 14:29:11 +02:00
Willy Tarreau
af54958d72 MEDIUM: checks: simplify server up/down/nolb transitions
We don't have to handle the maintenance transition here anymore so we
can simplify the functions and conditions. This also means that we don't
need the disable/enable functions but only a function to switch to each
new state.

It's worth mentionning that at this stage there are still confusions
between the server state and the checks states. For example, the health
check's state is adjusted from tracked servers changing state, while it
should not be.
2014-05-23 14:29:11 +02:00