234 Commits

Author SHA1 Message Date
Willy Tarreau
dded32defa [MINOR] replace client_retnclose() with stream_int_retnclose()
This makes more sense to return a message to a stream interface
than to a session.

senddata.{c,h} have been removed.
2008-11-30 19:48:07 +01:00
Willy Tarreau
72b179a53c [MEDIUM] reintroduce BF_HIJACK with produce_content
The stats dump are back. Even very large config files with
5000 servers work fast and well. The SN_SELF_GEN flag has
completely been removed.
2008-11-02 10:19:06 +01:00
Willy Tarreau
1ae3a057df [MEDIUM] remove unused references to {CL|SV}_STSHUT*
All references to CL_STSHUT* and SV_STSHUT* were removed where
possible. Some of them could not be removed because they are
still in use by the unix sockets.

A bug remains at this stage. Injecting with a very short timeout
sometimes leads to a client in close state and a server in data
state with all buffer flags indicating a shutdown but the server
fd still enable, thus causing a busy loop.
2008-08-16 10:56:30 +02:00
Willy Tarreau
ec6c5df018 [CLEANUP] remove many #include <types/xxx> from C files
It should be stated as a rule that a C file should never
include types/xxx.h when proto/xxx.h exists, as it gives
less exposure to declaration conflicts (one of which was
caught and fixed here) and it complicates the file headers
for nothing.

Only types/global.h, types/capture.h and types/polling.h
have been found to be valid includes from C files.
2008-07-16 10:30:42 +02:00
Willy Tarreau
284648e079 [CLEANUP] remove unused include/types/client.h
This file is not used anymore.
2008-07-16 10:30:40 +02:00
Willy Tarreau
10522fd113 [MEDIUM] modularize the global "stats" keyword configuration parser
The "stats" keyword already relied on an external parser, let's
make use of the new keyword registration mechanism.
2008-07-09 20:12:41 +02:00
Willy Tarreau
0c303eec87 [MAJOR] convert all expiration timers from timeval to ticks
This is the first attempt at moving all internal parts from
using struct timeval to integer ticks. Those provides simpler
and faster code due to simplified operations, and this change
also saved about 64 bytes per session.

A new header file has been added : include/common/ticks.h.

It is possible that some functions should finally not be inlined
because they're used quite a lot (eg: tick_first, tick_add_ifset
and tick_is_expired). More measurements are required in order to
decide whether this is interesting or not.

Some function and variable names are still subject to change for
a better overall logics.
2008-07-07 00:09:58 +02:00
Krzysztof Piotr Oledzki
8e4b21d5eb [BUG] Flush buffers also where there are exactly 0 bytes left
I noticed it was possible to get truncated http/csv stats. Sometimes.
Usually the problem disappeared as fast as it appeared, but once it
happend that my http-stats page was truncated for about one hour.
It was quite weird as it happened independently for csv and http
output and it took me some time to track & fix this bug.

Both buffer_write & buffer_write_chunk used to return 0 in two
situations: is case of success or where there was exactly 0 bytes
left. The first one is intentional but I believe the second one
is not as it was not possible to distinguish between successful
write and unsuccessful one, which means that if the buffer was 100%
filled, it was never flushed and it was not possible to write
more data.

This patch fixes this problem.
2008-04-21 07:22:33 +02:00
Willy Tarreau
39f7e6d516 [MEDIUM] fix stats socket limitation to 16 kB
Due to the way the stats socket work, it was not possible to
maintain the information related to the command entered, so
after filling a whole buffer, the request was lost and it was
considered that there was nothing to write anymore.

The major reason was that some flags were passed directly
during the first call to stats_dump_raw() instead of being
stored persistently in the session.

To definitely fix this problem, flags were added to the stats
member of the session structure.

A second problem appeared. When the stats were produced, a first
call to client_retnclose() was performed, then one or multiple
subsequent calls to buffer_write_chunks() were done. But once the
stats buffer was full and a reschedule operated, the buffer was
flushed, the write flag cleared from the buffer and nothing was
done to re-arm it.

For this reason, a check was added in the proto_uxst_stats()
function in order to re-call the client FSM when data were added
by stats_dump_raw(). Finally, the whole unix stats dump FSM was
rewritten to avoid all the magics it depended on. It is now
simpler and looks more like the HTTP one.
2008-03-17 22:08:01 +01:00
Krzysztof Piotr Oledzki
2c6962c3c0 [MAJOR] proto_uxst rework -> SNMP support
Currently there is a ~16KB limit for a data size passed via unix socket.
It is caused by a trivial bug ttat is going to fixed soon, however
in most cases there is no need to dump a full stats.

This patch makes possible to select a scope of dumped data by extending
current "show stat" to "show stat [<iid> <type> <sid>]":
 - iid is a proxy id, -1 to dump all proxies
 - type selects type of dumpable objects: 1 for frontend, 2 for backend, 4 for
   server, -1 for all types. Values can be ORed, for example:
     1+2=3   -> frontend+backend.
     1+2+4=7 -> frontend+backend+server.
 - sid is a service id, -1 to dump everything from the selected proxy.

To do this I implemented a new session flag (SN_STAT_BOUND), added three
variables in data_ctx.stats (iid, type, sid), modified dumpstats.c and
completely revorked the process_uxst_stats: now it waits for a "\n"
terminated string, splits args and uses them. BTW: It should be quite easy
to add new commands, for example to enable/disable servers, the only problem
I can see is a not very lucky config name (*stats* socket). :|

During the work I also fixed two bug:
 - s->flags were not initialized for proto_uxst
 - missing comma if throttling not enabled (caused by a stupid change in
     "Implement persistent id for proxies and servers")

Other changes:
 - No more magic type valuse, use STATS_TYPE_FE/STATS_TYPE_BE/STATS_TYPE_SV
 - Don't memset full s->data_ctx (it was clearing s->data_ctx.stats.{iid/type/sid},
    instead initialize stats.sv & stats.sv_st (stats.px and stats.px_st were already
    initialized)

With all that changes it was extremely easy to write a short perl plugin
for a perl-enabled net-snmp (also included in this patch).

29385 is my PEN (Private Enterprise Number) and I'm willing to donate
the SNMPv2-SMI::enterprises.29385.106.* OIDs for HAProxy if there
is nothing assigned already.
2008-03-04 06:32:16 +01:00
Krzysztof Piotr Oledzki
f58a962247 [MINOR] Implement persistent id for proxies and servers
This patch adds a possibility to set a persistent id for a proxy/server.
Now, even if some proxies/servers are inserted/deleted/moved, iids and
sids can be still used reliable.

Some people add servers with tricky names (BACKEND or FRONTEND for example).
So I also added one more field ('type') to distinguish between a
backend (0), frontend (1) and server (2) without complicated logic:
if name==BACKEND and sid==0 then type is BACKEND else type is SERVER,
etc for a FRONTEND. It also makes possible to have one frontend with more
than one IP (a patch coming soon) with independed stats - for example to
differs between remote and local traffic.

Finally, I added documentation about the CSV format.

This patch depends on '[MEDIUM] Implement "track [<backend>/]<server>"'
2008-02-28 17:23:59 +01:00
Krzysztof Piotr Oledzki
c8b16fc948 [MEDIUM] Implement "track [<backend>/]<server>"
This patch implements ability to set the current state of one server
by tracking another one. It:
 - adds two variables: *tracknext, *tracked to struct server
 - implements findserver(), similar to findproxy()
 - adds "track" keyword accepting both "proxy/server" and "server" (assuming current proxy)
 - verifies if both checks and tracking is not enabled at the same time
 - changes set_server_down() to notify tracking server
 - creates set_server_up(), set_server_disabled(), set_server_enabled() by
   moving the code from process_chk() and adding notifications
 - changes stats to show a name of tracked server instead of Chk/Dwn/Dwntime(html)
   or by adding new variable (csv)

Changes from the previuos version:
 - it is possibile to track independently of the declaration order
 - one extra comma bug is fixed
 - new condition to check if there is no disable-on-404 inconsistency
2008-02-27 10:39:53 +01:00
Krzysztof Piotr Oledzki
25b501a6b1 [MEDIUM]: Count retries and redispatches also for servers, fix redistribute_pending, extend logs, %d->%u cleanup
This patch extends a little previously added functionality to also
count retries and redispatches for servers. Now it is possible to know
which server causes redispatches as it is not always the same that takes
most retries.

While working with the code I found that redistribute_pending() does not increment
srv->redispatches && be->redispatches. I don't know how to test it but
I think the fix is correct. If not I can withdraw it.

I also extended logs to show how many retries were done and if redispatching
was necessary ('+'). I'm using an additional session flag SN_REDISP to match
redispatched connections. I had to rearrange all defines in session.h to make
more room for it.

The documentation about logs was also fixed a little (sorry, english only),
as current version uses totally different format. BTW: examples are still
outdated, maybe next time...

Finally, I changed %d -> %u for retries/redispatches as those variables
are declared as unsigned.
2008-01-06 16:43:05 +01:00
Willy Tarreau
a8efd362b2 [STATS] add support for "show info" on the unix socket
It is sometimes required to know some informations such as the
process uptime when consulting statistics. This patch adds the
"show info" command to query those informations on the UNIX
socket.
2008-01-03 10:19:15 +01:00
Willy Tarreau
ddbb82ff47 [STATS] report the number of times each server was selected
One user reported that an indicator was missing in the statistics:
the number of times each server was selected by load balancing. It
is in fact the total number of sessions assigned to a server by the
load balancing algorithm. It should directly reflect the weight for
"fair" algorithms such as round-robin, since it will not account for
persistant connections.

It should help a lot tuning each server's weight depending on the
load it receives.
2007-12-05 10:34:49 +01:00
Willy Tarreau
5542af65dc [MEDIUM] slowstart: ensure we don't start with a null weight
Because of a divide, it was possible to have a null weight during
a slowstart, which is pretty annoying, especially with a single
server and a long slowstart.

Also, fix the way we report the values in the stats page to avoid
confusion.
2007-12-03 02:04:00 +01:00
Willy Tarreau
b3f32f5f8a [MEDIUM] add support for time units in the configuration
It is not always handy to manipulate large values exprimed
in milliseconds for timeouts. Also, some values are entered
in seconds (such as the stats refresh interval). This patch
adds support for time units. It knows about 'us', 'ms', 's',
'm', 'h', and 'd'. It automatically converts each value into
the caller's expected unit. Unit-less values are still passed
unchanged.

The unit must be passed as a suffix to the number. For instance:

     clitimeout 15m

If any character is not understood, an error is returned.
2007-12-02 22:15:14 +01:00
Willy Tarreau
4bab24d955 [MINOR] stats: report the server warm up status in a "throttle" column
A new "throttle" column has been added to HTML and RAW stats to indicate
in percent, the level of throttling due to server warmup. The column is
empty at 100%.
2007-11-30 18:16:29 +01:00
Willy Tarreau
2ea81930e7 [MEDIUM] report disabled servers as "NOLB" when they are still UP
It's important to be able to distinguish between servers which are UP
and those which are UP but disabled via a 404 response. For this reason,
the status entries report "NOLB" instead of "UP", and the HTML page uses
darker colors. As a complement, write "DOWN" in bold red on the backend
if it has no server left for load balancing.
2007-11-30 12:04:38 +01:00
Willy Tarreau
5dc2fa660c [MINOR] add a weight divisor to the struct proxy
Under some circumstances, it will be useful to be able to have
a server's effective weight bigger than the user weight, and this
is particularly true for dynamic weight-based algorithms. In order
to support this, we add a "wdiv" member to the lbprm structure
which will always be used to divide the weights before reporting
them.
2007-11-28 14:23:13 +01:00
Willy Tarreau
2069704492 [MEDIUM] differentiate between generic LB params and map-specific ones
Since the introduction of server weights, all load balancing algorithms
relied on a pre-computed map. Incidently, quite a bunch of map-specific
parameters were used at random places in order to get the number of
servers or their total weight. It was not architecturally acceptable
that optimizations for the map computation had impact on external parts.
For instance, during this cleanup it was found that a backend weight was
seen as 1 when only the first backup server is used, whatever its weight.

This cleanup consists in differentiating between LB-generic parameters,
such as total weights, number of servers, etc... and map-specific ones.
The struct proxy has been enhanced in order to make it easier to later
support other algorithms. The recount_servers() function now also
updates generic values such as total weights so that it's not needed
anymore to call recalc_server_map() when weights are needed. This
permitted to simplify some code which does not need to know about map
internals anymore.
2007-11-28 14:23:10 +01:00
Willy Tarreau
1fbe4932fc [BUG] missing header names in raw stats output
qlimit, pid, iid and sid were missing from the raw stats output
2007-11-26 16:15:35 +01:00
Willy Tarreau
dcd4771b3d [MINOR] stats: report numerical process ID, proxy ID and server ID
It is very convenient for SNMP monitoring to have unique process ID,
proxy ID and server ID. Those have been added to the CSV outputs.
The numbers start at 1. 0 is reserved. For servers, 0 means that the
reported name is not a server name but half a proxy (FRONTEND/BACKEND).

A remaining hidden "-" in the CSV output has been eliminated too.
2007-11-04 23:35:08 +01:00
Willy Tarreau
6fb42e0694 [MINOR] add an options field to the listeners 2007-11-04 22:42:48 +01:00
Elijah Epifanov
acafc5f88c [MEDIUM] add support for "maxqueue" to limit server queue overload
This patch adds the "maxqueue" parameter to the server. This allows new
sessions to be immediately rebalanced when the server's queue is filled.
It's useful when session stickiness is just a performance boost (even a
huge one) but not a requirement.

This should only be used if session affinity isn't a hard functional
requirement but provides performance boost by keeping server-local
caches hot and compact).

Absence of 'maxqueue' option means unlimited queue. When queue gets filled
up to 'maxqueue' client session is moved from server-local queue to a global
one.
2007-10-25 20:15:38 +02:00
Krzysztof Oledzki
85130941e7 [MEDIUM] stats: report server and backend cumulated downtime
Hello,

This patch implements new statistics for SLA calculation by adding new
field 'Dwntime' with total down time since restart (both HTTP/CSV) and
extending status field (HTTP) or inserting a new one (CSV) with time
showing how long each server/backend is in a current state. Additionaly,
down transations are also calculated and displayed for backends, so it is
possible to know how many times selected backend was down, generating "No
server is available to handle this request." error.

New information are presentetd in two different ways:
   - for HTTP: a "human redable form", one of "100000d 23h", "23h 59m" or
      "59m 59s"
   - for CSV: seconds

I believe that seconds resolution is enough.

As there are more columns in the status page I decided to shrink some
names to make more space:
   - Weight -> Wght
   - Check -> Chk
   - Down -> Dwn

Making described changes I also made some improvements and fixed some
small bugs:
   - don't increment s->health above 's->rise + s->fall - 1'. Previously it
     was incremented an then (re)set to 's->rise + s->fall - 1'.
   - do not set server down if it is down already
   - do not set server up if it is up already
   - fix colspan in multiple places (mostly introduced by my previous patch)
   - add missing "status" header to CSV
   - fix order of retries/redispatches in server (CSV)
   - s/Tthen/Then/
   - s/server/backend/ in DATA_ST_PX_BE (dumpstats.c)

Changes from previous version:
  - deal with negative time intervales
  - don't relay on s->state (SRV_RUNNING)
  - little reworked human_time + compacted format (no spaces). If needed it
    can be used in the future for other purposes by optionally making "cnt"
    as an argument
  - leave set_server_down mostly unchanged
  - only little reworked "process_chk: 9"
  - additional fields in CSV are appended to the rigth
  - fix "SEC" macro
  - named arguments (human_time, be_downtime, srv_downtime)

Hope it is OK. If there are only cosmetic changes needed please fill free
to correct it, however if there are some bigger changes required I would
like to discuss it first or at last to know what exactly was changed
especially since I already put this patch into my production server. :)

Thank you,

Best regards,

 				Krzysztof Oledzki
2007-10-22 21:36:23 +02:00
Willy Tarreau
d4e1b5ffa5 [MINOR] stats: update the width of the table to 22 columns
Unfortunately, we forgot to increase the table from 20 to 22 cols when
we added retries and redisp.
2007-10-19 06:23:19 +02:00
Krzysztof Oledzki
1cf36ba3ae [MEDIUM] stats: count server retries and redispatches
It is important to know how your installation performs. Haproxy masks
connection errors, which is extremely good for a client but it is bad for
an administrator (except people believing that "ignorance is a bless").

Attached patch adds retries and redispatches counters, so now haproxy:

1. For server:
 - counts retried connections (masked or not)

2. For backends:
 - counts retried connections (masked or not) that happened to
    a slave server
 - counts redispatched connections
 - does not count successfully redispatched connections as backend errors.
    Errors are increased only when client does not get a valid response,
    in other words: with failed redispatch or when this function is not
    enabled.

3. For statistics:
 - display Retr (retries) and Redis (redispatches) as a "Warning"
   information.
2007-10-18 19:12:30 +02:00
Willy Tarreau
1388a3a8e8 [BUG] scope "." must match the backend and not the frontend 2007-10-18 16:38:37 +02:00
Willy Tarreau
fbee71331d [MEDIUM] introduce the "stats" keyword in global section
Removed old unused MODE_LOG and MODE_STATS, and replaced the "stats"
keyword in the global section. The new "stats" keyword in the global
section is used to create a UNIX socket on which the statistics will
be accessed.  The client must issue a "show stat\n" command in order
to get a CSV-formated output similar to the output on the HTTP socket
in CSV mode.
2007-10-18 14:16:11 +02:00
Willy Tarreau
3e76e728ce [MEDIUM] implement the statistics output on a unix socket
A unix socket can now access the statistics. It currently only
recognizes the "show stat\n" command at the beginning of the
input, then returns the statistics in CSV format.
2007-10-18 14:13:13 +02:00
Willy Tarreau
5031e6adf5 [MINOR] add a link to the CSV export on the stats page. 2007-10-18 14:12:30 +02:00
Willy Tarreau
55bb8450c0 [MEDIUM] implement the CSV output for the statistics
It is now possible to get CSV ouput from the statistics by
simply appending ";csv" to the HTTP request sent to get the
stats. The fields keep the same ordering as in the HTML page,
and a field "pxname" has been prepended at the beginning of
the line.
2007-10-18 14:12:28 +02:00
Willy Tarreau
9186126e1c [MEDIUM] moved stats and buffer generic functions to new files
Neither the primitives used to write data to a buffer, nor the stats
dump functions are HTTP-specific anymore. Move them to dedicated
files
2007-10-18 14:12:21 +02:00