It was disturbing to see a backend name associated with a bad request
when this "backend" was in fact the frontend. Instead, we now display
"backend <NONE>" if the "backend" has no backend capability :
> show errors
[25/Mar/2010:06:44:25.394] frontend fe (#1): invalid request
src 127.0.0.1, session #0, backend <NONE> (#-1), server <NONE> (#-1)
request length 45 bytes, error at position 0:
This warning was first reported by Ross West on FreeBSD, then by
Holger Just on OpenSolaris. It also happens on 64bit Linux. However,
fixing the format to use long int complains on 32bit Linux where
ptrdiff_t is apparently different. Better cast the pointer difference
to an int then.
When trying to spot some complex bugs, it's often needed to access
information on stuck sessions, which is quite difficult. This new
command helps one get detailed information about a session, with
flags, timers, states, etc... The buffer data are not dumped yet.
Often we need to understand why some transfers were aborted or what
constitutes server response errors. With those two counters, it is
now possible to detect an unexpected transfer abort during a data
phase (eg: too short HTTP response), and to know what part of the
server response errors may in fact be assigned to aborted transfers.
There are many information available in the stats page that can only
be seen when the mouse hovers over them. But it's hard to know where
those information are. Now with a discrete dotted underline it's easier
to spot those areas.
The current and max request rates are now reported when the mouse flies
over the session rate cur/max. The total requests is displayed with the
status codes over the total sessions cell.
It is wrong to merge FE and BE stats for a proxy because when we consult a
BE's stats, it reflects the FE's stats eventhough the BE has received no
traffic. The most common example happens with listen instances, where the
backend gets credited for all the trafic even when a use_backend rule makes
use of another backend.
This is a first attempt to add a maintenance mode on servers, using
the stat socket (in admin level).
It can be done with the following command :
- disable server <backend>/<server>
- enable server <backend>/<server>
In this mode, no more checks will be performed on the server and it
will be marked as a special DOWN state (MAINT).
If some servers were tracking it, they'll go DOWN until the server
leaves the maintenance mode. The stats page and the CSV export also
display this special state.
This can be used to disable the server in haproxy before doing some
operations on this server itself. This is a good complement to the
"http-check disable-on-404" keyword and works in TCP mode.
Supported informations, available via "tr/td title":
- cap: capabilities (proxy)
- mode: one of tcp, http or health (proxy)
- id: SNMP ID (proxy, socket, server)
- IP (socket, server)
- cookie (backend, server)
This patch adds add "a link" & "a href" html tags for sockets.
As sockets may have the same name like servers, I decided to
add "+" char (forbidden in names assigned to servers), as a prefix.
It is useless to report statistics if the feature was not enabled.
It also makes possible to distinguish if health analyses is
enabled or not only by looking at the stats page.
Implement decreasing health based on observing communication between
HAProxy and servers.
Changes in this version 2:
- documentation
- close race between a started check and health analysis event
- don't force fastinter if it is not set
- better names for options
- layer4 support
Changes in this version 3:
- add stats
- port to the current 1.4 tree
This patch extends and corrects the functionality introduced by
"Collect & provide http response codes received from servers":
- responses are now also accounted for frontends
- backend's and frontend's counters are incremented based
on responses sent to client, not received from servers
This patch adds <a href> html links for proxies, frontends, servers
and backends. Once located, can be clicked. Users no longer have to
manually add #anchor to stat's url.
This patch makes stats page about 30% smaller and
"CSS 2.1" + "HTML 4.01 Transitional" compliant.
There should be no visible differences.
Changes:
- add DOCTYPE for HTML 4.01 Transitional
- add missing </ul>
- remove cols=, AFAIK no modern browser support this property and
it prevents validation to pass.
- remove "align: center": there is no such property in css. There is
however "text-align: center" but it is definitely not what we would
like to see here.
- by default align .titre to center
- by default align .td to right
- remove all align=right, no longer necessary
- add class=ac (align center): shorter than "align=center" and use it when
necessary
- remove nowrap from td, instead use "white-space: nowrap" in css
Now stats page passes W3C validators for HTML & CSS. We may consider adding
"validated" icons from www.w3.org. ;)
This alone makes a typical HTML stats dump consume 10% CPU less,
because we avoid doing complex printf calls to drop them later.
Only a few common cases have been checked, those which are very
likely to run for nothing.
It is a bit expensive and complex to use to call buffer_feed()
directly from the request parser, and there are risks that some
output messages are lost in case of buffer full. Since most of
these messages are static, let's have a state dedicated to print
these messages and store them in a specific area shared with the
stats in the session. This both reduces code size and risks of
losing output data.
Capture & display more data from health checks, like
strerror(errno) for L4 failed checks or a first line
from a response for L7 successes/failed checks.
Non ascii or control characters are masked with
chunk_htmlencode() (html stats) or chunk_asciiencode() (logs).
The stats socket can now run at 3 different levels :
- user
- operator (default one)
- admin
These levels are used to restrict access to some information
and commands. Only the admin can clear all stats. A user cannot
clear anything nor access sensible data such as sessions or
errors.
The most common use of "clear counters" should be to only clear
max values without affecting cumulated values, for instance,
after an incident. So we change "clear counters" to only clear
max values, and add "clear counters all" to clear all counters.
If a frontend does not set 'option socket-stats', a 'clear counters'
on the stats socket could segfault because li->counters is NULL. The
correct fix is to check for NULL before as this is a valid situation.
There are a few remaining max values that need to move to counters.
Also, the counters are more often used than some config information,
so get them closer to the other useful struct members for better cache
efficiency.
This patch allows to collect & provide separate statistics for each socket.
It can be very useful if you would like to distinguish between traffic
generate by local and remote users or between different types of remote
clients (peerings, domestic, foreign).
Currently no "Session rate" is supported, but adding it should be possible
if we found it useful.
Doing this, we can remove the last BF_HIJACK user and remove
produce_content(). s->data_source could also be removed but
it is currently used to detect if the stats or a server was
used.
The stats handler used to store internal states in s->ana_state. Now
we only rely on si->st0 in which we can store as many states as we
have possible outputs. This cleans up the stats code a lot and makes
it more maintainable. It has also reduced code size by a few hundred
bytes.
We can simplify the code in the stats functions using buffer_feed_chunk()
instead of buffer_write_chunk(). Let's start with this function. This
patch also fixed an issue where we could dump past the end of the capture
buffer if it is shorter than the captured request.
Calling buffer_shutw() marks the buffer as closed but if it was already
closed in the other direction, the stream interface is not marked as
closed, causing infinite loops.
We took this opportunity to completely remove buffer_shutw() and buffer_shutr()
which have no reason to be used at all and which will always cause trouble
when directly called. The stats occurrence was the last one.