Just like the previous commit, we sometimes want to limit the rate of
incoming SSL connections. While it can be done for a frontend, it was
not possible for a whole process, which makes sense when multiple
processes are running on a system to server multiple customers.
The new global "maxsslrate" setting is usable to fix a limit on the
session rate going to the SSL frontends. The limits applies before
the SSL handshake and not after, so that it saves the SSL stack from
expensive key computations that would finally be aborted before being
accounted for.
The same setting may be changed at run time on the CLI using
"set rate-limit ssl-session global".
It's sometimes useful to be able to limit the connection rate on a machine
running many haproxy instances (eg: per customer) but it removes the ability
for that machine to defend itself against a DoS. Thus, better also provide a
limit on the session rate, which does not include the connections rejected by
"tcp-request connection" rules. This permits to have much higher limits on
the connection rate without having to raise the session rate limit to insane
values.
The limit can be changed on the CLI using "set rate-limit sessions global",
or in the global section using "maxsessrate".
In addition to previous outputs, we also emit the cumulated number of
connections, the cumulated number of requests, the maximum allowed
SSL connection concurrency, the current number of SSL connections and
the cumulated number of SSL connections. This will help troubleshoot
systems which experience memory shortage due to SSL.
It's easier and safer to rely on conn_xprt_ready() everywhere than to
check the flag itself. It will also simplify adding extra checks later
if needed. Some useless controls for !xprt have been removed, as the
XPRT_READY flag itself guarantees xprt is set.
It's easier and safer to rely on conn_ctrl_ready() everywhere than to
check the flag itself. It will also simplify adding extra checks later
if needed. Some useless controls for !ctrl have been removed, as the
CTRL_READY flag itself guarantees ctrl is set.
We're completely changing the way FDs will be polled. There will be no
more speculative I/O since we'll know the exact FD state, so these will
only be cached events.
First, let's fix a few field names which become confusing. "spec_e" was
used to store a speculative I/O event state. Now we'll store the whole
R/W states for the FD there. "spec_p" was used to store a speculative
I/O cache position. Now let's clearly call it "cache".
We're completely changing the way FDs will be polled. First, let's fix
a few field names which become confusing. "spec_e" was used to store a
speculative I/O event state. Now we'll store the whole R/W states for
the FD there.
The value of the variable "appctx->ctx.map.ent" is used after the loop,
but its value has changed. The variable "value" is initialized and
contains the good value.
This is a recent bug, no backport is needed.
Since commit 6b66f3e ([MAJOR] implement autonomous inter-socket forwarding)
introduced in 1.3.16-rc1, we've been relying on a stupid mechanism to wake
up the task after a write, which was an exact copy-paste of the reader side.
The principle was that if we empty a buffer and there's no forwarding
scheduled or if the *producer* is not in a connected state, then we wake
the task up.
That does not make any sense. It happens to wake up too late sometimes (eg,
when the request analyser waits for some room in the buffer to start to
work), and leads to unneeded wakeups in client-side keep-alive, because
the task is woken up when the response is sent, while the analysers are
simply waiting for a new request.
In order to fix this, we introduce a new channel flag : CF_WAKE_WRITE. It
is designed so that an analyser can explicitly request being notified when
some data were written. It is used only when the HTTP request or response
analysers need to wait for more room in the buffers. It is automatically
cleared upon wake up.
The flag is also automatically set by the functions which try to write into
a buffer from an applet when they fail (bi_putblk() etc...).
That allows us to remove the stupid condition above and avoid some wakeups.
In http-server-close and in http-keep-alive modes, this reduces from 4 to 3
the average number of wakeups per request, and increases the overall
performance by about 1.5%.
This reverts commit f3221f99acdd792352d4ee648d987270d74ca38e.
Igor reported some very strange breakage of his stats page which is
clearly caused by the chunking, though I don't see at first glance
what could be wrong. Better revert it for now.
In theory the principle is simple as we just need to send HTTP chunks
if the client is 1.1 compatible. In practice it's harder because we
have to append a CR LF after each block of data and we're never sure
to have the room for this. In order not to have to deal with this, we
instead send the CR LF prior to each chunk size. The only issue is for
the first chunk and for this reason we avoid to send the empty header
line when using chunked encoding.
When enabling a tracked server via the web interface, we must first
check if the server tracks another one and the state of this tracked
server, just like the command line does.
Failure to do so causes incorrect logs to be emitted when the server
is enabled :
[WARNING] 361/212556 (2645) : Server bck2/srv3 is DOWN via bck2/srv2. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[WARNING] 361/212603 (2645) : Server bck2/srv3 is DOWN for maintenance.
--> enable server now
[WARNING] 361/212606 (2645) : Server bck2/srv3 is UP (leaving maintenance).
With this fix, it's correct now :
[WARNING] 361/212805 (2666) : Server bck2/srv3 is DOWN via bck2/srv2. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[WARNING] 361/212813 (2666) : Server bck2/srv3 is DOWN for maintenance.
--> enable server now
[WARNING] 361/212821 (2666) : Server bck2/srv3 is DOWN via bck2/srv2. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
It does not seem necessary to backport this fix, considering that it
depends on extremely fragile behaviours, there are more risks of breakage
caused by a backport than the current inconvenience.
On several browsers, the monospace font used to display numbers in tips
is not much readable. Since the numbers are aligned anyway, there is too
little benefit in using such a font.
Since the recent addition of map updates, haproxy does not build anymore
on Solaris because "s_addr" is a #define :
src/dumpstats.c: In function `stats_map_lookup':
src/dumpstats.c:4688: error: syntax error before '.' token
src/dumpstats.c:4781: error: `S_un' undeclared (first use in this function)
src/dumpstats.c:4781: error: (Each undeclared identifier is reported only once
src/dumpstats.c:4781: error: for each function it appears in.)
make: *** [src/dumpstats.o] Error 1
Simply rename the variable.
Health checks can now be paused. This is the status they get when the
server is put into maintenance mode, which is more logical than relying
on the server's state at some places. It will be needed to allow agent
checks to run when health checks are disabled (currently not possible).
Having the check state partially stored in the server doesn't help.
Some functions such as srv_getinter() rely on the server being checked
to decide what check frequency to use, instead of relying on the check
being configured. So let's get rid of SRV_CHECKED and SRV_AGENT_CHECKED
and only use the check's states instead.
At the moment, health checks and agent checks are tied : no agent
check is emitted if no health check is enabled. Other parameters
are considered in the condition for letting checks run. It will
help us selectively enable checks (agent and regular checks) to be
know whether they're enabled/disabled and configured or not. Now
we can already emit an error when trying to enable an unconfigured
agent.
The flag CHK_STATE_RUNNING is misleading as one may believe it means
the state is enabled (just like SRV_RUNNING). Let's rename these two
flags CHK_ST_INPROGRESS and CHK_ST_DISABLED.
This patch adds map manipulation commands to the socket interface.
add map <map> <key> <value>
Add the value <value> in the map <map>, at the entry corresponding to
the key <key>. This command does not verify if the entry already
exists.
clear map <map>
Remove entries from the map <map>
del map <map> <key>
Delete all the map entries corresponding to the <key> value in the map
<map>.
set map <map> <key> <value>
Modify the value corresponding to each key <key> in a map <map>. The
new value is <value>.
show map [<map>]
Dump info about map converters. Without argument, the list of all
available maps are returned. If a <map> is specified, is content is
dumped.
We'll need to pass patterns on the CLI for lookups. Till now there was no
need for a backslash, so it's still time to support them just like in the
config file.
When dumping a session, it can be useful to know what applet it is
connected to instead of having just the appctx pointer. We also
report st0/st1/st2 to help debugging.
The task returned by stream_int_register_handler() is never used, however we
always need to access the appctx afterwards. So make it return the appctx
instead. We already plan for it to fail, which is the reason for the addition
of a few tests and the possibility for the HTTP analyser to return a status
code 500.
We're about to remove si->appctx, so first let's replace all occurrences
of its usage with a dynamic extract from si->end. A lot of code was changed
by search-n-replace, but the behaviour was intentionally not altered.
The code surrounding calls to stream_int_register_handler() was slightly
changed since we can only use si->end *after* the registration.
The outgoing connection is now allocated dynamically upon the first attempt
to touch the connection's source or destination address. If this allocation
fails, we fail on SN_ERR_RESOURCE.
As we didn't use si->conn anymore, it was removed. The endpoints are released
upon session_free(), on the error path, and upon a new transaction. That way
we are able to carry the existing server's address across retries.
The stream interfaces are not initialized anymore before session_complete(),
so we could even think about allocating them dynamically as well, though
that would not provide much savings.
The session initialization now makes use of conn_new()/conn_free(). This
slightly simplifies the code and makes it more logical. The connection
initialization code is now shorter by about 120 bytes because it's done
at once, allowing the compiler to remove all redundant initializations.
The si_attach_applet() function now takes care of first detaching the
existing endpoint, and it is called from stream_int_register_handler(),
so we can safely remove the calls to si_release_endpoint() in the
application code around this call.
A call to si_detach() was made upon stream_int_unregister_handler() to
ensure we always free the allocated connection if one was allocated in
parallel to setting an applet (eg: detect HTTP proxy while proceeding
with stats maybe).
Currently the control and transport layers of a connection are supposed
to be initialized when their respective pointers are not NULL. This will
not work anymore when we plan to reuse connections, because there is an
asymmetry between the accept() side and the connect() side :
- on accept() side, the fd is set first, then the ctrl layer then the
transport layer ; upon error, they must be undone in the reverse order,
then the FD must be closed. The FD must not be deleted if the control
layer was not yet initialized ;
- on the connect() side, the fd is set last and there is no reliable way
to know if it has been initialized or not. In practice it's initialized
to -1 first but this is hackish and supposes that local FDs only will
be used forever. Also, there are even less solutions for keeping trace
of the transport layer's state.
Also it is possible to support delayed close() when something (eg: logs)
tracks some information requiring the transport and/or control layers,
making it even more difficult to clean them.
So the proposed solution is to add two flags to the connection :
- CO_FL_CTRL_READY is set when the control layer is initialized (fd_insert)
and cleared after it's released (fd_delete).
- CO_FL_XPRT_READY is set when the control layer is initialized (xprt->init)
and cleared after it's released (xprt->close).
The functions have been adapted to rely on this and not on the pointers
anymore. conn_xprt_close() was unused and dangerous : it did not close
the control layer (eg: the socket itself) but still marks the transport
layer as closed, preventing any future call to conn_full_close() from
finishing the job.
The problem comes from conn_full_close() in fact. It needs to close the
xprt and ctrl layers independantly. After that we're still having an issue :
we don't know based on ->ctrl alone whether the fd was registered or not.
For this we use the two new flags CO_FL_XPRT_READY and CO_FL_CTRL_READY. We
now rely on this and not on conn->xprt nor conn->ctrl anymore to decide what
remains to be done on the connection.
In order not to miss some flag assignments, we introduce conn_ctrl_init()
to initialize the control layer, register the fd using fd_insert() and set
the flag, and conn_ctrl_close() which unregisters the fd and removes the
flag, but only if the transport layer was closed.
Similarly, at the transport layer, conn_xprt_init() calls ->init and sets
the flag, while conn_xprt_close() checks the flag, calls ->close and clears
the flag, regardless xprt_ctx or xprt_st. This also ensures that the ->init
and the ->close functions are called only once each and in the correct order.
Note that conn_xprt_close() does nothing if the transport layer is still
tracked.
conn_full_close() now simply calls conn_xprt_close() then conn_full_close()
in turn, which do nothing if CO_FL_XPRT_TRACKED is set.
In order to handle the error path, we also provide conn_force_close() which
ignores CO_FL_XPRT_TRACKED and closes the transport and the control layers
in turns. All relevant instances of fd_delete() have been replaced with
conn_force_close(). Now we always know what state the connection is in and
we can expect to split its initialization.
When we know we're not going to use a connection on a stream interface
because we're using an applet instead, do not allocate a connection, or
release the preallocated one. We do that for peers and CLI only at the
moment, and not for HTTP stats which in the future might be adapted to
support keep-alive.
The connection pointer is simply set to NULL, which pool_free2() already
supports.
The connection will only remain there as a pre-allocated entity whose
goal is to be placed in ->end when establishing an outgoing connection.
All connection initialization can be made on this connection, but all
information retrieved should be applied to the end point only.
This change is huge because there were many users of si->conn. Now the
only users are those who initialize the new connection. The difficulty
appears in a few places such as backend.c, proto_http.c, peers.c where
si->conn is used to hold the connection's target address before assigning
the connection to the stream interface. This is why we have to keep
si->conn for now. A future improvement might consist in dynamically
allocating the connection when it is needed.
The long-term goal is to have a context for applets as an alternative
to the connection and not as a complement. At the moment, the context
is still stored into the stream interface, and we only put a pointer
to the applet's context in si->end, initialize the context with object
type OBJ_TYPE_APPCTX, and this allows us not to allocate an entry when
deciding to switch to an applet.
A special care is taken to never dereference si->conn anymore when
dealing with an applet. That's why it's important that si->end is
always set to the proper type :
si->end == NULL => not connected to anything
*si->end == OBJ_TYPE_APPCTX => connected to an applet
*si->end == OBJ_TYPE_CONN => real connection (server, proxy, ...)
The session management code used to check the applet from the connection's
target. Now it uses the stream interface's end point and does not touch the
connection at all. Similarly, we stop checking the connection's addresses
and file descriptors when reporting the applet's status in the stats dump.
Since this is the applet context, call it ->appctx to avoid the confusion
with the pointer to the applet. Many places were changed but it's only a
renaming.
There is a big trouble with the way POST is handled for the admin
stats page. The POST parameters are extracted from some http-request
rules, and if not round they return zero hoping for being called again
when more data passes. This results in the HTTP analyser being called
several times and all the rules prior to the stats being executed
multiple times as well. That includes rewrite rules.
So instead of doing this, we now move all the processing of the stats
into the stats applet.
That way we just set the stats applet in the HTTP analyser when a stats
request is detected, and the applet takes the time it needs to read the
arguments and respond. We could even imagine improving the applet to
support requests larger than a single buffer.
The code was almost only moved and minimally changed. Several new HTTP
states were added to the stats applet to emit headers, redirects and
to read POST. It was necessary to do this because the headers sent
depend on the parsing of the POST request. In the end it's beneficial
because we removed two stream_int_retnclose() calls.
In preparation for moving the POST processing to the applet, we first
add new states to the HTTP I/O handler. Till now st0 was only 0/1 for
start/end. We now replace it with an enum.
This field was used by dumpstats to retrieve a pointer to the current
session, which may already be found from ->owner. With this change,
the stats code doesn't need the connection at all anymore.
We're trying to move the applets out of the struct connection. So
let's remove the dependence on xprt_st and introduce si->applet.st2
to store the missing contextual data instead.
We now have to report 2 conflicting information on the stats page :
- NOLB = server which returns 404 and stops load balancing ;
- DRAIN = server with a weight forced to zero
The DRAIN state was previously detected from eweight==0 and represented in
blue so that a temporarily disabled server was noticed. This was done by
commit cc8bb92 (MINOR: stats: show soft-stopped servers in different color).
This choice suffered from a small defect however, which is that a server
with a zero weight was reported in this color whatever its state (even down
or switching).
Also, one of the motivations for the color above was because the NOLB state
is barely detectable as it's very close to the UP state.
Since commit 8c3d0be (MEDIUM: Add DRAIN state and report it on the stats page),
we have the new DRAIN state to show servers with a zero weight. The colors are
unfortunately very close to those of the MAINT state, and some users were
confused by the disappearance of the blue bars.
Additionally, the NOLB state had precedence over DRAIN, which could be an
issue since DRAIN is the only thing the admin can act on, so once NOLB was
shown, there was nothing to indicate that the weight was forced to zero.
By switching the two priorities we can report DRAIN (forced mode) before
NOLB (detected mode).
The best solution to fix all this is to reuse the previous blue color for
all cases where weight == 0, whether it's set by config / agent / cli (DRAIN)
or detected by a 404 response (NOLB). However we only use this color when the
server is 100% UP. If it's going down we switch to the usual yellow color
showing failed checks, and when it's down it keeps its usual red color.
That way, a blue bar on the display indicates a server not taking new
sessions but perfectly up. And other colors keep their usual meaning.
When a server tracks another one, its state on the stats page always reports
"via xx/yy". That's convenient to know what server to act on to change the
state. But it is also possible to force the tracking server itself into
maintenance mode and in this case we should not report "via xx/yy" because
the tracked server can't do anything to change the server's state, which
is confusing. In practice there is nothing wrong in leaving it as-is,
except that it's highly misleading when looking at the stats page.
Note that we only change the HTML output, not the CSV one. The states are
already different : "MAINT" vs "MAINT(via)" and we expect anyone coding a
monitoring system based on the CSV output to know the differences between
all possible states.
Add a DRAIN sub-state for a server which
will be shown on the stats page instead of UP if
its effective weight is zero.
Also, log if a server enters or leaves the DRAIN state
as the result of an agent check.
Signed-off-by: Simon Horman <horms@verge.net.au>
The syntax of this new commands are:
enable agent <backend>/<server>
disable agent <backend>/<server>
These commands allow temporarily stopping and subsequently
re-starting an auxiliary agent check. The effect of this is as follows:
New checks are only initialised when the agent is in the enabled. Thus,
disable agent will prevent any new agent checks from begin initiated until
the agent re-enabled using enable agent.
When an agent is disabled the processing of an auxiliary agent check that
was initiated while the agent was set as enabled is as follows: All
results that would alter the weight, specifically "drain" or a weight
returned by the agent, are ignored. The processing of agent check is
otherwise unchanged.
The motivation for this feature is to allow the weight changing effects
of the agent checks to be paused to allow the weight of a server to be
configured using set weight without being overridden by the agent.
Signed-off-by: Simon Horman <horms@verge.net.au>
This is achieved by moving rise and fall from struct server to struct check.
After this move the behaviour of the primary check, server->check is
unchanged. However, the secondary agent check, server->agent now has
independent rise and fall values each of which are set to 1.
The result is that receiving "fail", "stopped" or "down" just once from the
agent will mark the server as down. And receiving a weight just once will
allow the server to be marked up if its primary check is in good health.
This opens up the scope to allow the rise and fall values of the agent
check to be configurable, however this has not been implemented at this
stage.
Signed-off-by: Simon Horman <horms@verge.net.au>
The column used to report the throttle percentage when a server is in
slowstart is based on the time only. This is wrong, because server weights
in slowstart are updated at most once a second, so the reported value is
wrong at least fo rone second during each step, which means all the time
when using short delays (< 20s).
The second point is that it's disturbing to see a weight < 100% without
any throttle at the end of the period (during the last second), because
the effective weight has not yet been updated.
Instead, we now compute the exact ratio between eweight and uweight and
report it. It's always accurate and describes the value being used instead
of using only the date.
It can be backported to 1.4 though it's not particularly important.
This is in preparation for associating a agent check
with a server which runs as well as the server's existing check.
Signed-off-by: Simon Horman <horms@verge.net.au>
Add state to struct check. This is currently used to store one bit,
CHK_RUNNING, which is set if a check is running and clear otherwise.
This bit was previously SRV_CHK_RUNNING of the state element of struct
server.
This is in preparation for associating a agent check
with a server which runs as well as the server's existing check.
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Paramatise the following functions over the check of a server
* set_server_down
* set_server_up
* srv_getinter
* server_status_printf
* set_server_check_status
* set_server_disabled
* set_server_enabled
Generally the server parameter of these functions has been removed.
Where it is still needed it is obtained using check->server.
This is in preparation for associating a agent check
with a server which runs as well as the server's existing check.
By paramatising these functions they may act on each of the checks
without further significant modification.
Explanation of the SSP_O_HCHK portion of this change:
* Prior to this patch SSP_O_HCHK serves a single purpose which
is to tell server_status_printf() weather it should print
the details of the check of a server or not.
With the paramatisation that this patch adds there are two cases.
1) Printing the details of the check in which case a
valid check parameter is needed.
2) Not printing the details of the check in which case
the contents check parameter are unused.
In case 1) we could pass SSP_O_HCHK and a valid check and;
In case 2) we could pass !SSP_O_HCHK and any value for check
including NULL.
If NULL is used for case 2) then SSP_O_HCHK becomes supurfulous
and as NULL is used for case 2) SSP_O_HCHK has been removed.
Signed-off-by: Simon Horman <horms@verge.net.au>
Mark Brooks reported the following issue :
"My table looks like this -
0x24a8294: key=192.168.136.10 use=0 exp=1761492 server_id=3
0x24a8344: key=192.168.136.11 use=0 exp=1761506 server_id=2
0x24a83f4: key=192.168.136.12 use=0 exp=1761520 server_id=3
0x24a84a4: key=192.168.136.13 use=0 exp=1761534 server_id=2
0x24a8554: key=192.168.136.14 use=0 exp=1761548 server_id=3
0x24a8604: key=192.168.136.15 use=0 exp=1761563 server_id=2
0x24a86b4: key=192.168.136.16 use=0 exp=1761580 server_id=3
0x24a8764: key=192.168.136.17 use=0 exp=1761592 server_id=2
0x24a8814: key=192.168.136.18 use=0 exp=1761607 server_id=3
0x24a88c4: key=192.168.136.19 use=0 exp=1761622 server_id=2
0x24a8974: key=192.168.136.20 use=0 exp=1761636 server_id=3
0x24a8a24: key=192.168.136.21 use=0 exp=1761649 server_id=2
im running the command -
socat unix-connect:/var/run/haproxy.stat stdio <<< 'clear table VIP_Name-2 data.server_id eq 2'
Id assume that the entries with server_id = 2 would be removed but its
removing everything each time."
The cause of the issue is a missing test for skip_entry when deciding
whether to clear the key or not. The test was present when only the
last node is to be removed, so removing only the first node from a
list of two always did the right thing, explaining why it remained
unnoticed in basic unit tests.
The bug was introduced by commit 8fa52f4e which attempted to fix a
previous issue with this feature where only the last node was removed.
This bug is 1.5-specific and does not require any backport.
The "set table" statement allows to create new entries with their respective
values. Till now it was limited to a single data type per line, requiring as
many "set table" statements as the desired data types to be set. Since this
is only a parser limitation, this patch gets rid of it. It also allows the
creation of a key with no data types (all reset to their default values).
Since commit 654694e1, it has been possible to feed some data into
stick tables from the CLI. That commit considered that frequency
counters would only have their previous value set, so that they
progressively fade out. But this does not match any real world
use case in fact. The only reason for feeding a freq counter is
to pass some data learned outside. We certainly don't want to see
such data start to vanish immediately, otherwise it will force the
external scripts to loop very frequently to limit the losses.
So let's set the current value instead in order to guarantee that
the data remains stable over the full period, then starts to fade
out between 1* and 2* the period.
Benoit Dolez reported a failure to start haproxy 1.5-dev19. The
process would immediately report an internal error with missing
fetches from some crap instead of ACL names.
The cause is that some versions of gcc seem to trim static structs
containing a variable array when moving them to BSS, and only keep
the fixed size, which is just a list head for all ACL and sample
fetch keywords. This was confirmed at least with gcc 3.4.6. And we
can't move these structs to const because they contain a list element
which is needed to link all of them together during the parsing.
The bug indeed appeared with 1.5-dev19 because it's the first one
to have some empty ACL keyword lists.
One solution is to impose -fno-zero-initialized-in-bss to everyone
but this is not really nice. Another solution consists in ensuring
the struct is never empty so that it does not move there. The easy
solution consists in having a non-null list head since it's not yet
initialized.
A new "ILH" list head type was thus created for this purpose : create
an Initialized List Head so that gcc cannot move the struct to BSS.
This fixes the issue for this version of gcc and does not create any
burden for the declarations.
Bryan Talbot reported that a config with only "stats bind-process 1" in
the global section would crash during parsing. This bug was introduced
with this new statement in 1.5-dev13. The stats frontend must be allocated
if it was not yet.
No backport is needed. The workaround consists in having previously declared
any other stats keyword (generally stats socket).