Both of them are rare and are detected from the same flags source, so
let's detect errors in the handshake loop and remove two tests in the
fast path. This seems to improve overall performance by less than 0.5%
on connection-bound workloads.
Add a DRAIN sub-state for a server which
will be shown on the stats page instead of UP if
its effective weight is zero.
Also, log if a server enters or leaves the DRAIN state
as the result of an agent check.
Signed-off-by: Simon Horman <horms@verge.net.au>
The syntax of this new commands are:
enable agent <backend>/<server>
disable agent <backend>/<server>
These commands allow temporarily stopping and subsequently
re-starting an auxiliary agent check. The effect of this is as follows:
New checks are only initialised when the agent is in the enabled. Thus,
disable agent will prevent any new agent checks from begin initiated until
the agent re-enabled using enable agent.
When an agent is disabled the processing of an auxiliary agent check that
was initiated while the agent was set as enabled is as follows: All
results that would alter the weight, specifically "drain" or a weight
returned by the agent, are ignored. The processing of agent check is
otherwise unchanged.
The motivation for this feature is to allow the weight changing effects
of the agent checks to be paused to allow the weight of a server to be
configured using set weight without being overridden by the agent.
Signed-off-by: Simon Horman <horms@verge.net.au>
This is achieved by moving rise and fall from struct server to struct check.
After this move the behaviour of the primary check, server->check is
unchanged. However, the secondary agent check, server->agent now has
independent rise and fall values each of which are set to 1.
The result is that receiving "fail", "stopped" or "down" just once from the
agent will mark the server as down. And receiving a weight just once will
allow the server to be marked up if its primary check is in good health.
This opens up the scope to allow the rise and fall values of the agent
check to be configurable, however this has not been implemented at this
stage.
Signed-off-by: Simon Horman <horms@verge.net.au>
In the case where agent-port is used and the agent
check is a secondary check to not mark a server as down
if the agent becomes unavailable.
In this configuration the agent should only cause a server to be marked
as down if the agent returns "fail", "stopped" or "down".
Signed-off-by: Simon Horman <horms@verge.net.au>
Allow an auxiliary agent check to be run independently of the
regular a regular health check. This is enabled by the agent-check
server setting.
The agent-port, which specifies the TCP port to use for the agent's
connections, is required.
The agent-inter, which specifies the interval between agent checks and
timeout of agent checks, is optional. If not set the value for regular
checks is used.
e.g.
server web1_1 127.0.0.1:80 check agent-port 10000
If either the health or agent check determines that a server is down
then it is marked as being down, otherwise it is marked as being up.
An agent health check performed by opening a TCP socket and reading an
ASCII string. The string should have one of the following forms:
* An ASCII representation of an positive integer percentage.
e.g. "75%"
Values in this format will set the weight proportional to the initial
weight of a server as configured when haproxy starts.
* The string "drain".
This will cause the weight of a server to be set to 0, and thus it
will not accept any new connections other than those that are
accepted via persistence.
* The string "down", optionally followed by a description string.
Mark the server as down and log the description string as the reason.
* The string "stopped", optionally followed by a description string.
This currently has the same behaviour as "down".
* The string "fail", optionally followed by a description string.
This currently has the same behaviour as "down".
Signed-off-by: Simon Horman <horms@verge.net.au>
Remove option lb-agent-chk and thus the facility to configure
a stand-alone agent health check. This feature was added by
"MEDIUM: checks: Add agent health check". It will be replaced
by subsequent patches with a features to allow an agent check
to be run as either a secondary check, along with any of the existing
checks, or as part of an http check with the status returned
in an HTTP header.
This patch does not entirely revert "MEDIUM: checks: Add agent health
check". The infrastructure it provides to parse the results of an
agent health check remains and will be re-used by the planned features
that are mentioned above.
Signed-off-by: Simon Horman <horms@verge.net.au>
In the case where an agent check returns fail, stopped or down,
log this as info when logging the server status along with any
trailing message returned by the agent after fail, stopped or down.
Previously only the trailing message was logged as info and
if omitted no info was logged.
Signed-off-by: Simon Horman <horms@verge.net.au>
Send SIGINT to child processes when killed. This ensures that
the haproxy process managed by the systemd-wrapper is stopped
when "systemctl stop haproxy.service" is called.
The column used to report the throttle percentage when a server is in
slowstart is based on the time only. This is wrong, because server weights
in slowstart are updated at most once a second, so the reported value is
wrong at least fo rone second during each step, which means all the time
when using short delays (< 20s).
The second point is that it's disturbing to see a weight < 100% without
any throttle at the end of the period (during the last second), because
the effective weight has not yet been updated.
Instead, we now compute the exact ratio between eweight and uweight and
report it. It's always accurate and describes the value being used instead
of using only the date.
It can be backported to 1.4 though it's not particularly important.
A crash was reported by Igor at owind when changing a server's weight
on the CLI. Lukas Tribus could reproduce a related bug where setting
a server's weight would result in the new weight being multiplied by
the initial one. The two bugs are the same.
The incorrect weight calculation results in the total farm weight being
larger than what was initially allocated, causing the map index to be out
of bounds on some hashes. It's easy to reproduce using "balance url_param"
with a variable param, or with "balance static-rr".
It appears that the calculation is made at many places and is not always
right and not always wrong the same way. Thus, this patch introduces a
new function "server_recalc_eweight()" which is dedicated to this task
of computing ->eweight from many other elements including uweight and
current time (for slowstart), and all users now switch to use this
function.
The patch is a bit large but the code was not trivially fixable in a way
that could guarantee this situation would not occur anymore. The fix is
much more readable and has been verified to work with all algorithms,
with both consistent and map-based hashes, and even with static-rr.
Slowstart was tested as well, just like enable/disable server.
The same bug is very likely present in 1.4 as well, so the patch will
probably need to be backported eventhough it will not apply as-is.
Thanks to Lukas and Igor for the information they provided to reproduce it.
Commit 2e99390 (BUG/MEDIUM: checks: fix slowstart behaviour when server
tracking is in use) moved the slowstart task initialization within the
health check code and leaves it unset when checks are disabled. The
problem is that it's possible to trigger slowstart from the CLI by
issuing "disable server XXX / enable server XXX" even when checks are
disabled. The result is a crash when trying to wake up the slowstart
task of that server.
Move the task initialization earlier so that it is done even if the
checks are disabled.
This patch should be backported to 1.4 since the commit above was
backported there.
Commit c08057c does the align job for buffer_dump(), but it has not fixed the
issue that less than 8 characters are left in the last line as below:
Dumping contents from byte 0 to byte 119
0 1 2 3 4 5 6 7 8 9 a b c d e f
0000: 47 45 54 20 2f 69 6e 64 - 65 78 2e 68 74 6d 20 48 GET /index.htm H
0010: 54 54 50 2f 31 2e 30 0d - 0a 55 73 65 72 2d 41 67 TTP/1.0..User-Ag
...
0060: 6e 65 63 74 69 6f 6e 3a - 20 4b 65 65 70 2d 41 6c nection: Keep-Al
0070: 69 76 65 0d 0a 0d 0a ive....
The last line of the hex column is still overlapped by the text column. Since
there will be additional "- " for the output line which has no less than 8
characters, two additional spaces should be present when there is less than 8
characters in order to do alignment. The result after being fixed is as below:
Dumping contents from byte 0 to byte 119
0 1 2 3 4 5 6 7 8 9 a b c d e f
0000: 47 45 54 20 2f 69 6e 64 - 65 78 2e 68 74 6d 20 48 GET /index.htm H
0010: 54 54 50 2f 31 2e 30 0d - 0a 55 73 65 72 2d 41 67 TTP/1.0..User-Ag
...
0060: 6e 65 63 74 69 6f 6e 3a - 20 4b 65 65 70 2d 41 6c nection: Keep-Al
0070: 69 76 65 0d 0a 0d 0a ive....
Signed-off-by: Godbach <nylzhaowei@gmail.com>
This is in preparation for associating a agent check
with a server which runs as well as the server's existing check.
Signed-off-by: Simon Horman <horms@verge.net.au>
Add state to struct check. This is currently used to store one bit,
CHK_RUNNING, which is set if a check is running and clear otherwise.
This bit was previously SRV_CHK_RUNNING of the state element of struct
server.
This is in preparation for associating a agent check
with a server which runs as well as the server's existing check.
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Paramatise the following functions over the check of a server
* set_server_down
* set_server_up
* srv_getinter
* server_status_printf
* set_server_check_status
* set_server_disabled
* set_server_enabled
Generally the server parameter of these functions has been removed.
Where it is still needed it is obtained using check->server.
This is in preparation for associating a agent check
with a server which runs as well as the server's existing check.
By paramatising these functions they may act on each of the checks
without further significant modification.
Explanation of the SSP_O_HCHK portion of this change:
* Prior to this patch SSP_O_HCHK serves a single purpose which
is to tell server_status_printf() weather it should print
the details of the check of a server or not.
With the paramatisation that this patch adds there are two cases.
1) Printing the details of the check in which case a
valid check parameter is needed.
2) Not printing the details of the check in which case
the contents check parameter are unused.
In case 1) we could pass SSP_O_HCHK and a valid check and;
In case 2) we could pass !SSP_O_HCHK and any value for check
including NULL.
If NULL is used for case 2) then SSP_O_HCHK becomes supurfulous
and as NULL is used for case 2) SSP_O_HCHK has been removed.
Signed-off-by: Simon Horman <horms@verge.net.au>
Move result element from struct server to struct check
This allows check results to be independent of the check's server.
This is in preparation for associating a agent check
with a server which runs as well as the server's existing check.
Signed-off-by: Simon Horman <horms@verge.net.au>
This is in preparation for associating a agent check
with a server which runs as well as the server's existing check.
The split has been made by:
* Moving elements of struct server's check element that will
be shared by both checks into a new check_common element
of struct server.
* Moving the remaining elements to a new struct check and
making struct server's check element a struct check.
* Adding a server element to struct check, a back-pointer
to the server element it is a member of.
- At this time the server could be obtained using
container_of, however, this will not be so easy
once a second struct check element is added to struct server
to accommodate an agent health check.
Signed-off-by: Simon Horman <horms@verge.net.au>
commit 39c63c5 "url32+src - like base32+src but whole url including parameters"
was missing the last argument "const char *kw", resulting in the build warning
below :
src/proto_http.c:10351:2: warning: initialization from incompatible pointer type [enabled by default]
src/proto_http.c:10351:2: warning: (near initialization for 'sample_fetch_keywords.kw[50].process') [enabled by default]
src/proto_http.c:10352:2: warning: initialization from incompatible pointer type [enabled by default]
src/proto_http.c:10352:2: warning: (near initialization for 'sample_fetch_keywords.kw[51].process') [enabled by default]
It's harmless since it's not needed there anyway.
Baptiste Assmann reported a bug affecting the "http-request redirect"
parser. It may randomly crash when reporting an error message if the
syntax is not OK. It happens that this is caused by the output error
message pointer which was not initialized to NULL.
This bug is 1.5-specific (introduced in dev17), no backport is needed.
I have a need to limit traffic to each url from each source address. much
like base32+src but the whole url including parameters (this came from
looking at the recent 'Haproxy rate limit per matching request' thread)
attached is patch that seems to do the job, its a copy and paste job of the
base32 functions
the url32 function seems to work too and using 2 machines to request the
same url locks me out of both if I abuse from either with the url32 key
function and only the one if I use url32_src.
Neil
The reqdeny/reqtarpit and http-request deny/tarpit were using
a copy-paste of the error handling code because originally the
req* actions used to maintain their own stats. This is not the
case anymore so we can use the same error blocks for both.
The http-request rulesets still has precedence over req* so no
functionality was changed.
The reqdeny/reqideny and reqtarpit/reqitarpit rules used to maintain
the stats counters themselves while http-request deny/tarpit and
rspdeny/rspideny used to centralize them at the point where the
error is processed.
Thus, let's do the same for reqdeny/reqtarpit so that the functions
which iterate over the rules do not have to deal with these counters
anymore.
When a connection is tarpitted, a denied req is counted once when the
action is applied, and then a failed req is counted when the tarpit
timeout expires. This is completely wrong as the tarpit is exactly
equivalent to a deny since it's a disguised deny.
So let's not increment the failed req anymore.
This fix may be backported to 1.4 which has the same issue.
Commit 986a9d2d12 moved the source address from the stream interface
to the session, but it did not set the flag on the connection to
report that the source address is known. Thus when logs are enabled,
we had a call to getpeername() which is redundant with the result
from accept(). This patch simply sets the flag.
This function was designed for haproxy while testing other functions
in the past. Initially it was not planned to be used given the not
very interesting numbers it showed on real URL data : it is not as
smooth as the other ones. But later tests showed that the other ones
are extremely sensible to the server count and the type of input data,
especially DJB2 which must not be used on numeric input. So in fact
this function is still a generally average performer and it can make
sense to merge it in the end, as it can provide an alternative to
sdbm+avalanche or djb2+avalanche for consistent hashing or when hashing
on numeric data such as a source IP address or a visitor identifier in
a URL parameter.
Summary:
Avalanche is supported not as a native hashing choice, but a modifier
on the hashing function. Note that this means that possible configs
written after 1.5-dev4 using "hash-type avalanche" will get an informative
error instead. But as discussed on the mailing list it seems nobody ever
used it anyway, so let's fix it before the final 1.5 release.
The default values were selected for backward compatibility with previous
releases, as discussed on the mailing list, which means that the consistent
hashing will still apply the avalanche hash by default when no explicit
algorithm is specified.
Examples
(default) hash-type map-based
Map based hashing using sdbm without avalanche
(default) hash-type consistent
Consistent hashing using sdbm with avalanche
Additional Examples:
(a) hash-type map-based sdbm
Same as default for map-based above
(b) hash-type map-based sdbm avalanche
Map based hashing using sdbm with avalanche
(c) hash-type map-based djb2
Map based hashing using djb2 without avalanche
(d) hash-type map-based djb2 avalanche
Map based hashing using djb2 with avalanche
(e) hash-type consistent sdbm avalanche
Same as default for consistent above
(f) hash-type consistent sdbm
Consistent hashing using sdbm without avalanche
(g) hash-type consistent djb2
Consistent hashing using djb2 without avalanche
(h) hash-type consistent djb2 avalanche
Consistent hashing using djb2 with avalanche
Summary:
In testing at tumblr, we found that using djb2 hashing instead of the
default sdbm hashing resulted is better workload distribution to our backends.
This commit implements a change, that allows the user to specify the hash
function they want to use. It does not limit itself to consistent hashing
scenarios.
The supported hash functions are sdbm (default), and djb2.
For a discussion of the feature and analysis, see mailing list thread
"Consistent hashing alternative to sdbm" :
http://marc.info/?l=haproxy&m=138213693909219
Note: This change does NOT make changes to new features, for instance,
applying an avalance hashing always being performed before applying
consistent hashing.
A call to free_pattern_tree() upon exit() is made to free all ACL
patterns allocated in a tree (strings or IP addresses). Unfortunately
it happens that this function has been bogus from the beginning, it
walks over the whole tree, frees the nodes but forgets to remove them
from the tree prior to freeing them. So after visiting a leaf, the
next eb_next() call will require to revisit some of the upper nodes
that were just freed. This can remain unnoticed for a long time because
free() often just marks the area as free. But in cases of aggressive
memory freeing, the location will not be mapped anymore and the process
segfaults.
Note that the bug has no impact other than polluting kernel logs and
frightening sysadmins, since it happens just before exit().
Simply adding the debug code below makes it easier to reproduce the
same bug :
while (node) {
next = eb_next(node);
+ node->node_p = (void *)-1;
free(node);
node = next;
}
Many thanks to the StackExchange team for their very detailed bug report
that permitted to quickly understand this non-obvious bug!
This fix should be backported to 1.4 which introduced the bug.
If the dumped length of buffer is not multiple of 16, the last output line can
be seen as below:
Dumping contents from byte 0 to byte 125
0 1 2 3 4 5 6 7 8 9 a b c d e f
0000: 47 45 54 20 2f 69 6e 64 - 65 78 2e 68 74 6d 20 48 GET /index.htm H
0010: 54 54 50 2f 31 2e 30 0d - 0a 55 73 65 72 2d 41 67 TTP/1.0..User-Ag
...
0060: 30 0d 0a 43 6f 6e 6e 65 - 63 74 69 6f 6e 3a 20 4b 0..Connection: K
0070: 65 65 70 2d 41 6c 69 76 - 65 0d 0a 0d 0a eep-Alive....
Yes, the hex column will be overlapped by the text column. Both the hex and
text column should be aligned at their own area as below:
Dumping contents from byte 0 to byte 125
0 1 2 3 4 5 6 7 8 9 a b c d e f
0000: 47 45 54 20 2f 69 6e 64 - 65 78 2e 68 74 6d 20 48 GET /index.htm H
0010: 54 54 50 2f 31 2e 30 0d - 0a 55 73 65 72 2d 41 67 TTP/1.0..User-Ag
...
0060: 30 0d 0a 43 6f 6e 6e 65 - 63 74 69 6f 6e 3a 20 4b 0..Connection: K
0070: 65 65 70 2d 41 6c 69 76 - 65 0d 0a 0d 0a eep-Alive....
Signed-off-by: Godbach <nylzhaowei@gmail.com>
It's quite common to write directives like the following :
tcp-request reject if WAIT_END { sc0_inc_gpc0 }
This one will never reject, because sc0_inc_gpc0 is provided no value
to compare against. The proper form should have been something like this :
tcp-request reject if WAIT_END { sc0_inc_gpc0 gt 0 }
or :
tcp-request reject if WAIT_END { sc0_inc_gpc0 -m found }
Now we detect the absence of any argument on the command line and emit
a warning suggesting alternatives or the use of "--" to really avoid
matching anything (might be used when debugging).
When a condition does something like :
action if A B C || D E F
If B returns a miss (can't tell true or false), C must not
be evaluated. This is important when C has a side effect
(eg: sc*_inc_gpc0). However the second part after the ||
can still be evaluated.
The track-sc* tcp rules are bogus. The test to verify if the
tracked counter was already assigned is performed in the same
condition as the test for the action. The effect is that a
rule which tracks a counter that is already being tracked
is implicitly converted to an accept because the default
rule is an accept.
This bug only affects 1.5-dev releases.
In session_accept(), if we face a memory allocation error, we try to
emit an HTTP 500 error message in HTTP mode. The problem is that we
must not use http_error_message() for this since it dereferences the
session which can be NULL in this case.
We don't need the session to build the error message anyway since
this function only uses it to retrieve the backend and frontend to
get the most suited error message. Let's pick it ourselves, we're
at the beginning of the session, only the frontend is relevant.
This bug is 1.5-specific.
Currently url_decode returns 1 or 0 depending on whether it could decode
the string or not. For some future use cases, it will be needed to get the
decoded string length after a successful decoding, so let's make it return
that value, and fall back to a negative one in case of error.
If haproxy is compiled with the USE_PCRE_JIT option, the length of the
string is used. If it is compiled without this option the function doesn't
use the length and expects a null terminated string.
The prototype of the function is ambiguous, and depends on the
compilation option. The developer can think that the length is always
used, and many bugs can be created.
This patch makes sure that the length is used. The regex_exec function
adds the final '\0' if it is needed.
William Lallemand reported a bug which happens when an ACL keyword using an
implicit argument (eg: a proxy name) is used : the keyword is not properly
set in the arglist field, resulting in an error about the previous keyword
being returned, or "(null)" if the faulty ACL appears first.
The bug only affects error reporting and is 1.5-specific, so no backport is
nedeed.
Bertrand Jacquin reported a but when using tcp_request content rules
on large POST HTTP requests. The issue is that smp_prefetch_http()
first tries to validate an input buffer, but only if the buffer is
not full. This test is wrong since it must only be performed after
the parsing has failed, otherwise we don't accept POST requests which
fill the buffer as valid HTTP requests.
This bug is 1.5-specific, no backport needed.
The current file "regex.h" define an abstraction for the regex. It
provides the same struct name and the same "regexec" function for the
3 regex types supported: standard libc, basic pcre and jit pcre.
The regex compilation function is not provided by this file. If the
developper wants to use regex, he must write regex compilation code
containing "#define *JIT*".
This patch provides a unique regex compilation function according to
the compilation options.
In addition, the "regex.h" file checks the presence of the "#define
PCRE_CONFIG_JIT" when "USE_PCRE_JIT" is enabled. If this flag is not
present, the pcre lib doesn't support JIT and "#error" is emitted.
Though si_conn_send_loop() does not loop over ->snd_buf() after commit ed7f836,
there is still some codes left which use `while` but only execute once. This
commit does the cleanup job and rename si_conn_send_loop() to si_conn_send().
Signed-off-by: Godbach <nylzhaowei@gmail.com>
RFC6125 does not specify if wildcard matches empty strings but
classical browsers implementations does.
After the fix foo*bar.exemple.om matches foobar.exemple.com.
Both static-rr and hash with type map-based call init_server_map() to allocate
server map, so the server map should be freed while doing cleanup if one of
the above load balance algorithms is used.
Signed-off-by: Godbach <nylzhaowei@gmail.com>
[wt: removed the unneeded "if" before the free]
This minor bug was found using the coccinelle script "da.cocci". The
len was initialized twice instead of setting the size. It's harmless
since no operations are performed on this empty string but needs to
be fixed anyway.
At the moment, HTTP response time is computed after response headers are
processed. This can misleadingly assign to the server some heavy local
processing (eg: regex), and also prevents response headers from passing
information related to the response time (which can sometimes be useful
for stats).
Let's retrieve the reponse time before processing the headers instead.
Note that in order to remain compatible with what was previously done,
we disable the response time when we get a 502 or any bad response. This
should probably be changed in 1.6 since it does not make sense anymore
to lose this information.
health_adjust() should requeue the task after changing its expire timer.
I noticed it on devel servers without load. We have long inter (10 seconds)
and short fasinter (100ms). But according to webserver logs, after a failed
request next check request was called with same 10s interval.
This patch should probably be backported to 1.4 which has the same feature.
This fetch method returns the response buffer len, similarly
to req.len for the request. Previously it was only possible
to rely on "res.payload(0,size) -m found" to find if at least
that amount of data was available, which was a bit tricky.
This new action immediately closes the connection with the server
when the condition is met. The first such rule executed ends the
rules evaluation. The main purpose of this action is to force a
connection to be finished between a client and a server after an
exchange when the application protocol expects some long time outs
to elapse first. The goal is to eliminate idle connections which
take signifiant resources on servers with certain protocols.
When a process with large stick tables is replaced by a new one and remains
present until the last connection finishes, it keeps these data in memory
for nothing since they will never be used anymore by incoming connections,
except during syncing with the new process. This is especially problematic
when dealing with long session protocols such as WebSocket as it becomes
possible to stack many processes and eat a lot of memory.
So the idea here is to know if a table still needs to be synced or not,
and to purge all unused entries once the sync is complete. This means that
after a few hundred milliseconds when everything has been synchronized with
the new process, only a few entries will remain allocated (only the ones
held by sessions during the restart) and all the remaining memory will be
freed.
Note that we carefully do that only after the grace period is expired so as
not to impact a possible proxy that needs to accept a few more connections
before leaving.
Doing this required to add a sync counter to the stick tables, to know how
many peer sync sessions are still in progress in order not to flush the entries
until all synchronizations are completed.
David Berard reported that send-proxy was broken on FreeBSD and tracked the
issue to be an error returned by send(). We already had the same issue in
the past in another area which was addressed by the following commit :
0ea0cf6 BUG: raw_sock: also consider ENOTCONN in addition to EAGAIN
In fact, on Linux send() returns EAGAIN when the connection is not yet
established while other OSes return ENOTCONN. Let's consider ENOTCONN for
send-proxy there as the same as EAGAIN.
David confirmed that this change properly fixed the issue.
Another place was affected as well (health checks with send-proxy), and
was fixed.
This fix does not need any backport since it only affects 1.5.
verifyhost allows you to specify a hostname that the remote server's
SSL certificate must match. Connections that don't match will be
closed with an SSL error.
With a facily of 2 or 1 digit, the send size was wrong and bytes with
unknown value were sent.
The size was calculated using the start of the buffer and not the start
of the data which varies with the number of digits of the facility.
This bug was reported by Samuel Stoller and reported by Lukas Tribus.
When a request fail, the unique_id was allocated but not generated.
The string was not initialized and junk was printed in the log with %ID.
This patch changes the behavior of the unique_id. The unique_id is now
generated when a request failed.
This bug was reported by Patrick Hemmer.
The HTTP request counter is incremented non atomically, which means that
many requests can log the same ID. Let's increment it when it is consumed
so that we avoid this case.
This bug was reported by Patrick Hemmer. It's 1.5-specific and does not
need to be backported.
Mathew Levett reported an issue which is a bit nasty and hard to track
down. RDP cookies contain both the IP and the port, and haproxy matches
them exactly. So if a server has no port specified (or a remapped port),
it will never match a port specified in a cookie. Better warn the user
when this is detected.
Apollon Oikonomopoulos reported a build failure on Hurd where PATH_MAX
is not defined. The only place where it is referenced is ssl_sock.c,
all other places use MAXPATHLEN instead, with a fallback to 128 when
the OS does not define it. So let's switch to MAXPATHLEN as well.
Mark Brooks reported the following issue :
"My table looks like this -
0x24a8294: key=192.168.136.10 use=0 exp=1761492 server_id=3
0x24a8344: key=192.168.136.11 use=0 exp=1761506 server_id=2
0x24a83f4: key=192.168.136.12 use=0 exp=1761520 server_id=3
0x24a84a4: key=192.168.136.13 use=0 exp=1761534 server_id=2
0x24a8554: key=192.168.136.14 use=0 exp=1761548 server_id=3
0x24a8604: key=192.168.136.15 use=0 exp=1761563 server_id=2
0x24a86b4: key=192.168.136.16 use=0 exp=1761580 server_id=3
0x24a8764: key=192.168.136.17 use=0 exp=1761592 server_id=2
0x24a8814: key=192.168.136.18 use=0 exp=1761607 server_id=3
0x24a88c4: key=192.168.136.19 use=0 exp=1761622 server_id=2
0x24a8974: key=192.168.136.20 use=0 exp=1761636 server_id=3
0x24a8a24: key=192.168.136.21 use=0 exp=1761649 server_id=2
im running the command -
socat unix-connect:/var/run/haproxy.stat stdio <<< 'clear table VIP_Name-2 data.server_id eq 2'
Id assume that the entries with server_id = 2 would be removed but its
removing everything each time."
The cause of the issue is a missing test for skip_entry when deciding
whether to clear the key or not. The test was present when only the
last node is to be removed, so removing only the first node from a
list of two always did the right thing, explaining why it remained
unnoticed in basic unit tests.
The bug was introduced by commit 8fa52f4e which attempted to fix a
previous issue with this feature where only the last node was removed.
This bug is 1.5-specific and does not require any backport.
Such load balance algorithms as roundrobin, leastconn and first will check the
server after being selected with the following condition:
if (!s->maxconn || (!s->nbpend && s->served < srv_dynamic_maxconn(s)))
But static-rr uses the different one in map_get_server_rr() as below:
if (!srv->maxconn || srv->cur_sess < srv_dynamic_maxconn(srv))
After viewing this difference, it is a better choice for static-rr to use the
same check condition as other algorithms.
This change will only affect static-rr. Though all hash algorithms with type
map-based will use the same server map as static-rr, they call another function
map_get_server_hash() to get server.
Signed-off-by: Godbach <nylzhaowei@gmail.com>
When using req.payload and res.payload to look up for specific content at an
arbitrary location, we're often facing the problem of not knowing the input
buffer length. If the length argument is larger than the buffer length, the
function did not match, and if they're smaller, there is a risk of not getting
the expected content. This is especially true when looking for data in SOAP
requests.
So let's make some provisions for scanning the whole buffer by specifying a
length of 0 bytes. This greatly simplifies the processing of random-sized
input data.
The "set table" statement allows to create new entries with their respective
values. Till now it was limited to a single data type per line, requiring as
many "set table" statements as the desired data types to be set. Since this
is only a parser limitation, this patch gets rid of it. It also allows the
creation of a key with no data types (all reset to their default values).
Since commit 654694e1, it has been possible to feed some data into
stick tables from the CLI. That commit considered that frequency
counters would only have their previous value set, so that they
progressively fade out. But this does not match any real world
use case in fact. The only reason for feeding a freq counter is
to pass some data learned outside. We certainly don't want to see
such data start to vanish immediately, otherwise it will force the
external scripts to loop very frequently to limit the losses.
So let's set the current value instead in order to guarantee that
the data remains stable over the full period, then starts to fade
out between 1* and 2* the period.
sc_* sample fetches now take an optional parameter which allows to look
the key in an alternate table. This is convenient to pass multiple
information for the same key at once (eg: have multiple gpc0 for the
same key, or support being fed complementary information from the CLI).
Example :
listen front
bind :8000
tcp-request content track-sc0 src table local-ip
http-response set-header src-id %[sc0_get_gpc0]+%[sc0_get_gpc0(global-ip)]
server dummy 127.0.0.1:8001
backend local-ip
stick-table size 1k type ip store gpc0
backend global-ip
stick-table size 1k type ip store gpc0
One very annoying issue when trying to extend the sticky counters beyond
the current 3 counters is that it requires a massive copy-paste of fetch
functions (we don't have to copy-paste code anymore), just so that the
fetch names exist.
So let's have an alternate form like "sc_*(num)" to allow passing the
counter number as an argument without having to redefine new fetch names.
The MAX_SESS_STKCTR macro defines the number of usable sticky counters,
which defaults to 3.
In preparation of more flexibility in the stick counters, make their
number configurable. It still defaults to 3 which is the minimum
accepted value. Changing the value alone is not sufficient to get
more counters, some bitfields still need to be updated and the TCP
actions need to be updated as well, but this update tries to be
easier, which is nice for experimentation purposes.
smp_fetch_sc0_trackers, smp_fetch_sc1_trackers and smp_fetch_sc2_trackers
were merged into a single function which relies on the fetch name to decide
what to return.
This is also a bug fix for this feature which has never worked till its bogus
introduction by commit "2406db4 MEDIUM: counters: add sc1_trackers/sc2_trackers"
(1.5-dev10).
Instead of returning the value in the sample, it was returned as the fetch
result!
There is no need to backport this fix anyway since it's 1.5-specific and
nobody uses the feature.
smp_fetch_sc0_bytes_out_rate, smp_fetch_sc1_bytes_out_rate, smp_fetch_sc2_bytes_out_rate,
smp_fetch_src_bytes_out_rate and smp_fetch_bytes_out_rate were merged into a single
function which relies on the fetch name to decide what to return.
smp_fetch_sc0_kbytes_out, smp_fetch_sc1_kbytes_out, smp_fetch_sc2_kbytes_out,
smp_fetch_src_kbytes_out and smp_fetch_kbytes_out were merged into a single
function which relies on the fetch name to decide what to return.
smp_fetch_sc0_bytes_in_rate, smp_fetch_sc1_bytes_in_rate, smp_fetch_sc2_bytes_in_rate,
smp_fetch_src_bytes_in_rate and smp_fetch_bytes_in_rate were merged into a single
function which relies on the fetch name to decide what to return.
smp_fetch_sc0_kbytes_in, smp_fetch_sc1_kbytes_in, smp_fetch_sc2_kbytes_in,
smp_fetch_src_kbytes_in and smp_fetch_kbytes_in were merged into a single
function which relies on the fetch name to decide what to return.
smp_fetch_sc0_http_err_rate, smp_fetch_sc1_http_err_rate, smp_fetch_sc2_http_err_rate,
smp_fetch_src_http_err_rate and smp_fetch_http_err_rate were merged into a single
function which relies on the fetch name to decide what to return.
smp_fetch_sc0_http_err_cnt, smp_fetch_sc1_http_err_cnt, smp_fetch_sc2_http_err_cnt,
smp_fetch_src_http_err_cnt and smp_fetch_http_err_cnt were merged into a single
function which relies on the fetch name to decide what to return.
smp_fetch_sc0_http_req_rate, smp_fetch_sc1_http_req_rate, smp_fetch_sc2_http_req_rate,
smp_fetch_src_http_req_rate and smp_fetch_http_req_rate were merged into a single
function which relies on the fetch name to decide what to return.
smp_fetch_sc0_http_req_cnt, smp_fetch_sc1_http_req_cnt, smp_fetch_sc2_http_req_cnt,
smp_fetch_src_http_req_cnt and smp_fetch_http_req_cnt were merged into a single
function which relies on the fetch name to decide what to return.
smp_fetch_sc0_sess_rate, smp_fetch_sc1_sess_rate, smp_fetch_sc2_sess_rate,
smp_fetch_src_sess_rate and smp_fetch_sess_rate were merged into a single
function which relies on the fetch name to decide what to return.
smp_fetch_sc0_sess_cnt, smp_fetch_sc1_sess_cnt, smp_fetch_sc2_sess_cnt,
smp_fetch_src_sess_cnt and smp_fetch_sess_cnt were merged into a single
function which relies on the fetch name to decide what to return.
smp_fetch_sc0_conn_cur, smp_fetch_sc1_conn_cur, smp_fetch_sc2_conn_cur,
smp_fetch_src_conn_cur and smp_fetch_conn_cur were merged into a single
function which relies on the fetch name to decide what to return.
smp_fetch_sc0_conn_rate, smp_fetch_sc1_conn_rate, smp_fetch_sc2_conn_rate,
smp_fetch_src_conn_rate and smp_fetch_conn_rate were merged into a single
function which relies on the fetch name to decide what to return.
smp_fetch_sc0_conn_cnt, smp_fetch_sc1_conn_cnt, smp_fetch_sc2_conn_cnt,
smp_fetch_src_conn_cnt and smp_fetch_conn_cnt were merged into a single
function which relies on the fetch name to decide what to return.
smp_fetch_sc0_clr_gpc0, smp_fetch_sc1_clr_gpc0, smp_fetch_sc2_clr_gpc0,
smp_fetch_src_clr_gpc0 and smp_fetch_clr_gpc0 were merged into a single
function which relies on the fetch name to decide what to return.
smp_fetch_sc0_inc_gpc0, smp_fetch_sc1_inc_gpc0, smp_fetch_sc2_inc_gpc0,
smp_fetch_src_inc_gpc0 and smp_fetch_inc_gpc0 were merged into a single
function which relies on the fetch name to decide what to return.
smp_fetch_sc0_gpc0, smp_fetch_sc1_gpc0, smp_fetch_sc2_gpc0,
smp_fetch_src_gpc0 and smp_fetch_gpc0 were merged into a single
function which relies on the fetch name to decide what to return.
smp_fetch_sc0_get_gpc0, smp_fetch_sc1_get_gpc0, smp_fetch_sc2_get_gpc0,
smp_fetch_src_get_gpc0 and smp_fetch_get_gpc0 were merged into a single
function which relies on the fetch name to decide what to return.
This function aims at simplifying the prefetching of the table and entry
when using any of the session counters fetches. The principle is that the
src_* variant produces a stkctr that is used instead of the one from the
session. That way we can call the same function from all session counter
fetch functions and always have a single function to support sc[0-9]_/src_.
This function is also called directly from backend.c, so let's stop
building fake args to call it as a sample fetch, and have a lower
layer more generic function instead.
We're having a lot of duplicate code just because of minor variants between
fetch functions that could be dealt with if the functions had the pointer to
the original keyword, so let's pass it as the last argument. An earlier
version used to pass a pointer to the sample_fetch element, but this is not
the best solution for two reasons :
- fetch functions will solely rely on the keyword string
- some other smp_fetch_* users do not have the pointer to the original
keyword and were forced to pass NULL.
So finally we're passing a pointer to the keyword as a const char *, which
perfectly fits the original purpose.
Converts an integer supposed to contain a date since epoch to
a string representing this date in a format suitable for use
in HTTP header fields. If an offset value is specified, then
it is a number of seconds that is added to the date before the
conversion is operated. This is particularly useful to emit
Date header fields, Expires values in responses when combined
with a positive offset, or Last-Modified values when the
offset is negative.
Returns the current date as the epoch (number of seconds since 01/01/1970).
If an offset value is specified, then it is a number of seconds that is added
to the current date before returning the value. This is particularly useful
to compute relative dates, as both positive and negative offsets are allowed.
There is no more reason for having "always_true", "always_false" and "env"
in acl.c while they're the most basic sample fetch keywords, so let's move
them to sample.c where it's easier to find them.
sample_process() used to return NULL on changing data, regardless of the
SMP_OPT_FINAL flag. Let's change this so that it is now possible to
include such data in logs or HTTP headers. Also, one unconvenient
thing was that it used to always set the sample flags to zero, making
it incompatible with ACLs which may need to call it multiple times. Only
do this for locally-allocated samples.
We now support having a comma-delimited converter list, which can start
right after the fetch keyword. The immediate benefit is that it allows
to use converters in log-format expressions, for example :
set-header source-net %[src,ipmask(24)]
The parser is also slightly improved and should be more resilient against
configuration errors. Also, optional arguments in converters were mistakenly
not allowed till now, so this was fixed.
Splicing is avoided for small transfers because it's generally cheaper
to perform a couple of recv+send calls than pipe+splice+splice. This
has the consequence that the last chunk of a large transfer may be
transferred using recv+send if it's less than 4 kB. But when the pipe
is already set up, it's better to use splice() to read the pending data,
since they will get merged with the pending ones. This is what now
happens everytime the reader is slower than the writer.
Note that this change alone could have fixed most of the CPU hog bug,
except at the end when only the close was pending.
As explained in previous patch, we incorrectly call chk_snd() when
performing a read even if the write event is already subscribed to
poll(). This is counter-productive because we're almost sure to get
an EAGAIN.
A quick test shows that this fix halves the number of failed splice()
calls without adding any extra work on other syscalls.
This could have been tagged as an improvement, but since this behaviour
made the analysis of previous bug more complex, it still qualifies as
a fix.
Mark Janssen reported an issue in 1.5-dev19 which was introduced
in 1.5-dev12 by commit 96199b10. From time to time, randomly, the
CPU usage spikes to 100% for seconds to minutes.
A deep analysis of the traces provided shows that it happens when
waiting for the response to a second pipelined HTTP request, or
when trying to handle the received shutdown advertised by epoll()
after the last block of data. Each time, splice() was involved with
data pending in the pipe.
The cause of this was that such events could not be taken into account
by splice nor by recv and were left pending :
- the transfer of the last block of data, optionally with a shutdown
was not handled by splice() because of the validation that to_forward
is higher than MIN_SPLICE_FORWARD ;
- the next recv() call was inhibited because of the test on presence
of data in the pipe. This is also what prevented the recv() call
from handling a response to a pipelined request until the client
had ACKed the previous response.
No less than 4 different methods were experimented to fix this, and the
current one was finally chosen. The principle is that if an event is not
caught by splice(), then it MUST be caught by recv(). So we remove the
condition on the pipe's emptiness to perform an recv(), and in order to
prevent recv() from being used in the middle of a transfer, we mark
supposedly full pipes with CO_FL_WAIT_ROOM, which makes sense because
the reason for stopping a splice()-based receive is that the pipe is
supposed to be full.
The net effect is that we don't wake up and sleep in loops during these
transient states. This happened much more often than expected, sometimes
for a few cycles at end of transfers, but rarely long enough to be
noticed, unless a client timed out with data pending in the pipe. The
effect on CPU usage is visible even when transfering 1MB objects in
pipeline, where the CPU usage drops from 10 to 6% on a small machine at
medium bandwidth.
Some further improvements are needed :
- the last chunk of a splice() transfer is never done using splice due
to the test on to_forward. This is wrong and should be performed with
splice if the pipe has not yet been emptied ;
- si_chk_snd() should not be called when the write event is already being
polled, otherwise we're almost certain to get EAGAIN.
Many thanks to Mark for all the traces he cared to provide, they were
essential for understanding this issue which was not reproducible
without.
Only 1.5-dev is affected, no backport is needed.
The max weight of server is 256 now, but SRV_UWGHT_MAX is still 255. As a result,
FWRR will not work well when server's weight is 256. The description is as below:
There are some macros related to server's weight in include/types/server.h:
#define SRV_UWGHT_RANGE 256
#define SRV_UWGHT_MAX (SRV_UWGHT_RANGE - 1)
#define SRV_EWGHT_MAX (SRV_UWGHT_MAX * BE_WEIGHT_SCALE)
Since weight of server can be reach to 256 and BE_WEIGHT_SCALE equals to 16,
the max eweight of server should be 256*16 = 4096, it will exceed SRV_EWGHT_MAX
which equals to SRV_UWGHT_MAX*BE_WEIGHT_SCALE = 255*16 = 4080. When a server
with weight 256 is insterted into FWRR tree during initialization, the key value
of this server should be SRV_EWGHT_MAX - s->eweight = 4080 - 4096 = -16 which
is closed to UINT_MAX in unsigned type, so the server with highest weight will
be not elected as the first server to process request.
In addition, it is a better choice to compare with SRV_UWGHT_MAX than a magic
number 256 while doing check for the weight. The max number of servers for
round-robin algorithm is also updated.
Signed-off-by: Godbach <nylzhaowei@gmail.com>
When ACLs and samples were converged in 1.5-dev18, function
"acl_prefetch_http" was not properly converted after commit 8ed669b1.
It used to return -1 when contents did not match HTTP traffic, which
was considered as a "true" boolean result by the ACL execution code,
possibly causing crashes due to missing data when checking for HTTP
traffic in TCP rules.
Another issue is that when the function returned zero, it did not
set tje SMP_F_MAY_CHANGE flag, so it could randomly exit on partial
requests before waiting for a complete one.
Last issue is that when it returned 1, it did not set smp->data.uint,
so this last one would retain a random value from a past execution.
This could randomly cause some matches to fail as well.
Thanks to Remo Eichenberger for reporting this issue with a detailed
explanation and configuration.
This bug is 1.5-specific, no backport is needed.
The checkcache option checks for cacheable responses with a set-cookie
header. Since the response processing code was refactored in 1.3.8
(commit a15645d4), the check was broken because the no-cache value
is only checked as no-cache="set-cookie", and not alone.
Thanks to Hervé Commowick for reporting this stupid bug!
The fix should be backported to 1.4 and 1.3.
Lukas Benes reported that http-send-name-header causes a segfault if no
server is available because we're dereferencing the session's target which
is NULL. The tiniest reproducer looks like this :
listen foo
bind :1234
mode http
http-send-name-header srv
This obvious fix must be backported to 1.4 which is affected as well.
Commit e25c917a introduced a third tracking counter bug forgot
to check it when storing values at the end of the session. The
impact is that if neither the first nor the second one are
changed, none of them are saved.
Both fdinfo and fdtab are allocated memory in init() while haproxy is starting,
but only fdtab is freed in deinit(), fdinfo should also be freed.
Signed-off-by: Godbach <nylzhaowei@gmail.com>
As per RFC3260 #4 and BCP37 #4.2 and #5.2, the IPv6 counterpart of TOS
is "traffic class".
Add support for IPv6 traffic class in "set-tos" by moving the "set-tos"
related code to the new inline function inet_set_tos(), handling IPv4
(IP_TOS), IPv6 (IPV6_TCLASS) and IPv4-mapped sockets (IP_TOS, like
::ffff:127.0.0.1).
Also define - if missing - the IN6_IS_ADDR_V4MAPPED() macro in
include/common/compat.h for compatibility.
s->req->prod->conn->addr.to.ss_family contains only useful data if
conn_get_to_addr() is called early. If thats not the case (nothing in the
configuration needs the destination address like logs, transparent, ...)
then "set-tos" doesn't work.
Fix this by checking s->req->prod->conn->addr.from.ss_family instead.
Also fix a minor doc issue about set-tos in http-response.
Benoit Dolez reported a failure to start haproxy 1.5-dev19. The
process would immediately report an internal error with missing
fetches from some crap instead of ACL names.
The cause is that some versions of gcc seem to trim static structs
containing a variable array when moving them to BSS, and only keep
the fixed size, which is just a list head for all ACL and sample
fetch keywords. This was confirmed at least with gcc 3.4.6. And we
can't move these structs to const because they contain a list element
which is needed to link all of them together during the parsing.
The bug indeed appeared with 1.5-dev19 because it's the first one
to have some empty ACL keyword lists.
One solution is to impose -fno-zero-initialized-in-bss to everyone
but this is not really nice. Another solution consists in ensuring
the struct is never empty so that it does not move there. The easy
solution consists in having a non-null list head since it's not yet
initialized.
A new "ILH" list head type was thus created for this purpose : create
an Initialized List Head so that gcc cannot move the struct to BSS.
This fixes the issue for this version of gcc and does not create any
burden for the declarations.
When abortonclose is used and an error is detected on the client side,
better force an RST to the server. That way we propagate to the server
the same vision we got from the client, and we ensure that we won't keep
TIME_WAITs.
Remove event_accept() in include/proto/proto_http.h and use correct function
name in other two files instead of event_accept().
Signed-off-by: Godbach <nylzhaowei@gmail.com>
It was a bit inconsistent to have gpc start at 0 and sc start at 1,
so make sc start at zero like gpc. No previous release was issued
with sc3 anyway, so no existing setup should be affected.
When a config makes use of hdr_ip(x-forwarded-for,-1) or any such thing
involving a negative occurrence count, the header is still parsed in the
order it appears, and an array of up to MAX_HDR_HISTORY entries is created.
When more entries are used, the entries simply wrap and continue this way.
A problem happens when the incoming header field count exactly divides
MAX_HDR_HISTORY, because the computation removes the number of requested
occurrences from the count, but does not care about the risk of wrapping
with a negative number. Thus we can dereference the array with a negative
number and randomly crash the process.
The bug is located in http_get_hdr() in haproxy 1.5, and get_ip_from_hdr2()
in haproxy 1.4. It affects configurations making use of one of the following
functions with a negative <value> occurence number :
- hdr_ip(<name>, <value>) (in 1.4)
- hdr_*(<name>, <value>) (in 1.5)
It also affects "source" statements involving "hdr_ip(<name>)" since that
statement implicitly uses -1 for <value> :
- source 0.0.0.0 usesrc hdr_ip(<name>)
A workaround consists in rejecting dangerous requests early using
hdr_cnt(<name>), which is available both in 1.4 and 1.5 :
block if { hdr_cnt(<name>) ge 10 }
This bug has been present since the introduction of the negative offset
count in 1.4.4 via commit bce70882. It has been reported by David Torgerson
who offered some debugging traces showing where the crash happened, thus
making it significantly easier to find the bug!
CVE-2013-2175 was assigned to this bug.
This fix must absolutely be backported to 1.4.
Commit 5adeda1 (acl: add option -m to change the pattern matching method)
was not completely correct with regards to boolean fetches. It only used
the sample type to determine if the test had to be performed as a boolean
instead of relying on the match function. Due to this, a test such as the
following would not correctly match as the pattern would be ignored :
acl srv_down srv_is_up(s2) -m int 0
No backport is needed as this was merged first in 1.5-dev18.
This is useful in order to take different actions across restarts without
touching the configuration (eg: soft-stop), or to pass some information
such as the local host name to the next hop.
This one is wrong, never matches and cannot work. It was brought by a blind
copy-paste from the url_* version in 1.5-dev9, but there is no underlying
fetch returning an IP type for this.
The following 15 ACLs were missed from previous review, and are not needed
either.
hdr_cnt, hdr_ip, hdr_val, rep_ssl_hello_type, req_len, req_ssl_hello_type,
scook_cnt, scook_val, shdr_cnt, shdr_ip, shdr_val, url_ip, url_port,
urlp_val, req_proto_http.
Commit bef91e71 added the possibility to automatically use some fetch
functions instead of ACL functions, but for the fetch output type was
never used and setting the match method using -m was always mandatory.
Some fetch types are non-ambiguous and can intuitively be associated
with some ACL types :
SMP_T_BOOL -> bool
SMP_T_UINT/SINT -> int
SMP_T_IPV4/IPV6 -> ip
So let's have the ACL expression parser detect these ones automatically.
Other types are more ambiguous, especially everything related to strings,
as there are many string matching methods available and none of them is
the obvious standard matching method for any string. These ones will still
have to be specified using -m.
This configures the client-facing connection to receive a PROXY protocol
header before any byte is read from the socket. This is equivalent to
having the "accept-proxy" keyword on the "bind" line, except that using
the TCP rule allows the PROXY protocol to be accepted only for certain
IP address ranges using an ACL. This is convenient when multiple layers
of load balancers are passed through by traffic coming from public
hosts.
"set-mark" is used to set the Netfilter MARK on all packets sent to the
client to the value passed in <mark> on platforms which support it. This
value is an unsigned 32 bit value which can be matched by netfilter and
by the routing table. It can be expressed both in decimal or hexadecimal
format (prefixed by "0x"). This can be useful to force certain packets to
take a different route (for example a cheaper network path for bulk
downloads). This works on Linux kernels 2.6.32 and above and requires
admin privileges.
This manipulates the TOS field of the IP header of outgoing packets sent
to the client. This can be used to set a specific DSCP traffic class based
on some request or response information. See RFC2474, 2597, 3260 and 4594
for more information.
Some users want to disable logging for certain non-important requests such as
stats requests or health-checks coming from another equipment. Other users want
to log with a higher importance (eg: notice) some special traffic (POST requests,
authenticated requests, requests coming from suspicious IPs) or some abnormally
large responses.
This patch responds to all these needs at once by adding a "set-log-level" action
to http-request/http-response. The 8 syslog levels are supported, as well as "silent"
to disable logging.
Some actions were clearly missing to process response headers. This
patch adds a new "http-response" ruleset which provides the following
actions :
- allow : stop evaluating http-response rules
- deny : stop and reject the response with a 502
- add-header : add a header in log-format mode
- set-header : set a header in log-format mode
Since commit cfd97c6f was merged into 1.5-dev14 (BUG/MEDIUM: checks:
prevent TIME_WAITs from appearing also on timeouts), some valid health
checks sometimes used to show some TCP resets. For example, this HTTP
health check sent to a local server :
19:55:15.742818 IP 127.0.0.1.16568 > 127.0.0.1.8000: S 3355859679:3355859679(0) win 32792 <mss 16396,nop,nop,sackOK,nop,wscale 7>
19:55:15.742841 IP 127.0.0.1.8000 > 127.0.0.1.16568: S 1060952566:1060952566(0) ack 3355859680 win 32792 <mss 16396,nop,nop,sackOK,nop,wscale 7>
19:55:15.742863 IP 127.0.0.1.16568 > 127.0.0.1.8000: . ack 1 win 257
19:55:15.745402 IP 127.0.0.1.16568 > 127.0.0.1.8000: P 1:23(22) ack 1 win 257
19:55:15.745488 IP 127.0.0.1.8000 > 127.0.0.1.16568: FP 1:146(145) ack 23 win 257
19:55:15.747109 IP 127.0.0.1.16568 > 127.0.0.1.8000: R 23:23(0) ack 147 win 257
After some discussion with Chris Huang-Leaver, it appeared clear that
what we want is to only send the RST when we have no other choice, which
means when the server has not closed. So we still keep SYN/SYN-ACK/RST
for pure TCP checks, but don't want to see an RST emitted as above when
the server has already sent the FIN.
The solution against this consists in implementing a "drain" function at
the protocol layer, which, when defined, causes as much as possible of
the input socket buffer to be flushed to make recv() return zero so that
we know that the server's FIN was received and ACKed. On Linux, we can make
use of MSG_TRUNC on TCP sockets, which has the benefit of draining everything
at once without even copying data. On other platforms, we read up to one
buffer of data before the close. If recv() manages to get the final zero,
we don't disable lingering. Same for hard errors. Otherwise we do.
In practice, on HTTP health checks we generally find that the close was
pending and is returned upon first recv() call. The network trace becomes
cleaner :
19:55:23.650621 IP 127.0.0.1.16561 > 127.0.0.1.8000: S 3982804816:3982804816(0) win 32792 <mss 16396,nop,nop,sackOK,nop,wscale 7>
19:55:23.650644 IP 127.0.0.1.8000 > 127.0.0.1.16561: S 4082139313:4082139313(0) ack 3982804817 win 32792 <mss 16396,nop,nop,sackOK,nop,wscale 7>
19:55:23.650666 IP 127.0.0.1.16561 > 127.0.0.1.8000: . ack 1 win 257
19:55:23.651615 IP 127.0.0.1.16561 > 127.0.0.1.8000: P 1:23(22) ack 1 win 257
19:55:23.651696 IP 127.0.0.1.8000 > 127.0.0.1.16561: FP 1:146(145) ack 23 win 257
19:55:23.652628 IP 127.0.0.1.16561 > 127.0.0.1.8000: F 23:23(0) ack 147 win 257
19:55:23.652655 IP 127.0.0.1.8000 > 127.0.0.1.16561: . ack 24 win 257
This change should be backported to 1.4 which is where Chris encountered
this issue. The code is different, so probably the tcp_drain() function
will have to be put in the checks only.
The req.hdr and res.hdr fetch methods do not work well on headers which
are allowed to contain commas, such as User-Agent, Date or Expires.
More specifically, full-length matching is impossible if a comma is
present.
This patch introduces 4 new fetch functions which are designed to work
with these full-length headers :
- req.fhdr, req.fhdr_cnt
- res.fhdr, res.fhdr_cnt
These ones do not stop at commas and permit to return full-length header
values.
People who use "option dontlog-normal" are bothered with redirects and
stats being logged and reported as errors in the logs ("PR" = proxy
blocked the request).
This patch introduces a new flag 'L' for when a request is locally
processed, that is not considered as an error by the log filters. That
way we know a request was intercepted and processed by haproxy without
logging the line when "option dontlog-normal" is in effect.
Since 1.5-dev12 and commit 3bf1b2b8 (MAJOR: channel: stop relying on
BF_FULL to take action), the HTTP parser switched to channel_full()
instead of BF_FULL to decide whether a buffer had enough room to start
parsing a request or response. The problem is that channel_full()
intentionally ignores outgoing data, so a corner case exists where a
large response might still be left in a response buffer with just a
few bytes left (much less than the reserve), enough to accept a second
response past the last data, but not enough to permit the HTTP processor
to add some headers. Since all the processing relies on this space being
available, we can get some random crashes when clients pipeline requests.
The analysis of a core from haproxy configured with 20480 bytes buffers
shows this : with enough "luck", when sending back the response for the
first request, the client is slow, the TCP window is congested, the socket
buffers are full, and haproxy's buffer fills up. We still have 20230 bytes
of response data in a 20480 response buffer. The second request is sent to
the server which returns 214 bytes which fit in the small 250 bytes left
in this buffer. And the buffer arrangement makes it possible to escape all
the controls in http_wait_for_response() :
|<------ response buffer = 20480 bytes ------>|
[ 2/2 | 3 | 4 | 1/2 ]
^ start of circular buffer
1/2 = beginning of previous response (18240)
2/2 = end of previous response (1990)
3 = current response (214)
4 = free space (36)
- channel_full() returns false (20230 bytes are going to leave)
- the response headers does not wrap at the end of the buffer
- the remaining linear room after the headers is larger than the
reserve, because it's the previous response which wraps :
=> response is processed
Header rewriting causes it to reach 260 bytes, 10 bytes larger than what
the buffer could hold. So all computations during header addition are
wrong and lead to the corruption we've observed.
All the conditions are very hard to meet (which explains why it took
almost one year for this bug to show up) and are almost impossible to
reproduce on purpose on a test platform. But the bug is clearly there.
This issue was reported by Dinko Korunic who kindly devoted a lot of
time to provide countless traces and cores, and to experiment with
troubleshooting patches to knock the bug down. Thanks Dinko!
No backport is needed, but all 1.5-dev versions between dev12 and dev18
included must be upgraded. A workaround consists in setting option
forceclose to prevent pipelined requests from being processed.
I left a mistake in my previous patch bringing the crt-list feature,
it breaks clients with no SNI support.
Also remove the useless wildp = NULL as per a previous discussion.
We were using "tune.ssl.maxrecord 2000" and discovered an interesting
problem: SSL data sent from the server to the client showed occasional
corruption of the payload data.
The root cause was:
When ssl_max_record is smaller than the requested send amount
the ring buffer wrapping wasn't properly adjusting the
number of bytes to send.
I solved this by selecting the initial size based on the number
of output bytes that can be sent without splitting _before_ checking
against ssl_max_record.
We're often missin a third counter to track base, src and base+src at
the same time. Here we introduce track_sc3 to have this third counter.
It would be wise not to add much more counters because that slightly
increases the session size and processing time though the real issue
is more the declaration of the keywords in the code and in the doc.
By properly affecting the flags and values, it becomes easier to add
more tracked counters, for example for experimentation. It also slightly
reduces the code and the number of tests. No counters were added with
this patch.
This new pattern fetch returns the client certificate's SHA-1 fingerprint
(i.e. SHA-1 hash of DER-encoded certificate) in a binary chunk.
This can be useful to pass it to a server in a header or to stick a client
to a server across multiple SSL connections.
FreeBSD uses (IPPROTO_IP, IP_BINDANY) and (IPPROTO_IPV6, IPV6_BINDANY)
to enable transparent proxy on a socket.
This patch adds support for the relevant setsockopt() calls.
This patch does not change the logic of the code, it only changes the
way OS-specific defines are tested.
At the moment the transparent proxy code heavily depends on Linux-specific
defines. This first patch introduces a new define "CONFIG_HAP_TRANSPARENT"
which is set every time the defines used by transparent proxy are present.
This also means that with an up-to-date libc, it should not be necessary
anymore to force CONFIG_HAP_LINUX_TPROXY during the build, as the flags
will automatically be detected.
The CTTPROXY flags still remain separate because this older API doesn't
work the same way.
A new line has been added in the version output for haproxy -vv to indicate
what transparent proxy support is available.
Improve the crt-list file format to allow a rule to negate a certain SNI :
<crtfile> [[!]<snifilter> ...]
This can be useful when a domain supports a wildcard but you don't want to
deliver the wildcard cert for certain specific domains.
Global compression settings (windowsize and memlevel) were only considered
for the gzip algorithm but not the deflate algorithm. Since a single allocator
is used for both algos, if gzip was first initialized the memory with parameters
smaller than default, then initializing deflate after with default settings
would result in overusing the small allocated areas.
To fix this, we make use of deflateInit2() for deflate_init() as well.
Thanks to Godbach for reporting this bug, introduced by in 1.5-dev13 by commit
8b52bb38. No backport is needed.
It happens that openssl's API can differ between versions, causing some
serious trouble if the version used at runtime is not the same as used
for building.
Now we report the two versions separately along with a warning if the
version differs (except the patch version).
Source function will not work with the following line in default section:
source 0.0.0.0 usesrc clientip
even that related settings by iptables and ip rule have been set correctly.
But it can work well in backend setcion.
The reason is that the operation in line 1815 in cfgparse.c as below:
curproxy->conn_src.opts = defproxy.conn_src.opts & ~CO_SRC_TPROXY_MASK;
clears three low bits of conn_src.opts which stores the configuration of
'usesrc'. Without correct bits set, the source address function can not
work well. They should be copied to the backend proxy without being modified.
Since conn_src.tproxy_addr had not copied from defproxy to backend proxy
while initializing backend proxy, source function will not work well
with explicit source address set in default section either.
Signed-off-by: Godbach <nylzhaowei@gmail.com>
Note: the bug was introduced in 1.5-dev16 with commit ef9a3605
In 1.5-dev17 (commit 1facd6d6), we reorganized the way HTTP stats
requests are handled. When moving the code, we dropped a "return 0"
which happens upon incomplete POST request, so we now end up with
the next return 1 which causes processing to go on with next
analyser. This causes incomplete POST requests to try to forward
the request to servers, resulting in either a 404 or a 503 depending
on the configuration.
This patch fixes this regression to restore the previous behaviour.
It's not enough though, as it happens that the stats code is handled
after all http header processing but in the same function. The net
effect is that incomplete requests cause the headers manipulation to
be performed multiple times, possibly resulting in multiple headers
in the request buffer. Since the stats requests are not meant to be
forwarded, it's not an issue yet but this is something to take care
of later.
A remaining issue that's not handled yet is that if the client does
not send the complete POST headers, then the request is finally
forwarded. This is not a regression, it has always been there and
seems to be caused by the lack of timeout processing when waiting
for the POST body. The solution to this issue would be to move the
handling of stats requests into a dedicated analyser placed after
http_process_request_body().
Bug reported by Guillaume de Lafond.
Implements the "res.comp" ACL which is a boolean returning 1 when a
response has been compressed by HAProxy or 0 otherwise.
Implements the "res.comp_algo" fetch which contains the name of the
algorithm HAProxy used to compress the response.
Bryan Talbot reported that a config with only "stats bind-process 1" in
the global section would crash during parsing. This bug was introduced
with this new statement in 1.5-dev13. The stats frontend must be allocated
if it was not yet.
No backport is needed. The workaround consists in having previously declared
any other stats keyword (generally stats socket).
A "soft-stopped" server (weight 0) can be missed easily in the
webinterface. Fix this by using a specific class (and color).
Signed-off-by: Lukas Tribus <luky-37@hotmail.com>
When a server state change happens, a status bar appears with a link. Since
commit 1.5-dev16 (commit e7dbfc66), it was not visible anymore because of
the CSS properties set on the whole "div" tag used by the tooltips. Fix this
by using a specific class for the tooltips.
The confirmation link on the web interface forgets to set the displaying
options, so that when one clicks on "[X] Action processed successfully",
the auto-refresh, up/down and scope search settings are lost. We need to
pass all these info in the new link.
commit 88c278fadf provided a new input field on the statistics page which was
focused by default. The autofocus prevent to scroll down the page with the
keyboard immediately afterloading it, without the need to leave that field.
It also interfered with "stats refresh" by scrolling up to this field on each
refresh.
This small patch removes the autofocus keyword and adds a tabindex="1" as
suggested by Vincent Bernat. It also cleans up the generated html code.
This patch adds a "scope" box in the statistics page in order to
display only proxies with a name that contains the requested value.
The scope filter is preserved across all clicks on the page.
Since ea68d36 we show whether JIT is enabled, based on the USE-flag
(USE_PCRE_JIT). This is too naive; libpcre may be built without JIT
support (which is the default).
Fix this by calling pcre_config(), which has the accurate information
we are looking for.
Example of a libpcre without JIT support after this patch:
> ./haproxy -vv | grep PCRE
> OPTIONS = USE_STATIC_PCRE=1 USE_PCRE_JIT=1
> Built with PCRE version : 8.32 2012-11-30
> PCRE library supports JIT : no (libpcre build without JIT?)
The compression state machine happens to start work it cannot undo if
there's no more data in the input buffer, and has trouble accounting
for it. Fixing it requires more than a few lines, as the confusion is
in part caused by the way the pointers to the various places in the
message are handled internally. So as a temporary fix, let's disable
compression on chunk-encoded responses. This will give us more time
to perform the required changes.
Will Glass-Husain reported this breakage which was caused by commit
dec9814e merged in 1.5-dev12, and which was incomplete, resulting in
both "clear table" and "show table" to do the same thing.
No backport is needed.
Improve error log reporting in the format parser by always giving the
file name, line number, and directive name instead of the hard-coded
"log-format".
Previously we got this when dealing with log-format errors:
[WARNING] 101/183012 (8561) : Warning: log-format variable name 'r' is not suited to HTTP mode
[WARNING] 101/183012 (8561) : log-format: sample fetch <hdr(a,1)> may not be reliably used with 'log-format' because it needs 'HTTP request headers,HTTP response headers' which is not available here.
[WARNING] 101/183012 (8561) : Warning: no such variable name 'k' in log-format
Now we have this :
[WARNING] 101/183016 (8593) : parsing [fmt.cfg:8] : 'log-format' variable name 'r' is reserved for HTTP mode
[WARNING] 101/183016 (8593) : parsing [fmt.cfg:8] : 'log-format' : sample fetch <hdr(a,1)> may not be reliably used here because it needs 'HTTP request headers,HTTP response headers' which is not available here.
[WARNING] 101/183016 (8593) : parsing [fmt.cfg:15] : no such variable name 'k' in 'unique-id-format'
Commit a4312fa2 merged into dev18 improved log-format management by
processing "log-format" and "unique-id-format" where they were declared,
so that the faulty args could be reported with their correct line numbers.
Unfortunately, the log-format parser considers the proxy mode (TCP/HTTP)
and now if the directive is set before the "mode" statement, it can be
rejected and report warnings.
So we really need to parse these directives at the end of a section at
least. Right now we do not have an "end of section" event, so we need
to store the file name and line number for each of these directives,
and take care of them at the end.
One of the benefits is that now the line numbers can be inherited from
the line passing "option httplog" even if it's in a defaults section.
Future improvements should be performed to report line numbers in every
log-format processed by the parser.
When the parameter passed to a consistent hash is not found, we fall back to
round-robin using chash_get_next_server(). This one stores the last visited
server in lbprm.chash.last, which can be NULL upon the first invocation or if
the only server was recently brought up.
The loop used to scan for a server is able to skip the previously attempted
server in case of a redispatch, by passing this previous server in srvtoavoid.
For this reason, the loop stops when the currently considered server is
different from srvtoavoid and different from the original chash.last.
A problem happens in a special sequence : if a connection to a server fails,
then all servers are removed from the farm, then the original server is added
again before the redispatch happens, we have chash.last = NULL and srvtoavoid
set to the only server in the farm. Then this server is always equal to
srvtoavoid and never to NULL, and the loop never stops.
The fix consists in assigning the stop point to the first encountered node if
it was not yet set.
This issue cannot happen with the map-based algorithm since it's based on an
index and not a stop point.
This issue was reported by Henry Qian who kindly provided lots of critically
useful information to figure out the conditions to reproduce the issue.
The fix needs to be backported to 1.4 which is also affected.
Till now we used to call the function until the connection established, which
means that the header rewriting was performed for nothing upon each even (eg:
uploaded contents) until the server responded or timed out.
Now we only call the function when we assign the server.
When syncing data between two haproxy instances, the received keys were
allocated in the stack but no room was planned for type string. So in
practice all strings of 16 or less bytes were properly handled (due to
the union with in6_addr which reserves some room), but above this some
local variables were overwritten. Up to 8 additional bytes could be
written without impact, but random behaviours are expected above, including
crashes and memory corruption.
If large enough strings are allowed in the configuration, it is possible that
code execution could be triggered when the function's return pointer is
overwritten.
A quick workaround consists in ensuring that no stick-table uses a string
larger than 16 bytes (default is 32). Another workaround is to disable peer
sync when strings are in use.
Thanks to Will Glass-Husain for bringing up this issue with a backtrace to
understand the issue.
The bug only affects 1.5-dev3 and above. No backport is needed.
We'll need to centralize some pool_alloc()/pool_free() calls in the
upcoming fix so before that we need to reduce the number of points
by which we leave the critical code.
According to "man pcreapi", pcre_compile() does not accept being passed
a NULL pointer in errptr or erroffset. It immediately returns NULL, causing
any expression to fail. So let's pass real variables and make use of them
to improve error reporting.
When an ACL keyword needs a mandatory argument and this argument is of
type proxy or table, it is allowed not to specify it so that current
proxy is used by default.
In order to achieve this, the ACL expression parser builds a dummy
argument from scratch and marks it unresolved.
However, since recent changes on the ACL and samples, an unresolved
argument needs to be added to the unresolved list. This specific code
did not do it, resulting in random data being used as a proxy pointer
if no argument was passed for a proxy name, possibly even causing a
crash.
A quick workaround consists explicitly naming proxies in ACLs.
Commit 96199b10 reintroduced the splice() mechanism in the new connection
system. However, it failed to account for the number of transferred bytes,
allowing more bytes than scheduled to be transferred to the client. This
can cause an issue with small-chunked responses, where each packet from
the server may contain several chunks, because a single splice() call may
succeed, then try to splice() a second time as the pipe is not full, thus
consuming the next chunk size.
This patch also reverts commit baf2a5 ("OPTIM: splice: detect shutdowns...")
because it introduced a related regression. The issue is that splice() may
return less data than available also if the pipe is full, so having EPOLLRDHUP
after splice() returns less than expected is not a sufficient indication that
the input is empty.
In both cases, the issue may be detected by the presence of "SD" termination
flags in the logs, and worked around by disabling splicing (using "-dS").
This problem was reported by Sander Klein, and no backport is needed.
haproxy -vv shows build informations about USE flags and lib versions.
This patch introduces informations about PCRE and the new JIT feature.
It also makes USE_PCRE_JIT=1 appear in the haproxy -vv "OPTIONS".
This is useful since with the introduction of JIT we will see libpcre
related issues.
Sander Klein reported this bug. The test for the extra argument on these
rules prevent any condition from being added. The bug was introduced with
the feature itself in 1.5-dev16.
Released version 1.5-dev18 with the following main changes :
- DOCS: Add explanation of intermediate certs to crt paramater
- DOC: typo and minor fixes in compression paragraph
- MINOR: config: http-request configuration error message misses new keywords
- DOC: minor typo fix in documentation
- BUG/MEDIUM: ssl: ECDHE ciphers not usable without named curve configured.
- MEDIUM: ssl: add bind-option "strict-sni"
- MEDIUM: ssl: add mapping from SNI to cert file using "crt-list"
- MEDIUM: regex: Use PCRE JIT in acl
- DOC: simplify bind option "interface" explanation
- DOC: tfo: bump required kernel to linux-3.7
- BUILD: add explicit support for TFO with USE_TFO
- MEDIUM: New cli option -Ds for systemd compatibility
- MEDIUM: add haproxy-systemd-wrapper
- MEDIUM: add systemd service
- BUG/MEDIUM: systemd-wrapper: don't leak zombie processes
- BUG/MEDIUM: remove supplementary groups when changing gid
- BUG/MEDIUM: config: fix parser crash with bad bind or server address
- BUG/MINOR: Correct logic in cut_crlf()
- CLEANUP: checks: Make desc argument to set_server_check_status const
- CLEANUP: dumpstats: Make cli_release_handler() static
- MEDIUM: server: Break out set weight processing code
- MEDIUM: server: Allow relative weights greater than 100%
- MEDIUM: server: Tighten up parsing of weight string
- MEDIUM: checks: Add agent health check
- BUG/MEDIUM: ssl: openssl 0.9.8 doesn't open /dev/random before chroot
- BUG/MINOR: time: frequency counters are not totally accurate
- BUG/MINOR: http: don't process abortonclose when request was sent
- BUG/MEDIUM: stream_interface: don't close outgoing connections on shutw()
- BUG/MEDIUM: checks: ignore late resets after valid responses
- DOC: fix bogus recommendation on usage of gpc0 counter
- BUG/MINOR: http-compression: lookup Cache-Control in the response, not the request
- MINOR: signal: don't block SIGPROF by default
- OPTIM: epoll: make use of EPOLLRDHUP
- OPTIM: splice: detect shutdowns and avoid splice() == 0
- OPTIM: splice: assume by default that splice is working correctly
- BUG/MINOR: log: temporary fix for lost SSL info in some situations
- BUG/MEDIUM: peers: only the last peers section was used by tables
- BUG/MEDIUM: config: verbosely reject peers sections with multiple local peers
- BUG/MINOR: epoll: use a fix maxevents argument in epoll_wait()
- BUG/MINOR: config: fix improper check for failed memory alloc in ACL parser
- BUG/MINOR: config: free peer's address when exiting upon parsing error
- BUG/MINOR: config: check the proper variable when parsing log minlvl
- BUG/MEDIUM: checks: ensure the health_status is always within bounds
- BUG/MINOR: cli: show sess should always validate s->listener
- BUG/MINOR: log: improper NULL return check on utoa_pad()
- CLEANUP: http: remove a useless null check
- CLEANUP: tcp/unix: remove useless NULL check in {tcp,unix}_bind_listener()
- BUG/MEDIUM: signal: signal handler does not properly check for signal bounds
- BUG/MEDIUM: tools: off-by-one in quote_arg()
- BUG/MEDIUM: uri_auth: missing NULL check and memory leak on memory shortage
- BUG/MINOR: unix: remove the 'level' field from the ux struct
- CLEANUP: http: don't try to deinitialize http compression if it fails before init
- CLEANUP: config: slowstart is never negative
- CLEANUP: config: maxcompcpuusage is never negative
- BUG/MEDIUM: log: emit '-' for empty fields again
- BUG/MEDIUM: checks: fix a race condition between checks and observe layer7
- BUILD: fix a warning emitted by isblank() on non-c99 compilers
- BUILD: improve the makefile's support for libpcre
- MEDIUM: halog: add support for counting per source address (-ic)
- MEDIUM: tools: make str2sa_range support all address syntaxes
- MEDIUM: config: make use of str2sa_range() instead of str2sa()
- MEDIUM: config: use str2sa_range() to parse server addresses
- MEDIUM: config: use str2sa_range() to parse peers addresses
- MINOR: tests: add a config file to ease address parsing tests.
- MINOR: ssl: add a global tunable for the max SSL/TLS record size
- BUG/MINOR: syscall: fix NR_accept4 system call on sparc/linux
- BUILD/MINOR: syscall: add definition of NR_accept4 for ARM
- MINOR: config: report missing peers section name
- BUG/MEDIUM: tools: fix bad character handling in str2sa_range()
- BUG/MEDIUM: stats: never apply "unix-bind prefix" to the global stats socket
- MINOR: tools: prepare str2sa_range() to return an error message
- BUG/MEDIUM: checks: don't call connect() on unsupported address families
- MINOR: tools: prepare str2sa_range() to accept a prefix
- MEDIUM: tools: make str2sa_range() parse unix addresses too
- MEDIUM: config: make str2listener() use str2sa_range() to parse unix addresses
- MEDIUM: config: use a single str2sa_range() call to parse bind addresses
- MEDIUM: config: use str2sa_range() to parse log addresses
- CLEANUP: tools: remove str2sun() which is not used anymore.
- MEDIUM: config: add complete support for str2sa_range() in dispatch
- MEDIUM: config: add complete support for str2sa_range() in server addr
- MEDIUM: config: add complete support for str2sa_range() in 'server'
- MEDIUM: config: add complete support for str2sa_range() in 'peer'
- MEDIUM: config: add complete support for str2sa_range() in 'source' and 'usesrc'
- CLEANUP: minor cleanup in str2sa_range() and str2ip()
- CLEANUP: config: do not use multiple errmsg at once
- MEDIUM: tools: support specifying explicit address families in str2sa_range()
- MAJOR: listener: support inheriting a listening fd from the parent
- MAJOR: tools: support environment variables in addresses
- BUG/MEDIUM: http: add-header should not emit "-" for empty fields
- BUG/MEDIUM: config: ACL compatibility check on "redirect" was wrong
- BUG/MEDIUM: http: fix another issue caused by http-send-name-header
- DOC: mention the new HTTP 307 and 308 redirect statues
- MEDIUM: poll: do not use FD_* macros anymore
- BUG/MAJOR: ev_select: disable the select() poller if maxsock > FD_SETSIZE
- BUG/MINOR: acl: ssl_fc_{alg,use}_keysize must parse integers, not strings
- BUG/MINOR: acl: ssl_c_used, ssl_fc{,_has_crt,_has_sni} take no pattern
- BUILD: fix usual isdigit() warning on solaris
- BUG/MEDIUM: tools: vsnprintf() is not always reliable on Solaris
- OPTIM: buffer: remove one jump in buffer_count()
- OPTIM: http: improve branching in chunk size parser
- OPTIM: http: optimize the response forward state machine
- BUILD: enable poll() by default in the makefile
- BUILD: add explicit support for Mac OS/X
- BUG/MAJOR: http: use a static storage for sample fetch context
- BUG/MEDIUM: ssl: improve error processing and reporting in ssl_sock_load_cert_list_file()
- BUG/MAJOR: http: fix regression introduced by commit a890d072
- BUG/MAJOR: http: fix regression introduced by commit d655ffe
- BUG/CRITICAL: using HTTP information in tcp-request content may crash the process
- MEDIUM: acl: remove flag ACL_MAY_LOOKUP which is improperly used
- MEDIUM: samples: use new flags to describe compatibility between fetches and their usages
- MINOR: log: indicate it when some unreliable sample fetches are logged
- MEDIUM: samples: move payload-based fetches and ACLs to their own file
- MINOR: backend: rename sample fetch functions and declare the sample keywords
- MINOR: frontend: rename sample fetch functions and declare the sample keywords
- MINOR: listener: rename sample fetch functions and declare the sample keywords
- MEDIUM: http: unify acl and sample fetch functions
- MINOR: session: rename sample fetch functions and declare the sample keywords
- MAJOR: acl: make all ACLs reference the fetch function via a sample.
- MAJOR: acl: remove the arg_mask from the ACL definition and use the sample fetch's
- MAJOR: acl: remove fetch argument validation from the ACL struct
- MINOR: http: add new direction-explicit sample fetches for headers and cookies
- MINOR: payload: add new direction-explicit sample fetches
- CLEANUP: acl: remove ACL hooks which were never used
- MEDIUM: proxy: remove acl_requires and just keep a flag "http_needed"
- MINOR: sample: provide a function to report the name of a sample check point
- MAJOR: acl: convert all ACL requires to SMP use+val instead of ->requires
- CLEANUP: acl: remove unused references to ACL_USE_*
- MINOR: http: replace acl_parse_ver with acl_parse_str
- MEDIUM: acl: move the ->parse, ->match and ->smp fields to acl_expr
- MAJOR: acl: add option -m to change the pattern matching method
- MINOR: acl: remove the use_count in acl keywords
- MEDIUM: acl: have a pointer to the keyword name in acl_expr
- MEDIUM: acl: support using sample fetches directly in ACLs
- MEDIUM: http: remove val_usr() to validate user_lists
- MAJOR: sample: maintain a per-proxy list of the fetch args to resolve
- MINOR: ssl: add support for the "alpn" bind keyword
- MINOR: http: status code 303 is HTTP/1.1 only
- MEDIUM: http: implement redirect 307 and 308
- MINOR: http: status 301 should not be marked non-cacheable
The ALPN extension is meant to replace the now deprecated NPN extension.
This patch implements support for it. It requires a version of openssl
with support for this extension. Patches are available here right now :
http://html5labs.interopbridges.com/media/167447/alpn_patches.zip
While ACL args were resolved after all the config was parsed, it was not the
case with sample fetch args because they're almost everywhere now.
The issue is that ACLs now solely rely on sample fetches, so their args
resolving doesn't work anymore. And many fetches involving a server, a
proxy or a userlist don't work at all.
The real issue is that at the bottom layers we have no information about
proxies, line numbers, even ACLs in order to report understandable errors,
and that at the top layers we have no visibility over the locations where
fetches are referenced (think log node).
After failing multiple unsatisfying solutions attempts, we now have a new
concept of args list. The principle is that every proxy has a list head
which contains a number of indications such as the config keyword, the
context where it's used, the file and line number, etc... and a list of
arguments. This list head is of the same type as the elements, so it
serves as a template for adding new elements. This way, it is filled from
top to bottom by the callers with the information they have (eg: line
numbers, ACL name, ...) and the lower layers just have to duplicate it and
add an element when they face an argument they cannot resolve yet.
Then at the end of the configuration parsing, a loop passes over each
proxy's list and resolves all the args in sequence. And this way there is
all necessary information to report verbose errors.
The first immediate benefit is that for the first time we got very precise
location of issues (arg number in a keyword in its context, ...). Second,
in order to do this we had to parse log-format and unique-id-format a bit
earlier, so that was a great opportunity for doing so when the directives
are encountered (unless it's a default section). This way, the recorded
line numbers for these args are the ones of the place where the log format
is declared, not the end of the file.
Userlists report slightly more information now. They're the only remaining
ones in the ACL resolving function.
Now it becomes possible to directly use sample fetches as the ACL fetch
methods. In this case, the matching method is mandatory. This allows to
form more ACL combinations from existing fetches and will limit the need
for new ACLs when everything is available to form them from sample fetches
and matches.
The acl_expr struct used to hold a pointer to the ACL keyword. But since
we now have all the relevant pointers, we don't need that anymore, we just
need the pointer to the keyword as a string in order to return warnings
and error messages.
So let's change this in order to remove the dependency on the acl_keyword
struct from acl_expr.
During this change, acl_cond_kw_conflicts() used to return a pointer to an
ACL keyword but had to be changed to return a const char* for the same reason.
ACL expressions now support "-m" in addition to "-i" and "-f". This new
option is followed by the name of the pattern matching method to be used
on the extracted pattern. This makes it possible to reuse existing sample
fetch methods with other matching methods (eg: regex). A "found" matching
method ignores any pattern and only verifies that the required sample was
found (useful for cookies).
The HTTP version parser used in ACLs has long been a string and
still had its own parser. This makes no sense, switch it to use
the standard string parser.
The ACLs now use the fetch's ->use and ->val to decide upon compatibility
between the place where they are used and where the information are fetched.
The code is capable of reporting warnings about very fine incompatibilities
between certain fetches and an exact usage location, so it is expected that
some new warnings will be emitted on some existing configurations.
Two degrees of detection are provided :
- detecting ACLs that never match
- detecting keywords that are ignored
All tests show that this seems to work well, though bugs are still possible.
Proxy's acl_requires was a copy of all bits taken from ACLs, but we'll
get rid of ACL flags and only rely on sample fetches soon. The proxy's
acl_requires was only used to allocate an HTTP context when needed, and
was even forced in HTTP mode. So better have a flag which exactly says
what it's supposed to be used for.
These hooks, which established the relation between ACL_USE_* and the location
where the ACL were used, were never used because they were superseded with the
sample capabilities. Remove them now.
Similarly to previous commit fixing "hdr" and "cookie" in HTTP, we have to deal
with "payload" and "payload_lv" which are request-only for ACLs and req/resp for
sample fetches depending on the context, and to a less extent with other req_*
and rep_*/rep_* fetches. So let's add explicit "req." and "res." variants and
make the ACLs rely on that instead.
Since "hdr" and "cookie" were ambiguously referring to the request or response
depending on the context, we need a way to explicitly specify the direction.
By prefixing the fetches names with "req." and "res.", we can now restrict such
fetches to the appropriate direction. At the moment the fetches are explicitly
declared by later we might think about having an automatic match when "req." or
"res." appears. These explicit fetches are now used by the relevant ACLs.