The req.hdr and res.hdr fetch methods do not work well on headers which
are allowed to contain commas, such as User-Agent, Date or Expires.
More specifically, full-length matching is impossible if a comma is
present.
This patch introduces 4 new fetch functions which are designed to work
with these full-length headers :
- req.fhdr, req.fhdr_cnt
- res.fhdr, res.fhdr_cnt
These ones do not stop at commas and permit to return full-length header
values.
People who use "option dontlog-normal" are bothered with redirects and
stats being logged and reported as errors in the logs ("PR" = proxy
blocked the request).
This patch introduces a new flag 'L' for when a request is locally
processed, that is not considered as an error by the log filters. That
way we know a request was intercepted and processed by haproxy without
logging the line when "option dontlog-normal" is in effect.
Since 1.5-dev12 and commit 3bf1b2b8 (MAJOR: channel: stop relying on
BF_FULL to take action), the HTTP parser switched to channel_full()
instead of BF_FULL to decide whether a buffer had enough room to start
parsing a request or response. The problem is that channel_full()
intentionally ignores outgoing data, so a corner case exists where a
large response might still be left in a response buffer with just a
few bytes left (much less than the reserve), enough to accept a second
response past the last data, but not enough to permit the HTTP processor
to add some headers. Since all the processing relies on this space being
available, we can get some random crashes when clients pipeline requests.
The analysis of a core from haproxy configured with 20480 bytes buffers
shows this : with enough "luck", when sending back the response for the
first request, the client is slow, the TCP window is congested, the socket
buffers are full, and haproxy's buffer fills up. We still have 20230 bytes
of response data in a 20480 response buffer. The second request is sent to
the server which returns 214 bytes which fit in the small 250 bytes left
in this buffer. And the buffer arrangement makes it possible to escape all
the controls in http_wait_for_response() :
|<------ response buffer = 20480 bytes ------>|
[ 2/2 | 3 | 4 | 1/2 ]
^ start of circular buffer
1/2 = beginning of previous response (18240)
2/2 = end of previous response (1990)
3 = current response (214)
4 = free space (36)
- channel_full() returns false (20230 bytes are going to leave)
- the response headers does not wrap at the end of the buffer
- the remaining linear room after the headers is larger than the
reserve, because it's the previous response which wraps :
=> response is processed
Header rewriting causes it to reach 260 bytes, 10 bytes larger than what
the buffer could hold. So all computations during header addition are
wrong and lead to the corruption we've observed.
All the conditions are very hard to meet (which explains why it took
almost one year for this bug to show up) and are almost impossible to
reproduce on purpose on a test platform. But the bug is clearly there.
This issue was reported by Dinko Korunic who kindly devoted a lot of
time to provide countless traces and cores, and to experiment with
troubleshooting patches to knock the bug down. Thanks Dinko!
No backport is needed, but all 1.5-dev versions between dev12 and dev18
included must be upgraded. A workaround consists in setting option
forceclose to prevent pipelined requests from being processed.
We're often missin a third counter to track base, src and base+src at
the same time. Here we introduce track_sc3 to have this third counter.
It would be wise not to add much more counters because that slightly
increases the session size and processing time though the real issue
is more the declaration of the keywords in the code and in the doc.
By properly affecting the flags and values, it becomes easier to add
more tracked counters, for example for experimentation. It also slightly
reduces the code and the number of tests. No counters were added with
this patch.
FreeBSD uses (IPPROTO_IP, IP_BINDANY) and (IPPROTO_IPV6, IPV6_BINDANY)
to enable transparent proxy on a socket.
This patch adds support for the relevant setsockopt() calls.
This patch does not change the logic of the code, it only changes the
way OS-specific defines are tested.
At the moment the transparent proxy code heavily depends on Linux-specific
defines. This first patch introduces a new define "CONFIG_HAP_TRANSPARENT"
which is set every time the defines used by transparent proxy are present.
This also means that with an up-to-date libc, it should not be necessary
anymore to force CONFIG_HAP_LINUX_TPROXY during the build, as the flags
will automatically be detected.
The CTTPROXY flags still remain separate because this older API doesn't
work the same way.
A new line has been added in the version output for haproxy -vv to indicate
what transparent proxy support is available.
When freeing ACL regex, we don't want to perform the free() in regex_free()
as it's already performed in free_pattern(). The double free only happens
when using PCRE_JIT when freeing everything during exit so it's harmless
but exhibits libc errors during a reload/restart.
Bug reported by Seri.
Improve the crt-list file format to allow a rule to negate a certain SNI :
<crtfile> [[!]<snifilter> ...]
This can be useful when a domain supports a wildcard but you don't want to
deliver the wildcard cert for certain specific domains.
This patch adds a "scope" box in the statistics page in order to
display only proxies with a name that contains the requested value.
The scope filter is preserved across all clicks on the page.
Commit a4312fa2 merged into dev18 improved log-format management by
processing "log-format" and "unique-id-format" where they were declared,
so that the faulty args could be reported with their correct line numbers.
Unfortunately, the log-format parser considers the proxy mode (TCP/HTTP)
and now if the directive is set before the "mode" statement, it can be
rejected and report warnings.
So we really need to parse these directives at the end of a section at
least. Right now we do not have an "end of section" event, so we need
to store the file name and line number for each of these directives,
and take care of them at the end.
One of the benefits is that now the line numbers can be inherited from
the line passing "option httplog" even if it's in a defaults section.
Future improvements should be performed to report line numbers in every
log-format processed by the parser.
The ALPN extension is meant to replace the now deprecated NPN extension.
This patch implements support for it. It requires a version of openssl
with support for this extension. Patches are available here right now :
http://html5labs.interopbridges.com/media/167447/alpn_patches.zip
While ACL args were resolved after all the config was parsed, it was not the
case with sample fetch args because they're almost everywhere now.
The issue is that ACLs now solely rely on sample fetches, so their args
resolving doesn't work anymore. And many fetches involving a server, a
proxy or a userlist don't work at all.
The real issue is that at the bottom layers we have no information about
proxies, line numbers, even ACLs in order to report understandable errors,
and that at the top layers we have no visibility over the locations where
fetches are referenced (think log node).
After failing multiple unsatisfying solutions attempts, we now have a new
concept of args list. The principle is that every proxy has a list head
which contains a number of indications such as the config keyword, the
context where it's used, the file and line number, etc... and a list of
arguments. This list head is of the same type as the elements, so it
serves as a template for adding new elements. This way, it is filled from
top to bottom by the callers with the information they have (eg: line
numbers, ACL name, ...) and the lower layers just have to duplicate it and
add an element when they face an argument they cannot resolve yet.
Then at the end of the configuration parsing, a loop passes over each
proxy's list and resolves all the args in sequence. And this way there is
all necessary information to report verbose errors.
The first immediate benefit is that for the first time we got very precise
location of issues (arg number in a keyword in its context, ...). Second,
in order to do this we had to parse log-format and unique-id-format a bit
earlier, so that was a great opportunity for doing so when the directives
are encountered (unless it's a default section). This way, the recorded
line numbers for these args are the ones of the place where the log format
is declared, not the end of the file.
Userlists report slightly more information now. They're the only remaining
ones in the ACL resolving function.
The acl_expr struct used to hold a pointer to the ACL keyword. But since
we now have all the relevant pointers, we don't need that anymore, we just
need the pointer to the keyword as a string in order to return warnings
and error messages.
So let's change this in order to remove the dependency on the acl_keyword
struct from acl_expr.
During this change, acl_cond_kw_conflicts() used to return a pointer to an
ACL keyword but had to be changed to return a const char* for the same reason.
ACL expressions now support "-m" in addition to "-i" and "-f". This new
option is followed by the name of the pattern matching method to be used
on the extracted pattern. This makes it possible to reuse existing sample
fetch methods with other matching methods (eg: regex). A "found" matching
method ignores any pattern and only verifies that the required sample was
found (useful for cookies).
The ACLs now use the fetch's ->use and ->val to decide upon compatibility
between the place where they are used and where the information are fetched.
The code is capable of reporting warnings about very fine incompatibilities
between certain fetches and an exact usage location, so it is expected that
some new warnings will be emitted on some existing configurations.
Two degrees of detection are provided :
- detecting ACLs that never match
- detecting keywords that are ignored
All tests show that this seems to work well, though bugs are still possible.
Proxy's acl_requires was a copy of all bits taken from ACLs, but we'll
get rid of ACL flags and only rely on sample fetches soon. The proxy's
acl_requires was only used to allocate an HTTP context when needed, and
was even forced in HTTP mode. So better have a flag which exactly says
what it's supposed to be used for.
These hooks, which established the relation between ACL_USE_* and the location
where the ACL were used, were never used because they were superseded with the
sample capabilities. Remove them now.
ACL fetch being inherited from the sample fetch keyword, we don't need
anymore to specify what function to use to validate the fetch arguments.
Note that the job is still done in the ACL parsing code based on elements
from the sample fetch structs.
Now that ACLs solely rely on sample fetch functions, make them use the
same arg mask. All inconsistencies have been fixed separately prior to
this patch, so this patch almost only adds a new pointer indirection
and removes all references to ARG*() in the definitions.
The parsing is still performed by the ACL code though.
ACL fetch functions used to directly reference a fetch function. Now
that all ACL fetches have their sample fetches equivalent, we can make
ACLs reference a sample fetch keyword instead.
In order to simplify the code, a sample keyword name may be NULL if it
is the same as the ACL's, which is the most common case.
A minor change appeared, http_auth always expects one argument though
the ACL allowed it to be missing and reported as such afterwards, so
fix the ACL to match this. This is not really a bug.
The file acl.c is a real mess, it both contains functions to parse and
process ACLs, and some sample extraction functions which act on buffers.
Some other payload analysers were arbitrarily dispatched to proto_tcp.c.
So now we're moving all payload-based fetches and ACLs to payload.c
which is capable of extracting data from buffers and rely on everything
that is protocol-independant. That way we can safely inflate this file
and only use the other ones when some fetches are really specific (eg:
HTTP, SSL, ...).
As a result of this cleanup, the following new sample fetches became
available even if they're not really useful :
always_false, always_true, rep_ssl_hello_type, rdp_cookie_cnt,
req_len, req_ssl_hello_type, req_ssl_sni, req_ssl_ver, wait_end
The function 'acl_fetch_nothing' was wrong and never used anywhere so it
was removed.
The "rdp_cookie" sample fetch used to have a mandatory argument while it
was optional in ACLs, which are supposed to iterate over RDP cookies. So
we're making it optional as a fetch too, and it will return the first one.
If a log-format involves some sample fetches that may not be present at
the logging instant, we can now report a warning.
Note that this is done both for log-format and for add-header and carefully
respects the original fetch keyword's capabilities.
Samples fetches were relying on two flags SMP_CAP_REQ/SMP_CAP_RES to describe
whether they were compatible with requests rules or with response rules. This
was never reliable because we need a finer granularity (eg: an HTTP request
method needs to parse an HTTP request, and is available past this point).
Some fetches are also dependant on the context (eg: "hdr" uses request or
response depending where it's involved, causing some abiguity).
In order to solve this, we need to precisely indicate in fetches what they
use, and their users will have to compare with what they have.
So now we have a bunch of bits indicating where the sample is fetched in the
processing chain, with a few variants indicating for some of them if it is
permanent or volatile (eg: an HTTP status is stored into the transaction so
it is permanent, despite being caught in the response contents).
The fetches also have a second mask indicating their validity domain. This one
is computed from a conversion table at registration time, so there is no need
for doing it by hand. This validity domain consists in a bitmask with one bit
set for each usage point in the processing chain. Some provisions were made
for upcoming controls such as connection-based TCP rules which apply on top of
the connection layer but before instantiating the session.
Then everywhere a fetch is used, the bit for the control point is checked in
the fetch's validity domain, and it becomes possible to finely ensure that a
fetch will work or not.
Note that we need these two separate bitfields because some fetches are usable
both in request and response (eg: "hdr", "payload"). So the keyword will have
a "use" field made of a combination of several SMP_USE_* values, which will be
converted into a wider list of SMP_VAL_* flags.
The knowledge of permanent vs dynamic information has disappeared for now, as
it was never used. Later we'll probably reintroduce it differently when
dealing with variables. Its only use at the moment could have been to avoid
caching a dynamic rate measurement, but nothing is cached as of now.
This flag is used on ACL matches that support being looking up patterns
in trees. At the moment, only strings and IPs support tree-based lookups,
but the flag is randomly set also on integers and binary data, and is not
even always set on strings nor IPs.
Better get rid of this mess by only relying on the matching function to
decide whether or not it supports tree-based lookups, this is safer and
easier to maintain.
TCP Fast Open is supported in server mode since Linux 3.7, but current
libc's don't define TCP_FASTOPEN=23. Introduce the new USE flag USE_TFO
to define it manually in compat.h. Also note this in the TFO related
documentation.
Now that all addresses are parsed using str2sa_range(), it becomes easy
to add support for environment variables and use them everywhere an address
is needed. Environment variables are used as $VAR or ${VAR} as in shell.
Any number of variables may compose an address, allowing various fantasies
such as "fd@${FD_HTTP}" or "${LAN_DC1}.1:80".
These ones are usable in logs, bind, servers, peers, stats socket, source,
dispatch, and check address.
This change allows one to force the address family in any address parsed
by str2sa_range() by specifying it as a prefix followed by '@' then the
address. Currently supported address prefixes are 'ipv4@', 'ipv6@', 'unix@'.
This also helps forcing resolving for host names (when getaddrinfo is used),
and force the family of the empty address (eg: 'ipv4@' = 0.0.0.0 while
'ipv6@' = ::).
The main benefits is that unix sockets can now get a local name without
being forced to begin with a slash. This is useful during development as
it is no longer necessary to have stats socket sent to /tmp.
Don't use a statically allocated address both for str2ip and str2sa_range,
use the same. The inet and unix code paths have been splitted a little
better to improve readability.
We'll need str2sa_range() to support a prefix for unix sockets. Since
we don't always want to use it (eg: stats socket), let's not take it
unconditionally from global but let the caller pass it.
An invalid copy-paste called it NR_splice instead of NR_accept4.
This does not lead to real issues because if this define is used,
then the code cannot compile since NR_accept4 is still missing.
Add new tunable "tune.ssl.maxrecord".
Over SSL/TLS, the client can decipher the data only once it has received
a full record. With large records, it means that clients might have to
download up to 16kB of data before starting to process them. Limiting the
record size can improve page load times on browsers located over high
latency or low bandwidth networks. It is suggested to find optimal values
which fit into 1 or 2 TCP segments (generally 1448 bytes over Ethernet
with TCP timestamps enabled, or 1460 when timestamps are disabled), keeping
in mind that SSL/TLS add some overhead. Typical values of 1419 and 2859
gave good results during tests. Use "strace -e trace=write" to find the
best value.
This trick was first suggested by Mike Belshe :
http://www.belshe.com/2010/12/17/performance-and-the-tls-record-size/
Then requested again by Ilya Grigorik who provides some hints here :
http://ofps.oreilly.com/titles/9781449344764/_transport_layer_security_tls.html#ch04_00000101
Right now we have multiple methods for parsing IP addresses in the
configuration. This is quite painful. This patch aims at adapting
str2sa_range() to make it support all formats, so that the callers
perform the appropriate tests on the return values. str2sa() was
changed to simply return str2sa_range().
The output values are now the following ones (taken from the comment
on top of the function).
Converts <str> to a locally allocated struct sockaddr_storage *, and a port
range or offset consisting in two integers that the caller will have to
check to find the relevant input format. The following format are supported :
String format | address | port | low | high
addr | <addr> | 0 | 0 | 0
addr: | <addr> | 0 | 0 | 0
addr:port | <addr> | <port> | <port> | <port>
addr:pl-ph | <addr> | <pl> | <pl> | <ph>
addr:+port | <addr> | <port> | 0 | <port>
addr:-port | <addr> |-<port> | <port> | 0
The detection of a port range or increment by the caller is made by
comparing <low> and <high>. If both are equal, then port 0 means no port
was specified. The caller may pass NULL for <low> and <high> if it is not
interested in retrieving port ranges.
Note that <addr> above may also be :
- empty ("") => family will be AF_INET and address will be INADDR_ANY
- "*" => family will be AF_INET and address will be INADDR_ANY
- "::" => family will be AF_INET6 and address will be IN6ADDR_ANY
- a host name => family and address will depend on host name resolving.
Support for server side TFO was actually introduced in linux-3.7,
linux-3.6 just has client support.
This patch fixes documentation and a code comment about the
kernel requirement. It also fixes a wrong tfo related code
comment in src/proto_tcp.c.
Support a agent health check performed by opening a TCP socket to a
pre-defined port and reading an ASCII string. The string should have one of
the following forms:
* An ASCII representation of an positive integer percentage.
e.g. "75%"
Values in this format will set the weight proportional to the initial
weight of a server as configured when haproxy starts.
* The string "drain".
This will cause the weight of a server to be set to 0, and thus it will
not accept any new connections other than those that are accepted via
persistence.
* The string "down", optionally followed by a description string.
Mark the server as down and log the description string as the reason.
* The string "stopped", optionally followed by a description string.
This currently has the same behaviour as down (iii).
* The string "fail", optionally followed by a description string.
This currently has the same behaviour as down (iii).
A agent health check may be configured using "option lb-agent-chk".
The use of an alternate check-port, used to obtain agent heath check
information described above as opposed to the port of the service,
may be useful in conjunction with this option.
e.g.
option lb-agent-chk
server http1_1 10.0.0.10:80 check port 10000 weight 100
Signed-off-by: Simon Horman <horms@verge.net.au>
Break out set weight processing code.
This is in preparation for reusing the code.
Also, remove duplicate check in nested if clauses.
{px->lbprm.algo & BE_LB_PROP_DYN) is checked by
the immediate outer if clause, so there is no need
to check it a second time.
Signed-off-by: Simon Horman <horms@verge.net.au>