Since IPv6 is a different type than IPv4, the pattern fetch functions
src6 and dst6 were added. IPv6 stick-tables can also fetch IPv4 addresses
with src and dst. In this case, the IPv4 addresses are mapped to their
IPv6 counterpart, according to RFC 4291.
Patch 5ab04ec47c was incomplete,
because if the first send() fails on an empty buffer, we fail
to rearm the polling and we can't establish the connection
anymore.
The issue was reported by Ben Timby who provided large amounts
of traces of various tests helping to reliably reproduce the issue.
Since the latest additions to buffer_forward(), it became too large for
inlining, so let's uninline it. The code size drops by 3kB. Should be
backported to 1.4 too.
Despite much care around handling the content-length as a 64-bit integer,
forwarding was broken on 32-bit platforms due to the 32-bit nature of
the ->to_forward member of the "buffer" struct. The issue is that this
member is declared as a long, so while it works OK on 64-bit platforms,
32-bit truncate the content-length to the lower 32-bits.
One solution could consist in turning to_forward to a long long, but it
is used a lot in the critical path, so it's not acceptable to perform
all buffer size computations on 64-bit there.
The fix consists in changing the to_forward member to a strict 32-bit
integer and ensure in buffer_forward() that only the amount of bytes
that can fit into it is considered. Callers of buffer_forward() are
responsible for checking that their data were taken into account. We
arbitrarily ensure we never consider more than 2G at once.
That's the way it was intended to work on 32-bit platforms except that
it did not.
This issue was tracked down hard at Exosec with Bertrand Jacquin,
Thierry Fournier and Julien Thomas. It remained undetected for a long
time because files larger than 4G are almost always transferred in
chunked-encoded format, and most platforms dealing with huge contents
these days run on 64-bit.
The bug affects all 1.5 and 1.4 versions, and must be backported.
Since we now have the copy of the target in the session, use it instead
of relying on the SI for it. The SI drops the target upon unregister()
so applets such as stats were logged as "NOSRV".
Johannes Smith reported some wrong retries count in logs associated with bad
requests. The cause was that the conn_retries field in the stream interface
was only initialized when attempting to connect, but is used when logging,
possibly with an uninitialized value holding last connection's conn_retries.
This could have been avoided by making use of a stream interface initializer.
This bug is 1.5-specific.
Function gethostbyname is deprecated since IEEE Std 1003.1-2008 and
was replaced by getaddrinfo (available since IEEE Std 1003.1-2004).
Contrary to gethostbyname, getaddrinfo is specified to support both
IPv4 and IPv4 addresses.
Since some libc doesn't handle getaddrinfo properly, constant
USE_GETADDRINFO must be defined at compile time to enable use of
getaddrinfo.
It's always been a mess to debug wrong listening addresses because
the parsing function does not indicate the file and line number. Now
it does. Since the code was almost a duplicate of str2sa_range, it
now makes use of it and has been sensibly reduced.
The parser now distinguishes between pure addresses and address:port. This
is useful for some config items where only an address is required.
Raw IPv6 addresses are now parsed, but IPv6 host name resolution is still not
handled (gethostbyname does not resolve IPv6 names to addresses).
This option enables use of the PROXY protocol with the server, which
allows haproxy to transport original client's address across multiple
architecture layers.
Upon connection establishment, stream_sock is now able to send a PROXY
line before sending any data. Since it's possible that the buffer is
already full, and we don't want to allocate a block for that line, we
compute it on-the-fly when we need it. We just store the offset from
which to (re-)send from the end of the line, since it's assumed that
multiple outputs of the same proxy line will be strictly equivalent. In
practice, one call is enough. We just make sure to handle the case where
the first send() would indicate an incomplete output, eventhough it's
very unlikely to ever happen.
If the check fails for a low-level socket error (eg: address family not
supportd), we currently ignore the status. We must report the error and
declare a failed health check in this case. The only real reason for this
would be when an IPv6 check is required on an IPv4-only system.
And also rename "req_acl_rule" "http_req_rule". At the beginning that
was a bit confusing to me, especially the "req_acl" list which in fact
holds what we call rules. After some digging, it appeared that some
part of the code is 100% HTTP and not just related to authentication
anymore, so let's move that part to HTTP and keep the auth-only code
in auth.c.
Right now, http-request rules are not evaluated if the URL matches the
stats request. This is quite unexpected. For instance, in the config
below, an abuser present in the abusers list will not be prevented access
to the stats.
listen pub
bind :8181
acl abuser src -f abusers.lst
http-request deny if abuser
stats uri /stats
It is not a big deal but it's not documented as such either. For 1.5, let's
have both lists be evaluated in turn, until one blocks. For 1.4 we'll simply
update the doc to indicate that.
Also instead of duplicating the code, the patch factors out the list walking
code. The HTTP auth has been moved slightly earlier, because it was set after
the header addition code, but we don't need to add headers to a request we're
dropping.
It's very annoying that frontend and backend stats are merged because we
don't know what we're observing. For instance, if a "listen" instance
makes use of a distinct backend, it's impossible to know what the bytes_out
means.
Some points take care of not updating counters twice if the backend points
to the frontend, indicating a "listen" instance. The thing becomes more
complex when we try to add support for server side keep-alive, because we
have to maintain a pointer to the backend used for last request, and to
update its stats. But we can't perform such comparisons anymore because
the counters will not match anymore.
So in order to get rid of this situation, let's have both frontend AND
backend stats in the "struct proxy". We simply update the relevant ones
during activity. Some of them are only accounted for in the backend,
while others are just for frontend. Maybe we can improve a bit on that
later, but the essential part is that those counters now reflect what
they really mean.
This patch turns internal server addresses to sockaddr_storage to
store IPv6 addresses, and makes the connect() function use it. This
code already works but some caveats with getaddrinfo/gethostbyname
still need to be sorted out while the changes had to be merged at
this stage of internal architecture changes. So for now the config
parser will not emit an IPv6 address yet so that user experience
remains unchanged.
This change should have absolutely zero user-visible effect, otherwise
it's a bug introduced during the merge, that should be reported ASAP.
This one has been removed and is now totally superseded by ->target.
To get the server, one must use target_srv(&s->target) instead of
s->srv now.
The function ensures that non-server targets still return NULL.
s->prev_srv is used by assign_server() only, but all code paths leading
to it now take s->prev_srv from the existing s->srv. So assign_server()
can do that copy into its own stack.
If at one point a different srv is needed, we still have a copy of the
last server on which we failed a connection attempt in s->target.
When dealing with HTTP keep-alive, we'll have to know if we can reuse
an existing connection. For that, we'll have to check if the current
connection was made on the exact same target (referenced in the stream
interface).
Thus, we need to first assign the next target to the session, then
copy it to the stream interface upon connect(). Later we'll check for
equivalence between those two operations.
Since all of them are defined as proxy options, it's better to ensure
that at most one of them is enabled at once. The priority has been set
according to what is already performed in the backend :
1) dispatch
2) http_proxy
3) transparent
Till now we used the fact that the dispatch address was not null to use
the dispatch mode. This is very unconvenient, so let's have a dedicated
option.
This is in fact where those parts belong to. The old data_state was replaced
by applet.state and is now initialized when the applet is registered. It's
worth noting that the applet does not need to know the session nor the
buffer anymore since everything is brought by the stream interface.
It is possible that having a separate applet struct would simplify the
code but that's not a big deal.
With HTTP keep-alive, logging the right server name will be quite
complex because the assigned server will possibly change before we log.
Also, when we want to log accesses to an applet, it's not easy because
the applet becomes NULL again before logging.
The logged server's name is now taken from the target stored in the
stream interface. That way we can log an applet, a server name, or we
could even log a proxy or anything else if we wanted to. Ideally the
session should contain a desired target which is the one which should
be logged.
Now that we have the target pointer and type in the stream interface,
we don't need the applet.handler pointer anymore. That makes the code
somewhat cleaner because we know we're dealing with an applet by checking
its type instead of checking the pointer is not null.
When doing a connect() on a stream interface, some information is needed
from the server and from the backend. In some situations, we don't have
a server and only a backend (eg: peers). In other cases, we know we have
an applet and we don't want to connect to anything, but we'd still like
to have the info about the applet being used.
For this, we now store a pointer to the "target" into the stream interface.
The target describes what's on the other side before trying to connect. It
can be a server, a proxy or an applet for now. Later we'll probably have
descriptors for multiple-stage chains so that the final information may
still be found.
This will help removing many specific cases in the code. It already made
it possible to remove the "srv" and "be" parameters to tcpv4_connect_server().
I/O handlers are still delicate to manipulate. They have no type, they're
just raw functions which have no knowledge of themselves. Let's have them
declared as applets once for all. That way we can have multiple applets
share the same handler functions and we can store their names there. When
we later need to add more parameters (eg: usage stats), we'll be able to
do so in the applets themselves.
The CLI functions has been prefixed with "cli" instead of "stats" as it's
clearly what is going on there.
The applet descriptor in the stream interface should get all the applet
specific data (st0, ...) but this will be done in the next patch so that
we don't pollute this one too much.
Both Hank A. Paulson and Rob at pixsense reported a crash when
loading ACLs from a pattern file which contains empty lines.
From the tests, it appears that only files that contain nothing
but empty lines are causing that (in the past they would have had
their line feeds loaded as patterns).
The crash happens in the free_pattern() call which doesn't like to
be called with a NULL pattern. Let's make it accept it so that it's
more in line with the standard uses of free() which ignores NULLs.
Similar to the stats socket bug, we must check that the proxy is not disabled
before trying to enable/disable a server.
Even if a disabled proxy is not displayed, someone can inject a faulty proxy
name in the POST parameters. So, we must ensure that no disabled proxy can be
used.
As reported by Bryan Talbot, enabling and disabling a server in a disabled
proxy causes a segfault.
Changing the weight can also cause a similar segfault.
Bryan Talbot reported that POST requests with a query string were not
correctly processed if the hash parameter was the first one, because
the delimiter that was looked for to trigger the parsing was '&' instead
of '?'.
Also, while checking the code, it became apparent that it was enough for
a query string to be present in the request for POST parameters to be
ignored, even if the url_param was in the body and not in the URL.
The code has then been fixed like this :
1) look for URL param. If found, return it.
2) if no URL param was found and method is POST, then look it up into
the body
The code now seems to pass all request combinations.
This patch must be backported to 1.4 since 1.4 is equally broken right now.
Till now, the forwarding code was making use of the hdr_content_len member
to hold the size of the last chunk parsed. As such, it was reset after being
scheduled for forwarding. The issue is that this entry was reset before the
data could be viewed by backend.c in order to parse a POST body, so the
"balance url_param check_post" did not work anymore.
In order to fix this, we need two things :
- the chunk size (reset upon every forward)
- the total body size (not reset)
hdr_content_len was thus replaced by the former (hence the size of the patch)
as it makes more sense to have it stored that way than the way around.
This patch should be backported to 1.4 with care, considering that it affects
the forwarding code.
It seems like if a response message is chunked and the chunk size wraps
at the end of the buffer and the crlf sequence is incomplete, then we
can forward a wrong chunk size due to incorrect handling of the wrapped
size. It seems extremely unlikely to occur on real traffic (no reason to
have half of the CRLF after a chunk) but nothing prevents it from being
possible.
This fix must be backported to 1.4.
As reported by the Loadbalancer.org team, it was not possible to bind
more than 1024 ports. This is because the process' limits were set after
trying to bind the sockets, which defeats their purpose.
This fix must be backported to 1.4 and 1.3.
We used to only count one socket instead of one per listener. This makes
the socket count wrong, preventing from automatically computing the proper
number of sockets to bind.
This fix must be backported to 1.4 and 1.3.
req_acl was used instead of req_acl_final. As a matter of luck, both
happen to be the same at this point, but this is not granted in the
future.
This fix should be backported to 1.4.
Some browsers send POST requests in several packets, which was not supported
by the "stats admin" function.
This patch allows to wait for more data when they are not fully received
(we are still limited to a certain size defined by the buffer size minus its
reserved space).
It also adds support for the "Expect: 100-Continue" header.
Stefan Behte reported a strange case where depending on the position of
the Connection header in the header list, some headers added after it
were or were not usable in "balance hdr()". The reason is that when the
last header is removed, the list's tail was not updated, so any header
added after that one was not visible from the list.
This fix must be backported to 1.4 and possibly 1.3.
while working further on the changes to allow for dynamic
adding/removing of backend servers we noticed a potential problem: the
path given for the 'stats socket' global option may get truncated when
copying it into the sockaddr_un.sun_path field.
Attached patch checks the length, and reports an error if truncation
would happen.
This issue was noticed by Joerg Sonnenberger <joerg@NetBSD.org>.
src/frontend.c: In function 'frontend_accept':
src/frontend.c:110: warning: pointer targets in passing argument 5 of 'getsockopt' differ in signedness
The argument should be socklen_t and not int.
I have written a small patch to enable a correct PostgreSQL health check
It works similar to mysql-check with the very same parameters.
E.g.:
listen pgsql 127.0.0.1:5432
mode tcp
option pgsql-check user pgsql
server masterdb pgsql.server.com:5432 check inter 10000
It's better to avoid sticking on empty parameter values, as this almost
always indicates a missing parameter. Otherwise it's easy to enter a
situation where all new visitors stick to the same server.
Revert commits 035da6d1b0 and
f18b5f21ba.
These fixes were wrong. They worked but they were fixing the symptom
instead of the root cause of the problem. The real issue was in the
ebtree lookup code and it has been fixed now so these patches are not
needed anymore. It's better not to copy memory blocks when we don't
need to, so let's revert them.
Commit 035da6d1b0 was incorrect as it
could modify a live buffer. We must first ensure that we're on the
private buffer or perform a copy before modifying the data.
Gabriel Sosa reported that haproxy unexpectedly reports an error
when a pattern file loaded by an ACL contains an empty line. The
test was present but inefficient as it did not consider the '\n'
as the end of the line. This fix relies on the line length instead.
It should be backported to 1.4.
If a key to be looked up is extracted from data without being padded
and if it matches the beginning of another stored key, it is not
found in subsequent lookups because it does not end with a zero.
This bug was discovered and diagnosed by David Cournapeau.
One of the requirements we have is to run multiple instances of haproxy on a
single host; this is so that we can split the responsibilities (and change
permissions) between product teams. An issue we ran up against is how we
would distinguish between the logs generated by each instance. The solution
we came up with (please let me know if there is a better way) is to override
the application tag written to syslog. We can then configure syslog to write
these to different files.
I have attached a patch adding a global option 'log-tag' to override the
default syslog tag 'haproxy' (actually defaults to argv[0]).
By passing a negative value to the "mss" argument of "bind" lines, it
becomes possible to subtract this value to the MSS advertised by the
client, which results in segments smaller than advertised. The effect
is useful with some TCP stacks which ACK less often when segments are
not full, because they only ACK every other full segment as suggested
by RFC1122.
NOTE: currently this has no effect on Linux kernel 2.6, a kernel patch
is still required to change the MSS of established connections.
Haproxy does not include the hostname rather the IP of the machine in
the syslog headers it sends. Unfortunately this means that for each log
line rsyslog does a reverse dns on the client IP and in the case of
non-routable IPs one gets the public hostname not the internal one.
While this is valid according to RFC3164 as one might imagine this is
troublsome if you have some machines with public IPs, internal IPs, no
reverse DNS entries, etc and you want a standardized hostname based log
directory structure. The rfc says the preferred value is the hostname.
This patch adds a global "log-send-hostname" statement which accepts an
optional string to force the host name. If unset, the local host name
is used.
Since haproxy 1.4.9, combining option httpclose and option
http-pretend-keepalive can leave the connections opened until the backend
keep-alive timeout is reached, providing bad performances.
The same can occur when the proxy is in tunnel mode.
This patch ensures that the server side connection is closed after the
response and ignore http-pretend-keepalive in tunnel mode.
When a connection error is encountered on a server and the server's
connection pool is full, pending connections are not woken up because
the current connection is still accounted for on the server, so it
still appears full. This becomes visible on a server which has
"maxconn 1" because the pending connections will only be able to
expire in the queue.
Now we take care of releasing our current connection before trying to
offer it to another pending request, so that the server can accept a
next connection.
This patch should be backported to 1.4.
When a client connection aborts while the server-side connection is in
turn-around after a failed connection attempt, the turn-around timeout
is reset in shutw() but the state is not changed. The session then
remains stuck in this state forever. Change the QUE and TAR states to
DIS just as we do for CER to fix this.
This patch should be backported to 1.4.
We've had several issues related to data transfers. First, if a
client aborted an upload before the server started to respond, it
would get a 502 followed by a 400. The same was true (in the other
way around) if the server suddenly aborted while the client was
uploading the data.
The flags reported in the logs were misleading. Request errors could
be reported while the transfer was stopped during the data phase. The
status codes could also be overwritten by a 400 eventhough the start
of the response was transferred to the client.
The stats were also wrong in case of data aborts. The server or the
client could sometimes be miscredited for being the author of the
abort depending on where the abort was detected. Some client aborts
could also be accounted as request errors and some server aborts as
response errors.
Now it seems like all such issues are fixed. Since we don't have a
specific state for data flowing from the client to the server
before the server responds, we're still counting the client aborted
transfers as "CH", and they become "CD" when the server starts to
respond. Ideally a "P" state would be desired.
This patch should be backported to 1.4.
HTTP pipelining currently needs to monitor the response buffer to wait
for some free space to be able to send a response. It was not possible
for the HTTP analyser to be called based on response buffer activity.
Now we introduce a new buffer flag BF_WAKE_ONCE which is set when the
HTTP request analyser is set on the response buffer and some activity
is detected. This is not clean at all but once of the only ways to fix
the issue before we make it possible to register events for analysers.
Also it appeared that one realign condition did not cover all cases.
Using haproxy in multi-process mode (nbproc > 1), some features can be
not fully compatible or not work at all. haproxy will now display a warning on
startup for :
- appsession
- sticking rules
- stats / stats admin
- stats socket
- peers (fatal error in that case)
This counter will help quickly spot whether there are new errors or not.
It is also assigned to each capture so that a script can keep trace of
which capture was taken when.
It is possible to block on incorrectly chunked requests or responses,
but this becomes very hard to debug when it happens once in a while.
This patch adds the ability to also capture incorrectly chunked requests
and responses. The chunk will appear in the error buffer and will be
verifiable with the usual "show errors". The incorrect byte will match
the error location.
Error captures did only support contiguous messages. This is annoying
for capturing chunking errors, so let's ensure the function is able to
copy wrapped messages.
When an error message is returned to a client, all buffer contents
were left intact. Since the analysers were removed, the potentially
invalid data that were read had a chance to be sent too.
Now we ensure we only keep the already scheduled data in the buffer
and we truncate it after that. That means that responses with data
that must be blocked will really be blocked, and that incorrectly
chunked data will be stopped at the point where the chunking fails.
When haproxy parses chunk-encoded data that are scheduled to be sent, it is
possible that the other end is closed (mainly due to a client abort returning
as an error). The message state thus changes to HTTP_MSG_ERROR and the error
is reported as a chunk parsing error ("PD--") while it is not. Detect this
case before setting the flags and set the appropriate flag in this case.
Debugging parsing errors can be greatly improved if we know what the parser
state was and what the buffer flags were (especially for closed inputs/outputs
and full buffers). Let's add that to the error snapshots.
When forwarding chunk-encoded data, each chunk gets a TCP PUSH flag when
going onto the wire simply because the send() function does not know that
some data remain after it (next chunk). Now we set the BF_EXPECT_MORE flag
on the buffer if the chunk size is not null. That way we can reduce the
number of packets sent, which is particularly noticeable when forwarding
compressed data, especially as it requires less ACKs from the client.
When the number of servers is a multiple of the size of the input set,
map-based hash can be inefficient. This typically happens with 64
servers when doing URI hashing. The "avalanche" hash-type applies an
avalanche hash before performing a map lookup in order to smooth the
distribution. The result is slightly less smooth than the map for small
numbers of servers, but still better than the consistent hashing.
We'll use this hash at other places, let's make it globally available.
The function has also been renamed because its "chash_hash" name was
not appropriate.
When a header is removed, the previous header's next pointer is updated
to reflect the next of the current header. However, when cycling through
the loop, we update the prev pointer to point to the deleted header, which
means that if we delete another header, it's the deleted header's next
pointer that will be updated, leaving the deleted header in the list with
a null length, which is forbidden.
We must just not update the prev pointer after a removal.
This bug was present when either "reqdel" and "rspdel" removed two consecutive
headers. It could also occur when removing cookies in either requests or
responses, but since headers were the last header processing, the issue
remained unnoticed.
Issue reported by Hank A. Paulson.
This fix must be ported to 1.4 and possibly 1.3.
Cookies in indirect mode are removed from the cookie header. Three pointers
ought to be updated when appsession cookies are processed next, but were not.
The result is that a memcpy() can be called with a negative value causing the
process to crash. It is not sure whether this can be remotely exploited or not.
(cherry picked from commit c5f3749aa3ccfdebc4992854ea79823d26f66213)
In out of memory conditions, the ->destroy function would free all
possibly allocated pools from the current appsession, including those
that were not yet allocated nor assigned, which used to point to a
previous allocation, obviously resulting in a segfault.
(cherry picked from commit 75eae485921d3a6ce197915c769673834ecbfa5c)
In case of out of memory, it was possible to write to a null pointer
when capturing response cookies due to a missing "else" block. The
request handling was fine though.
(cherry picked from commit 62e3604d7dd27741c0b4c9e27d9e7c73495dfc32)
When running with -vv or -V -d, the list of usable polling systems
is reported. The final selection did not take into account the
possible failures during the tests, which is misleading and could
make one think that a non-working poller will be used, while it is
not the case. Fix that to really report the correct ones.
(cherry picked from commit 6d0e354e0171f08b7b3868ad2882c3663bd068a7)
Since unix sockets are supported for bind, the default backlog size was not
enough to accept the traffic. The size is now inherited from the listener
to behave like the tcp listeners.
This also affects the "stats socket" backlog, which is now determined by
"stats maxconn".
Some distros' libc are built for CPUs earlier than i686 and as such do
not offer support for Linux kernel's faster vsyscalls. This code adds
a new build option USE_VSYSCALLS to bypass libc for most commonly used
system calls. A net gain of about 10% can be observed with this change
alone.
It only works when /proc/sys/abi/vsyscall32 equals exactly 2. When it's
set to 1, the VDSO is randomized and cannot be used.
Analysers were re-evaluated when some flags were still present in the
buffers, even if they had not changed since previous pass, resulting
in a waste of CPU cycles.
Ensuring that the flags have changed has saved some useless calls :
function min calls per session (before -> after)
http_request_forward_body 5 -> 4
http_response_forward_body 3 -> 2
http_sync_req_state 10 -> 8
http_sync_res_state 8 -> 6
http_resync_states 8 -> 6
The stream_sock's accept() used to close the FD upon error, but this
was also sometimes performed by the frontend's accept() called via the
session's accept(). Those interlaced calls were also responsible for the
spaghetti-looking error unrolling code in session.c and stream_sock.c.
Now the frontend must not close the FD anymore, the session is responsible
for that. It also takes care of just closing the FD or also removing from
the FD lists, depending on its state. The socket-level accept() does not
have to care about that anymore.
Some Alert() messages were remaining in the accept() path, which they
would have no chance to be detected. Remove some of them (the impossible
ones) and replace the relevant ones with send_log() so that the admin
has a chance to catch them.
Enhance pattern convs and fetch argument parsing, now fetchs and convs callbacks used typed args.
Add more details on error messages on parsing pattern expression function.
Update existing pattern convs and fetchs to new proto.
Create stick table key type "binary".
Manage Truncation and padding if pattern's fetch-converted result don't match table key size.
If a read shutdown is encountered on the first packet of a connection
right after the data and the last analyser is unplugged at the same
time, then that last data chunk may never be forwarded. In practice,
right now it cannot happen on requests due to the way they're scheduled,
nor can it happen on responses due to the way their analysers work.
But this behaviour has been observed with new response analysers being
developped.
The reason is that when the read shutdown is encountered and an analyser
is present, data cannot be forwarded but the BF_SHUTW_NOW flag is set.
After that, the analyser gets called and unplugs itself, hoping that
process_session() will automatically forward the data. This does not
happen due to BF_SHUTW_NOW.
Simply removing the test on this flag is not enough because then aborted
requests still get forwarded, due to the forwarding code undoing the
abort.
The solution here consists in checking BF_SHUTR_NOW instead of BF_SHUTW_NOW.
BF_SHUTR_NOW is only set on aborts and remains set until ->shutr() is called.
This is enough to catch recent aborts but not prevent forwarding in other
cases. Maybe a new special buffer flag "BF_ABORT" might be desirable in the
future.
This patch does not need to be backported because older versions don't
have the analyser which make the problem appear.
Some options depends on the target architecture or compilation options.
When such an option is used on a compiled version that doesn't support it,
it's probably better to identify it as an unsupported option due to
compilation options instead of an unknown option.
Edit: better check on the empty capability than on the option bits. -Willy
There were a lot of snprintf() everywhere in the UNIX bind code. Now we
proceed as for tcp and indicate the socket path at the end between square
brackets. The code is smaller and more readable.
Add the address and port to the error message of the proxy socket that caused
the error. This can be helpful when several listening addresses are used in a
proxy.
Edit: since we now also support unix sockets (which already report their
path), better move the address reporting to proto_tcp.c by analogy.
-Willy
MAXPATHLEN may be used at other places, it's unconvenient to have it
redefined in a few files. Also, since checking it requires including
sys/param.h, some versions of it cause a macro declaration conflict
with MIN/MAX which are defined in tools.h. The solution consists in
including sys/param.h in both files so that we ensure it's loaded
before the macros are defined and MAXPATHLEN is checked.
The introduction of a new PROXY protocol for proxied connections requires
an early analyser to decode the incoming connection and set the session
flags accordingly.
Some more work is needed, among which setting a flag on the session to
indicate it's proxied, and copying the original parameters for later
comparisons with new ACLs (eg: real_src, ...).
inetaddr_host_lim_ret() used to make use of const char** for some
args, but that make it impossible ot use char** due to the way
controls are made by gcc. So let's change that.
This option makes haproxy preserve any persistence cookie emitted by
the server, which allows the server to change it or to unset it, for
instance, after a logout request.
(cherry picked from commit 52e6d75374c7900c1fe691c5633b4ae029cae8d5)
When a backend defines a new cookie, it forgot to unset any params
that could have been set in a defaults section, resulting in configs
that would sometimes refuse to load or not work as expected.
(cherry picked from commit f80bf174ed905a29a3ed8ee91fcd528da6df174f)
This match returns true when the request calling it is the first one of
a connection.
(cherry picked from commit 922ca979c50653c415852531f36fe409190ad76b)
Pattern fetches relying on destination address must first fetch
the address if it has not been done yet.
(cherry picked from commit 21abf441feb318b2ccd7df590fd89e9e824627f6)
The MySQL check has been revamped to be able to send real MySQL data,
and to avoid Aborted connects on MySQL side.
It is however backward compatible with older version, but it is highly
recommended to use the new mode, by adding "user <username>" on the
"mysql-check" line.
The new check consists in sending two MySQL packet, one Client
Authentication packet, with "haproxy" username (by default), and one
QUIT packet, to correctly close MySQL session. We then parse the Mysql
Handshake Initialisation packet and/or Error packet. It is a basic but
useful test which does not produce error nor aborted connect on the
server.
(cherry picked from commit a1e4dcfe5718311b7653d7dabfad65c005d0439b)
Health checks were all pure ASCII, but we're going to have to support some
binary checks (eg: SQL). When they're inherited from the default section,
they will be truncated to the first \0 due to strdup(). Let's fix that with
a simple malloc.
(cherry picked from commit 98fc04a766bcff80f57db2b1cd865c91761b131b)
Keywords were changed just before the commit but not in the help message.
Spotted by Hank A. Paulson.
(cherry picked from commit fdd46a0766dccec704aa1bd5acb0ac99a801c549)
When we're enabling a server again (unix CLI or stats interface), we must not mark
it completely up because it can take a while before a failure is detected. So we
mark it one step above failure, which means it's up but will be marked down upon
first failure.
(cherry picked from commit 83c3e06452457ed5660fc814cbda5bf878bf19a2)
The stats web interface must be read-only by default to prevent security
holes. As it is now allowed to enable/disable servers, a new keyword
"stats admin" is introduced to activate this admin level, conditioned by ACLs.
(cherry picked from commit 5334bab92ca7debe36df69983c19c21b6dc63f78)
Based on a patch provided by Judd Montgomery, it is now possible to
enable/disable servers from the stats web interface. This allows to select
several servers in a backend and apply the action to them at the same time.
Currently, there are 2 known limitations :
- The POST data are limited to one packet
(don't alter too many servers at a time).
- Expect: 100-continue is not supported.
(cherry picked from commit 7693948766cb5647ac03b48e782cfee2b1f14491)
In a down backend, when a zero-weight server is lost, a new
"backend down" message was emitted and the down transition of that
backend was wrongly increased. This change ensures that we don't
count that transition again.
This patch should be backported to 1.3.
(cherry picked from commit 60efc5f745b5fa70d811f977727592e47e32a281)
If a maxidle or maxlife parameter is set on the persistence cookie in
insert mode and the client did not provide a recent enough cookie,
then we emit a new cookie with a new last_seen date and the same
first_seen (if maxlife is set). Recent enough here designates a
cookie that would be rounded to the same date. That way, we can
refresh a cookie when required without doing it in all responses.
If the request did not contain such parameters, they are set anyway.
This means that a monitoring request that is forced to a server will
get an expiration date anyway, but this should not be a problem given
that the client is able to set its cookie in this case. This also
permits to force an expiration date on visitors who previously did
not have one.
If a request comes with a dated cookie while no date check is performed,
then a new cookie is emitted with no date, so that we don't risk dropping
the user too fast due to a very old date when we re-enable the date check.
All requests that were targetting the correct server and which had their
expiration date added/updated/removed in the response cookie are logged
with the 'U' ("updated") flag instead of the 'I' ("inserted"). So very
often we'll see "VU" instead of "VN".
(cherry picked from commit 8b3c6ecab6d37be5f3655bc3a2d2c0f9f37325eb)
If a cookie comes in with a first or last date, and they are configured on
the backend, they're checked. If a date is expired or too far in the future,
then the cookie is ignored and the specific reason appears in the cookie
field of the logs.
(cherry picked from commit faa3019107eabe6b3ab76ffec9754f2f31aa24c6)
These functions only require 5 chars to encode 30 bits, and don't expect
any padding. They will be used to encode dates in cookies.
(cherry picked from commit a7e2b5fc4612994c7b13bcb103a4a2c3ecd6438a)
The set-cookie status flags were not very handy and limited. Reorder
them to save some room for additional values and add the "U" flags
(for Updated expiration date) that will be used with expirable cookies
in insert mode.
(cherry picked from commit 5bab52f821bb0fa99fc48ad1b400769e66196ece)
In all cookie persistence modes but prefix, we now support cookies whose
value is suffixed with some contents after a vertical bar ('|'). This will
be used to pass an optional expiration date. So as of now we only consider
the part of the cookie value which is used before the vertical bar.
(cherry picked from commit a4486bf4e5b03b5a980d03fef799f6407b2c992d)
Add two new arguments to the "cookie" keyword, to be able to
fix a max idle and max life on them. Right now only the parameter
parsing is implemented.
(cherry picked from commit 9ad5dec4c3bb8f29129f292cb22d3fc495fcc98a)
HTTP content-based health checks will be involved in searching text in pages.
Some pages may not fit in the default buffer (16kB) and sometimes it might be
desired to have larger buffers in order to find patterns. Running checks on
smaller URIs is always preferred of course.
(cherry picked from commit 043f44aeb835f3d0b57626c4276581a73600b6b1)
This patch adds the "http-check expect [r]{string,status}" statements
which enable health checks based on whether the response status or body
to an HTTP request contains a string or matches a regex.
This probably is one of the oldest patches that remained unmerged. Over
the time, several people have contributed to it, among which FinalBSD
(first and second implementations), Nick Chalk (port to 1.4), Anze
Skerlavaj (tests and fixes), Cyril Bont (general fixes), and of course
myself for the final fixes and doc during integration.
Some people already use an old version of this patch which has several
issues, among which the inability to search for a plain string that is
not at the beginning of the data, and the inability to look for response
contents that are provided in a second and subsequent recv() calls. But
since some configs are already deployed, it was quite important to ensure
a 100% compatible behaviour on the working cases.
Thus, that patch fixes the issues while maintaining config compatibility
with already deployed versions.
(cherry picked from commit b507c43a3ce9a8e8e4b770e52e4edc20cba4c37f)
This patch provides a new "option ldap-check" statement to enable
server health checks based on LDAPv3 bind requests.
(cherry picked from commit b76b44c6fed8a7ba6f0f565dd72a9cb77aaeca7c)
Some configs may involve httpclose in a frontend and http-pretend-keepalive
in a backend. httpclose used to take priority over keepalive, thus voiding
its effect. This change ensures that when both are combined, keepalive is
still announced to the server while close is announced to the client.
(cherry picked from commit 2be7ec90fa9caf66294f446423bbab2d00db9004)
Some broken browsers still happen to send a CRLF after a POST. Those which
send a CRLF in a second packet have it queued into the system's buffers,
which causes an RST to be emitted by some systems upon close of the response
(eg: Linux). The client may then receive the RST without the last response
segments, resulting in a truncated response.
This change leaves request polling enabled on a POST so that we can flush
any late data from the request buffers.
A more complete workaround would consist in reading from the request for a
long time, until we get confirmation that the close has been ACKed. This
is much more complex and should only be studied for newer versions.
(cherry picked from commit 12e316af4f0245fde12dbc224ebe33c8fea806b2)
Jozsef R.Nagy reported a reliability issue on FreeBSD. Sometimes an error
would be emitted, reporting the inability to switch a socket to non-blocking
mode and the listener would definitely not accept anything. Cyril Bont
narrowed this bug down to the call to EV_FD_CLR(l->fd, DIR_RD).
He was right because this call is wrong. It only disables input events on
the listening socket, without setting the listener to the LI_LISTEN state,
so any subsequent call to enable_listener() from maintain_proxies() is
ignored ! The correct fix consists in calling disable_listener() instead.
It is discutable whether we should keep such error path or just ignore the
event. The goal in earlier versions was to temporarily disable new activity
in order to let the system recover while releasing resources.
There was no consistency between all the functions used to exchange data
between a buffer and a stream interface. Also, the functions used to send
data to a buffer did not consider the possibility that the buffer was
shutdown for read.
Now the functions are called buffer_{put,get}_{char,block,chunk,string}.
The old buffer_feed* functions have been left available for existing code
but marked deprecated.
si->release() was called each time we closed one direction of a stream
interface, while it should only have been called when both sides are
closed. This bug is specific to 1.5 and only affects embedded tasks.
In deinit(), it is possible that we first free the listeners, then
unbind them all. Right now this situation can't happen because the
only way to call deinit() is to pass via a soft-stop which will
already unbind all protocols. But later this might become a problem.
This patch addresses exactly the same issues as the previous one, but
for responses this time. It also introduces implicit support for the
Set-Cookie2 header, for which there's almost nothing specific to do
since it is a clean header. This one allows multiple cookies in a
same header, by respecting the HTTP messaging semantics.
The new parser has been tested with insertion, rewrite, passive,
removal, prefixing and captures, and it looks OK. It's still able
to rewrite (or delete) multiple cookies at once. Just as with the
request parser, it tries hard to fix formating of the cookies it
displaces.
This patch too should be backported to 1.4 and possibly to 1.3.
The request cookie parser did not allow spaces to appear in cookie
values nor around the equal sign. The various RFCs on the subject
say different things, some suggesting that a space is allowed after
the equal sign and being worded in a way that lets one believe it
is allowed before too. Some spaces may appear inside values and be
part of the values. The quotes allow delimiters to be embedded in
values. The spaces before and after attributes should be trimmed.
The new parser addresses all those points and has been carefully tested.
It fixes misplaced spaces around equal signs before processing the cookies
or forwarding them. It also tries its best to perform clean removals by
always keeping the delimiter after the value being removed and leaving one
space after it.
The variable inside the parser have been renamed to make the code a lot
more understandable, and one multi-function pointer has been eliminated.
Since this patch fixes real possible issues, it should be backported to 1.4
and possibly 1.3, since one (single) case of wrong spaces has been reported
in 1.3.
The code handling the Set-Cookie has not been touched yet.
This counter is incremented for each incoming connection and each active
listener, and is used to prevent haproxy from stopping upon SIGUSR1. It
will thus be possible for some tasks in increment this counter in order
to prevent haproxy from dying until they have completed their job.
The header parser has a bug which causes commas to be matched within
quotes while it was not expected. The way the code was written could
make one think it was OK. The resulting effect is that the following
config would use the second IP address instead of the third when facing
this request :
source 0.0.0.0 usesrc hdr_ip(X-Forwarded-For,2)
GET / HTTP/1.0
X-Forwarded-for: "127.0.0.1, 127.0.0.2", 127.0.0.3
This fix must be backported to 1.4 and 1.3.
Fix 4fe4190278 was a bit too strong. It
has caused some chunked-encoded responses to be truncated when a recv()
call could return multiple chunks followed by a close. The reason is
that when a chunk is parsed, only its contents are scheduled to be
forwarded. Thus, the reader sees auto_close+shutr and sets shutw_now.
The sender in turn sends the last scheduled data and does shutw().
Another nasty effect is that it has reduced the keep-alive rate. If
a response did not completely fit into the buffer, then the auto_close
bit was left on and the sender would close upon completion.
The fix consists in not making use of auto_close when chunked encoding
is used nor when keep-alive is used, which makes sense. However it is
maintained on error processing.
Thanks to Cyril Bont for reporting the issue early.
Signal zero is never delivered by the system. However having a signal to
which functions and tasks can subscribe to be notified of a stopping event
is useful. So this patch does two things :
1) allow signal zero to be delivered from any function of signal handler
2) make soft_stop() deliver this signal so that tasks can be notified of
a stopping condition.
The two new functions below make it possible to register any number
of functions or tasks to a system signal. They will be called in the
registration order when the signal is received.
struct sig_handler *signal_register_fct(int sig, void (*fct)(struct sig_handler *), int arg);
struct sig_handler *signal_register_task(int sig, struct task *task, int reason);
In case of binding failure during startup, we wait for some time sending
signals to old pids so that they release the ports we need. But if there
aren't any old pids anymore, it's useless to wait, we prefer to fail fast.
Along with this change, we now have the number of old pids really found
in the nb_oldpids variable.
If the global stats timeout statement was found before the stats socket
(or without), the parser would crash because the stats frontend was not
initialized. Now we have an allocation function which solves the issue.
This bug was introduced with 1.4 so it does not need backporting.
(was commit 1c5819d2498ae3643c3880507847f948a53d2773 in 1.4)
If a server is disabled or tracking a disabled server, it must not
dequeue requests pending in the proxy queue, it must only dequeue
its own ones.
The problem that was caused is that if a backend always had requests
in its queue, a disabled server would continue to take traffic forever.
(was commit 09d02aaf02d1f21c0c02672888f3a36a14bdd299 in 1.4)
The statistics page (the HTML one) displays a garbage value on frontends using
"rate-limit session" in HTTP mode.
This is due to the usage of the same buffer for the macros converting the max
session rate and the limit.
Steps to reproduce :
Configuration file example :
listen bug :80
mode http
rate-limit sessions
stats enable
Then start refreshing the statistics page.
This bug was introduced just before the release of haproxy 1.4.0.
(was commit 6cfaf9e91969c87a9eab1d58a15d2d0a3f346c9b in 1.4)
While it's usually desired to wait for a server response even
when the client closes its request channel, it can be problematic
with long polling requests. In order to let the server decide what
to do in such a case, if option abortonclose is set, we simply
forward the shutdown to the server. That way, it can decide to
take the appropriate action. Most servers will still process the
request, while some will probably want to abort.
Obviously, this only works as long as the client has not sent
another pipelined request over the same connection.
(was commit 0e25d86da49827ff6aa3c94132c01292b5ba4854 in 1.4)
In case of HTTP keepalive processing, we want to release the counters tracked
by the backend. Till now only the second set of counters was released, while
it could have been assigned by the frontend, or the backend could also have
assigned the first set. Now we reuse to unused bits of the session flags to
mark which stick counters were assigned by the backend and to release them as
appropriate.
The assumption that there was a 1:1 relation between tracked counters and
the frontend/backend role was wrong. It is perfectly possible to track the
track-fe-counters from the backend and the track-be-counters from the
frontend. Thus, in order to reduce confusion, let's remove this useless
{fe,be} reference and simply use {1,2} instead. The keywords have also been
renamed in order to limit confusion. The ACL rule action now becomes
"track-sc{1,2}". The ACLs are now "sc{1,2}_*" instead of "trk{fe,be}_*".
That means that we can reasonably document "sc1" and "sc2" (sticky counters
1 and 2) as sort of patterns that are available during the whole session's
life and use them just like any other pattern.
It began to be problematic to have "tcp-request" followed by an
immediate action, as sometimes it was a keyword indicating a hook
or setting ("content" or "inspect-delay") and sometimes it was an
action.
Now the prefix for connection-level tcp-requests is "tcp-request connection"
and the ones processing contents remain "tcp-request contents".
This has allowed a nice simplification of the config parser and to
clean up the doc a bit. Also now it's a bit more clear why tcp-request
connection are not allowed in backends.
Doing so allows us to track counters from backends or depending on contents.
For instance, it now becomes possible to decide to track a connection based
on a Host header if enough time is granted to parse the HTTP request. It is
also possible to just track frontend counters in the frontend and unconditionally
track backend counters in the backend without having to write complex rules.
The first track-fe-counters rule executed is used to track counters for
the frontend, and the first track-be-counters rule executed is used to track
counters for the backend. Nothing prevents a frontend from setting a track-be
rule nor a backend from setting a track-fe rule. In fact these rules are
arbitrarily split between FE and BE with no dependencies.
Having a single tracking pointer for both frontend and backend counters
does not work. Instead let's have one for each. The keyword has changed
to "track-be-counters" and "track-fe-counters", and the ACL "trk_*"
changed to "trkfe_*" and "trkbe_*".
It is now possible to dump some select table entries based on criteria
which apply to the stored data. This is enabled by appending the following
options to the end of the "show table" statement :
data.<data_type> {eq|ne|lt|gt|le|ge} <value>
For intance :
show table http_proxy data.conn_rate gt 5
show table http_proxy data.gpc0 ne 0
The compare applies to the integer value as it would be displayed, and
operates on signed long long integers.
It's a bit cumbersome to have to know all possible storable types
from the stats interface. Instead, let's have generic types for
all data, which will facilitate their manipulation.
This feature will be required at some point, when the stick tables are
used to enforce security measures. For instance, some visitors may be
incorrectly flagged as abusers and would ask the site admins to remove
their entry from the table.
It is now possible to dump a table's contents with keys, expire,
use count, and various data using the command above on the stats
socket.
"show table" only shows main table stats, while "show table <name>"
dumps table contents, only if the socket level is admin.
This patch adds support for the following session counters :
- http_req_cnt : HTTP request count
- http_req_rate: HTTP request rate
- http_err_cnt : HTTP request error count
- http_err_rate: HTTP request error rate
The equivalent ACLs have been added to check the tracked counters
for the current session or the counters of the current source.
This counter may be used to track anything. Two sets of ACLs are available
to manage it, one gets its value, and the other one increments its value
and returns it. In the second case, the entry is created if it did not
exist.
Thus it is possible for example to mark a source as being an abuser and
to keep it marked as long as it does not wait for the entry to expire :
# The rules below use gpc0 to track abusers, and reject them if
# a source has been marked as such. The track-counters statement
# automatically refreshes the entry which will not expire until a
# 1-minute silence is respected from the source. The second rule
# evaluates the second part if the first one is true, so GPC0 will
# be increased once the conn_rate is above 100/5s.
stick-table type ip size 200k expire 1m store conn_rate(5s),gpc0
tcp-request track-counters src
tcp-request reject if { trk_get_gpc0 gt 0 }
tcp-request reject if { trk_conn_rate gt 100 } { trk_inc_gpc0 gt 0}
Alternatively, it is possible to let the entry expire even in presence of
traffic by swapping the check for gpc0 and the track-counters statement :
stick-table type ip size 200k expire 1m store conn_rate(5s),gpc0
tcp-request reject if { src_get_gpc0 gt 0 }
tcp-request track-counters src
tcp-request reject if { trk_conn_rate gt 100 } { trk_inc_gpc0 gt 0}
It is also possible not to track counters at all, but entry lookups will
then be performed more often :
stick-table type ip size 200k expire 1m store conn_rate(5s),gpc0
tcp-request reject if { src_get_gpc0 gt 0 }
tcp-request reject if { src_conn_rate gt 100 } { src_inc_gpc0 gt 0}
The '0' at the end of the counter name is there because if we find that more
counters may be useful, other ones will be added.
This function looks up a key, updates its expiration date, or creates
it if it was not found. acl_fetch_src_updt_conn_cnt() was updated to
make use of it.
These counters maintain incoming and outgoing byte rates in a stick-table,
over a period which is defined in the configuration (2 ms to 24 days).
They can be used to detect service abuse and enforce a certain bandwidth
limits per source address for instance, and block if the rate is passed
over. Since 32-bit counters are used to compute the rates, it is important
not to use too long periods so that we don't have to deal with rates above
4 GB per period.
Example :
# block if more than 5 Megs retrieved in 30 seconds from a source.
stick-table type ip size 200k expire 1m store bytes_out_rate(30s)
tcp-request track-counters src
tcp-request reject if { trk_bytes_out_rate gt 5000000 }
# cause a 15 seconds pause to requests from sources in excess of 2 megs/30s
tcp-request inspect-delay 15s
tcp-request content accept if { trk_bytes_out_rate gt 2000000 } WAIT_END
These counters maintain incoming connection rates and session rates
in a stick-table, over a period which is defined in the configuration
(2 ms to 24 days). They can be used to detect service abuse and
enforce a certain accept rate per source address for instance, and
block if the rate is passed over.
Example :
# block if more than 50 requests per 5 seconds from a source.
stick-table type ip size 200k expire 1m store conn_rate(5s),sess_rate(5s)
tcp-request track-counters src
tcp-request reject if { trk_conn_rate gt 50 }
# cause a 3 seconds pause to requests from sources in excess of 20 requests/5s
tcp-request inspect-delay 3s
tcp-request content accept if { trk_sess_rate gt 20 } WAIT_END
We're now able to return errors based on the validity of an argument
passed to a stick-table store data type. We also support ARG_T_DELAY
to pass delays to stored data types (eg: for rate counters).
Some data types will require arguments (eg: period for a rate counter).
This patch adds support for such arguments between parenthesis in the
"store" directive of the stick-table statement. Right now only integers
are supported.