Recent commit 62c3be broke maintenance mode by fixing srv_is_usable().
Enabling a disabled server would not re-introduce it into the farm.
The reason is that in set_server_up(), the SRV_MAINTAIN flag is still
present when recounting the servers. The flag was removed late only to
adjust a log message. Keep a copy of the old flag instead and update
SRV_MAINTAIN earlier.
This fix must also be backported to 1.4 (but no release got the regression).
Commits 5c6209 and 072930 were aimed at avoiding undesirable PUSH flags
when forwarding chunked data, but had the undesired effect of causing
data advertised by content-length to be affected by the delayed ACK too.
This can happen when the data to be forwarded are small enough to fit into
a single send() call, otherwise the BF_EXPECT_MORE flag would be removed.
Content-length data don't need the BF_EXPECT_MORE flag since the low-level
forwarder already knows it can safely rely on bf->to_forward to set the
appropriate TCP flags.
Note that the issue is only observed in requests at the moment, though the
later introduction of server-side keep-alive could trigger the issue on the
response path too.
Special thanks to Randy Shults for reporting this issue with a lot of
details helping to reproduce it.
The fix must be backported to 1.4.
When a request completes on a server and the server connection is closed
while the client connection stays open, the HTTP engine releases all server
connection slots and scans the queues to offer the connection slot to
another pending request.
An issue happens when the released connection allows other requests to be
dequeued : may_dequeue_tasks() relies on srv->served which is only decremented
by sess_change_server() which itself is only called after may_dequeue_tasks().
This results in no connection being woken up until another connection terminates
so that may_dequeue_tasks() is called again.
This fix is minimalist and only moves sess_change_server() earlier (which is
safe). It should be reworked and the code factored out so that the same occurrence
in session.c shares the same code.
This bug has been there since the introduction of option-http-server-close and
the fix must be backported to 1.4.
Since commit 115acb97, chunk size was limited to 256MB. There is no reason for
such a limit and the comment on the code suggests a missing zero. However,
increasing the limit past 2 GB causes trouble due to some 32-bit subtracts
in various computations becoming negative (eg: buffer_max_len). So let's limit
the chunk size to 2 GB - 1 max.
commit a1cc3811 introduced an undesirable \0\n ending on HTTP log messages. This
is because of an extra character count passed to __send_log() which causes the LF
to be appended past the \0. Some syslog daemons thus log an extra empty line. The
fix is obvious. Fix the function comments to remind what they expect on their input.
This is past 1.5-dev7 regression so there's no backport needed.
The principle behind this load balancing algorithm was first imagined
and modeled by Steen Larsen then iteratively refined through several
work sessions until it would totally address its original goal.
The purpose of this algorithm is to always use the smallest number of
servers so that extra servers can be powered off during non-intensive
hours. Additional tools may be used to do that work, possibly by
locally monitoring the servers' activity.
The first server with available connection slots receives the connection.
The servers are choosen from the lowest numeric identifier to the highest
(see server parameter "id"), which defaults to the server's position in
the farm. Once a server reaches its maxconn value, the next server is used.
It does not make sense to use this algorithm without setting maxconn. Note
that it can however make sense to use minconn so that servers are not used
at full load before starting new servers, and so that introduction of new
servers requires a progressively increasing load (the number of servers
would more or less follow the square root of the load until maxconn is
reached). This algorithm ignores the server weight, and is more beneficial
to long sessions such as RDP or IMAP than HTTP, though it can be useful
there too.
http_sess_log now use the logformat linked list to make the log
string, snprintf is not used for speed issue.
CLF mode also uses logformat.
NOTE: as of now, empty fields in CLF now are "" not "-" anymore.
parse_logformat_string: parse the string, detect the type: text,
separator or variable
parse_logformat_var: dectect variable name
parse_logformat_var_args: parse arguments and flags
add_to_logformat_list: add to the logformat linked list
send_log function is now splited in 3 functions
* hdr_log: generate the syslog header
* send_log: send a syslog message with a printf format string
* __send_log: send a syslog message
When checking a configuration file using "-c -f xxx", sometimes it is
reported that a config is valid while it will later fail (eg: no enabled
listener). Instead, let's improve the return values :
- return 0 if config is 100% OK
- return 1 if config has errors
- return 2 if config is OK but no listener nor peer is enabled
If the local host is not found as a peer in a "peers" section, we have a
double free, and possibly a use-after-free because the peers section is
freed since it's aliased as the table's name.
Marcello Gorlani reported that commit 5e205524ad
(BUG: http: re-enable TCP quick-ack upon incomplete HTTP requests) broke build
on FreeBSD.
Moving the include lower fixes the issue. This must be backported to 1.4 too.
It was reported that a server configured with a zero weight would
sometimes still take connections from the backend queue. This issue is
real, it happens this way :
1) the disabled server accepts a request with a cookie
2) many cookie-less requests accumulate in the backend queue
3) when the disabled server completes its request, it checks its own
queue and the backend's queue
4) the server takes a pending request from the backend queue and
processes it. In response, the server's cookie is assigned to
the client, which ensures that some requests will continue to
be served by this server, leading back to point 1 above.
The fix consists in preventing a zero-weight server from dequeuing pending
requests from the backend. Making use of srv_is_usable() in such tests makes
the tests more robust against future changes.
This fix must be backported to 1.4 and 1.3.
In a config where server "s1" is marked disabled and "s2" tracks "s1",
s2 appears disabled on the stats but is still inserted into the LB farm
because the tracking is resolved too late in the configuration process.
We now resolve tracked servers before building LB maps and we also mark
the tracking server in maintenance mode, which previously was not done,
causing half of the issue.
Last point is that we also protect srv_is_usable() against electing a
server marked for maintenance. This is not absolutely needed but is a
safe choice and makes a lot of sense.
This fix must be backported to 1.4.
I downloaded version 1.4.19 this morning. While merging the code changes
to a custom build that we have here for our project I noticed a typo in
'session.c', in the new code for inserting the server name in the HTTP
header. The fix that I did is shown in the patch below.
[WT: the bug is harmless, it is only suboptimal]
Joe Price reported that "clear table xxx" sent on the CLI would only clear
the last entry. This is true, some code was missing to remove an entry from
within the loop, and only the final condition was able to remove an entry.
The fix is obvious. No backport is needed.
These ones are invalid and blocked unless "option accept-invalid-http-request"
is specified in the frontend. In any case, the faulty request is logged.
Note that some of the remaining invalid chars are still not checked against,
those are the invalid ones between 32 and 127 :
34 ('"'), 60 ('<'), 62 ('>'), 92 ('\'), 94 ('^'),
96 ('`'), 123 ('{'), 124 ('|'), 125 ('}')
Using a lookup table might be better at some point.
The HTTP request parser was considering that any non-LWS char was
par of the URI. Unfortunately, this allows control chars to be sent
in the URI, sometimes resulting in backend servers misbehaving, for
instance when they interprete \0 as an end of string and respond
with plain HTTP/0.9 without headers, that haproxy blocks as invalid
responses.
RFC3986 clearly states the list of allowed characters in a URI. Even
non-ASCII chars are not allowed. Unfortunately, after having run 10
years with these chars allowed, we can't block them right now without
an optional workaround. So the first step consists in only blocking
control chars. A later patch will allow non-ASCII only when an appropriate
option is enabled in the frontend.
Control chars are 0..31 and 127, with the exception of 9, 10 and 13
(\t, \n, \r).
On Solaris/sparc, getpid() returns pid_t which is not an int :
src/peers.c: In function `peer_io_handler':
src/peers.c:508: warning: int format, pid_t arg (arg 6)
New option "http-send-name-header" specifies the name of a header which
will hold the server name in outgoing requests. This is the name of the
server the connection is really sent to, which means that upon redispatches,
the header's value is updated so that it always matches the server's name.
Using "halog -c" is still something quite common to perform on logs,
but unfortunately since the recent added controls, it was sensibly
slowed down due to the parsing of the accept date field.
Now we use a specific loop for the case where nothing is needed from
the input, and this sped up the line counting by 2.5x. A 2.4 GHz Xeon
now counts lines at a rate of 2 GB of logs per second.
This pattern previously was limited to type IP. With the new header
extraction function, it becomes possible to extract strings, so that
the header can be returned as a string. This will not change anything
to existing configs, as string will automatically be converted to IP
when needed. However, new configs will be able to use IPv6 addresses
from headers in stick-tables, as well as stick on any non-IP header
(eg: host, user-agent, ...).
The new function does not return IP addresses but header values instead,
so that the caller is free to make what it want of them. The conversion
is not quite clean yet, as the previous test which considered that address
0.0.0.0 meant "no address" is still used. A different IP parsing function
should be used to take this into account.
Now strings and data blocks are stored in the temp_pattern's chunk
and matched against this one.
The rdp_cookie currently makes extensive use of acl_fetch_rdp_cookie()
and will be a good candidate for the initial rework so that ACLs use
the patterns framework and not the other way around.
IPv4 and IPv6 addresses are now stored into temp_pattern instead of
the dirty hack consisting into storing them into the consumer's target
address.
Some refactoring should now be possible since the methods used to fetch
source and destination addresses are similar between patterns and ACLs.
All ACL fetches which return integer value now store the result into
the temporary pattern struct. All ACL matches which rely on integer
also get their value there.
Note: the pattern data types are not set right now.
Till now the pattern data integer type was unsigned without any
particular reason. In order to make ACLs use it, we must switch it
to signed int instead.
This function was only used to call chunk_init_len() from another chunk,
which in the end consists in simply assigning the source chunk to the
destination chunk. Let's remove this indirection to make the code clearer.
Anyway it was the only place such a function was used.
This is 1.5-specific. It causes issues with transparent source binding involving
hdr_ip. We must not try to bind() to a foreign address when the family is not set,
and we must set the family when an address is set.
By default we disable TCP quick-acking on HTTP requests so that we
avoid sending a pure ACK immediately followed by the HTTP response.
However, if the client sends an incomplete request in a short packet,
its TCP stack might wait for this packet to be ACKed before sending
the rest of the request, delaying incoming requests by up to 40-200ms.
We can detect this undesirable situation when parsing the request :
- if an incomplete request is received
- if a full request is received and uses chunked encoding or advertises
a content-length larger than the data available in the buffer
In these situations, we re-enable TCP quick-ack if we had previously
disabled it.
Server Name Indication (SNI) is a TLS extension which makes a client
present the name of the server it is connecting to in the client hello.
It allows a transparent proxy to take a decision based on the beginning
of an SSL/TLS stream without deciphering it.
The new ACL "req_ssl_sni" matches the name extracted from the TLS
handshake against a list of names which may be loaded from a file if
needed.
When splice() returns EAGAIN, on old kernels it could be caused by a read
shutdown which was not detected. Due to this behaviour, we had to fall
back to recv(), which in turn says if it's a real EAGAIN or a shutdown.
Since this behaviour was fixed in 2.6.27.14, on more recent kernels we'd
prefer to avoid the fallback to recv() when possible. For this, we set a
variable the first time splice() detects a shutdown, to indicate that it
works. We can then rely on this variable to adjust our behaviour.
Doing this alone increased the overall performance by about 1% on medium
sized objects.
First, it's a waste not to call chk_snd() when spliced data are available,
because the pipe can almost always be transferred into the outgoing socket
buffers. Starting from now, when we splice data in, we immediately try to
send them. This results in less pipes used, and possibly less kernel memory
in use at once.
Second, if a pipe cannot be transferred into the outgoing socket buffers,
it means this buffer is full. There's no point trying again then, as space
will almost never be available, resulting in a useless syscall returning
EAGAIN.