Ankit Malp reported a bug that we've had since binding to devices was
implemented. Haproxy wrongly checks that the process stays privileged
after startup when a binding to a device is specified via the bind
keyword "interface". This is wrong, because after startup we're not
binding any socket anymore, and during startup if there's a permission
issue it will be immediately reported ("permission denied"). More
importantly there's no way around it as the process exits on startup
when facing such an option.
This fix should be backported to 1.7, 1.6 and 1.5.
This patch adds 'no-agent-check' setting supported both by 'default-server'
and 'server' directives to disable an agent check for a specific server which would
have 'agent-check' set as default value (inherited from 'default-server'
'agent-check' setting), or, on 'default-server' lines, to disable 'agent-check' setting
as default value for any further 'server' declarations.
For instance, provided this configuration:
default-server agent-check
server srv1
server srv2 no-agent-check
server srv3
default-server no-agent-check
server srv4
srv1 and srv3 would have an agent check enabled contrary to srv2 and srv4.
We do not allocate anymore anything when parsing 'default-server' 'agent-check'
setting.
Before this patch, only 'server' directives could support 'disabled' setting.
This patch makes also 'default-server' directives support this setting.
It is used to disable a list of servers declared after a 'defaut-server' directive.
'enabled' new keyword has been added, both supported as 'default-server' and
'server' setting, to enable again a list of servers (so, declared after a
'default-server enabled' directive) or to explicitly enable a specific server declared
after a 'default-server disabled' directive.
For instance provided this configuration:
default-server disabled
server srv1...
server srv2...
server srv3... enabled
server srv4... enabled
srv1 and srv2 are disabled and srv3 and srv4 enabled.
This is equivalent to this configuration:
default-server disabled
server srv1...
server srv2...
default-server enabled
server srv3...
server srv4...
even if it would have been preferable/shorter to declare:
server srv3...
server srv4...
default-server disabled
server srv1...
server srv2...
as 'enabled' is the default server state.
This patch makes 'default-server' support 'addr' setting.
The code which was responsible of parsing 'server' 'addr' setting
has moved from parse_server() to implement a new parser
callable both as 'default-server' and 'server' 'addr' setting parser.
Should not break anything.
This patch makes 'default-server' directives support 'sni' settings.
A field 'sni_expr' has been added to 'struct server' to temporary
stores SNI expressions as strings during both 'default-server' and 'server'
lines parsing. So, to duplicate SNI expressions from 'default-server' 'sni' setting
for new 'server' instances we only have to "strdup" these strings as this is
often done for most of the 'server' settings.
Then, sample expressions are computed calling sample_parse_expr() (only for 'server'
instances).
A new function has been added to produce the same error output as before in case
of any error during 'sni' settings parsing (display_parser_err()).
Should not break anything.
Before this patch, only 'server' directives could support 'source' setting.
This patch makes also 'default-server' directives support this setting.
To do so, we had to extract the code responsible of parsing 'source' setting
arguments from parse_server() function and make it callable both
as 'default-server' and 'server' 'source' setting parser. So, the code is mostly
the same as before except that before allocating anything for 'struct conn_src'
members, we must free the memory previously allocated.
Should not break anything.
Before this patch, 'cookie' setting was only supported by 'server' directives.
This patch makes 'default-server' directive also support 'cookie' setting.
Should not break anything.
Before this path, 'observe' setting was only supported by 'server' directives.
This patch makes 'default-server' directives also support 'observe' setting.
Should not break anything.
Before this patch only 'server' directives could support 'redir' setting.
This patch makes also 'default-server' directives support 'redir' setting.
Should not break anything.
Before this patch only 'server' directives could support 'track' setting.
This patch makes 'default-server' directives also support this setting.
Should not break anything.
Before this patch 'check' setting was only supported by 'server' directives.
This patch makes also 'default-server' directives support this setting.
A new 'no-check' keyword parser has been implemented to disable this setting both
in 'default-server' and 'server' directives.
Should not break anything.
This patch makes 'default-server' directive support 'verifyhost' setting.
Note: there was a little memory leak when several 'verifyhost' arguments were
supplied on the same 'server' line.
This patch makes 'default-server' directive support 'send-proxy-v2-ssl'
(resp. 'send-proxy-v2-ssl-cn') setting.
A new keyword 'no-send-proxy-v2-ssl' (resp. 'no-send-proxy-v2-ssl-cn') has been
added to disable 'send-proxy-v2-ssl' (resp. 'send-proxy-v2-ssl-cn') setting both
in 'server' and 'default-server' directives.
This patch makes 'default-server' directive support 'ssl' setting.
A new keyword 'no-ssl' has been added to disable this setting both
in 'server' and 'default-server' directives.
This patch makes 'default-server' directive support 'no-sslv3' (resp. 'no-ssl-reuse',
'no-tlsv10', 'no-tlsv11', 'no-tlsv12', and 'no-tls-tickets') setting.
New keywords 'sslv3' (resp. 'ssl-reuse', 'tlsv10', 'tlsv11', 'tlsv12', and
'tls-no-tickets') have been added to disable these settings both in 'server' and
'default-server' directives.
This patch makes 'default-server' directive support 'force-sslv3'
and 'force-tlsv1[0-2]' settings.
New keywords 'no-force-sslv3' (resp. 'no-tlsv1[0-2]') have been added
to disable 'force-sslv3' (resp. 'force-tlsv1[0-2]') setting both in 'server' and
'default-server' directives.
This patch makes 'default-server' directive support 'check-ssl' setting
to enable SSL for health checks.
A new keyword 'no-check-ssl' has been added to disable this setting both in
'server' and 'default-server' directives.
This patch makes 'default-server' directive support 'send-proxy'
(resp. 'send-proxy-v2') setting.
A new keyword 'no-send-proxy' (resp. 'no-send-proxy-v2') has been added
to disable 'send-proxy' (resp. 'send-proxy-v2') setting both in 'server' and
'default-server' directives.
This patch makes 'default-server' directive support 'non-stick' setting.
A new keyword 'stick' has been added so that to disable
'non-stick' setting both in 'server' and 'default-server' directives.
This patch makes 'default-server' directive support 'check-send-proxy' setting.
A new keyword 'no-check-send-proxy' has been added so that to disable
'check-send-proxy' setting both in 'server' and 'default-server' directives.
At this time, only 'server' supported 'backup' keyword.
This patch makes also 'default-server' directive support this keyword.
A new keyword 'no-backup' has been added so that to disable 'backup' setting
both in 'server' and 'default-server' directives.
For instance, provided the following sequence of directives:
default-server backup
server srv1
server srv2 no-backup
default-server no-backup
server srv3
server srv4 backup
srv1 and srv4 are declared as backup servers,
srv2 and srv3 are declared as non-backup servers.
There is no reason to emit such an error message:
"'default-server' expects <name> and <addr>[:<port>] as arguments."
if less than two arguments are provided on 'default-server' lines.
This is a 'server' specific error message.
There is a silly case where a loop is not detected in tracked servers lists:
when a server tracks itself.
Ex:
server srv1 127.0.0.1:8000 track srv1
Well, this never happens and this does not prevent haproxy from working.
But with this next following configuration:
server srv1 127.0.0.1:8000 track srv2
server srv2 127.0.0.1:8000 track srv2
server srv3 127.0.0.1:8000 track srv2
the code in charge of detecting such loops never returns (without any error message).
haproxy becomes stuck in an infinite loop because of this statement found
in check_config_validity():
for (loop = srv->track; loop && loop != newsrv; loop = loop->track);
Again, such a configuration is never accidentally used I guess.
This latter example seems silly, but as several 'default-server' directives may be used
in the same proxy section, and as 'default-server' settings are not resetted each a
new 'default-server' line is created, it will match the following configuration, in the future,
when 'track' setting will be supported by 'default-server':
default-server track srv3
server srv1 127.0.0.1:8000
server srv2 127.0.0.1:8000
.
.
.
default-server check
server srv3 127.0.0.1:8000
(cherry picked from commit 6528fc93d3c065fdac841f24e55cfe9674a67414)
A "weakness" exist in the first implementation of the parsing of the DNS
responses: HAProxy always choses the first IP available matching family
preference, or as a failover, the first IP.
It should be good enough, since most DNS servers do round robin on the
records they send back to clients.
That said, some servers does not do proper round robin, or we may be
unlucky too and deliver the same IP to all the servers sharing the same
hostname.
Let's take the simple configuration below:
backend bk
srv s1 www:80 check resolvers R
srv s2 www:80 check resolvers R
The DNS server configured with 2 IPs for 'www'.
If you're unlucky, then HAProxy may apply the same IP to both servers.
Current patch improves this situation by weighting the decision
algorithm to ensure we'll prefer use first an IP found in the response
which is not already affected to any server.
The new algorithm does not guarantee that the chosen IP is healthy,
neither a fair distribution of IPs amongst the servers in the farm,
etc...
It only guarantees that if the DNS server returns many records for a
hostname and that this hostname is being resolved by multiple servers in
the same backend, then we'll use as many records as possible.
If a server fails, HAProxy won't pick up an other record from the
response.
When SIGUSR1 is received, haproxy enters in soft-stop and quits when no
connection remains.
It can happen that the instance remains alive for a long time, depending
on timeouts and traffic. This option ensures that soft-stop won't run
for too long.
Example:
global
hard-stop-after 30s # Once in soft-stop, the instance will remain
# alive for at most 30 seconds.
kevent() always sets EV_EOF with EVFILT_READ to notify of a read shutdown
and EV_EOF with EVFILT_WRITE to notify of a write error. Let's check this
flag to properly update the FD's polled status (FD_POLL_HUP and FD_POLL_ERR
respectively).
It's worth noting that this one can be coupled with a regular read event
to notify about a pending read followed by a shutdown, but for now we only
use this to set the relevant flags (HUP and ERR).
The poller now exhibits the flag HAP_POLL_F_RDHUP to indicate this new
capability.
An improvement may consist in not setting FD_POLL_IN when the "data"
field is null since it normally only reflects the amount of pending
data.
After commit e852545 ("MEDIUM: polling: centralize polled events processing")
all pollers stopped to explicitly check the FD's polled status, except the
kqueue poller which is constructed a bit differently. It doesn't seem possible
to cause any issue such as missing an event however, but anyway it's better to
definitely get rid of this since the event filter already provides the event
direction.
On Linux since 2.6.17 poll() supports POLLRDHUP to notify of an upcoming
hangup after pending data. Making use of it allows us to avoid a useless
recv() after short responses on keep-alive connections. Note that we
automatically enable the feature once this flag has been met first in a
poll() status. Till now it was only enabled on epoll.
Curu Wong reported a case where haproxy used to send RST to a server
after receiving its FIN. The problem is caused by the fact that being
a server connection, its fd is marked with linger_risk=1, and that the
poller didn't report POLLRDHUP, making haproxy unaware of a pending
shutdown that came after the data, so it used to resort to nolinger
for closing.
However when pollers support RDHUP we're pretty certain whether or not
a shutdown comes after the data and we don't need to perform that extra
recv() call.
Similarly when we're dealing with an inbound connection we don't care
and don't want to perform this extra recv after a request for a very
unlikely case, as in any case we'll have to deal with the client-facing
TIME_WAIT socket.
So this patch ensures that only when it's known that there's no risk with
lingering data, as well as in cases where it's known that the poller would
have detected a pending RDHUP, we perform the fd_done_recv() otherwise we
continue, trying a subsequent recv() to try to detect a pending shutdown.
This effectively results in an extra recv() call for keep-alive sockets
connected to a server when POLLRDHUP isn't known to be supported, but
it's the only way to know whether they're still alive or closed.
This fix should be backported to 1.7, 1.6 and 1.5. It relies on the
previous patch bringing support for the HAP_POLL_F_RDHUP flag.
We'll need to differenciate between pollers which can report hangup at
the same time as read (POLL_RDHUP) from the other ones, because only
these ones may benefit from the fd_done_recv() optimization. Epoll has
had support for EPOLLRDHUP since Linux 2.6.17 and has always been used
this way in haproxy, so now we only set the flag once we've observed it
once in a response. It means that some initial requests may try to
perform a second recv() call, but after the first closed connection it
will be enough to know that the second call is not needed anymore.
Later we may extend these flags to designate event-triggered pollers.
A tcp half connection can cause 100% CPU on expiration.
First reproduced with this haproxy configuration :
global
tune.bufsize 10485760
defaults
timeout server-fin 90s
timeout client-fin 90s
backend node2
mode tcp
timeout server 900s
timeout connect 10s
server def 127.0.0.1:3333
frontend fe_api
mode tcp
timeout client 900s
bind :1990
use_backend node2
Ie timeout server-fin shorter than timeout server, the backend server
sends data, this package is left in the cache of haproxy, the backend
server continue sending fin package, haproxy recv fin package. this
time the session information is as follows:
time the session information is as follows:
0x2373470: proto=tcpv4 src=127.0.0.1:39513 fe=fe_api be=node2
srv=def ts=08 age=1s calls=3 rq[f=848000h,i=0,an=00h,rx=14m58s,wx=,ax=]
rp[f=8004c020h,i=0,an=00h,rx=,wx=14m58s,ax=] s0=[7,0h,fd=6,ex=]
s1=[7,18h,fd=7,ex=] exp=14m58s
rp has set the CF_SHUTR state, next, the client sends the fin package,
session information is as follows:
0x2373470: proto=tcpv4 src=127.0.0.1:39513 fe=fe_api be=node2
srv=def ts=08 age=38s calls=4 rq[f=84a020h,i=0,an=00h,rx=,wx=,ax=]
rp[f=8004c020h,i=0,an=00h,rx=1m11s,wx=14m21s,ax=] s0=[7,0h,fd=6,ex=]
s1=[9,10h,fd=7,ex=] exp=1m11s
After waiting 90s, session information is as follows:
0x2373470: proto=tcpv4 src=127.0.0.1:39513 fe=fe_api be=node2
srv=def ts=04 age=4m11s calls=718074391 rq[f=84a020h,i=0,an=00h,rx=,wx=,ax=]
rp[f=8004c020h,i=0,an=00h,rx=?,wx=10m49s,ax=] s0=[7,0h,fd=6,ex=]
s1=[9,10h,fd=7,ex=] exp=? run(nice=0)
cpu information:
6899 root 20 0 112224 21408 4260 R 100.0 0.7 3:04.96 haproxy
Buffering is set to ensure that there is data in the haproxy buffer, and haproxy
can receive the fin package, set the CF_SHUTR flag, If the CF_SHUTR flag has been
set, The following code does not clear the timeout message, causing cpu 100%:
stream.c:process_stream:
if (unlikely((res->flags & (CF_SHUTR|CF_READ_TIMEOUT)) == CF_READ_TIMEOUT)) {
if (si_b->flags & SI_FL_NOHALF)
si_b->flags |= SI_FL_NOLINGER;
si_shutr(si_b);
}
If you have closed the read, set the read timeout does not make sense.
With or without cf_shutr, read timeout is set:
if (tick_isset(s->be->timeout.serverfin)) {
res->rto = s->be->timeout.serverfin;
res->rex = tick_add(now_ms, res->rto);
}
After discussion on the mailing list, setting half-closed timeouts the
hard way here doesn't make sense. They should be set only at the moment
the shutdown() is performed. It will also solve a special case which was
already reported of some half-closed timeouts not working when the shutw()
is performed directly at the stream-interface layer (no analyser involved).
Since the stream interface layer cannot know the timeout values, we'll have
to store them directly in the stream interface so that they are used upon
shutw(). This patch does this, fixing the problem.
An easier reproducer to validate the fix is to keep the huge buffer and
shorten all timeouts, then call it under tcploop server and client, and
wait 3 seconds to see haproxy run at 100% CPU :
global
tune.bufsize 10485760
listen px
bind :1990
timeout client 90s
timeout server 90s
timeout connect 1s
timeout server-fin 3s
timeout client-fin 3s
server def 127.0.0.1:3333
$ tcploop 3333 L W N20 A P100 F P10000 &
$ tcploop 127.0.0.1:1990 C S10000000 F
Because of this typo, AN_RES_FLT_END was never called when a redirect rule is
applied on a keep-alive connection. In almost all cases, this bug has no
effect. But, it leads to a memory leak if a redirect is done on a http-response
rule when the HTTP compression is enabled.
This patch should be backported in 1.7.
SSL_CTX_set_ecdh_auto is declared (when present) with #define. A simple #ifdef
avoid to list all cases of ssllibs. It's a placebo in new ssllibs. It's ok with
openssl 1.0.1, 1.0.2, 1.1.0, libressl and boringssl.
Thanks to Piotr Kubaj for postponing and testing with libressl.
This fixes a regression introduced in d7bdcb874b, that removed the
ability to use req.payload(0,0) to read the whole buffer content. The
offending commit is present starting in version 1.6, so the patch
should be backported to versions 1.6 and 1.7.
This flag is always set when we end up here, for each and every data
layer (idle, stream-interface, checks), and continuing to test it
leaves a big risk of forgetting to set it as happened once already
before 1.5-dev13.
It could make sense to backport this into stable branches as part of
the connection flag fixes, after some cool down period.
Despite the previous commit working fine on all tests, it's still not
sufficient to completely address the problem. If the connection handler
is called with an event validating an L4 connection but some handshakes
remain (eg: accept-proxy), it will still wake the function up, which
will not report the activity, and will not detect a change once the
handshake it complete so it will not notify the ->wake() handler.
In fact the only reason why the ->wake() handler is still called here
is because after dropping the last handshake, we try to call ->recv()
and ->send() in turn and change the flags in order to detect a data
activity. But if for any reason the data layer is not interested in
reading nor writing, it will not get these events.
A cleaner way to address this is to call the ->wake() handler only
on definitive status changes (shut, error), on real data activity,
and on a complete connection setup, measured as CONNECTED with no
more handshake pending.
It could be argued that the handshake flags have to be made part of
the condition to set CO_FL_CONNECTED but that would currently break
a part of the health checks. Also a handshake could appear at any
moment even after a connection is established so we'd lose the
ability to detect a second end of handshake.
For now the situation around CO_FL_CONNECTED is not clean :
- session_accept() only sets CO_FL_CONNECTED if there's no pending
handshake ;
- conn_fd_handler() will set it once L4 and L6 are complete, which
will do what session_accept() above refrained from doing even if
an accept_proxy handshake is still pending ;
- ssl_sock_infocbk() and ssl_sock_handshake() consider that a
handshake performed with CO_FL_CONNECTED set is a renegociation ;
=> they should instead filter on CO_FL_WAIT_L6_CONN
- all ssl_fc_* sample fetch functions wait for CO_FL_CONNECTED before
accepting to fetch information
=> they should also get rid of any pending handshake
- smp_fetch_fc_rcvd_proxy() uses !CO_FL_CONNECTED instead of
CO_FL_ACCEPT_PROXY
- health checks (standard and tcp-checks) don't check for HANDSHAKE
and may report a successful check based on CO_FL_CONNECTED while
not yet done (eg: send buffer full on send_proxy).
This patch aims at solving some of these side effects in a backportable
way before this is reworked in depth :
- we need to call ->wake() to report connection success, measure
connection time, notify that the data layer is ready and update
the data layer after activity ; this has to be done either if
we switch from pending {L4,L6}_CONN to nothing with no handshakes
left, or if we notice some handshakes were pending and are now
done.
- we document that CO_FL_CONNECTED exactly means "L4 connection
setup confirmed at least once, L6 connection setup confirmed
at least once or not necessary, all this regardless of any
possibly remaining handshakes or future L6 negociations".
This patch also renames CO_FL_CONN_STATUS to the more explicit
CO_FL_NOTIFY_DATA, and works around the previous flags trick consiting
in setting an impossible combination of flags to notify the data layer,
by simply clearing the current flags.
This fix should be backported to 1.7, 1.6 and 1.5.
Recent fix 7bf3fa3 ("BUG/MAJOR: connection: update CO_FL_CONNECTED before
calling the data layer") marked an end to a fragile situation where the
absence of CO_FL_{CONNECTED,L4,L6}* flags is used to mark the completion
of a connection setup. The problem is that by setting the CO_FL_CONNECTED
flag earlier, we can indeed call the ->wake() function from conn_fd_handler
but the stream-interface's wake function needs to see CO_FL_CONNECTED unset
to detect that a connection has just been established, so if there's no
pending data in the buffer, the connection times out. The other ->wake()
functions (health checks and idle connections) don't do this though.
So instead of trying to detect a subtle change in connection flags,
let's simply rely on the stream-interface's state and validate that the
connection is properly established and that handshakes are completed
before reporting the WRITE_NULL indicating that a pending connection was
just completed.
This patch passed all tests of handshake and non-handshake combinations,
with synchronous and asynchronous connect() and should be safe for backport
to 1.7, 1.6 and 1.5 when the fix above is already present.
When a filter is used, there are 2 channel's analyzers to surround all the
others, flt_start_analyze and flt_end_analyze. This is the good place to acquire
and release resources used by filters, when needed. In addition, the last one is
used to synchronize the both channels, especially for HTTP streams. We must wait
that the analyze is finished for the both channels for an HTTP transaction
before restarting it for the next one.
But this part was buggy, leading to unexpected behaviours. First, depending on
which channel ends first, the request or the response can be switch in a
"forward forever" mode. Then, the HTTP transaction can be cleaned up too early,
while a processing is still in progress on a channel.
To fix the bug, the flag CF_FLT_ANALYZE has been added. It is set on channels in
flt_start_analyze and is kept if at least one filter is still analyzing the
channel. So, we can trigger the channel syncrhonization if this flag was removed
on the both channels. In addition, the flag TX_WAIT_CLEANUP has been added on
the transaction to know if the transaction must be cleaned up or not during
channels syncrhonization. This way, we are sure to reset everything once all the
processings are finished.
This patch should be backported in 1.7.
When the "process" setting of a bind line limits the processes a
listening socket is enabled on, a "disable frontend" operation followed
by an "enable frontend" triggers a bug because all declared listeners
are attempted to be bound again regardless of their assigned processes.
This can at minima create new sockets not receiving traffic, and at worst
prevent from re-enabling a frontend if it's bound to a privileged port.
This bug was introduced by commit 1c4b814 ("MEDIUM: listener: support
rebinding during resume()") merged in 1.6-dev1, trying to perform the
bind() before checking the process list instead of after.
Just move the process check before the bind() operation to fix this.
This fix must be backported to 1.7 and 1.6.
Thanks to Pavlos for reporting this one.
Strict interpretation of TLS can cause SSL sessions to be thrown away
when the socket is shutdown without sending a "close notify", resulting
in each check to go through the complete handshake, eating more CPU on
the servers.
[wt: strictly speaking there's no guarantee that the close notify will
be delivered, it's only best effort, but that may be enough to ensure
that once at least one is received, next checks will be cheaper. This
should be backported to 1.7 and possibly 1.6]
This adds 3 new commands to the cli :
enable dynamic-cookie backend <backend> that enables dynamic cookies for a
specified backend
disable dynamic-cookie backend <backend> that disables dynamic cookies for a
specified backend
set dynamic-cookie-key backend <backend> that lets one change the dynamic
cookie secret key, for a specified backend.
This adds a new "dynamic" keyword for the cookie option. If set, a cookie
will be generated for each server (assuming one isn't already provided on
the "server" line), from the IP of the server, the TCP port, and a secret
key provided. To provide the secret key, a new keyword as been added,
"dynamic-cookie-key", for backends.
Example :
backend bk_web
balance roundrobin
dynamic-cookie-key "bla"
cookie WEBSRV insert dynamic
server s1 127.0.0.1:80 check
server s2 192.168.56.1:80 check
This is a first step to be able to dynamically add and remove servers,
without modifying the configuration file, and still have all the load
balancers redirect the traffic to the right server.
Provide a way to generate session cookies, based on the IP address of the
server, the TCP port, and a secret key provided.
Matthias Fechner reported a regression in 1.7.3 brought by the backport
of commit 819efbf ("BUG/MEDIUM: tcp: don't poll for write when connect()
succeeds"), causing some connections to fail to establish once in a while.
While this commit itself was a fix for a bad sequencing of connection
events, it in fact unveiled a much deeper bug going back to the connection
rework era in v1.5-dev12 : 8f8c92f ("MAJOR: connection: add a new
CO_FL_CONNECTED flag").
It's worth noting that in a lab reproducing a similar environment as
Matthias' about only 1 every 19000 connections exhibit this behaviour,
making the issue not so easy to observe. A trick to make the problem
more observable consists in disabling non-blocking mode on the socket
before calling connect() and re-enabling it later, so that connect()
always succeeds. Then it becomes 100% reproducible.
The problem is that this CO_FL_CONNECTED flag is tested after deciding to
call the data layer (typically the stream interface but might be a health
check as well), and that the decision to call the data layer relies on a
change of one of the flags covered by the CO_FL_CONN_STATE set, which is
made of CO_FL_CONNECTED among others.
Before the fix above, this bug couldn't appear with TCP but it could
appear with Unix sockets. Indeed, connect() was always considered
blocking so the CO_FL_WAIT_L4_CONN connection flag was always set, and
polling for write events was always enabled. This used to guarantee that
the conn_fd_handler() could detect a change among the CO_FL_CONN_STATE
flags.
Now with the fix above, if a connect() immediately succeeds for non-ssl
connection with send-proxy enabled, and no data in the buffer (thus TCP
mode only), the CO_FL_WAIT_L4_CONN flag is not set, the lack of data in
the buffer doesn't enable polling flags for the data layer, the
CO_FL_CONNECTED flag is not set due to send-proxy still being pending,
and once send-proxy is done, its completion doesn't cause the data layer
to be woken up due to the fact that CO_FL_CONNECT is still not present
and that the CO_FL_SEND_PROXY flag is not watched in CO_FL_CONN_STATE.
Then no progress is made when data are received from the client (and
attempted to be forwarded), because a CF_WRITE_NULL (or CF_WRITE_PARTIAL)
flag is needed for the stream-interface state to turn from SI_ST_CON to
SI_ST_EST, allowing ->chk_snd() to be called when new data arrive. And
the only way to set this flag is to call the data layer of course.
After the connect timeout, the connection gets killed and if in the mean
time some data have accumulated in the buffer, the retry will succeed.
This patch fixes this situation by simply placing the update of
CO_FL_CONNECTED where it should have been, before the check for a flag
change needed to wake up the data layer and not after.
This fix must be backported to 1.7, 1.6 and 1.5. Versions not having
the patch above are still affected for unix sockets.
Special thanks to Matthias Fechner who provided a very detailed bug
report with a bisection designating the faulty patch, and to Olivier
Houchard for providing full access to a pretty similar environment where
the issue could first be reproduced.
This may be used to output the JSON schema which describes the output of
show info json and show stats json.
The JSON output is without any extra whitespace in order to reduce the
volume of output. For human consumption passing the output through a
pretty printer may be helpful.
e.g.:
$ echo "show schema json" | socat /var/run/haproxy.stat stdio | \
python -m json.tool
The implementation does not generate the schema. Some consideration could
be given to integrating the output of the schema with the output of
typed and json info and stats. In particular the types (u32, s64, etc...)
and tags.
A sample verification of show info json and show stats json using
the schema is as follows. It uses the jsonschema python module:
cat > jschema.py << __EOF__
import json
from jsonschema import validate
from jsonschema.validators import Draft3Validator
with open('schema.txt', 'r') as f:
schema = json.load(f)
Draft3Validator.check_schema(schema)
with open('instance.txt', 'r') as f:
instance = json.load(f)
validate(instance, schema, Draft3Validator)
__EOF__
$ echo "show schema json" | socat /var/run/haproxy.stat stdio > schema.txt
$ echo "show info json" | socat /var/run/haproxy.stat stdio > instance.txt
python ./jschema.py
$ echo "show stats json" | socat /var/run/haproxy.stat stdio > instance.txt
python ./jschema.py
Signed-off-by: Simon Horman <horms@verge.net.au>
Add a json parameter to show (info|stat) which will output information
in JSON format. A follow-up patch will add a JSON schema which describes
the format of the JSON output of these commands.
The JSON output is without any extra whitespace in order to reduce the
volume of output. For human consumption passing the output through a
pretty printer may be helpful.
e.g.:
$ echo "show info json" | socat /var/run/haproxy.stat stdio | \
python -m json.tool
STAT_STARTED has bee added in order to track if show output has begun or
not. This is used in order to allow the JSON output routines to only insert
a "," between elements when needed. I would value any feedback on how this
might be done better.
Signed-off-by: Simon Horman <horms@verge.net.au>
Given that all call places except one had to set txn->status prior to
calling http_server_error(), it's simpler to make this function rely
on txn->status than have it store it from an argument.
This commit removes second argument(msgnum) from http_error_message and
changes http_error_message to use s->txn->status/http_get_status_idx for
mapping status code from 200..504 to HTTP_ERR_200..HTTP_ERR_504(enum).
This is needed for http-request tarpit deny_status commit.
It adds "hostname" as a new sample fetch. It does exactly the same as
"%H" in a log format except that it can be used outside of log formats.
Signed-off-by: Nenad Merdanovic <nmerdan@haproxy.com>
2 places were using an open-coded implementation of this function to count
available servers. Note that the avg_queue_size() fetch didn't check that
the proxy was in STOPPED state so it would possibly return a wrong server
count here but that wouldn't impact the returned value.
Signed-off-by: Nenad Merdanovic <nmerdan@haproxy.com>
This is like the nbsrv() sample fetch function except that it works as
a converter so it can count the number of available servers of a backend
name retrieved using a sample fetch or an environment variable.
Signed-off-by: Nenad Merdanovic <nmerdan@haproxy.com>
The said form of the CLI command didn't return anything since commit
ad8be61c7.
This fix needs to be backported to 1.7.
Signed-off-by: Nenad Merdanovic <nmerdan@haproxy.com>
The memory is released by cli_release_mlook, which also properly sets the
pointer to NULL. This was introduced with a big code reorganization
involving moving to the new keyword registration form in commit ad8be61c7.
This fix needs to be backported to 1.7.
Signed-off-by: Nenad Merdanovic <nmerdan@haproxy.com>
Invalid OCSP file (for example empty one that can be used to enable
OCSP response to be set dynamically later) causes errors that are
placed on OpenSSL error stack. Those errors are not cleared so
anything that checks this stack later will fail.
Following configuration:
bind :443 ssl crt crt1.pem crt crt2.pem
With following files:
crt1.pem
crt1.pem.ocsp - empty one
crt2.pem.rsa
crt2.pem.ecdsa
Will fail to load.
This patch should be backported to 1.7.
Surprizingly, http-request, http-response, block, redirect, and capture
rules did not cause a warning to be emitted when used in a TCP proxy, so
let's fix this.
This patch may be backported to older versions as it helps spotting
configuration issues.
As its named said, this statement customize the maximum allowed size for frames
exchanged between HAProxy and SPOAs. It should be greater than or equal to 256
and less than or equal to (tune.bufsize - 4) (4 bytes are reserved to the frame
length).
This option can be used to enable or to disable (prefixing the option line with
the "no" keyword) the sending of fragmented payload to agents. By default, this
option is enabled.
These options can be used to enable or to disable (prefixing the option line
with the "no" keyword), respectively, pipelined and asynchronous exchanged
between HAproxy and agents. By default, pipelining and async options are
enabled.
Now, when a payload is fragmented, the first frame must define the frame type
and the followings must use the special type SPOE_FRM_T_UNSET. This way, it is
easy to know if a fragment is the first one or not. Of course, all frames must
still share the same stream-id and frame-id.
Update SPOA example accordingly.
If an agent want to abort the processing a fragmented NOTIFY frame before
receiving all fragments, it can send an ACK frame at any time with ABORT bit set
(and of course, the FIN bit too).
Beside this change, SPOE_FRM_ERR_FRAMEID_NOTFOUND error flag has been added. It
is set when a unknown ACK frame is received.
Now, agents can announce the support for the "fragmentation" capability during
the HELLO handshake. HAProxy will never announce it because fragmented frame
decoding is not implemented yet. But it can send such kind of frames. So, if an
agent supports this capability, payloads exceeding the frame size will be
split. A fragemented payload consists of several frames with the FIN bit clear
and terminated by a single frame with the FIN bit set. All these frames must
share the same STREAM-ID and FRAME-ID.
Note that an unfragemnted payload consists of a single frame with the FIN bit
set. And HELLO and DISCONNECT frames cannot be fragmented. This means that only
NOTIFY frames can transport fragmented payload for now.
The max_frame_size value is negociated between HAProxy and SPOE agents during
the HELLO handshake. It is a per-connection value. Different SPOE agents can
choose to use different max_frame_size values. So, now, we keep the minimum of
all known max_frame_size. This minimum is updated when a new connection to a
SPOE agent is opened and when a connection is closed. We use this value as a
limit to encode messages in NOTIFY frames.
This happens when buffer allocation failed. In the SPOE context, buffers are
allocated by streams and SPOE applets at different time. First, by streams, when
messages need to be encoded before sending them in a NOTIFY frame. Then, by SPOE
applets, when a ACK frame is received.
The first case works as expected, we wake up the stream. But for the second one,
we must wake up the waiting SPOE applet.
Now, when option "set-on-error" is enabled, we set a status code representing
the error occurred instead of "true". For values under 256, it represents an
error coming from the engine. Below 256, it reports a SPOP error. In this case,
to retrieve the right SPOP status code, you must remove 256 to this value. Here
are possible values:
* 1: a timeout occurred during the event processing.
* 2: an error was triggered during the ressources allocation.
* 255: an unknown error occurred during the event processing.
* 256+N: a SPOP error occurred during the event processing.
Now, as for peers, we use an opaque pointer to store information related to the
SPOE filter in appctx structure. These information are now stored in a dedicated
structure (spoe_appctx) and allocated, using a pool, when the applet is created.
This removes the dependency between applets and the SPOE filter and avoids to
eventually inflate the appctx structure.
Now, HAProxy and agents can announce the support for "pipelining" and/or "async"
capabilities during the HELLO handshake. For now, HAProxy always announces the
support of both. In addition, in its HELLO frames. HAproxy adds the "engine-id"
key. It is a uniq string that identify a SPOE engine.
The "pipelining" capability is the ability for a peer to decouple NOTIFY and ACK
frames. This is a symmectical capability. To be used, it must be supported by
HAproxy and agents. Unlike HTTP pipelining, the ACK frames can be send in any
order, but always on the same TCP connection used for the corresponding NOTIFY
frame.
The "async" capability is similar to the pipelining, but here any TCP connection
established between HAProxy and the agent can be used to send ACK frames. if an
agent accepts connections from multiple HAProxy, it can use the "engine-id"
value to group TCP connections.
The array of pointers passed to sample_parse_expr was not really an array but a
pointer to pointer. So it can easily lead to a segfault during the configuration
parsing.
During a soft stop, we need to wakeup all SPOE applets to stop them. So we loop
on all proxies, and for each proxy, on all filters. But we must be sure to only
handle SPOE filters here. To do so, we use a specific id.
Use SSL_set_ex_data/SSL_get_ex_data standard API call to store capture.
We need to avoid internal structures/undocumented calls usage to try to
control the beast and limit painful compatibilities.
Bug introduced with "removes SSL_CTX_set_ssl_version call and cleanup CTX
creation": ssl_sock_new_ctx is called before all the bind line is parsed.
The fix consists of separating the use of default_ctx as the initialization
context of the SSL connection via bind_conf->initial_ctx. Initial_ctx contains
all the necessary parameters before performing the selection of the CTX:
default_ctx is processed as others ctx without unnecessary parameters.
This new sample-fetches captures the cipher list offer by the client
SSL connection during the client-hello phase. This is useful for
fingerprint the SSL connection.
BoringSSL doesn't support SSL_CTX_set_ssl_version. To remove this call, the
CTX creation is cleanup to clarify what is happening. SSL_CTX_new is used to
match the original behavior, in order: force-<method> according the method
version then the default method with no-<method> options.
OPENSSL_NO_SSL3 error message is now in force-sslv3 parsing (as force-tls*).
For CTX creation in bind environement, all CTX set related to the initial ctx
are aggregate to ssl_sock_new_ctx function for clarity.
Tests with crt-list have shown that server_method, options and mode are
linked to the initial CTX (default_ctx): all ssl-options are link to each
bind line and must be removed from crt-list.
Extract from RFC 6066:
"If the server understood the ClientHello extension but does not recognize
the server name, the server SHOULD take one of two actions: either abort the
handshake by sending a fatal-level unrecognized_name(112) alert or continue the
handshake. It is NOT RECOMMENDED to send a warning-level unrecognized_name(112)
alert, because the client's behavior in response to warning-level alerts is
unpredictable. If there is a mismatch between the server name used by the
client application and the server name of the credential chosen by the server,
this mismatch will become apparent when the client application performs the
server endpoint identification, at which point the client application will have
to decide whether to proceed with the communication."
Thanks Roberto Guimaraes for the bug repport, spotted with openssl-1.1.0.
This fix must be backported.
SSL verify and client_CA inherits from the initial ctx (default_ctx).
When a certificate is found, the SSL connection environment must be replaced by
the certificate configuration (via SSL_set_verify and SSL_set_client_CA_list).
This patch used boringssl's callback to analyse CLientHello before any
handshake to extract key signature capabilities.
Certificat with better signature (ECDSA before RSA) is choosed
transparenty, if client can support it. RSA and ECDSA certificates can
be declare in a row (without order). This makes it possible to set
different ssl and filter parameter with crt-list.
In 1.4-dev5 when we started to implement keep-alive, commit a9679ac
("[MINOR] http: make the conditional redirect support keep-alive")
added a specific check was added to support keep-alive on redirect
rules but only when the location would start with a "/" indicating
the client would come back to the same server.
But nowadays most applications put http:// or https:// in front of
each and every location, and continuing to perform a close there is
counter-efficient, especially when multiple objects are fetched at
once from a same origin which redirects them to the correct origin
(eg: after an http to https forced upgrade).
It's about time to get rid of this old trick as it causes more harm
than good at an era where persistent connections are omnipresent.
Special thanks to Ciprian Dorin Craciun for providing convincing
arguments with a pretty valid use case and proposing this draft
patch which addresses the issue he was facing.
This change although not exactly a bug fix should be backported
to 1.7 to adapt better to existing infrastructure.
Adrian Fitzpatrick reported that since commit f51658d ("MEDIUM: config:
relax use_backend check to make the condition optional"), typos like "of"
instead of "if" on use_backend rules are not properly detected. The reason
is that the parser only checks for "if" or "unless" otherwise it considers
there's no keyword, making the rule inconditional.
This patch fixes it. It may reveal some rare config bugs for some people,
but will not affect valid configurations.
This fix must be backported to 1.7, 1.6 and 1.5.
Error in the HTTP parser. The function http_get_path() can
return NULL and this case is not catched in the code. So, we
try to dereference NULL pointer, and a segfault occurs.
These two lines are useful to prevent the bug.
acl prevent_bug path_beg /
http-request deny if !prevent_bug
This bug fix should be backported in 1.6 and 1.7
Commit 405ff31 ("BUG/MINOR: ssl: assert on SSL_set_shutdown with BoringSSL")
introduced a regression causing some random crashes apparently due to
memory corruption. The issue is the use of SSL_CTX_set_quiet_shutdown()
instead of SSL_set_quiet_shutdown(), making it use a different structure
and causing the flag to be put who-knows-where.
Many thanks to Jarno Huuskonen who reported this bug early and who
bisected the issue to spot this patch. No backport is needed, this
is 1.8-specific.
Historically, http-response rules couldn't produce errors generating HTTP
responses during their evaluation. This possibility was "implicitly" added with
http-response redirect rules (51d861a4). But, at the time, replace-header rules
were kept untouched. When such a rule failed, the rules processing was just
stopped (like for an accept rule).
Conversely, when a replace-header rule fails on the request, it generates a HTTP
response (400 Bad Request).
With this patch, errors on replace-header rule are now handled in the same way
for HTTP requests and HTTP responses.
This patch should be backported in 1.7 and 1.6.
This is the same fix as which concerning the redirect rules (0d94576c).
The buffer used to expand the <replace-fmt> argument must be protected to
prevent it being overwritten during build_logline() execution (the function used
to expand the format string).
This patch should be backported in 1.7, 1.6 and 1.5. It relies on commit b686afd
("MINOR: chunks: implement a simple dynamic allocator for trash buffers") for
the trash allocator, which has to be backported as well.
Some users have experienced some troubles using the compression filter when the
HTTP response body length is undefined. They complained about receiving
truncated responses.
In fact, the bug can be triggered if there is at least one filter attached to
the stream but none registered to analyze the HTTP response body. In this case,
when the body length is undefined, data should be forwarded without any
parsing. But, because of a wrong check, we were starting to parse them. Because
it was not expected, the end of response was not correctly detected and the
response could be truncted. So now, we rely on HAS_DATA_FILTER macro instead of
HAS_FILTER one to choose to parse HTTP response body or not.
Furthermore, in http_response_forward_body, the test to not forward the server
closure to the client has been updated to reflect conditions listed in the
associated comment.
And finally, in http_msg_forward_body, when the body length is undefined, we
continue the parsing it until the server closes the connection without any on
filters. So filters can safely stop to filter data during their parsing.
This fix should be backported in 1.7
See 4b788f7d34
If we use the action "http-request redirect" with a Lua sample-fetch or
converter, and the Lua function calls one of the Lua log function, the
header name is corrupted, it contains an extract of the last loggued data.
This is due to an overwrite of the trash buffer, because his scope is not
respected in the "add-header" function. The scope of the trash buffer must
be limited to the function using it. The build_logline() function can
execute a lot of other function which can use the trash buffer.
This patch fix the usage of the trash buffer. It limits the scope of this
global buffer to the local function, we build first the header value using
build_logline, and after we store the header name.
Thanks Jesse Schulman for the bug repport.
This patch must be backported in 1.7, 1.6 and 1.5 version, and it relies
on commit b686afd ("MINOR: chunks: implement a simple dynamic allocator for
trash buffers") for the trash allocator, which has to be backported as well.
The trash buffers are becoming increasingly complex to deal with due to
the code's modularity allowing some functions to be chained and causing
the same chunk buffers to be used multiple times along the chain, possibly
corrupting each other. In fact the trash were designed from scratch for
explicitly not surviving a function call but string manipulation makes
this impossible most of the time while not fullfilling the need for
reliable temporary chunks.
Here we introduce the ability to allocate a temporary trash chunk which
is reserved, so that it will not conflict with the trash chunks other
functions use, and will even support reentrant calls (eg: build_logline).
For this, we create a new pool which is exactly the size of a usual chunk
buffer plus the size of the chunk struct so that these chunks when allocated
are exactly the same size as the ones returned by get_trash_buffer(). These
chunks may fail so the caller must check them, and the caller is also
responsible for freeing them.
The code focuses on minimal changes and ease of reliable backporting
because it will be needed in stable versions in order to support next
patch.
UDP sockets used to send DNS queries are created before fork happens and
this is a big problem because all the processes (in case of a
configuration starting multiple processes) share the same socket. Some
processes may consume responses dedicated to an other one, some servers
may be disabled, some IPs changed, etc...
As a workaround, this patch close the existing socket and create a new
one after the fork() has happened.
[wt: backport this to 1.7]
The function dns_init_resolvers() is used to initialize socket used to
send DNS queries.
This patch gives the function the ability to close a socket before
re-opening it.
[wt: this needs to be backported to 1.7 for next fix]
This patch change the names prefixing it by a "_". So "end" becomes "_end".
The backward compatibility with names without the prefix "_" is assured.
In other way, another the keyword "end" can be used like this: Map['end'].
Thanks Robin H. Johnson for the bug repport
This should be backported in version 1.6 and 1.7
There's a test after a successful synchronous connect() consisting
in waking the data layer up asap if there's no more handshake.
Unfortunately this test is run before setting the CO_FL_SEND_PROXY
flag and before the transport layer adds its own flags, so it can
indicate a willingness to send data while it's not the case and it
will have to be handled later.
This has no visible effect except a useless call to a function in
case of health checks making use of the proxy protocol for example.
Additionally a corner case where EALREADY was returned and considered
equivalent to EISCONN was fixed so that it's considered equivalent to
EINPROGRESS given that the connection is not complete yet. But this
code should never return on the first call anyway so it's mostly a
cleanup.
This fix should be backported to 1.7 and 1.6 at least to avoid
headaches during some debugging.
While testing a tcp_fastopen related change, it appeared that in the rare
case where connect() can immediately succeed, we still subscribe to write
notifications on the socket, causing the conn_fd_handler() to immediately
be called and a second call to connect() to be attempted to double-check
the connection.
In fact this issue had already been met with unix sockets (which often
respond immediately) and partially addressed but incorrect so another
patch will follow. But for TCP nothing was done.
The fix consists in removing the WAIT_L4_CONN flag if connect() succeeds
and to subscribe for writes only if some handshakes or L4_CONN are still
needed. In addition in order not to fail raw TCP health checks, we have
to continue to enable polling for data when nothing is scheduled for
leaving and the connection is already established, otherwise the caller
will never be notified.
This fix should be backported to 1.7 and 1.6.
A recent patch to support BoringSSL caused this warning to appear on
OpenSSL 1.1.0 :
src/ssl_sock.c:3062:4: warning: statement with no effect [-Wunused-value]
It's caused by SSL_CTX_set_ecdh_auto() which is now only a macro testing
that the last argument is zero, and the result is not used here. Let's
just kill it for both versions.
Tested with 0.9.8, 1.0.0, 1.0.1, 1.0.2, 1.1.0. This fix may be backported
to 1.7 if the boringssl fix is as well.
Limitations:
. disable force-ssl/tls (need more work)
should be set earlier with SSL_CTX_new (SSL_CTX_set_ssl_version is removed)
. disable generate-certificates (need more work)
introduce SSL_NO_GENERATE_CERTIFICATES to disable generate-certificates.
Cleanup some #ifdef and type related to boringssl env.
This change adds possibility to change agent-addr and agent-send directives
by CLI/socket. Now you can replace server's and their configuration without
reloading/restarting whole haproxy, so it's a step in no-reload/no-restart
direction.
Depends on #e9602af - agent-addr is implemented there.
Can be backported to 1.7.
This directive add possibility to set different address for agent-checks.
With this you can manage server status and weight from central place.
Can be backported to 1.7.
crt-list is extend to support ssl configuration. You can now have
such line in crt-list <file>:
mycert.pem [npn h2,http/1.1]
Support include "npn", "alpn", "verify", "ca_file", "crl_file",
"ecdhe", "ciphers" configuration and ssl options.
"crt-base" is also supported to fetch certificates.
When the stream's backend was defined, the request's analyzers flag was always
set to 0 if the stream had no listener. This bug was introduced with the filter
API but never triggered (I think so).
Because of the commit 5820a366, it is now possible to encountered it. For
example, this happens when the trace filter is enabled on a SPOE backend. The
fix is pretty trivial.
This fix must be backported to 1.7.
The previous version used an O(number of proxies)^2 algo to get the sum of
the number of maxconns of frontends which reference a backend at least once.
This new version adds the frontend's maxconn number to the backend's
struct proxy member 'tot_fe_maxconn' when the backend name is resolved
for switching rules or default_backend statment. At the end, the final
backend's fullconn is computed looping only one time for all on proxies O(n).
The load of a configuration using a large amount of backends (10 thousands)
without configured fullconn was reduced from several minutes to few seconds.
The output of whether prefer-server-ciphers is supported by OpenSSL
actually always show yes in 1.8, because SSL_OP_CIPHER_SERVER_PREFERENCE
is redefined before the actual check in src/ssl_sock.c, since it was
moved from here from src/haproxy.c.
Since this is not really relevant anymore as we don't support OpenSSL
< 0.9.7 anyway, this change just removes this output.
When haproxy is compiled without zlib or slz, the output of
haproxy -vv shows (null).
Make haproxy -vv output great again by providing the proper
information (which is what we did before).
This is for 1.8 only.
With BoringSSL:
SSL_set_shutdown: Assertion `(SSL_get_shutdown(ssl) & mode) == SSL_get_shutdown(ssl)' failed.
"SSL_set_shutdown causes ssl to behave as if the shutdown bitmask (see SSL_get_shutdown)
were mode. This may be used to skip sending or receiving close_notify in SSL_shutdown by
causing the implementation to believe the events already happened.
It is an error to use SSL_set_shutdown to unset a bit that has already been set.
Doing so will trigger an assert in debug builds and otherwise be ignored.
Use SSL_CTX_set_quiet_shutdown instead."
Change logic to not notify on SSL_shutdown when connection is not clean.
"X509_get_pubkey() attempts to decode the public key for certificate x.
If successful it returns the public key as an EVP_PKEY pointer with its
reference count incremented: this means the returned key must be freed
up after use."
This prevents DNS from resolving IPv6-only servers in 1.7. Note, this
patch depends on the previous series :
1. BUG/MINOR: tools: fix off-by-one in port size check
2. BUG/MEDIUM: server: consider AF_UNSPEC as a valid address family
3. MEDIUM: server: split the address and the port into two different fields
4. MINOR: tools: make str2sa_range() return the port in a separate argument
5. MINOR: server: take the destination port from the port field, not the addr
6. MEDIUM: server: disable protocol validations when the server doesn't resolve
This fix (hence the whole series) must be backported to 1.7.
When a server doesn't resolve we don't know the address family so we
can't perform the basic protocol validations. However we know that we'll
ultimately resolve to AF_INET4 or AF_INET6 so the controls are OK. It is
important to proceed like this otherwise it will not be possible to start
with unresolved addresses.
Next patch will cause the port to disappear from the address field when servers
do not resolve so we need to take it from the separate field provided by
str2sa_range().
Keeping the address and the port in the same field causes a lot of problems,
specifically on the DNS part where we're forced to cheat on the family to be
able to keep the port. This causes some issues such as some families not being
resolvable anymore.
This patch first moves the service port to a new field "svc_port" so that the
port field is never used anymore in the "addr" field (struct sockaddr_storage).
All call places were adapted (there aren't that many).
The DNS code is written so as to support AF_UNSPEC to decide on the
server family based on responses, but unfortunately snr_resolution_cb()
considers it as invalid causing a DNS storm to happen when a server
arrives with this family.
This situation is not supposed to happen as long as unresolved addresses
are forced to AF_INET, but this will change with the upcoming fixes and
it's possible that it's not granted already when changing an address on
the CLI.
This fix must be backported to 1.7 and 1.6.
port_to_str() checks that the port size is at least 5 characters instead
of at least 6. While in theory it could permit a buffer overflow, it's
harmless because all callers have at least 6 characters here.
This fix needs to be backported to 1.7, 1.6 and 1.5.
http-reuse should normally not be used in conjunction with the proxy
protocol or with "usesrc clientip". While there's nothing fundamentally
wrong with this, whenever these options are used, the server expects the
IP address to be the source address for all requests, which doesn't make
sense with http-reuse.
fc_rcvd_proxy : boolean
Returns true if the client initiated the connection with a PROXY protocol
header.
A flag is added on the struct connection if a PROXY header is successfully
parsed.
The older 'rsprep' directive allows modification of the status reason.
Extend 'http-response set-status' to take an optional string of the new
status reason.
http-response set-status 418 reason "I'm a coffeepot"
Matching updates in Lua code:
- AppletHTTP.set_status
- HTTP.res_set_status
Signed-off-by: Robin H. Johnson <robbat2@gentoo.org>
Commits 5f10ea3 ("OPTIM: http: improve parsing performance of long
URIs") and 0431f9d ("OPTIM: http: improve parsing performance of long
header lines") introduced a bug in the HTTP parser : when a partial
request is read, the first part ends up on a 8-bytes boundary (or 4-byte
on 32-bit machines), the end lies in the header field value part, and
the buffer used to contain a CR character exactly after the last block,
then the parser could be confused and read this CR character as being
part of the current request, then switch to a new state waiting for an
LF character. Then when the next part of the request appeared, it would
read the character following what was erroneously mistaken for a CR,
see that it is not an LF and fail on a bad request. In some cases, it
can even be worse and the header following the hole can be improperly
indexed causing all sort of unexpected behaviours like a content-length
being ignored or a header appended at the wrong position. The reason is
that there's no control of and of parsing just after breaking out of the
loop.
One way to reproduce it is with this config :
global
stats socket /tmp/sock1 mode 666 level admin
stats timeout 1d
frontend px
bind :8001
mode http
timeout client 10s
redirect location /
And sending requests this way :
$ tcploop 8001 C P S:"$(dd if=/dev/zero bs=16384 count=1 2>/dev/null | tr '\000' '\r')"
$ tcploop 8001 C P S:"$(dd if=/dev/zero bs=16384 count=1 2>/dev/null | tr '\000' '\r')"
$ tcploop 8001 C P \
S:"GET / HTTP/1.0\r\nX-padding: 0123456789.123456789.123456789.123456789.123456789.123456789.1234567" P \
S:"89.123456789\r\n\r\n" P
Then a "show errors" on the socket will report :
$ echo "show errors" | socat - /tmp/sock1
Total events captured on [04/Jan/2017:15:09:15.755] : 32
[04/Jan/2017:15:09:13.050] frontend px (#2): invalid request
backend <NONE> (#-1), server <NONE> (#-1), event #31
src 127.0.0.1:59716, session #91, session flags 0x00000080
HTTP msg state 17, msg flags 0x00000000, tx flags 0x00000000
HTTP chunk len 0 bytes, HTTP body len 0 bytes
buffer flags 0x00808002, out 0 bytes, total 111 bytes
pending 111 bytes, wrapping at 16384, error at position 107:
00000 GET / HTTP/1.0\r\n
00016 X-padding: 0123456789.123456789.123456789.123456789.123456789.12345678
00086+ 9.123456789.123456789\r\n
00109 \r\n
This fix must be backported to 1.7.
Many thanks to Aleksey Gordeev and Axel Reinhold for providing detailed
network captures and configurations exhibiting the issue.
debug_hexdump() prints to the requested output stream (typically stdout
or stderr) an hex dump of the blob passed in argument. This is useful
to help debug binary protocols.
Error captures almost always report a state 26 (MSG_ERROR) making it
very hard to know what the parser was expecting. The reason is that
we have to switch to MSG_ERROR to trigger the dump, and then during
the dump we capture the current state which is already MSG_ERROR. With
this change we now copy the current state into an err_state field that
will be reported as the faulty state.
This patch looks a bit large because the parser doesn't update the
current state until it runs out of data so the current state is never
known when jumping to ther error label! Thus the code had to be updated
to take copies of the current state before switching to MSG_ERROR based
on the switch/case values.
As a bonus, it now shows the current state in human-readable form and
not only in numeric form ; in the past it was not an issue since it was
always 26 (MSG_ERROR).
At least now we can get exploitable invalid request/response reports :
[05/Jan/2017:19:28:57.095] frontend f (#2): invalid request
backend <NONE> (#-1), server <NONE> (#-1), event #1
src 127.0.0.1:39894, session #4, session flags 0x00000080
HTTP msg state MSG_RQURI(4), msg flags 0x00000000, tx flags 0x00000000
HTTP chunk len 0 bytes, HTTP body len 0 bytes
buffer flags 0x00908002, out 0 bytes, total 20 bytes
pending 20 bytes, wrapping at 16384, error at position 5:
00000 GET /\e HTTP/1.0\r\n
00017 \r\n
00019 \n
[05/Jan/2017:19:28:33.827] backend b (#3): invalid response
frontend f (#2), server s1 (#1), event #0
src 127.0.0.1:39718, session #0, session flags 0x000004ce
HTTP msg state MSG_HDR_NAME(17), msg flags 0x00000000, tx flags 0x08300000
HTTP chunk len 0 bytes, HTTP body len 0 bytes
buffer flags 0x80008002, out 0 bytes, total 59 bytes
pending 59 bytes, wrapping at 16384, error at position 31:
00000 HTTP/1.1 200 OK\r\n
00017 Content-length : 10\r\n
00038 \r\n
00040 0a\r\n
00044 0123456789\r\n
00056 0\r\n
This should be backported to 1.7 and 1.6 at least to help with bug
reports.
It is important to defined analyzers (AN_REQ_* and AN_RES_*) in the same order
they are evaluated in process_stream. This order is really important because
during analyzers evaluation, we run them in the order of the lower bit to the
higher one. This way, when an analyzer adds/removes another one during its
evaluation, we know if it is located before or after it. So, when it adds an
analyzer which is located before it, we can switch to it immediately, even if it
has already been called once but removed since.
With the time, and introduction of new analyzers, this order was broken up. the
main problems come from the filter analyzers. We used values not related with
their evaluation order. Furthermore, we used same values for request and response
analyzers.
So, to fix the bug, filter analyzers have been splitted in 2 distinct lists to
have different analyzers for the request channel than those for the response
channel. And of course, we have moved them to the right place.
Some other analyzers have been reordered to respect the evaluation order:
* AN_REQ_HTTP_TARPIT has been moved just before AN_REQ_SRV_RULES
* AN_REQ_PRST_RDP_COOKIE has been moved just before AN_REQ_STICKING_RULES
* AN_RES_STORE_RULES has been moved just after AN_RES_WAIT_HTTP
Note today we have 29 analyzers, all stored into a 32 bits bitfield. So we can
still add 4 more analyzers before having a problem. A good way to fend off the
problem for a while could be to have a different bitfield for request and
response analyzers.
[wt: all of this must be backported to 1.7, and part of it must be backported
to 1.6 and 1.5]
The registered output type for the sample fetches sc*_get_gpt0
is a boolean, but the value returned is an integer.
This patch fixs the default type to SINT in place of BOOL.
This patch should be backported in 1.6 and 1.7
when using "option prefer-last-server", we may not always stay on
the same backend if option balance told us otherwise.
For example, backend may change in the following cases:
balance hdr()
balance rdp-cookie
balance source
balance uri
balance url_param
[wt: backport this to 1.7 and 1.6]
this adds a support of the newest pcre2 library,
more secure than its older sibling in a cost of a
more complex API.
It works pretty similarly to pcre's part to keep
the overall change smooth, except :
- we define the string class supported at compile time.
- after matching the ovec data is properly sized, althought
we do not take advantage of it here.
- the lack of jit support is treated less 'dramatically'
as pcre2_jit_compile in this case is 'no-op'.
In systemd mode (-Ds), the master haproxy process is waiting for each
child to exit in a specific order. If a process die when it's not his
turn, it will become a zombie process until every processes exit.
The master is now waiting for any process to exit in any order.
This patch should be backported to 1.7, 1.6 and 1.5.
Calling SSL_set_tlsext_host_name() on the current SSL ctx has no effect
if the session is being resumed because the hostname is already stored
in the session and is not advertised again in subsequent connections.
It's visible when enabling SNI and health checks at the same time because
checks do not send an SNI and regular traffic reuses the same connection,
resulting in no SNI being sent.
The only short-term solution is to reset the reused session when the
SNI changes compared to the previous one. It can make the server-side
performance suffer when SNIs are interleaved but it will work. A better
long-term solution would be to keep a small cache of a few contexts for
a few SNIs.
Now with SSL_set_session(ctx, NULL) it works. This needs to be double-
checked though. The man says that SSL_set_session() frees any previously
existing context. Some people report a bit of breakage when calling
SSL_set_session(NULL) on openssl 1.1.0a (freed session not reusable at
all though it's not an issue for now).
This needs to be backported to 1.7 and 1.6.
According to nbsrv() documentation this fetcher should return "an
integer value corresponding to the number of usable servers".
In case backend is disabled none of servers is usable, so I believe
fetcher should return 0.
This patch should be backported to 1.7, 1.6, 1.5.
Historically a lot of SSL global settings were stored into the global
struct, but we've reached a point where there are 3 ifdefs in it just
for this, and others in haproxy.c to initialize it.
This patch moves all the private fields to a new struct "global_ssl"
stored in ssl_sock.c. This includes :
char *crt_base;
char *ca_base;
char *listen_default_ciphers;
char *connect_default_ciphers;
int listen_default_ssloptions;
int connect_default_ssloptions;
int tune.sslprivatecache; /* Force to use a private session cache even if nbproc > 1 */
unsigned int tune.ssllifetime; /* SSL session lifetime in seconds */
unsigned int tune.ssl_max_record; /* SSL max record size */
unsigned int tune.ssl_default_dh_param; /* SSL maximum DH parameter size */
int tune.ssl_ctx_cache; /* max number of entries in the ssl_ctx cache. */
The "tune" part was removed (useless here) and the occasional "ssl"
prefixes were removed as well. Thus for example instead of
global.tune.ssl_default_dh_param
we now have :
global_ssl.default_dh_param
A few initializers were present in the constructor, they could be brought
back to the structure declaration.
A few other entries had to stay in global for now. They concern memory
calculationn (used in haproxy.c) and stats (used in stats.c).
The code is already much cleaner now, especially for global.h and haproxy.c
which become readable.
tlskeys_finalize_config() was the only reason for haproxy.c to still
require ifdef and includes for ssl_sock. This one fits perfectly well
in the late initializers so it was changed to be registered with
hap_register_post_check().
Now we can simply check the transport layer at run time and decide
whether or not to initialize or destroy these entries. This removes
other ifdefs and includes from cfgparse.c, haproxy.c and hlua.c.
Now we exclusively use xprt_get(XPRT_RAW) instead of &raw_sock or
xprt_get(XPRT_SSL) for &ssl_sock. This removes a bunch of #ifdef and
include spread over a number of location including backend, cfgparse,
checks, cli, hlua, log, server and session.
There are still a lot of #ifdef USE_OPENSSL in the code (still 43
occurences) because we never know if we can directly access ssl_sock
or not. This patch attacks the problem differently by providing a
way for transport layers to register themselves and for users to
retrieve the pointer. Unregistered transport layers will point to NULL
so it will be easy to check if SSL is registered or not. The mechanism
is very inexpensive as it relies on a two-entries array of pointers,
so the performance will not be affected.
Having it in the ifdef complicates certain operations which require
additional ifdefs just to access a member which could remain zero in
non-ssl cases. Let's move it out, it will not even increase the
struct size on 64-bit machines due to alignment.
Instead of hard-coding all SSL destruction in cfgparse.c and haproxy.c,
we now register this new function as the transport layer's destroy_bind_conf()
and call it only when defined. This removes some non-obvious SSL-specific
code and #ifdefs from cfgparse.c and haproxy.c
Instead of hard-coding all SSL preparation in cfgparse.c, we now register
this new function as the transport layer's prepare_bind_conf() and call it
only when definied. This removes some non-obvious SSL-specific code from
cfgparse.c as well as a #ifdef.
Most of the SSL functions used to have a proxy argument which was mostly
used to be able to emit clean errors using Alert(). First, many of them
were converted to memprintf() and don't require this pointer anymore.
Second, the rare which still need it also have either a bind_conf argument
or a server argument, both of which carry a pointer to the relevant proxy.
So let's now get rid of it, it needlessly complicates the API and certain
functions already have many arguments.
Historically, all listeners have a pointer to the frontend. But since
the introduction of SSL, we now have an intermediary layer called
bind_conf corresponding to a "bind" line. It makes no sense to have
the frontend on each listener given that it's the same for all
listeners belonging to a same bind_conf. Also certain parts like
SSL can only operate on bind_conf and need the frontend.
This patch fixes this by moving the frontend pointer from the listener
to the bind_conf. The extra indirection is quite cheap given and the
places were this is used are very scarce.
A mistake was made when the socket layer was cut into proto and
transport, the transport was attached to the listener while all
listeners in a single "bind" line always have exactly the same
transport. It doesn't seem obvious but this is the reason why there
are so many #ifdefs USE_OPENSSL in cfgparse : a lot of operations
have to be open-coded because cfgparse only manipulates bind_conf
and we don't have the information of the transport layer here.
Very little code makes use of the transport layer, mainly session
setup and log. These places can afford an extra pointer indirection
(the listener points to the bind_conf). This change is thus very small,
it saves a little bit of memory (8B per listener) and makes the code
more flexible.
The code currently creates a listener only to ensure that sess->li is
properly populated, and to retrieve the frontend (which is also available
directly from the session).
It turns out that the current infrastructure (for a large part) already
supports not having any listener on a session (since Lua does the same),
except for the following places which were not yet converted :
- session_count_new() : used by session_accept_fd, ie never for spoe
- session_accept_fd() : never used here, an applet initiates the session
- session_prepare_log_prefix() : embryonic sessions only, thus unused
- session_kill_embryonic() : same
- conn_complete_session() : same
- build_log_line() for fields %cp, %fp and %ft : unused here but may change
- http_wait_for_request() and subsequent functions : unused here
Thus for now it's as safe to run SPOE without listener as it is for Lua,
and this was an obstacle against some cleanups of the listener code. The
places above should be plugged so that it becomes save over the long term
as well.
An alternative in the future might be to create a dummy listener that
outgoing connections could use just to avoid keeping a null here.
The tcp rules may be applied to a TCP stream initiated by applets (spoe,
lua, peers, later H2). These ones do not necessarily have a valid listener
so we must verify the field is not null before updating the stats. For now
there's no way to trigger this bug because lua and peers don't have analysers,
h2 is not implemented and spoe has a dummy listener. But this threatens to
break at any instant.
"scur" was typed as "limit" (FO_CONFIG) and "config value" (FN_LIMIT).
The real types of "scur" are "metric" (FO_METRIC) and "gauge"
(FN_GAUGE). FO_METRIC and FN_GAUGE are the value 0.
ssl_sock functions don't mark pointers as NULL after freeing them. So
if a "bind" line specifies some SSL settings without the "ssl" keyword,
they will get freed at the end of check_config_validity(), then freed
a second time on exit. Simply mark the pointers as NULL to fix this.
This fix needs to be backported to 1.7 and 1.6.
We have a bug when SSL reuse is disabled on the server side : we reset
the context but do not set it to NULL, causing a multiple free of the
same entry. It seems like this bug cannot appear as-is with the current
code (or the conditions to get it are not obvious) but it did definitely
strike when trying to fix another bug with the SNI which forced a new
handshake.
This fix should be backported to 1.7, 1.6 and 1.5.
This finishes to clean up the zlib-specific parts. It also unbreaks recent
commit b97c6fb ("CLEANUP: compression: use the build options list to report
the algos") which broke USE_ZLIB due to MAXWBITS not being defined anymore
in haproxy.c.
The following keywords were still parsed in cfgparse and were moved
to ssl_sock to remove some #ifdefs :
"tune.ssl.cachesize", "tune.ssl.default-dh-param", "tune.ssl.force-private-cache",
"tune.ssl.lifetime", "tune.ssl.maxrecord", "tune.ssl.ssl-ctx-cache-size".
It's worth mentionning that some of them used to have incorrect sign
checks possibly resulting in some negative values being used. All of
them are now checked for being positive.
This removes 2 #ifdefs and makes the code much cleaner. The controls
are still there and the two parsers have been merged into a single
function ssl_parse_global_ca_crt_base().
It's worth noting that there's still a check to prevent a change when
the value was already specified. This test seems useless and possibly
counter-productive, it may have to be revisited later, but for now it
was implemented identically.
We already had alertif_too_many_args{,_idx}(), but these ones are
specifically designed for use in cfgparse. Outside of it we're
trying to avoid calling Alert() all the time so we need an
equivalent using a pointer to an error message.
These new functions called too_many_args{,_idx)() do exactly this.
They don't take the file name nor the line number which they have
no use for but instead they take an optional pointer to an error
message and the pointer to the error code is optional as well.
With (NULL, NULL) they'll simply check the validity and return a
verdict. They are quite convenient for use in isolated keyword
parsers.
These two new functions as well as the previous ones have all been
exported.
We replaced global.deviceatlas with global_deviceatlas since there's no need
to store all this into the global section. This removes the last #ifdefs,
and now the code is 100% self-contained in da.c. The file da.h was now
removed because it was only used to load dac.h, which is more easily
loaded directly from da.c. It provides another good example of how to
integrate code in the future without touching the core parts.
We replaced global._51degrees with global_51degrees since there's no need
to store all this into the global section. This removes the last #ifdefs,
and now the code is 100% self-contained in 51d.c. The file 51d.h was now
removed because it was only used to load 51Degrees.h, which is more easily
loaded from 51d.c. It provides a good example of how to integrate code in
the future without touching the core parts.
We replaced global.wurfl with global_wurfl since there's no need to store
all this into the global section. This removes the last #ifdefs, and now
the code is 100% self-contained in wurfl.c. It provides a good example of
how to integrate code in the future without touching the core parts.
deinit_51degrees() is not called anymore from haproxy.c, removing
2 #ifdefs and one include. The function was made static. The include
file still includes 51Degrees.h which is needed by global.h and 51d.c
so it was not touched beyond this last function removal.
By registering the deinit function we avoid another #ifdef in haproxy.c.
The ha_wurfl_deinit() function has been made static and unexported. Now
proto/wurfl.h is totally empty, the code being self-contained in wurfl.c,
so the useless .h has been removed.
The 3 device detection engines stop at the same place in deinit()
with the usual #ifdefs. Similar to the other functions we can have
some late deinitialization functions. These functions do not return
anything however so we have to use a different type.
Instead of having a #ifdef in the main init code we now use the registered
init functions. Doing so also enables error checking as errors were previously
reported as alerts but ignored. Also they were incorrect as the 'status'
variable was hidden by a second one and was always reporting DA_SYS (which
is apparently an error) in every case including the case where no file was
loaded. The init_deviceatlas() function was unexported since it's not used
outside of this place anymore.
This removes some #ifdefs from the main haproxy code path. Function
init_51degrees() now returns ERR_* instead of exit(1) on error, and
this function was made static and is not exported anymore.
This removes some #ifdefs from the main haproxy code path and enables
error checking. The current code only makes use of warnings even for
some errors that look serious. While this choice is questionnable, it
has been kept as-is, and only the return codes were adapted to ERR_WARN
to at least report that some warnings were emitted. ha_wurfl_init() was
unexported as it's not needed anymore.
Instead of calling the checks directly from the init code, we now
register the start_checks() function to be run at this point. This
also allows to unexport the check init function and to remove one
include from haproxy.c.
There's a significant amount of late initialization calls which are
performed after the point where we exit in check mode. These calls
are used to allocate resource and perform certain slow operations.
Let's have a way to register some functions which need to be called
there instead of having this multitude of #ifdef in the init path.
This removes 2 #ifdef, an include, an ugly construct and a wild "extern"
declaration from haproxy.c. The message indicating that compression is
*not* enabled is not there anymore.
Many extensions now report some build options to ease debugging, but
this is now being done at the expense of code maintainability. Let's
provide a registration function to do this so that we can start to
remove most of the #ifdefs from haproxy.c (18 currently just for a
single function).
If the memory allocator fails, it return a bad code, and the execution
continue. If the Lua/cli initializer fails, the allocated struct is not
released.
If the lua/cli fails during initialization, it returns an ok
status, an the execution continue. This will probably occur a
segfault.
Thiw patch should be backported in 1.7
This is a regression from the commit a73e59b690.
When data are sent from a cosocket, the action is done in the context of the
applet running a lua stack and not in the context of the applet owning the
cosocket. So we must take care, explicitly, that this last applet have a buffer
(the req buffer of the cosocket).
This patch must be backported in 1.7
In the case of applet, the Lua context is taken from session
when we get the private values. This patch just update comments
associated to this action because it is not obvious.
This one now migrates to the general purpose cli.p0 for the ref pointer,
cli.i0 for the dump_all flag and cli.i1 for the dump_keys_index. A few
comments were added.
The applet.h file doesn't depend on openssl anymore. It's worth noting
that the previous dependency was accidental and only used to work because
all files including this one used to have openssl included prior to
loading this file.
This one now migrates to the general purpose cli.p0 for the proxy pointer,
cli.p1 for the server pointer, and cli.i0 for the proxy's instance if only
one has to be dumped.
Most of the keywords don't need to have their own entry in the appctx
union, they just need to reuse some generic pointers like we've been
used to do in the appctx with st{0,1,2}. This patch adds p0, p1, i0, i1
and initializes them to zero before calling the parser. This way some
of the simplest existing keywords will be able to disappear from the
union.
It's worth noting that this is an extension to what was initially
attempted via the "private" member that I removed a few patches ago by
not understanding how it was supposed to be used. Here the fact that
we share the same union will force us to be stricter: the code either
uses the general purpose variables or it uses its own fields but not
both.
This is a leftover from the cleanup campaign, the stats scope was still
initialized by the CLI instead of being initialized by the stats keyword
parsers. This should probably be backported to 1.7 to make the code more
consistent.
Sometimes a registered keyword will not need any specific parsing nor
initialization, so it's annoying to have to write an empty parsing
function returning zero just for this.
This patch makes it possible to automatically call a keyword's I/O
handler of when the parsing function is not defined, while still allowing
a parser to set the I/O handler itself.
The signals system embedded in Lua can be tranformed in general purpose
signals code. To reach this goal, this path removes the Lua part of the
signals.
This is an easy job, because Lua is useles with signal. I change just two
prototypes.
The struct hlua size is 128 bytes. The size is the biggest of all the elements
of the union embedded in the appctx struct. With HTTP2, it is possible that this
appctx struct will be use many times for each connection, so the 128 bytes are
a little bit heavy for the global memory consomation.
This patch replace the embbeded hlua struct by a pointer and an associated memory
pool. Now, the memory for lua is allocated only if it is required.
[wt: the appctx is now down to 160 bytes]
Another small bug in "show cli sockets" made the last fix always report
process 64 due to a signedness issue in the shift operation when building
the mask.
Just like previous patch, this was the only other user of the "private"
field of the applet. It used to store a copy of the keyword's action.
Let's just put it into ->table->action and use it from there. It also
slightly simplifies the code by removing a few pointer to integer casts.
We have very few users of the appctx's private field which was introduced
prior to the split of the CLI. Unfortunately it was not removed after the
end. This commit simply introduces hlua_cli->fcn which is the pointer to
the Lua function that the Lua code used to store in this private pointer.
While developing an experimental applet performing only one read per full
line, it appeared that it would be woken up for the client's close, not
read all data (missing LF), then wait for a subsequent call, and would only
be woken up on client timeout to finish the read. The reason is that we
preset SI_FL_WAIT_DATA in the stream-interface's flags to avoid a fast loop,
but there's nothing which can remove this flag until there's a read operation.
We must definitely remove it in stream_int_notify() each time we're called
with CF_SHUTW_NOW because we know there will be no more subsequent read
and we don't want an applet which keeps the WANT_GET flag to block on this.
This fix should be backported to 1.7 and 1.6 though it's uncertain whether
cli, peers, lua or spoe really are affected there.
This problem is already detected here:
8dc7316a6f
Another case raises. Now HAProxy sends a final message (typically
with "http-request deny"). Once the the message is sent, the response
channel flags are not modified.
HAProxy executes a Lua sample-fecthes for building logs, and the
result is ignored because the response flag remains set to the value
HTTP_MSG_RPBEFORE. So the Lua function hlua_check_proto() want to
guarantee the valid state of the buffer and ask for aborting the
request.
The function check_proto() is not the good way to ensure request
consistency. The real question is not "Are the message valid ?", but
"Are the validity of message unchanged ?"
This patch memorize the parser state before entering int the Lua
code, and perform a check when it go out of the Lua code. If the parser
state change for down, the request is aborted because the HTTP message
is degraded.
This patch should be backported in version 1.6 and 1.7
Fixing the build using LibreSSL as OpenSSL implementation.
Currently, LibreSSL 2.4.4 provides the same API of OpenSSL 1.0.1x,
but it redefine the OpenSSL version number as 2.0.x, breaking all
checks with OpenSSL 1.1.x.
The patch solves the issue checking the definition of the symbol
LIBRESSL_VERSION_NUMBER when Openssl 1.1.x features are requested.
When an entity tries to get a buffer, if it cannot be allocted, for example
because the number of buffers which may be allocated per process is limited,
this entity is added in a list (called <buffer_wq>) and wait for an available
buffer.
Historically, the <buffer_wq> list was logically attached to streams because it
were the only entities likely to be added in it. Now, applets can also be
waiting for a free buffer. And with filters, we could imagine to have more other
entities waiting for a buffer. So it make sense to have a generic list.
Anyway, with the current design there is a bug. When an applet failed to get a
buffer, it will wait. But we add the stream attached to the applet in
<buffer_wq>, instead of the applet itself. So when a buffer is available, we
wake up the stream and not the waiting applet. So, it is possible to have
waiting applets and never awakened.
So, now, <buffer_wq> is independant from streams. And we really add the waiting
entity in <buffer_wq>. To be generic, the entity is responsible to define the
callback used to awaken it.
In addition, applets will still request an input buffer when they become
active. But they will not be sleeped anymore if no buffer are available. So this
is the responsibility to the applet I/O handler to check if this buffer is
allocated or not. This way, an applet can decide if this buffer is required or
not and can do additional processing if not.
[wt: backport to 1.7 and 1.6]
A stream can be awakened for different reasons. During its processing, it can be
early stopped if no buffer is available. In this situation, the reason why the
stream was awakened is lost, because we rely on the task state, which is reset
after each processing loop.
In many cases, that's not a big deal. But it can be useful to accumulate the
task states if the stream processing is interrupted, especially if some filters
need to be called.
To be clearer, here is an simple example:
1) A stream is awakened with the reason TASK_WOKEN_MSG.
2) Because no buffer is available, the processing is interrupted, the stream
is back to sleep. And the task state is reset.
3) Some buffers become available, so the stream is awakened with the reason
TASK_WOKEN_RES. At this step, the previous reason (TASK_WOKEN_MSG) is lost.
Now, the task states are saved for a stream and reset only when the stream
processing is not interrupted. The correspoing bitfield represents the pending
events for a stream. And we use this one instead of the task state during the
stream processing.
Note that TASK_WOKEN_TIMER and TASK_WOKEN_RES are always removed because these
events are always handled during the stream processing.
[wt: backport to 1.7 and 1.6]
<run_queue> is used to track the number of task in the run queue and
<run_queue_cur> is a copy used for the reporting purpose. These counters has
been renamed, respectively, <tasks_run_queue> and <tasks_run_queue_cur>. So the
naming is consistent between tasks and applets.
[wt: needed for next fixes, backport to 1.7 and 1.6]
As for tasks, 2 counters has been added to track :
* the total number of applets : nb_applets
* the number of active applets : applets_active_queue
[wt: needed for next fixes, to backport to 1.7 and 1.6]
[wt: while it could seem suspicious, the preceeding call to
dump_servers_state() indeed flushes the trash in case anything is
emitted. No backport needed though.]
When the option "http-buffer-request" is set, HAProxy send itself the
"HTTP/1.1 100 Continue" response in order to retrieve the post content.
When HAProxy forward the request, it send the body directly after the
headers. The header "Expect: 100-continue" was sent with the headers.
This header is useless because the body will be sent in all cases, and
the server reponse is not removed by haproxy.
This patch removes the header "Expect: 100-continue" if HAProxy sent it
itself.
These 2 patches add ability to fetch frontend/backend name in your
logic, so they can be used later to make routing decisions (fe_name) or
taking some actions based on backend which responded to request (be_name).
In our case we needed a fetcher to be able to extract information we
needed from frontend name.
(http|tcp)-(request|response) action cannot take arguments from the
configuration file. Arguments are useful for executing the action with
a special context.
This patch adds the possibility of passing arguments to an action. It
runs exactly like sample fetches and other Lua wrappers.
Note that this patch implements a 'TODO'.
The variable are compared only using text, the final '\0' (or the
string length) are not checked. So, the variable name "txn.internal"
matchs other one call "txn.int".
This patch fix this behavior
It must be backported ni 1.6 and 1.7
By investigating a keep-alive issue with CloudFlare, we[1] found that
when using the 'set-cookie' option in a redirect (302) HAproxy is adding
an extra `\r\n`.
Triggering rule :
`http-request redirect location / set-cookie Cookie=value if [...]`
Expected result :
```
HTTP/1.1 302 Found
Cache-Control: no-cache
Content-length: 0
Location: /
Set-Cookie: Cookie=value; path=/;
Connection: close
```
Actual result :
```
HTTP/1.1 302 Found
Cache-Control: no-cache
Content-length: 0
Location: /
Set-Cookie: Cookie=value; path=/;
Connection: close
```
This extra `\r\n` seems to be harmless with another HAproxy instance in
front of it (sanitizing) or when using a browser. But we confirm that
the CloudFlare NGINX implementation is not able to handle this. It
seems that both 'Content-length: 0' and extra carriage return broke RFC
(to be confirmed).
When looking into the code, this carriage-return was already present in
1.3.X versions but just before closing the connection which was ok I
think. Then, with 1.4.X the keep-alive feature was added and this piece
of code remains unchanged.
[1] all credit for the bug finding goes to CloudFlare Support Team
[wt: the bug was indeed present since the Set-Cookie was introduced
in 1.3.16, by commit 0140f25 ("[MINOR] redirect: add support for
"set-cookie" and "clear-cookie"") so backporting to all supported
versions is desired]
The recent CLI reorganization managed to break these two commands
by having their parser return 1 (indicating an end of processing)
instead of 0 to indicate new calls to the io handler were needed.
Namely the faulty commits are :
69e9644 ("REORG: cli: move show stat resolvers to dns.c")
32af203 ("REORG: cli: move ssl CLI functions to ssl_sock.c")
The fix is trivial and there is no other loss of functionality. Thanks
to Dragan Dosen for reporting the issue and the faulty commits. The
backport is needed in 1.7.
In 1.5-dev20, commit 48bcfda ("MEDIUM: dumpstat: make the CLI parser
understand the backslash as an escape char") introduced support for
backslash on the CLI, but it strips all backslashes in all arguments
instead of only unescaping them, making it impossible to pass a
backslash in an argument.
This will allow us to use a backslash in a command over the socket, eg.
"add acl #0 ABC\\XYZ".
[wt: this should be backported to 1.7 and 1.6]
Commit 5fddab0 ("OPTIM: stream_interface: disable reading when
CF_READ_DONTWAIT is set") improved the connection layer's efficiency
back in 1.5-dev13 by avoiding successive read attempts on an active
FD. But by disabling this on a polled FD, it causes an unpleasant
side effect which is that the FD that was subscribed to polling is
suddenly stopped and may need to be re-enabled once the kernel
starts to slow down on data eviction (eg: saturated server at the
other end, bursty traffic caused by too large maxpollevents).
This behaviour is observable with persistent connections when there
is a large enough connection count so that there's no data in the
early connection and polling is required, because there are then
up to 4 epoll_ctl() calls per request. It's important that the
server is slower than haproxy to cause some delays when reading
response.
The current connection layer as designed in 1.6 with the FD cache
doesn't require this trick anymore, though it still benefits from
it when it saves an FD from being uselessly polled. But compared
to the increased cost of enabling and disabling poll all the time,
it's still better to disable it. In some cases it's possible to
observe a performance increase as high as 30% by avoiding this
epoll_ctl() dance.
In the end we only want to disable it when the FD is speculatively
read and not when it's polled. For this we introduce a new function
__conn_data_done_recv() which is used to indicate that we're done
with recv() and not interested in new attempts. If/when we later
support event-triggered epoll, this function will have to change
a bit to do the same even in the polled case.
A quick test with keep-alive requests run on a dual-core / dual-
thread Atom shows a significant improvement :
single process, 0 bytes :
before: Requests per second: 12243.20 [#/sec] (mean)
after: Requests per second: 13354.54 [#/sec] (mean)
single process, 4k :
before: Requests per second: 9639.81 [#/sec] (mean)
after: Requests per second: 10991.89 [#/sec] (mean)
dual process, 0 bytes (unstable) :
before: Requests per second: 16900-19800 ~ 17600 [#/sec] (mean)
after: Requests per second: 18600-21400 ~ 20500 [#/sec] (mean)
In 1.6-dev2, commit 32990b5 ("MEDIUM: session: remove the task pointer
from the session") introduced a bug which can sometimes crash the process
on resource shortage. When stream_complete() returns -1, it has already
reattached the connection to the stream, then kill_mini_session() is
called and still expects to find the task in conn->owner. Note that
since this commit, the code has moved a bit and is now in stream_new()
but the problem remains the same.
Given that we already know the task around these places, let's simply
pass the task to kill_mini_session().
The conditions currently at risk are :
- failure to initialize filters for the new stream (lack of memory or
any filter returning < 0 on attach())
- failure to attach filters (any filter returning < 0 on stream_start())
- frontend's accept() returning < 0 (allocation failure)
This fix is needed in 1.7 and 1.6.
This allow a filter to start to analyze data in HTTP and to fallback in TCP when
data are tunneled.
[wt: backport desired in 1.7 - no impact right now but may impact the ability
to backport future fixes]
These 2 analyzers are responsible of the data forwarding in, respectively, HTTP
mode and TCP mode. Now, the analyzer responsible of the HTTP data forwarding is
called before the one responsible of the TCP data forwarding. This will allow
the filtering of tunneled data in HTTP.
[wt: backport desired in 1.7 - no impact right now but may impact the ability
to backport future fixes]
In HAProxy 1.6, When "http-tunnel" option is enabled, HTTP transactions are
tunneled as soon as possible after the headers parsing/forwarding. When the
transfer length of the response can be determined, this happens when all data
are forwarded. But for responses with an undetermined transfer length this
happens when headers are forwarded. This behavior is questionable, but this is
not the purpose of this fix...
In HAProxy 1.7, the first use-case works like in 1.6. But the second one not
because of the data filtering. HAProxy was always trying to forward data until
the server closes the connection. So the transaction was never switched in
tunnel mode. This is the expected behavior when there is a data filter. But in
the default case (no data filter), it should work like in 1.6.
This patch fixes the bug. We analyze response data until the server closes the
connection only when there is a data filter.
[wt: backport needed in 1.7]
When a 2xx response to a CONNECT request is returned, the connection must be
switched in tunnel mode immediatly after the headers, and Transfer-Encoding and
Content-Length headers must be ignored. So from the HTTP parser point of view,
there is no body.
The bug comes from the fact the flag HTTP_MSGF_XFER_LEN was not set on the
response (This flag means that the body size can be determined. In our case, it
can, it is 0). So, during data forwarding, the connection was never switched in
tunnel mode and we were blocked in a state where we were waiting that the
server closes the connection to ends the response.
Setting the flag HTTP_MSGF_XFER_LEN on the response fixed the bug.
The code of http_wait_for_response has been slightly updated to be more
readable.
[wt: 1.7-only, this is not needed in 1.6]
When a backend doesn't use any known LB algorithm, backend_lb_algo_str()
returns NULL. It used to cause "nil" to be printed in the stats dump
since version 1.4 but causes 1.7 to try to parse this NULL to encode
it as a CSV string, causing a crash on "show stat" in this case.
The only situation where this can happen is when "transparent" or
"dispatch" are used in a proxy, in which case the LB algorithm is
BE_LB_ALGO_NONE. Thus now we explicitly report "none" when this
situation is detected, and we preventively report "unknown" if any
unknown algorithm is detected, which may happen if such an algo is
added in the future and the function is not updated.
This fix must be backported to 1.7 and may be backported as far as
1.4, though it has less impact there.
Historically we used to have the stick counters processing put into
session.c which became stream.c. But a big part of it is now in
stick-table.c (eg: converters) but despite this we still have all
the sample fetch functions in stream.c
These parts do not depend on the stream anymore, so let's move the
remaining chunks to stick-table.c and have cleaner files.
What remains in stream.c is everything needed to attach/detach
trackers to the stream and to update the counters while the stream
is being processed.
There's no more reason to keep tcp rules processing inside proto_tcp.c
given that there is nothing in common there except these 3 letters : tcp.
The tcp rules are in fact connection, session and content processing rules.
Let's move them to "tcp-rules" and let them live their life there.
There are 8 functions each repeating what another does and adding one
extra test. We used to have some copy-paste issues in the past due to
this. Instead we now make them simply rely on the previous one and add
the final test. It's much better and much safer. The functions could
be moved to inlines but they're used at a few other locations only,
it didn't make much sense in the end.
We used to have 3 types of counters with a huge overlap :
- listener counters : stats collected for each bind line
- proxy counters : union of the frontend and backend counters
- server counters : stats collected per server
It happens that quite a good part was common between listeners and
proxies due to the frontend counters being updated at the two locations,
and that similarly the server and proxy counters were overlapping and
being updated together.
This patch cleans this up to propose only two types of counters :
- fe_counters: used by frontends and listeners, related to
incoming connections activity
- be_counters: used by backends and servers, related to outgoing
connections activity
This allowed to remove some non-sensical counters from both parts. For
frontends, the following entries were removed :
cum_lbconn, last_sess, nbpend_max, failed_conns, failed_resp,
retries, redispatches, q_time, c_time, d_time, t_time
For backends, this ones was removed : intercepted_req.
While doing this it was discovered that we used to incorrectly report
intercepted_req for backends in the HTML stats, which was always zero
since it's never updated.
Also it revealed a few inconsistencies (which were not fixed as they
are harmless). For example, backends count connections (cum_conn)
instead of sessions while servers count sessions and not connections.
Over the long term, some extra cleanups may be performed by having
some counters update functions touching both the server and backend
at the same time, as well as both the frontend and listener, to
ensure that all sides have all their stats properly filled. The stats
dump will also be able to factor the dump functions by counter types.
When dealing with many proxies, it's hard to spot response errors because
all internet-facing frontends constantly receive attacks. This patch now
makes it possible to demand that only request or response errors are dumped
by appending "request" or "reponse" to the show errors command.
The function log format emit its own error message using Alert(). This
patch replaces this behavior and uses the standard HAProxy error system
(with memprintf).
The benefits are:
- cleaning the log system
- the logformat can ignore the caller (actually the caller must set
a flag designing the caller function).
- Make the usage of the logformat function easy for future components.
For tokenizing a string, standard Lua recommends to use regexes.
The followinf example splits words:
for i in string.gmatch(example, "%S+") do
print(i)
end
This is a little bit overkill for simply split words. This patch
adds a tokenize function which quick and do not use regexes.
gcc 3.4.6 noticed a possibly unitialized variable in vars.c, and while it
cannot happen the way the function is used, it's surprizing that newer
versions did not report it.
This fix may be backported to 1.6.
This patch takes into account the return code of the parse_logformat_string()
function. Now the configuration parser will fail if the log_format is not
strict.
Until now, the function parse_logformat_string() never fails. It
send warnings when it parses bad format, and returns expression in
best effort.
This patch replaces warnings by alert and returns a fail code.
Maybe the warning mode is designed for a compatibility with old
configuration versions. If it is the case, now this compatibility
is broken.
[wt: no, the reason is that an alert must cause a startup failure,
but this will be OK with next patch]
The log-format function parse_logformat_string() takes file and line
for building parsing logs. These two parameters are embedded in the
struct proxy curproxy, which is the current parsing context.
This patch removes these two unused arguments.
This patch replace the successful return code from 0 to 1. The
error code is replaced from 1 to 0.
The return code of this function is actually unused, so this
patch cannot modify the behaviour.
This patch replaces the successful return code from 0 to 1. The
error code is replaced from -1 to 0.
The return code of this function is actually unused, so this
patch cannot modify the behaviour.
The caller must log location information, so this information is
provided two times in the log line. The error log is like this:
[ALERT] 327/011513 (14291) : parsing [o3.conf:38]: 'http-response
set-header': Sample fetch <method,json(rrr)> failed with : invalid
args in conv method 'json' : Unexpected input code type at file
'o3.conf', line 38. Allowed value are 'ascii', 'utf8', 'utf8s',
'utf8p' and 'utf8ps'.
This patch removes the second location indication, the the same error
becomes:
[ALERT] 327/011637 (14367) : parsing [o3.conf:38]: 'http-response
set-header': Sample fetch <method,json(rrr)> failed with : invalid
args in conv method 'json' : Unexpected input code type. Allowed
value are 'ascii', 'utf8', 'utf8s', 'utf8p' and 'utf8ps'.
stats_sock_parse_request() was renamed cli_parse_request(). It now takes
an appctx instead of a stream interface, and presets ->st2 to 0 so that
most handlers will not have to set it anymore. The io_handler is set by
default to the keyword's IO handler so that the parser can simply change
it without having to rewrite the new state.
All 4 rate-limit settings were handled at once given that exactly the
same checks are performed on them. In case of missing or incorrect
argument, the detailed supported options are printed with their use
case.
This was the last specific entry in the CLI parser, some additional
cleanup may still be done.
Also mention that "set server" is preferred now. Note that these
were the last enable/disable commands in cli.c. Also remove the
now unused expect_server_admin() function.
It could be argued that it's between server, stream and session but
at least due to the fact that it operates on streams, its best place
is in stream.c.
This way we don't have any more state specific to a given yieldable
command. The other commands should be easier to move as they only
involve a parser.
It really belongs to proto_http.c since it's a dump for HTTP request
and response errors. Note that it's possible that some parts do not
need to be exported anymore since it really is the only place where
errors are manipulated.
The table dump code was a horrible mess, with common parts interleaved
all the way to deal with the various actions (set/clear/show). A few
error messages were still incorrect, as the "set" operation did not
update them so they would still report "unknown action" (now fixed).
The action was now passed as a private argument to the CLI keyword
which itself is copied into the appctx private field. It's just an
int cast to a pointer.
Some minor issues were noticed while doing this, for example when dumping
an entry by key, if the key doesn't exist, nothing is printed, not even
the table's header. It's unclear whether this was intentional but it
doesn't really match what is done for data-based dumps. It was left
unchanged for now so that a later fix can be backported if needed.
Enum entries STAT_CLI_O_TAB, STAT_CLI_O_CLR and STAT_CLI_O_SET were
removed.
Move the "show info" command to stats.c using the CLI keyword API
to register it on the CLI. The stats_dump_info_to_buffer() function
is now static again. Note, we don't need proto_ssl anymore in cli.c.
Move the "show stat" command to stats.c using the CLI keyword API
to register it on the CLI. The stats_dump_stat_to_buffer() function
is now static again.
Move 'show sess' CLI functions to stream.c and use the cli keyword API
to register it on the CLI.
[wt: the choice of stream vs session makes sense because since 1.6 these
really are streams that we're dumping and not sessions anymore]
Several CLI commands require a frontend, so let's have a function to
look this one up and prepare the appropriate error message and the
appctx's state in case of failure.
Several CLI commands require a server, so let's have a function to
look this one up and prepare the appropriate error message and the
appctx's state in case of failure.
Move map and acl CLI functions to map.c and use the cli keyword API to
register actions on the CLI. Then remove the now unused individual
"add" and "del" keywords.
proto/dumpstats.h has been split in 4 files:
* proto/cli.h contains protypes for the CLI
* proto/stats.h contains prototypes for the stats
* types/cli.h contains definition for the CLI
* types/stats.h contains definition for the stats
When the CLI's timeout is reduced, nothing was done to take the task
up to update it. In the past it used to run inside process_stream()
so it used to be refreshed. This is not the case anymore since we have
the appctx so the task needs to be woken up in order to recompute the
new expiration date.
This fix needs to be backported to 1.6.
The "set maxconn frontend" statement on the CLI tries to dequeue possibly
pending requests, but due to a copy-paste error, they're dequeued on the
CLI's frontend instead of the one being changed.
The impact is very minor as it only means that possibly pending connections
will still have to wait for a previous one to complete before being accepted
when a limit is raised.
This fix has to be backported to 1.6 and 1.5.
In dumpstats.c we have get_conn_xprt_name() and get_conn_data_name() to
report the name of the data and transport layers used on a connection.
But when the name is not known, its pointer is reported instead. But the
static char used to report the pointer is too small as it doesn't leave
room for '0x'. Fortunately all subsystems are known so we never trigger
this case.
This fix needs to be backported to 1.6 and 1.5.
uint16_t instead of u_int16_t
None ISO fields of struct tm are not present, but
by zeroyfing it, on GNU and BSD systems tm_gmtoff
field will be set.
[wt: moved the memset into each of the date functions]
It defines the variable to set when an error occurred during an event
processing. It will only be set when an error occurred in the scope of the
transaction. As for all other variables define by the SPOE, it will be
prefixed. So, if your variable name is "error" and your prefix is "my_spoe_pfx",
the variable will be "txn.my_spoe_pfx.error".
When set, the variable is the boolean "true". Note that if "option
continue-on-error" is set, the variable is not automatically removed between
events processing.
"maxconnrate" is the maximum number of connections per second. The SPOE will
stop to open new connections if the maximum is reached and will wait to acquire
an existing one.
"maxerrrate" is the maximum number of errors per second. The SPOE will stop its
processing if the maximum is reached.
These options replace hardcoded macros MAX_NEW_SPOE_APPLETS and
MAX_NEW_SPOE_APPLET_ERRS. We use it to limit SPOE activity, especially when
servers are down..
By default, for a specific stream, when an abnormal/unexpected error occurs, the
SPOE is disabled for all the transaction. So if you have several events
configured, such error on an event will disabled all followings. For TCP
streams, this will disable the SPOE for the whole session. For HTTP streams,
this will disable it for the transaction (request and response).
To bypass this behaviour, you can set 'continue-on-error' option in 'spoe-agent'
section. With this option, only the current event will be ignored.
It is a way to set the maximum time to wait for a stream to process an event,
i.e to acquire a stream to talk with an agent, to encode all messages, to send
the NOTIFY frame, to receive the corrsponding acknowledgement and to process all
actions. It is applied on the stream that handle the client and the server
sessions.
When:
- A Lua action return data and close the channel. The request status
is set to HTTP_MSG_CLOSED for the request and HTTP_MSG_DONE for the
response.
- HAProxy sets the state HTTP_MSG_ERROR. I don't known why, because
there are many line which sets this state.
- A Lua sample-fetch is executed, typically for building the log
line.
- When the Lua sample fetch exits, a control of the data is
executed. If HAProxy is currently parsing the request, the request
is aborted in order to prevent a segfault or sending corrupted
data.
This ast control is executed comparing the state HTTP_MSG_BODY. When
this state is reached, the request is parsed and no error are
possible. When the state is < than HTTP_MSG_BODY, the parser is
running.
Unfortunately, the code HTTP_MSG_ERROR is just < HTTP_MSG_BODY. When
we are in error, we want to terminate the execution of Lua without
error.
This patch changes the comparaison level.
This patch must be backported in 1.6
Gernot Pörner reported some constant leak of ref counts for stick tables
entries. It happens that this leak was not at all in the regular traffic
path but on the "show table" path. An extra ref count was taken during
the dump if the output had to be paused, and it was released upon clean
termination or an error detected in the I/O handler. But the release
handler didn't do it, while it used to properly do it for the sessions
dump.
This fix needs to be backported to 1.6.
Commit ef8f4fe ("BUG/MINOR: stick-table: handle out-of-memory condition
gracefully") unfortunately got trapped by a pointer operation. Replacing
ts = poll_alloc() + size;
with :
ts = poll_alloc();
ts += size;
Doesn't give the same result because pool_alloc() is void while ts is a
struct stksess*. So now we don't access the same places, which is visible
in certain stick-table scenarios causing a crash.
This must be backported to 1.6 and 1.5.
Setting an FD to -1 when closed isn't the most easily noticeable thing
to do when we're chasing accidental reuse of a stale file descriptor.
Instead set it to that large a negative value that it will overflow the
fdtab and provide an analysable core at the moment the issue happens.
Care was taken to ensure it doesn't overflow nor change sign on 32-bit
machines when multiplied by fdtab, and that it also remains negative for
the various checks that exist. The value equals 0xFDDEADFD which happens
to be easily spotted in a debugger.
The bug described in commit 568743a ("BUG/MEDIUM: stream-int: completely
detach connection on connect error") was not a stream-interface layer bug
but a connection layer bug. There was exactly one place in the code where
we could change a file descriptor's status without first checking whether
it is valid or not, it was in conn_stop_polling(). This one is called when
the polling status is changed after an update, and calls fd_stop_both even
if we had already closed the file descriptor :
1479388298.484240 ->->->->-> conn_fd_handler > conn_cond_update_polling
1479388298.484240 ->->->->->-> conn_cond_update_polling > conn_stop_polling
1479388298.484241 ->->->->->->-> conn_stop_polling > conn_ctrl_ready
1479388298.484241 conn_stop_polling < conn_ctrl_ready
1479388298.484241 ->->->->->->-> conn_stop_polling > fd_stop_both
1479388298.484242 ->->->->->->->-> fd_stop_both > fd_update_cache
1479388298.484242 ->->->->->->->->-> fd_update_cache > fd_release_cache_entry
1479388298.484242 fd_update_cache < fd_release_cache_entry
1479388298.484243 fd_stop_both < fd_update_cache
1479388298.484243 conn_stop_polling < fd_stop_both
1479388298.484243 conn_cond_update_polling < conn_stop_polling
1479388298.484243 conn_fd_handler < conn_cond_update_polling
The problem with the previous fix above is that it break the http_proxy mode
and possibly even some Lua parts and peers to a certain extent ; all outgoing
connections where the target address is initially copied into the outgoing
connection which experience a retry would use a random outgoing address after
the retry because closing and detaching the connection causes the target
address to be lost. This was attempted to be addressed by commit 0857d7a
("BUG/MAJOR: stream: properly mark the server address as unset on connect
retry") but it used to only solve the most visible effect and not the root
cause.
Prior to this fix, it was possible to cause this config to keep CLOSE_WAIT
for as long as it takes to expire a client or server timeout (note the
missing client timeout) :
listen test
mode http
bind :8002
server s1 127.0.0.1:8001
$ tcploop 8001 L0 W N20 A R P100 S:"HTTP/1.1 200 OK\r\nContent-length: 0\r\n\r\n" &
$ tcploop 8002 N200 C T W S:"GET / HTTP/1.0\r\n\r\n" O P10000 K
With this patch, these CLOSE_WAIT properly vanish when both processes leave.
This commit reverts the two fixes above and replaces them with the proper
fix in connection.h. It must be backported to 1.6 and 1.5. Thanks to
Robson Roberto Souza Peixoto for providing very detailed traces showing
some obvious inconsistencies leading to finding this bug.
This pointer will be used for storing private context. With this,
the same executed function can handle more than one keyword. This
will be very useful for creation Lua cli bindings.
The release function is called when the command is terminated (give
back the hand to the prompt) or when the session is broken (timeout
or client closed).
In case `pool_alloc2()` returns NULL, propagate the condition to the
caller. This could happen when limiting the amount of memory available
for HAProxy with `-m`.
[wt: backport to 1.6 and 1.5 needed]
Before this change, trash is being used to create certificate filename
to read in care Mutli-Cert are in used. But then ssl_sock_load_ocsp()
modify trash leading to potential wrong information given in later error
message.
This also blocks any further use of certificate filename for other
usage, like ongoing patch to support Certificate Transparency handling
in Multi-Cert bundle.
The unlikely macro doesn't take in acount the condition, but only one
variable.
Must be backported in 1.6
[wt: with gcc 3.x, unlikely(x) is defined as __builtin_expect((x) != 0, 0)
so the condition is wrong for negative numbers, which correspond to the
case where bi_getblk_nc() has reached the end of the buffer and the
channel is already closed. With gcc 4.x, the output is cast to unsigned long
so the <=0 will not match negative values either. This is only used in Lua
for now so that may explain why it hasn't hit yet]
A new "option spop-check" statement has been added to enable server health
checks based on SPOP HELLO handshake. SPOP is the protocol used by SPOE filters
to talk to servers.
SPOE makes possible the communication with external components to retrieve some
info using an in-house binary protocol, the Stream Processing Offload Protocol
(SPOP). In the long term, its aim is to allow any kind of offloading on the
streams. This first version, besides being experimental, won't do lot of
things. The most important today is to validate the protocol design and lay the
foundations of what will, one day, be a full offload engine for the stream
processing.
So, for now, the SPOE can offload the stream processing before "tcp-request
content", "tcp-response content", "http-request" and "http-response" rules. And
it only supports variables creation/suppression. But, in spite of these limited
features, we can easily imagine to implement a SSO solution, an ip reputation
service or an ip geolocation service.
Internally, the SPOE is implemented as a filter. So, to use it, you must use
following line in a proxy proxy section:
frontend my-front
...
filter spoe [engine <name>] config <file>
...
It uses its own configuration file to keep the HAProxy configuration clean. It
is also a easy way to disable it by commenting out the filter line.
See "doc/SPOE.txt" for all details about the SPOE configuration.
It does the opposite of 'set-var' action/converter. It is really useful for
per-process variables. But, it can be used for any scope.
The lua function 'unset_var' has also been added.
Now it is possible to use variables attached to a process. The scope name is
'proc'. These variables are released only when HAProxy is stopped.
'tune.vars.proc-max-size' directive has been added to confiure the maximum
amount of memory used by "proc" variables. And because memory accounting is
hierachical for variables, memory for "proc" vars includes memory for "sess"
vars.
This function, unsurprisingly, sets a variable value only if it already
exists. In other words, this function will succeed only if the variable was
found somewhere in the configuration during HAProxy startup.
It will be used by SPOE filter. So an agent will be able to set a value only for
existing variables. This prevents an agent to create a very large number of
unused variables to flood HAProxy and exhaust the memory reserved to variables..
This code has been moved from haproxy.c to sample.c and the function
release_sample_expr can now be called from anywhere to release a sample
expression. This function will be used by the stream processing offload engine
(SPOE).
A scope is a section name between square bracket, alone on its line, ie:
[scope-name]
...
The spaces at the beginning and at the end of the line are skipped. Comments at
the end of the line are also skipped.
When a scope is parsed, its name is saved in the global variable
cfg_scope. Initially, cfg_scope is NULL and it remains NULL until a valid scope
line is parsed.
This feature remains unused in the HAProxy configuration file and
undocumented. However, it will be used during SPOE configuration parsing.
This feature will be used by the stream processing offload engine (SPOE) to
parse dedicated configuration files without mixing HAProxy sections with SPOE
sections.
So, here we can back up all sections known by HAProxy, unregister all of them
and add new ones, dedicted to the SPOE. Once the SPOE configuration file parsed,
we can roll back all changes by restoring HAProxy sections.
Now, for TCP streams, backend filters are released when the stream is
destroyed. But, for HTTP streams, these filters are released when the
transaction analyze ends, in flt_end_analyze callback.
New callbacks have been added to handle creation and destruction of filter
instances:
* 'attach' callback is called after a filter instance creation, when it is
attached to a stream. This happens when the stream is started for filters
defined on the stream's frontend and when the backend is set for filters
declared on the stream's backend. It is possible to ignore the filter, if
needed, by returning 0. This could be useful to have conditional filtering.
* 'detach' callback is called when a filter instance is detached from a stream,
before its destruction. This happens when the stream is stopped for filters
defined on the stream's frontend and when the analyze ends for filters defined
on the stream's backend.
In addition, the callback 'stream_set_backend' has been added to know when a
backend is set for a stream. It is only called when the frontend and the backend
are not the same. And it is called for all filters attached to a stream
(frontend and backend).
Finally, the TRACE filter has been updated.
The 'set-var' converter uses function smp_conv_store (vars.c). In this function,
we should use the first argument (index 0) to retrieve the variable name and its
scope. But because of a typo, we get the scope of the second argument (index
1). In this case, there is no second argument. So the scope used was always 0
(SCOPE_SESS), always setting the variable in the session scope.
So, due to this bug, this rules
tcp-request content accept if { src,set-var(txn.foo) -m found }
always set the variable 'sess.foo' instead of 'txn.foo'.
Now that it is possible to decide whether we prefer to use libc or the
state file to resolve the server's IP address and it is possible to change
a server's IP address at run time on the CLI, let's not restrict the reuse
of the address from the state file anymore to the DNS only.
The impact is that by default the state file will be considered first
(which matches its purpose) and only then the libc. This way any address
change performed at run time over the CLI will be preserved regardless
of DNS usage or not.
It is very common when validating a configuration out of production not to
have access to the same resolvers and to fail on server address resolution,
making it difficult to test a configuration. This option simply appends the
"none" method to the list of address resolution methods for all servers,
ensuring that even if the libc fails to resolve an address, the startup
sequence is not interrupted.
This will allow a server to automatically fall back to an explicit numeric
IP address when all other methods fail. The address is simply specified in
the address list.
Now that we have "init-addr none", it becomes possible to recover on
libc resolver's failures. Thus it's preferable not to alert nor fail
at the moment the libc is called, and instead process the failure at
the end of the list. This allows "none" to be set after libc to
provide a smooth fallback in case of resolver issues.
This new setting supports a comma-delimited list of methods used to
resolve the server's FQDN to an IP address. Currently supported methods
are "libc" (use the regular libc's resolver) and "last" (use the last
known valid address found in the state file).
The list is implemented in a 32-bit integer, because each init-addr
method only requires 3 bits. The last one must always be SRV_IADDR_END
(0), allowing to store up to 10 methods in a single 32 bit integer.
Note: the doc is provided at the end of this series.
The RMAINT state happens when a server doesn't get a valid DNS response
past the hold time. If the address is forced on the CLI, we must use it
and leave the RMAINT state.
WARNING: this is a MAJOR (and disruptive) change with previous HAProxy's
behavior: before, HAProxy never ever used to change a server administrative
status when the DNS resolution failed at run time.
This patch gives HAProxy the ability to change the administrative status
of a server to MAINT (RMAINT actually) when an error is encountered for
a period longer than its own allowed by the corresponding 'hold'
parameter.
IE if the configuration sets "hold nx 10s" and a server's hostname
points to a NX for more than 10s, then the server will be set to RMAINT,
hence in MAINTENANCE mode.
This adds new "hold" timers : nx, refused, timeout, other. This timers
will be used to tell HAProxy to keep an erroneous response as valid for
the corresponding period. For now they're only configured, not enforced.
It will be important to help debugging some DNS resolution issues to
know why a server was marked down, so let's make the function support
a 3rd argument with an indication of the reason. Passing NULL will keep
the message as-is.
The server's state is now "MAINT (resolution)" just like we also have
"MAINT (via x/y)" when servers are tracked. The HTML stats page reports
"resolution" in the checks field similarly to what is done for the "via"
entry.
It's important to report in the server state change logs that RMAINT was
cleared, as it's not the regular maintenance mode, it's specific to name
resolution, and it's important to report the new state (which can be DRAIN
or READY).
Server addresses are not resolved anymore upon the first pass so that we
don't fail if an address cannot be resolved by the libc. Instead they are
processed all at once after the configuration is fully loaded, by the new
function srv_init_addr(). This function only acts on the server's address
if this address uses an FQDN, which appears in server->hostname.
For now the function does two things, to followup with HAProxy's historical
default behavior:
1. apply server IP address found in server-state file if runtime DNS
resolution is enabled for this server
2. use the DNS resolver provided by the libc
If none of the 2 options above can find an IP address, then an error is
returned.
All of this will be needed to support the new server parameter "init-addr".
For now, the biggest user-visible change is that all server resolution errors
are dumped at once instead of causing a startup failure one by one.
Currently, the function which applies server states provided by the
"old" process is applied after configuration sanity check. This results
in the impossibility to check the validity of the state file during a
regular config check, implying a full start is required, which can be
a problem sometimes.
This patch moves the loading of server_state file before MODE_CHECK.
This will be needed to later postpone server address resolution. We need the
FQDN even when it doesn't resolve. The caller then needs to check if fqdn was
set when resolve is null to detect that the address couldn't be parsed and
needs later resolution.
Quite a lot of people have been complaining about option contstats not
working correctly anymore since about 1.4. The reason was that one reason
for the significant performance boost between 1.3 and 1.4 was the ability
to forward data between a server and a client without waking up the stream
manager. And we couldn't afford to force sessions to constantly wake it
up given that most of the people interested in contstats are also those
interested in high performance transmission.
An idea was experimented with in the past, consisting in limiting the
amount of transmissible data before waking it up, but it was not usable
on slow connections (eg: FTP over modem lines, RDP, SSH) as stats would
be updated too rarely if at all, so that idea was dropped.
During a discussion today another idea came up : ensure that stats are
updated once in a while, since it's the only thing that matters. It
happens that we have the request channel's analyse_exp timeout that is
used to wake the stream up after a configured delay, and that by
definition this timeout is not used when there's no more analyser
(otherwise the stream would wake up and the stats would be updated).
Thus here the idea is to reuse this timeout when there's no analyser
and set it to now+5 seconds so that a stream wakes up at least once
every 5 seconds to update its stats. It should be short enough to
provide smooth traffic graphs and to allow to debug outputs of "show
sess" more easily without inflicting too much load even for very large
number of concurrent connections.
This patch is simple enough and safe enough to be backportable to 1.6
if there is some demand.
In the last release a lot of the structures have become opaque for an
end user. This means the code using these needs to be changed to use the
proper functions to interact with these structures instead of trying to
manipulate them directly.
This does not fix any deprecations yet that are part of 1.1.0, it only
ensures that it can be compiled against that version and is still
compatible with older ones.
[wt: openssl-0.9.8 doesn't build with it, there are conflicts on certain
function prototypes which we declare as inline here and which are
defined differently there. But openssl-0.9.8 is not supported anymore
so probably it's OK to go without it for now and we'll see later if
some users still need it. Emeric has reviewed this change and didn't
spot anything obvious which requires special care. Let's try it for
real now]
The only reason wurfl/wurfl.h was needed outside of wurfl.c was to expose
wurfl_handle which is a pointer to a structure, referenced by global.h.
By just storing a void* there instead, we can confine all wurfl code to
wurfl.c, which is really nice.
WURFL is a high-performance and low-memory footprint mobile device
detection software component that can quickly and accurately detect
over 500 capabilities of visiting devices. It can differentiate between
portable mobile devices, desktop devices, SmartTVs and any other types
of devices on which a web browser can be installed.
In order to add WURFL device detection support, you would need to
download Scientiamobile InFuze C API and install it on your system.
Refer to www.scientiamobile.com to obtain a valid InFuze license.
Any useful information on how to configure HAProxy working with WURFL
may be found in:
doc/WURFL-device-detection.txt
doc/configuration.txt
examples/wurfl-example.cfg
Please find more information about WURFL device detection API detection
at https://docs.scientiamobile.com/documentation/infuze/infuze-c-api-user-guide
Right now there is an issue with the way the maintenance flags are
propagated upon startup. They are not propagate, just copied from the
tracked server. This implies that depending on the server's order, some
tracking servers may not be marked down. For example this configuration
does not work as expected :
server s1 1.1.1.1:8000 track s2
server s2 1.1.1.1:8000 track s3
server s3 1.1.1.1:8000 track s4
server s4 wtap:8000 check inter 1s disabled
It results in s1/s2 being up, and s3/s4 being down, while all of them
should be down.
The only clean way to process this is to run through all "root" servers
(those not tracking any other server), and to propagate their state down
to all their trackers. This is the same algorithm used to propagate the
state changes. It has to be done both to compute the IDRAIN flag and the
IMAINT flag. However, doing so requires that tracking servers are not
marked as inherited maintenance anymore while parsing the configuration
(and given that it is wrong, better drop it).
This fix also addresses another side effect of the bug above which is
that the IDRAIN/IMAINT flags are stored in the state files, and if
restored while the tracked server doesn't have the equivalent flag,
the servers may end up in a situation where it's impossible to remove
these flags. For example in the configuration above, after removing
"disabled" on server s4, the other servers would have remained down,
and not anymore with this fix. Similarly, the combination of IMAINT
or IDRAIN with their respective forced modes was not accepted on
reload, which is wrong as well.
This bug has been present at least since 1.5, maybe even 1.4 (it came
with tracking support). The fix needs to be backported there, though
the srv-state parts are irrelevant.
This commit relies on previous patch to silence warnings on startup.
We'll have to use srv_set_admin_flag() to propagate some server flags
during the startup, and we don't want the resulting actions to cause
warnings, logs nor e-mail alerts to be generated since we're just applying
the config or a state file. So let's condition these notifications to the
fact that we're starting.
CMAINT indicates that the server was *initially* disabled in the
configuration via the "disabled" keyword. FDRAIN indicates that the
server was switched to the DRAIN state from the CLI or the agent.
This it's perfectly valid to have both of them in the state file,
so the parser must not reject this combination.
This fix must be backported to 1.6.
There were seveal reports about the DRAIN state not being properly
restored upon reload.
It happens that the condition in the code does exactly the opposite
of what the comment says, and the comment is right so the code is
wrong.
It's worth noting that the conditions are complex here due to the 2
available methods to set the drain state (CLI/agent, and config's
weight). To paraphrase the updated comment in the code, there are
two possible reasons for FDRAIN to have been present :
- previous config weight was zero
- "set server b/s drain" was sent to the CLI
In the first case, we simply want to drop this drain state if the new
weight is not zero anymore, meaning the administrator has intentionally
turned the weight back to a positive value to enable the server again
after an operation. In the second case, the drain state was forced on
the CLI regardless of the config's weight so we don't want a change to
the config weight to lose this status. What this means is :
- if previous weight was 0 and new one is >0, drop the DRAIN state.
- if the previous weight was >0, keep it.
This fix must be backported to 1.6.
http_find_header2() relies on find_hdr_value_end() to find the comma
delimiting a header field value, which also properly handles double
quotes and backslashes within quotes. In fact double quotes are very
rare, and commas happen once every multiple characters, especially
with cookies where a full block can be found at once. So it makes
sense to optimize this function to speed up the lookup of the first
block before the quote.
This change increases the performance from 212k to 217k req/s when
requests contain a 1kB cookie (+2.5%). We don't care about going
back into the fast parser after the first quote, as it may
needlessly make the parser more complex for very marginal gains.
Searching the trailing space in long URIs takes some time. This can
happen especially on static files and some blogs. By skipping valid
character ranges by 32-bit blocks, it's possible to increase the
HTTP performance from 212k to 216k req/s on requests features a
100-character URI, which is an increase of 2%. This is done for
architectures supporting unaligned accesses (x86_64, x86, armv7a).
There's only a 32-bit version because URIs are rarely long and very
often short, so it's more efficient to limit the systematic overhead
than to try to optimize for the rarest requests.
A performance test with 1kB cookies was capping at 194k req/s. After
implementing multi-byte skipping, the performance increased to 212k req/s,
or 9.2% faster. This patch implements this for architectures supporting
unaligned accesses (x86_64, x86, armv7a). Maybe other architectures can
benefit from this but they were not tested yet.
We used to have 7 different character classes, each was 256 bytes long,
resulting in almost 2kB being used in the L1 cache. It's as cheap to
test a bit than to check the byte is not null, so let's store a 7-bit
composite value and check for the respective bits there instead.
The executable is now 4 kB smaller and the performance on small
objects increased by about 1% to 222k requests/second with a config
involving 4 http-request rules including 1 header lookup, one header
replacement, and 2 variable assignments.
ipcpy() is used to replace an IP address with another one, but it
doesn't preserve the original port so all callers have to do it
manually while it's trivial to do there. Better do it inside the
function.
Often we need to call str2ip2() on an address which already contains a
port without replacing it, so let's ensure we preserve it even if the
family changes.
Gabriele Cerami reported the the exit codes of the systemd-wrapper are
wrong. In short, it directly returns the output of the wait syscall's
status, which is a composite value made of error code an signal numbers.
In general it contains the signal number on the lower bits and the error
code on the higher bits, but exit() truncates it to the lowest 8 bits,
causing config validations to incorrectly report a success. Example :
$ ./haproxy-systemd-wrapper -c -f /dev/null
<7>haproxy-systemd-wrapper: executing /tmp/haproxy -c -f /dev/null -Ds
Configuration file has no error but will not start (no listener) => exit(2).
<5>haproxy-systemd-wrapper: exit, haproxy RC=512
$ echo $?
0
If the process is killed however, the signal number is directly reported
in the exit code.
Let's fix all this to ensure that the exit code matches what the shell does,
which means that codes 0..127 are for exit codes, codes 128..254 for signals,
and code 255 for unknown exit code. Now the return code is correct :
$ ./haproxy-systemd-wrapper -c -f /dev/null
<7>haproxy-systemd-wrapper: executing /tmp/haproxy -c -f /dev/null -Ds
Configuration file has no error but will not start (no listener) => exit(2).
<5>haproxy-systemd-wrapper: exit, haproxy RC=2
$ echo $?
2
$ ./haproxy-systemd-wrapper -f /tmp/cfg.conf
<7>haproxy-systemd-wrapper: executing /tmp/haproxy -f /dev/null -Ds
^C
<5>haproxy-systemd-wrapper: exit, haproxy RC=130
$ echo $?
130
This fix must be backported to 1.6 and 1.5.
There's no reason to use the stream anymore, only the appctx should be
used by a peer. This was a leftover from the migration to appctx and it
caused some confusion, so let's totally drop it now. Note that half of
the patch are just comment updates.
It was inherited from initial code but we must only manipulate the appctx
and never the stream, otherwise we always risk shooting ourselves in the
foot.
In case of resource allocation error, peer_session_create() frees
everything allocated and returns a pointer to the stream/session that
was put back into the free pool. This stream/session is then assigned
to ps->{stream,session} with no error control. This means that it is
perfectly possible to have a new stream or session being both used for
a regular communication and for a peer at the same time.
In fact it is the only way (for now) to explain a CLOSE_WAIT on peers
connections that was caught in this dump with the stream interface in
SI_ST_CON state while the error field proves the state ought to have
been SI_ST_DIS, very likely indicating two concurrent accesses on the
same area :
0x7dbd50: [31/Oct/2016:17:53:41.267510] id=0 proto=tcpv4
flags=0x23006, conn_retries=0, srv_conn=(nil), pend_pos=(nil)
frontend=myhost2 (id=4294967295 mode=tcp), listener=? (id=0)
backend=<NONE> (id=-1 mode=-) addr=127.0.0.1:41432
server=<NONE> (id=-1) addr=127.0.0.1:8521
task=0x7dbcd8 (state=0x08 nice=0 calls=2 exp=<NEVER> age=1m5s)
si[0]=0x7dbf48 (state=CLO flags=0x4040 endp0=APPCTX:0x7d99c8 exp=<NEVER>, et=0x000)
si[1]=0x7dbf68 (state=CON flags=0x50 endp1=CONN:0x7dc0b8 exp=<NEVER>, et=0x020)
app0=0x7d99c8 st0=11 st1=0 st2=0 applet=<PEER>
co1=0x7dc0b8 ctrl=tcpv4 xprt=RAW data=STRM target=PROXY:0x7fe62028a010
flags=0x0020b310 fd=7 fd.state=22 fd.cache=0 updt=0
req=0x7dbd60 (f=0x80a020 an=0x0 pipe=0 tofwd=0 total=0)
an_exp=<NEVER> rex=<NEVER> wex=<NEVER>
buf=0x78a3c0 data=0x78a3d4 o=0 p=0 req.next=0 i=0 size=0
res=0x7dbda0 (f=0x80402020 an=0x0 pipe=0 tofwd=0 total=0)
an_exp=<NEVER> rex=<NEVER> wex=<NEVER>
buf=0x78a3c0 data=0x78a3d4 o=0 p=0 rsp.next=0 i=0 size=0
Special thanks to Arnaud Gavara who provided lots of valuable input and
ran some validation testing on this patch.
This fix must be backported to 1.6 and 1.5. Note that in 1.5 the
session is not assigned from within the function so some extra checks
may be needed in the callers.
This part was missed when peers were ported to the new applet
infrastructure in 1.6, the main stream is woken up instead of the
appctx. This creates a race condition by which it is possible to
wake the stream at the wrong moment and miss an event. This bug
might be at least partially responsible for some of the CLOSE_WAIT
that were reported on peers session upon reload in version 1.6.
This fix must be backported to 1.6.
Greetings,
Was recently working with a stick table storing URL's and one had an
equals sign in it (e.g. 127.0.0.1/f=ab) which made it difficult to
easily split the key and value without a regex.
This patch will change it so that the key looks like
"key=127.0.0.1/f\=ab" instead of "key=127.0.0.1/f=ab".
Not very important given that there are ways to work around it.
Thanks,
- Chad
The consistent hash lookup is done as normal, then if balancing is
enabled, we progress through the hash ring until we find a server that
doesn't have "too much" load. In the case of equal weights for all
servers, the allowed number of requests for a server is either the
floor or the ceil of (num_requests * hash-balance-factor / num_servers);
with unequal weights things are somewhat more complicated, but the
spirit is the same -- a server should not be able to go too far above
(its relative weight times) the average load. Using the hash ring to
make the second/third/etc. choice maintains as much locality as
possible given the load limit.
Signed-off-by: Andrew Rodland <andrewr@vimeo.com>
For active servers, this is the sum of the eweights of all active
servers before this one in the backend, and
[srv->cumulative_weight .. srv_cumulative_weight + srv_eweight) is a
space occupied by this server in the range [0 .. lbprm.tot_wact), and
likewise for backup servers with tot_wbck. This allows choosing a
server or a range of servers proportional to their weight, by simple
integer comparison.
Signed-off-by: Andrew Rodland <andrewr@vimeo.com>
0 will mean no balancing occurs; otherwise it represents the ratio
between the highest-loaded server and the average load, times 100 (i.e.
a value of 150 means a 1.5x ratio), assuming equal weights.
Signed-off-by: Andrew Rodland <andrewr@vimeo.com>
Pierre Cheynier found that there's a persistent issue with the systemd
wrapper. Too fast reloads can lead to certain old processes not being
signaled at all and continuing to run. The problem was tracked down as
a race between the startup and the signal processing : nothing prevents
the wrapper from starting new processes while others are still starting,
and the resulting pid file will only contain the latest pids in this
case. This can happen with large configs and/or when a lot of SSL
certificates are involved.
In order to solve this we want the wrapper to wait for the new processes
to complete their startup. But we also want to ensure it doesn't wait for
nothing in case of error.
The solution found here is to create a pipe between the wrapper and the
sub-processes. The wrapper waits on the pipe and the sub-processes are
expected to close this pipe once they completed their startup. That way
we don't queue up new processes until the previous ones have registered
their pids to the pid file. And if anything goes wrong, the wrapper is
immediately released. The only thing is that we need the sub-processes
to know the pipe's file descriptor. We pass it in an environment variable
called HAPROXY_WRAPPER_FD.
It was confirmed both by Pierre and myself that this completely solves
the "zombie" process issue so that only the new processes continue to
listen on the sockets.
It seems that in the future this stuff could be moved to the haproxy
master process, also getting rid of an environment variable.
This fix needs to be backported to 1.6 and 1.5.
It's important to know that a signal sent to the wrapper had no effect
because something failed during execve(). Ideally more info (strerror)
should be reported. It would be nice to backport this to 1.6 and 1.5.
The wrapper is not the best reliable thing in the universe, so start
by adding at least the minimum expected controls :-/
To be backported to 1.5 and 1.6.
Since signals are inherited, we must restore them before calling execve()
and intercept them again after a failed execve(). In order to cleanly deal
with the SIGUSR2/SIGHUP loops where we re-exec the wrapper, we ignore these
two signals during a re-exec, and restore them to defaults when spawning
haproxy.
This should be backported to 1.6 and 1.5.
When execv() fails to execute the haproxy executable, it's important to
return an error instead of pretending everything is cool. This fix should
be backported to 1.6 and 1.5 in order to improve the overall reliability
under systemd.
Today, the certificate are indexed int he SNI tree using their CN and the
list of thier AltNames. So, Some certificates have the same names in the
CN and one of the AltNames entries.
Typically Let's Encrypt duplicate the the DNS name in the CN and the
AltName.
This patch prevents the creation of identical entries in the trees. It
checks the same DNS name and the same SSL context.
If the same certificate is registered two time it will be duplicated.
This patch should be backported in the 1.6 and 1.5 version.
Add some debug trace when haproxy is configured in debug & verbose mode.
This is useful for openssl tests. Typically, the error "SSL handshake
failure" can be caused by a lot of protocol error. This patch details
the encountered error. For exemple:
OpenSSL error 0x1408a0c1: ssl3_get_client_hello: no shared cipher
Note that my compilator (gcc-4.7) refuse to considers the function
ssl_sock_dump_errors() as inline. The condition "if" ensure that the
content of the function is not executed in normal case. It should be
a pity to call a function just for testing its execution condition, so
I use the macro "forceinline".
This commit introduces "tcp-request session" rules. These are very
much like "tcp-request connection" rules except that they're processed
after the handshake, so it is possible to consider SSL information and
addresses rewritten by the proxy protocol header in actions. This is
particularly useful to track proxied sources as this was not possible
before, given that tcp-request content rules are processed after each
HTTP request. Similarly it is possible to assign the proxied source
address or the client's cert to a variable.
This is in order to make integration of tcp-request-session cleaner :
- tcp_exec_req_rules() was renamed tcp_exec_l4_rules()
- LI_O_TCP_RULES was renamed LI_O_TCP_L4_RULES
(LI_O_*'s horrible indent was also fixed and a provision was left
for L5 rules).
These are denied conns. Strangely this wasn't emitted while it used to be
available for a while. It corresponds to the number of connections blocked
by "tcp-request connection reject".
Thus the SMP_USE_HTTP_ANY dependency is incorrect, we have to depend on
SMP_USE_L5_CLI (the session). It's particularly important for session-wide
variables which are kept across HTTP requests. For now there is no impact
but it will make a difference with tcp-request session rules.
smp_fetch_var() may be called from everywhere since it just reads a
variable. It must ensure that the stream exists before trying to return
a stream-dependant variable. For now there is no impact but it will
cause trouble with tcp-request session rules.
When the tcp/http actions above were introduced in 1.7-dev4, we used to
proceed like this :
- set-src/set-dst would force the port to zero
- set-src-port/set-dst-port would not do anything if the address family is
neither AF_INET nor AF_INET6.
It was a stupid idea of mine to request this behaviour because it ensures
that these functions cannot be used in a wide number of situations. Because
of the first rule, it is necessary to save the source port one way or
another if only the address has to be changed (so you have to use an
variable). Due to the second rule, there's no way to set the source port
on a unix socket without first overwriting the address. And sometimes it's
really not convenient, especially when there's no way to guarantee that all
fields will properly be set.
In order to fix all this, this small change does the following :
- set-src/set-dst always preserve the original port even if the address
family changes. If the previous address family didn't have a port (eg:
AF_UNIX), then the port is set to zero ;
- set-src-port/set-dst-port always preserve the original address. If the
address doesn't have a port, then the family is forced to IPv4 and the
address to "0.0.0.0".
Thanks to this it now becomes possible to perform one action, the other or
both in any order.
To register a new cli keyword, you need to declare a cli_kw_list
structure in your source file:
static struct cli_kw_list cli_kws = {{ },{
{ { "test", "list", NULL }, "test list : do some tests on the cli", test_parsing, NULL },
{ { NULL }, NULL, NULL, NULL, NULL }
}};
And then register it:
cli_register_kw(&cli_kws);
The first field is an array of 5 elements, where you declare the
keywords combination which will match, it must be ended by a NULL
element.
The second field is used as a usage message, it will appear in the help
of the cli, you can set it to NULL if you don't want to show it, it's a
good idea if you want to overwrite some existing keywords.
The two last fields are callbacks.
The first one is used at parsing time, you can use it to parse the
arguments of your keywords and print small messages. The function must
return 1 in case of a failure, otherwise 0:
#include <proto/dumpstats.h>
static int test_parsing(char **args, struct appctx *appctx)
{
struct chunk out;
if (!*args[2]) {
appctx->ctx.cli.msg = "Error: the 3rd argument is mandatory !";
appctx->st0 = STAT_CLI_PRINT;
return 1;
}
chunk_reset(&trash);
chunk_printf(&trash, "arg[3]: %s\n", args[2]);
chunk_init(&out, NULL, 0);
chunk_dup(&out, &trash);
appctx->ctx.cli.err = out.str;
appctx->st0 = STAT_CLI_PRINT_FREE; /* print and free in the default cli_io_handler */
return 0;
}
The last field is the IO handler callback, it can be set to NULL if you
want to use the default cli_io_handler() otherwise you can write your
own. You can use the private pointer in the appctx if you need to store
a context or some data. stats_dump_sess_to_buffer() is a good example of
IO handler, IO handlers often use the appctx->st2 variable for the state
machine. The handler must return 0 in case it have to be recall later
otherwise 1.
During the stick-table teaching process which occurs at reloading/restart time,
expiration dates of stick-tables entries were not synchronized between peers.
This patch adds two new stick-table messages to provide such a synchronization feature.
As these new messages are not supported by older haproxy peers protocol versions,
this patch increments peers protol version, from 2.0 to 2.1, to help in detecting/supporting
such older peers protocol implementations so that new versions might still be able
to transparently communicate with a newer one.
[wt: technically speaking it would be nice to have this backported into 1.6
as some people who reload often are affected by this design limitation, but
it's not a totally transparent change that may make certain users feel
reluctant to upgrade older versions. Let's let it cook in 1.7 first and
decide later]
The fe_req_rate is similar to fe_sess_rate, but fetches the number
of HTTP requests per second instead of connections/sessions per second.
Signed-off-by: Nenad Merdanovic <nmerdan@anine.io>
dns_init_resolvers() tries to emit the current resolver's name in the
error message in case of out-of-memory condition. But it must not do
it when initializing the trash before even having such a resolver
otherwise the user is certain to get a dirty crash instead of the
error message. No backport is needed.
An apparent copy-paste error resulted in backend's avg connection time
to report the average queue time instead in the HTML dump. Backport to
1.6 is desired.
When the compression is enable on HTTP responses, the chunked data are copied in
a temporary buffer during the HTTP body parsing and then compressed when
everything is forwarded to the client. But the amout of data that can be copied
was not correctly calculated. In many cases, it worked, else on the edge when
the channel buffer was almost full.
[wt: bug introduced by b77c5c26 in 1.7-dev, no backport needed]
Enable IP_BIND_ADDRESS_NO_PORT on backend connections when the source
address is specified without port or port ranges. This is supported
since Linux 4.2/libc 2.23.
If the kernel supports it but the libc doesn't, we can define it at
build time:
make [...] DEFINE=-DIP_BIND_ADDRESS_NO_PORT=24
For more informations about this feature, see Linux commit 90c337da
With Linux officially introducing SO_REUSEPORT support in 3.9 and
its mainstream adoption we have seen more people running into strange
SO_REUSEPORT related issues (a process management issue turning into
hard to diagnose problems because the kernel load-balances between the
new and an obsolete haproxy instance).
Also some people simply want the guarantee that the bind fails when
the old process is still bound.
This change makes SO_REUSEPORT configurable, introducing the command
line argument "-dR" and the noreuseport configuration directive.
A backport to 1.6 should be considered.
pcre_version() returns the running PCRE release, not the release
haproxy was built with.
This simple string fix should be backported to supported releases,
as the output may be confusing.
The analyse of CNAME resolution and request's domain name was performed
twice:
- when validating the response buffer
- when loading the right IP address from the response
Now DNS response are properly loaded into a DNS response structure, we
do the domain name validation when loading/validating the response in
the DNS strcucture and later processing of this task is now useless.
backport: no
DNS servers don't return A or AAAA record if the query points to a CNAME
not resolving to the right type.
We know it because the last record of the response is a CNAME. We can
trigger a new query, switching to a new query type, handled by the layer
above.
New DNS response parser function which turn the DNS response from a
network buffer into a DNS structure, much easier for later analysis
by upper layer.
Memory is pre-allocated at start-up in a chunk dedicated to DNS
response store.
New error code to report a wrong number of queries in a DNS response.
Enrichment of the 'set server <b>/<s> addr' cli directive to allow changing
now a server's port.
The new syntax looks like:
set server <b>/<s> addr [port <port>]
This function can replace update_server_addr() where the need to change the
server's port as well as the IP address is required.
It performs some validation before performing each type of change.
Introduction of 3 new server flags to remember if some parameters were set
during configuration parsing.
* SRV_F_CHECKADDR: this server has a check addr configured
* SRV_F_CHECKPORT: this server has a check port configured
* SRV_F_AGENTADDR: this server has a agent addr configured
HAProxy used to deduce port used for health checks when parsing configuration
at startup time.
Because of this way of working, it makes it complicated to change the port at
run time.
The current patch changes this behavior and makes HAProxy to choose the
port used for health checking when preparing the check task itself.
A new type of error is introduced and reported when no port can be found.
There won't be any impact on performance, since the process to find out the
port value is made of a few 'if' statements.
This patch also introduces a new check state CHK_ST_PORT_MISS: this flag is
used to report an error in the case when HAProxy needs to establish a TCP
connection to a server, to perform a health check but no TCP ports can be
found for it.
And last, it also introduces a new stream termination condition:
SF_ERR_CHK_PORT. Purpose of this flag is to report an error in the event when
HAProxy has to run a health check but no port can be found to perform it.
SOL_IPV6 is not defined on OSX, breaking the compile. Also libcrypt is
not available for installation neither in Macports nor as a Brew recipe,
so we're disabling implicit dependancy.
Signed-off-by: Dinko Korunic <dinko.korunic@gmail.com>
Introduction of a new CLI command "set server <srv> check-port <port>' to
allow admins to change a server's health check port at run time.
This changes the equivalent of the configuration server parameter
called 'port'.
Today I was working on an auto-update script for some ACLs, and found
that I couldn't load ACL entries with a semi-colon in them no matter
how I tried to escape it.
As such, I wrote this patch (this one is for 1.7dev, but it applies to
1.5 the same with just line numbers changed), which seems to allow me
to execute a command such as "add acl /etc/foo.lst foo\;bar" over the
socket. It's worth noting that stats_sock_parse_request() already uses
the backslash to escape spaces in words so it makes sense to use it as
well to escape the semi-colon.
A typo resulting from a copy-paste in the original req.ssl_ver code will
make certain SSLv2 hello messages not properly detected. The bug has been
present since the code was added in 1.3.16. In 1.3 and 1.4, this code was
in proto_tcp.c. In 1.5-dev0, it moved to acl.c, then later to payload.c.
This bug was tagged "minor" because SSLv2 is outdated and this encoding
was rarely (if at all) used, the shorter form starting with 0x80 being
more common.
This fix needs to be backported to all currently maintained branches.
In dns_send_query(), ret was set to 0 but always reassigned before the
usage so this initialisation was useless.
The send_error variable was created, assigned to 0 but never used. So
this variable is just useless by itself. Removing it.
In dump_servers_state(), srv_time_since_last_change, bk_f_forced_id, srv_f_forced_id variables
were firstly set to zero and immediately reassigned to another value while never been accessed in between.
Sounds like a useless initiazation. So let's make only the useful allocation.
Bartosz Koninski reported that recent commit 568743a ("BUG/MEDIUM: stream-int:
completely detach connection on connect error") introduced a nasty side effect
during the connection retry, causing a reconnect attempt to be performed on a
random address, possibly another server from another backend as shown in his
reproducer.
The real reason is in fact that by calling si_release_endpoint() after a
failed connect attempt and not clearing the SN_ADDR_SET flag on the
stream, we indicate that we continue to trust the connection's current
address. So upon next connect attempt, a connection is picked from the
pool and the last target address is reused. It is not very easy to
reproduce it 100% reliably out of production traffic, but it's easier to
see when haproxy is started with -dM to force the pools to be filled with
junk, and where subsequent connections are not even attempted due to their
family being incorrect.
The correct fix consists in clearing the SN_ADDR_SET flag after calling
si_release_endpoint() during a retry. But it outlines a deeper issue which
is in fact that the target address is *stored* in the connection and its
validity is stored in the stream. Until we have a better solution to store
target addresses, it would be better to rely on the connection flags
(CO_FL_ADDR_TO_SET) for this purpose. But it also outlines the fact that
the same issue still exists in Lua sockets and in idle sockets, which
fortunately are not affected by this issue.
Thanks to Bartosz for providing all the elements needed to understand the
problem.
This fix needs to be backported to 1.6 and 1.5.
Trie now uses a dataset structure just like Pattern, so this has been
defined in includes/types/global.h for both Pattern and Trie where it
was just Pattern.
In src/51d.c all functions used by the Trie implementation which need a
dataset as an argument now use the global dataset. The
fiftyoneDegreesDestroy method has now been replaced with
fiftyoneDegreesDataSetFree which is common to Pattern and Trie. In
addition, two extra dataset init status' have been added to the switch
statement in init_51degrees.
Tq is the time between the instant the connection is accepted and a
complete valid request is received. This time includes the handshake
(SSL / Proxy-Protocol), the idle when the browser does preconnect and
the request reception.
This patch decomposes %Tq in 3 measurements names %Th, %Ti, and %TR
which returns respectively the handshake time, the idle time and the
duration of valid request reception. It also adds %Ta which reports
the request's active time, which is the total time without %Th nor %Ti.
It replaces %Tt as the total time, reporting accurate measurements for
HTTP persistent connections.
%Th is avalaible for TCP and HTTP sessions, %Ti, %TR and %Ta are only
avalaible for HTTP connections.
In addition to this, we have new timestamps %tr, %trg and %trl, which
log the date of start of receipt of the request, respectively in the
default format, in GMT time and in local time (by analogy with %t, %T
and %Tl). All of them are obviously only available for HTTP. These values
are more relevant as they more accurately represent the request date
without being skewed by a browser's preconnect nor a keep-alive idle
time.
The HTTP log format and the CLF log format have been modified to
use %tr, %TR, and %Ta respectively instead of %t, %Tq and %Tt. This
way the default log formats now produce the expected output for users
who don't want to manually fiddle with the log-format directive.
Example with the following log-format :
log-format "%ci:%cp [%tr] %ft %b/%s h=%Th/i=%Ti/R=%TR/w=%Tw/c=%Tc/r=%Tr/a=%Ta/t=%Tt %ST %B %CC %CS %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq %hr %hs %{+Q}r"
The request was sent by hand using "openssl s_client -connect" :
Aug 23 14:43:20 haproxy[25446]: 127.0.0.1:45636 [23/Aug/2016:14:43:20.221] test~ test/test h=6/i=2375/R=261/w=0/c=1/r=0/a=262/t=2643 200 145 - - ---- 1/1/0/0/0 0/0 "GET / HTTP/1.1"
=> 6 ms of SSL handshake, 2375 waiting before sending the first char (in
fact the time to type the first line), 261 ms before the end of the request,
no time spent in queue, 1 ms spend connecting to the server, immediate
response, total active time for this request = 262ms. Total time from accept
to close : 2643 ms.
The timing now decomposes like this :
first request 2nd request
|<-------------------------------->|<-------------- ...
t tr t tr ...
---|----|----|----|----|----|----|----|----|--
: Th Ti TR Tw Tc Tr Td : Ti ...
:<---- Tq ---->: :
:<-------------- Tt -------------->:
:<--------- Ta --------->:
Up to HAProxy 1.7-dev3, HAProxy used to use the first bind port from it's
local 'listen' section when no port is configured on the server.
IE, in the configuration below, the server port would be 25:
listen smtp
bind :25
server s1 1.0.0.1 check
This way of working is now obsolete and can be removed, furthermore it is not
documented!
This will make the possibility to change the server's port much easier.
The function ipcpy() simply duplicates the IP address found in one
struct sockaddr_storage into an other struct sockaddr_storage.
It also update the family on the destination structure.
Memory of destination structure must be allocated and cleared by the
caller.
Bryan Talbot reported a very interesting bug. The sc_trackers() sample
fetch seems to have escaped the sanitization that was performed during 1.5
to ensure all dereferences of stkctr_entry() were safe. Here if a tacker
is set on a backend and is then checked against a different backend where
the entry doesn't exist, stkctr_entry() returns NULL and this is dereferenced
to retrieve the ref count. Thanks to Bryan for his detailed bug report featuring
a working config and reproducer.
This fix must be backported to 1.6 and 1.5.
After pushing the last update related to a resync process, the teacher resets
the re-connection's origin to the saved one (pointer position when he receive
the resync request). But the last acknowledgement overwrites this pointer to
an inconsistent value. In peersv2, it results with empty table chunks
regularly pushed.
The fix consist to move the confirm code to assure that the confirm message is
always sent after the last acknowledgement related to the resync. And to reset
the re-connection's origin to the saved one when a confirm message is received.
This bug affects versions 1.6 and superior.
For older versions (peersv1), this inconsistent state should not generate side
effects because chunks are not used and a next acknowlegement message resets
the pointer in a valid state.
Adding on to Thierry's work (http://git.haproxy.org/?p=haproxy.git;h=6310bef5)
I have added a few more fetchers for counters based on the tcp_info struct
maintained by the kernel :
fc_unacked, fc_sacked, fc_retrans, fc_fackets, fc_lost,
fc_reordering
Two fields were not added because they're version-dependant :
fc_rcv_rtt, fc_total_retrans
The fields name depend on the operating system. FreeBSD and NetBSD prefix
all the field names with "__" so we have to rely on a few #ifdef for
portability.
Clang reports that this function is not used :
src/ev_poll.c:34:28: warning: unused function 'hap_fd_isset' [-Wunused-function]
static inline unsigned int hap_fd_isset(int fd, unsigned int *evts)
It's been true since the rework of the pollers in 1.5 and it's unlikely we'll
ever need it anymore, so better remove it now to provide clean builds.
This fix can be backported to 1.6 and 1.5 now.
This warning appears when building without any compression library :
src/compression.c:143:19: warning: unused function 'init_comp_ctx' [-Wunused-function]
static inline int init_comp_ctx(struct comp_ctx **comp_ctx)
^
src/compression.c:176:19: warning: unused function 'deinit_comp_ctx' [-Wunused-function]
static inline int deinit_comp_ctx(struct comp_ctx **comp_ctx)
No backport is needed.
OpenBSD emits warnings on usages of strcpy, strcat and sprintf (and
probably a few others). Here we have a single such warning in all the code
reintroduced by commit 0ba0e4a ("MEDIUM: Support sending email alerts") in
1.6-dev1. Let's get rid of it, the open-coding of strcat is as small as its
usage and the the result is even more efficient.
This fix needs to be backported to 1.6.
On OpenBSD, netinet/ip.h fails unless in_systm.h is included. This
include was added by the silent-drop feature introduced with commit
2d392c2 ("MEDIUM: tcp: add new tcp action "silent-drop"") in 1.6-dev6,
but we don't need it, IP_TTL is defined in netinet/in.h, so let's drop
this useless include.
This fix needs to be backported to 1.6.
The following commit merged into 1.6-dev6 broke the build on OpenBSD :
609ac2a ("MEDIUM: log: replace sendto() with sendmsg() in __send_log()")
Including sys/uio.h is enough to fix this. This fix needs to be backported
to 1.6.
Building 1.6 and above on OpenBSD 5.2 fails due to protocol.c not
including sys/types.h before sys/socket.h :
In file included from src/protocol.c:14:
/usr/include/sys/socket.h:162: error: expected specifier-qualifier-list before 'u_int8_t'
This fix needs to be backported to 1.6.
This bug is due to a copy/paste error and was introduced
with peers-protocol v2:
The last_pushed pointer was not correctly reset to the teaching_origin at
the end of the teaching state but to the first update present in the tree.
The result: some updates were re-pushed after leaving the teaching state.
This fix needs to be backported to 1.6.
It is sometimes needed in application server environments to easily tell
if a source is local to the machine or a remote one, without necessarily
knowing all the local addresses (dhcp, vrrp, etc). Similarly in transparent
proxy configurations it is sometimes desired to tell the difference between
local and remote destination addresses.
This patch adds two new sample fetch functions for this :
dst_is_local : boolean
Returns true if the destination address of the incoming connection is local
to the system, or false if the address doesn't exist on the system, meaning
that it was intercepted in transparent mode. It can be useful to apply
certain rules by default to forwarded traffic and other rules to the traffic
targetting the real address of the machine. For example the stats page could
be delivered only on this address, or SSH access could be locally redirected.
Please note that the check involves a few system calls, so it's better to do
it only once per connection.
src_is_local : boolean
Returns true if the source address of the incoming connection is local to the
system, or false if the address doesn't exist on the system, meaning that it
comes from a remote machine. Note that UNIX addresses are considered local.
It can be useful to apply certain access restrictions based on where the
client comes from (eg: require auth or https for remote machines). Please
note that the check involves a few system calls, so it's better to do it only
once per connection.
There's no point in always duplicating the sample, just ensure it's
writable, as was done prior to the smp_dup() change. This should be
backported to 1.6 to avoid a performance regression caused by this
change (about 30% more time for upper/lower due to the copy).
The binary sample to stick-table key conversion is wrong. It doesn't
check that the binary sample is writable before padding it. After a
quick audit, it doesn't look like any existing sample fetch function
can trigger this bug. The correct fix consists in calling smp_make_rw()
prior to padding the sample.
This fix should be backported to 1.6.
When a stick-table key is derived from a string-based sample, it checks
if it's properly zero-terminated otherwise tries to do so. But the test
doesn't work for two reasons :
- the reported allocated size may be zero while the sample is maked as
not CONST (eg: certain sample fetch functions like smp_fetch_base()
do this), so smp_dup() prior to the recent changes will fail on this.
- the string might have been converted from a binary sample, where the
trailing zero is not appended. If the sample was writable, smp_dup()
would not modify it either and we would fail again here. This may
happen with req.payload or req.body_param for example.
The correct solution consists in calling smp_make_safe() to ensure the
sample is usable as a valid string.
This fix must be backported to 1.6.
The "sni" server directive does some bad stuff on many occasions because
it works on a sample of type string and limits len to size-1 by hand. The
problem is that size used to be zero on many occasions before the recent
changes to smp_dup() and that it effectively results in setting len to -1
and writing the zero byte *before* the string (and not terminating the
string).
This patch makes use of the recently introduced smp_make_safe() to address
this issue.
This fix must be backported to 1.6.
Vedran Furac reported a strange problem where the "base" sample fetch
would not always work for tracking purposes.
In fact, it happens that commit bc8c404 ("MAJOR: stick-tables: use sample
types in place of dedicated types") merged in 1.6 exposed a fundamental
bug related to the way samples use chunks as strings. The problem is that
chunks convey a base pointer, a length and an optional size, which may be
zero when unknown or when the chunk is allocated from a read-only location.
The sole purpose of this size is to know whether or not the chunk may be
appended new data. This size cause some semantics issue in the sample,
which has its own SMP_F_CONST flag to indicate read-only contents.
The problem was emphasized by the commit above because it made use of new
calls to smp_dup() to convert a sample to a table key. And since smp_dup()
would only check the SMP_F_CONST flag, it would happily return read-write
samples indicating size=0.
So some tests were added upon smp_dup() return to ensure that the actual
length is smaller than size, but this in fact made things even worse. For
example, the "sni" server directive does some bad stuff on many occasions
because it limits len to size-1 and effectively sets it to -1 and writes
the zero byte before the beginning of the string!
It is therefore obvious that smp_dup() needs to be modified to take this
nature of the chunks into account. It's not enough but is needed. The core
of the problem comes from the fact that smp_dup() is called for 5 distinct
needs which are not always fulfilled :
1) duplicate a sample to keep a copy of it during some operations
2) ensure that the sample is rewritable for a converter like upper()
3) ensure that the sample is terminated with a \0
4) set a correct size on the sample
5) grow the sample in case it was extracted from a partial chunk
Case 1 is not used for now, so we can ignore it. Case 2 indicates the wish
to modify the sample, so its R/O status must be removed if any, but there's
no implied requirement that the chunk becomes larger. Case 3 is used when
the sample has to be made compatible with libc's str* functions. There's no
need to make it R/W nor to duplicate it if it is already correct. Case 4
can happen when the sample's size is required (eg: before performing some
changes that must fit in the buffer). Case 5 is more or less similar but
will happen when the sample by be grown but we want to ensure we're not
bound by the current small size.
So the proposal is to have different functions for various operations. One
will ensure a sample is safe for use with str* functions. Another one will
ensure it may be rewritten in place. And smp_dup() will have to perform an
inconditional duplication to guarantee at least #5 above, and implicitly
all other ones.
This patch only modifies smp_dup() to make the duplication inconditional. It
is enough to fix both the "base" sample fetch and the "sni" server directive,
and all use cases in general though not always optimally. More patches will
follow to address them more optimally and even better than the current
situation (eg: avoid a dup just to add a \0 when possible).
The bug comes from an ambiguous design, so its roots are old. 1.6 is affected
and a backport is needed. In 1.5, the function already existed but was only
used by two converters modifying the data in place, so the bug has no effect
there.
For quite some time, a few users have been experiencing random crashes
when compressing with zlib, from versions 1.2.3 to 1.2.8 included.
Upon thourough investigation in zlib's deflate.c, it appeared obvious
that avail_in and next_in are used during the flush operation and need
to be initialized, while admittedly it's not obvious in the documentation.
By simply forcing both values to -1 it's possible to immediately reproduce
the exact crash that these users have been experiencing :
(gdb) bt
#0 0x00007fa73ce10c00 in __memcpy_sse2 () from /lib64/libc.so.6
#1 0x00007fa73e0c5d49 in ?? () from /lib64/libz.so.1
#2 0x00007fa73e0c68e0 in ?? () from /lib64/libz.so.1
#3 0x00007fa73e0c73c7 in deflate () from /lib64/libz.so.1
#4 0x00000000004dca1c in deflate_flush_or_finish (comp_ctx=0x7b6580, out=0x7fa73e5bd010, flag=2) at src/compression.c:808
#5 0x00000000004dcb60 in deflate_flush (comp_ctx=0x7b6580, out=0x7fa73e5bd010) at src/compression.c:835
#6 0x00000000004dbc50 in http_compression_buffer_end (s=0x7c0050, in=0x7c00a8, out=0x78adf0 <tmpbuf.24662>, end=0) at src/compression.c:249
#7 0x000000000048bb5f in http_response_forward_body (s=0x7c0050, res=0x7c00a0, an_bit=1048576) at src/proto_http.c:7173
#8 0x00000000004cce54 in process_stream (t=0x7bffd8) at src/stream.c:1939
#9 0x0000000000427ddf in process_runnable_tasks () at src/task.c:238
#10 0x0000000000419892 in run_poll_loop () at src/haproxy.c:1573
#11 0x000000000041a4a5 in main (argc=4, argv=0x7fffcda38348) at src/haproxy.c:1933
Note that for all reports deflate_flush_or_finish() was always involved.
The crash is very hard to reproduce when using regular traffic because it
requires that the combination of avail_in and next_in are inadequate so
that the memcpy() call reads out of bounds. But this can very likely
happen when the input buffer points to an area reused by another stream
when the flush has been interrupted due to a full output buffer. This
also explains why this report is recent, as dynamic buffer allocation
was introduced in 1.6.
Anyway it's not acceptable to call a function with a randomly set input
buffer. The deflate() function explicitly checks for the case where both
avail_in and next_in are null and doesn't use it in this case during a
flush, so this is the best solution.
Special thanks to Sasha Litvak, James Hartshorn and Paul Bauer for
reporting very useful stack traces which were critical to finding the
root cause of this bug.
This fix must be backported into 1.6 and 1.5, though 1.5 is less likely to
trigger this case given that it keeps its own buffers allocated all along
the session's life.
Tim Butler reported a troubling issue affecting all versions since 1.5.
When a connection error occurs and a retry is performed on the same server,
the server connection first goes into the turn-around state (SI_ST_TAR) for
one second. During this time, the client may speak and try to push some
data. The tests in place confirm that the stream interface is in a state
<= SI_ST_EST and that a connection exists, so all ingredients are present
to try to perform a send() to forward data. The send() cannot be performed
since the connection's control layer is marked as not ready, but the
polling flags are changed, and due to the remaining error flag present
on the connection, the polling on the FD is disabled in both directions.
But if this FD was reassigned to another connection in the mean time, it
is this FD which is disabled, and it causes a timeout on another connection.
A configuration allowing to reproduce the issue looks like this :
listen test
bind :8003
server s1 127.0.0.1:8001 # this one should be closed
listen victim
bind :8002
server s1 127.0.0.1:8000 # this one should respond slowly (~50ms)
Two parallel injections should be run with short time-outs (100ms). After
some time, some dead connections will appear in listener "victim" due to
their I/Os being disabled by some of the failed transfers on "test"
instance. These ones will only be flushed on time out. A dead connection
looks like this :
> show sess 0x7dcb70
0x7dcb70: [07/Aug/2016:08:58:40.120151] id=3771 proto=tcpv4 source=127.0.0.1:34682
flags=0xce, conn_retries=3, srv_conn=0x7da020, pend_pos=(nil)
frontend=victim (id=3 mode=tcp), listener=? (id=1) addr=127.0.0.1:8002
backend=victim (id=3 mode=tcp) addr=127.0.0.1:37736
server=s1 (id=1) addr=127.0.0.1:8000
task=0x7dcaf8 (state=0x08 nice=0 calls=2 exp=<NEVER> age=30s)
si[0]=0x7dcd68 (state=EST flags=0x08 endp0=CONN:0x7e2410 exp=<NEVER>, et=0x000)
si[1]=0x7dcd88 (state=EST flags=0x18 endp1=CONN:0x7e0cd0 exp=<NEVER>, et=0x000)
co0=0x7e2410 ctrl=tcpv4 xprt=RAW data=STRM target=LISTENER:0x7d9ea8
flags=0x0020b306 fd=122 fd.state=25 fd.cache=0 updt=0
co1=0x7e0cd0 ctrl=tcpv4 xprt=RAW data=STRM target=SERVER:0x7da020
flags=0x0020b306 fd=93 fd.state=20 fd.cache=0 updt=0
req=0x7dcb80 (f=0x848000 an=0x0 pipe=0 tofwd=-1 total=129)
an_exp=<NEVER> rex=<NEVER> wex=<NEVER>
buf=0x7893c0 data=0x7893d4 o=0 p=0 req.next=0 i=0 size=0
res=0x7dcbc0 (f=0x80008000 an=0x0 pipe=0 tofwd=-1 total=0)
an_exp=<NEVER> rex=<NEVER> wex=<NEVER>
buf=0x7893c0 data=0x7893d4 o=0 p=0 rsp.next=0 i=0 size=0
The solution against this issue is to completely detach the connection upon
error instead of only performing a forced close.
This fix should be backported to 1.6 and 1.5.
Special thanks to Tim who did all the troubleshooting work and provided a
lot of traces allowing to find the root cause of this problem.
Somme HTTP manipulation functions are executed without valid and parsed
requests or responses. This causes a segmentation fault when the executed
code tries to change an empty buffer.
This patch must be backported in the 1.6 version
This patch adds 4 new sample fetches which returns the RTT of the
established connexion and the RTT variance. The established connection
can be between the client and HAProxy, and between HAProxy and the
server. This is very useful for statistics. A great use case is the
estimation of the TCP connection time of the client. Note that the
RTT of the server side is not so interesting because we already have
the connect() time.
In function lf_text_len(), we used escape_chunk() to escape special
characters. There could be a problem if len is greater than the real src
string length (zero-terminated), eg. when calling lf_text_len() from
lf_text().
Similar to "escape_chunk", this function tries to prefix all characters
tagged in the <map> with the <escape> character. The specified <string>
contains the input to be escaped.
Ruoshan Huang found that the call to session_inc_http_err_ctr() in the
recent http-response patch was not a good idea because it also increments
counters that are already tracked (eg: http-request track-sc or previous
http-response track-sc). Better open-code the update, it's simple.
This enables tracking of sticky counters from current response. The only
difference from "http-request track-sc" is the <key> sample expression
can only make use of samples in response (eg. res.*, status etc.) and
samples below Layer 6.
If an action wrapper stops the processing of the transaction
with a txn_done() function, the return code of the action is
"continue". So the continue can implies the processing of other
like adding headers. However, the HTTP content is flushed and
a segfault occurs.
This patchs add a flag indicating that the Lua code want to
stop the processing, ths flags is forwarded to the haproxy core,
and other actions are ignored.
Must be backported in 1.6
The function txn_done() ends a transaction. It does not make
sense to call this function from a lua sample-fetch wrapper,
because the role of a sample-fetch is not to terminate a
transaction.
This patch modify the role of the fucntion txn_done() if it
is called from a sample-fetch wrapper, now it just ends the
execution of the Lua code like the done() function.
Must be backported in 1.6
Alexander Lebedev reported that the response bit is set on SPARC when
DNS queries are sent. This has been tracked to the endianess issue, so
this patch makes the code portable.
Signed-off-by: Nenad Merdanovic <nmerdan@anine.io>
Alexander Lebedev reported that the DNS parser crashes in 1.6 with a bus
error on Sparc when it receives a response. This is obviously caused by
some alignment issues. The issue can also be reproduced on ARMv5 when
setting /proc/cpu/alignment to 4 (which helps debugging).
Two places cause this crash in turn, the first one is when the IP address
from the packet is compared to the current one, and the second place is
when the address is assigned because an unaligned address is passed to
update_server_addr().
This patch modifies these places to properly use memcpy() and memcmp()
to manipulate the unaligned data.
Nenad Merdanovic found another set of places specific to 1.7 in functions
in_net_ipv4() and in_net_ipv6(), which are used to compare networks. 1.6
has the functions but does not use them. There we perform a temporary copy
to a local variable to fix the problem. The type of the function's argument
is wrong since it's not necessarily aligned, so we change it for a const
void * instead.
This fix must be backported to 1.6. Note that in 1.6 the code is slightly
different, there's no rec[] array, the pointer is used directly from the
buffer.
Roberto Guimaraes reported that Valgrind complains about a leak
in ssl_get_dh_1024().
This is caused caused by an oversight in ssl_sock_load_dh_params(),
where local_dh_1024 is always replaced by a new DH object even if
it already holds one. This patch simply checks whether local_dh_1024
is NULL before calling ssl_get_dh_1024().
Originally, tcphdr's source and dest from Linux were used to get the
source and port which led to a build issue on BSD oses.
To avoid side problems related to network then we just use an internal
struct as we need only those two fields.
This reverts commit 0ea4c23ca7.
Certain very simple confs randomly segfault upon startup with openssl 1.0.2
with this patch, which seems to indicate a use after free. Better drop it
and let valgrind complain about the potential leak.
Also it's worth noting that the man page for SSL_CTX_set_tmp_dh() makes no
mention about whether or not the element should be freed, and the example
provided does not use it either.
This fix should be backported to 1.6 and 1.5 where the patch was just
included.
Changed all the cases where the pointer passed to realloc is overwritten
by the pointer returned by realloc. The new function my_realloc2 has
been used except in function register_name. If register_name fails to
add a new variable because of an "out of memory" error, all the existing
variables remain valid. If we had used my_realloc2, the array of variables
would have been freed.
In commit 9962f8fc (BUG/MEDIUM: http: unbreak uri/header/url_param hashing), we
take care to update 'msg->sov' value when the parser changes to state
HTTP_MSG_DONE. This works when no filter is used. But, if a filter is used and
if it loops on 'http_end' callback, the following block is evaluated two times
consecutively:
if (unlikely(!(chn->flags & CF_WROTE_DATA) || msg->sov > 0))
msg->sov -= ret;
Today, in practice, because this happens when all data are parsed and forwarded,
the second test always fails (after the first update, msg->sov is always lower
or equal to 0). But it is useless and error prone. So to avoid misunderstanding
the code has been slightly changed. Now, in all cases, we try to update msg->sov
only once per iteration.
No backport is needed.
Vedran Furac reported that "balance uri" doesn't work anymore in recent
1.7-dev versions. Dragan Dosen found that the first faulty commit was
dbe34eb ("MEDIUM: filters/http: Move body parsing of HTTP messages in
dedicated functions"), merged in 1.7-dev2.
After this patch, the hashing is performed on uninitialized data,
indicating that the buffer is not correctly rewound. In fact, all forms
of content-based hashing are broken since the commit above. Upon code
inspection, it appears that the new functions http_msg_forward_chunked_body()
and http_msg_forward_body() forget to rewind the buffer in the success
case, when the parser changes to state HTTP_MSG_DONE. The rewinding code
was reinserted in both functions and the fix was confirmed by two test,
with and without chunking.
No backport it needed.
Kay Fuchs reported that the error message is misleading in response
captures because it suggests that "len" is accepted while it's not.
This needs to be backported to 1.6.
Eric Webster reported that the state file wouldn't reload in 1.6.5
while it used to work in 1.6.4. The issue is that headers are now
missing from the output when a specific backend is dumped since
commit 4c1544d ("BUG/MEDIUM: stats: show servers state may show an
empty or incomplete result"). This patch fixes this by introducing
a dump state.
It must be backported to 1.6.
A filter can choose to loop on data forwarding. When this loop occurs in
HTTP_MSG_ENDING state, http_foward_data callbacks are called twice because of a
goto on the wrong label.
A filter can also choose to loop at the end of a HTTP message, in http_end
callback. Here the goto is good but the label is not at the right place. We must
be sure to upate msg->sov value.
Filters can alter data during the parsing, i.e when http_data or tcp_data
callbacks are called. For now, the update must be done by hand. So we must
handle changes in the channel buffers, especially on the number of input bytes
pending (buf->i).
In addition, a filter can choose to switch channel buffers to do its
updates. So, during data filtering, we must always use the right buffer and not
use variable to reference them.
Without this patch, filters cannot safely alter data during the data parsing.
There's no point in blocking/unblocking sigchld when removing entries
from the list since the code is called asynchronously.
Similarly the blocking/unblocking could be removed from the connect_proc_chk()
function but it happens that at high signal rates, fork() takes twice as much
time to execute as it is regularly interrupted by a signal, so in the end this
signal blocking is beneficial there for performance reasons.
The external checks code makes use of block_sigchld() and unblock_sigchld()
to ensure nobody modifies the signals list while they're being manipulated.
It happens that these functions clear the list of blocked signals, so they
can possibly have a side effect if other signals are blocked. For now no
other signal is blocked but it may very well change in the future so rather
correctly use SIG_BLOCK/SIG_UNBLOCK instead of touching unrelated signals.
This fix should be backported to 1.6 for correctness.
There are random segfaults occuring when using external checks. The
reason is that when receiving a SIGCHLD, a call to task_wakeup() is
performed. There are two situations where this causes trouble :
- the scheduler is in process_running_tasks(), since task_wakeup()
sets rq_next to NULL, when the former dereferences it to get the
next pointer, the program crashes ;
- when another task_wakeup() is being called and during eb_next()
in process_running_tasks(), because the structure of the run queue
tree changes while it is being processed.
The solution against this is to use asynchronous signal processing
thanks to the internal signal API. The signal handler is registered,
and upon delivery, the signal is added to the queue and processed
out of any other processing.
This code was stressed at 2500 forks/s and their respective signals
for quite some time and the segfault is now gone.
Lukas Erlacher reported an interesting problem : since we don't close
FDs after the fork() upon external checks, any script executed that
writes data on stdout/stderr will possibly send its data to wrong
places, very likely an existing connection.
After some analysis, the problem is even wider. It's not enough to
just close stdin/stdout/stderr, as all sockets are passed to the
sub-process, and as long as they're not closed, they are usable for
whatever mistake can be done. Furthermore with epoll, such FDs will
continue to be reported after a close() as the underlying file is
not closed yet.
CLOEXEC would be an acceptable workaround except that 1) it adds an
extra syscall on the fast path, and 2) we have no control over FDs
created by external libs (eg: openssl using /dev/crypto, libc using
/dev/random, lua using anything else), so in the end we still need
to close them all.
On some BSD systems there's a closefrom() syscall which could be
very useful for this.
Based on an insightful idea from Simon Horman, we don't close 0/1/2
when we're in verbose mode since they're properly connected to
stdin/stdout/stderr and can become quite useful during debugging
sessions to detect some script output errors or execve() failures.
This fix must be backported to 1.6.
When the requested amount of FDs cannot be allocated, setrlimit() fails.
That's bad because if the limit is set to 1024 and we need 10000, we
stay on 1024 while we could possibly raise it to 4096 thanks to rlim_max.
This patch takes care of trying to assign rlim_cur to rlim_max on failure
so that we get as much as possible if we can't get all we need. The case
is particularly visible when starting haproxy as a non-privileged user
and a large maxconn is specified in the configuration.
Another point of doing this is that it is the only way to allow us to
close inherited FDs upon fork(), ie those between rlim_cur and rlim_max.
This patch may be backported to 1.6 and 1.5.
global.rlimit_nofile contains the mxa number of file descriptors that
can be allocated, except if the user is not allowed to reach this limit,
where it still contains the initially requested value. It is important
that this value always matches what is really configured so that it is
properly reported in the stats and that we can use it later to close
all FDs without wasting time closing impossible FDs.
This fix may be backported to 1.6 and 1.5.
This configures the client-facing connection to receive a NetScaler
Client IP insertion protocol header before any byte is read from the
socket. This is equivalent to having the "accept-netscaler-cip" keyword
on the "bind" line, except that using the TCP rule allows the PROXY
protocol to be accepted only for certain IP address ranges using an ACL.
This is convenient when multiple layers of load balancers are passed
through by traffic coming from public hosts.
When NetScaler application switch is used as L3+ switch, informations
regarding the original IP and TCP headers are lost as a new TCP
connection is created between the NetScaler and the backend server.
NetScaler provides a feature to insert in the TCP data the original data
that can then be consumed by the backend server.
Specifications and documentations from NetScaler:
https://support.citrix.com/article/CTX205670https://www.citrix.com/blogs/2016/04/25/how-to-enable-client-ip-in-tcpip-option-of-netscaler/
When CIP is enabled on the NetScaler, then a TCP packet is inserted just after
the TCP handshake. This is composed as:
- CIP magic number : 4 bytes
Both sender and receiver have to agree on a magic number so that
they both handle the incoming data as a NetScaler Client IP insertion
packet.
- Header length : 4 bytes
Defines the length on the remaining data.
- IP header : >= 20 bytes if IPv4, 40 bytes if IPv6
Contains the header of the last IP packet sent by the client during TCP
handshake.
- TCP header : >= 20 bytes
Contains the header of the last TCP packet sent by the client during TCP
handshake.
I just noticed that SSL wouldn't build anymore since this afternoon's patch :
src/ssl_sock.c: In function 'ssl_sock_load_multi_cert':
src/ssl_sock.c:1982:26: warning: left-hand operand of comma expression has no effect [-Wunused-value]
for (i = 0; i < fcount, i++)
^
src/ssl_sock.c:1982:31: error: expected ';' before ')' token
for (i = 0; i < fcount, i++)
^
Makefile:791: recipe for target 'src/ssl_sock.o' failed
make: *** [src/ssl_sock.o] Error 1
SNI filters used to be ignored with multicerts (eg: those providing
ECDSA and RSA at the same time). This patch makes them work like
other certs.
Note: most of the changes in this patch are due to an extra level of
indent, read it with "git show -b".
In function smp_fetch_url32_src(), it's better to check the value of
cli_conn before we go any further.
This patch needs to be backported to 1.6 and 1.5.
This is similar to the commit 5ad6e1dc ("BUG/MINOR: http: base32+src
should use the big endian version of base32"). Now we convert url32 to big
endian when building the binary block.
This patch needs to be backported to 1.6 and 1.5.
The previous dump algorithm was not trying to yield when the buffer is
full, it's not a problem with the TLS_TICKETS_NO which is 3 by default
but it can become one if the buffer size is lowered and if the
TLS_TICKETS_NO is increased.
The index of the latest ticket dumped is now stored to ensure we can
resume the dump after a yield.
The function stats_tlskeys_list() can meet an undefined behavior when
called with appctx->st2 == STAT_ST_LIST, indeed the ref pointer is used
uninitialized.
However this function was using NULL in appctx->ctx.tlskeys.ref as a
flag to dump every tickets from every references. A real flag
appctx->ctx.tlskeys.dump_all is now used for this behavior.
This patch delete the 'ref' variable and use appctx->ctx.tlskeys.ref
directly.
Valgrind reports that the memory allocated in ssl_get_dh_1024() was leaking. Upon further inspection of openssl code, it seems that SSL_CTX_set_tmp_dh makes a copy of the data, so calling DH_free afterwards makes sense.
If we use the action "http-request add-header" with a Lua sample-fetch or
converter, and the Lua function calls one of the Lua log function, the
header name is corrupted, it contains an extract of the last loggued data.
This is due to an overwrite of the trash buffer, because his scope is not
respected in the "add-header" function. The scope of the trash buffer must
be limited to the function using it. The build_logline() function can
execute a lot of other function which can use the trash buffer.
This patch fix the usage of the trash buffer. It limits the scope of this
global buffer to the local function, we build first the header value using
build_logline, and after we store the header name.
Thanks Michael Ezzell for the repporting.
This patch must be backported in 1.6 version
The header name is copied two time in the buffer. The first copy is a printf-like
function writing the name and the http separators in the buffer, and the second
form is a memcopy. This seems to be inherited from some changes. This patch
removes the printf like, format.
This patch must be backported in 1.6 and 1.5 versions
The number of arguments pushed in the stack are false, so we try to execute a
function out of the stack. This function is always a nil pointer, so the
following message is displayed.
Lua converter 'testconv': runtime error: attempt to call a nil value.
Thanks Michael Ezzell for the repporting.
This patch must be backported in the 1.6 version.
When a stick table is tracked, and another one is used later on the
configuration, a segfault occurs.
The function "smp_create_src_stkctr" can return a NULL value, and
its value is not tested, so one other function try to dereference
a NULL pointer. This patch just add a verification of the NULL
pointer.
The problem is reproduced with this configuration:
listen www
mode http
bind :12345
tcp-request content track-sc0 src table IPv4
http-request allow if { sc0_inc_gpc0(IPv6) gt 0 }
server dummy 127.0.0.1:80
backend IPv4
stick-table type ip size 10 expire 60s store gpc0
backend IPv6
stick-table type ipv6 size 10 expire 60s store gpc0
Thank to kabefuna@gmail.com for the bug report.
This patch must be backported in the 1.6 and 1.5 version.
The 'set-src' action was not available for tcp actions The action code
has been converted into a function in proto_tcp.c to be used for both
'http-request' and 'tcp-request connection' actions.
Both http and tcp keywords are registered in proto_tcp.c
The reference to the tls_keys_ref was not deleted from the
tlskeys_reference linked list.
When the SSL is malconfigured, it can lead to an access to freed memory
during a "show tls-keys" on the admin socked.
Olivier Doucet reported that "show servers state" was producing an invalid
output with some configurations where nbproc > 1.
Indeed, commit 76a99784f4 fixed some issues but unfortunately introduced a
regression when a backend bound to the same process as the stats socket and a
previous backend is bound to another one.
For example :
global
daemon
nbproc 2
stats socket /var/run/haproxy-1.sock process 1
stats socket /var/run/haproxy-2.sock process 2
listen proc1
bind 127.0.0.1:9001
bind-process 1
server WRONG 127.0.0.1:80
listen proc2
bind 127.0.0.1:9002
bind-process 2
server RIGHT 127.0.0.1:80
Requesting "show servers state" on /var/run/haproxy-2.sock was producing a line
like :
3 proc2 1 WRONG 127.0.0.1 2 0 1 1 4 1 0 2 0 0 0 0
whereas the line below was awaited :
3 proc2 1 RIGHT 127.0.0.1 2 0 1 1 5 1 0 2 0 0 0 0
This was caused by the initialization of the server loop too early, before the
bind_proc filtering whereas it should be done after.
This fix should be backported to 1.6, where the regression has unfortunately
been backported.
Ben Cabot reported that after commit 5e4261b ("CLEANUP: config:
detect double registration of a config section") recently introduced
in 1.7-dev, it's not possible anymore to load multiple configuration
files. Bryan Talbot provided a simple reproducer to exhibit the issue.
It turns out that function readcfgfile() registers new parsers for
section keywords for each new file. In addition to being useless, this
has the negative effect of wasting memory and slowing down the config
parser as the number of configuration files increases.
This fix only needs to be backported if/where the commit above is
backported.
htonll()/ntohll() already exist on Solaris 11 with a different declaration,
causing a build error as reported by Jonathan Fisher. They used to exist on
OSX with a #define which allowed us to detect them. It was a bad idea to give
these functions a name subject to conflicts like this. Simply rename them
my_htonll()/my_ntohll() to definitely get rid of the conflict.
This patch must be backported to 1.6.
The stick-table converters used to take a string on input because it
was the only type that could be casted to from any other type. This is
inefficient and possibly inaccurate sometimes. For example in order to
look up an IP address, it must first be converted to a string then
converted back to an IP address.
We've had SMP_T_ANY introduced long ago in 1.6, but unfortunately it
was not propagated to these converters, so let's do it now.
It's important to note that a few direct type conversions which already
would not make any sense are not possible (for example, converting a
boolean to an IP address or an HTTP method to an integer). While this
would have caused the lookup to be performed on the wrong key, now the
lookup will fail and the converter will return no data. While there
should not be any case where this happens, it's probably best to avoid
backporting this change before a longer observation period.
Baptiste reported that the table_conn_rate() converter would always
return zero in 1.6.5. In fact, commit bc8c404 ("MAJOR: stick-tables:
use sample types in place of dedicated types") broke all stick-table
converters because smp_to_stkey() now returns a pointer to the sample
instead of holding a copy of the key, and the converters used to
reinitialize the sample prior to performing the lookup. Only
"in_table()" continued to work.
The construct is still fragile, so some comments were added to a few
function to clarify their impacts. It's also worth noting that there
is no point anymore in forcing these converters to take a string on
input, but that will be changed in another commit.
The bug was introduced in 1.6-dev4, this fix must be backported to 1.6.
Commit 108b1dd ("MEDIUM: http: configurable http result codes for
http-request deny") introduced in 1.6-dev2 was incomplete. It introduced
a new field "rule_deny_status" into struct http_txn, which is filled only
by actions "http-request deny" and "http-request tarpit". It's then used
in the deny code path to emit the proper error message, but is used
uninitialized when the deny comes from a "reqdeny" rule, causing random
behaviours ranging from returning a 200, an empty response, or crashing
the process. Often upon startup only 200 was returned but after the fields
are used the crash happens. This can be sped up using -dM.
There's no need at all for storing this status in the http_txn struct
anyway since it's used immediately after being set. Let's store it in
a temporary variable instead which is passed as an argument to function
http_req_get_intercept_rule().
As an extra benefit, removing it from struct http_txn reduced the size
of this struct by 8 bytes.
This fix must be backported to 1.6 where the bug was detected. Special
thanks to Falco Schmutz for his detailed report including an exploitable
core and a reproducer.
When compiled with GCC 6, the IP address specified for a frontend was
ignored and HAProxy was listening on all addresses instead. This is
caused by an incomplete copy of a "struct sockaddr_storage".
With the GNU Libc, "struct sockaddr_storage" is defined as this:
struct sockaddr_storage
{
sa_family_t ss_family;
unsigned long int __ss_align;
char __ss_padding[(128 - (2 * sizeof (unsigned long int)))];
};
Doing an aggregate copy (ss1 = ss2) is different than using memcpy():
only members of the aggregate have to be copied. Notably, padding can be
or not be copied. In GCC 6, some optimizations use this fact and if a
"struct sockaddr_storage" contains a "struct sockaddr_in", the port and
the address are part of the padding (between sa_family and __ss_align)
and can be not copied over.
Therefore, we replace any aggregate copy by a memcpy(). There is another
place using the same pattern. We also fix a function receiving a "struct
sockaddr_storage" by copy instead of by reference. Since it only needs a
read-only copy, the function is converted to request a reference.
This patch removes setlocale from the main function. It was introduced
by commit 379d9c7 ("MEDIUM: init: allow directory as argument of -f")
in 1.7-dev a few commits ago after a discussion on the mailing list.
Some regex may have different behaviours depending on the
locale. Some LUA scripts may change their behaviour too
(http://lua-users.org/wiki/LuaLocales).
Without this patch (haproxy is using setlocale) :
$ cat locale.cfg
defaults
mode http
frontend test
bind :9000
mode http
use_backend testbk if { hdr_reg(X-Test) ^\w+$ }
backend testbk
mode http
server s 127.0.0.1:80
$ LANG=fr_FR.UTF-8 ./haproxy -f locale.cfg
$ curl -i -H "X-Test: échec" localhost:9000
HTTP/1.1 200 OK
...
$ LANG=C ./haproxy -f locale.cfg
$ curl -i -H "X-Test: échec" localhost:9000
HTTP/1.0 503 Service Unavailable
...
'channel_analyze' callback has been removed. Now, there are 2 callbacks to
surround calls to analyzers:
* channel_pre_analyze: Called BEFORE all filterable analyzers. it can be
called many times for the same analyzer, once at each loop until the
analyzer finishes its processing. This callback is resumable, it returns a
negative value if an error occurs, 0 if it needs to wait, any other value
otherwise.
* channel_post_analyze: Called AFTER all filterable analyzers. Here, AFTER
means when an analyzer finishes its processing. This callback is NOT
resumable, it returns a negative value if an error occurs, any other value
otherwise.
Pre and post analyzer callbacks are not automatically called. 'pre_analyzers'
and 'post_analyzers' bit fields in the filter structure must be set to the right
value using AN_* flags (see include/types/channel.h).
The flag AN_RES_ALL has been added (AN_REQ_ALL already exists) to ease the life
of filter developers. AN_REQ_ALL and AN_RES_ALL include all filterable
analyzers.
Now, to call an analyzer in 'process_stream' function, we should use
FLT_ANALAYZE or ANALYZE macros, depending if this is a filterable analyzer or
not.
Instead of calling 'channel_analyze' callback with the flag AN_FLT_HTTP_HDRS,
now we use the new callback 'http_headers'. This change is done because
'channel_analyze' callback will be removed in a next commit.
As suggested by Pavlos, it's too bad that we didn't have a %Td log
format tag given that there are a few mentions of Td corresponding
to the data transmission time already in the doc, so this is now done.
Just like the other specifiers, we report -1 if the connection failed
before reaching the data transmission state.
In an effort to make the config parser more robust, we should validate
that everything we register is not already registered. Most cfg_register_*
functions unfortunately return void and just perform a LIST_ADDQ(), so they
will have to change for this. At least cfg_register_section() does perform
a bit of checks and is easy to check for such errors, so let's start with
this one. Future patches will definitely have to focus on the remaining
functions and ensure unicity of all config parsers.
If -f argument is a directory add all the files (and only files) it
containes to the config files list.
These files are added in lexical order (respecting LC_COLLATE).
Only files with ".cfg" extension are added.
Only non hidden files (not prefixed with ".") are added.
Symlink are followed.
The -f order is still respected:
$ tree -a rootdir
rootdir
|-- dir1
| |-- .6.cfg
| |-- 1.cfg
| |-- 2
| |-- 3.cfg
| |-- 4.cfg -> 1.cfg
| |-- 5 -> 1.cfg
| |-- 7.cfg -> .
| `-- dir4
| `-- 8.cfg
|-- dir2
| |-- 10.cfg
| `-- 9.cfg
|-- dir3
| `-- 11.cfg
|-- link -> dir3/
|-- root1
|-- root2
`-- root3
$ ./haproxy -C rootdir -f root2 -f dir2 -f root3 -f dir1 \
-f link -f root1
root2
dir2/10.cfg
dir2/9.cfg
root3
dir1/1.cfg
dir1/3.cfg
dir1/4.cfg
link/11.cfg
root1
This can be useful on systemd where you can't change the haproxy
commande line options on service reload.
int list_append_word(struct list *li, const char *str, char **err)
Append a copy of string <str> (inside a wordlist) at the end of
the list <li>.
The caller is responsible for freeing the <err> and <str> copy memory
area using free().
On failure : return 0 and <err> filled with an error message.
Released version 1.7-dev3 with the following main changes :
- MINOR: sample: Moves ARGS underlying type from 32 to 64 bits.
- BUG/MINOR: log: Don't use strftime() which can clobber timezone if chrooted
- BUILD: namespaces: fix a potential build warning in namespaces.c
- MINOR: da: Using ARG12 macro for the sample fetch and the convertor.
- DOC: add encoding to json converter example
- BUG/MINOR: conf: "listener id" expects integer, but its not checked
- DOC: Clarify tunes.vars.xxx-max-size settings
- CLEANUP: chunk: adding NULL check to chunk_dup allocation.
- CLEANUP: connection: fix double negation on memcmp()
- BUG/MEDIUM: peers: fix incorrect age in frequency counters
- BUG/MEDIUM: Fix RFC5077 resumption when more than TLS_TICKETS_NO are present
- BUG/MAJOR: Fix crash in http_get_fhdr with exactly MAX_HDR_HISTORY headers
- BUG/MINOR: lua: can't load external libraries
- BUG/MINOR: prevent the dump of uninitialized vars
- CLEANUP: map: it seems that the map were planed to be chained
- MINOR: lua: move class registration facilities
- MINOR: lua: remove some useless checks
- CLEANUP: lua: Remove two same functions
- MINOR: lua: refactor the Lua object registration
- MINOR: lua: precise message when a critical error is catched
- MINOR: lua: post initialization
- MINOR: lua: Add internal function which strip spaces
- MINOR: lua: convert field to lua type
- DOC: "addr" parameter applies to both health and agent checks
- DOC: timeout client: pointers to timeout http-request
- DOC: typo on stick-store response
- DOC: stick-table: amend paragraph blaming the loss of table upon reload
- DOC: typo: ACL subdir match
- DOC: typo: maxconn paragraph is wrong due to a wrong buffer size
- DOC: regsub: parser limitation about the inability to use closing square brackets
- DOC: typo: req.uri is now replaced by capture.req.uri
- DOC: name set-gpt0 mismatch with the expected keyword
- MINOR: http: sample fetch which returns unique-id
- MINOR: dumpstats: extract stats fields enum and names
- MINOR: dumpstats: split stats_dump_info_to_buffer() in two parts
- MINOR: dumpstats: split stats_dump_fe_stats() in two parts
- MINOR: dumpstats: split stats_dump_li_stats() in two parts
- MINOR: dumpstats: split stats_dump_sv_stats() in two parts
- MINOR: dumpstats: split stats_dump_be_stats() in two parts
- MINOR: lua: dump general info
- MINOR: lua: add class proxy
- MINOR: lua: add class server
- MINOR: lua: add class listener
- BUG/MEDIUM: stick-tables: some sample-fetch doesn't work in the connection state.
- MEDIUM: proxy: use dynamic allocation for error dumps
- CLEANUP: remove unneeded casts
- CLEANUP: uniformize last argument of malloc/calloc
- DOC: fix "needed" typo
- BUG/MINOR: dumpstats: fix write to global chunk
- BUG/MINOR: dns: inapropriate way out after a resolution timeout
- BUG/MINOR: dns: trigger a DNS query type change on resolution timeout
- CLEANUP: proto_http: few corrections for gcc warnings.
- BUG/MINOR: DNS: resolution structure change
- BUG/MINOR : allow to log cookie for tarpit and denied request
- BUG/MEDIUM: ssl: rewind the BIO when reading certificates
- OPTIM/MINOR: session: abort if possible before connecting to the backend
- DOC: http: rename the unique-id sample and add the documentation
- BUG/MEDIUM: trace.c: rdtsc() is defined in two files
- BUG/MEDIUM: channel: fix miscalculation of available buffer space (2nd try)
- BUG/MINOR: server: risk of over reading the pref_net array.
- BUG/MINOR: cfgparse: couple of small memory leaks.
- BUG/MEDIUM: sample: initialize the pointer before parse_binary call.
- DOC: fix discrepancy in the example for http-request redirect
- MINOR: acl: Add predefined METH_DELETE, METH_PUT
- CLEANUP: .gitignore cleanup
- DOC: Clarify IPv4 address / mask notation rules
- CLEANUP: fix inconsistency between fd->iocb, proto->accept and accept()
- BUG/MEDIUM: fix maxaccept computation on per-process listeners
- BUG/MINOR: listener: stop unbound listeners on startup
- BUG/MINOR: fix maxaccept computation according to the frontend process range
- TESTS: add blocksig.c to run tests with all signals blocked
- MEDIUM: unblock signals on startup.
- MINOR: filters: Print the list of existing filters during HA startup
- MINOR: filters: Typo in an error message
- MINOR: filters: Filters must define the callbacks struct during config parsing
- DOC: filters: Add filters documentation
- BUG/MEDIUM: channel: don't allow to overwrite the reserve until connected
- BUG/MEDIUM: channel: incorrect polling condition may delay event delivery
- BUG/MEDIUM: channel: fix miscalculation of available buffer space (3rd try)
- BUG/MEDIUM: log: fix risk of segfault when logging HTTP fields in TCP mode
- MINOR: Add ability for agent-check to set server maxconn
- CLEANUP: Use server_parse_maxconn_change_request for maxconn CLI updates
- MINOR: filters: add opaque data
- BUG/MEDIUM: lua: protects the upper boundary of the argument list for converters/fetches.
- MINOR: lua: migrate the argument mask to 64 bits type.
- BUG/MINOR: dumpstats: Fix the "Total bytes saved" counter in backends stats
- BUG/MINOR: log: fix a typo that would cause %HP to log <BADREQ>
- BUG/MEDIUM: http: fix incorrect reporting of server errors
- MINOR: channel: add new function channel_congested()
- BUG/MEDIUM: http: fix risk of CPU spikes with pipelined requests from dead client
- BUG/MAJOR: channel: fix miscalculation of available buffer space (4th try)
- BUG/MEDIUM: stream: ensure the SI_FL_DONT_WAKE flag is properly cleared
- BUG/MEDIUM: channel: fix inconsistent handling of 4GB-1 transfers
- BUG/MEDIUM: stats: show servers state may show an empty or incomplete result
- BUG/MEDIUM: stats: show backend may show an empty or incomplete result
- MINOR: stats: fix typo in help messages
- MINOR: stats: show stat resolvers missing in the help message
- BUG/MINOR: dns: fix DNS header definition
- BUG/MEDIUM: dns: fix alignment issue when building DNS queries
- CLEANUP: don't ignore scripts in .gitignore
- BUILD: add a few release and backport scripts in scripts/
On some architectures, unaligned access is not authorized. On most
architectures, it is just slower. Therefore, we have to use memcpy()
when an unaligned access is needed, specifically when writing the qinfo.
Also remove the unaligned access when reading answer count when reading
the answer. It's likely that this instruction was optimized away by the
compiler since it is unneeded. Add a comment to explain why we use 7 as
an offset instead of 6. Not an unaligned offset since "resp" is
"unsigned char", then promoted to int.
The help message provided by the "help" command on the unix socket didn't
provide "show stat resolvers" in the list of supported commands.
This patch should be backported to 1.6
This is the same issue as "show servers state", where the result is incorrect
it the data can't fit in one buffer. The similar fix is applied, to restart
the data processing where it stopped as buffers are sent to the client.
This fix should be backported to haproxy 1.6
It was reported that the unix socket command "show servers state" returned an
empty response while "show servers state <backend>" worked.
In fact, both cases can reproduce the issue. It happens when the response can't
fit in one buffer.
The fix consists in processing the response in several steps, as it is done in
some others commands, by restarting where it was stopped after the buffer is
sent to the client.
This fix should be backported to haproxy 1.6
In 1.4-dev3, commit 31971e5 ("[MEDIUM] add support for infinite forwarding")
made it possible to configure the lower layer to forward data indefinitely
by setting the forward size to CHN_INFINITE_FORWARD (4GB-1). By then larger
chunk sizes were not supported so there was no confusion in the usage of the
function.
Since 1.5 we support 64-bit content-lengths and chunk sizes and the function
has grown to support 64-bit arguments, though it still limits a single pass
to 32-bit quantities (what fit in the channel's to_forward field). The issue
now becomes that a 4GB-1 content-length can be confused with infinite
forwarding (in fact it's 4GB-1+what was already in the buffer). It causes a
visible effect when transferring this exact size because the transfer rate
is lower than with other sizes due in part to the disabling of the Nagle
algorithm on the sendto() call.
In theory with keep-alive it should prevent a second request from being
processed after such a transfer, but since the analysers are still present,
the forwarding analyser properly counts down the remaining size to transfer
and ultimately the transaction gets correctly reset so there is no visible
effect.
Since the root cause of the issue is an API problem (lack of distinction
between a real valid length and a magic value), this patch modifies the API
to have a new dedicated function called channel_forward_forever() to program
a permanent forwarding. The existing function __channel_forward() was modified
to properly take care of the requested sizes and ensure it 1) never overflows
and 2) never reaches CHN_INFINITE_FORWARD by accident.
It is worth noting that the function used to have a bug causing a 2GB
forward to be scheduled if it was called with less data than what is present
in buf->i. Fortunately this bug couldn't be triggered with existing code.
This fix should be backported to 1.6 and 1.5. While it also theorically
affects 1.4, it's better not to backport it there, as the risk of breaking
large object transfers due to significant API differences is high, compared
to the fact that the largest supported objects (4GB-1) are just slower to
transfer.
The previous buffer space bug has revealed an issue causing some stalled
connections to remain orphaned forever, preventing an old process from
dying. The issue is that once in a while a task may be woken up because
a disabled expiration timer has been reached despite no timeout being
reached. In this case we exit very early but the SI_FL_DONT_WAKE flag
wasn't cleared, resulting in new events not waking the task up. It may
be one of the reasons why a few people have already observed some peers
connections stuck in CLOSE_WAIT state.
This bug was introduced in 1.5-dev13 by commit 798f432 ("OPTIM: session:
don't process the whole session when only timers need a refresh"), so
the fix must be backported to 1.6 and 1.5.
Since client-side HTTP keep-alive was introduced in 1.4-dev, a good
number of corner cases had to be dealt with. One of them remained and
caused some occasional CPU spikes that Cyril Bonté had the opportunity
to observe from time to time and even recently to capture for deeper
analysis.
What happens is the following scenario :
1) a client sends a first request which causes the server to respond
using chunked encoding
2) the server starts to respond with a large response that doesn't
fit into a single buffer
3) the response buffer fills up with the response
4) the client reads the response while it is being sent by the server.
5) haproxy eventually receives the end of the response from the server,
which occupies the whole response buffer (must at least override the
reserve), parses it and prepares to receive a second request while
sending the last data blocks to the client. It then reinitializes the
whole http_txn and calls http_wait_for_request(), which arms the
http-request timeout.
6) at this exact moment the client emits a second request, probably
asking for an object referenced in the first page while parsing it
on the fly, and silently disappears from the internet (internet
access cut or software having trouble with pipelined request).
7) the second request arrives into the request buffer and the response
data stall in the response buffer, preventing the reserve from being
used for anything else.
8) haproxy calls http_wait_for_request() to parse the next request which
has just arrived, but since it sees the response buffer is full, it
refrains from doing so and waits for some data to leave or a timeout
to strike.
9) the http-request timeout strikes, causing http_wait_for_request() to
be called again, and the function immediately returns since it cannot
even produce an error message, and the loop is maintained here.
10) the client timeout strikes, aborting the loop.
At first glance a check for timeout would be needed *before* considering
the buffer in http_wait_for_request(), but the issue is not there in fact,
because when doing so we see in the logs a client timeout error while
waiting for a request, which is wrong. The real issue is that we must not
consider the first transaction terminated until at least the reserve is
released so that http_wait_for_request() has no problem starting to process
a new request. This allows the first response to be reported in timeout and
not the second request. A more restrictive control could consist in not
considering the request complete until the response buffer has no more
outgoing data but this brings no added value and needlessly increases the
number of wake-ups when dealing with pipelining.
Note that the same issue exists with the request, we must ensure that any
POST data are finished being forwarded to the server before accepting a new
request. This case is much harder to trigger however as servers rarely
disappear and if they do so, they impact all their sessions at once.
No specific reproducer is usable because the bug is very hard to reproduce
and depends on the system as well with settings varying across reboots. On
a test machine, socket buffers were reduced to 4096 (tcp_rmem and tcp_wmem),
haproxy's buffers were set to 16384 and tune.maxrewrite to 12288. The proxy
must work in http-server-close mode, with a request timeout smaller than the
client timeout. The test is run like this :
$ (printf "GET /15000 HTTP/1.1\r\n\r\n"; usleep 100000; \
printf "GET / HTTP/1.0\r\n\r\n"; sleep 15) | nc6 --send-only 0 8002
The server returns chunks of the requested size (15000 bytes here, but
78000 in a previous test was the only working value). Strace must show the
last recvfrom() succeed and the last sendto() being shorter than desired or
better, not being called.
This fix must be backported to 1.6, 1.5 and 1.4. It was made in a way that
should make it possible to backport it. It depends on channel_congested()
which also needs to be backported. Further cleanup of http_wait_for_request()
could be made given that some of the tests are now useless.
Commit dbe34eb ("MEDIUM: filters/http: Move body parsing of HTTP
messages in dedicated functions") introduced a bug in function
http_response_forward_body() by getting rid of the while(1) loop.
The code immediately following the loop was only reachable on missing
data but now it's also reachable under normal conditions, which used
to be dealt with by the skip_resync_state label returning zero. The
side effect is that in http_server_close situations, the channel's
SHUTR flag is seen and considered as a server error which is reported
if any other error happens (eg: client timeout).
This bug is specific to 1.7, no backport is needed.
Typo was introduced in 57bc891 ("BUG/MEDIUM: log: fix risk of
segfault when logging HTTP fields in TCP mode") which inverted the
condition in the test and caused <BADREQ> to be logged when using
%HP.
Signed-off-by: Nenad Merdanovic <nmerdan@anine.io>
Instead of subtracting ST_F_COMP_OUT (Compression out) from ST_F_COMP_IN
(Compressio in) in backends stats, ST_F_COMP_BYP (Compression bypass) was used.
When a converter or sample is called from within a Lua code, there is a risk
of invalid argument string data usage when the upper boundary is reached.
Would be kind to port to 1.6 if possible.
Add opaque data between the filter keyword registrering and the parsing
function. This opaque data allow to use the same parser with differents
registered keywords. The opaque data is used for giving data which mainly
makes difference between the two keywords.
It will be used with Lua keywords registering.
This is very useful in complex architecture systems where HAproxy
is balancing DB connections for example. We want to keep the maxconn
high in order to avoid issues with queueing on the LB level when
there is slowness on another part of the system. Example is a case of
an architecture where each thread opens multiple DB connections, which
if get stuck in queue cause a snowball effect (old connections aren't
closed, new ones cannot be established). These connections are mostly
idle and the DB server has no problem handling thousands of them.
Allowing us to dynamically set maxconn depending on the backend usage
(LA, CPU, memory, etc.) enables us to have high maxconn for situations
like above, but lowering it in case there are real issues where the
backend servers become overloaded (cache issues, DB gets hit hard).
David Torgerson faced an issue when using HTTP fields in log-format in TCP
sections. The txn is dereferenced while it's null, resulting in a crash of
the process. Such configurations are invalid and a warning is emitted, but
nevertheless the process must not crash. As found by Lukas Tribus, this is
a side effect of the split between the stream and the HTTP transaction that
happened in 1.6, making it possible to have txn==NULL there.
The fix consists in checking that txn is valid before using it. Fortunately
it's easy since almost all places already used to check for the existence
of a field (eg: txn->uri).
This patch should be backported to 1.6.
A problem was reported recently by some users of programs compiled
with Go 1.5 which by default blocks all signals before executing
processes, resulting in haproxy not receiving SIGUSR1 or even SIGTERM,
causing lots of zombie processes.
This problem was apparently observed by users of consul and kubernetes
(at least).
This patch is a workaround for this issue. It consists in unblocking
all signals on startup. Since they're normally not blocked in a regular
shell, it ensures haproxy always starts under the same conditions.
Quite useful information reported by both Matti Savolainen and REN
Xiaolei actually helped find the root cause of this problem and this
workaround. Thanks to them for this.
This patch must be backported to 1.6 and 1.5 where the problem is
observed.
commit 7c0ffd23 is only considering the explicit use of the "process" keyword
on the listeners. But at this step, if it's not defined in the configuration,
the listener bind_proc mask is set to 0. As a result, the code will compute
the maxaccept value based on only 1 process, which is not always true.
For example :
global
nbproc 4
frontend test
bind-process 1-2
bind :80
Here, the maxaccept value for the "test" frontend was set to the global
tune.maxaccept value (default to 64), whereas it should consider 2 processes
will accept connections. As of the documentation, the value should be divided
by twice the number of processes the listener is bound to.
To fix this, we can consider that if no mask is set to the listener, we take
the frontend mask.
This is not critical but it can introduce unfairness distribution of the
incoming connections across the processes.
It should be backported to the same branches as commit 7c0ffd23 (1.6 and 1.5
were in the scope).
When a listener is not bound to a process its frontend belongs to, it
is only paused and not stopped. This creates confusion from the outside
as "netstat -ltnp" for example will report only the parent process as
the listener instead of the effective one. "ss -lnp" will report that
all processes are listening to all sockets.
This is confusing enough to suggest a fix. Now we simply stop the unused
listeners. Example with this simple config :
global
nbproc 4
frontend haproxy_test
bind-process 1-40
bind :12345 process 1
bind :12345 process 2
bind :12345 process 3
bind :12345 process 4
Before the patch :
$ netstat -ltnp
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:12345 0.0.0.0:* LISTEN 30457/./haproxy
tcp 0 0 0.0.0.0:12345 0.0.0.0:* LISTEN 30457/./haproxy
tcp 0 0 0.0.0.0:12345 0.0.0.0:* LISTEN 30457/./haproxy
tcp 0 0 0.0.0.0:12345 0.0.0.0:* LISTEN 30457/./haproxy
After the patch :
$ netstat -ltnp
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:12345 0.0.0.0:* LISTEN 30504/./haproxy
tcp 0 0 0.0.0.0:12345 0.0.0.0:* LISTEN 30503/./haproxy
tcp 0 0 0.0.0.0:12345 0.0.0.0:* LISTEN 30502/./haproxy
tcp 0 0 0.0.0.0:12345 0.0.0.0:* LISTEN 30501/./haproxy
This patch may be backported to 1.6 and 1.5, but it relies on commit
7a798e5 ("CLEANUP: fix inconsistency between fd->iocb, proto->accept
and accept()") since it will expose an API inconsistency by including
listener.h in the .c.
Christian Ruppert reported a performance degradation when binding a
single frontend to many processes while only one bind line was being
used, bound to a single process.
The reason comes from the fact that whenever a listener is bound to
multiple processes, the it is assigned a maxaccept value which equals
half the global maxaccept value divided by the number of processes the
frontend is bound to. The purpose is to ensure that no single process
will drain all the incoming requests at once and ensure a fair share
between all listeners. Usually this works pretty well, when a listener
is bound to all the processes of its frontend. But here we're in a
situation where the maxaccept of a listener which is bound to a single
process is still divided by a large value.
The fix consists in taking into account the number of processes the
listener is bound do and not only those of the frontend. This way it
is perfectly possible to benefit from nbproc and SO_REUSEPORT without
performance degradation.
1.6 and 1.5 normally suffer from the same issue.
There's quite some inconsistency in the internal API. listener_accept()
which is the main accept() function returns void but is declared as int
in the include file. It's assigned to proto->accept() for all stream
protocols where an int is expected but the result is never checked (nor
is it documented by the way). This proto->accept() is in turn assigned
to fd->iocb() which is supposed to return an int composed of FD_WAIT_*
flags, but which is never checked either.
So let's fix all this mess :
- nobody checks accept()'s return
- nobody checks iocb()'s return
- nobody sets a return value
=> let's mark all these functions void and keep the current ones intact.
Additionally we now include listener.h from listener.c to ensure we won't
silently hide this incoherency in the future.
Note that this patch could/should be backported to 1.6 and even 1.5 to
simplify debugging sessions.
parse_binary line 2025 checks the nullity of binstr parameter.
Other calls of parse_binary properly zeroify this parameter.
[wt: this could result in random failures of the const parser]
dns_option struct pref_net field is an array of 5. The issue
here shows that pref_net_nb can go up to 5 as well which might lead
to read outside of this array.
Depending on the path that led to sess_update_stream_int(), it's
possible that we had a read error on the frontend, but that we haven't
checked if we may abort the connection.
This was seen in particular the following setup: tcp mode, with
abortonclose set, frontend using ssl. If the ssl connection had a first
successful read, but the second read failed, we would stil try to open a
connection to the backend, although we had enough information to close
the connection early.
sess_update_stream_int() had some logic to handle that case in the
SI_ST_QUE and SI_ST_TAR, but that was missing in the SI_ST_ASS case.
This patches addresses the issue by verifying the state of the req
channel (and the abortonclose option) right before opening the
connection to the backend, so we have the opportunity to close the
connection there, and factorizes the shared SI_ST_{QUE,TAR,ASS} code.
Emeric found that some certificate files that were valid with the old method
(the one with the explicit name involving SSL_CTX_use_PrivateKey_file()) do
not work anymore with the new one (the one trying to load multiple cert types
using PEM_read_bio_PrivateKey()). With the last one, the private key couldn't
be loaded.
The difference was related to the ordering in the PEM file was different. The
old method would always work. The new method only works if the private key is
at the top, or if it appears as an "EC" private key. The cause in fact is that
we never rewind the BIO between the various calls. So this patch moves the
loading of the private key as the first step, then it rewinds the BIO, and
then it loads the cert and the chain. With this everything works.
No backport is needed, this issue came with the recent addition of the
multi-cert support.
The following patch allow to log cookie for tarpit and denied request.
This minor bug affect at least 1.5, 1.6 and 1.7 branch.
The solution is not perfect : may be the cookie processing
(manage_client_side_cookies) can be moved into http_process_req_common.
060e57301d introduced a bug, related to a
dns option structure change and an improper rebase.
Thanks Lukas Tribus for reporting it.
backport: 1.7 and above
first, we modify the signatures of http_msg_forward_body and
http_msg_forward_chunked_body as they are declared as inline
below. Secondly, just verify the returns of the chunk initialization
which holds the Authorization Method (althought it is unlikely to fail ...).
Both from gcc warnings.
After Cedric Jeanneret reported an issue with HAProxy and DNS resolution
when multiple servers are in use, I saw that the implementation of DNS
query type update on resolution timeout was not implemented, even if it
is documented.
backport: 1.6 and above
A bug leading HAProxy to stop DNS resolution when multiple servers are
configured and one is in timeout, the request is not resent.
Current code fix this issue.
backport status: 1.6 and above
This just happens to work as it is the correct chunk, but should be whatever
gets passed in as argument.
Signed-off-by: Conrad Hoffmann <conrad@soundcloud.com>
Instead of repeating the type of the LHS argument (sizeof(struct ...))
in calls to malloc/calloc, we directly use the pointer
name (sizeof(*...)). The following Coccinelle patch was used:
@@
type T;
T *x;
@@
x = malloc(
- sizeof(T)
+ sizeof(*x)
)
@@
type T;
T *x;
@@
x = calloc(1,
- sizeof(T)
+ sizeof(*x)
)
When the LHS is not just a variable name, no change is made. Moreover,
the following patch was used to ensure that "1" is consistently used as
a first argument of calloc, not the last one:
@@
@@
calloc(
+ 1,
...
- ,1
)
In C89, "void *" is automatically promoted to any pointer type. Casting
the result of malloc/calloc to the type of the LHS variable is therefore
unneeded.
Most of this patch was built using this Coccinelle patch:
@@
type T;
@@
- (T *)
(\(lua_touserdata\|malloc\|calloc\|SSL_get_app_data\|hlua_checkudata\|lua_newuserdata\)(...))
@@
type T;
T *x;
void *data;
@@
x =
- (T *)
data
@@
type T;
T *x;
T *data;
@@
x =
- (T *)
data
Unfortunately, either Coccinelle or I is too limited to detect situation
where a complex RHS expression is of type "void *" and therefore casting
is not needed. Those cases were manually examined and corrected.
There are two issues with error captures. The first one is that the
capture size is still hard-coded to BUFSIZE regardless of any possible
tune.bufsize setting and of the fact that frontends only capture request
errors and that backends only capture response errors. The second is that
captures are allocated in both directions for all proxies, which start to
count a lot in configs using thousands of proxies.
This patch changes this so that error captures are allocated only when
needed, and of the proper size. It also refrains from dumping a buffer
that was not allocated, which still allows to emit all relevant info
such as flags and HTTP states. This way it is possible to save up to
32 kB of RAM per proxy in the default configuration.
The sc_* sample fetch can work without the struct strm, because the
tracked counters are also stored in the session. So, this patchs
removes the check for the strm existance.
This bug is recent and was introduced in 1.7-dev2 by commit 6204cd9
("BUG/MAJOR: vars: always retrieve the stream and session from the sample")
This bugfix must be backported in 1.6.
This patch splits the function stats_dump_be_stats() in two parts. The
part is called stats_fill_be_stats(), and just fill the stats buffer.
This split allows the usage of preformated stats in other parts of HAProxy
like the Lua.
This patch splits the function stats_dump_sv_stats() in two parts. The
extracted part is called stats_fill_sv_stats(), and just fill the stats buffer.
This split allows the usage of preformated stats in other parts of HAProxy
like the Lua.
This patch splits the function stats_dump_li_stats() in two parts. The
extracted part is called stats_fill_li_stats(), and just fill the stats buffer.
This split allows the usage of preformated stats in other parts of HAProxy
like the Lua.
This patch splits the function stats_dump_fe_stats() in two parts. The
extracted part is called stats_fill_fe_stats(), and just fill the stats buffer.
This split allows the usage of preformated stats in other parts of HAProxy
like the Lua.
This patch splits the function stats_dump_info_to_buffer() in two parts. The
extracted part is called stats_fill_info(), and just fill the stats buffer.
This split allows the usage of preformated stats in other parts of HAProxy
like the Lua.
This patch adds a sample fetch which returns the unique-id if it is
configured. If the unique-id is not yet generated, it build it. If
the unique-id is not configured, it returns none.
Some internal HAproxy error message are provided with a final '\n'.
Its typically for the integration in the CLI. Sometimes, these messages
are returned as Lua string. These string must be without "\n" or final
spaces.
This patch adds a function whoch removes unrequired parameters.
This patch adds a Lua post initialisation wrapper. It already exists for
pure Lua function, now it executes also C. It is useful for doing things
when the configuration is ready to use. For example we can can browse and
register all the proxies.
All the HAProxy Lua object are declared with the same pattern:
- Add the function __tosting which dumps the object name
- Register the name in the Lua REGISTRY
- Register the reference ID
These action are refactored in on function. This remove some
lines of code.
The modified function are declared in the safe environment, so
they must called from safe environement. As the environement is
safe, its useles to check the stack size.
The functions
- hlua_class_const_int()
- hlua_class_const_str()
- hlua_class_function()
are use for common class registration actions.
The function 'hlua_dump_object()' is generic dump name function.
These functions can be used by all the HAProxy objects, so I move
it into the safe functions file.
Similar issue was fixed in 67dad27, but the fix is incomplete. Crash still
happened when utilizing req.fhdr() and sending exactly MAX_HDR_HISTORY
headers.
This fix needs to be backported to 1.5 and 1.6.
Signed-off-by: Nenad Merdanovic <nmerdan@anine.io>
Olivier Doucet reported the issue on the ML and tested that when using
more than TLS_TICKETS_NO keys in the file, the CPU usage is much higeher
than expected.
Lukas Tribus then provided a test case which showed that resumption doesn't
work at all in that case.
This fix needs to be backported to 1.6.
Signed-off-by: Nenad Merdanovic <nmerdan@anine.io>
The frequency counters's window start is sent as "now - freq.date",
which is a positive age compared to the current date. But on receipt,
this age was added to the current date instead of subtracted. So
since the date was always in the future, they were always expired if
the activity changed side in less than the counter's measuring period
(eg: 10s).
This bug was reported by Christian Ruppert who also provided an easy
reproducer.
It needs to be backported to 1.6.
Regarding the minor update introduced in the
cd6c3c7cb4 commit, the DeviceAtlas
module is now able to use up to 12 device properties via the
new ARG12 macro.
I just met this warning today making me realize that haproxy's
headers were included prior to the system ones, so all #ifndefs
are taken first then the system redefines them. Simply move
haproxy includes after the system's. This should be backported
to 1.6 as well.
In file included from /usr/include/bits/fcntl.h:61:0,
from /usr/include/fcntl.h:35,
from src/namespace.c:13:
/usr/include/bits/fcntl-linux.h:203:0: warning: "F_SETPIPE_SZ" redefined [enabled by default]
In file included from include/common/config.h:26:0,
from include/proto/log.h:29,
from src/namespace.c:7:
include/common/compat.h:81:0: note: this is the location of the previous definition
The strftime() function can call tzset() internally on some platforms.
When haproxy is chrooted, the /etc/localtime file is not found, and some
implementations will clobber the content of the current timezone.
The GMT offset is computed by diffing the times returned by gmtime_r() and
localtime_r(). These variants are guaranteed to not call tzset() and were
already used in haproxy while chrooted, so they should be safe.
This patch must be backported to 1.6 and 1.5.
Released version 1.7-dev2 with the following main changes :
- DOC: lua: fix lua API
- DOC: mailers: typo in 'hostname' description
- DOC: compression: missing mention of libslz for compression algorithm
- BUILD/MINOR: regex: missing header
- BUG/MINOR: stream: bad return code
- DOC: lua: fix somme errors and add implicit types
- MINOR: lua: add set/get priv for applets
- BUG/MINOR: http: fix several off-by-one errors in the url_param parser
- BUG/MINOR: http: Be sure to process all the data received from a server
- MINOR: filters/http: Use a wrapper function instead of stream_int_retnclose
- BUG/MINOR: chunk: make chunk_dup() always check and set dst->size
- DOC: ssl: fixed some formatting errors in crt tag
- MINOR: chunks: ensure that chunk_strcpy() adds a trailing zero
- MINOR: chunks: add chunk_strcat() and chunk_newstr()
- MINOR: chunk: make chunk_initstr() take a const string
- MEDIUM: tools: add csv_enc_append() to preserve the original chunk
- MINOR: tools: make csv_enc_append() always start at the first byte of the chunk
- MINOR: lru: new function to delete <nb> least recently used keys
- DOC: add Ben Shillito as the maintainer of 51d
- BUG/MINOR: 51d: Ensures a unique domain for each configuration
- BUG/MINOR: 51d: Aligns Pattern cache implementation with HAProxy best practices.
- BUG/MINOR: 51d: Releases workset back to pool.
- BUG/MINOR: 51d: Aligned const pointers to changes in 51Degrees.
- CLEANUP: 51d: Aligned if statements with HAProxy best practices and removed casts from malloc.
- MINOR: rename master process name in -Ds (systemd mode)
- DOC: fix a few spelling mistakes
- DOC: fix "workaround" spelling
- BUG/MINOR: examples: Fixing haproxy.spec to remove references to .cfg files
- MINOR: fix the return type for dns_response_get_query_id() function
- MINOR: server state: missing LF (\n) on error message printed when parsing server state file
- BUG/MEDIUM: dns: no DNS resolution happens if no ports provided to the nameserver
- BUG/MAJOR: servers state: server port is erased when dns resolution is enabled on a server
- BUG/MEDIUM: servers state: server port is used uninitialized
- BUG/MEDIUM: config: Adding validation to stick-table expire value.
- BUG/MEDIUM: sample: http_date() doesn't provide the right day of the week
- BUG/MEDIUM: channel: fix miscalculation of available buffer space.
- MEDIUM: pools: add a new flag to avoid rounding pool size up
- BUG/MEDIUM: buffers: do not round up buffer size during allocation
- BUG/MINOR: stream: don't force retries if the server is DOWN
- BUG/MINOR: counters: make the sc-inc-gpc0 and sc-set-gpt0 touch the table
- MINOR: unix: don't mention free ports on EAGAIN
- BUG/CLEANUP: CLI: report the proper field states in "show sess"
- MINOR: stats: send content-length with the redirect to allow keep-alive
- BUG: stream_interface: Reuse connection even if the output channel is empty
- DOC: remove old tunnel mode assumptions
- BUG/MAJOR: http-reuse: fix risk of orphaned connections
- BUG/MEDIUM: http-reuse: do not share private connections across backends
- BUG/MINOR: ssl: Be sure to use unique serial for regenerated certificates
- BUG/MINOR: stats: fix missing comma in stats on agent drain
- MAJOR: filters: Add filters support
- MINOR: filters: Do not reset stream analyzers if the client is gone
- REORG: filters: Prepare creation of the HTTP compression filter
- MAJOR: filters/http: Rewrite the HTTP compression as a filter
- MEDIUM: filters: Use macros to call filters callbacks to speed-up processing
- MEDIUM: filters: remove http_start_chunk, http_last_chunk and http_chunk_end
- MEDIUM: filters: Replace filter_http_headers callback by an analyzer
- MEDIUM: filters/http: Move body parsing of HTTP messages in dedicated functions
- MINOR: filters: Add stream_filters structure to hide filters info
- MAJOR: filters: Require explicit registration to filter HTTP body and TCP data
- MINOR: filters: Remove unused or useless stuff and do small optimizations
- MEDIUM: filters: Optimize the HTTP compression for chunk encoded response
- MINOR: filters/http: Slightly update the parsing of chunks
- MINOR: filters/http: Forward remaining data when a channel has no "data" filters
- MINOR: filters: Add an filter example
- MINOR: filters: Extract proxy stuff from the struct filter
- MINOR: map: Add regex matching replacement
- BUG/MINOR: lua: unsafe initialization
- DOC: lua: fix somme errors
- MINOR: lua: file dedicated to unsafe functions
- MINOR: lua: add "now" time function
- MINOR: standard: add RFC HTTP date parser
- MINOR: lua: Add date functions
- MINOR: lua: move common function
- MINOR: lua: merge function
- MINOR: lua: Add concat class
- MINOR: standard: add function "escape_chunk"
- MEDIUM: log: add a new log format flag "E"
- DOC: add server name at rate-limit sessions example
- BUG/MEDIUM: ssl: fix off-by-one in ALPN list allocation
- BUG/MEDIUM: ssl: fix off-by-one in NPN list allocation
- DOC: LUA: fix some typos and syntax errors
- MINOR: cli: add a new "show env" command
- MEDIUM: config: allow to manipulate environment variables in the global section
- MEDIUM: cfgparse: reject incorrect 'timeout retry' keyword spelling in resolvers
- MINOR: mailers: increase default timeout to 10 seconds
- MINOR: mailers: use <CRLF> for all line endings
- BUG/MAJOR: lua: segfault using Concat object
- DOC: lua: copyrights
- MINOR: common: mask conversion
- MEDIUM: dns: extract options
- MEDIUM: dns: add a "resolve-net" option which allow to prefer an ip in a network
- MINOR: mailers: make it possible to configure the connection timeout
- BUG/MAJOR: lua: applets can't sleep.
- BUG/MINOR: server: some prototypes are renamed
- BUG/MINOR: lua: Useless copy
- BUG/MEDIUM: stats: stats bind-process doesn't propagate the process mask correctly
- BUG/MINOR: server: fix the format of the warning on address change
- CLEANUP: server: add "const" to some message strings
- MINOR: server: generalize the "updater" source
- BUG/MEDIUM: chunks: always reject negative-length chunks
- BUG/MINOR: systemd: ensure we don't miss signals
- BUG/MINOR: systemd: report the correct signal in debug message output
- BUG/MINOR: systemd: propagate the correct signal to haproxy
- MINOR: systemd: ensure a reload doesn't mask a stop
- BUG/MEDIUM: cfgparse: wrong argument offset after parsing server "sni" keyword
- CLEANUP: stats: Avoid computation with uninitialized bits.
- CLEANUP: pattern: Ignore unknown samples in pat_match_ip().
- CLEANUP: map: Avoid memory leak in out-of-memory condition.
- BUG/MINOR: tcpcheck: fix incorrect list usage resulting in failure to load certain configs
- BUG/MAJOR: samples: check smp->strm before using it
- MINOR: sample: add a new helper to initialize the owner of a sample
- MINOR: sample: always set a new sample's owner before evaluating it
- BUG/MAJOR: vars: always retrieve the stream and session from the sample
- CLEANUP: payload: remove useless and confusing nullity checks for channel buffer
- BUG/MINOR: ssl: fix usage of the various sample fetch functions
- MINOR: stats: create fields types suitable for all CSV output data
- MINOR: stats: add all the "show info" fields in a table
- MEDIUM: stats: fill all the show info elements prior to displaying them
- MINOR: stats: add a function to emit fields into a chunk
- MINOR: stats: add stats_dump_info_fields() to dump one field per line
- MEDIUM: stats: make use of stats_dump_info_fields() for "show info"
- MINOR: stats: add a declaration of all stats fields
- MINOR: stats: don't hard-code the CSV fields list anymore
- MINOR: stats: create stats fields storage and CSV dump function
- MEDIUM: stats: convert stats_dump_fe_stats() to use stats_dump_fields_csv()
- MEDIUM: stats: make stats_dump_fe_stats() use stats fields for HTML dump
- MEDIUM: stats: convert stats_dump_li_stats() to use stats_dump_fields_csv()
- MEDIUM: stats: make stats_dump_li_stats() use stats fields for HTML dump
- MEDIUM: stats: convert stats_dump_be_stats() to use stats_dump_fields_csv()
- MEDIUM: stats: make stats_dump_be_stats() use stats fields for HTML dump
- MEDIUM: stats: convert stats_dump_sv_stats() to use stats_dump_fields_csv()
- MEDIUM: stats: make stats_dump_sv_stats() use the stats field for HTML
- MEDIUM: stats: move the server state coloring logic to the server dump function
- MINOR: stats: do not use srv->admin & STATS_ADMF_MAINT in HTML dumps
- MINOR: stats: do not check srv->state for SRV_ST_STOPPED in HTML dumps
- MINOR: stats: make CSV report server check status only when enabled
- MINOR: stats: only report backend's down time if it has servers
- MINOR: stats: prepend '*' in front of the check status when in progress
- MINOR: stats: make HTML stats dump rely on the table for the check status
- MINOR: stats: add agent_status, agent_code, agent_duration to output
- MINOR: stats: add check_desc and agent_desc to the output fields
- MINOR: stats: add check and agent's health values in the output
- MEDIUM: stats: make the HTML server state dump use the CSV states
- MEDIUM: stats: only report observe errors when observe is set
- MEDIUM: stats: expose the same flags for CLI and HTTP accesses
- MEDIUM: stats: report server's address in the CSV output
- MEDIUM: stats: report the cookie value in the server & backend CSV dumps
- MEDIUM: stats: compute the color code only in the HTML form
- MEDIUM: stats: report the listeners' address in the CSV output
- MEDIUM: stats: make it possible to report the WAITING state for listeners
- REORG: stats: dump the frontend's HTML stats via a generic function
- REORG: stats: dump the socket stats via the generic function
- REORG: stats: dump the server stats via the generic function
- REORG: stats: dump the backend stats via the generic function
- MEDIUM: stats: add a new "mode" column to report the proxy mode
- MINOR: stats: report the load balancing algorithm in CSV output
- MINOR: stats: add 3 fields to report the frontend-specific connection stats
- MINOR: stats: report number of intercepted requests for frontend and backends
- MINOR: stats: introduce stats_dump_one_line() to dump one stats line
- CLEANUP: stats: make stats_dump_fields_html() not rely on proxy anymore
- MINOR: stats: add ST_SHOWADMIN to pass the admin info in the regular flags
- MINOR: stats: make stats_dump_fields_html() not use &trash by default
- MINOR: stats: add functions to emit typed fields into a chunk
- MEDIUM: stats: support "show info typed" on the CLI
- MEDIUM: stats: implement a typed output format for stats
- DOC: document the "show info typed" and "show stat typed" output formats
- MINOR: cfgparse: warn when uid parameter is not a number
- MINOR: cfgparse: warn when gid parameter is not a number
- BUG/MINOR: standard: Avoid free of non-allocated pointer
- BUG/MINOR: pattern: Avoid memory leak on out-of-memory condition
- CLEANUP: http: fix a build warning introduced by a recent fix
- BUG/MINOR: log: GMT offset not updated when entering/leaving DST
GMT offset used in local time formats was computed at startup, but was not updated when DST status changed while running.
For example these two RFC5424 syslog traces where emitted 5 seconds apart, just before and after DST changed:
<14>1 2016-03-27T01:59:58+01:00 bunch-VirtualBox haproxy 2098 - - Connect ...
<14>1 2016-03-27T03:00:03+01:00 bunch-VirtualBox haproxy 2098 - - Connect ...
It looked like they were emitted more than 1 hour apart, unlike with the fix:
<14>1 2016-03-27T01:59:58+01:00 bunch-VirtualBox haproxy 3381 - - Connect ...
<14>1 2016-03-27T03:00:03+02:00 bunch-VirtualBox haproxy 3381 - - Connect ...
This patch should be backported to 1.6 and partially to 1.5 (no fix needed in log.c).
Cyril reported that recent commit 320ec2a ("BUG/MEDIUM: chunks: always
reject negative-length chunks") introduced a build warning because gcc
cannot guess that we can't fall into the case where the auth_method
chunk is not initialized.
This patch addresses it, though for the long term it would be best
if chunk_initlen() would always initialize the result.
This fix must be backported to 1.6 and 1.5 where the aforementionned
fix was already backported.
pattern_new_expr() failed to free the allocated list element when an
out-of-memory error occurs during initialization of the element. As
this only happens when loading the configuration file or evaluating
commands via the CLI, it is unlikely for this leak to be relevant
unless the user makes automated, heavy use of the CLI.
Found in HAProxy 1.5.14.
The original author forgot to dereference the argument to free in
parse_binary. This may result in a crash on reading bad input from
the configuration file instead of a proper error message.
Found in HAProxy 1.5.14.
Currently, no warning are emitted when the gid is not a number.
Purpose of this warning is to let admins know they their configuration
won't be applied as expected.
Currently, no warning are emitted when the uid is not a number.
Purpose of this warning is to let admins know they their configuration
won't be applied as expected.
The output for each field is :
field:<origin><nature><scope>:type:value
where field reminds the type of the object being dumped as well as its
position (pid, iid, sid), field number and field name. This way a
monitoring utility may very well report all available information without
knowing new fields in advance.
This format is also supported in the HTTP version of the stats by adding
";typed" after the URI, instead of ";csv" for the CSV format.
The doc was not updated yet.
This emits the field positions, names and types. It is more convenient
than the default output for a parser that doesn't know all the fields. It
simply relies on stats_emit_typed_data_field() and stats_emit_field_tags()
added by previous patch for the output. A new stats format flag was added,
STAT_FMT_TYPED, which is set when the "typed" keyword is specified on the
CLI.
New function stats_emit_typed_data_field() does exactly like
stats_emit_raw_data_field() except that it also prints the data
type after a colon. This will be used to print using the typed
format.
And function stats_emit_field_tags() appends a 3-letter code
describing the origin, nature, and scope, followed by an optional
delimiter. This will be particularly convenient to dump typed
data.
This function must dump into the buffer it gets in argument, and should
not assume it's always trash. This was the last part of the rework, now
the CSV and HTML functions are compatible and the output format may easily
be extended.
It's easier to have a new flag in <flags> to indicate whether or not we
want to display the admin column in HTML dumps. We already have similar
flags to show the version or the legends.
This new function dumps the current stats line according to the
specified format (CSV or HTML for now), and returns these functions'
output code, which will serve later to indicate a failure (eg: buffer
full).
This further simplifies the code since all dumpers now just call this
function.
This was reported in HTML dumps already but not CSV. It reports the
number of monitor and stats requests. Ideally use-service and redirs
should be accounted for as well.
Frontends have extra information compared to other entities, they can
report some statistics at the connection level while the other ones
are limited to the session level. This patch adds 3 more fields for
this :
- conn_rate
- conn_rate_max
- conn_tot
It's worth noting that listeners theorically have such statistics, except
that the distinction between connections and sessions is not clearly made
in the code, so that will have to be improved later.
This new function stats_dump_fields_html() checks the type of the object
being dumped from the stats table, and emits it in HTML format. It uses
an argument indicating if the HTML page is also used as an admin page,
and for now still takes the proxy in argument as a few entries still
need it.
The code was simply moved as-is to the new function. There's no
functional change.
HTML output used to have it but not the CSV output. It indicates that the
listener is not full but was forced to wait because the max connection
rate was reached.
The color code requires a complex logic, and we use it only in the
HTML part. So let's compute it there based on the server state, its
health and its weight. The thing is tricky but OK. There's a 1-to-1
mapping of down servers, but not of up servers, hence the need for
the weight and health.
The server's cookie value is now reported in the "cookie" column and
used as-is from the HTML dump. It was the last reference to the sv
pointer from this place.
The same was done for the backend's dump.
This new field "addr" presents the server's address:port if the client
is either enabled via "stats show legends" in case of HTTP dumps, or
has at least level operator on the CLI. The address formats might be :
- ipv4:port
- [ipv6]:port
- unix
- (error message)
The HTML dump over HTTP request may have several flags including
ST_SHLGNDS (to show legends), ST_SHNODE (to show node name),
ST_SHDESC (to show some descriptions).
There's no such thing over the CLI so we need to have an equivalent.
Let's compute the flags earlier so that we can make use of these flags
regardless of the call point.
Now instead of recomputing the state based on the health, rise etc,
we reuse the same state as in the CSV file, and optionally complete
it with a down or an up arrow if a change is occurring. We could
have parsed the strings to detect a '/' indicating a state change,
but it was easier to check the health against rise and fall.
This adds the following fields :
- check_rise [...S]: server's "rise" parameter used by checks
- check_fall [...S]: server's "fall" parameter used by checks
- check_health [...S]: server's health check value between 0 and rise+fall-1
- agent_rise [...S]: agent's "rise" parameter, normally 1
- agent_fall [...S]: agent's "fall" parameter, normally 1
- agent_health [...S]: agent's health parameter, between 0 and rise+fall-1
Added these two new fields to the CSV output :
- check_desc : short human-readable description of check_status
- agent_desc : short human-readable description of agent_status
Also factor two tests for enabled checks.
The agent check status is now reported :
- agent_status : status of last agent check
- agent_code : numeric code reported by agent if any (unused for now)
- agent_duration : time in ms taken to finish last check
There's no point in reporting a backend's up/down time if it has no
servers. The CSV output used to report "0" for a serverless backend
while the HTML version already removed the field. For servers, this
field is already omitted if checks are disabled. Let's uniformize
all of this and remove the field in CSV as well when irrelevant.
The HTML version doesn't report a check status when the server is in
maintenance since it can be quite old and irrelevant. The CSV forgot
to care about that, so let's do it here as well.
We don't want the HTML dump to rely on the server state. We
already have this piece of information in the status field by
checking that it starts with "DOWN".
It currently is really not convenient to have a state and a color detection
outside of the function and to use these ones inside. It makes it harder to
adjust the stats output based on the server state exactly. Let's move the
logic into the dump function itself.
Here we still have a huge amount of stuff to extract from the HTML code
and even from the caller. Indeed, the calling function computes the
server state and prepares a color code that will be used to determine
what style to use. The operations needed to decide what field to present
or not depend a lot on the server's state, admin state, health value,
rise and fall etc... all of which are not easily present in the table.
We also have to check the reference's values for all of the above.
There are also a number of differences between the CSV and HTML outputs :
- CSV always reports check duration, HTML only if not zero
- regarding last_change, CSV always report the server's while the HTML
considers either the server's or the reference based on the admin state.
- agent and health are separate in the CSV but mixed in the HTML.
- too few info on agent anyway.
After careful code inspection it happens that both sv->last_change and
ref->last_change are identical and can both derive from [LASTCHG].
Also, the following info are missing from the array to complete the HTML
code :
- cookie, address, status description, check-in-progress, srv->admin
At least for now it still works but a lot of info now need to be added.
This function now only fills the relevant fields with raw values and
calls stats_dump_fields_csv() for the CSV part. The output remains
exactly the same for now.
Some limits are only emitted if set, so the HTML part will have to
check for these being set.
A number of fields had to be built using printf-format strings, so
instead of allocating strings that risk not being freed, we use
chunk_newstr() and chunk_appendf().
Text strings are now copied verbatim in the stats fields so that only
the CSV dump encodes them as CSV. A single "out" chunk is used and cut
into multiple substrings using chunk_newstr() so that we don't need
distinct chunks for each stats field holding a string. The total amount
of data being emitted at once will never overflow the "out" chunk since
only a small part of it goes into the trash buffer which is the same size
and will thus overflow first.
One point of care is that failed_checks and down_trans were logged as
64-bit quantities on servers but they're 32-bit on proxies. This may
have to be changed later to unify them.
Some fields are still needed to complete the conversion :
- px->srv : used to take decisions when backend has no server (eg: print down or not)
- algo string (useful for CSV as well) // only if SHLGNDS
- cookie_name (useful for CSV as well) // only if SHLGNDS
- px->mode == HTTP (or px->mode as a string) // same for frontend
- px->be_counters.intercepted_req (stats and redirects ?)
The following field already has a place but was not presented in the
CSV output, so it should simply be added afterwards :
- px->be_counters.http.cum_req (was in HTML and missing from CSV)
This function now only fills the relevant fields with raw values and
calls stats_dump_fields_csv() for the CSV part. The output remains
exactly the same for now. It's worth noting that there are some
ambiguities between connections and sessions, for example cum_conn
is dumped into cum_sess. Additionally, there is a naming ambiguity
in that the internal "d_time" (time where the beginning of data
appeared) is called "rtime" in the output (response time) and they
actually are indeed the same.
The conversion still requires some elements which are not present in the
current fields :
- the HTML status may emit "WAITING"/"OPEN"/"FULL" while the CSV format
doesn't propose "WAITING", so this last one will have to be added.
- the HTML output emits the listening adresses when the ST_SHLGNDS flag
is set but this address field doesn't exist in the CSV format
- it's interesting to note that when the ST_SHLGNDS flag is not set, the
HTML output doesn't provide the listener's ID while it's present in the
CSV output accessible from the same interface.
This function now only fills the relevant fields with raw values and
calls stats_dump_fields_csv() for the CSV part. The output remains
exactly the same for now.
It is worth mentionning that l->cum_conn is being dumped into a cum_sess
field and that once we introduce an official cum_conn field we may have
to dump the same value at both places to maintain compatibility with the
existing stats.
Now we avoid directly accessing the proxy and instead we pick the values
from the stats fields. This unveils that only a few fields are missing to
complete the job :
- know whether or not the checkbox column needs to be displayed. This
is not directly relevant to the stats but rather to the fact that the
HTML dump is also a control interface. This doesn't need a field, just
a function argument.
- px->mode == HTTP (or px->mode as a string)
- px->fe_counters.intercepted_req (stats and redirects ?)
- px->fe_counters.cum_conn
- px->fe_counters.cps_max
- px->fe_conn_per_sec
All the last ones make sense in the CSV, so they'll have to be added as well.
This function now only fills the relevant fields with raw values and
calls stats_dump_fields_csv() for the CSV part. The output remains
exactly the same for now.
The new function stats_dump_fields_csv() currenty walks over all CSV
fields and prints all non-empty ones as one line. Strings are csv-encoded
on the fly.
This is in preparation for a unifying of the stats output between the
multiple formats. The long-term goal will be that HTML stats are built
from the array used to produce the CSV output in order to ensure that
no information is missing in any format.
This function dumps non-empty fields, one per line with their name and
values, in the same format as is currently used by "show info". It relies
on previously added stats_emit_raw_data_field().
The table is completely filled with all relevant information. Only the
fields that should appear are presented. The description field is now
properly omitted if not set, instead of being reported as empty.
Technically speaking, many SSL sample fetch functions act on the
connection and depend on USE_L5CLI on the client side, which means
they're usable as soon as a handshake is completed on a connection.
This means that the test consisting in refusing to call them when
the stream is NULL will prevent them from working when we implement
the tcp-request session ruleset. Better fix this now. The fix consists
in using smp->sess->origin when they're called for the front connection,
and smp->strm->si[1].end when called for the back connection.
There is currently no known side effect for this issue, though it would
better be backported into 1.6 so that the code base remains consistend.
The buffer could not be null by definition since we moved the stream out
of the session. It's the stream which gained the ability to be NULL (hence
the recent fix for this case). The checks in place are useless and make one
think this situation can happen.
This is the continuation of previous patch called "BUG/MAJOR: samples:
check smp->strm before using it".
It happens that variables may have a session-wide scope, and that their
session is retrieved by dereferencing the stream. But nothing prevents them
from being used from a streamless context such as tcp-request connection,
thus crashing the process. Example :
tcp-request connection accept if { src,set-var(sess.foo) -m found }
In order to fix this, we have to always ensure that variable manipulation
only happens via the sample, which contains the correct owner and context,
and that we never use one from a different source. This results in quite a
large change since a lot of functions are inderctly involved in the call
chain, but the change is easy to follow.
This fix must be backported to 1.6, and requires the last two patches.
Some functions like sample_conv_var2smp(), var_get_byname(), and
var_set_byname() directly or indirectly need to access the current
stream and/or session and must find it in the sample itself and not
as a distinct argument. Thus we first need to call smp_set_owner()
prior to each such calls.
Since commit 6879ad3 ("MEDIUM: sample: fill the struct sample with the
session, proxy and stream pointers") merged in 1.6-dev2, the sample
contains the pointer to the stream and sample fetch functions as well
as converters use it heavily. This requires from a lot of call places
to initialize 4 fields, and it was even forgotten at a few places.
This patch provides a convenient helper to initialize all these fields
at once, making it easy to prepare a new sample from a previous one for
example.
A few call places were cleaned up to make use of it. It will be needed
by further fixes.
At one place in the Lua code, it was moved earlier because we used to
call sample casts with a non completely initialized sample, which is
not clean eventhough at the moment there are no consequences.
Since commit 6879ad3 ("MEDIUM: sample: fill the struct sample with the
session, proxy and stream pointers") merged in 1.6-dev2, the sample
contains the pointer to the stream and sample fetch functions as well
as converters use it heavily.
The problem is that earlier commit 87b0966 ("REORG/MAJOR: session:
rename the "session" entity to "stream"") had split the session and
stream resulting in the possibility for smp->strm to be NULL before
the stream was initialized. This is what happens in tcp-request
connection rulesets, as discovered by Baptiste.
The sample fetch functions must now check that smp->strm is valid
before using it. An alternative could consist in using a dummy stream
with nothing in it to avoid some checks but it would only result in
deferring them to the next step anyway, and making it harder to detect
that a stream is valid or the dummy one.
There is still an issue with variables which requires a complete
independant fix. They use strm->sess to find the session with strm
possibly NULL and passed as an argument. All call places indirectly
use smp->strm to build strm. So the problem is there but the API needs
to be changed to remove this duplicate argument that makes it much
harder to know what pointer to use.
This fix must be backported to 1.6, as well as the next one fixing
variables.
Commit baf9794 ("BUG/MINOR: tcpcheck: conf parsing error when no port
configured on server and first rule(s) is (are) COMMENT") was wrong, it
incorrectly implemented a list access by dereferencing a pointer of an
incorrect type resulting in checking the next element in the list. The
consequence is that it stops before the last comment instead of at the
last one and skips the first rule. In the end, rules starting with
comments are not affected, but if a sequence of checks directly starts
with connect, it is then skipped and this is visible when no port is
configured on the server line as the config refuses to load.
There was another occurence of the same bug a few lines below, both
of them were fixed. Tests were made on different configs and confirm
the new fix is OK.
This fix must be backported to 1.6.
This memory leak of about 100 bytes occurs only if there is an error
condidtion during evaluation of a "map" directive in the configuration
file. This evaluation only happens once on startup because haproxy
does not have a mechanism for re-loading the configuration file during
run-time. The startup will be aborted anyway due to error conditions
raised.
Nevertheless fix it to silence warnings of static code analysis tools
and be safe against future revisions of the code.
Found in haproxy 1.5.14.
Ignore samples that are neither SMP_T_IPV4 nor SMP_T_IPV6 instead of
matching with an uninitialized value in this case.
This situation should not occur in the current codebase but triggers
warnings in static code analysis tools.
Found in haproxy 1.5.
stats_map_lookup() sets bit SMP_F_CONST in the uninitialized member
flags of a stack-allocated sample, leaving the other bits
uninitialized. All code paths that can access the struct only ever
check for this specific flag, so there is no risk of unintended
behavior.
Nevertheless fix it as it triggers warnings in static code analysis
tools and might become a problem on future revisions of the code.
Problem found in version 1.5.
Owen Marshall reported an issue depending on the server keywords order in the
configuration.
Working line :
server dev1 <ip>:<port> check inter 5000 ssl verify none sni req.hdr(Host)
Non working line :
server dev1 <ip>:<port> check inter 5000 ssl sni req.hdr(Host) verify none
Indeed, both parse_server() and srv_parse_sni() modified the current argument
offset at the same time. To fix the issue, srv_parse_sni() can work on a local
copy ot the offset, leaving parse_server() responsible of the actual value.
This fix must be backported to 1.6.
If a SIGHUP/SIGUSR2 is sent immediately after a SIGTERM/SIGINT and
before wait() is notified, it will mask it since there's no queue,
only a copy of the last received signal. Let's add a special check
before overwriting the signal so that SIGTERM/SIGINT are not masked.
Some people report that sometimes there's a collection of old processes
after a restart of the systemd wrapper. It's not surprizing when reading
the code, the SIGTERM is propagated as a SIGINT which asks for a graceful
stop instead. If people ask for termination, we should terminate.
Commit c54bdd2 ("MINOR: Also accept SIGHUP/SIGTERM in systemd-wrapper")
added support for extra signals but did not adapt the debug message,
which continues to report incorrect signal types. Let's pass the signal
number to the do_restart() and do_shutdown() functions so that the output
matches the signal that was received.
There's a race condition in the systemd wrapper. People who restart it
too fast report old processes remaining there. Here if the signal is
caught before entering the "while" loop we block indefinitely on the
wait() call. Let's check the caught signal before calling wait(). It
doesn't completely close the race, a window still exists between the
test of the flag and the call to wait(), but it's much smaller. A
safer solution could involve pselect() to wait for the signal
delivery.
The recent addition of "show env" on the CLI has revealed an interesting
design bug. Chunks are supposed to support a negative length to indicate
that they carry no data. chunk_printf() sets this size to -1 if the string
is too large for the buffer. At a few places in the http engine we may end
up with trash.len = -1. But bi_putchk(), chunk_appendf() and a few other
chunks consumers don't consider this case as possible and will use such a
chunk, possibly restoring an invalid string or trying to copy -1 bytes.
This fix takes care of clarifying the situation in a backportable way
where such sizes are used, so that a negative length indicating an error
remains present until the chunk is reinitialized or overwritten. But a
cleaner design adjustment needs to be done so that there's a clear contract
on how to use these chunks. At first glance it doesn't seem *that* useful
to support negative sizes, so probably this is what should change.
This fix must be backported to 1.6 and 1.5.
the function server_parse_addr_change_request() contain an hardcoded
updater source "stats command". this function can be called from other
sources than the "stats command", so this patch make this argument
generic.
When the server address is changed, a message with unrequired '\n' or
'.' is displayed, like this:
[WARNING] 054/101137 (3229) : zzzz/s3 changed its IP from 127.0.0.1 to ::55 by stats command
.
This patch remove the '\n' which is sent before the '.'.
This patch must be backported in 1.6
With nbproc > 1, it is possible to specify on which process the stats socket
will be bound using "stats bind-process", but the behaviour was not correct,
ignoring the value in some configurations.
Example :
global
nbproc 4
stats bind-process 1
stats socket /var/run/haproxy.sock
With such a configuration, all the processes will listen on the stats socket.
As a workaround, it is also possible to declare a "process" keyword on
the "stats stocket" line.
The patch must be applied to 1.7, 1.6 and 1.5
A value is copied two time in teh stack, but only one is usefull. The
second copy leaves unused in the stack and take some room for noting.
This path removes the second copy.
Must be backported in 1.6
This patch must be backported in 1.6
hlua_yield() function returns the required sleep time. The Lua core must
be resume the execution after the required time. The core dedicated to
the http and tcp applet doesn't implement the wake up function. It is a
miss.
This patch fix this.
This patch introduces a configurable connection timeout for mailers
with a new "timeout mail <time>" directive.
Acked-by: Simon Horman <horms@verge.net.au>
This options prioritize th choice of an ip address matching a network. This is
useful with clouds to prefer a local ip. In some cases, a cloud high
avalailibility service can be announced with many ip addresses on many
differents datacenters. The latency between datacenter is not negligible, so
this patch permitsto prefers a local datacenter. If none address matchs the
configured network, another address is selected.
DNS selection preferences are actually declared inline in the
struct server. There are copied from the server struct to the
dns_resolution struct for each resolution.
Next patchs adds new preferences options, and it is not a good
way to copy all the configuration information before each dns
resolution.
This patch extract the configuration preference from the struct
server and declares a new dedicated struct. Only a pointer to this
new striuict will be copied before each dns resolution.
Concat object is based on "luaL_Buffer". The luaL_Buffer documentation says:
During its normal operation, a string buffer uses a variable number of stack
slots. So, while using a buffer, you cannot assume that you know where the
top of the stack is. You can use the stack between successive calls to buffer
operations as long as that use is balanced; that is, when you call a buffer
operation, the stack is at the same level it was immediately after the
previous buffer operation. (The only exception to this rule is
luaL_addvalue.) After calling luaL_pushresult the stack is back to its level
when the buffer was initialized, plus the final string on its top.
So, the stack cannot be manipulated between the first call at the function
"luaL_buffinit()" and the last call to the function "luaL_pushresult()" because
we cannot known the stack status.
In other way, the memory used by these functions seems to be collected by GC, so
if the GC is triggered during the usage of the Concat object, it can be used
some released memory.
This patch rewrite the Concat class without the "luaL_Buffer" system. It uses
"userdata()" forr the memory allocation of the buffer strings.
Not doing so causes issues with Exchange2013 not processing the message
body from the email. Specification seems to specify that as correct
behavior : https://www.ietf.org/rfc/rfc2821.txt # 2.3.7 Lines
> SMTP client implementations MUST NOT transmit "bare" "CR" or "LF" characters.
This patch should be backported to 1.6.
Acked-by: Simon Horman <horms@verge.net.au>
This allows the tcp connection to send multiple SYN packets, so 1 lost
packet does not cause the mail to be lost. It changes the socket timeout
from 2 to 10 seconds, this allows for 3 syn packets to be send and
waiting a little for their reply.
This patch should be backported to 1.6.
Acked-by: Simon Horman <horms@verge.net.au>
If for example it was written as 'timeout retri 1s' or 'timeout wrong 1s'
this would be used for the retry timeout value. Resolvers section only
timeout setting currently is 'retry', others are still parsed as before
this patch to not break existing configurations.
A less strict version will be backported to 1.6.
With new init systems such as systemd, environment variables became a
real mess because they're only considered on startup but not on reload
since the init script's variables cannot be passed to the process that
is signaled to reload.
This commit introduces an alternative method consisting in making it
possible to modify the environment from the global section with directives
like "setenv", "unsetenv", "presetenv" and "resetenv".
Since haproxy supports loading multiple config files, it now becomes
possible to put the host-dependant variables in one file and to
distribute the rest of the configuration to all nodes, without having
to deal with the init system's deficiencies.
Environment changes take effect immediately when the directives are
processed, so it's possible to do perform the same operations as are
usually performed in regular service config files.
Using environment variables in configuration files can make troubleshooting
complicated because there's no easy way to verify that the variables are
correct. This patch introduces a new "show env" command which displays the
whole environment on the CLI, one variable per line.
The socket must at least have level operator to display the environment.
After seeing previous ALPN fix, I suspected that NPN code was wrong
as well, and indeed it was since ALPN was copied from it. This fix
must be backported into 1.6 and 1.5.
The first time I tried it (1.6.3) I got a segmentation fault :(
After some investigation with gdb and valgrind I found the
problem. memcpy() copies past an allocated buffer in
"bind_parse_alpn". This patch fixes it.
[wt: this fix must be backported into 1.6 and 1.5]
The +E mode escapes characters '"', '\' and ']' with '\' as prefix. It
mostly makes sense to use it in the RFC5424 structured-data log formats.
Example:
log-format-sd %{+Q,+E}o\ [exampleSDID@1234\ header=%[capture.req.hdr(0)]]
This patch merges the last imported functions in one, because
the function hlua_metatype is only used by hlua_checudata.
This patch fix also the compilation error became by the
copy of the code.
This patch moves the function hlua_checkudata which check that
an object contains the expected class_reference as metatable.
This function is commonly used by all the lua functions.
The function hlua_metatype is also moved.
This parser takes a string containing an HTTP date. It returns
a broken-down time struct. We must considers considers this
time as GMT. Maybe later the timezone will be taken in account.
When Lua executes functions from its API, these can throws an error.
These function must be executed in a special environment which catch
these error, otherwise a critical error (like segfault) can raise.
This patch add a c file called "hlua_fcn.c" which collect all the
Lua/c function needing safe environment for its execution.
Now, filter's configuration (.id, .conf and .ops fields) is stored in the
structure 'flt_conf'. So proxies own a flt_conf list instead of a filter
list. When a filter is attached to a stream, it gets a pointer on its
configuration. This avoids mixing the filter's context (owns by a stream) and
its configuration (owns by a proxy). It also saves 2 pointers per filter
instance.
The "trace" filter has been added. It defines all available callbacks and for
each one it prints a trace message. To enable it:
listener test
...
filter trace
...
This is an improvement, especially when the message body is big. Before this
patch, remaining data were forwarded when there is no filter on the stream. Now,
the forwarding is triggered when there is no "data" filter on the channel. When
no filter is used, there is no difference, but when at least one filter is used,
it can be really significative.
Now, http_parse_chunk_size and http_skip_chunk_crlf return the number of bytes
parsed on success. http_skip_chunk_crlf does not use msg->sol anymore.
On the other hand, http_forward_trailers is unchanged. It returns >0 if the end
of trailers is reached and 0 if not. In all cases (except if an error is
encountered), msg->sol contains the length of the last parsed part of the
trailer headers.
Internal doc and comments about msg->sol has been updated accordingly.
Instead of compressing all chunks as they come, we store them in a temporary
buffer. The compression happens during the forwarding phase. This change speeds
up the compression of chunked response.
Before, functions to filter HTTP body (and TCP data) were called from the moment
at least one filter was attached to the stream. If no filter is interested by
these data, this uselessly slows data parsing.
A good example is the HTTP compression filter. Depending of request and response
headers, the response compression can be enabled or not. So it could be really
nice to call it only when enabled.
So, now, to filter HTTP/TCP data, a filter must use the function
register_data_filter. For TCP streams, this function can be called only
once. But for HTTP streams, when needed, it must be called for each HTTP request
or HTTP response.
Only registered filters will be called during data parsing. At any time, a
filter can be unregistered by calling the function unregister_data_filter.
From the stream point of view, this new structure is opaque. it hides filters
implementation details. So, impact for future optimizations will be reduced
(well, we hope so...).
Some small improvements has been made in filters.c to avoid useless checks.
Now body parsing is done in http_msg_forward_body and
http_msg_forward_chunked_body functions, regardless of whether we parse a
request or a response.
Parsing result is still handled in http_request_forward_body and
http_response_forward_body functions.
This patch will ease futur optimizations, mainly on filters.
This new analyzer will be called for each HTTP request/response, before the
parsing of the body. It is identified by AN_FLT_HTTP_HDRS.
Special care was taken about the following condition :
* the frontend is a TCP proxy
* filters are defined in the frontend section
* the selected backend is a HTTP proxy
So, this patch explicitly add AN_FLT_HTTP_HDRS analyzer on the request and the
response channels when the backend is a HTTP proxy and when there are filters
attatched on the stream.
This patch simplifies http_request_forward_body and http_response_forward_body
functions.
For Chunked HTTP request/response, the body filtering can be really
expensive. In the worse case (many chunks of 1 bytes), the filters overhead is
of 3 calls per chunk. If http_data callback is useful, others are just
informative.
So these callbacks has been removed. Of course, existing filters (trace and
compression) has beeen updated accordingly. For the HTTP compression filter, the
update is quite huge. Its implementation is closer to the old one.
When no filter is attached to the stream, the CPU footprint due to the calls to
filters_* functions is huge, especially for chunk-encoded messages. Using macros
to check if we have some filters or not is a great improvement.
Furthermore, instead of checking the filter list emptiness, we introduce a flag
to know if filters are attached or not to a stream.
HTTP compression has been rewritten to use the filter API. This is more a PoC
than other thing for now. It allocates memory to work. So, if only for that, it
should be rewritten.
In the mean time, the implementation has been refactored to allow its use with
other filters. However, there are limitations that should be respected:
- No filter placed after the compression one is allowed to change input data
(in 'http_data' callback).
- No filter placed before the compression one is allowed to change forwarded
data (in 'http_forward_data' callback).
For now, these limitations are informal, so you should be careful when you use
several filters.
About the configuration, 'compression' keywords are still supported and must be
used to configure the HTTP compression behavior. In absence of a 'filter' line
for the compression filter, it is added in the filter chain when the first
compression' line is parsed. This is an easy way to do when you do not use other
filters. But another filter exists, an error is reported so that the user must
explicitly declare the filter.
For example:
listen tst
...
compression algo gzip
compression offload
...
filter flt_1
filter compression
filter flt_2
...
HTTP compression will be moved in a true filter. To prepare the ground, some
functions have been moved in a dedicated file. Idea is to keep everything about
compression algos in compression.c and everything related to the filtering in
flt_http_comp.c.
For now, a header has been added to help during the transition. It will be
removed later.
Unused empty ACL keyword list was removed. The "compression" keyword
parser was moved from cfgparse.c to flt_http_comp.c.
When all callbacks have been called for all filters registered on a stream, if
we are waiting for the next HTTP request, we must reset stream analyzers. But it
is useless to do so if the client has already closed the connection.
This patch adds the support of filters in HAProxy. The main idea is to have a
way to "easely" extend HAProxy by adding some "modules", called filters, that
will be able to change HAProxy behavior in a programmatic way.
To do so, many entry points has been added in code to let filters to hook up to
different steps of the processing. A filter must define a flt_ops sutrctures
(see include/types/filters.h for details). This structure contains all available
callbacks that a filter can define:
struct flt_ops {
/*
* Callbacks to manage the filter lifecycle
*/
int (*init) (struct proxy *p);
void (*deinit)(struct proxy *p);
int (*check) (struct proxy *p);
/*
* Stream callbacks
*/
void (*stream_start) (struct stream *s);
void (*stream_accept) (struct stream *s);
void (*session_establish)(struct stream *s);
void (*stream_stop) (struct stream *s);
/*
* HTTP callbacks
*/
int (*http_start) (struct stream *s, struct http_msg *msg);
int (*http_start_body) (struct stream *s, struct http_msg *msg);
int (*http_start_chunk) (struct stream *s, struct http_msg *msg);
int (*http_data) (struct stream *s, struct http_msg *msg);
int (*http_last_chunk) (struct stream *s, struct http_msg *msg);
int (*http_end_chunk) (struct stream *s, struct http_msg *msg);
int (*http_chunk_trailers)(struct stream *s, struct http_msg *msg);
int (*http_end_body) (struct stream *s, struct http_msg *msg);
void (*http_end) (struct stream *s, struct http_msg *msg);
void (*http_reset) (struct stream *s, struct http_msg *msg);
int (*http_pre_process) (struct stream *s, struct http_msg *msg);
int (*http_post_process) (struct stream *s, struct http_msg *msg);
void (*http_reply) (struct stream *s, short status,
const struct chunk *msg);
};
To declare and use a filter, in the configuration, the "filter" keyword must be
used in a listener/frontend section:
frontend test
...
filter <FILTER-NAME> [OPTIONS...]
The filter referenced by the <FILTER-NAME> must declare a configuration parser
on its own name to fill flt_ops and filter_conf field in the proxy's
structure. An exemple will be provided later to make it perfectly clear.
For now, filters cannot be used in backend section. But this is only a matter of
time. Documentation will also be added later. This is the first commit of a long
list about filters.
It is possible to have several filters on the same listener/frontend. These
filters are stored in an array of at most MAX_FILTERS elements (define in
include/types/filters.h). Again, this will be replaced later by a list of
filters.
The filter API has been highly refactored. Main changes are:
* Now, HA supports an infinite number of filters per proxy. To do so, filters
are stored in list.
* Because filters are stored in list, filters state has been moved from the
channel structure to the filter structure. This is cleaner because there is no
more info about filters in channel structure.
* It is possible to defined filters on backends only. For such filters,
stream_start/stream_stop callbacks are not called. Of course, it is possible
to mix frontend and backend filters.
* Now, TCP streams are also filtered. All callbacks without the 'http_' prefix
are called for all kind of streams. In addition, 2 new callbacks were added to
filter data exchanged through a TCP stream:
- tcp_data: it is called when new data are available or when old unprocessed
data are still waiting.
- tcp_forward_data: it is called when some data can be consumed.
* New callbacks attached to channel were added:
- channel_start_analyze: it is called when a filter is ready to process data
exchanged through a channel. 2 new analyzers (a frontend and a backend)
are attached to channels to call this callback. For a frontend filter, it
is called before any other analyzer. For a backend filter, it is called
when a backend is attached to a stream. So some processing cannot be
filtered in that case.
- channel_analyze: it is called before each analyzer attached to a channel,
expects analyzers responsible for data sending.
- channel_end_analyze: it is called when all other analyzers have finished
their processing. A new analyzers is attached to channels to call this
callback. For a TCP stream, this is always the last one called. For a HTTP
one, the callback is called when a request/response ends, so it is called
one time for each request/response.
* 'session_established' callback has been removed. Everything that is done in
this callback can be handled by 'channel_start_analyze' on the response
channel.
* 'http_pre_process' and 'http_post_process' callbacks have been replaced by
'channel_analyze'.
* 'http_start' callback has been replaced by 'http_headers'. This new one is
called just before headers sending and parsing of the body.
* 'http_end' callback has been replaced by 'channel_end_analyze'.
* It is possible to set a forwarder for TCP channels. It was already possible to
do it for HTTP ones.
* Forwarders can partially consumed forwardable data. For this reason a new
HTTP message state was added before HTTP_MSG_DONE : HTTP_MSG_ENDING.
Now all filters can define corresponding callbacks (http_forward_data
and tcp_forward_data). Each filter owns 2 offsets relative to buf->p, next and
forward, to track, respectively, input data already parsed but not forwarded yet
by the filter and parsed data considered as forwarded by the filter. A any time,
we have the warranty that a filter cannot parse or forward more input than
previous ones. And, of course, it cannot forward more input than it has
parsed. 2 macros has been added to retrieve these offets: FLT_NXT and FLT_FWD.
In addition, 2 functions has been added to change the 'next size' and the
'forward size' of a filter. When a filter parses input data, it can alter these
data, so the size of these data can vary. This action has an effet on all
previous filters that must be handled. To do so, the function
'filter_change_next_size' must be called, passing the size variation. In the
same spirit, if a filter alter forwarded data, it must call the function
'filter_change_forward_size'. 'filter_change_next_size' can be called in
'http_data' and 'tcp_data' callbacks and only these ones. And
'filter_change_forward_size' can be called in 'http_forward_data' and
'tcp_forward_data' callbacks and only these ones. The data changes are the
filter responsability, but with some limitation. It must not change already
parsed/forwarded data or data that previous filters have not parsed/forwarded
yet.
Because filters can be used on backends, when we the backend is set for a
stream, we add filters defined for this backend in the filter list of the
stream. But we must only do that when the backend and the frontend of the stream
are not the same. Else same filters are added a second time leading to undefined
behavior.
The HTTP compression code had to be moved.
So it simplifies http_response_forward_body function. To do so, the way the data
are forwarded has changed. Now, a filter (and only one) can forward data. In a
commit to come, this limitation will be removed to let all filters take part to
data forwarding. There are 2 new functions that filters should use to deal with
this feature:
* flt_set_http_data_forwarder: This function sets the filter (using its id)
that will forward data for the specified HTTP message. It is possible if it
was not already set by another filter _AND_ if no data was yet forwarded
(msg->msg_state <= HTTP_MSG_BODY). It returns -1 if an error occurs.
* flt_http_data_forwarder: This function returns the filter id that will
forward data for the specified HTTP message. If there is no forwarder set, it
returns -1.
When an HTTP data forwarder is set for the response, the HTTP compression is
disabled. Of course, this is not definitive.
The csv stats format breaks when agent changes server state to drain.
Tools like hatop, metric or check agents will fail due to this. This
should be backported to 1.6.
The serial number for a generated certificate was computed using the requested
servername, without any variable/random part. It is not a problem from the
moment it is not regenerated.
But if the cache is disabled or when the certificate is evicted from the cache,
we may need to regenerate it. It is important to not reuse the same serial
number for the new certificate. Else clients (especially browsers) trigger a
warning because 2 certificates issued by the same CA have the same serial
number.
So now, the serial is a static variable initialized with now_ms (internal date
in milliseconds) and incremented at each new certificate generation.
(Ref MPS-2031)
When working on the previous bug, it appeared that it the case that was
triggering the bug would also work between two backends, one of which
doesn't support http-reuse. The reason is that while the idle connection
is moved to the private pool, upon reuse we only check if it holds the
CO_FL_PRIVATE flag. And we don't set this flag when there's no reuse.
So let's always set it in this case, it will guarantee that no undesired
connection sharing may happen.
This fix must be backported to 1.6.
There is a bug in connect_server() : we use si_attach_conn() to offer
the current session's connection to the session we're stealing the
connection from. Unfortunately, si_attach_conn() uses the standard data
connection operations while here we need to use the idle connection
operations.
This results in a situation where when the server's idle timeout strikes,
the read0 is silently ignored, causes the response channel to be shut down
for reads, and the connection remains attached. Next attempt to send a
request when using this connection simply results in nothing being done
because we try to send over an already closed connection. Worse, if the
client aborts, then no timeout remains at all and the session waits
forever and remains assigned to the server.
A more-or-less easy way to reproduce this bug is to have two concurrent
streams each connecting to a different server with "http-reuse aggressive",
typically a cache farm using a URL hash :
stream1: GET /1 HTTP/1.1
stream2: GET /2 HTTP/1.1
stream1: GET /2 HTTP/1.1
wait for the server 1's connection to timeout
stream2: GET /1 HTTP/1.1
The connection hangs here, and "show sess all" shows a closed connection
with a SHUTR on the response channel.
The fix is very simple though not optimal. It consists in calling
si_idle_conn() again after attaching the connection. But in practise
it should not be done like this. The real issue is that there's no way
to cleanly attach a connection to a stream interface without changing
the connection's operations. So the API clearly needs to be revisited
to make such operations easier.
Many thanks to Yves Lafon from W3C for providing lots of useful dumps
and testing patches to help figure the root cause!
This fix must be backported to 1.6.
After a POST on the stats admin page, a 303 is emitted. Unfortunately
this 303 doesn't contain a content-length, which forces the connection
to be closed and reopened. Let's simply add a content-length: 0 to solve
this.
Commit 16f649c ("REORG: polling: rename "fd_spec" to "fd_cache"")
missed the server-facing connection during the rename, so the old
names are still in used and add a bit of confusion during the
debugging.
This should be backported to 1.6 and 1.5.
When a connect() to a unix socket returns EAGAIN we talk about
"no free ports" in the error/debug message, which only makes
sense when using TCP.
Explain connect() failure and suggest troubleshooting server
backlog size.
These two actions don't touch the table so the entry will expire and
the values will not be pushed to other peers. Also in the case of gpc0,
the gpc0_rate counter must be updated. The issue was reported by
Ruoshan Huang.
This fix needs to be backported to 1.6.
Arkadiy Kulev noticed that if a server is marked down while a connection
is being trying to establish, we still insist on performing retries on
the same server, which is absurd. Better perform the redispatch if we
already know the server is down. Because of this, it's likely that the
observe-l4 and sudden-death mechanisms are not optimal an cannot help
much the connection which was used to detect the problem.
The fix should be backported to 1.6 and 1.5 at least.
When users request 16384 bytes for a buffer, they get 16392 after
rounding up. This is problematic for SSL as it systematically
causes a small 8-bytes message to be appended after the first 16kB
message and costs about 15% of performance.
Let's add MEM_F_EXACT to use exactly the size we need. This requires
previous patch (MEDIUM: pools: add a new flag to avoid rounding pool
size up).
This issue was introduced in 1.6 and causes trouble there, so this
fix must be backported.
This is issue was reported by Gary Barrueto and diagnosed by Cyril Bonté.
Usually it's desirable to merge similarly sized pools, which is the
reason why their size is rounded up to the next multiple of 16. But
for the buffers this is problematic because we add the size of
struct buffer to the user-requested size, and the rounding results
in 8 extra bytes that are usable in the end. So the user gets more
bytes than asked for, and in case of SSL it results in short writes
for the extra bytes that are sent above multiples of 16 kB.
So we add a new flag MEM_F_EXACT to request that the size is not
rounded up when creating the entry. Thus it doesn't disable merging.
Gregor KovaÄ reported that http_date() did not return the right day of the
week. For example "Sat, 22 Jan 2016 17:43:38 GMT" instead of "Fri, 22 Jan
2016 17:43:38 GMT". Indeed, gmtime() returns a 'struct tm' result, where
tm_wday begins on Sunday, whereas the code assumed it began on Monday.
This patch must be backported to haproxy 1.5 and 1.6.
Servers state function save and apply server IP when DNS resolution is
enabled on a server.
Purpose is to prevent switching traffic from one server to an other one
when multiple IPs are returned by the DNS server for the A or AAAA
record.
That said, a bug in current code lead to erase the service port while
copying the IP found in the file into the server structure in HAProxy's
memory.
This patch fix this bug.
The bug was reported on the ML by Robert Samuel Newson and fix proposed
by Nenad Merdanovic.
Thank you both!!!
backport: can be backported to 1.6
Erez reported a bug on discourse.haproxy.org about DNS resolution not
occuring when no port is specified on the nameserver directive.
This patch prevent this behavior by returning an error explaining this
issue when parsing the configuration file.
That said, later, we may want to force port 53 when client did not
provide any.
backport: 1.6
This function should return a 16-bit type as that is the type for
dns header id.
Also because it is doing an uint16 unpack big-endian operation.
Backport: can be backported to 1.6
Signed-off-by: Thiago Farina <tfarina@chromium.org>
Signed-off-by: Baptiste Assmann <bedis9@gmail.com>
Changes to if statements do not affect code operation, just layout of
the code. Type casts from malloc returns have been removed as this cast
happens automatically from the void* type.
This may be backported to 1.6.
Parameters provided to 51Degrees methods that have changed to require
const pointers are now cast to avoid compiler warnings.
This should be backported to 1.6.
Malloc continues to be used for the creation of cache entries. The
implementation has been enhanced ready for production deployment. A new
method to free cache entries created in 51d.c has been added to ensure
memory is released correctly.
This should be backported to 1.6.
Args pointer is now used as the LRU cache domain to ensure the cache
distinguishes between multiple fetch and conv configurations.
This should be backported to 1.6.
Introduction of a new function in the LRU cache source file.
Purpose of this function is to be used to delete a number of entries in
the cache. 'number' is defined by the caller and the key removed are
taken at the tail of the tree
csv_enc_append() returns a pointer to the beginning of the encoded
string, which makes it convenient to use in printf(). However it's not
convenient for use in chunks as it may leave an unused byte at the
beginning depending on the automatic quoting. Let's modify it to work
in two passes. First it looks for a character that requires escaping
using strpbrk(), and second it encodes the string. This way it
guarantees to always start at the first available byte of the chunk.
Additionally it made the code quite simpler.
We have csv_enc() but there's no way to append some CSV-encoded data
to an existing chunk, so here we modify the existing function for this
and create an inlined version of csv_enc() which first resets the output
chunk. It will be handy to append data to an existing chunk without
having to use an extra temporary chunk, or to encode multiple strings
into a single chunk with chunk_newstr().
The patch is quite small, in fact most changes are typo fixes in the
comments.
The function http_reply_and_close has been added in proto_http.c to wrap calls
to stream_int_retnclose. This functions will be modified when the filters will
be added.
When the response body is forwarded, if the server closes the input before the
end, an error is thrown. But if the data processing is too slow, all data could
already be received and pending in the input buffer. So this is a bug to stop
processing in this context. The server doesn't really closed the input before
the end.
As an example, this could happen when HAProxy is configured to do compression
offloading. If the server closes the connection explicitly after the response
(keep-alive disabled by the server) and if HAProxy receives the data faster than
they are compressed, then the response could be truncated.
This patch fixes the bug by checking if some pending data remain in the input
buffer before returning an error. If yes, the processing continues.
Several cases of "<=" instead of "<" were found in the url_param parser,
mostly affecting the case where the parameter is wrapping. They shouldn't
affect header operations, just body parsing in a wrapped pipelined request.
The code is a bit complicated with certain operations done multiple times
in multiple functions, so it's not sure others are not left. This code
must be re-audited.
It should only be backported to 1.6 once carefully tested, because it is
possible that other bugs relied on these ones.
The applet can't have access to the session private data. This patch
fix this problem. Now an applet can use private data stored by actions
and fecthes.
INNER and XFERBODY analyzer were set in order to support HTTP applets
from TCP rulesets, but this does not work (cf previous patch).
Other cases already provides theses analyzers, so their addition is
not needed. Furthermore if INNER was set it could cause some headers
to be rewritten (ex: connection) after headers were already forwarded,
resulting in a crash in buffer_insert_line2().
Special thanks to Bernd Helm for providing very detailed information,
captures and stack traces making it possible to spot the root cause
here.
This fix must be backported to 1.6.
HTTP applets request requires everything initilized by
"http_process_request" (analyzer flag AN_REQ_HTTP_INNER).
The applet will be immediately initilized, but its before
the call of this analyzer.
Due to this problem HTTP applets could be called with uncompletely
initialized http_txn.
This fix must be backported to 1.6.
In certain circumstances (eg: Lua HTTP applet called from a
TCP ruleset before http_process_request()), the HTTP TXN is not
yet fully initialized so some information it contains cannot be
relied on. Such information include the HTTP version, the state
of the expect: 100-continue header, the connection header and
the transfer-encoding header.
Here the bug only turns something which already doesn't work
into something wrong, but better avoid any references to the
http_txn from the Lua code to avoid future mistakes.
This patch should be backported into 1.6 for code consistency.
If a sample fetch needing http_txn is called from an HTTP Lua applet,
the result will be invalid and may even cause a crash because some HTTP
data can be forwarded and the HTTP txn is no longer valid.
Here the solution is to ensure that a fetch called from Lua never
needs http_txn. This is done thanks to a new flag HLUA_F_MAY_USE_HTTP
which indicates whether or not it is safe to call a fetch which needs
HTTP.
This fix needs to be backported to 1.6.
This patch converts a boolean "int" to a bitfiled. The main
reason is to save space in the struct if another flag may will
be require.
Note that this patch is required for next fix and will need to be
backported to 1.6.
When a POST is processed by a Lua service, the HTTP header are
potentially gone. So, we cannot retrieve their content using
the standard "hdr" sample fetchs (which will soon become invalid
anyway) from an applet.
This patch add an entry "headers" to the object applet_http. This
entry is an array containing all the headers. It permits to use the
HTTP headers during the processing of the service.
Many thanks to Jan Bruder for reporting this issue with enough
details to reproduce it.
This patch will have to be backported to 1.6 since it will be the
only way to access headers from Lua applets.
New sticktable entries learned from a remote peer can be pushed to others after
a random delay because they are not inserted at the right position in the updates
tree.
When memmax is forced using "-m", the per-process memory limit is enforced
using setrlimit(), but this value is not used to compute the automatic
maxconn limit. In addition, the per-process memory limit didn't consider
the fact that the shared SSL cache only needs to be accounted once.
The doc was also fixed to clearly state that "-m" is global and not per
process. It makes sense because people who use -m want to protect the
system's resources regardless of whatever appears in the configuration.
This setting used to be assigned to a variable tunable from a constant
and for an unknown reason never made its way into the config parser.
tune.recv_enough <number>
Haproxy uses some hints to detect that a short read indicates the end of the
socket buffers. One of them is that a read returns more than <recv_enough>
bytes, which defaults to 10136 (7 segments of 1448 each). This default value
may be changed by this setting to better deal with workloads involving lots
of short messages such as telnet or SSH sessions.
Added support for loading mutiple certs into shared contexts when they
are specified in a crt-list
Note that it's not practical to support SNI filters with multicerts, so
any SNI filters that's provided to the crt-list is ignored if a
multi-cert opertion is used.
Added ability for users to specify multiple certificates that all relate
a single server. Users do this by specifying certificate "cert_name.pem"
but having "cert_name.pem.rsa", "cert_name.pem.dsa" and/or
"cert_name.pem.ecdsa" in the directory.
HAProxy will now intelligently search for those 3 files and try combine
them into as few SSL_CTX's as possible based on CN/SAN. This will allow
HAProxy to support multiple ciphersuite key algorithms off a single
SSL_CTX.
This change integrates into the existing architecture of SNI lookup and
multiple SNI's can point to the same SSL_CTX, which can support multiple
key_types.
Added cert_key_and_chain struct to ssl. This struct will store the
contents of a crt path (from the config file) into memory. This will
allow us to use the data stored in memory instead of reading the file
multiple times.
This will be used to support a later commit to load multiple pkeys/certs
into a single SSL_CTX
In order to properly enable sched_setaffinity, in some versions of Linux,
it is rather _GNU_SOURCE than __USE_GNU (spotted on Alpine Linux for instance),
also for the sake of consistency as __USE_GNU seems not used across the code and
for last, it seems on Linux it is the best way to enable non portable code.
On Linux glibc's based versions, it seems _GNU_SOURCE defines __USE_GNU
it should be safe enough.
Krishna Kumar reported that the following configuration doesn't permit
HTTP reuse between two clients :
frontend private-frontend
mode http
bind :8001
default_backend private-backend
backend private-backend
mode http
http-reuse always
server bck 127.0.0.1:8888
The reason for this is that in http_end_txn_clean_session() we check the
stream's backend backend's http-reuse option before deciding whether the
backend connection should be moved back to the server's pool or not. But
since we're doing this after the call to http_reset_txn(), the backend is
reset to match the frontend, which doesn't have the option. However it
will work fine in a setup involving a "listen" section.
We just need to keep a pointer to the current backend before calling
http_reset_txn(). The code does that and replaces the few remaining
references to s->be inside the same function so that if any part of
code were to be moved later, this trap doesn't happen again.
This fix must be backported to 1.6.
A small configuration parsing error exists when no port is setup on the
server IP:port statement and the server's parameter 'port' is not set
and if the first tcp-check rule is a comment, like in the example below:
backend b
option tcp-check
tcp-check comment blah
tcp-check connect 8444
server s 127.0.0.1 check
In such case, an ALERT is improperly returned, despite this
configuration is valid and works.
The new code move the pointer to the first tcp-check rule which isn't a
comment before checking the presence of the port.
backport status: 1.6 and above
Current configuration parsing is permissive in such situation:
A server in a backend with no port conigured on the IP address
statement, no 'port' parameter configured and last rule of a tcp-check
is a CONNECT with no port.
The current code currently parses all the rules to validate a port is
well available, but it misses the last one, which means such
configuration is valid:
backend b
option tcp-check
tcp-check connect port 8444
tcp-check connect
server s 127.0.0.1 check
the second connect tentative is sent to port '0'...
Current patch fixes this by parsing the list the right way, including
the last rule.
backport status: 1.6 and above
A segfault can occur during at the initialization phase, when an unknown
"mailers" name is configured. This happens when "email-alert myhostname" is not
set, where a direct pointer to an array is used instead of copying the string,
causing the segfault when haproxy tries to free the memory.
This is a minor issue because the configuration is invalid and a fatal error
will remain, but it should be fixed to prevent reload issues.
Example of minimal configuration to reproduce the bug :
backend example
email-alert mailers NOT_FOUND
email-alert from foo@localhost
email-alert to bar@localhost
This fix must be backported to 1.6.
Tommy Atkinson and Sylvain Faivre reported that email alerts didn't work when
they were declared in the defaults section. This is due to the use of an
internal attribute which is set once an email-alert is at least partially
configured. But this attribute was not propagated to the current proxy during
the configuration parsing.
Not that the issue doesn't occur if "email-alert myhostname" is configured in
the defaults section.
This fix must be backported to 1.6.
Currently urlp fetching samples were able to find parameters with an empty
value, but the return code depended on the value length. The final result was
that acls using urlp couldn't match empty values.
Example of acl which always returned "false":
acl MATCH_EMPTY urlp(foo) -m len 0
The fix consists in unconditionally return 1 when the parameter is found.
This fix must be backported to 1.6 and 1.5.
Right now it's possible to change the global compression rate limiting
without the CLI being at the admin level.
This fix must be backported to 1.6 and 1.5.
It's pointless to reserve this amount of memory when zlib is not used.
Adding the condition will make build scripts easier to manage. This may
be backported to 1.6.
client-fin and server-fin are bogus. They are applied on the write
side after a SHUTR was seen. The immediate effect is that sometimes
if a SHUTR was seen after a SHUTW on the same side, the timeout is
enabled again regardless of the fact that the output is already
closed. This results in the timeout event not to be processed and
a busy poll loop to happen until another timeout on the stream gets
rid of it. Note that haproxy continues its job during this, it's just
that it eats all the CPU trying to handle an event that it ignores.
An reproducible case consists in having a client stop reading data from
a server to ensure data remain in the response buffer, then the client
sends a shutdown(write). If abortonclose is enabled on haproxy, the
shutdown is passed to the server side and the server responds with a
SHUTR that cannot immediately be forwarded to the client since the
buffer is full. During this time the event is ignored and the task is
woken again in loops.
It is worth noting that the timeout handling since 1.5 is a bit fragile
and that it might be possible that other similar conditions still exist,
so the timeout handling should be audited regarding this issue.
Many thanks to BaiYang for providing detailed information showing the
problem in action.
This bug also affects 1.5 thus the fix must be backported.
There is a bug where "option http-keep-alive" doesn't force a response
to stay in keep-alive if the server sends the FIN along with the response
on the second or subsequent response. The reason is that the auto-close
was forced enabled when recycling the HTTP transaction and it's never
disabled along the response processing chain before the SHUTR gets a
chance to be forwarded to the client side. The MSG_DONE state of the
HTTP response properly disables it but too late.
There's no more reason for enabling auto-close here, because either it
doesn't matter in non-keep-alive modes because the connection is closed,
or it is automatically enabled by process_stream() when it sees there's
no analyser on the stream.
This bug also affects 1.5 so a backport is desired.
Sander Klein reported an error messages about SSLv3 not
being supported on Debian 8, although he didn't force-sslv3.
Vincent Bernat tracked this down to the LUA initialization, which
actually does force-sslv3.
This patch removes force-sslv3 from the LUA initialization, so
the LUA SSL socket can actually use TLS and doesn't trigger
warnings when SSLv3 is not supported by libssl (such as in
Debian 8).
This should be backported to 1.6.
When a server timeout is detected on the second or nth request of a keep-alive
connection, HAProxy closes the connection without writing a response.
Some clients would fail with a remote disconnected exception and some
others would retry potentially unsafe requests.
This patch removes the special case and makes sure a 504 timeout is
written back whenever a server timeout is handled.
Signed-off-by: lsenta <laurent.senta@gmail.com>
When the txn.done() fiunction is called, the ouput buffer is cleaned,
but the associated relative pointer on the HTTP requests elements
is not reseted.
This patch remove this cleanup, because the output buffer may contain
data to forward.
The initial record layer version in a SSL handshake may be set to TLSv1.0
or similar for compatibility reasons, this is allowed as per RFC5246
Appendix E.1 [1]. Some implementations are Openssl [2] and NSS [3].
A related issue has been fixed some time ago in commit 57d229747
("BUG/MINOR: acl: req_ssl_sni fails with SSLv3 record version").
Fix this by using the real client hello version instead of the record
layer version.
This was reported by Julien Vehent and analyzed by Cyril Bonté.
The initial patch is from Julien Vehent as well.
This should be backported to stable series, the req_ssl_ver keyword was
first introduced in 1.3.16.
[1] https://tools.ietf.org/html/rfc5246#appendix-E.1
[2] 4a1cf50187
[3] https://bugzilla.mozilla.org/show_bug.cgi?id=774547
fgets() can return NULL on error or when EOF occurs. This patch adds a
check of fgets() return value and displays a warning if the first line of
the server state file can not be read. Additionally, we make sure to close
the previously opened file descriptor.
It is possible to create a http capture rule which points to a capture slot
id which does not exist.
Current patch prevent this when parsing configuration and prevent running
configuration which contains such rules.
This configuration is now invalid:
frontend f
bind :8080
http-request capture req.hdr(User-Agent) id 0
default_backend b
this one as well:
frontend f
bind :8080
declare capture request len 32 # implicit id is 0 here
http-request capture req.hdr(User-Agent) id 1
default_backend b
It applies of course to both http-request and http-response rules.
Causes HAProxy to emit a static string to the agent on every check,
so that you can independently control multiple services running
behind a single agent port.
The direction (request or response) is not propagated in the
sample fecthes called throught Lua. This patch adds the direction
status in some structs (hlua_txn and hlua_smp) to make sure that
the sample fetches will be called with all the information.
The converters can not access to a TXN object, so there are not
impacted the direction. However, the samples used as input of the
Lua converter wrapper are initiliazed with the direction. Thereby,
the struct smp stay consistent.
[wt: needs to be backported to 1.6]
This patch cleanups the direction names. It replaces numeric values,
by the associated defines. It ensure the compliance with values found
somwhere else in HAProxy.
It is required by the bugfix patch which is following.
[wt: needs to be backported to 1.6]
Current resolvers section parsing function is permissive on nameserver
id and two nameservers may have the same id.
It's a shame, since we don't know for example, whose statistics belong
to which nameserver...
From now, configuration with duplicated nameserver id in a resolvers
section are considered as broken and returns a fatal error when parsing.
Cyril Bonté reported a reproduceable sequence which can lead to a crash
when using backend connection reuse. The problem comes from the fact that
we systematically add the server connection to an idle pool at the end of
the HTTP transaction regardless of the fact that it might already be there.
This is possible for example when processing a request which doesn't use
a server connection (typically a redirect) after a request which used a
connection. Then after the first request, the connection was already in
the idle queue and we're putting it a second time at the end of the second
request, causing a corruption of the idle pool.
Interestingly, the memory debugger in 1.7 immediately detected a suspicious
double free on the connection, leading to a very early detection of the
cause instead of its consequences.
Thanks to Cyril for quickly providing a working reproducer.
This fix must be backported to 1.6 since connection reuse was introduced
there.
A bug lied in the parsing of DNS CNAME response, leading HAProxy to
think the CNAME was improperly resolved in the response.
This should be backported into 1.6 branch
The status DNS_UPD_NAME_ERROR returned by dns_get_ip_from_response and
which means the queried name can't be found in the response was
improperly processed (fell into the default case).
This lead to a loop where HAProxy simply resend a new query as soon as
it got a response for this status and in the only case where such type
of response is the very first one received by the process.
This should be backported into 1.6 branch
It was accidently discovered that limiting haproxy to 5000 MB leads to
an effective limit of 904 MB. This is because the computation for the
size limit is performed by multiplying rlimit_memmax by 1048576, and
doing so causes the operation to be performed on an int instead of a
long or long long. Just switch to 1048576ULL as is done at other places
to fix this.
This bug affects all supported versions, the backport is desired, though
it rarely affects users since few people apply memory limits.
When DEBUG_MEMORY_POOLS is used, we now use the link pointer at the end
of the pool to store a pointer to the pool, and to control it during
pool_free2() in order to serve four purposes :
- at any instant we can know what pool an object was allocated from
when examining memory, hence how we should possibly decode it ;
- it serves to detect double free when they happen, as the pointer
cannot be valid after the element is linked into the pool ;
- it serves to detect if an element is released in the wrong pool ;
- it serves as a canary, to detect if some buffers experienced an
overflow before being release.
All these elements will definitely help better troubleshoot strange
situations, or at least confirm that certain conditions did not happen.
When debugging a core file, it's sometimes convenient to be able to
visit the released entries in the pools (typically last released
session). Unfortunately the first bytes of these entries are destroyed
by the link elements of the pool. And of course, most structures have
their most accessed elements at the beginning of the structure (typically
flags). Let's add a build-time option DEBUG_MEMORY_POOLS which allocates
an extra pointer in each pool to put the link at the end of each pool
item instead of the beginning.
This commit adds support for setting a per-server maxconn from the stats
socket. The only really notable part of this commit is that we need to
check if maxconn == minconn before changing things, as this indicates
that we are NOT using dynamic maxconn. When we are not using dynamic
maxconn, we should update maxconn/minconn in lockstep.
The function 'EVP_PKEY_get_default_digest_nid()' was introduced in OpenSSL
1.0.0. So for older version of OpenSSL, compiled with the SNI support, the
HAProxy compilation fails with the following error:
src/ssl_sock.c: In function 'ssl_sock_do_create_cert':
src/ssl_sock.c:1096:7: warning: implicit declaration of function 'EVP_PKEY_get_default_digest_nid'
if (EVP_PKEY_get_default_digest_nid(capkey, &nid) <= 0)
[...]
src/ssl_sock.c:1096: undefined reference to `EVP_PKEY_get_default_digest_nid'
collect2: error: ld returned 1 exit status
Makefile:760: recipe for target 'haproxy' failed
make: *** [haproxy] Error 1
So we must add a #ifdef to check the OpenSSL version (>= 1.0.0) to use this
function. It is used to get default signature digest associated to the private
key used to sign generated X509 certificates. It is called when the private key
differs than EVP_PKEY_RSA, EVP_PKEY_DSA and EVP_PKEY_EC. It should be enough for
most of cases.
Basically, it's ill-defined and shouldn't really be used going forward.
We can't guarantee that resolvers will do the 'legwork' for us and
actually resolve CNAMES when we request the ANY query-type. Case in point
(obfuscated, clearly):
PRODUCTION! ahayworth@secret-hostname.com:~$
dig @10.11.12.53 ANY api.somestartup.io
; <<>> DiG 9.8.4-rpz2+rl005.12-P1 <<>> @10.11.12.53 ANY api.somestartup.io
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62454
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 0
;; QUESTION SECTION:
;api.somestartup.io. IN ANY
;; ANSWER SECTION:
api.somestartup.io. 20 IN CNAME api-somestartup-production.ap-southeast-2.elb.amazonaws.com.
;; AUTHORITY SECTION:
somestartup.io. 166687 IN NS ns-1254.awsdns-28.org.
somestartup.io. 166687 IN NS ns-1884.awsdns-43.co.uk.
somestartup.io. 166687 IN NS ns-440.awsdns-55.com.
somestartup.io. 166687 IN NS ns-577.awsdns-08.net.
;; Query time: 1 msec
;; SERVER: 10.11.12.53#53(10.11.12.53)
;; WHEN: Mon Oct 19 22:02:29 2015
;; MSG SIZE rcvd: 242
HAProxy can't handle that response correctly.
Rather than try to build in support for resolving CNAMEs presented
without an A record in an answer section (which may be a valid
improvement further on), this change just skips ANY record types
altogether. A and AAAA are much more well-defined and predictable.
Notably, this commit preserves the implicit "Prefer IPV6 behavior."
Furthermore, ANY query type by default is a bad idea: (from Robin on
HAProxy's ML):
Using ANY queries for this kind of stuff is considered by most people
to be a bad practice since besides all the things you named it can
lead to incomplete responses. Basically a resolver is allowed to just
return whatever it has in cache when it receives an ANY query instead
of actually doing an ANY query at the authoritative nameserver. Thus
if it only received queries for an A record before you do an ANY query
you will not get an AAAA record even if it is actually available since
the resolver doesn't have it in its cache. Even worse if before it
only got MX queries, you won't get either A or AAAA
Kim Seri reported that haproxy 1.6.0 crashes after a few requests
when a bind line has SSL enabled with more than one certificate. This
was caused by an insufficient condition to free generated certs during
ssl_sock_close() which can also catch other certs.
Christopher Faulet analysed the situation like this :
-------
First the LRU tree is only initialized when the SSL certs generation is
configured on a bind line. So, in the most of cases, it is NULL (it is
not the same thing than empty).
When the SSL certs generation is used, if the cache is not NULL, a such
certificate is pushed in the cache and there is no need to release it
when the connection is closed.
But it can be disabled in the configuration. So in that case, we must
free the generated certificate when the connection is closed.
Then here, we have really a bug. Here is the buggy part:
3125) if (conn->xprt_ctx) {
3126) #ifdef SSL_CTRL_SET_TLSEXT_HOSTNAME
3127) if (!ssl_ctx_lru_tree && objt_listener(conn->target)) {
3128) SSL_CTX *ctx = SSL_get_SSL_CTX(conn->xprt_ctx);
3129) if (ctx != 3130)
SSL_CTX_free(ctx);
3131) }
3133) SSL_free(conn->xprt_ctx);
3134) conn->xprt_ctx = NULL;
3135) sslconns--;
3136) }
The check on the line 3127 is not enough to determine if this is a
generated certificate or not. Because ssl_ctx_lru_tree is NULL,
generated certificates, if any, must be freed. But here ctx should also
be compared to all SNI certificates and not only to default_ctx. Because
of this bug, when a SNI certificate is used for a connection, it is
erroneously freed when this connection is closed.
-------
Christopher provided this reliable reproducer :
----------
global
tune.ssl.default-dh-param 2048
daemon
listen ssl_server
mode tcp
bind 127.0.0.1:4443 ssl crt srv1.test.com.pem crt srv2.test.com.pem
timeout connect 5000
timeout client 30000
timeout server 30000
server srv A.B.C.D:80
You just need to generate 2 SSL certificates with 2 CN (here
srv1.test.com and srv2.test.com).
Then, by doing SSL requests with the first CN, there is no problem. But
with the second CN, it should segfault on the 2nd request.
openssl s_client -connect 127.0.0.1:4443 -servername srv1.test.com // OK
openssl s_client -connect 127.0.0.1:4443 -servername srv1.test.com // OK
But,
openssl s_client -connect 127.0.0.1:4443 -servername srv2.test.com // OK
openssl s_client -connect 127.0.0.1:4443 -servername srv2.test.com // KO
-----------
A long discussion led to the following proposal which this patch implements :
- the cert is generated. It gets a refcount = 1.
- we assign it to the SSL. Its refcount becomes two.
- we try to insert it into the tree. The tree will handle its freeing
using SSL_CTX_free() during eviction.
- if we can't insert into the tree because the tree is disabled, then
we have to call SSL_CTX_free() ourselves, then we'd rather do it
immediately. It will more closely mimmick the case where the cert
is added to the tree and immediately evicted by concurrent activity
on the cache.
- we never have to call SSL_CTX_free() during ssl_sock_close() because
the SSL session only relies on openssl doing the right thing based on
the refcount only.
- thus we never need to know how the cert was created since the
SSL_CTX_free() is either guaranteed or already done for generated
certs, and this protects other ones against any accidental call to
SSL_CTX_free() without having to track where the cert comes from.
This patch also reduces the inter-dependence between the LRU tree and
the SSL stack, so it should cause less sweating to migrate to threads
later.
This bug is specific to 1.6.0, as it was introduced after dev7 by
this fix :
d2cab92 ("BUG/MINOR: ssl: fix management of the cache where forged certificates are stored")
Thus a backport to 1.6 is required, but not to 1.5.
Susheel Jalali reported a confusing bug in namespaces implementation.
If namespaces are enabled at build time (USE_NS=1) and *no* namespace
is used at all in the whole config file, my_socketat() returns -1 and
all socket bindings fail. This is because of a wrong condition in this
function. A possible workaround consists in creating some namespaces.
The function which parses a DNS response buffer did not move properly a
pointer when reading a packet where records does not use DNS "message
compression" techniques.
Thanks to 0yvind Johnsen for the help provided during the troubleshooting
session.
This is equivalent to commit 2af207a ("MEDIUM: tcp: implement tcp-ut
bind option to set TCP_USER_TIMEOUT") except that this time it works
on the server side. The purpose is to detect dead server connections
even when checks are rare, disabled, or after a soft reload (since
checks are disabled there as well), and to ensure client connections
will get killed faster.
Baptiste reported a segfault when the "id" keyword was passed on the
"stats socket" line. The problem is related to the fact that the stats
parser stats_parse_global() passes curpx instead of global.stats_fe to
the keyword parser. Indeed, curpx being a pointer to the proxy in the
current section, it is not correct here since the global section does
not describe a proxy. It's just by pure luck that only bind_parse_id()
uses the proxy since any other keyword parser could use it as well.
The bug has no impact since the id specified here is not usable at all
and can be discarded from a faulty configuration.
This fix must be backported to 1.5.
Lua needs to known the direction of the http data processed (request or
response). It checks the flag SMP_OPT_DIR_REQ, buf this flag is 0. This patch
correctly checks the flags after applying the SMP_OPT_DIR mask.
A previous commit broke the interactive stats cli prompt. Specifically,
it was not clear that we could be in STAT_CLI_PROMPT when we get to
the output functions for the cli handler, and the switch statement did
not handle this case. We would then fall through to the default
statement, which was recently changed to set error flags on the socket.
This in turn causes the socket to be closed, which is not what we wanted
in this specific case.
To fix, we add a case for STAT_CLI_PROMPT, and simply break out of the
switch statement.
Testing:
- Connected to unix stats socket, issued 'prompt', observed that I
could issue multiple consecutive commands.
- Connected to unix stats socket, issued 'prompt', observed that socket
timed out after inactivity expired.
- Connected to unix stats socket, issued 'prompt' then 'set timeout cli
5', observed that socket timed out after 5 seconds expired.
- Connected to unix stats socket, issued invalid commands, received
usage output.
- Connected to unix stats socket, issued 'show info', received info
output and socket disconnected.
- Connected to unix stats socket, issued 'show stat', received stats
output and socket disconnected.
- Repeated above tests with TCP stats socket.
[wt: no backport needed, this was introduced during the applet rework in 1.6]
Now, A callback is defined for generated certificates to set DH parameters for
ephemeral key exchange when required.
In same way, when possible, we also defined Elliptic Curve DH (ECDH) parameters.
This is done by adding EVP_PKEY_EC type in supported types for the CA private
key when we get the message digest used to sign a generated X509 certificate.
So now, we support DSA, RSA and EC private keys.
And to be sure, when the type of the private key is not directly supported, we
get its default message digest using the function
'EVP_PKEY_get_default_digest_nid'.
We also use the key of the default certificate instead of generated it. So we
are sure to use the same key type instead of always using a RSA key.
the file specified by the SSL option 'ca-sign-file' can now contain the CA
certificate used to dynamically generate certificates and its private key in any
order.
Commit d2cab92 ("BUG/MINOR: ssl: fix management of the cache where forged
certificates are stored") removed some needed #ifdefs resulting in ssl not
building on older openssl versions where SSL_CTRL_SET_TLSEXT_HOSTNAME is
not defined :
src/ssl_sock.c: In function 'ssl_sock_load_ca':
src/ssl_sock.c:2504: error: 'ssl_ctx_lru_tree' undeclared (first use in this function)
src/ssl_sock.c:2504: error: (Each undeclared identifier is reported only once
src/ssl_sock.c:2504: error: for each function it appears in.)
src/ssl_sock.c:2505: error: 'ssl_ctx_lru_seed' undeclared (first use in this function)
src/ssl_sock.c: In function 'ssl_sock_close':
src/ssl_sock.c:3095: error: 'ssl_ctx_lru_tree' undeclared (first use in this function)
src/ssl_sock.c: In function '__ssl_sock_deinit':
src/ssl_sock.c:5367: error: 'ssl_ctx_lru_tree' undeclared (first use in this function)
make: *** [src/ssl_sock.o] Error 1
Reintroduce the ifdefs around the faulty areas.
First, the LRU cache must be initialized after the configuration parsing to
correctly set its size.
Next, the function 'ssl_sock_set_generated_cert' returns -1 when an error occurs
(0 if success). In that case, the caller is responsible to free the memory
allocated for the certificate.
Finally, when a SSL certificate is generated by HAProxy but cannot be inserted
in the cache, it must be freed when the SSL connection is closed. This happens
when 'tune.ssl.ssl-ctx-cache-size' is set to 0.
The 'OPTIONS' method was not in the list of supported HTTP methods and
find_http_meth return HTTP_METH_OTHER instead of HTTP_METH_OPTIONS.
[wt: this fix needs to be backported at least to 1.5, 1.4 and 1.3]
When debugging an issue, sometimes it can be useful to be able to use
byte 0 to poison memory areas, resulting in the same effect as a calloc().
This patch changes the default mem_poison_byte to -1 to disable it so that
all positive values are usable.
HAProxy could already support being passed a file list on the command
line, by passing multiple times "-f" followed by a file name. People
have been complaining that it made it hard to pass file lists from init
scripts.
This patch introduces an end of arguments using the common "--" tag,
after which only file names may appear. These files are then added to
the existing list of other files specified using -f and are loaded in
their declaration order. Thus it becomes possible to do something like
this :
haproxy -sf $(pidof haproxy) -- /etc/haproxy/global.cfg /etc/haproxy/customers/*.cfg
Given that all command line arguments start with a '-' and that
no pid number can start with this character, there's no constraint
to make the pid list the last argument. Let's relax this rule.
Thierry reported that keep-alive still didn't cope well with Lua
services. The reason is that for now applets have to be closed at
the end of a transaction so we want to work in server-close mode,
which isn't noticeable by the client since it still sees keep-alive.
Additionally we want to enable the request body transfer analyser
which will be needed to synchronize with the response analyser to
indicate the end of the transfer.
The release handler used to be called twice for some time and just by
pure luck we never ended up double-freeing the data there. Add a NULL
to ensure this can never happen should a future change permit this
situation again.
This commit adds support for dumping all resolver stats. Specifically
if a command 'show stats resolvers' is issued withOUT a resolver section
id, we dump all known resolver sections. If none are configured, a
message is displayed indicating that.
When using the external-check command option HAProxy was failing to
start with a fatal error "'external-check' cannot handle unexpected
argument". When looking at the code it was looking for an incorrect
argument. Also correcting an Alert message text as spotted by by
PiBa-NL.
When first parameter to getaddrinfo() is not NULL (it is always not NULL
in str2ip()), on Linux AI_PASSIVE value for ai_flags is ignored. On
FreeBSD, when AI_PASSIVE is specified and hostname parameter is not NULL,
getaddrinfo() ignores local address selection policy, always returning
AAAA record. Pass zero ai_flags to behave correctly on FreeBSD, this
change should be no-op for Linux.
This fix should be backported to 1.5 as well, after some observation
period.
Michael Ezzell reported a bug causing haproxy to segfault during startup
when trying to send syslog message from Lua. The function __send_log() can
be called with *p that is NULL and/or when the configuration is not fully
parsed, as is the case with Lua.
This patch fixes this problem by using individual vectors instead of the
pre-generated strings log_htp and log_htp_rfc5424.
Also, this patch fixes a problem causing haproxy to write the wrong pid in
the logs -- the log_htp(_rfc5424) strings were generated at the haproxy
start, but "pid" value would be changed after haproxy is started in
daemon/systemd mode.
Only the main execution function can set the run flag, because it is
the last function before the execution time.
This patch removes the flag set by another function. It will be used
by the new lua timeout counter.
Commit e11cfcd ("MINOR: config: new backend directives:
load-server-state-from-file and server-state-file-name") introduced a bug
which can cause haproxy to crash upon startup by sending user-controlled
data in a format string when emitting a warning. Fix the way the warning
message is built to avoid this.
No backport is needed, this was introduced in 1.6-dev6 only.
Commit e11cfcd ("MINOR: config: new backend directives:
load-server-state-from-file and server-state-file-name") caused these
warnings when building with Clang :
src/server.c:1972:21: warning: comparison of unsigned expression < 0 is always false [-Wtautological-compare]
(srv_uweight < 0) || (srv_uweight > SRV_UWGHT_MAX))
~~~~~~~~~~~ ^ ~
src/server.c:1980:21: warning: comparison of unsigned expression < 0 is always false [-Wtautological-compare]
(srv_iweight < 0) || (srv_iweight > SRV_UWGHT_MAX))
~~~~~~~~~~~ ^ ~
Indeed, srv_iweight and srv_uweight are unsigned. Just drop the offending test.
The conn_sock_drain() call is only there to force the system to ACK
pending data in case of TCP_QUICKACK so that the client doesn't retransmit,
otherwise it leads to a real RST making the feature useless. There's no
point in draining the connection when quick ack cannot be disabled, so
let's move the call inside the ifdef part.
The silent-drop action is supposed to close with a TCP reset that is
either not sent or not too far. But since it's on the client-facing
side, the socket's lingering is enabled by default and the RST only
occurs if some pending unread data remain in the queue when closing.
This causes some clean shutdowns to occur with retransmits, which is
not good at all. Force linger_risk on the socket to flush all data
and destroy the socket.
No backport is needed, this was introduced in 1.6-dev6.
req.ssl_st_ext : integer
Returns 0 if the client didn't send a SessionTicket TLS Extension (RFC5077)
Returns 1 if the client sent SessionTicket TLS Extension
Returns 2 if the client also sent non-zero length TLS SessionTicket
This stops the evaluation of the rules and makes the client-facing
connection suddenly disappear using a system-dependant way that tries
to prevent the client from being notified. The effect it then that the
client still sees an established connection while there's none on
HAProxy. The purpose is to achieve a comparable effect to "tarpit"
except that it doesn't use any local resource at all on the machine
running HAProxy. It can resist much higher loads than "tarpit", and
slow down stronger attackers. It is important to undestand the impact
of using this mechanism. All stateful equipments placed between the
client and HAProxy (firewalls, proxies, load balancers) will also keep
the established connection for a long time and may suffer from this
action. On modern Linux systems running with enough privileges, the
TCP_REPAIR socket option is used to block the emission of a TCP
reset. On other systems, the socket's TTL is reduced to 1 so that the
TCP reset doesn't pass the first router, though it's still delivered to
local networks.
tcp-request connection had an inverted condition on action_ptr, resulting
in no registered actions to be usable since commit 4214873 ("MEDIUM: actions:
remove ACTION_STOP") merged in 1.6-dev5. Very few new actions were impacted.
No backport is needed.
When peers are stopped due to not being running on the appropriate
process, we want to completely release them and unregister their signals
and task in order to ensure there's no way they may be called in the
future.
Note: ideally we should have a list of all tables attached to a peers
section being disabled in order to unregister them and void their
sync_task. It doesn't appear to be *that* easy for now.
When performing a soft stop, we used to wake up every proxy's task and
each of their table's task. The problem is that since we're able to stop
proxies and peers not bound to a specific process, we may end up calling
random junk by doing so of the proxy we're waking up is already stopped.
This causes a segfault to appear during soft reloads for old processes
not bound to a peers section if such a section exists in other processes.
Let's only consider proxies that are not stopped when doing this.
This fix must be backported to 1.5 which also has the same issue.
Since commit f83d3fe ("MEDIUM: init: stop any peers section not bound
to the correct process"), it is possible to stop unused peers on certain
processes. The problem is that the pause/resume/stop functions are not
aware of this and will pass a NULL proxy pointer to the respective
functions, resulting in segfaults in unbound processes during soft
restarts.
Properly check that the peers' frontend is still valid before calling
them.
This bug also affects 1.5 so the fix must be backported. Note that this
fix is not enough to completely get rid of the segfault, the next one
is needed as well.
Introduction of the new keyword about the client's cookie name
via a new config entry.
Splits the work between the original convertor and the new fetch type.
The latter iterates through the whole list of HTTP headers but first,
make sure the request data are ready to process and set the input
to the string type rightly.
The API is then able to filter itself which headers are gueninely useful,
with a special care of the Client's cookie.
[WT: the immediate impact for the end user is that configs making use of
"da-csv" won't load anymore and will need to be modified to use either
"da-csv-conv" or "da-csv-fetch"]
This patch adds a new RFC5424-specific log-format for the structured-data
that is automatically send by __send_log() when the sender is in RFC5424
mode.
A new statement "log-format-sd" should be used in order to set log-format
for the structured-data part in RFC5424 formatted syslog messages.
Example:
log-format-sd [exampleSDID@1234\ bytes=\"%B\"\ status=\"%ST\"]
The function __send_log() iterates over senders and passes the header as
the first vector to sendmsg(), thus it can send a logger-specific header
in each message.
A new logger arguments "format rfc5424" should be used in order to enable
RFC5424 header format. For example:
log 10.2.3.4:1234 len 2048 format rfc5424 local2 info
At the moment we have to call snprintf() for every log line just to
rebuild a constant. Thanks to sendmsg(), we send the message in 3 parts:
time-based header, proxy-specific hostname+log-tag+pid, session-specific
message.
static_table_key, get_http_auth_buff and swap_buffer static variables
are now freed during deinit and the two previously new functions are
called as well. In addition, the 'trash' string buffer is cleared.
A new function introduced meant to be called during general deinit phase.
During the configuration parsing, the section entries are all allocated.
This new function free them.
The tune.maxrewrite parameter used to be pre-initialized to half of
the buffer size since the very early days when buffers were very small.
It has grown to absurdly large values over the years to reach 8kB for a
16kB buffer. This prevents large requests from being accepted, which is
the opposite of the initial goal.
Many users fix it to 1024 which is already quite large for header
addition.
So let's change the default setting policy :
- pre-initialize it to 1024
- let the user tweak it
- in any case, limit it to tune.bufsize / 2
This results in 15kB usable to buffer HTTP messages instead of 8kB, and
doesn't affect existing configurations which already force it.
This new target can be called from the frontend or the backend. It
is evaluated just before the backend choice and just before the server
choice. So, the input stream or HTTP request can be forwarded to a
server or to an internal service.
This flag is used by custom actions to know that they're called for the
first time. The only case where it's not set is when they're resuming
from a yield. It will be needed to let them know when they have to
allocate some resources.
The garbage collector is a little bit heavy to run, and it was added
only for cosockets. This patch prevent useless executions when no
cosockets are used.
When the channel is down, the applet is waked up. Before this patch,
the applet closes the stream and the unread data are discarded.
After this patch, the stream is not closed except if the buffers are
empty.
It will be closed later by the close function, or by the garbage collector.
The HAProxy Lua socket respects the Lua Socket tcp specs. these specs
are a little bit limited, it not permits to connect to unix socket.
this patch extends a little it the specs. It permit to accept a
syntax that begin with "unix@", "ipv4@", "ipv6@", "fd@" or "abns@".
Actions may yield but must not do it during the final call from a ruleset
because it indicates there will be no more opportunity to complete or
clean up. This is indicated by ACT_FLAG_FINAL in the action's flags,
which must be passed to hlua_resume().
Thanks to this, an action called from a TCP ruleset is properly woken
up and possibly finished when the client disconnects.
In HTTP it's more difficult to know when to pass the flag or not
because all actions are supposed to be final and there's no inspection
delay. Also, the input channel may very well be closed without this
being an error. So we only set the flag when option abortonclose is
set and the input channel is closed, which is the only case where the
user explicitly wants to forward a close down the chain.
This new flag indicates to a custom action that it must not yield because
it will not be called anymore. This addresses an issue introduced by commit
bc4c1ac ("MEDIUM: http/tcp: permit to resume http and tcp custom actions"),
which made it possible to yield even after the last call and causes Lua
actions not to be stopped when the session closes. Note that the Lua issue
is not fixed yet at this point. Also only TCP rules were handled, for now
HTTP rules continue to let the action yield since we don't know whether or
not it is a final call.
Since commit bc4c1ac ("MEDIUM: http/tcp: permit to resume http and tcp
custom actions"), some actions may yield and be called back when new
information are available. Unfortunately some of them may continue to
yield because they simply don't know that it's the last call from the
rule set. For this reason we'll need to pass a flag to the custom
action to pass such information and possibly other at the same time.
When a thread stops this patch forces a garbage collection which
cleanup and free the memory.
The HAProxy Lua Socket class contains an HAProxy session, so it
uses a big amount of memory, in other this Socket class uses a
filedescriptor and maintain a connection open.
If the class Socket is stored in a global variable, the Socket stay
alive along of the life of the process (except if it is closed by the
other size, or if a timeout is reached). If the class Socket is stored
in a local variable, it must die with this variable.
The socket is closed by a call from the garbage collector. And the
portability and use of a variable is known by the same garbage collector.
so, running the GC just after the end of all Lua code ensure that the
heavy resources like the socket class are freed quickly.
Not having the target set on the connection causes it to be released
at the last moment, and the destination address to randomly be valid
depending on the data found in the memory at this moment. In practice
it works as long as memory poisonning is disabled. The deep reason is
that connect_server() doesn't expect to be called with SF_ADDR_SET and
an existing connection with !reuse. This causes the release of the
connection, its reallocation (!reuse), and taking the address from the
newly allocated connection. This should certainly be improved.
Commit d75cb0f ("BUG/MAJOR: lua: segfault after the channel data is
modified by some Lua action.") introduced a regression causing an
action run from a TCP rule in an HTTP proxy to end in HTTP error
if it terminated cleanly, because it didn't parse the HTTP request!
Relax the test so that it takes into account the opportunity for the
analysers to parse the message.
The lua code must the the appropriate wakeup mechanism for cosockets.
It wakes them up from outside the stream and outside the applet, so it
shouldn't use the applet's wakeup callback which is designed only for
use from within the applet itself. For now it didn't cause any trouble
(yet).
When an action or a fetch modify the channel data, the http request parser
pointer become inconsistent. This patch detects the modification and call
again the parser.
The function lua_settable uses the metatable for acessing data,
so in the C part, we want to index value in the real table and
not throught the meta-methods. It's much faster this way.
When we call the function smp_prefetch_http(), if the txn is not initialized,
it doesn't work. This patch fix this. Now, smp_prefecth_http() permits to use
http with any proxy mode.
While the SI_ST_DIS state is set *after* doing the close on a connection,
it was set *before* calling release on an applet. Applets have no internal
flags contrary to connections, so they have no way to detect they were
already released. Because of this it happened that applets were closed
twice, once via si_applet_release() and once via si_release_endpoint() at
the end of a transaction. The CLI applet could perform a double free in
this case, though the situation to cause it is quite hard because it
requires that the applet is stuck on output in states that produce very
few data.
In order to solve this, we now assign the SI_ST_DIS state *after* calling
->release, and we refrain from doing so if the state is already assigned.
This makes applets work much more like connections and definitely avoids
this double release.
In the future it might be worth making applets have their own flags like
connections to carry their own state regardless of the stream interface's
state, especially when dealing with connection reuse.
No backport is needed since this issue was caused by the rearchitecture
in 1.6.
It's dangerous to call this on internal state error, because it risks
to perform a double-free. This can only happen when a state is not
handled. Note that the switch/case currently doesn't offer any option
for missed states since they're all declared. Better fix this anyway.
The fix was tested by commenting out some entries in the switch/case.
Due to the code inherited from the early CLI mode with the non-interactive
code, the SHUTR status was only considered while waiting for a request,
which prevents the connection from properly being closed during a dump,
and the connection used to remain established. This issue didn't happen
in 1.5 because while this part was missed, the resynchronization performed
in process_session() would detect the situation and handle the cleanup.
No backport is needed.
If an error happens during a dump on the CLI, an explicit call to
cli_release_handler() is performed. This is not needed anymore since
we introduced ->release() in the applet which is called upon error.
Let's remove this confusing call which can even be risky in some
situations.
If an applet tries to write to a closed connection, it hangs forever.
This results in some "get map" commands on the CLI to leave orphaned
connections alive.
Now the applet wakeup function detects that the applet still wants to
write while the channel is closed for reads, which is the equivalent
to the common "broken pipe" situation. In this case, an error is
reported on the stream interface, just as it happens with connections
trying to perform a send() in a similar situation.
With this fix the stats socket is properly released.
This function is a callback made only for calls from the applet handler.
Rename it to remove confusion. It's currently called from the Lua code
but that's not correct, we should call the notify and update functions
instead otherwise it will not enable the applet again.
This one is not needed anymore as what it used to do is either
completely covered by the new stream_int_notify() function, or undesired
and inherited from the past as a side effect of introducing the
connections.
This update is theorically never called since it's assigned only when
nothing is connected to the stream interface. However a test has been
added to si_update() to stay safe if some foreign code decides to call
si_update() in unsafe situations.
The code to report completion after a connection update or an applet update
was almost the same since applets stole it from the connection. But the
differences made them hard to maintain and prevented the creation of new
functions doing only one part of the work.
This patch replaces the common code from the si_conn_wake_cb() and
si_applet_wake_cb() with a single call to stream_int_notify() which only
notifies the stream (si+channels+task) from the outside.
No functional change was made beyond this.
stream_int_notify() was taken from the common part between si_conn_wake_cb()
and si_applet_done(). It is designed to report activity to a stream from
outside its handler. It'll generally be used by lower layers to report I/O
completion but may also be used by remote streams if the buffer processing
is shared.
The condition to release the SI_FL_WAIT_ROOM flag was abnormally
complicated because it was inherited from 6 years ago before we used
to check for the buffer's emptiness. The CF_READ_PARTIAL flag had to be
removed, and the complex test was replaced with a simpler one checking
if *some* data were moved out or not.
The reason behind this change is to have a condition compatible with
both connections and applets, as applets currently don't work very
well in this area. Specifically, some optimizations on the applet
side cause them not to release the flag above until the buffer is
empty, which may prevent applets from taking together (eg: peers
over large haproxy buffers and small kernel buffers).
Now the call to stream_int_update() is moved to si_update(), which
is exclusively called from the stream, so that the socket layer may
be updated without updating the stream layer. This will later permit
to call it individually from other places (other tasks or applets for
example).
Now that we have a generic stream_int_update() function, we can
replace the equivalent part in stream_int_update_conn() and
stream_int_update_applet() to avoid code duplication.
There is no functional change, as the code is the same but split
in two functions for each call.
This function is designed to be called from within the stream handler to
update the channels' expiration timers and the stream interface's flags
based on the channels' flags. It needs to be called only once after the
channels' flags have settled down, and before they are cleared, though it
doesn't harm to call it as often as desired (it just slightly hurts
performance). It must not be called from outside of the stream handler,
as what it does will be used to compute the stream task's expiration.
The code was taken directly from stream_int_update_applet() and
stream_int_update_conn() which had exactly the same one except for
applet-specific or connection-specific status update.
The purpose is to separate the connection-specific parts so that the
stream-int specific one can be factored out. There's no functional
change here, only code displacement.
If an applet wakes up and causes the next one to sleep, the active list
is corrupted and cannot be scanned anymore, as the process then loops
over the next element.
In order to avoid this problem, we move the active applet list to a run
queue and reinit the active list. Only the first element of this queue
is checked, and if the element is not removed, it is removed and requeued
into the active list.
Since we're using a distinct list, if an applet wants to requeue another
applet into the active list, it properly gets added to the active list
and not to the run queue.
This stops the infinite loop issue that could be caused with Lua applets,
and in any future configuration where two applets could be attached
together.
This is not a real run queue and we're facing ugly bugs because
if this : if a an applet removes another applet from the queue,
typically the next one after itself, the list iterator loops
forever because the list's backup pointer is not valid anymore.
Before creating a run queue, let's rename this list.
The pattern match "found" fails to parse on binary type samples. The
reason is that it presents itself as an integer type which bin cannot
be cast into. We must always accept this match since it validates
anything on input regardless of the type. Let's just relax the parser
to accept it.
This problem might also exist in 1.5.
(cherry picked from commit 91cc2368a73198bddc3e140d267cce4ee08cf20e)
Due to a check between offset+len and buf->size, an empty buffer returns
"will never match". Check against tune.bufsize instead.
(cherry picked from commit 43e4039fd5d208fd9d32157d20de93d3ddf9bc0d)
The current Lua action are not registered. The executed function is
selected according with a function name writed in the HAProxy configuration.
This patch add an action registration function. The configuration mode
described above disappear.
This change make some incompatibilities with existing configuration files for
HAProxy 1.6-dev.
function 'peer_prepare_ackmsg' is designed to use the argument 'msg'
instead of 'trash.str'.
There is currently no bug because the caller passes 'trash.str' in
the 'msg' argument.
Some updates are pushed using an incremental update message after a
re-connection whereas the origin is forgotten by the peer.
These updates are never correctly acknowledged. So they are regularly
re-pushed after an idle timeout and a re-connect.
The fix consists to use an absolute update message in some cases.
If an entry is still not present in the update tree, we could miss to schedule
for a push depending of an un-initialized value (upd.key remains un-initialized
for new sessions or isn't re-initalized for reused ones).
In the same way, if an entry is present in the tree, but its update's tick
is far in the past (> 2^31). We could consider it's still scheduled even if
it is not the case.
The fix consist to force the re-scheduling of an update if it was not present in
the updates tree or if the update is not in the scheduling window of every peers.
PiBANL reported that HAProxy's DNS resolver can't "connect" its socker
on FreeBSD.
Remi Gacogne reported that we should use the function 'get_addr_len' to
get the addr structure size instead of sizeof.
Mailing list participant "mlist" reported negative conn_cur values in
stick tables as the result of "tcp-request connection track-sc". The
reason is that after the stick entry it copied from the session to the
stream, both the session and the stream grab a reference to the entry
and when the stream ends, it decrements one reference and one connection,
then the same is done for the session.
In fact this problem was already encountered slightly differently in the
past and addressed by Thierry using the patch below as it was believed by
then to be only a refcount issue since it was the observable symptom :
827752e "BUG/MEDIUM: stick-tables: refcount error after copying SC..."
In reality the problem is that the stream must touch neither the refcount
nor the connection count for entries it inherits from the session. While
we have no way to tell whether a track entry was inherited from the session
(since they're simply memcpy'd), it is possible to prevent the stream from
touching an entry that already exists in the session because that's a
guarantee that it was inherited from it.
Note that it may be a temporary fix. Maybe in the future when a session
gives birth to multiple streams we'll face a situation where a session may
be updated to add more tracked entries after instanciating some streams.
The correct long-term fix is to mark some tracked entries as shared or
private (or RO/RW). That will allow the session to track more entries
even after the same trackers are being used by early streams.
No backport is needed, this is only caused by the session/stream split in 1.6.
Pradeep Jindal reported and troubleshooted a bug causing haproxy to die
during startup on all processes not making use of a peers section. It
only happens with nbproc > 1 when peers are declared. Technically it's
when the peers task is stopped on processes that don't use it that the
crash occurred (a task_free() called on a NULL task pointer).
This only affects peers v2 in the dev branch, no backport is needed.
Trie device detection doesn't benefit from caching compared to Pattern.
As such the LRU cache has been removed from the Trie method.
A new fetch method has been added named 51d.all which uses all the
available HTTP headers for device device detection. The previous 51d
conv method has been changed to 51d.single where one HTTP header,
typically User-Agent, is used for detection. This method is marginally
faster but less accurate.
Three new properties are available with the Pattern method called
Method, Difference and Rank which provide insight into the validity of
the results returned.
A pool of worksets is used to avoid needing to create a new workset for
every request. The workset pool is thread safe ready to support a future
multi threaded version of HAProxy.
Added the definition of CHECK_HTTP_MESSAGE_FIRST and the declaration of
smp_prefetch_http to the header.
Changed smp_prefetch_http implementation to remove the static qualifier.
This patch uses the start up of the health check task to also start
the warmup task when required.
This is executed only once: when HAProxy has just started up and can
be started only if the load-server-state-from-file feature is enabled
and the server was in the warmup state before a reload occurs.
This directive gives HAProxy the ability to use the either the global
server-state-file directive or a local one using server-state-file-name to
load server states.
The state can be saved right before the reload by the init script, using
the "show servers state" command on the stats socket redirecting output into
a file.
This new global section directive is used to store the path to the file
where HAProxy will be able to retrieve server states across reloads.
The file pointed by this path is used to store a file which can contains
state of all servers from all backends.
This new global directive can be used to provide a base directory where
all the server state files could be loaded.
If a server state file name starts with a slash '/', then this directive
must not be applied.
new command 'show servers state' which dumps all variable parameters
of a server during an HAProxy process life.
Purpose is to dump current server state at current run time in order to
read them right after the reload.
The format of the output is versionned and we support version 1 for now.
When a server is disabled in the configuration using the "disabled"
keyword, a single flag is positionned: SRV_ADMF_CMAINT (use to be
SRV_ADMF_FMAINT)..
That said, when providing the first version of this code, we also
changed the SRV_ADMF_MAINT mask to match any of the possible MAINT
cases: SRV_ADMF_FMAINT, SRV_ADMF_IMAINT, SRV_ADMF_CMAINT
Since SRV_ADMF_CMAINT is never (and is not supposed to be) altered at
run time, once a server has this flag set up, it can never ever be
enabled again using the stats socket.
In order to fix this, we should:
- consider SRV_ADMF_CMAINT as a simple flag to report the state in the
old configuration file (will be used after a reload to deduce the
state of the server in a new running process)
- enabling both SRV_ADMF_CMAINT and SRV_ADMF_FMAINT when the keyword
"disabled" is in use in the configuration
- update the mask SRV_ADMF_MAINT as it was before, to only match
SRV_ADMF_FMAINT and SRV_ADMF_IMAINT.
The following patch perform the changes above.
It allows fixing the regression without breaking the way the up coming
feature (seamless server state accross reloads) is going to work.
Note: this is 1.6-only, no backport needed.
This couple of function executes securely some Lua calls outside of
the lua runtime environment. Each Lua call can return a longjmp
if it encounter a memory error.
Lua documentation extract:
If an error happens outside any protected environment, Lua calls
a panic function (see lua_atpanic) and then calls abort, thus
exiting the host application. Your panic function can avoid this
exit by never returning (e.g., doing a long jump to your own
recovery point outside Lua).
The panic function runs as if it were a message handler (see
§2.3); in particular, the error message is at the top of the
stack. However, there is no guarantee about stack space. To push
anything on the stack, the panic function must first check the
available space (see §4.2).
We must check all the Lua entry point. This includes:
- The include/proto/hlua.h exported functions
- the task wrapper function
- The action wrapper function
- The converters wrapper function
- The sample-fetch wrapper functions
It is tolerated that the initilisation function returns an abort.
Before each Lua abort, an error message is writed on stderr.
The macro SET_SAFE_LJMP initialise the longjmp. The Macro
RESET_SAFE_LJMP reset the longjmp. These function must be macro
because they must be exists in the program stack when the longjmp
is called
All the code which emits error log have the same pattern. Its:
Send log with syslog system, and if it is allowed, display error
log on screen.
This patch replace this pattern by a macro. This reduces the number
of lines.
There are two types of retries when performing a DNS resolution:
1. retry because of a timeout
2. retry of the full sequence of requests (query types failover)
Before this patch, the 'resolution->try' counter was incremented
after each send of a DNS request, which does not cover the 2 cases
above.
This patch fix this behavior.
Some DNS response may be valid from a protocol point of view but may not
contain any IP addresses.
This patch gives a new flag to the function dns_get_ip_from_response to
report such case.
It's up to the upper layer to decide what to do with this information.
Some DNS responses may be valid from a protocol point of view, but may
not contain any information considered as interested by the requester..
Purpose of the flag DNS_RESP_NO_EXPECTED_RECORD introduced by this patch is
to allow reporting such situation.
When this happens, a new DNS query is sent with a new query type.
For now, the function only expect A and AAAA query types which is enough
to cover current cases.
In a next future, it will be up to the caller to tell the function which
query types are expected.
The send_log function needs a final \n.
This bug is repported by Michael Ezzell.
Minor bug: when writing to syslog from Lua scripts, the last character from
each log entry is truncated.
core.Alert("this is truncated");
Sep 7 15:07:56 localhost haproxy[7055]: this is truncate
This issue appears to be related to the fact that send_log() (in src/log.c)
is expecting a newline at the end of the message's format string:
/*
* This function adds a header to the message and sends the syslog message
* using a printf format string. It expects an LF-terminated message.
*/
void send_log(struct proxy *p, int level, const char *format, ...)
I believe the fix would be in in src/hlua.c at line 760
<http://git.haproxy.org/?p=haproxy.git;a=blob;f=src/hlua.c;h=1e4d47c31e66c16c837ff2aa5ef577f6cafdc7e7;hb=316e3196285b89a917c7d84794ced59a6a5b4eba#l760>,
where this...
send_log(px, level, "%s", trash.str);
...should be adding a newline into the format string to accommodate what
the code expects.
send_log(px, level, "%s\n", trash.str);
This change provides what seems to be the correct behavior:
Sep 7 15:08:30 localhost haproxy[7150]: this is truncated
All other uses of send_log() in hlua.c have a trailing dot "." in the
message that is masking the truncation issue because the output message
stops on a clean word boundary. I suspect these would also benefit from
"\n" appended to their format strings as well, since this appears to be the
pattern seen throughout the rest of the code base.
Reported-by: Michael Ezzell <michael@ezzell.net>
The server's host name picked for resolution was incorrect, it did not
skip the address family specifier, did not resolve environment variables,
and messed up with the optional trailing colon.
Instead, let's get the fqdn returned by str2sa_range() and use that
exclusively.
If an environment variable is used in an address, and is not set, it's
silently considered as ":" or "0.0.0.0:0" which is not correct as it
can hide environment issues and lead to unexpected behaviours. Let's
report this case when it happens.
This fix should be backported to 1.5.
The function does a bunch of things among which resolving environment
variables, skipping address family specifiers and trimming port ranges.
It is the only one which sees the complete host name before trying to
resolve it. The DNS resolving code needs to know the original hostname,
so we modify this function to optionally provide it to the caller.
Note that the function itself doesn't know if the host part was a host
or an address, but str2ip() knows that and can be asked not to try to
resolve. So we first try to parse the address without resolving and
try again with resolving enabled. This way we know if the address is
explicit or needs some kind of resolution.
In the first version of the DNS resolver, HAProxy sends an ANY query
type and in case of issue fails over to the type pointed by the
directive in 'resolve-prefer'.
This patch allows the following new failover management:
1. default query type is still ANY
2. if response is truncated or in error because ANY is not supported by
the server, then a fail over to a new query type is performed. The
new query type is the one pointed by the directive 'resolve-prefer'.
3. if no response or still some errors occurs, then a query type fail over
is performed to the remaining IP address family.
First dns client implementation simply ignored most of DNS response
flags.
This patch changes the way the flags are parsed, using bit masks and
also take care of truncated responses.
Such response are reported to the above layer which can handle it
properly.
This patch introduces a new internal response state about the analysis
of a DNS response received by a server.
It is dedicated to report to above layer that the response is
'truncated'.
This patch updates the dns_nameserver structure to integrate a counter
dedicated to 'truncated' response sent by servers.
Such response are important to track, since HAProxy is supposed to
replay its request.
Under certain circonstance (a configuration with many servers relying on
DNS resolution and one of them triggering the replay of a request
because of a timeout or invalid response to an ANY query), HAProxy could
end up in an infinite loop over the currently supposed running DNS
queries.
This was caused because the FIFO list of running queries was improperly
updated in snr_resolution_error_cb. The head of the list was removed
instead of the resolution in error, when moving the resolution to the
end of the list.
In the mean time, a LIST_DEL statement is removed since useless. This
action is already performed by the dns_reset_resolution function.
Patch f046f11561 introduced a regression:
DNS resolution doesn't start anymore, while it was supposed to make it
start with first health check.
Current patch fix this issue by triggering a new DNS resolution if the
last_resolution time is not set.
A crash was reported when using the "famous" http-send-name-header
directive. This time it's a bit tricky, it requires a certain number of
conditions to be met including maxconn on a server, queuing, timeout in
the queue and cookie-based persistence.
The problem is that in stream.c, before calling http_send_name_header(),
we check a number of conditions to know if we have to replace the header
name. But prior to reaching this place, it's possible for
sess_update_stream_int() to fail and change the stream-int's state to
SI_ST_CLO, send an error 503 to the client, and flush all buffers. But
http_send_name_header() can only be called with valid buffer contents
matching the http_msg's description. So when it rewinds the stream to
modify the header, buf->o becomes negative by the size of the incoming
request and is used as the argument to memmove() which basically
displaces 4GB of memory off a few bytes to write the new name, resulting
in a core and a core file that's really not fun to play with.
The solution obviously consists in refraining from calling this nasty
function when the stream interface is already closed.
This bug also affects 1.5 and possibly 1.4, so the fix must be backported
there.
See commit id bdc97a8795
Michael Ezzell reported that the following Lua code fails in
dev4 when the TCP is not established immediately (due to a little
bit of latency):
function tricky_socket()
local sock = core.tcp();
sock:settimeout(3);
core.log(core.alert,"calling connect()\n");
local connected, con_err = sock:connect("x.x.x.x",80);
core.log(core.alert,"returned from connect()\n");
if con_err ~= nil then
core.log(core.alert,"connect() failed with error: '" .. con_err .. "'\n");
end
The problem is that the flags who want to wake up the applet are
resetted before each applet call, so the applet must set again the
flags if the connection is not established.
When converting the "method" fetch to a string, we used to get an empty
string if the first character was not an upper case. This was caused by
the lookup function which returns HTTP_METH_NONE when a lookup is not
possible, and this method being mapped to an empty string in the array.
This is a totally stupid mechanism, there's no reason for having the
result depend on the first char. In fact the message parser already
checks that the syntax matches an HTTP token so we can only land there
with a valid token, hence only HTTP_METH_OTHER should be returned.
This fix should be backported to all actively supported branches.
Before this patch, two type of custom actions exists: ACT_ACTION_CONT and
ACT_ACTION_STOP. ACT_ACTION_CONT is a non terminal action and ACT_ACTION_STOP is
a terminal action.
Note that ACT_ACTION_STOP is not used in HAProxy.
This patch remove this behavior. Only type type of custom action exists, and it
is called ACT_CUSTOM. Now, the custion action can return a code indicating the
required behavior. ACT_RET_CONT wants that HAProxy continue the current rule
list evaluation, and ACT_RET_STOP wants that HAPRoxy stops the the current rule
list evaluation.
Jesse Hathaway reported a crash that Cyril Bonté diagnosed as being
caused by the manipulation of srv_conn after setting it to NULL. This
happens in http-server-close mode when the server returns either a 401
or a 407, because the connection was previously closed then it's being
assigned the CO_FL_PRIVATE flag.
This bug only affects 1.6-dev as it was introduced by connection reuse code
with commit 387ebf8 ("MINOR: connection: add a new flag CO_FL_PRIVATE").
First DNS resolution is supposed to be triggered by first health check,
which is not the case with current code.
This patch fixes this behavior by setting the
resolution->last_resolution time to 0 instead of now_ms when parsing
server's configuration at startup.
When called from an http ruleset, txn:done() can still crash the process
because it closes the stream without consuming pending data resulting in
the transaction's buffer representation to differ from the real buffer.
This patch also adjusts the transaction's state to indicate that it's
closed to be consistent with what's already done in redirect rules.
When the Lua execution flow endswith the command done (core.done or txn.done())
an error is detourned, and the stack is no longer usable. This patch juste
reinitilize the stack if this case is detected.
The function txn:close() must be terminal because it demands the session
destruction. This patch renames this function to "done()" to be much
clearer about the fact that it is a final operation.
This patch is inspired by Bowen Ni's proposal and it is based on his first
implementation:
With Lua integration in HAProxy 1.6, one can change the request method,
path, uri, header, response header etc except response line.
I'd like to contribute the following methods to allow modification of the
response line.
[...]
There are two new keywords in 'http-response' that allows you to rewrite
them in the native HAProxy config. There are also two new APIs in Lua that
allows you to do the same rewriting in your Lua script.
Example:
Use it in HAProxy config:
*http-response set-code 404*
Or use it in Lua script:
*txn.http:res_set_reason("Redirect")*
I dont take the full patch because the manipulation of the "reason" is useless.
standard reason are associated with each returned code, and unknown code can
take generic reason.
So, this patch can set the status code, and the reason is automatically adapted.
The function dont remove remaineing analysers and dont update response
channel timeout.
The fix is a copy of the behavior of the functions http_apply_redirect_rule()
and stream_int_retnclose().
Tsvetan Tsvetanov reported that the following Lua code fails in
dev2 and dev3 :
function hello(txn)
local request_msg = txn.req:dup()
local tsm_sock = core.tcp()
tsm_sock:connect("127.0.0.1", 7777)
local res = tsm_sock:send(request_msg)
local response = tsm_sock:receive('*l')
txn.res:send(response)
txn:close()
end
Thierry diagnosed that it was caused by commit 563cc37 ("MAJOR: stream:
use a regular ->update for all stream interfaces"). It broke lua's
ability to establish outgoing connections.
The reason is that the applet used to be notified about established
connections just after the stream analyser loop, and that's not the
case anymore. In peers, this issue didn't happen because peers use
a handshake so after sending data, the response is received and wakes
the applet up again. Here we have to indicate that we want to send or
receive data, this will cause the notification to happen as soon as
the connection is ready. This is similar to pretending that we're
working on a full buffer after all. In theory subscribing for reads
is not needed, but it's added here for completeness.
Reported-By: Tsvetan Tsvetanov <cpi.cecko@gmail.com>
The table definition message id was used instead of the update acknowledgement id.
This bug causes a malformated message and a protocol error and breaks the
connection.
After that, the updates remain unacknowledged.
This was the first transparent proxy technology supported by haproxy
circa 2005 but it was obsoleted in 2007 by Tproxy 4.0 which removed a
lot of the earlier versions' shortcomings and was finally merged into
the kernel. Since nobody has been using cttproxy for many years now
and nobody has even just tried to compile the files, it's time to
remove it. The doc was updated as well.
This patch removes the special stick tables types names and
use the standard sample type names. This avoid the maintainance
of two types and remove the switch/case for matching a sample
type for each stick table type.
This patch is the first step for sample integration. Actually
the stick tables uses her own data type, and some converters
must be called to convert sample type to stick-tables types.
This patch removes the stick-table types and replace it by
the sample types. This prevent:
- Maintenance of two types of converters
- reduce the code using the samples converters
Now the prototype for each action from each section are the same, and
a discriminant for determining for each section we are called are added.
So, this patch removes the wrappers for the action functions called from
more than one section.
This patch removes 132 lines of useless code.
This patch normalize the return code of the configuration parsers. Before
these changes, the tcp action parser returned -1 if fail and 0 for the
succes. The http action returned 0 if fail and 1 if succes.
The normalisation does:
- ACT_RET_PRS_OK for succes
- ACT_RET_PRS_ERR for failure
Each (http|tcp)-(request|response) action use the same method
for looking up the action keyword during the cofiguration parsing.
This patch mutualize the code.
This patch merges the conguration keyword struct. Each declared configuration
keyword struct are similar with the others. This patch simplify the code.
Action function can return 3 status:
- error if the action encounter fatal error (like out of memory)
- yield if the action must terminate his work later
- continue in other cases
For performances considerations, some actions are not processed by remote
function. They are directly processed by the function. Some of these actions
does the same things but for different processing part (request / response).
This patch give the same name for the same actions, and change the normalization
of the other actions names.
This patch is ONLY a rename, it doesn't modify the code.
This patch group the action name in one file. Some action are called
many times and need an action embedded in the action caller. The main
goal is to have only one header file grouping all definitions.
This mark permit to detect if the action tag is over the allowed range.
- Normally, this case doesn't appear
- If it appears, it is processed by ded fault case of the switch
This patch removes the generic opaque type for storing the configuration of the
acion "set-src" (HTTP_REQ_ACT_SET_SRC), and use the dedicated type "struct expr"
The (http|tcp)-(request|response) action rules use common
opaque type. For the HAProxy embbedded feature, types are know,
it better to add this types in the action union and use it.
The (http|tcp)-(request|response) action rules use common
opaque type. For the HAProxy embbedded feature, types are know,
it better to add this types in the action union and use it.
This patch is the first of a serie which merge all the action structs. The
function "tcp-request content", "tcp-response-content", "http-request" and
"http-response" have the same values and the same process for some defined
actions, but the struct and the prototype of the declared function are
different.
This patch try to unify all of these entries.
A map can store and return various types as output. The only one example is the
IPv4 and IPv6 types. The previous patch remove the type from the sample storage
struct and use the conoverter output type, expecting that all entries of the
map have the same type.
This will be wrong when the maps will support both IPv4 and IPv6 as output.
With the difference between the "struct sample" data and the
"struct sample_storage" data, it was not possible to write
data = data, and we did a memcpy. This patch remove some of
these memcpy.
I can't test this patch because the avalaible 51degrees library is
"51Degrees-C-3.1.5.2" and HAProxy obviously build with another version
(some defines and symbols disappear).
This patch is provided as-is in best effort.
The union name "data" is a little bit heavy while we read the source
code because we can read "data.data.sint". The rename from "data" to "u"
makes the read easiest like "data.u.sint".
This patch remove the struct information stored both in the struct
sample_data and in the striuct sample. Now, only thestruct sample_data
contains data, and the struct sample use the struct sample_data for storing
his own data.
During 1.5-dev20 there was some code refactoring to make the src_* fetch
function use the same code as sc_*. Unfortunately this introduced a
regression where src_* doesn't create an entry anymore if it does not
exist in the table. The reason is that smp_fetch_sc_stkctr() only calls
stktable_lookup_key() while src_inc_*/src_clr_* used to make use of
stktable_update_key() which additionally create the entry if it does
not exist.
There's no point modifying the common function for these two exceptions,
so instead we now have a function dedicated to the creation of this entry
for src_* only. It is called when the entry didn't exist, so that requires
minimal modifications to existing code.
Thanks to Thierry Fournier for helping diagnose the issue.
This fix must be backported to 1.5.
The newly created server flag SRV_ADMF_CMAINT means that the server is
in 'disabled' mode because of configuration statement 'disabled'.
The flag SRV_ADMF_FMAINT should not be set anymore in such case and is
reserved only when the server is Forced in maintenance mode from the
stats socket.
During the processing of tcp-request connection, the stream doesn't exists, so the
stick counters are stored in the session. When the stream is created it must
inherit from the session sc.
This patch fix this behavior.
[WT: this is specific to 1.6, no backport needed]
Next patch will remove sample_storage->type, and the only user is the
"show map" feature on the CLI which can use the map's output type instead.
Let's do that first.
The RFC4291 says that when the IPv6 adress have the followin form:
0000::ffff:a.b.c.d, if can be converted to an IPv4 adress. This patch
enable this conversion in casts.
As the sint can be casted as ipv4, and ipv4 can be casted as ipv6, we
can directly cast sint as ipv6 using the RFC4291.
Some function are just a wrappers. This patch reduce the size of
this wrapper for improving the readability. One check is moved
from the wrapper to the main function, and some middle vars are
removed.
Commit c6678e2 ("MEDIUM: config: authorize frontend and listen without bind")
completely removed the test for bind lines in frontends in order to make it
easier for automated tools to generate configs (eg: replacing a bind with
another one passing via a temporary config without any bind line). The
problem is that some common mistakes are totally hidden now. For example,
this apparently valid entry is silently ignored :
listen 1.2.3.4:8000
server s1 127.0.0.1:8000
Hint: 1.2.3.4:8000 is mistakenly the proxy name here.
Thus instead we now emit a warning to indicate that a frontend was found
with no listener. This should be backported to 1.5 to help spot abnormal
configurations.
appsessions started to be deprecated with the introduction of stick
tables, and the latter are much more powerful and flexible, and in
addition they are replicated between nodes and maintained across
reloads. Let's now remove appsession completely.
test conf:
global
tune.lua.session-timeout 0
lua-load lol.lua
debug
maxconn 4096
listen test
bind 0.0.0.0:10010
mode tcp
tcp-request content lua act_test
balance roundrobin
server test 127.0.0.1:3304
lua test:
function act_test(txn)
while true do
core.Alert("TEST")
end
end
The function "act_test()" is not executed because a zero timeout is not
considered as TICK_ETERNITY, but is considered as 0.
This path fix this behavior. This is the same problem than the bugfix
685c014e99.
I've been trying out 1.6 dev3 with lua support, and trying to start
lua tasks seems to not be working.
Using this configuration
global
lua-load /lua/lol.lua
debug
maxconn 4096
backend shard_b
server db01 mysql_shard_b:3306
backend shard_a
server db01 mysql_shard_a:3306
listen mysql-cluster
bind 0.0.0.0:8001
mode tcp
balance roundrobin
use_backend shard_b
And this lua function
core.register_task(function()
while true do
core.Alert("LOLOLOLOLOL")
end
end)
I'd always get a timeout error starting the registered function.
The problem lies as far as I can tell in the fact that is possible for
now_ms to not change (is this maybe a problem on my config/system?)
until the expiration check happens, in the resume function that
actually kickstarts the lua task, making HAProxy think that expiration
time for the task is up, if I understand correctly tasks are meant to
never really timeout.
Since sample fetches are not always available in the response phase,
this patch implements %HQ such that:
GET /foo?bar=baz HTTP/1.0
...would be logged as:
?bar=baz
In some cases, parsing of the DNS response is broken and the response is
considered as invalid, despite being valid.
The current patch fixes this issue. It's a temporary solution until I
rework the response parsing to store the response buffer into a real DNS
packet structure.
This patch adds a few checks on "global._51degrees.data_file_path" and allows
haproxy to start even when the pattern or trie data file is not specified.
If the "51d" converter is used, a new function "_51d_conv_check" will check
"global._51degrees.data_file_path" and displays a warning if necessary.
In src/haproxy.c, the global 51Degrees "cache_size" has moved outside of the
FIFTYONEDEGREES_H_PATTERN_INCLUDED ifdef block.
This strategy is less extreme than "always", it only dispatches first
requests to validated reused connections, and moves a connection from
the idle list to the safe list once it has seen a second request, thus
proving that it could be reused.
These ones are considered safe as they have already been reused.
They will be useful in "aggressive" and "always" http-reuse modes
in order to place the first request of a connection with the least
risk.
The "safe" mode consists in picking existing connections only when
processing a request that's not the first one from a connection. This
ensures that in case where the server finally times out and closes, the
client can decide to replay idempotent requests.
Now instead of closing the existing connection attached to the
stream interface, we first check if the one we pick was attached to
another stream interface, in which case the connections are swapped
if possible (eg: if the current connection is not private). That way
the previous connection remains attached to an existing session and
significantly increases the chances of being reused.
In connect_server(), if we don't have a connection attached to the
stream-int, we first look into the server's idle_conns list and we
pick the first one there, we detach it from its owner if it had one.
If we used to have a connection, we close it.
This mechanism works well but doesn't scale : as servers increase,
the likeliness that the connection attached to the stream interface
doesn't match the server and gets closed increases.
For now it only supports "never", meaning that we never want to reuse a
shared connection, and "always", meaning that we can use any connection
that was not marked private. When "never" is set, this also implies that
no idle connection may become a shared one.
This flag is set on an outgoing connection when this connection gets
some properties that must not be shared with other connections, such
as dynamic transparent source binding, SNI or a proxy protocol header,
or an authentication challenge from the server. This will be needed
later to implement connection reuse.
This function is now dedicated to idle connections only, which means
that it must not be used without any endpoint nor anything not a
connection. The connection remains attached to the stream interface.
For now it's not populated but we have the list entry. It will carry
all idle connections that sessions don't want to share. They may be
used later to reclaim connections upon socket shortage for example.
Since we now always call this function with the reuse parameter cleared,
let's simplify the function's logic as it cannot return the existing
connection anymore. The savings on this inline function are appreciable
(240 bytes) :
$ size haproxy.old haproxy.new
text data bss dec hex filename
1020383 40816 36928 1098127 10c18f haproxy.old
1020143 40816 36928 1097887 10c09f haproxy.new
connect_server() already does most of the check that is done again in
si_alloc_conn(), so let's simply reuse the existing connection instead
of calling the function again. It will also simplify the connection
reuse.
Indeed, for reuse to be set, it also requires srv_conn to be valid. In the
end, the only situation where we have to release the existing connection
and allocate a new one is when reuse == 0.
objt_server() is called multiple times at various places while some
places already make use of srv for this. Let's move the call at the
top of the function and use it all over the place.
Currently it is possible for the current_rule field to be evaluated before
being set, leading to valgrind complaining:
==16783== Conditional jump or move depends on uninitialised value(s)
==16783== at 0x44E662: http_res_get_intercept_rule (proto_http.c:3730)
==16783== by 0x44E662: http_process_res_common (proto_http.c:6528)
==16783== by 0x4797B7: process_stream (stream.c:1851)
==16783== by 0x414634: process_runnable_tasks (task.c:238)
==16783== by 0x40B02F: run_poll_loop (haproxy.c:1528)
==16783== by 0x407F25: main (haproxy.c:1887)
This was introduced by commit 152b81e7b2.
Jan A. Bruder reported that some very specific hostnames on server
lines were causing haproxy to crash on startup. Given that hist
backtrace showed some heap corruption, it was obvious there was an
overflow somewhere. The bug in fact is a typo in dns_str_to_dn_label()
which mistakenly copies one extra byte from the host name into the
output value, thus effectively corrupting the structure.
The bug triggers while parsing the next server of similar length
after the corruption, which generally triggers at config time but
could theorically crash at any moment during runtime depending on
what malloc sizes are needed next. This is why it's tagged major.
No backport is needed, this bug was introduced in 1.6-dev2.
This patch allow the existing operators to take a variable as parameter.
This is useful to add the content of two variables. This patch modify
the behavior of operators.
This patch check calculus for overflow and returns capped values.
This permits to protect against integer overflow in certain operations
involving ratios, percentages, limits or anything. That can sometimes
be critically important with some operations (eg: content-length < X).
This patch removes the 32 bits unsigned integer and the 32 bit signed
integer. It replaces these types by a unique type 64 bit signed.
This makes easy the usage of integer and clarify signed and unsigned use.
With the previous version, signed and unsigned are used ones in place of
others, and sometimes the converter loose the sign. For example, divisions
are processed with "unsigned", if one entry is negative, the result is
wrong.
Note that the integer pattern matching and dotted version pattern matching
are already working with signed 64 bits integer values.
There is one user-visible change : the "uint()" and "sint()" sample fetch
functions which used to return a constant integer have been replaced with
a new more natural, unified "int()" function. These functions were only
introduced in the latest 1.6-dev2 so there's no impact on regular
deployments.
This patch adds 3 functions for 64 bit integer conversion.
* lltoa_r : converts signed 64 bit integer to string
* read_uint64 : converts from string to signed 64 bits integer with capping
* read_int64 : converts from string to unsigned 64 bits integer with capping
This patch introduces three new functions which can be used to find a
server in a farm using different server information:
- server unique id (srv->puid)
- server name
- find best match using either name or unique id
When performing best matching, the following applies:
- use the server name first (if provided)
- use the server id if provided
in any case, the function can update the caller about mismatches
encountered.
This flag aims at reporting whether the server unique id (srv->puid) has
been forced by the administrator in HAProxy's configuration.
If not set, it means HAProxy has generated automatically the server's
unique id.
function proxy_find_best_match can update the caller by updating an int
provided in argument.
For now, proxy_find_best_match hardcode bit values 0x01, 0x02 and 0x04,
which is not understandable when reading a code exploiting them.
This patch defines 3 macros with a more explicit wording, so further
reading of a code exploiting the magic bit values will be understandable
more easily.
The man said that gmtime() and localtime() can return a NULL value.
This is not tested. It appears that all the values of a 32 bit integer
are valid, but it is better to check the return of these functions.
However, if the integer move from 32 bits to 64 bits, some 64 values
can be unsupported.
Madison May reported that the timeout applied by the default
configuration is inproperly set up.
This patch fix this:
- hold valid default to 10s
- timeout retry default to 1s
The new "sni" server directive takes a sample fetch expression and
uses its return value as a hostname sent as the TLS SNI extension.
A typical use case consists in forwarding the front connection's SNI
value to the server in a bridged HTTPS forwarder :
sni ssl_fc_sni
ssl_sock_set_servername() is used to set the SNI hostname on an
outgoing connection. This function comes from code originally
provided by Christopher Faulet of Qualys.
When the HTTP forwarder is used, it resets msg->sov so that we know that
the parsing pointer has advanced by exactly (msg->eoh + msg->eol - msg->sov)
bytes which may have to be rewound in case we want to perform an HTTP fetch
after forwarding has started (eg: upon connect).
But when the backend is in TCP mode, there may be no HTTP forwarding
analyser installed, still we may want to perform these HTTP fetches in
case we have already ensured at the TCP layer that we have a properly
parsed HTTP transaction.
In order to solve this, we reset msg->sov before doing a channel_forward()
so that we can still compute http_rewind() on the pending data. That ensures
the buffer is always rewindable even in mixed TCP+HTTP mode.
ARGC_CAP was not added to fmt_directives() which is used to format
error messages when failing to parse log format expressions. The
whole switch/case has been reorganized to match the declaration
order making it easier to spot missing values. The default is not
the "log" directive anymore but "undefined" asking to report the
bug.
Backport to 1.5 is not strictly needed but is desirable at least
for code sanity.
Clients that support ECC cipher suites SHOULD send the specified extension
within the SSL ClientHello message according to RFC4492, section 5.1. We
can use this extension to chain-proxy requests so that, on the same IP
address, a ECC compatible clients gets an EC certificate and a non-ECC
compatible client gets a regular RSA certificate. The main advantage of this
approach compared to the one presented by Dave Zhu on the mailing list
is that we can make it work with OpenSSL versions before 1.0.2.
Example:
frontend ssl-relay
mode tcp
bind 0.0.0.0:443
use_backend ssl-ecc if { req.ssl_ec_ext 1 }
default_backend ssl-rsa
backend ssl-ecc
mode tcp
server ecc unix@/var/run/haproxy_ssl_ecc.sock send-proxy-v2 check
backend ssl-rsa
mode tcp
server rsa unix@/var/run/haproxy_ssl_rsa.sock send-proxy-v2 check
listen all-ssl
bind unix@/var/run/haproxy_ssl_ecc.sock accept-proxy ssl crt /usr/local/haproxy/ecc.foo.com.pem user nobody
bind unix@/var/run/haproxy_ssl_rsa.sock accept-proxy ssl crt /usr/local/haproxy/www.foo.com.pem user nobody
Signed-off-by: Nenad Merdanovic <nmerdan@anine.io>
The current method of retrieving the incoming connection's destination
address to hash it is not compatible with IPv6 nor the proxy protocol
because it directly tries to get an IPv4 address from the socket. Instead
we must ask the connection. This is only used when no SNI is provided.
In src/51d.c, the function _51d_conv(), a final '\0' is added into
smp->data.str.str, which can cause a problem if the SMP_F_CONST flag is
set in smp->flags or if smp->data.str.size is not available.
This patch adds a check on smp->flags and smp->data.str.size, and copies
the smp->data.str.str to another buffer by using smp_dup(). If necessary,
the "const" flag is set after device detection. Also, this patch removes
the unnecessary call to chunk_reset() on temp argument.
This option enables overriding source IP address in a HTTP request. It is
useful when we want to set custom source IP (e.g. front proxy rewrites address,
but provides the correct one in headers) or we wan't to mask source IP address
for privacy or compliance.
It acts on any expression which produces correct IP address.
This modification makes possible to use sample_fetch_string() in more places,
where we might need to fetch sample values which are not plain strings. This
way we don't need to fetch string, and convert it into another type afterwards.
When using aliased types, the caller should explicitly check which exact type
was returned (e.g. SMP_T_IPV4 or SMP_T_IPV6 for SMP_T_ADDR).
All usages of sample_fetch_string() are converted to use new function.
Compression stats were not easy to read and could be confusing because
the saving ratio could be taken for global savings while it was only
relative to compressible input. Let's make that a bit clearer using
the new tooltips with a bit more details and also report the effective
ratio over all output bytes.
Commit cc87a11 ("MEDIUM: tcp: add register keyword system.") broke the
TCP ruleset by merging custom rules and accept. It was fixed a first time
by commit e91ffd0 ("BUG/MAJOR: tcp: only call registered actions when
they're registered") but the accept action still didn't work anymore
and was causing the matching rule to simply be ignored.
Since the code introduced a very fragile behaviour by not even mentionning
that accept and custom were silently merged, let's fix this once for all by
adding an explicit check for the accept action. Nevertheless, as previously
mentionned, the action should be changed so that custom is the only action
and the continue vs break indication directly comes from the callee.
No backport is needed, this bug only affects 1.6-dev.
Until now, the code assumed that it can get the offset to the first TLV
header just by subtracting the length of the TLV part from the length of
the complete buffer. However, if the buffer contains actual data after
the header, this computation is flawed and leads to haproxy trying to
parse TLV headers from the proxied data.
This change fixes this by making sure that the offset to the first TLV
header is calculated based from the start of the buffer -- simply by
adding the size of the proxy protocol v2 header plus the address
family-dependent size of the address information block.
The function buffer_slow_realign() was initially designed for requests
only and did not consider pending outgoing data. This causes a problem
when called on responses where data remain in the buffer, which may
happen with pipelined requests when the client is slow to read data.
The user-visible effect is that if less than <maxrewrite> bytes are
present in the buffer from a previous response and these bytes cross
the <maxrewrite> boundary close to the end of the buffer, then a new
response will cause a realign and will destroy these pending data and
move the pointer to what's believed to contain pending output data.
Thus the client receives the crap that lies in the buffer instead of
the original output bytes.
This new implementation now properly realigns everything including the
outgoing data which are moved to the end of the buffer while the input
data are moved to the beginning.
This implementation still uses a buffer-to-buffer copy which is not
optimal in terms of performance and which should be replaced by a
buffer switch later.
Prior to this patch, the following script would return different hashes
on each round when run from a 100 Mbps-connected machine :
i=0
while usleep 100000; do
echo round $((i++))
set -- $(nc6 0 8001 < 1kreq5k.txt | grep -v '^[0-9A-Z]' | md5sum)
if [ "$1" != "3861afbb6566cd48740ce01edc426020" ]; then echo $1;break;fi
done
The file contains 1000 times this request with "Connection: close" on the
last one :
GET /?s=5k&R=1 HTTP/1.1
The config is very simple :
global
tune.bufsize 16384
tune.maxrewrite 8192
defaults
mode http
timeout client 10s
timeout server 5s
timeout connect 3s
listen px
bind :8001
option http-server-close
server s1 127.0.0.1:8000
And httpterm-1.7.2 is used as the server on port 8000.
After the fix, 1 million requests were sent and all returned the same
contents.
Many thanks to Charlie Smurthwaite of atechmedia.com for his precious
help on this issue, which would not have been diagnosed without his
very detailed traces and numerous tests.
The patch must be backported to 1.5 which is where the bug was introduced.
This is in order to avoid conflicting with NetBSD popcount* functions
since 6.x release, the final l to mentions the argument is a long like
NetBSD does.
This patch could be backported to 1.5 to fix the build issue there as well.
This cache is used by 51d converter. The input User-Agent string, the
converter args and a random seed are used as a hashing key. The cached
entries contains a pointer to the resulting string for specific
User-Agent string detection.
The cache size can be tuned using 51degrees-cache-size parameter.
Moved 51Degrees code from src/haproxy.c, src/sample.c and src/cfgparse.c
into a separate files src/51d.c and include/import/51d.h.
Added two new functions init_51degrees() and deinit_51degrees(), updated
Makefile and other code reorganizations related to 51Degrees.
Commit 4834bc7 ("MEDIUM: vars: adds support of variables") brought a bug.
Setting a variable from an expression that doesn't resolve infinitely
blocks the processing.
The internal actions API must be changed to let the caller pass the various
flags regarding the state of the analysis (SMP_OPT_FINAL).
For now we only fix the issue by making the action_store() function always
return 1 to prevent any blocking.
No backport is needed.
We'll need to move the session variables to the session. For this, the
accounting must not depend on the stream. Instead we pass the pointers
to the different lists.
Commit 7810ad7 ("BUG/MAJOR: lru: fix unconditional call to free due to
unexpected semi-colon") was not enough, it happens that the free() is
not performed at the right place because if the evicted node is recycled,
we must also release its data before it gets overwritten.
No backport is needed.
Dmitry Sivachenko reported the following build warning using Clang, which
is a real bug :
src/log.c:1538:22: warning: use of logical '&&' with constant operand
[-Wconstant-logical-operand]
if (tmp->options && LOG_OPT_QUOTE)
^ ~~~~~~~~~~~~~
The effect is that recent log tags related to HTTP method, path, uri,
query have a bug making them always use quotes.
This bug was introduced in 1.6-dev2 with commit 0ebc55f ("MEDIUM: logs:
Add HTTP request-line log format directives"), so no backport is needed.
Dmitry Sivachenko reported the following build warning using Clang, which
is a real bug :
src/lru.c:133:32: warning: if statement has empty body [-Wempty-body]
if (old->data && old->free);
^
It results in calling old->free(old->data) even when old->free is NULL,
hence crashing on cached patterns.
The same bug appears a few lines below in lru64_destroy() :
src/lru.c:195:33: warning: if statement has empty body [-Wempty-body]
if (elem->data && elem->free);
^
Both were introduced in 1.6-dev2 with commit f90ac55 ("MINOR: lru: Add the
possibility to free data when an item is removed"), so no backport is needed.
Dmitry Sivachenko reported the following harmless build warning using Clang :
src/dumpstats.c:5196:48: warning: address of array 'strm_li(sess)->proto->name'
will always evaluate to 'true' [-Wpointer-bool-conversion]
...strm_li(sess) && strm_li(sess)->proto->name ? strm_li(sess)->proto->nam...
~~ ~~~~~~~~~~~~~~~~~~~~~~^~~~
proto->name cannot be null here as it's the protocol name which is stored
directly in the structure.
The same case is present in 1.5 though the code changed.
Dmitry Sivachenko reported the following build warning using Clang,
though it's harmless :
src/hlua.c:1911:13: warning: variable '_socket_info_expanded_form' is not needed
and will not be emitted [-Wunneeded-internal-declaration]
static char _socket_info_expanded_form[] = SOCKET_INFO_EXPANDED_FORM;
^
Indeed, the variable is not used except to compute a sizeof which is
taken from the string it is initialized from. It probably is a leftover
after various code refactorings. Let's get rid of it now since it's not
used anymore.
No backport is needed.
Dmitry Sivachenko reported the following build warning using Clang
which is a real bug :
src/ssl_sock.c:4104:44: warning: address of 'smp->data.str.len' will always
evaluate to 'true' [-Wpointer-bool-conversion]
if (!smp->data.str.str || !&smp->data.str.len)
The impact is very low however, it will return an empty session_id
instead of no session id when none is found.
The fix should be backported to 1.5.
Released version 1.6-dev2 with the following main changes :
- BUG/MINOR: ssl: Display correct filename in error message
- MEDIUM: logs: Add HTTP request-line log format directives
- BUG/MEDIUM: check: tcpcheck regression introduced by e16c1b3f
- BUG/MINOR: check: fix tcpcheck error message
- MINOR: use an int instead of calling tcpcheck_get_step_id
- MINOR: tcpcheck_rule structure update
- MINOR: include comment in tcpcheck error log
- DOC: tcpcheck comment documentation
- MEDIUM: server: add support for changing a server's address
- MEDIUM: server: change server ip address from stats socket
- MEDIUM: protocol: add minimalist UDP protocol client
- MEDIUM: dns: implement a DNS resolver
- MAJOR: server: add DNS-based server name resolution
- DOC: server name resolution + proto DNS
- MINOR: dns: add DNS statistics
- MEDIUM: http: configurable http result codes for http-request deny
- BUILD: Compile clean when debug options defined
- MINOR: lru: Add the possibility to free data when an item is removed
- MINOR: lru: Add lru64_lookup function
- MEDIUM: ssl: Add options to forge SSL certificates
- MINOR: ssl: Export functions to manipulate generated certificates
- MEDIUM: config: add DeviceAtlas global keywords
- MEDIUM: global: add the DeviceAtlas required elements to struct global
- MEDIUM: sample: add the da-csv converter
- MEDIUM: init: DeviceAtlas initialization
- BUILD: Makefile: add options to build with DeviceAtlas
- DOC: README: explain how to build with DeviceAtlas
- BUG/MEDIUM: http: fix the url_param fetch
- BUG/MEDIUM: init: segfault if global._51d_property_names is not initialized
- MAJOR: peers: peers protocol version 2.0
- MINOR: peers: avoid re-scheduling of pending stick-table's updates still not pushed.
- MEDIUM: peers: re-schedule stick-table's entry for sync when data is modified.
- MEDIUM: peers: support of any stick-table data-types for sync
- BUG/MAJOR: sample: regression on sample cast to stick table types.
- CLEANUP: deinit: remove codes for cleaning p->block_rules
- DOC: Fix L4TOUT typo in documentation
- DOC: set-log-level in Logging section preamble
- BUG/MEDIUM: compat: fix segfault on FreeBSD
- MEDIUM: check: include server address and port in the send-state header
- MEDIUM: backend: Allow redispatch on retry intervals
- MINOR: Add TLS ticket keys reference and use it in the listener struct
- MEDIUM: Add support for updating TLS ticket keys via socket
- DOC: Document new socket commands "show tls-keys" and "set ssl tls-key"
- MINOR: Add sample fetch which identifies if the SSL session has been resumed
- DOC: Update doc about weight, act and bck fields in the statistics
- BUG/MEDIUM: ssl: fix tune.ssl.default-dh-param value being overwritten
- MINOR: ssl: add a destructor to free allocated SSL ressources
- MEDIUM: ssl: add the possibility to use a global DH parameters file
- MEDIUM: ssl: replace standards DH groups with custom ones
- MEDIUM: stats: Add enum srv_stats_state
- MEDIUM: stats: Separate server state and colour in stats
- MEDIUM: stats: Only report drain state in stats if server has SRV_ADMF_DRAIN set
- MEDIUM: stats: Differentiate between DRAIN and DRAIN (agent)
- MEDIUM: Lower priority of email alerts for log-health-checks messages
- MEDIUM: Send email alerts when servers are marked as UP or enter the drain state
- MEDIUM: Document when email-alerts are sent
- BUG/MEDIUM: lua: bad argument number in analyser and in error message
- MEDIUM: lua: automatically converts strings in proxy, tables, server and ip
- BUG/MINOR: utf8: remove compilator warning
- MEDIUM: map: uses HAProxy facilities to store default value
- BUG/MINOR: lua: error in detection of mandatory arguments
- BUG/MINOR: lua: set current proxy as default value if it is possible
- BUG/MEDIUM: http: the action set-{method|path|query|uri} doesn't run.
- BUG/MEDIUM: lua: undetected infinite loop
- BUG/MAJOR: http: don't read past buffer's end in http_replace_value
- BUG/MEDIUM: http: the function "(req|res)-replace-value" doesn't respect the HTTP syntax
- MEDIUM/CLEANUP: http: rewrite and lighten http_transform_header() prototype
- BUILD: lua: it miss the '-ldl' directive
- MEDIUM: http: allows 'R' and 'S' in the protocol alphabet
- MINOR: http: split the function http_action_set_req_line() in two parts
- MINOR: http: split http_transform_header() function in two parts.
- MINOR: http: export function inet_set_tos()
- MINOR: lua: txn: add function set_(loglevel|tos|mark)
- MINOR: lua: create and register HTTP class
- DOC: lua: fix some typos
- MINOR: lua: add log functions
- BUG/MINOR: lua: Fix SSL initialisation
- DOC: lua: some fixes
- MINOR: lua: (req|res)_get_headers return more than one header value
- MINOR: lua: map system integration in Lua
- BUG/MEDIUM: http: functions set-{path,query,method,uri} breaks the HTTP parser
- MINOR: sample: add url_dec converter
- MEDIUM: sample: fill the struct sample with the session, proxy and stream pointers
- MEDIUM: sample change the prototype of sample-fetches and converters functions
- MINOR: sample: fill the struct sample with the options.
- MEDIUM: sample: change the prototype of sample-fetches functions
- MINOR: http: split the url_param in two parts
- CLEANUP: http: bad indentation
- MINOR: http: add body_param fetch
- MEDIUM: http: url-encoded parsing function can run throught wrapped buffer
- DOC: http: req.body_param documentation
- MINOR: proxy: custom capture declaration
- MINOR: capture: add two "capture" converters
- MEDIUM: capture: Allow capture with slot identifier
- MINOR: http: add array of generic pointers in http_res_rules
- MEDIUM: capture: adds http-response capture
- MINOR: common: escape CSV strings
- MEDIUM: stats: escape some strings in the CSV dump
- MINOR: tcp: add custom actions that can continue tcp-(request|response) processing
- MINOR: lua: Lua tcp action are not final action
- DOC: lua: schematics about lua socket organization
- BUG/MINOR: debug: display (null) in place of "meth"
- DOC: mention the "lua action" in documentation
- MINOR: standard: add function that converts signed int to a string
- BUG/MINOR: sample: wrong conversion of signed values
- MEDIUM: sample: Add type any
- MINOR: debug: add a special converter which display its input sample content.
- MINOR: tcp: increase the opaque data array
- MINOR: tcp/http/conf: extends the keyword registration options
- MINOR: build: fix build dependency
- MEDIUM: vars: adds support of variables
- MINOR: vars: adds get and set functions
- MINOR: lua: Variable access
- MINOR: samples: add samples which returns constants
- BUG/MINOR: vars/compil: fix some warnings
- BUILD: add 51degrees options to makefile.
- MINOR: global: add several 51Degrees members to global
- MINOR: config: add 51Degrees config parsing.
- MINOR: init: add 51Degrees initialisation code
- MEDIUM: sample: add fiftyone_degrees converter.
- MEDIUM: deinit: add cleanup for 51Degrees to deinit
- MEDIUM: sample: add trie support to 51Degrees
- DOC: add 51Degrees notes to configuration.txt.
- DOC: add build indications for 51Degrees to README.
- MEDIUM: cfgparse: introduce weak and strong quoting
- BUG/MEDIUM: cfgparse: incorrect memmove in quotes management
- MINOR: cfgparse: remove line size limitation
- MEDIUM: cfgparse: expand environment variables
- BUG/MINOR: cfgparse: fix typo in 'option httplog' error message
- BUG/MEDIUM: cfgparse: segfault when userlist is misused
- CLEANUP: cfgparse: remove reference to 'ruleset' section
- MEDIUM: cfgparse: check section maximum number of arguments
- MEDIUM: cfgparse: max arguments check in the global section
- MEDIUM: cfgparse: check max arguments in the proxies sections
- CLEANUP: stream-int: remove a redundant clearing of the linger_risk flag
- MINOR: connection: make conn_sock_shutw() actually perform the shutdown() call
- MINOR: stream-int: use conn_sock_shutw() to shutdown a connection
- MINOR: connection: perform the call to xprt->shutw() in conn_data_shutw()
- MEDIUM: stream-int: replace xprt->shutw calls with conn_data_shutw()
- MINOR: checks: use conn_data_shutw_hard() instead of call via xprt
- MINOR: connection: implement conn_sock_send()
- MEDIUM: stream-int: make conn_si_send_proxy() use conn_sock_send()
- MEDIUM: connection: make conn_drain() perform more controls
- REORG: connection: move conn_drain() to connection.c and rename it
- CLEANUP: stream-int: remove inclusion of fd.h that is not used anymore
- MEDIUM: channel: don't always set CF_WAKE_WRITE on bi_put*
- CLEANUP: lua: don't use si_ic/si_oc on known stream-ints
- BUG/MEDIUM: peers: correctly configure the client timeout
- MINOR: peers: centralize configuration of the peers frontend
- MINOR: proxy: store the default target into the frontend's configuration
- MEDIUM: stats: use frontend_accept() as the accept function
- MEDIUM: peers: use frontend_accept() instead of peer_accept()
- CLEANUP: listeners: remove unused timeout
- MEDIUM: listener: store the default target per listener
- BUILD: fix automatic inclusion of libdl.
- MEDIUM: lua: implement a simple memory allocator
- MEDIUM: compression: postpone buffer adjustments after compression
- MEDIUM: compression: don't send leading zeroes with chunk size
- BUG/MINOR: compression: consider the expansion factor in init
- MINOR: http: check the algo name "identity" instead of the function pointer
- CLEANUP: compression: statify all algo-specific functions
- MEDIUM: compression: add a distinction between UA- and config- algorithms
- MEDIUM: compression: add new "raw-deflate" compression algorithm
- MEDIUM: compression: split deflate_flush() into flush and finish
- CLEANUP: compression: remove unused reset functions
- MAJOR: compression: integrate support for libslz
- BUG/MEDIUM: http: hdr_cnt would not count any header when called without name
- BUG/MAJOR: http: null-terminate the http actions keywords list
- CLEANUP: lua: remove the unused hlua_sleep memory pool
- BUG/MAJOR: lua: use correct object size when initializing a new converter
- CLEANUP: lua: remove hard-coded sizeof() in object creations and mallocs
- CLEANUP: lua: fix confusing local variable naming in hlua_txn_new()
- CLEANUP: hlua: stop using variable name "s" alternately for hlua_txn and hlua_smp
- CLEANUP: lua: get rid of the last "*ht" for struct hlua_txn.
- CLEANUP: lua: rename last occurrences of "*s" to "*htxn" for hlua_txn
- CLEANUP: lua: rename variable "sc" for struct hlua_smp
- CLEANUP: lua: get rid of the last two "*hs" for hlua_smp
- REORG/MAJOR: session: rename the "session" entity to "stream"
- REORG/MEDIUM: stream: rename stream flags from SN_* to SF_*
- MINOR: session: start to reintroduce struct session
- MEDIUM: stream: allocate the session when a stream is created
- MEDIUM: stream: move the listener's pointer to the session
- MEDIUM: stream: move the frontend's pointer to the session
- MINOR: session: add a pointer to the session's origin
- MEDIUM: session: use the pointer to the origin instead of s->si[0].end
- CLEANUP: sample: remove useless tests in fetch functions for l4 != NULL
- MEDIUM: http: move header captures from http_txn to struct stream
- MINOR: http: create a dedicated pool for http_txn
- MAJOR: http: move http_txn out of struct stream
- MAJOR: sample: don't pass l7 anymore to sample fetch functions
- CLEANUP: lua: remove unused hlua_smp->l7 and hlua_txn->l7
- MEDIUM: http: remove the now useless http_txn from {req/res} rules
- CLEANUP: lua: don't pass http_txn anymore to hlua_request_act_wrapper()
- MAJOR: sample: pass a pointer to the session to each sample fetch function
- MINOR: stream: provide a few helpers to retrieve frontend, listener and origin
- CLEANUP: stream: don't set ->target to the incoming connection anymore
- MINOR: stream: move session initialization before the stream's
- MINOR: session: store the session's accept date
- MINOR: session: don't rely on s->logs.logwait in embryonic sessions
- MINOR: session: implement session_free() and use it everywhere
- MINOR: session: add stick counters to the struct session
- REORG: stktable: move the stkctr_* functions from stream to sticktable
- MEDIUM: streams: support looking up stkctr in the session
- MEDIUM: session: update the session's stick counters upon session_free()
- MEDIUM: proto_tcp: track the session's counters in the connection ruleset
- MAJOR: tcp: make tcp_exec_req_rules() only rely on the session
- MEDIUM: stream: don't call stream_store_counters() in kill_mini_session() nor session_accept()
- MEDIUM: stream: move all the session-specific stuff of stream_accept() earlier
- MAJOR: stream: don't initialize the stream anymore in stream_accept
- MEDIUM: session: remove the task pointer from the session
- REORG: session: move the session parts out of stream.c
- MINOR: stream-int: make appctx_new() take the applet in argument
- MEDIUM: peers: move the appctx initialization earlier
- MINOR: session: introduce session_new()
- MINOR: session: make use of session_new() when creating a new session
- MINOR: peers: make use of session_new() when creating a new session
- MEDIUM: peers: initialize the task before the stream
- MINOR: session: set the CO_FL_CONNECTED flag on the connection once ready
- CLEANUP: stream.c: do not re-attach the connection to the stream
- MEDIUM: stream: isolate connection-specific initialization code
- MEDIUM: stream: also accept appctx as origin in stream_accept_session()
- MEDIUM: peers: make use of stream_accept_session()
- MEDIUM: frontend: make ->accept only return +/-1
- MEDIUM: stream: return the stream upon accept()
- MEDIUM: frontend: move some stream initialisation to stream_new()
- MEDIUM: frontend: move the fd-specific settings to session_accept_fd()
- MEDIUM: frontend: don't restrict frontend_accept() to connections anymore
- MEDIUM: frontend: move some remaining stream settings to stream_new()
- CLEANUP: frontend: remove one useless local variable
- MEDIUM: stream: don't rely on the session's listener anymore in stream_new()
- MEDIUM: lua: make use of stream_new() to create an outgoing connection
- MINOR: lua: minor cleanup in hlua_socket_new()
- MINOR: lua: no need for setting timeouts / conn_retries in hlua_socket_new()
- MINOR: peers: no need for setting timeouts / conn_retries in peer_session_create()
- CLEANUP: stream-int: swap stream-int and appctx declarations
- CLEANUP: namespaces: fix protection against multiple inclusions
- MINOR: session: maintain the session count stats in the session, not the stream
- MEDIUM: session: adjust the connection flags before stream_new()
- MINOR: stream: pass the pointer to the origin explicitly to stream_new()
- CLEANUP: poll: move the conditions for waiting out of the poll functions
- BUG/MEDIUM: listener: don't report an error when resuming unbound listeners
- BUG/MEDIUM: init: don't limit cpu-map to the first 32 processes only
- BUG/MAJOR: tcp/http: fix current_rule assignment when restarting over a ruleset
- BUG/MEDIUM: stream-int: always reset si->ops when si->end is nullified
- DOC: update the entities diagrams
- BUG/MEDIUM: http: properly retrieve the front connection
- MINOR: applet: add a new "owner" pointer in the appctx
- MEDIUM: applet: make the applet not depend on a stream interface anymore
- REORG: applet: move the applet definitions out of stream_interface
- CLEANUP: applet: rename struct si_applet to applet
- REORG: stream-int: create si_applet_ops dedicated to applets
- MEDIUM: applet: add basic support for an applet run queue
- MEDIUM: applet: implement a run queue for active appctx
- MEDIUM: stream-int: add a new function si_applet_done()
- MAJOR: applet: now call si_applet_done() instead of si_update() in I/O handlers
- MAJOR: stream: use a regular ->update for all stream interfaces
- MEDIUM: dumpstats: don't unregister the applet anymore
- MEDIUM: applet: centralize the call to si_applet_done() in the I/O handler
- MAJOR: stream: do not allocate request buffers anymore when the left side is an applet
- MINOR: stream-int: add two flags to indicate an applet's wishes regarding I/O
- MEDIUM: applet: make the applets only use si_applet_{cant|want|stop}_{get|put}
- MEDIUM: stream-int: pause the appctx if the task is woken up
- BUG/MAJOR: tcp: only call registered actions when they're registered
- BUG/MEDIUM: peers: fix applet scheduling
- BUG/MEDIUM: peers: recent applet changes broke peers updates scheduling
- MINOR: tools: provide an rdtsc() function for time comparisons
- IMPORT: lru: import simple ebtree-based LRU functions
- IMPORT: hash: import xxhash-r39
- MEDIUM: pattern: add a revision to all pattern expressions
- MAJOR: pattern: add LRU-based cache on pattern matching
- BUG/MEDIUM: http: remove content-length from chunked messages
- DOC: http: update the comments about the rules for determining transfer-length
- BUG/MEDIUM: http: do not restrict parsing of transfer-encoding to HTTP/1.1
- BUG/MEDIUM: http: incorrect transfer-coding in the request is a bad request
- BUG/MEDIUM: http: remove content-length form responses with bad transfer-encoding
- MEDIUM: http: restrict the HTTP version token to 1 digit as per RFC7230
- MEDIUM: http: disable support for HTTP/0.9 by default
- MEDIUM: http: add option-ignore-probes to get rid of the floods of 408
- BUG/MINOR: config: clear proxy->table.peers.p for disabled proxies
- MEDIUM: init: don't stop proxies in parent process when exiting
- MINOR: stick-table: don't attach to peers in stopped state
- MEDIUM: config: initialize stick-tables after peers, not before
- MEDIUM: peers: add the ability to disable a peers section
- MINOR: peers: store the pointer to the signal handler
- MEDIUM: peers: unregister peers that were never started
- MEDIUM: config: propagate the table's process list to the peers sections
- MEDIUM: init: stop any peers section not bound to the correct process
- MEDIUM: config: validate that peers sections are bound to exactly one process
- MAJOR: peers: allow peers section to be used with nbproc > 1
- DOC: relax the peers restriction to single-process
- DOC: document option http-ignore-probes
- DOC: fix the comments about the meaning of msg->sol in HTTP
- BUG/MEDIUM: http: wait for the exact amount of body bytes in wait_for_request_body
- BUG/MAJOR: http: prevent risk of reading past end with balance url_param
- MEDIUM: stream: move HTTP request body analyser before process_common
- MEDIUM: http: add a new option http-buffer-request
- MEDIUM: http: provide 3 fetches for the body
- DOC: update the doc on the proxy protocol
- BUILD: pattern: fix build warnings introduced in the LRU cache
- BUG/MEDIUM: stats: properly initialize the scope before dumping stats
- CLEANUP: config: fix misleading information in error message.
- MINOR: config: report the number of processes using a peers section in the error case
- BUG/MEDIUM: config: properly compute the default number of processes for a proxy
- MEDIUM: http: add new "capture" action for http-request
- BUG/MEDIUM: http: fix the http-request capture parser
- BUG/MEDIUM: http: don't forward client shutdown without NOLINGER except for tunnels
- BUILD/MINOR: ssl: fix build failure introduced by recent patch
- BUG/MAJOR: check: fix breakage of inverted tcp-check rules
- CLEANUP: checks: fix double usage of cur / current_step in tcp-checks
- BUG/MEDIUM: checks: do not dereference head of a tcp-check at the end
- CLEANUP: checks: simplify the loop processing of tcp-checks
- BUG/MAJOR: checks: always check for end of list before proceeding
- BUG/MEDIUM: checks: do not dereference a list as a tcpcheck struct
- BUG/MAJOR: checks: break infinite loops when tcp-checks starts with comment
- MEDIUM: http: make url_param iterate over multiple occurrences
- BUG/MEDIUM: peers: apply a random reconnection timeout
- MEDIUM: config: reject invalid config with name duplicates
- MEDIUM: config: reject conflicts in table names
- CLEANUP: proxy: make the proxy lookup functions more user-friendly
- MINOR: proxy: simply ignore duplicates in proxy name lookups
- MINOR: config: don't open-code proxy name lookups
- MEDIUM: config: clarify the conflicting modes detection for backend rules
- CLEANUP: proxy: remove now unused function findproxy_mode()
- MEDIUM: stick-table: remove the now duplicate find_stktable() function
- MAJOR: config: remove the deprecated reqsetbe / reqisetbe actions
- MINOR: proxy: add a new function proxy_find_by_id()
- MINOR: proxy: add a flag to memorize that the proxy's ID was forced
- MEDIUM: proxy: add a new proxy_find_best_match() function
- CLEANUP: http: explicitly reference request in http_apply_redirect_rules()
- MINOR: http: prepare support for parsing redirect actions on responses
- MEDIUM: http: implement http-response redirect rules
- MEDIUM: http: no need to close the request on redirect if data was parsed
- BUG/MEDIUM: http: fix body processing for the stats applet
- BUG/MINOR: da: fix log-level comparison to emove annoying warning
- CLEANUP: global: remove one ifdef USE_DEVICEATLAS
- CLEANUP: da: move the converter registration to da.c
- CLEANUP: da: register the config keywords in da.c
- CLEANUP: adjust the envelope name in da.h to reflect the file name
- CLEANUP: da: remove ifdef USE_DEVICEATLAS from da.c
- BUILD: make 51D easier to build by defaulting to 51DEGREES_SRC
- BUILD: fix build warning when not using 51degrees
- BUILD: make DeviceAtlas easier to build by defaulting to DEVICEATLAS_SRC
- BUILD: ssl: fix recent build breakage on older SSL libs
Commit 31af49d ("MEDIUM: ssl: Add options to forge SSL certificates")
introduced some dependencies on SSL_CTRL_SET_TLSEXT_HOSTNAME for which
a few checks were missing, breaking the build on openssl 0.9.8.
A switch case doesn't have default entry, and the compilator sends
a warning about uninitilized var.
warning: 'vars' may be used uninitialized in this function [-Wmaybe-uninitialized]
This regression was introduce by commit
9c627e84b2 (MEDIUM: sample: Add type any)
New sample type 'any' was not handled in the matrix used to cast
to stick-tables types.
It is possible to propagate entries of any data-types in stick-tables between
several haproxy instances over TCP connections in a multi-master fashion. Each
instance pushes its local updates and insertions to remote peers. The pushed
values overwrite remote ones without aggregation. Interrupted exchanges are
automatically detected and recovered from the last known point.
This patch adds two functions used for variable acces using the
variable full name. If the variable doesn't exists in the variable
pool name, it is created.
This patch adds support of variables during the processing of each stream. The
variables scope can be set as 'session', 'transaction', 'request' or 'response'.
The variable type is the type returned by the assignment expression. The type
can change while the processing.
The allocated memory can be controlled for each scope and each request, and for
the global process.
This patch permits to register a new keyword with the keyword "tcp-request content"
'tcp-request connection", tcp-response content", http-request" and "http-response"
which is identified only by matching the start of the keyword.
for example, we register the keyword "set-var" with the option "match_pfx"
and the configuration keyword "set-var(var_name)" matchs this entry.
This type is used to accept any type of sample as input, and prevent
any automatic "cast". It runs like the type "ADDR" which accept the
type "IPV4" and "IPV6".
The signed values are casted as unsigned before conversion. This patch
use the good converters according with the sample type.
Note: it depends on previous patch to parse signed ints.
Implementation of a DNS client in HAProxy to perform name resolution to
IP addresses.
It relies on the freshly created UDP client to perform the DNS
resolution. For now, all UDP socket calls are performed in the
DNS layer, but this might change later when the protocols are
extended to be more suited to datagram mode.
A new section called 'resolvers' is introduced thanks to this patch. It
is used to describe DNS servers IP address and also many parameters.
Basic introduction of a UDP layer in HAProxy. It can be used as a
client only and manages UDP exchanges with servers.
It can't be used to load-balance UDP protocols, but only used by
internal features such as DNS resolution.
Ability to change a server IP address during HAProxy run time.
For now this is provided via function update_server_addr() which
currently is not called.
A log is emitted on each change. For now we do it inconditionally,
but later we'll want to do it only on certain circumstances, which
explains why the logging block is enclosed in if(1).
Following functions are now available in the SSL public API:
* ssl_sock_create_cert
* ssl_sock_get_generated_cert
* ssl_sock_set_generated_cert
* ssl_sock_generated_cert_serial
These functions could be used to create a certificate by hand, set it in the
cache used to store generated certificates and retrieve it. Here is an example
(pseudo code):
X509 *cacert = ...;
EVP_PKEY *capkey = ...;
char *servername = ...;
unsigned int serial;
serial = ssl_sock_generated_cert_serial(servername, strlen(servername));
if (!ssl_sock_get_generated_cert(serial, cacert)) {
SSL_CTX *ctx = ssl_sock_create_cert(servername, serial, cacert, capkey);
ssl_sock_set_generated_cert(ctx, serial, cacert);
}
With this patch, it is possible to configure HAProxy to forge the SSL
certificate sent to a client using the SNI servername. We do it in the SNI
callback.
To enable this feature, you must pass following BIND options:
* ca-sign-file <FILE> : This is the PEM file containing the CA certitifacte and
the CA private key to create and sign server's certificates.
* (optionally) ca-sign-pass <PASS>: This is the CA private key passphrase, if
any.
* generate-certificates: Enable the dynamic generation of certificates for a
listener.
Because generating certificates is expensive, there is a LRU cache to store
them. Its size can be customized by setting the global parameter
'tune.ssl.ssl-ctx-cache-size'.
It lookup a key in a LRU cache for use with specified domain and revision. It
differs from lru64_get as it does not create missing keys. The function returns
NULL if an error or a cache miss occurs.
Now, When a item is committed in an LRU tree, you can define a function to free
data owned by this item. This function will be called when the item is removed
from the LRU tree or when the tree is destroyed..
When using the "51d" converter without specifying the list of 51Degrees
properties to detect (see parameter "51degrees-property-name-list"), the
"global._51d_property_names" could be left uninitialized which will lead to
segfault during init.
Since all rules listed in p->block_rules have been moved to the beginning of
the http-request rules in check_config_validity(), there is no need to clean
p->block_rules in deinit().
Signed-off-by: Godbach <nylzhaowei@gmail.com>
An ifdef was missing to avoid declaring these variables :
src/haproxy.c: In function 'deinit':
src/haproxy.c:1253:47: warning: unused variable '_51d_prop_nameb' [-Wunused-variable]
src/haproxy.c:1253:30: warning: unused variable '_51d_prop_name' [-Wunused-variable]
It takes up to 5 string arguments that are to be 51Degrees property names.
It will then create a chunk with values detected based on the request header
supplied (this should be the User-Agent).
When haproxy is run on the foreground with DeviceAtlas enabled, one
line of warning is seen for every test because the comparison is always
true even when loglevel is zero :
willy@wtap:haproxy$ ./haproxy -db -f test-da.cfg
[WARNING] 151/150831 (25506) : deviceatlas : final memory image 7148029 bytes.
Deviceatlas module loaded.
[WARNING] 151/150832 (25506) : deviceatlas : .
[WARNING] 151/150833 (25506) : deviceatlas : .
[WARNING] 151/150833 (25506) : deviceatlas : .
^C
Don't emit a warning when loglevel is null.