6359 Commits

Author SHA1 Message Date
Emeric Brun
234fc3c31e BUG/MEDIUM: peers: table entries learned from a remote are pushed to others after a random delay.
New sticktable entries learned from a remote peer can be pushed to others after
a random delay because they are not inserted at the right position in the updates
tree.
2015-12-16 15:50:22 +01:00
Willy Tarreau
7006045e48 BUG/MEDIUM: config: properly adjust maxconn with nbproc when memmax is forced
When memmax is forced using "-m", the per-process memory limit is enforced
using setrlimit(), but this value is not used to compute the automatic
maxconn limit. In addition, the per-process memory limit didn't consider
the fact that the shared SSL cache only needs to be accounted once.

The doc was also fixed to clearly state that "-m" is global and not per
process. It makes sense because people who use -m want to protect the
system's resources regardless of whatever appears in the configuration.
2015-12-14 13:03:09 +01:00
Willy Tarreau
b22fc30aaa MINOR: config: make tune.recv_enough configurable
This setting used to be assigned to a variable tunable from a constant
and for an unknown reason never made its way into the config parser.

tune.recv_enough <number>
  Haproxy uses some hints to detect that a short read indicates the end of the
  socket buffers. One of them is that a read returns more than <recv_enough>
  bytes, which defaults to 10136 (7 segments of 1448 each). This default value
  may be changed by this setting to better deal with workloads involving lots
  of short messages such as telnet or SSH sessions.
2015-12-14 12:05:45 +01:00
Willy Tarreau
30da7ad809 BUILD: ssl: set SSL_SOCK_NUM_KEYTYPES with openssl < 1.0.2
Last patch unfortunately broke build with openssl older than 1.0.2.
Let's just define a single key type in this case.
2015-12-14 11:28:33 +01:00
yanbzhu
be2774dd35 MEDIUM: ssl: Added support for Multi-Cert OCSP Stapling
Enabled loading of OCSP staple responses (.ocsp files) when processing a
bundled certificate with multiple keytypes.
2015-12-14 11:22:29 +01:00
yanbzhu
63ea846399 MEDIUM: ssl: Added multi cert support for loading crt directories
Loading of multiple certs into shared contexts is now supported if a user
specifies a directory instead of a cert file.
2015-12-14 11:22:29 +01:00
yanbzhu
1b04e5b0e0 MINOR: ssl: Added multi cert support for crt-list config keyword
Added support for loading mutiple certs into shared contexts when they
are specified in a crt-list

Note that it's not practical to support SNI filters with multicerts, so
any SNI filters that's provided to the crt-list is ignored if a
multi-cert opertion is used.
2015-12-14 11:22:29 +01:00
yanbzhu
08ce6ab0c9 MEDIUM: ssl: Added support for creating SSL_CTX with multiple certs
Added ability for users to specify multiple certificates that all relate
a single server. Users do this by specifying certificate "cert_name.pem"
but having "cert_name.pem.rsa", "cert_name.pem.dsa" and/or
"cert_name.pem.ecdsa" in the directory.

HAProxy will now intelligently search for those 3 files and try combine
them into as few SSL_CTX's as possible based on CN/SAN. This will allow
HAProxy to support multiple ciphersuite key algorithms off a single
SSL_CTX.

This change integrates into the existing architecture of SNI lookup and
multiple SNI's can point to the same SSL_CTX, which can support multiple
key_types.
2015-12-14 11:22:29 +01:00
yanbzhu
488a4d2e75 MINOR: ssl: Added cert_key_and_chain struct
Added cert_key_and_chain struct to ssl. This struct will store the
contents of a crt path (from the config file) into memory. This will
allow us to use the data stored in memory instead of reading the file
multiple times.

This will be used to support a later commit to load multiple pkeys/certs
into a single SSL_CTX
2015-12-14 11:22:29 +01:00
David Carlier
7ece096767 CLEANUP: haproxy: using _GNU_SOURCE instead of __USE_GNU macro.
In order to properly enable sched_setaffinity, in some versions of Linux,
it is rather _GNU_SOURCE than __USE_GNU (spotted on Alpine Linux for instance),
also for the sake of consistency as __USE_GNU seems not used across the code and
for last, it seems on Linux it is the best way to enable non portable code.
On Linux glibc's based versions, it seems _GNU_SOURCE defines __USE_GNU
it should be safe enough.
2015-12-09 10:38:29 +01:00
Willy Tarreau
858b103631 BUG/MEDIUM: http: fix http-reuse when frontend and backend differ
Krishna Kumar reported that the following configuration doesn't permit
HTTP reuse between two clients :

    frontend private-frontend
        mode http
        bind :8001
        default_backend private-backend

    backend private-backend
        mode http
        http-reuse always
        server bck 127.0.0.1:8888

The reason for this is that in http_end_txn_clean_session() we check the
stream's backend backend's http-reuse option before deciding whether the
backend connection should be moved back to the server's pool or not. But
since we're doing this after the call to http_reset_txn(), the backend is
reset to match the frontend, which doesn't have the option. However it
will work fine in a setup involving a "listen" section.

We just need to keep a pointer to the current backend before calling
http_reset_txn(). The code does that and replaces the few remaining
references to s->be inside the same function so that if any part of
code were to be moved later, this trap doesn't happen again.

This fix must be backported to 1.6.
2015-12-07 17:04:59 +01:00
Baptiste Assmann
baf9794b4d BUG/MINOR: tcpcheck: conf parsing error when no port configured on server and first rule(s) is (are) COMMENT
A small configuration parsing error exists when no port is setup on the
server IP:port statement and the server's parameter 'port' is not set
and if the first tcp-check rule is a comment, like in the example below:

  backend b
   option tcp-check
   tcp-check comment blah
   tcp-check connect 8444
   server s 127.0.0.1 check

In such case, an ALERT is improperly returned, despite this
configuration is valid and works.

The new code move the pointer to the first tcp-check rule which isn't a
comment before checking the presence of the port.

backport status: 1.6 and above
2015-12-04 07:48:44 +01:00
Baptiste Assmann
3dd73bea64 BUG/MINOR: tcpcheck: conf parsing error when no port configured on server and last rule is a CONNECT with no port
Current configuration parsing is permissive in such situation:
A server in a backend with no port conigured on the IP address
statement, no 'port' parameter configured and last rule of a tcp-check
is a CONNECT with no port.

The current code currently parses all the rules to validate a port is
well available, but it misses the last one, which means such
configuration is valid:

  backend b
   option tcp-check
   tcp-check connect port 8444
   tcp-check connect
   server s 127.0.0.1 check

the second connect tentative is sent to port '0'...

Current patch fixes this by parsing the list the right way, including
the last rule.

backport status: 1.6 and above
2015-12-04 07:48:35 +01:00
Cyril Bonté
b65e0335d9 BUG/MINOR: checks: typo in an email-alert error message
When the email alert message couldn't be formatted, the logged error message
said the contrary.

This fix must be backported to 1.6.
2015-12-04 06:09:30 +01:00
Cyril Bonté
e22bfd61b1 BUG/MINOR: checks: email-alert causes a segfault when an unknown mailers section is configured
A segfault can occur during at the initialization phase, when an unknown
"mailers" name is configured. This happens when "email-alert myhostname" is not
set, where a direct pointer to an array is used instead of copying the string,
causing the segfault when haproxy tries to free the memory.

This is a minor issue because the configuration is invalid and a fatal error
will remain, but it should be fixed to prevent reload issues.

Example of minimal configuration to reproduce the bug :
    backend example
        email-alert mailers NOT_FOUND
        email-alert from foo@localhost
        email-alert to bar@localhost

This fix must be backported to 1.6.
2015-12-04 06:09:30 +01:00
Cyril Bonté
7e0847045a BUG/MEDIUM: checks: email-alert not working when declared in defaults
Tommy Atkinson and Sylvain Faivre reported that email alerts didn't work when
they were declared in the defaults section. This is due to the use of an
internal attribute which is set once an email-alert is at least partially
configured. But this attribute was not propagated to the current proxy during
the configuration parsing.

Not that the issue doesn't occur if "email-alert myhostname" is configured in
the defaults section.

This fix must be backported to 1.6.
2015-12-04 06:09:30 +01:00
David Carlier
3b7113836d BUG/MEDIUM: da: stop DeviceAtlas processing in the convertor if there is no input.
In case a HTTP header modifier, like req*del, is used, the User-Agent would be removed
and cause a segfault, hence the work is stopped in due time.
2015-12-03 11:37:01 +01:00
David Carlier
df3785fe2a MINOR: da: silent logging by default and displaying DeviceAtlas support if built. 2015-12-03 11:37:01 +01:00
David CARLIER
087ca283e4 CLEANUP: proxy: calloc call inverted arguments
Nothing major but a human typo mistake.
2015-12-03 11:37:01 +01:00
David Carlier
081b336f7d BUILD: dumpstats: silencing warning for printf format specifier / time_t
time_t is not necesseraly a long int (spotted in OpenBSD), so just an explicit cast to
avoid the compiler warning. should be safe enough.
2015-12-03 11:37:01 +01:00
Cyril Bonté
ce1ef4df01 BUG/MEDIUM: sample: urlp can't match an empty value
Currently urlp fetching samples were able to find parameters with an empty
value, but the return code depended on the value length. The final result was
that acls using urlp couldn't match empty values.

Example of acl which always returned "false":
  acl MATCH_EMPTY urlp(foo) -m len 0

The fix consists in unconditionally return 1 when the parameter is found.

This fix must be backported to 1.6 and 1.5.
2015-11-26 23:51:42 +01:00
Willy Tarreau
a1c2b2c4f3 BUG/MEDIUM: cli: changing compression rate-limiting must require admin level
Right now it's possible to change the global compression rate limiting
without the CLI being at the admin level.

This fix must be backported to 1.6 and 1.5.
2015-11-26 18:32:39 +01:00
Willy Tarreau
ed9dddd237 CLEANUP: compression: don't allocate DEFAULT_MAXZLIBMEM without USE_ZLIB
It's pointless to reserve this amount of memory when zlib is not used.
Adding the condition will make build scripts easier to manage. This may
be backported to 1.6.
2015-11-26 16:35:53 +01:00
Willy Tarreau
f25b3573d6 BUG/MEDIUM: stream: fix half-closed timeout handling
client-fin and server-fin are bogus. They are applied on the write
side after a SHUTR was seen. The immediate effect is that sometimes
if a SHUTR was seen after a SHUTW on the same side, the timeout is
enabled again regardless of the fact that the output is already
closed. This results in the timeout event not to be processed and
a busy poll loop to happen until another timeout on the stream gets
rid of it. Note that haproxy continues its job during this, it's just
that it eats all the CPU trying to handle an event that it ignores.

An reproducible case consists in having a client stop reading data from
a server to ensure data remain in the response buffer, then the client
sends a shutdown(write). If abortonclose is enabled on haproxy, the
shutdown is passed to the server side and the server responds with a
SHUTR that cannot immediately be forwarded to the client since the
buffer is full. During this time the event is ignored and the task is
woken again in loops.

It is worth noting that the timeout handling since 1.5 is a bit fragile
and that it might be possible that other similar conditions still exist,
so the timeout handling should be audited regarding this issue.

Many thanks to BaiYang for providing detailed information showing the
problem in action.

This bug also affects 1.5 thus the fix must be backported.
2015-11-26 10:33:47 +01:00
Willy Tarreau
714ea78c9a BUG/MEDIUM: http: don't enable auto-close on the response side
There is a bug where "option http-keep-alive" doesn't force a response
to stay in keep-alive if the server sends the FIN along with the response
on the second or subsequent response. The reason is that the auto-close
was forced enabled when recycling the HTTP transaction and it's never
disabled along the response processing chain before the SHUTR gets a
chance to be forwarded to the client side. The MSG_DONE state of the
HTTP response properly disables it but too late.

There's no more reason for enabling auto-close here, because either it
doesn't matter in non-keep-alive modes because the connection is closed,
or it is automatically enabled by process_stream() when it sees there's
no analyser on the stream.

This bug also affects 1.5 so a backport is desired.
2015-11-26 10:25:11 +01:00
Lukas Tribus
d334a2c843 BUG/MINOR: lua: don't force-sslv3 LUA's SSL socket
Sander Klein reported an error messages about SSLv3 not
being supported on Debian 8, although he didn't force-sslv3.

Vincent Bernat tracked this down to the LUA initialization, which
actually does force-sslv3.

This patch removes force-sslv3 from the LUA initialization, so
the LUA SSL socket can actually use TLS and doesn't trigger
warnings when SSLv3 is not supported by libssl (such as in
Debian 8).

This should be backported to 1.6.
2015-11-26 07:30:22 +01:00
Willy Tarreau
7f876a1eeb BUG/MEDIUM: http: switch the request channel to no-delay once done.
There's an issue when sending POST data that came in a second packet,
the CF_NEVER_WAIT flag is not always set on the request channel, while
the server is waiting for the request. We must always set this flag in
this case since we're not going to shut down after sending, contrary
to the response side.

Note that option http-no-delay works around this issue.

Reproducer :

listen  px
        mode http
        timeout client 10s
        timeout server 5s
        timeout connect 3s
        option http-server-close
        #option http-no-delay
        bind :8001
        server s1 127.0.0.1:8003

$ (printf "POST / HTTP/1.1\r\nTransfer-encoding: chunked\r\n\r\n"; sleep 0.01; printf "10\r\nAZERTYUIOPQSDFGH\r\n0\r\n\r\n") | nc6 0 8001

Before this fix :

12:03:31.946763 epoll_wait(3, {{EPOLLIN, {u32=5, u64=5}}}, 200, 1000) = 1
12:03:32.634175 accept4(5, {sa_family=AF_INET, sin_port=htons(53849), sin_addr=inet_addr("127.0.0.1")}, [16], SOCK_NONBLOCK) = 6
12:03:32.634318 setsockopt(6, SOL_TCP, TCP_NODELAY, [1], 4) = 0
12:03:32.634434 accept4(5, 0x7ffccfbb2cf0, [128], SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
12:03:32.634574 recvfrom(6, "POST / HTTP/1.1\r\nTransfer-encodi"..., 8192, 0, NULL, NULL) = 47
12:03:32.634809 setsockopt(6, SOL_TCP, TCP_QUICKACK, [1], 4) = 0
12:03:32.634952 socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 7
12:03:32.635031 fcntl(7, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
12:03:32.635089 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0
12:03:32.635153 connect(7, {sa_family=AF_INET, sin_port=htons(8003), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
12:03:32.635315 epoll_wait(3, {}, 200, 0) = 0
12:03:32.635394 sendto(7, "POST / HTTP/1.1\r\nTransfer-encodi"..., 66, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 66
12:03:32.635527 recvfrom(6, 0x7f0224e66024, 8192, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable)
12:03:32.635651 epoll_ctl(3, EPOLL_CTL_ADD, 6, {EPOLLIN|0x2000, {u32=6, u64=6}}) = 0
12:03:32.635782 epoll_wait(3, {}, 200, 0) = 0
12:03:32.635842 recvfrom(7, 0x7f0224e66024, 8192, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable)
12:03:32.635924 epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLIN|0x2000, {u32=7, u64=7}}) = 0
12:03:32.636027 epoll_wait(3, {{EPOLLIN, {u32=6, u64=6}}}, 200, 1000) = 1
12:03:32.644892 recvfrom(6, "10\r\nAZERTYUIOPQSDFGH\r\n0\r\n\r\n", 8192, 0, NULL, NULL) = 27
12:03:32.645016 epoll_wait(3, {}, 200, 0) = 0
12:03:32.645105 sendto(7, "10\r\nAZERTYUIOPQSDFGH\r\n0\r\n\r\n", 27, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_MORE, NULL, 0) = 27

After the fix :

11:59:12.538617 connect(7, {sa_family=AF_INET, sin_port=htons(8003), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
11:59:12.538787 epoll_wait(3, {}, 200, 0) = 0
11:59:12.538867 sendto(7, "POST / HTTP/1.1\r\nTransfer-encodi"..., 66, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 66
11:59:12.539031 recvfrom(6, 0x7f832ce45024, 8192, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable)
11:59:12.539161 epoll_ctl(3, EPOLL_CTL_ADD, 6, {EPOLLIN|0x2000, {u32=6, u64=6}}) = 0
11:59:12.539259 epoll_wait(3, {}, 200, 0) = 0
11:59:12.539337 recvfrom(7, 0x7f832ce45024, 8192, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable)
11:59:12.539421 epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLIN|0x2000, {u32=7, u64=7}}) = 0
11:59:12.539499 epoll_wait(3, {{EPOLLIN, {u32=6, u64=6}}}, 200, 1000) = 1
11:59:12.548519 recvfrom(6, "10\r\nAZERTYUIOPQSDFGH\r\n0\r\n\r\n", 8192, 0, NULL, NULL) = 27
11:59:12.548844 epoll_wait(3, {}, 200, 0) = 0
11:59:12.549012 sendto(7, "10\r\nAZERTYUIOPQSDFGH\r\n0\r\n\r\n", 27, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 27
11:59:12.549454 epoll_wait(3, {}, 200, 1000) = 0

This fix must be backported to 1.6, 1.5 and 1.4.
2015-11-18 12:50:38 +01:00
lsenta
1e1f41d0f3 BUG: http: do not abort keep-alive connections on server timeout
When a server timeout is detected on the second or nth request of a keep-alive
connection, HAProxy closes the connection without writing a response.
Some clients would fail with a remote disconnected exception and some
others would retry potentially unsafe requests.

This patch removes the special case and makes sure a 504 timeout is
written back whenever a server timeout is handled.

Signed-off-by: lsenta <laurent.senta@gmail.com>
2015-11-13 14:41:51 +01:00
Daniel Jakots
54ffb918cb BUILD: check for libressl to be able to build against it
[wt: might be worth backporting it to 1.6]
2015-11-08 07:28:02 +01:00
Thierry FOURNIER
a3308fd8c1 BUG/MEDIUM: lua: clean output buffer
When the txn.done() fiunction is called, the ouput buffer is cleaned,
but the associated relative pointer on the HTTP requests elements
is not reseted.

This patch remove this cleanup, because the output buffer may contain
data to forward.
2015-11-06 01:15:34 +01:00
Lukas Tribus
c93242cab9 BUG/MINOR: acl: don't use record layer in req_ssl_ver
The initial record layer version in a SSL handshake may be set to TLSv1.0
or similar for compatibility reasons, this is allowed as per RFC5246
Appendix E.1 [1]. Some implementations are Openssl [2] and NSS [3].

A related issue has been fixed some time ago in commit 57d229747
("BUG/MINOR: acl: req_ssl_sni fails with SSLv3 record version").

Fix this by using the real client hello version instead of the record
layer version.

This was reported by Julien Vehent and analyzed by Cyril Bonté.
The initial patch is from Julien Vehent as well.

This should be backported to stable series, the req_ssl_ver keyword was
first introduced in 1.3.16.

[1] https://tools.ietf.org/html/rfc5246#appendix-E.1
[2] 4a1cf50187
[3] https://bugzilla.mozilla.org/show_bug.cgi?id=774547
2015-11-05 14:10:06 +01:00
Dragan Dosen
cf4fb036a4 BUG/MINOR: server: check return value of fgets() in apply_server_state()
fgets() can return NULL on error or when EOF occurs. This patch adds a
check of fgets() return value and displays a warning if the first line of
the server state file can not be read. Additionally, we make sure to close
the previously opened file descriptor.
2015-11-05 10:39:09 +01:00
Baptiste Assmann
e9544935e8 BUG/MINOR: http rule: http capture 'id' rule points to a non existing id
It is possible to create a http capture rule which points to a capture slot
id which does not exist.

Current patch prevent this when parsing configuration and prevent running
configuration which contains such rules.

This configuration is now invalid:

  frontend f
   bind :8080
   http-request capture req.hdr(User-Agent) id 0
   default_backend b

this one as well:

  frontend f
   bind :8080
   declare capture request len 32 # implicit id is 0 here
   http-request capture req.hdr(User-Agent) id 1
   default_backend b

It applies of course to both http-request and http-response rules.
2015-11-04 08:47:55 +01:00
James Brown
55f9ff11b5 MINOR: check: add agent-send server parameter
Causes HAProxy to emit a static string to the agent on every check,
so that you can independently control multiple services running
behind a single agent port.
2015-11-04 07:26:51 +01:00
Thierry FOURNIER
c4eebc8157 BUG/MEDIUM: lua: sample fetches based on response doesn't work
The direction (request or response) is not propagated in the
sample fecthes called throught Lua. This patch adds the direction
status in some structs (hlua_txn and hlua_smp) to make sure that
the sample fetches will be called with all the information.

The converters can not access to a TXN object, so there are not
impacted the direction. However, the samples used as input of the
Lua converter wrapper are initiliazed with the direction. Thereby,
the struct smp stay consistent.
[wt: needs to be backported to 1.6]
2015-11-03 10:50:14 +01:00
Thierry FOURNIER
6e01f38e73 CLEANUP: use direction names in place of numeric values
This patch cleanups the direction names. It replaces numeric values,
by the associated defines. It ensure the compliance with values found
somwhere else in HAProxy.

It is required by the bugfix patch which is following.
[wt: needs to be backported to 1.6]
2015-11-03 10:48:00 +01:00
Baptiste Assmann
a315c5534e BUG/MINOR: dns: check for duplicate nameserver id in a resolvers section was missing
Current resolvers section parsing function is permissive on nameserver
id and two nameservers may have the same id.
It's a shame, since we don't know for example, whose statistics belong
to which nameserver...

From now, configuration with duplicated nameserver id in a resolvers
section are considered as broken and returns a fatal error when parsing.
2015-11-03 09:56:29 +01:00
Willy Tarreau
1c59bd5abc BUG/MAJOR: http: don't requeue an idle connection that is already queued
Cyril Bonté reported a reproduceable sequence which can lead to a crash
when using backend connection reuse. The problem comes from the fact that
we systematically add the server connection to an idle pool at the end of
the HTTP transaction regardless of the fact that it might already be there.

This is possible for example when processing a request which doesn't use
a server connection (typically a redirect) after a request which used a
connection. Then after the first request, the connection was already in
the idle queue and we're putting it a second time at the end of the second
request, causing a corruption of the idle pool.

Interestingly, the memory debugger in 1.7 immediately detected a suspicious
double free on the connection, leading to a very early detection of the
cause instead of its consequences.

Thanks to Cyril for quickly providing a working reproducer.

This fix must be backported to 1.6 since connection reuse was introduced
there.
2015-11-02 22:28:25 +01:00
mildis
ff5d510294 MINOR: config: allow IPv6 bracketed literals 2015-11-01 21:30:41 +01:00
Baptiste Assmann
e4c4b7dda6 BUG/MINOR: dns: unable to parse CNAMEs response
A bug lied in the parsing of DNS CNAME response, leading HAProxy to
think the CNAME was improperly resolved in the response.

This should be backported into 1.6 branch
2015-10-30 12:39:08 +01:00
Baptiste Assmann
fad0318c74 BUG/MAJOR: dns: first DNS response packet not matching queried hostname may lead to a loop
The status DNS_UPD_NAME_ERROR returned by dns_get_ip_from_response and
which means the queried name can't be found in the response was
improperly processed (fell into the default case).
This lead to a loop where HAProxy simply resend a new query as soon as
it got a response for this status and in the only case where such type
of response is the very first one received by the process.

This should be backported into 1.6 branch
2015-10-30 12:38:14 +01:00
Willy Tarreau
f2dd5e4159 BUG/MEDIUM: config: count memory limits on 64 bits, not 32
It was accidently discovered that limiting haproxy to 5000 MB leads to
an effective limit of 904 MB. This is because the computation for the
size limit is performed by multiplying rlimit_memmax by 1048576, and
doing so causes the operation to be performed on an int instead of a
long or long long. Just switch to 1048576ULL as is done at other places
to fix this.

This bug affects all supported versions, the backport is desired, though
it rarely affects users since few people apply memory limits.
2015-10-29 10:42:55 +01:00
Willy Tarreau
58102cf30b MEDIUM: memory: add accounting for failed allocations
We now keep a per-pool counter of failed memory allocations and
we report that, as well as the amount of memory allocated and used
on the CLI.
2015-10-28 16:24:21 +01:00
Willy Tarreau
de30a684ca DEBUG/MEDIUM: memory: add optional control pool memory operations
When DEBUG_MEMORY_POOLS is used, we now use the link pointer at the end
of the pool to store a pointer to the pool, and to control it during
pool_free2() in order to serve four purposes :
  - at any instant we can know what pool an object was allocated from
    when examining memory, hence how we should possibly decode it ;

  - it serves to detect double free when they happen, as the pointer
    cannot be valid after the element is linked into the pool ;

  - it serves to detect if an element is released in the wrong pool ;

  - it serves as a canary, to detect if some buffers experienced an
    overflow before being release.

All these elements will definitely help better troubleshoot strange
situations, or at least confirm that certain conditions did not happen.
2015-10-28 15:28:05 +01:00
Willy Tarreau
ac421118db DEBUG/MEDIUM: memory: optionally protect free data in pools
When debugging a core file, it's sometimes convenient to be able to
visit the released entries in the pools (typically last released
session). Unfortunately the first bytes of these entries are destroyed
by the link elements of the pool. And of course, most structures have
their most accessed elements at the beginning of the structure (typically
flags). Let's add a build-time option DEBUG_MEMORY_POOLS which allocates
an extra pointer in each pool to put the link at the end of each pool
item instead of the beginning.
2015-10-28 15:27:59 +01:00
Andrew Hayworth
edb93a7c28 MINOR: cli: ability to set per-server maxconn
This commit adds support for setting a per-server maxconn from the stats
socket. The only really notable part of this commit is that we need to
check if maxconn == minconn before changing things, as this indicates
that we are NOT using dynamic maxconn. When we are not using dynamic
maxconn, we should update maxconn/minconn in lockstep.
2015-10-28 08:01:56 +01:00
Christopher Faulet
e7db21693f BUILD: ssl: fix build error introduced in commit 7969a3 with OpenSSL < 1.0.0
The function 'EVP_PKEY_get_default_digest_nid()' was introduced in OpenSSL
1.0.0. So for older version of OpenSSL, compiled with the SNI support, the
HAProxy compilation fails with the following error:

src/ssl_sock.c: In function 'ssl_sock_do_create_cert':
src/ssl_sock.c:1096:7: warning: implicit declaration of function 'EVP_PKEY_get_default_digest_nid'
   if (EVP_PKEY_get_default_digest_nid(capkey, &nid) <= 0)
[...]
src/ssl_sock.c:1096: undefined reference to `EVP_PKEY_get_default_digest_nid'
collect2: error: ld returned 1 exit status
Makefile:760: recipe for target 'haproxy' failed
make: *** [haproxy] Error 1

So we must add a #ifdef to check the OpenSSL version (>= 1.0.0) to use this
function. It is used to get default signature digest associated to the private
key used to sign generated X509 certificates. It is called when the private key
differs than EVP_PKEY_RSA, EVP_PKEY_DSA and EVP_PKEY_EC. It should be enough for
most of cases.
2015-10-22 13:32:34 +02:00
Andrew Hayworth
e6a4a329b8 MEDIUM: dns: Don't use the ANY query type
Basically, it's ill-defined and shouldn't really be used going forward.
We can't guarantee that resolvers will do the 'legwork' for us and
actually resolve CNAMES when we request the ANY query-type. Case in point
(obfuscated, clearly):

  PRODUCTION! ahayworth@secret-hostname.com:~$
  dig @10.11.12.53 ANY api.somestartup.io

  ; <<>> DiG 9.8.4-rpz2+rl005.12-P1 <<>> @10.11.12.53 ANY api.somestartup.io
  ; (1 server found)
  ;; global options: +cmd
  ;; Got answer:
  ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62454
  ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 0

  ;; QUESTION SECTION:
  ;api.somestartup.io.                        IN      ANY

  ;; ANSWER SECTION:
  api.somestartup.io.         20      IN      CNAME api-somestartup-production.ap-southeast-2.elb.amazonaws.com.

  ;; AUTHORITY SECTION:
  somestartup.io.               166687  IN      NS      ns-1254.awsdns-28.org.
  somestartup.io.               166687  IN      NS      ns-1884.awsdns-43.co.uk.
  somestartup.io.               166687  IN      NS      ns-440.awsdns-55.com.
  somestartup.io.               166687  IN      NS      ns-577.awsdns-08.net.

  ;; Query time: 1 msec
  ;; SERVER: 10.11.12.53#53(10.11.12.53)
  ;; WHEN: Mon Oct 19 22:02:29 2015
  ;; MSG SIZE  rcvd: 242

HAProxy can't handle that response correctly.

Rather than try to build in support for resolving CNAMEs presented
without an A record in an answer section (which may be a valid
improvement further on), this change just skips ANY record types
altogether. A and AAAA are much more well-defined and predictable.

Notably, this commit preserves the implicit "Prefer IPV6 behavior."

Furthermore, ANY query type by default is a bad idea: (from Robin on
HAProxy's ML):
  Using ANY queries for this kind of stuff is considered by most people
  to be a bad practice since besides all the things you named it can
  lead to incomplete responses. Basically a resolver is allowed to just
  return whatever it has in cache when it receives an ANY query instead
  of actually doing an ANY query at the authoritative nameserver. Thus
  if it only received queries for an A record before you do an ANY query
  you will not get an AAAA record even if it is actually available since
  the resolver doesn't have it in its cache. Even worse if before it
  only got MX queries, you won't get either A or AAAA
2015-10-20 22:31:01 +02:00
Willy Tarreau
2f63ef4d1c BUG/MAJOR: ssl: free the generated SSL_CTX if the LRU cache is disabled
Kim Seri reported that haproxy 1.6.0 crashes after a few requests
when a bind line has SSL enabled with more than one certificate. This
was caused by an insufficient condition to free generated certs during
ssl_sock_close() which can also catch other certs.

Christopher Faulet analysed the situation like this :

-------
First the LRU tree is only initialized when the SSL certs generation is
configured on a bind line. So, in the most of cases, it is NULL (it is
not the same thing than empty).
When the SSL certs generation is used, if the cache is not NULL, a such
certificate is pushed in the cache and there is no need to release it
when the connection is closed.
But it can be disabled in the configuration. So in that case, we must
free the generated certificate when the connection is closed.

Then here, we have really a bug. Here is the buggy part:

3125)      if (conn->xprt_ctx) {
3126) #ifdef SSL_CTRL_SET_TLSEXT_HOSTNAME
3127)              if (!ssl_ctx_lru_tree && objt_listener(conn->target)) {
3128)                      SSL_CTX *ctx = SSL_get_SSL_CTX(conn->xprt_ctx);
3129)                      if (ctx != 3130)
 SSL_CTX_free(ctx);
3131)              }
3133)              SSL_free(conn->xprt_ctx);
3134)              conn->xprt_ctx = NULL;
3135)              sslconns--;
3136)      }

The check on the line 3127 is not enough to determine if this is a
generated certificate or not. Because ssl_ctx_lru_tree is NULL,
generated certificates, if any, must be freed. But here ctx should also
be compared to all SNI certificates and not only to default_ctx. Because
of this bug, when a SNI certificate is used for a connection, it is
erroneously freed when this connection is closed.
-------

Christopher provided this reliable reproducer :

----------
global
    tune.ssl.default-dh-param   2048
    daemon

listen ssl_server
    mode tcp
    bind 127.0.0.1:4443 ssl crt srv1.test.com.pem crt srv2.test.com.pem

    timeout connect 5000
    timeout client  30000
    timeout server  30000

    server srv A.B.C.D:80

You just need to generate 2 SSL certificates with 2 CN (here
srv1.test.com and srv2.test.com).

Then, by doing SSL requests with the first CN, there is no problem. But
with the second CN, it should segfault on the 2nd request.

openssl s_client -connect 127.0.0.1:4443 -servername srv1.test.com // OK
openssl s_client -connect 127.0.0.1:4443 -servername srv1.test.com // OK

But,

openssl s_client -connect 127.0.0.1:4443 -servername srv2.test.com // OK
openssl s_client -connect 127.0.0.1:4443 -servername srv2.test.com // KO
-----------

A long discussion led to the following proposal which this patch implements :

- the cert is generated. It gets a refcount = 1.
- we assign it to the SSL. Its refcount becomes two.
- we try to insert it into the tree. The tree will handle its freeing
  using SSL_CTX_free() during eviction.
- if we can't insert into the tree because the tree is disabled, then
  we have to call SSL_CTX_free() ourselves, then we'd rather do it
  immediately. It will more closely mimmick the case where the cert
  is added to the tree and immediately evicted by concurrent activity
  on the cache.
- we never have to call SSL_CTX_free() during ssl_sock_close() because
  the SSL session only relies on openssl doing the right thing based on
  the refcount only.
- thus we never need to know how the cert was created since the
  SSL_CTX_free() is either guaranteed or already done for generated
  certs, and this protects other ones against any accidental call to
  SSL_CTX_free() without having to track where the cert comes from.

This patch also reduces the inter-dependence between the LRU tree and
the SSL stack, so it should cause less sweating to migrate to threads
later.

This bug is specific to 1.6.0, as it was introduced after dev7 by
this fix :

   d2cab92 ("BUG/MINOR: ssl: fix management of the cache where forged certificates are stored")

Thus a backport to 1.6 is required, but not to 1.5.
2015-10-20 15:29:01 +02:00
Willy Tarreau
70f289cf8d BUG/MEDIUM: namespaces: don't fail if no namespace is used
Susheel Jalali reported a confusing bug in namespaces implementation.
If namespaces are enabled at build time (USE_NS=1) and *no* namespace
is used at all in the whole config file, my_socketat() returns -1 and
all socket bindings fail. This is because of a wrong condition in this
function. A possible workaround consists in creating some namespaces.
2015-10-20 15:29:00 +02:00