Before this change, trash is being used to create certificate filename
to read in care Mutli-Cert are in used. But then ssl_sock_load_ocsp()
modify trash leading to potential wrong information given in later error
message.
This also blocks any further use of certificate filename for other
usage, like ongoing patch to support Certificate Transparency handling
in Multi-Cert bundle.
In the last release a lot of the structures have become opaque for an
end user. This means the code using these needs to be changed to use the
proper functions to interact with these structures instead of trying to
manipulate them directly.
This does not fix any deprecations yet that are part of 1.1.0, it only
ensures that it can be compiled against that version and is still
compatible with older ones.
[wt: openssl-0.9.8 doesn't build with it, there are conflicts on certain
function prototypes which we declare as inline here and which are
defined differently there. But openssl-0.9.8 is not supported anymore
so probably it's OK to go without it for now and we'll see later if
some users still need it. Emeric has reviewed this change and didn't
spot anything obvious which requires special care. Let's try it for
real now]
Today, the certificate are indexed int he SNI tree using their CN and the
list of thier AltNames. So, Some certificates have the same names in the
CN and one of the AltNames entries.
Typically Let's Encrypt duplicate the the DNS name in the CN and the
AltName.
This patch prevents the creation of identical entries in the trees. It
checks the same DNS name and the same SSL context.
If the same certificate is registered two time it will be duplicated.
This patch should be backported in the 1.6 and 1.5 version.
Add some debug trace when haproxy is configured in debug & verbose mode.
This is useful for openssl tests. Typically, the error "SSL handshake
failure" can be caused by a lot of protocol error. This patch details
the encountered error. For exemple:
OpenSSL error 0x1408a0c1: ssl3_get_client_hello: no shared cipher
Note that my compilator (gcc-4.7) refuse to considers the function
ssl_sock_dump_errors() as inline. The condition "if" ensure that the
content of the function is not executed in normal case. It should be
a pity to call a function just for testing its execution condition, so
I use the macro "forceinline".
Roberto Guimaraes reported that Valgrind complains about a leak
in ssl_get_dh_1024().
This is caused caused by an oversight in ssl_sock_load_dh_params(),
where local_dh_1024 is always replaced by a new DH object even if
it already holds one. This patch simply checks whether local_dh_1024
is NULL before calling ssl_get_dh_1024().
This reverts commit 0ea4c23ca7.
Certain very simple confs randomly segfault upon startup with openssl 1.0.2
with this patch, which seems to indicate a use after free. Better drop it
and let valgrind complain about the potential leak.
Also it's worth noting that the man page for SSL_CTX_set_tmp_dh() makes no
mention about whether or not the element should be freed, and the example
provided does not use it either.
This fix should be backported to 1.6 and 1.5 where the patch was just
included.
I just noticed that SSL wouldn't build anymore since this afternoon's patch :
src/ssl_sock.c: In function 'ssl_sock_load_multi_cert':
src/ssl_sock.c:1982:26: warning: left-hand operand of comma expression has no effect [-Wunused-value]
for (i = 0; i < fcount, i++)
^
src/ssl_sock.c:1982:31: error: expected ';' before ')' token
for (i = 0; i < fcount, i++)
^
Makefile:791: recipe for target 'src/ssl_sock.o' failed
make: *** [src/ssl_sock.o] Error 1
SNI filters used to be ignored with multicerts (eg: those providing
ECDSA and RSA at the same time). This patch makes them work like
other certs.
Note: most of the changes in this patch are due to an extra level of
indent, read it with "git show -b".
Valgrind reports that the memory allocated in ssl_get_dh_1024() was leaking. Upon further inspection of openssl code, it seems that SSL_CTX_set_tmp_dh makes a copy of the data, so calling DH_free afterwards makes sense.
Emeric found that some certificate files that were valid with the old method
(the one with the explicit name involving SSL_CTX_use_PrivateKey_file()) do
not work anymore with the new one (the one trying to load multiple cert types
using PEM_read_bio_PrivateKey()). With the last one, the private key couldn't
be loaded.
The difference was related to the ordering in the PEM file was different. The
old method would always work. The new method only works if the private key is
at the top, or if it appears as an "EC" private key. The cause in fact is that
we never rewind the BIO between the various calls. So this patch moves the
loading of the private key as the first step, then it rewinds the BIO, and
then it loads the cert and the chain. With this everything works.
No backport is needed, this issue came with the recent addition of the
multi-cert support.
Instead of repeating the type of the LHS argument (sizeof(struct ...))
in calls to malloc/calloc, we directly use the pointer
name (sizeof(*...)). The following Coccinelle patch was used:
@@
type T;
T *x;
@@
x = malloc(
- sizeof(T)
+ sizeof(*x)
)
@@
type T;
T *x;
@@
x = calloc(1,
- sizeof(T)
+ sizeof(*x)
)
When the LHS is not just a variable name, no change is made. Moreover,
the following patch was used to ensure that "1" is consistently used as
a first argument of calloc, not the last one:
@@
@@
calloc(
+ 1,
...
- ,1
)
In C89, "void *" is automatically promoted to any pointer type. Casting
the result of malloc/calloc to the type of the LHS variable is therefore
unneeded.
Most of this patch was built using this Coccinelle patch:
@@
type T;
@@
- (T *)
(\(lua_touserdata\|malloc\|calloc\|SSL_get_app_data\|hlua_checkudata\|lua_newuserdata\)(...))
@@
type T;
T *x;
void *data;
@@
x =
- (T *)
data
@@
type T;
T *x;
T *data;
@@
x =
- (T *)
data
Unfortunately, either Coccinelle or I is too limited to detect situation
where a complex RHS expression is of type "void *" and therefore casting
is not needed. Those cases were manually examined and corrected.
Olivier Doucet reported the issue on the ML and tested that when using
more than TLS_TICKETS_NO keys in the file, the CPU usage is much higeher
than expected.
Lukas Tribus then provided a test case which showed that resumption doesn't
work at all in that case.
This fix needs to be backported to 1.6.
Signed-off-by: Nenad Merdanovic <nmerdan@anine.io>
Technically speaking, many SSL sample fetch functions act on the
connection and depend on USE_L5CLI on the client side, which means
they're usable as soon as a handshake is completed on a connection.
This means that the test consisting in refusing to call them when
the stream is NULL will prevent them from working when we implement
the tcp-request session ruleset. Better fix this now. The fix consists
in using smp->sess->origin when they're called for the front connection,
and smp->strm->si[1].end when called for the back connection.
There is currently no known side effect for this issue, though it would
better be backported into 1.6 so that the code base remains consistend.
Since commit 6879ad3 ("MEDIUM: sample: fill the struct sample with the
session, proxy and stream pointers") merged in 1.6-dev2, the sample
contains the pointer to the stream and sample fetch functions as well
as converters use it heavily.
The problem is that earlier commit 87b0966 ("REORG/MAJOR: session:
rename the "session" entity to "stream"") had split the session and
stream resulting in the possibility for smp->strm to be NULL before
the stream was initialized. This is what happens in tcp-request
connection rulesets, as discovered by Baptiste.
The sample fetch functions must now check that smp->strm is valid
before using it. An alternative could consist in using a dummy stream
with nothing in it to avoid some checks but it would only result in
deferring them to the next step anyway, and making it harder to detect
that a stream is valid or the dummy one.
There is still an issue with variables which requires a complete
independant fix. They use strm->sess to find the session with strm
possibly NULL and passed as an argument. All call places indirectly
use smp->strm to build strm. So the problem is there but the API needs
to be changed to remove this duplicate argument that makes it much
harder to know what pointer to use.
This fix must be backported to 1.6, as well as the next one fixing
variables.
Owen Marshall reported an issue depending on the server keywords order in the
configuration.
Working line :
server dev1 <ip>:<port> check inter 5000 ssl verify none sni req.hdr(Host)
Non working line :
server dev1 <ip>:<port> check inter 5000 ssl sni req.hdr(Host) verify none
Indeed, both parse_server() and srv_parse_sni() modified the current argument
offset at the same time. To fix the issue, srv_parse_sni() can work on a local
copy ot the offset, leaving parse_server() responsible of the actual value.
This fix must be backported to 1.6.
After seeing previous ALPN fix, I suspected that NPN code was wrong
as well, and indeed it was since ALPN was copied from it. This fix
must be backported into 1.6 and 1.5.
The first time I tried it (1.6.3) I got a segmentation fault :(
After some investigation with gdb and valgrind I found the
problem. memcpy() copies past an allocated buffer in
"bind_parse_alpn". This patch fixes it.
[wt: this fix must be backported into 1.6 and 1.5]
The serial number for a generated certificate was computed using the requested
servername, without any variable/random part. It is not a problem from the
moment it is not regenerated.
But if the cache is disabled or when the certificate is evicted from the cache,
we may need to regenerate it. It is important to not reuse the same serial
number for the new certificate. Else clients (especially browsers) trigger a
warning because 2 certificates issued by the same CA have the same serial
number.
So now, the serial is a static variable initialized with now_ms (internal date
in milliseconds) and incremented at each new certificate generation.
(Ref MPS-2031)
Added support for loading mutiple certs into shared contexts when they
are specified in a crt-list
Note that it's not practical to support SNI filters with multicerts, so
any SNI filters that's provided to the crt-list is ignored if a
multi-cert opertion is used.
Added ability for users to specify multiple certificates that all relate
a single server. Users do this by specifying certificate "cert_name.pem"
but having "cert_name.pem.rsa", "cert_name.pem.dsa" and/or
"cert_name.pem.ecdsa" in the directory.
HAProxy will now intelligently search for those 3 files and try combine
them into as few SSL_CTX's as possible based on CN/SAN. This will allow
HAProxy to support multiple ciphersuite key algorithms off a single
SSL_CTX.
This change integrates into the existing architecture of SNI lookup and
multiple SNI's can point to the same SSL_CTX, which can support multiple
key_types.
Added cert_key_and_chain struct to ssl. This struct will store the
contents of a crt path (from the config file) into memory. This will
allow us to use the data stored in memory instead of reading the file
multiple times.
This will be used to support a later commit to load multiple pkeys/certs
into a single SSL_CTX
The function 'EVP_PKEY_get_default_digest_nid()' was introduced in OpenSSL
1.0.0. So for older version of OpenSSL, compiled with the SNI support, the
HAProxy compilation fails with the following error:
src/ssl_sock.c: In function 'ssl_sock_do_create_cert':
src/ssl_sock.c:1096:7: warning: implicit declaration of function 'EVP_PKEY_get_default_digest_nid'
if (EVP_PKEY_get_default_digest_nid(capkey, &nid) <= 0)
[...]
src/ssl_sock.c:1096: undefined reference to `EVP_PKEY_get_default_digest_nid'
collect2: error: ld returned 1 exit status
Makefile:760: recipe for target 'haproxy' failed
make: *** [haproxy] Error 1
So we must add a #ifdef to check the OpenSSL version (>= 1.0.0) to use this
function. It is used to get default signature digest associated to the private
key used to sign generated X509 certificates. It is called when the private key
differs than EVP_PKEY_RSA, EVP_PKEY_DSA and EVP_PKEY_EC. It should be enough for
most of cases.
Kim Seri reported that haproxy 1.6.0 crashes after a few requests
when a bind line has SSL enabled with more than one certificate. This
was caused by an insufficient condition to free generated certs during
ssl_sock_close() which can also catch other certs.
Christopher Faulet analysed the situation like this :
-------
First the LRU tree is only initialized when the SSL certs generation is
configured on a bind line. So, in the most of cases, it is NULL (it is
not the same thing than empty).
When the SSL certs generation is used, if the cache is not NULL, a such
certificate is pushed in the cache and there is no need to release it
when the connection is closed.
But it can be disabled in the configuration. So in that case, we must
free the generated certificate when the connection is closed.
Then here, we have really a bug. Here is the buggy part:
3125) if (conn->xprt_ctx) {
3126) #ifdef SSL_CTRL_SET_TLSEXT_HOSTNAME
3127) if (!ssl_ctx_lru_tree && objt_listener(conn->target)) {
3128) SSL_CTX *ctx = SSL_get_SSL_CTX(conn->xprt_ctx);
3129) if (ctx != 3130)
SSL_CTX_free(ctx);
3131) }
3133) SSL_free(conn->xprt_ctx);
3134) conn->xprt_ctx = NULL;
3135) sslconns--;
3136) }
The check on the line 3127 is not enough to determine if this is a
generated certificate or not. Because ssl_ctx_lru_tree is NULL,
generated certificates, if any, must be freed. But here ctx should also
be compared to all SNI certificates and not only to default_ctx. Because
of this bug, when a SNI certificate is used for a connection, it is
erroneously freed when this connection is closed.
-------
Christopher provided this reliable reproducer :
----------
global
tune.ssl.default-dh-param 2048
daemon
listen ssl_server
mode tcp
bind 127.0.0.1:4443 ssl crt srv1.test.com.pem crt srv2.test.com.pem
timeout connect 5000
timeout client 30000
timeout server 30000
server srv A.B.C.D:80
You just need to generate 2 SSL certificates with 2 CN (here
srv1.test.com and srv2.test.com).
Then, by doing SSL requests with the first CN, there is no problem. But
with the second CN, it should segfault on the 2nd request.
openssl s_client -connect 127.0.0.1:4443 -servername srv1.test.com // OK
openssl s_client -connect 127.0.0.1:4443 -servername srv1.test.com // OK
But,
openssl s_client -connect 127.0.0.1:4443 -servername srv2.test.com // OK
openssl s_client -connect 127.0.0.1:4443 -servername srv2.test.com // KO
-----------
A long discussion led to the following proposal which this patch implements :
- the cert is generated. It gets a refcount = 1.
- we assign it to the SSL. Its refcount becomes two.
- we try to insert it into the tree. The tree will handle its freeing
using SSL_CTX_free() during eviction.
- if we can't insert into the tree because the tree is disabled, then
we have to call SSL_CTX_free() ourselves, then we'd rather do it
immediately. It will more closely mimmick the case where the cert
is added to the tree and immediately evicted by concurrent activity
on the cache.
- we never have to call SSL_CTX_free() during ssl_sock_close() because
the SSL session only relies on openssl doing the right thing based on
the refcount only.
- thus we never need to know how the cert was created since the
SSL_CTX_free() is either guaranteed or already done for generated
certs, and this protects other ones against any accidental call to
SSL_CTX_free() without having to track where the cert comes from.
This patch also reduces the inter-dependence between the LRU tree and
the SSL stack, so it should cause less sweating to migrate to threads
later.
This bug is specific to 1.6.0, as it was introduced after dev7 by
this fix :
d2cab92 ("BUG/MINOR: ssl: fix management of the cache where forged certificates are stored")
Thus a backport to 1.6 is required, but not to 1.5.
Now, A callback is defined for generated certificates to set DH parameters for
ephemeral key exchange when required.
In same way, when possible, we also defined Elliptic Curve DH (ECDH) parameters.
This is done by adding EVP_PKEY_EC type in supported types for the CA private
key when we get the message digest used to sign a generated X509 certificate.
So now, we support DSA, RSA and EC private keys.
And to be sure, when the type of the private key is not directly supported, we
get its default message digest using the function
'EVP_PKEY_get_default_digest_nid'.
We also use the key of the default certificate instead of generated it. So we
are sure to use the same key type instead of always using a RSA key.
the file specified by the SSL option 'ca-sign-file' can now contain the CA
certificate used to dynamically generate certificates and its private key in any
order.
Commit d2cab92 ("BUG/MINOR: ssl: fix management of the cache where forged
certificates are stored") removed some needed #ifdefs resulting in ssl not
building on older openssl versions where SSL_CTRL_SET_TLSEXT_HOSTNAME is
not defined :
src/ssl_sock.c: In function 'ssl_sock_load_ca':
src/ssl_sock.c:2504: error: 'ssl_ctx_lru_tree' undeclared (first use in this function)
src/ssl_sock.c:2504: error: (Each undeclared identifier is reported only once
src/ssl_sock.c:2504: error: for each function it appears in.)
src/ssl_sock.c:2505: error: 'ssl_ctx_lru_seed' undeclared (first use in this function)
src/ssl_sock.c: In function 'ssl_sock_close':
src/ssl_sock.c:3095: error: 'ssl_ctx_lru_tree' undeclared (first use in this function)
src/ssl_sock.c: In function '__ssl_sock_deinit':
src/ssl_sock.c:5367: error: 'ssl_ctx_lru_tree' undeclared (first use in this function)
make: *** [src/ssl_sock.o] Error 1
Reintroduce the ifdefs around the faulty areas.
First, the LRU cache must be initialized after the configuration parsing to
correctly set its size.
Next, the function 'ssl_sock_set_generated_cert' returns -1 when an error occurs
(0 if success). In that case, the caller is responsible to free the memory
allocated for the certificate.
Finally, when a SSL certificate is generated by HAProxy but cannot be inserted
in the cache, it must be freed when the SSL connection is closed. This happens
when 'tune.ssl.ssl-ctx-cache-size' is set to 0.
The union name "data" is a little bit heavy while we read the source
code because we can read "data.data.sint". The rename from "data" to "u"
makes the read easiest like "data.u.sint".
This patch remove the struct information stored both in the struct
sample_data and in the striuct sample. Now, only thestruct sample_data
contains data, and the struct sample use the struct sample_data for storing
his own data.
This patch removes the 32 bits unsigned integer and the 32 bit signed
integer. It replaces these types by a unique type 64 bit signed.
This makes easy the usage of integer and clarify signed and unsigned use.
With the previous version, signed and unsigned are used ones in place of
others, and sometimes the converter loose the sign. For example, divisions
are processed with "unsigned", if one entry is negative, the result is
wrong.
Note that the integer pattern matching and dotted version pattern matching
are already working with signed 64 bits integer values.
There is one user-visible change : the "uint()" and "sint()" sample fetch
functions which used to return a constant integer have been replaced with
a new more natural, unified "int()" function. These functions were only
introduced in the latest 1.6-dev2 so there's no impact on regular
deployments.
The new "sni" server directive takes a sample fetch expression and
uses its return value as a hostname sent as the TLS SNI extension.
A typical use case consists in forwarding the front connection's SNI
value to the server in a bridged HTTPS forwarder :
sni ssl_fc_sni