If we were to enable 'ocsp-update' on a certificate that does not have
an OCSP URI, we would exit ssl_sock_load_ocsp with a negative error code
which would raise a misleading error message ("<cert> has an OCSP URI
and OCSP auto-update is set to 'on' ..."). This patch simply fixes the
error message but an error is still raised.
This issue was raised in GitHub #2432.
It can be backported up to branch 2.8.
It is now possible to selectively retrieve extra counters from stats
modules. H1, H2, QUIC and H3 fill_stats() callback functions are updated to
return a specific counter.
It has been found that under some rare error circumstances,
SSL_do_handshake() could return with SSL_ERROR_WANT_READ without
even trying to call the read function, causing permanent wakeups
that prevent the process from sleeping.
It was established that this only happens if the retry flags are
not systematically cleared in both directions upon any I/O attempt,
but, given the lack of documentation on this topic, it is hard to
say if this rather strange behavior is expected or not, otherwise
why wouldn't the library always clear the flags by itself before
proceeding?
In addition, this only seems to affect OpenSSL 1.1.0 and above,
and does not affect wolfSSL nor aws-lc.
A bisection on haproxy showed that this issue was first triggered by
commit a8955d57ed ("MEDIUM: ssl: provide our own BIO."), which means
that OpenSSL's socket BIO does not have this problem. And this one
does always clear the flags before proceeding. So let's just proceed
the same way. It was verified that it properly fixes the problem,
does not affect other implementations, and doesn't cause any freeze
nor spurious wakeups either.
Many thanks to Valentín Gutiérrez for providing a network capture
showing the incident as well as a reproducer. This is GH issue #2403.
This patch needs to be backported to all versions that include the
commit above, i.e. as far as 2.0.
This patch impacts only haproxy when built against aws-lc TLS stack (OPENSSL_IS_AWSLC).
During the SSL_CTX switching from ssl_sock_switchctx_cbk() callback,
ssl_sock_switchctx_set() is called. This latter calls SSL_set_SSL_CTX()
whose aims is to change the SSL_CTX attached o an SSL object (TLS session).
But the aws-lc (or boringssl) implementation of this function copy
the "early data enabled" setting value (boolean) coming with the SSL_CTX object
into the SSL object. So, if not set in the SSL_CTX object this setting disabled
the one which has been set by configuration into the SSL object
(see qc_set_quic_early_data_enabled(), it calls SSL_set_early_data_enabled()
with an SSL object as parameter).
Fix this enabling the "early data enabled" setting into the SSL_CTX before
setting this latter into the SSL object.
This patch is required to make QUIC 0-RTT work with haproxy built against
aws-lc.
Note that, this patch should also help in early data support for TCP connections.
The 'generate-certificates' option does not need its dedicated SSL_CTX
*, it only needs the default SSL_CTX.
Use the default SSL_CTX found in the sni_ctx to generate certificates.
It allows to remove all the specific default_ctx initialization, as
well as the default_ssl_conf and 'default_inst'.
This patch follows the previous one about default certificate selection
("MEDIUM: ssl: allow multiple fallback certificate to allow ECDSA/RSA
selection").
This patch generates '*" SNI filters for the first certificate of a
bind line, it will be used to match default certificates. Instead of
setting the default_ctx pointer in the bind line.
Since the filters are in the SNI tree, it allows to have multiple
default certificate and restore the ecdsa/rsa selection with a
multi-cert bundle.
This configuration:
# foobar.pem.ecdsa and foobar.pem.rsa
bind *:8443 ssl crt foobar.pem crt next.pem
will use "foobar.pem.ecdsa" and "foobar.pem.rsa" as default
certificates.
Note: there is still cleanup needed around default_ctx.
This was discussed in github issue #2392.
This patch changes the default certificate mechanism.
Since the beginning of SSL in HAProxy, the default certificate was the first
certificate of a bind line. This allowed to fallback on this certificate
when no servername extension was sent by the server, or when no SAN nor
CN was available in the certificate.
When using a multi-certificate bundle (ecdsa+rsa), it was possible to
have both certificates as the fallback one, leting openssl chose the
right one. This was possible because a multi-certificate bundle
was generating a unique SSL_CTX for both certificates.
When the haproxy and openssl architecture evolved, we decided to
use multiple SSL_CTX for a multi-cert bundle, in order to simplify the
code and allow updates over the CLI.
However only one default_ctx was allowed, so we lost the ability to
chose between ECDSA and RSA for the default certificate.
This patch allows to use a '*' filter for a certificate, which allow to
lookup between multiple '*' filter, and have one in RSA and another one
in ECDSA. It replaces the default_ctx mechanism in the ClientHello
callback and use the standard algorithm to look for a default cert and
chose between ECDSA and RSA.
/!\ This patch breaks the automatic setting of the default certificate, which
will be introduce in the next patch. So the first certificate of a bind
line won't be used as a defaullt anymore.
To use this feature, one could use crt-list with '*' filters:
$ cat foo.crtlist
foobar.pem.rsa *
foobar.pem.ecdsa *
In order to test the feature, it's easy to send a request without
the servername extension and use ECDSA or RSA compatible ciphers:
$ openssl s_client -connect localhost:8443 -tls1_2 -cipher ECDHE-RSA-AES256-GCM-SHA384
$ openssl s_client -connect localhost:8443 -tls1_2 -cipher ECDHE-ECDSA-AES256-GCM-SHA384
This bug impacts only the QUIC OpenSSL compatibility module (USE_QUIC_OPENSSL_COMPAT).
The TLS capture of information from client hello enabled by
tune.ssl.capture-buffer-size could not work with USE_QUIC_OPENSSL_COMPAT. This
is due to the fact the callback set for this feature was replaced by
quic_tls_compat_msg_callback(). In fact this called must be registered by
ssl_sock_register_msg_callback() as this done for the TLS client hello capture.
A call to this function appends the function passed as parameter to a list of
callbacks to be called when the TLS stack parse a TLS message.
quic_tls_compat_msg_callback() had to be modified to return if it is called
for a non-QUIC TLS session.
Must be backported to 2.8.
The PR which allows to chose a certificate depending on the ciphers and
the signature algorithms was merged in WolfSSL. Let's activate this
code.
This could be backported in 2.9 only when the next WolfSSL release is
available (5.6.5). It will also need a check on the version.
This bug could be reproduced loading several certificated from "bind" line:
with "server_ocsp.pem" as argument to "crt" setting and updating
the CDSA certificate with the RSA as follows:
echo -e "set ssl cert reg-tests/ssl/ocsp_update/multicert/server_ocsp.pem.ecdsa \
<<\n$(cat reg-tests/ssl/ocsp_update/multicert/server_ocsp.pem.rsa)\n" | socat - /tmp/stats
followed by an "commit ssl cert reg-tests/ssl/ocsp_update/multicert/server_ocsp.pem.ecdsa"
command. This could be detected by libasan as follows:
=================================================================
==507223==ERROR: AddressSanitizer: attempting double-free on 0x60200007afb0 in thread T3:
#0 0x7fabc6fb5527 in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x54527)
#1 0x7fabc6ae8f8c in ossl_asn1_string_embed_free (/opt/quictls/lib/libcrypto.so.81.3+0xd4f8c)
#2 0x7fabc6af54e9 in ossl_asn1_primitive_free (/opt/quictls/lib/libcrypto.so.81.3+0xe14e9)
#3 0x7fabc6af5960 in ossl_asn1_template_free (/opt/quictls/lib/libcrypto.so.81.3+0xe1960)
#4 0x7fabc6af569f in ossl_asn1_item_embed_free (/opt/quictls/lib/libcrypto.so.81.3+0xe169f)
#5 0x7fabc6af58a4 in ASN1_item_free (/opt/quictls/lib/libcrypto.so.81.3+0xe18a4)
#6 0x46a159 in ssl_sock_free_cert_key_and_chain_contents src/ssl_ckch.c:723
#7 0x46aa92 in ckch_store_free src/ssl_ckch.c:869
#8 0x4704ad in cli_release_commit_cert src/ssl_ckch.c:1981
#9 0x962e83 in cli_io_handler src/cli.c:1140
#10 0xc1edff in task_run_applet src/applet.c:454
#11 0xaf8be9 in run_tasks_from_lists src/task.c:634
#12 0xafa2ed in process_runnable_tasks src/task.c:876
#13 0xa23c72 in run_poll_loop src/haproxy.c:3024
#14 0xa24aa3 in run_thread_poll_loop src/haproxy.c:3226
#15 0x7fabc69e7ea6 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x7ea6)
#16 0x7fabc6907a2e in __clone (/lib/x86_64-linux-gnu/libc.so.6+0xfba2e)
0x60200007afb0 is located 0 bytes inside of 3-byte region [0x60200007afb0,0x60200007afb3)
freed by thread T3 here:
#0 0x7fabc6fb5527 in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x54527)
#1 0x7fabc6ae8f8c in ossl_asn1_string_embed_free (/opt/quictls/lib/libcrypto.so.81.3+0xd4f8c)
previously allocated by thread T2 here:
#0 0x7fabc6fb573f in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x5473f)
#1 0x7fabc6ae8d77 in ASN1_STRING_set (/opt/quictls/lib/libcrypto.so.81.3+0xd4d77)
Thread T3 created by T0 here:
#0 0x7fabc6f84bba in pthread_create (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x23bba)
#1 0xc04f36 in setup_extra_threads src/thread.c:252
#2 0xa2761f in main src/haproxy.c:3917
#3 0x7fabc682fd09 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x23d09)
Thread T2 created by T0 here:
#0 0x7fabc6f84bba in pthread_create (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x23bba)
#1 0xc04f36 in setup_extra_threads src/thread.c:252
#2 0xa2761f in main src/haproxy.c:3917
#3 0x7fabc682fd09 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x23d09)
SUMMARY: AddressSanitizer: double-free ??:0 __interceptor_free
==507223==ABORTING
Aborted
The OCSP CID stored in the impacted ckch data were freed but not reset to NULL,
leading to a subsequent double free.
Must be backported to 2.8.
This patch removes the code which selects the SSL certificate in the
OpenSSL Client Hello callback, to use the ssl_sock_chose_sni_ctx()
function which does the same.
The bigger part of the function which remains is the extraction of the
servername, ciphers and sigalgs, because it's done manually by parsing
the TLS extensions.
This is not supposed to change anything functionally.
The certificate selection used in the WolfSSL cert_cb and in the OpenSSL
clienthello callback is the same, the function was duplicate to achieve
the same.
This patch move the selection code to a common function called
ssl_sock_chose_sni_ctx().
The servername string is still lowered in the callback, however the
search for the first dot in the string (wildp) is done in
ssl_sock_chose_sni_ctx()
The function uses the same certificate selection algorithm as before, it
needs to know if you need rsa or ecdsa, the bind_conf to achieve the
lookup, and the servername string.
This patch moves the code for WolSSL only.
PR https://github.com/wolfSSL/wolfssl/pull/6963 implements primitives to
extract ciphers and algorithm signatures.
It allows to chose a certificate depending on the sigals and
ciphers presented by the client (RSA or ECDSA).
Since WolfSSL does not implement the clienthello callback, the patch
uses the certificate callback (SSL_CTX_set_cert_cb())
The callback is inspired by our clienthello callback, however the
extraction of client ciphers and sigalgs is simpler,
wolfSSL_get_sigalg_info() and wolfSSL_get_ciphersuite_info() are used.
This is not enabled by default yet as the PR was not merged.
This global variable was used to avoid using locks on shared_contexts in
the unlikely case of nbthread==1. Since the locks do not do anything
when USE_THREAD is not defined, it will be more beneficial to simply
remove this variable and the systematic test on its value in the shared
context locking functions.
Descend the shctx_lock calls into the shctx_row_reserve_hot so that the
cases when we don't need to lock anything (enough space in the current
row or not enough space in the 'avail' list) do not take the lock at
all.
In sh_ssl_sess_new_cb the lock had to be descended into
sh_ssl_sess_store in order not to cover the shctx_row_reserve_hot call
anymore.
Since a lock on the cache tree was added in the latest cache changes, we
do not need to use the shared_context's lock to lock more than pure
shared_context related data anymore. This already existing lock will now
only cover the 'avail' list from the shared_context. It can then be
changed to a rwlock instead of a spinlock because we might want to only
run through the avail list sometimes.
Apart form changing the type of the shctx lock, the main modification
introduced by this patch is to limit the amount of code covered by the
shctx lock. This lock does not need to cover any code strictly related
to the cache tree anymore.
The "hot" list stored in a shared_context was used to keep a reference
to shared blocks that were currently being used and were thus removed
from the available list (so that they don't get reused for another cache
response). This 'hot' list does not ever need to be shared across
threads since every one of them only works on their current row.
The main need behind this 'hot' list was to detach the corresponding
blocks from the 'avail' list and to have a known list root when calling
list_for_each_entry_from in shctx_row_data_append (for instance).
Since we actually never need to iterate over all members of the 'hot'
list, we can remove it and replace the inc_hot/dec_hot logic by a
detach/reattach one.
Every use of the cache tree was covered by the shctx lock even when no
operations were performed on the shared_context lists (avail and hot).
This patch adds a dedicated RW lock for the cache so that blocks of code
that work on the cache tree only can use this lock instead of the
superseding shctx one. This is useful for operations during which the
concerned blocks are already in the hot list.
When the two locks need to be taken at the same time, in
http_action_req_cache_use and in shctx_row_reserve_hot, the shctx one
must be taken first.
A new parameter needed to be added to the shared_context's free_block
callback prototype so that cache_free_block can take the cache lock and
release it afterwards.
The patch which fixes the certificate selection uses
SSL_CIPHER_get_id() to skip the SCSV ciphers without checking if cipher
is NULL. This patch fixes the issue by skipping any NULL cipher in the
iteration.
Problem was reported in #2329.
Need to be backported where 23093c72f1 was
backported. No release was made with this patch so the severity is
MEDIUM.
When using TLSv1.3, the signature algorithms extension is used to chose
the right ECDSA or RSA certificate.
However there was an old test for previous version of TLS (< 1.3) which
was testing if the cipher is compatible with ECDSA when an ECDSA
signature algorithm is used. This test was relying on
SSL_CIPHER_get_auth_nid(cipher) == NID_auth_ecdsa to verify if the
cipher is still good.
Problem is, with TLSv1.3, all ciphersuites are compatible with any
authentication algorithm, but SSL_CIPHER_get_auth_nid(cipher) does not
return NID_auth_ecdsa, but NID_auth_any.
Because of this, with TLSv1.3 when both ECDSA and RSA certificates are
available for a domain, the ECDSA one is not chosen in priority.
This patch also introduces a test on the cipher IDs for the signaling
ciphersuites, because they would always return NID_auth_any, and are not
relevent for this selection.
This patch fixes issue #2300.
Must be backported in all stable versions.
Each time a new SSL context is allocated, global.sslconns is
incremented. If global.maxsslconn is reached, the allocation is
cancelled.
This procedure was not entirely thread-safe due to the check and
increment operations conducted at different stage. This could lead to
global.maxsslconn slightly exceeded when several threads allocate SSL
context while sslconns is near the limit.
To fix this, use a CAS operation in a do/while loop. This code is
similar to the actconn/maxconn increment for connection.
A new function increment_sslconn() is defined for this operation. For
the moment, only SSL code is using it. However, it is expected that QUIC
will also use it to count QUIC connections as SSL ones.
This should be backported to all stable releases. Note that prior to the
2.6, sslconns was outside of global struct, so this commit should be
slightly adjusted.
Since commit 5afcb686b ("MAJOR: connection: purge idle conn by last usage")
in 2.9-dev4, the test on conn->toremove_list added to conn_get_idle_flag()
in 2.8 by commit 3a7b539b1 ("BUG/MEDIUM: connection: Preserve flags when a
conn is removed from an idle list") becomes misleading. Indeed, now both
toremove_list and idle_list are shared by a union since the presence in
these lists is mutually exclusive. However, in conn_get_idle_flag() we
check for the presence in the toremove_list to decide whether or not to
delete the connection from the tree. This test now fails because instead
it sees the presence in the idle or safe list via the union, and concludes
the element must not be removed. Thus the element remains in the tree and
can be found later after the connection is released, causing crashes that
Tristan reported in issue #2292.
The following config is sufficient to reproduce it with 2 threads:
defaults
mode http
timeout client 5s
timeout server 5s
timeout connect 1s
listen front
bind :8001
server next 127.0.0.1:8002
frontend next
bind :8002
timeout http-keep-alive 1
http-request redirect location /
Sending traffic with a few concurrent connections and some short timeouts
suffices to instantly crash it after ~10k reqs:
$ h2load -t 4 -c 16 -n 10000 -m 1 -w 1 http://0:8001/
With Amaury we analyzed the conditions in which the function is called
in order to figure a better condition for the test and concluded that
->toremove_list is never filled there so we can safely remove that part
from the test and just move the flag retrieval back to what it was prior
to the 2.8 patch above. Note that the patch is not reverted though, as
the parts that would drop the unexpected flags removal are unchanged.
This patch must NOT be backported. The code in 2.8 works correctly, it's
only the change in 2.9 that makes it misbehave.
This patch implements the 'curves' keyword on server lines as well as
the 'ssl-default-server-curves' keyword in the global section.
It also add the keyword on the server line in the ssl_curves reg-test.
These keywords allow the configuration of the curves list for a server.
This adds a new option for the Makefile USE_OPENSSL_AWSLC, and
update the documentation with instructions to use HAProxy with
AWS-LC.
Update the type of the OCSP callback retrieved with
SSL_CTX_get_tlsext_status_cb with the actual type for
libcrypto versions greater than 1.0.2. This doesn't affect
OpenSSL which casts the callback to void* in SSL_CTX_ctrl.
The per-thread SSL context in servers causes a burst of connection
renegotiations on startup, both for the forwarded traffic and for the
health checks. Health checks have been seen to continue to cause SSL
rekeying for several minutes after a restart on large thread-count
machines. The reason is that the context is exlusively per-thread
and that the more threads there are, the more likely it is for a new
connection to start on a thread that doesn't have such a context yet.
In order to improve this situation, this commit ensures that a thread
starting an SSL connection to a server without a session will first
look at the last session that was updated by another thread, and will
try to use it. In order to minimize the contention, we're using a read
lock here to protect the data, and the first-level index is an integer
containing the thread number, that is always valid and may always be
dereferenced. This way the session retrieval algorithm becomes quite
simple:
- if the last thread index is valid, then try to use the same session
under a read lock ;
- if any error happens, then atomically nuke the index so that other
threads don't use it and the next one to update a connection updates
it again
And for the ssl_sess_new_srv_cb(), we have this:
- update the entry under a write lock if the new session is valid,
otherwise kill it if the session is not valid;
- atomically update the index if it was 0 and the new one is valid,
otherwise atomically nuke it if the session failed.
Note that even if only the pointer is destroyed, the element will be
re-allocated by the next thread during the sess_new_srv_sb().
Right now a session is picked even if the SNI doesn't match, because
we don't know the SNI yet during ssl_sock_init(), but that's essentially
a matter of API, since connect_server() figures the SNI very early, then
calls conn_prepare() which calls ssl_sock_init(). Thus in the future we
could easily imaging storing a number of SNI-based contexts instead of
storing contexts per thread.
It could be worth backporting this to one LTS version after some
observation, though this is not strictly necessary. the current commit
depends on the following ones:
BUG/MINOR: ssl_sock: fix possible memory leak on OOM
MINOR: ssl_sock: avoid iterating realloc(+1) on stored context
DOC: ssl: add some comments about the non-obvious session allocation stuff
CLEANUP: ssl: keep a pointer to the server in ssl_sock_init()
MEDIUM: ssl_sock: always use the SSL's server name, not the one from the tid
MEDIUM: server/ssl: place an rwlock in the per-thread ssl server session
MINOR: server/ssl: maintain an index of the last known valid SSL session
MINOR: server/ssl: clear the shared good session index on failure
MEDIUM: server/ssl: pick another thread's session when we have none yet
If we fail to set the session using SSL_set_session(), we want to quickly
erase our index from the shared one so that any other thread with a valid
session replaces it.
When a thread creates a new session for a server, if none was known yet,
we assign the thread id (hence the reused_sess index) to a shared variable
so that other threads will later be able to find it when they don't have
one yet. For now we only set and clear the pointer upon session creation,
we do not yet pick it.
Note that we could have done it per thread-group, so as to avoid any
cross-thread exchanges, but it's anticipated that this is essentially
used during startup, at a moment where the cost of inter-thread contention
is very low compared to the ability to restart at full speed, which
explains why instead we store a single entry.
The goal will be to permit a thread to update its session while having
it shared with other threads. For now we only place the lock and arrange
the code around it so that this is quite light. For now only the owner
thread uses this lock so there is no contention.
Note that there is a subtlety in the openssl API regarding
i2s_SSL_SESSION() in that it fills the area pointed to by its argument
with a dump of the session and returns a size that's equal to the
previously allocated one. As such, it does modify the shared area even
if that's not obvious at first glance.
In ssl_sock_set_servername(), we're retrieving the current server name
from the current thread, hoping it will not have changed. This is a
bit dangerous as strictly speaking it's not easy to prove that no other
connection had to use one between the moment it was retrieved in
ssl_sock_init() and the moment it's being read here. In addition, this
forces us to maintain one session per thread while this is not the real
need, in practice we only need one session per SNI. And the current model
prevents us from sharing sessions between threads.
This had been done in 2.5 via commit e18d4e828 ("BUG/MEDIUM: ssl: backend
TLS resumption with sni and TLSv1.3"), but as analyzed with William, it
turns out that a saner approach consists in keeping the call to
SSL_get_servername() there and instead to always assign the SNI to the
current SSL context via SSL_set_tlsext_host_name() immediately when the
session is retreived. This way the session and SNI are consulted atomically
and the host name is only checked from the session and not from possibly
changing elements.
As a bonus the rdlock that was added by that commit could now be removed,
though it didn't cost much.
The SSL session allocation/reuse part is far from being trivial, and
there are some necessary tricks such as allocating then immediately
freeing that are required by the API due to internal refcount. All of
this is particularly hard to grasp, even with the scarce man pages.
Let's document a little bit what's granted and expected along this path
to help the reader later.
The SSL context storage in servers is per-thread, and the contents are
allocated for a length that is determined from the session. It turns out
that placing some traces there revealed that the realloc() that is called
to grow the area can be called multiple times in a row even for just
health checks, to grow the area by just one or two bytes. Given that
malloc() allocates in multiples of 8 or 16 anyway, let's round the
allocated size up to the nearest multiple of 8 to avoid this unneeded
operation.
Define a new function _srv_add_idle(). This is a simple wrapper to
insert a connection in the server idle tree. This is reserved for simple
usage and require to idle_conns lock. In most cases,
srv_add_to_idle_list() should be used.
This patch does not have any functional change. However, it will help
with the next patch as idle connection will be always inserted in a list
as secondary storage along with idle/safe trees.
Small change of API for conn_delete_from_tree(). Now the connection
instance is taken as argument instead of its inner node.
No functional change introduced with this commit. This simplifies
slightly invocation of conn_delete_from_tree(). The most useful changes
is that this function will be extended in the next patch to be able to
remove the connection from its new idle list at the same time as in its
idle tree.
That's the classical realloc() issue: if it returns NULL, the old area
is not freed but we erase the pointer. It was brought by commit e18d4e828
("BUG/MEDIUM: ssl: backend TLS resumption with sni and TLSv1.3"), and
should be backported where this commit was backported.
If multiple SSL_CTXs use the same certificate that has an OCSP response
file on the filesystem, only the first one will have the OCSP callback
set. This bug was introduced by "cc346678d MEDIUM: ssl: Add ocsp_certid
in ckch structure and discard ocsp buffer early" which cleared the
ocsp_response from the ckch_data after it was inserted in the tree,
which prevented subsequent contexts from having the callback registered.
This patch should be backported to 2.8.
SSL_CTX_keylog() is the callback used when the TLS keylog feature is enabled with
tune.ssl.keylog configuration setting. But the QUIC openssl wrapper also needs
to use such a callback to receive the QUIC TLS secrets from the TLS stack.
Add a call to the keylog callback for the QUIC openssl wrapper to SSL_CTX_keylog()
to ensure that it will be called when the TLS keylog feature is enabled.
When the QUIC OpenSSL wrapper use is enabled, all the TLS contexts (SSL_CTX) must
be configured to support it. This is done calling quic_tls_compat_init() from
ssl_sock_prepare_ctx(). Note that quic_tls_compat_init() ignore the TLS context
which are not linked to non-QUIC TLS sessions/connections.
Required for the QUIC openssl wrapper support.
This patch introduces the "client-sigalgs" keyword for the server line,
which allows to configure the list of server signature algorithms
negociated during the handshake. Also available as
"ssl-default-server-client-sigalgs" in the global section.
This patch introduces the "sigalgs" keyword for the server line, which
allows to configure the list of server signature algorithms negociated
during the handshake. Also available as "ssl-default-server-sigalgs" in
the global section.
Return a more acurate error than the previous patch, CO_ER_SSL_EMPTY is
the code for "Connection closed during SSL handshake" which is more
precise than CO_ER_SSL_ABORT ("Connection error during SSL handshake").
No backport needed.
During a SSL_do_handshake(), SSL_ERROR_ZERO_RETURN can be returned in case
the remote peer sent a close_notify alert. Previously this would set the
connection error to CO_ER_SSL_HANDSHAKE, this patch sets it to
CO_ER_SSL_ABORT to have a more acurate error.
in __ssl_sock_init, BIO_meth_new may failed and return NULL if
OPENSSL_zalloc failed. in this case, ha_meth will be NULL, and then
crash happens in BIO_meth_set_write. So, we add a check for ha_meth.
GCC complains about swapping 2 heads list, one local and one global.
gcc -Iinclude -O2 -g -Wall -Wextra -Wundef -Wdeclaration-after-statement -Wfatal-errors -Wtype-limits -Wshift-negative-value -Wshift-overflow=2 -Wduplicated-cond -Wnull-dereference -fwrapv -Wno-address-of-packed-member -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers -Wno-cast-function-type -Wno-string-plus-int -Wno-atomic-alignment -Werror -DDEBUG_STRICT -DDEBUG_MEMORY_POOLS -DUSE_EPOLL -DUSE_NETFILTER -DUSE_POLL -DUSE_THREAD -DUSE_BACKTRACE -DUSE_TPROXY -DUSE_LINUX_TPROXY -DUSE_LINUX_SPLICE -DUSE_LIBCRYPT -DUSE_CRYPT_H -DUSE_GETADDRINFO -DUSE_OPENSSL -DUSE_SSL -DUSE_LUA -DUSE_ACCEPT4 -DUSE_ZLIB -DUSE_CPU_AFFINITY -DUSE_TFO -DUSE_NS -DUSE_DL -DUSE_RT -DUSE_MATH -DUSE_SYSTEMD -DUSE_PRCTL -DUSE_THREAD_DUMP -DUSE_QUIC -DUSE_SHM_OPEN -DUSE_PCRE -DUSE_PCRE_JIT -I/github/home/opt/include -I/usr/include -DCONFIG_HAPROXY_VERSION=\"2.8-dev8-7d23e8d1a6db\" -DCONFIG_HAPROXY_DATE=\"2023/04/24\" -c -o src/ssl_sample.o src/ssl_sample.c
In file included from include/haproxy/pool.h:29,
from include/haproxy/chunk.h:31,
from include/haproxy/dynbuf.h:33,
from include/haproxy/channel.h:27,
from include/haproxy/applet.h:29,
from src/ssl_sock.c:47:
src/ssl_sock.c: In function 'tlskeys_finalize_config':
include/haproxy/list.h:48:88: error: storing the address of local variable 'tkr' in 'tlskeys_reference.p' [-Werror=dangling-pointer=]
48 | #define LIST_INSERT(lh, el) ({ (el)->n = (lh)->n; (el)->n->p = (lh)->n = (el); (el)->p = (lh); (el); })
| ~~~~~~~~^~~~~~
src/ssl_sock.c:1086:9: note: in expansion of macro 'LIST_INSERT'
1086 | LIST_INSERT(&tkr, &tlskeys_reference);
| ^~~~~~~~~~~
compilation terminated due to -Wfatal-errors.
This appears with gcc 13.0.
The fix uses LIST_SPLICE() instead of inserting the head of the local
list in the global list.
Should fix issue #2136 .
This commit introduces the keyword "client-sigalgs" for the bind line,
which does the same as "sigalgs" but for the client authentication.
"ssl-default-bind-client-sigalgs" allows to set the default parameter
for all the bind lines.
This patch should fix issue #2081.
This patch introduces the "sigalgs" keyword for the bind line, which
allows to configure the list of server signature algorithms negociated
during the handshake. Also available as "ssl-default-bind-sigalgs" in
the default section.
This patch was originally written by Bruno Henc.
WolfSSL is enabling by default the CRL checks even if a CRL file wasn't
provided. This patch resets the default X509_STORE flags so this is
not checked by default.
While it does not have any effect, it's better not to try to setup an
ALPN callback nor to try to lookup algorithms when the configured ALPN
string is empty as a result of "no-alpn" being used.
strcpy() is quite nasty but tolerable to copy constants, but here
it copies a variable path into a node in a code path that's not
trivial to follow given that it takes the node as the result of
a tree lookup. Let's get rid of it and mention where the entry
is retrieved.
Previously performing a config check of `.github/h2spec.config` would report a
20 byte leak as reported in GitHub Issue #2082.
The leak was introduced in a6c0a59e9a, which is
dev only. No backport needed.
Despite having replaced the SSL BIOs to use our own raw_sock layer, we
still didn't exploit the CO_SFL_MSG_MORE flag which is pretty useful to
avoid sending incomplete packets. It's particularly important for SSL
since the extra overhead almost guarantees that each send() will be
followed by an incomplete (and often odd-sided) segment.
We already have an xprt_st set of flags to pass info to the various
layers, so let's just add a new one, SSL_SOCK_SEND_MORE, that is set
or cleared during ssl_sock_from_buf() to transfer the knowledge of
CO_SFL_MSG_MORE. This way we can recover this information and pass
it to raw_sock.
This alone is sufficient to increase by ~5-10% the H2 bandwidth over
SSL when multiple streams are used in parallel.
The commit 5e1b0e7bf ("BUG/MEDIUM: connection: Clear flags when a conn is
removed from an idle list") introduced a regression. CO_FL_SAFE_LIST and
CO_FL_IDLE_LIST flags are used when the connection is released to properly
decrement used/idle connection counters. if a connection is idle, these
flags must be preserved till the connection is really released. It may be
removed from the list but not immediately released. If these flags are lost
when it is finally released, the current number of used connections is
erroneously decremented. If means this counter may become negative and the
counters tracking the number of idle connecitons is not decremented,
suggesting a leak.
So, the above commit is reverted and instead we improve a bit the way to
detect an idle connection. The function conn_get_idle_flag() must now be
used to know if a connection is in an idle list. It returns the connection
flag corresponding to the idle list if the connection is idle
(CO_FL_SAFE_LIST or CO_FL_IDLE_LIST) or 0 otherwise. But if the connection
is scheduled to be removed, 0 is also returned, regardless the connection
flags.
This new function is used when the connection is temporarily removed from
the list to be used, mainly in muxes.
This patch should fix#2078 and #2057. It must be backported as far as 2.2.
Like for connection error code, when FD are dumps, the ssl error code is now
displayed. This may help to diagnose why a connection error occurred.
This patch may be backported to help debugging.
Instead of having a dedicated httpclient instance and its own code
decorrelated from the actual auto update one, the "update ssl
ocsp-response" will now use the update task in order to perform updates.
Since the cli command allows to update responses that were never
included in the auto update tree, a new flag was added to the
certificate_ocsp structure so that the said entry can be inserted into
the tree "by hand" and it won't be reinserted back into the tree after
the update process is performed. The 'update_once' flag "stole" a bit
from the 'fail_count' counter since it is the one less likely to reach
UINT_MAX among the ocsp counters of the certificate_ocsp structure.
This new logic required that every certificate_ocsp entry contained all
the ocsp-related information at all time since entries that are not
supposed to be configured automatically can still be updated through the
cli. The logic of the ssl_sock_load_ocsp was changed accordingly.
When adding a new certificate through the CLI and appending it to a
crt-list with the 'ocsp-update' option set, the new certificate would
not be added to the OCSP response update list.
The only thing that was missing was the copy of the ocsp_update mode
from the ssl_bind_conf into the ckch_store's object.
An extra wakeup of the update task also needed to happen in case the
newly inserted entry needs to be updated before the next wakeup of the
task.
This patch does not need to be backported.
The minimum and maximum delays between two automatic updates of a given
OCSP response can now be set via global options. It allows to limit the
update rate of OCSP responses for configurations that use many frontend
certificates with the ocsp-update option set if the updates are deemed
too costly.
In order to have some information about the frontend certificate when
dumping the contents of the ocsp update tree from the cli, we could
either keep a reference to a ckch_store in the certificate_ocsp
structure, which might cause some dangling reference problems, or
simply copy the path to the certificate in the ocsp response structure.
This latter solution was chosen because of its simplicity.
This is a bad idea to make the TLS ClientHello callback call qc_conn_finalize().
If this latter fails, this would generate a TLS alert and make the connection
send packet whereas it is not functional. But qc_conn_finalize() job was to
install the transport parameters sent by the QUIC listener. This installation
cannot be done at any time. This must be done after having possibly negotiated
the QUIC version and before sending the first Handshake packets. It seems
the better moment to do that in when the Handshake TX secrets are derived. This
has been found inspecting the ngtcp2 code. Calling SSL_set_quic_transport_params()
too late would make the ServerHello to be sent without the transport parameters.
The code for the connection update which was done from qc_conn_finalize() has
been moved to quic_transport_params_store(). So, this update is done as soon as
possible.
Add QUIC_FL_CONN_TX_TP_RECEIVED to flag the connection as having received the
peer transport parameters. Indeed this is required when the ClientHello message
is splitted between packets.
Add QUIC_FL_CONN_FINALIZED to protect the connection from calling qc_conn_finalize()
more than one time. This latter is called only when the connection has received
the transport parameters and after returning from SSL_do_hanshake() which is the
function which trigger the TLS ClientHello callback call.
Remove the calls to qc_conn_finalize() from from the TLS ClientHello callbacks.
Must be backported to 2.6. and 2.7.
When using WolfSSL, there are some cases were the SSL_CTX_sess_new_cb is
called with an existing session ID. These cases are not met with
OpenSSL.
When the ID is found in the session tree during the insertion, the
shared_block len is not set to 0 and is not used. However if later the
block is reused, since the len is not set to 0, the release callback
will be called an ebmb_delete will be tried on the block, even if it's
not in the tree, provoking a crash.
The code was buggy from the beginning, but the case never happen with
openssl which changes the ID.
Must be backported in every maintained branches.
CF_READ_NULL flag is not really useful and used. It is a transient event
used to wakeup the stream. As we will see, all read events on a channel may
be resumed to only one and are all used to wake up the stream.
In this patch, we introduce CF_READ_EVENT flag as a replacement to
CF_READ_NULL. There is no breaking change for now, it is just a
rename. Gradually, other read events will be merged with this one.
The ocsp_uri field of the certificate_ocsp structure was a 16k buffer
when it could be hand allocated to just the required size to store the
OCSP uri. This field is now behaving the same way as the sctl and
ocsp_response buffers of the ckch_store structure.
If a given certificate is used multiple times in a configuration, the
ocsp_cid field would have been overwritten during each
ssl_sock_load_ocsp call even if it was previously filled.
This patch does not need to be backported.
If the ocsp issuer certificate was actually taken from the certificate
chain in ssl_sock_load_ocsp, we don't need to keep an extra reference on
it since we already keep a reference to the full certificate chain.
This patch effectively enables the ocsp auto update mechanism. If a
least one ocsp-update option is enabled in a crt-list, then the ocsp
auto update task is created. It will look into the dedicated ocsp update
tree for the next update to be updated, use the http_client to send the
ocsp request to the proper responder, validate the received ocsp
response and update the ocsp response tree before finally reinserting
the entry in the ocsp update tree (with a next update time set to
now+1H).
The main task will then sleep until another entry needs to be updated.
The task gets scheduled after config check in order to avoid trying to
update ocsp responses while configuration is still being parsed (and
certificates and actual ocsp responses are loaded).
This patch contains the main function of the ocsp auto update mechanism
as well as an init and destroy function of the task used for this.
The task is not created in this patch but in a later one.
The function has two distinct parts and the branching to one or the
other is completely based on the fact that the cur_ocsp pointer of the
ssl_ocsp_task_ctx member is set.
If the pointer is not set, we need to look at the first item of the
update tree and see if it needs to be updated. If it does not we simply
wait until the time is right and let the task asleep. If it does need to
be updated, we simply build and send the corresponding ocsp request
thanks to the http_client. The task is then sent to sleep with an expire
time set to infinity. The http_client will wake it back up once the
response is received (or a timeout occurs). Just note that during this
whole process the cetificate_ocsp object corresponding to the entry
being updated is taken out of the update tree and only stored in the
ssl_ocsp_task_ctx context.
Once the task is waken up by the http_client, it branches on the
response processing part of the function which basically checks that the
response is valid and inserts it into the ocsp_response tree. The task
then goes back to sleep until another entry needs to be updated.
When 'ocsp-update' is enabled for a given certificate, we need to insert
the certificate_ocsp member of this certificate in the OCSP update tree
as well as the already existing OCSP response tree. For such an entry to
be created, the certificate needs to contain an "OCSP URI" field, and we
also need to know the certificate's issuer, that is used to build the
OCSP_CERTID. When no OCSP response is known for a given certificate, an
empty certificate_ocsp object gets created so that it can be inserted in
the ocsp update tree.
The entry is inserted on the first spot of the update tree since its
expire time is 0. Then whenever the update task is started, it will try
to get responses for those certificates first.
In order for the update process to work, we also need to store some
information relative to the main certificate into the certificate_ocsp
structure. This avoids having to keep a reference to a ckch in an ocsp
tree entry.
This patch adds a reference to the certificate chain as well as the ocsp
issuer that might have been filled during init into the certificate_ocsp
object. It also gets the ocsp uri at this time since it is contained in
the server's certificate. We only take the first uri that might be
contained in the certificate though.
Those fields are only filled when ocsp auto update is enabled for the
concerned certificate.
The OCSP update tree holds ocsp responses that will need to be updated
automatically. The entries are inserted in an eb64_tree where the keys
are the absolute time after which the entry will need to be updated.
The ocsp_response member of the cert_key_and_chain structure is only
used temporarily. During a standard init process where an ocsp response
is provided, this ocsp file is first copied into the ocsp_response
buffer without any ocsp-related parsing (see
ssl_sock_load_ocsp_response_from_file), and then the contents are
actually interpreted and inserted into the actual ocsp tree
(cert_ocsp_tree) later in the process (see ssl_sock_load_ocsp). If the
response was deemed valid, it is then copied into the actual
ocsp_response structure's 'response' field (see
ssl_sock_load_ocsp_response). From this point, the ocsp_response field
of the cert_key_and_chain object could be discarded since actual ocsp
operations will be based of the certificate_ocsp object.
The only remaining runtime use of the ckch's ocsp_response field was in
the CLI, and more precisely in the 'show ssl cert' mechanism.
This constraint could be removed by adding an OCSP_CERTID directly in
the ckch because the buffer was only used to get this id.
This patch then adds the OCSP_CERTID pointer in the ckch, it clears the
ocsp_response buffer early and simplifies the ckch_store_build_certid
function.
The new "update ssl ocsp-response <certfile>" CLI command allows to
update the stored OCSP response for a given certificate. It relies on
the http_client which is used to send an HTTP request to the OCSP
responder whose URI can be extracted from the certificate.
This command won't work for a certificate that did not have a stored
OCSP response yet.
This helper function will check that an OCSP response is valid, meaning
that the proper "Content-Type: application/ocsp-response" header is
present and the data itself is a proper OCSP_RESPONSE that can be
checked thanks to the issuer certificate.
The tree that contains OCSP responses is never locked despite being used
at runtime for OCSP stapling as well as the CLI through "set ssl cert"
and "set ssl ocsp-response" commands.
Everything works though because the certificate_ocsp structure is
refcounted and the tree's entries are cleaned up when SSL_CTXs are
destroyed (thanks to an ex_data entry in which the certificate_ocsp
pointer is stored).
This new lock will come to use when the OCSP auto update mechanism is
fully implemented because this new feature will be based on another tree
that stores the same certificate_ocsp members and updates their contents
periodically.
The certificate chain that gets passed in the SSL_CTX through
SSL_CTX_set1_chain has its reference counter increased by OpenSSL
itself. But since the ssl_sock_load_cert_chain function might create a
brand new certificate chain if none exists in the ckch_data
(sk_X509_new_null), then we ended up returning a new certificate chain
to the caller that was never destroyed.
This patch can be backported to all stable branches but it might need to
be reworked for branches older than 2.4 because of commit ec805a32b9
that refactorized the modified code.
When calling 'show ssl ocsp-response' from the CLI, a temporary buffer
was created in parse_binary when we could just use a local static buffer
instead. This does not change the behavior of the function, it just
simplifies it.
With internal proxies using the SSL activated (httpclient for example)
the automatic computation of the maxconn is wrong because these proxies
are always activated by default.
This patch fixes the issue by not counting these internal proxies during
the computation.
Must be backported as far as 2.5.
Rename the structure "cert_key_and_chain" to "ckch_data" in order to
avoid confusion with the store whcih often called "ckchs".
The "cert_key_and_chain *ckch" were renamed "ckch_data *data", so we now
have store->data instead of ckchs->ckch.
Marked medium because it changes the API.
This adds a USE_OPENSSL_WOLFSSL option, wolfSSL must be used with the
OpenSSL compatibility layer. This must be used with USE_OPENSSL=1.
WolfSSL build options:
./configure --prefix=/opt/wolfssl --enable-haproxy
HAProxy build options:
USE_OPENSSL=1 USE_OPENSSL_WOLFSSL=1 WOLFSSL_INC=/opt/wolfssl/include/ WOLFSSL_LIB=/opt/wolfssl/lib/ ADDLIB='-Wl,-rpath=/opt/wolfssl/lib'
Using at least the commit 54466b6 ("Merge pull request #5810 from
Uriah-wolfSSL/haproxy-integration") from WolfSSL. (2022-11-23).
This is still to be improved, reg-tests are not supported yet, and more
tests are to be done.
Signed-off-by: William Lallemand <wlallemand@haproxy.org>
In ssl_sock_bind_verifycbk(), when compiled without QUIC support, the
compiler may report an error during compilation about a possible NULL
dereference:
src/ssl_sock.c: In function ‘ssl_sock_bind_verifycbk’:
src/ssl_sock.c:1738:12: error: potential null pointer dereference [-Werror=null-dereference]
1738 | ctx->xprt_st |= SSL_SOCK_ST_FL_VERIFY_DONE;
| ~~~^~~~~~~~~
A BUG_ON() was addeded because it must never happen. But when compiled
without DEBUG_STRICT, there is nothing to help the compiler. Thus
ALREADY_CHECKED() macro is used. The ssl-sock context and the bind config
are concerned.
This patch must be backported to 2.6.
The registering of the keylog callback seems to provoke a loss of
performance. Disable the registration as well as the fetches if
tune.ssl.keylog is off.
Must be backported as far as 2.2.
Since commit 9b2598 ("BUG/MEDIUM: ssl: Verify error codes can exceed
63"), the ca_ignerr_bitfield and crt_ignerr_bietfield are incorrecly
accessed from __objt_listener(conn->target)->bind_conf which is not
avaiable from QUIC. The bind_conf variable was mistakenly replaced.
This patch fixes the issue by using again the bind_conf variable.
Must be backported where 9b2598 was backported.
The CRT and CA verify error codes were stored in 6 bits each in the
xprt_st field of the ssl_sock_ctx meaning that only error code up to 63
could be stored. Likewise, the ca-ignore-err and crt-ignore-err options
relied on two unsigned long longs that were used as bitfields for all
the ignored error codes. On the latest OpenSSL1.1.1 and with OpenSSLv3
and newer, verify errors have exceeded this value so these two storages
must be increased. The error codes will now be stored on 7 bits each and
the ignore-err bitfields are replaced by a big enough array and
dedicated bit get and set functions.
It can be backported on all stable branches.
[wla: let it be tested a little while before backport]
Signed-off-by: William Lallemand <wlallemand@haproxy.org>
In case of error, the ocsp item might already be in the ocsp certificate
tree but simply freed instead of destroyed through ssl_sock_free_ocsp.
This patch can be backported to all stable versions.
When calling ssl_get0_issuer_chain, if akid is not NULL but its keyid
is, then the AUTHORITY_KEYID is not freed.
This patch can be backported to all stable branches.
When running HAProxy with OpenSSLv3, the two BIGNUMs used to build our
own DH parameters are not freed. It was not necessary previously because
ownership of those parameters was transferred to OpenSSL through the
DH_set0_pqg call.
This patch should be backported to 2.6.