Empty connection is reported as handshake error
even if dont-log-null is specified.
This bug affect is a regression du to:
BUILD: ssl: fix to build (again) with boringssl
New openssl 1.1.1 defines OPENSSL_NO_HEARTBEATS as boring ssl
so the test was replaced by OPENSSL_IS_BORINGSSL
This fix should be backported on 1.8
Add a new "subscribe" method for connection, conn_stream and mux, so that
upper layer can subscribe to them, to be called when the event happens.
Right now, the only event implemented is "SUB_CAN_SEND", where the upper
layer can register to be called back when it is possible to send data.
The connection and conn_stream got a new "send_wait_list" entry, which
required to move a few struct members around to maintain an efficient
cache alignment (and actually this slightly improved performance).
Now all the code used to manipulate chunks uses a struct buffer instead.
The functions are still called "chunk*", and some of them will progressively
move to the generic buffer handling code as they are cleaned up.
Chunks are only a subset of a buffer (a non-wrapping version with no head
offset). Despite this we still carry a lot of duplicated code between
buffers and chunks. Replacing chunks with buffers would significantly
reduce the maintenance efforts. This first patch renames the chunk's
fields to match the name and types used by struct buffers, with the goal
of isolating the code changes from the declaration changes.
Most of the changes were made with spatch using this coccinelle script :
@rule_d1@
typedef chunk;
struct chunk chunk;
@@
- chunk.str
+ chunk.area
@rule_d2@
typedef chunk;
struct chunk chunk;
@@
- chunk.len
+ chunk.data
@rule_i1@
typedef chunk;
struct chunk *chunk;
@@
- chunk->str
+ chunk->area
@rule_i2@
typedef chunk;
struct chunk *chunk;
@@
- chunk->len
+ chunk->data
Some minor updates to 3 http functions had to be performed to take size_t
ints instead of ints in order to match the unsigned length here.
For the same consistency reasons, let's use b_empty() at the few places
where an empty buffer is expected, or c_empty() if it's done on a channel.
Some of these places were there to realign the buffer so
{b,c}_realign_if_empty() was used instead.
The mux and transport rcv_buf() now takes a "flags" argument, just like
the snd_buf() one or like the equivalent syscall lower part. The upper
layers will use this to pass some information such as indicating whether
the buffer is free from outgoing data or if the lower layer may allocate
the buffer itself.
Just like we have a size_t for xprt->snd_buf(), we adjust to use size_t
for rcv_buf()'s count argument and return value. It also removes the
ambiguity related to the possibility to see a negative value there.
This way the senders don't need to modify the buffer's metadata anymore
nor to know about the output's split point. This way the functions can
take a const buffer and it's clearer who's in charge of updating the
buffer after a send. That's why the buffer realignment is now performed
by the caller of the transport's snd_buf() functions.
The return type was updated to return a size_t to comply with the count
argument.
Commit 200b0fa ("MEDIUM: Add support for updating TLS ticket keys via
socket") introduced support for updating TLS ticket keys from the CLI,
but missed a small corner case : if multiple bind lines reference the
same tls_keys file, the same reference is used (as expected), but during
the clean shutdown, it will lead to a double free when destroying the
bind_conf contexts since none of the lines knows if others still use
it. The impact is very low however, mostly a core and/or a message in
the system's log upon old process termination.
Let's introduce some basic refcounting to prevent this from happening,
so that only the last bind_conf frees it.
Thanks to Janusz Dziemidowicz and Thierry Fournier for both reporting
the same issue with an easy reproducer.
This fix needs to be backported from 1.6 to 1.8.
Bug from 96b7834e: pkinfo is stored on SSL_CTX ex_data and should
not be also stored on SSL ex_data without reservation.
Simply extract pkinfo from SSL_CTX in ssl_sock_get_pkey_algo.
No backport needed.
We never saw unexplicated crash with SSL, so I suppose that we are
luck, or the slot 0 is always reserved. Anyway the usage of the macro
SSL_get_app_data() and SSL_set_app_data() seem wrong. This patch change
the deprecated functions SSL_get_app_data() and SSL_set_app_data()
by the new functions SSL_get_ex_data() and SSL_set_ex_data(), and
it reserves the slot in the SSL memory space.
For information, this is the two declaration which seems wrong or
incomplete in the OpenSSL ssl.h file. We can see the usage of the
slot 0 whoch is hardcoded, but never reserved.
#define SSL_set_app_data(s,arg) (SSL_set_ex_data(s,0,(char *)arg))
#define SSL_get_app_data(s) (SSL_get_ex_data(s,0))
This patch must be backported at least in 1.8, maybe in other versions.
The cipher list capture struct is stored in the SSL memory space,
but the slot is reserved in the SSL_CTX memory space. This causes
ramdom crashes.
This patch should be backported to 1.8
Sets OpenSSL 1.1.1's SSL_OP_PRIORITIZE_CHACHA unconditionally, as per [1]:
When SSL_OP_CIPHER_SERVER_PREFERENCE is set, temporarily reprioritize
ChaCha20-Poly1305 ciphers to the top of the server cipher list if a
ChaCha20-Poly1305 cipher is at the top of the client cipher list. This
helps those clients (e.g. mobile) use ChaCha20-Poly1305 if that cipher
is anywhere in the server cipher list; but still allows other clients to
use AES and other ciphers. Requires SSL_OP_CIPHER_SERVER_PREFERENCE.
[1] https://www.openssl.org/docs/man1.1.1/man3/SSL_CTX_clear_options.html
Commit 821bb9b ("MAJOR: threads/ssl: Make SSL part thread-safe") added
insufficient locking to the cert lookup and generation code : it uses
lru64_lookup(), which will automatically remove and add a list element
to the LRU list. It cannot be simply read-locked.
A long-term improvement should consist in using a lockless mechanism
in lru64_lookup() to safely move the list element at the head. For now
let's simply use a write lock during the lookup. The effect will be
minimal since it's used only in conjunction with automatically generated
certificates, which are much more expensive and rarely used.
This fix must be backported to 1.8.
Previously these fetches would return empty results when HAProxy was
compiled
without the requisite SSL support. This results in confusion and problem
reports from people who unexpectedly encounter the behavior.
It is now possible to use a payload with the "set ssl ocsp-response"
command. These syntaxes will work the same way:
# echo "set ssl ocsp-response $(base64 -w 10000 ocsp.der)" | \
socat /tmp/sock1 -
# echo -e "set ssl ocsp-response <<\n$(base64 ocsp.der)\n" | \
socat /tmp/sock1 -
Signed-off-by: Aurlien Nephtali <aurelien.nephtali@corp.ovh.com>
In order to use arbitrary data in the CLI (multiple lines or group of words
that must be considered as a whole, for example), it is now possible to add a
payload to the commands. To do so, the first line needs to end with a special
pattern: <<\n. Everything that follows will be left untouched by the CLI parser
and will be passed to the commands parsers.
Per-command support will need to be added to take advantage of this
feature.
Signed-off-by: Aurlien Nephtali <aurelien.nephtali@corp.ovh.com>
When using the CLI_ST_PRINT_FREE state, always output something back
if the faulty function did not fill the 'err' variable.
The map/acl code could lead to a crash whereas the SSL code was silently
failing.
Signed-off-by: Aurlien Nephtali <aurelien.nephtali@corp.ovh.com>
OpenSSL can be built without NEXTPROTONEG support by passing
-no-npn to the configure script. This sets the
OPENSSL_NO_NEXTPROTONEG flag in opensslconf.h
Since NEXTPROTONEG is now considered deprecated, it is superseeded
by ALPN (Application Layer Protocol Next), HAProxy should allow
building withough NPN support.
Since 200b0fac ("MEDIUM: Add support for updating TLS ticket keys via
socket"), 4147b2ef ("MEDIUM: ssl: basic OCSP stapling support."),
4df59e9 ("MINOR: cli: add socket commands and config to prepend
informational messages with severity") and 654694e1 ("MEDIUM: stats/cli:
add support for "set table key" to enter values"), commands
'set ssl tls-key', 'set ssl ocsp-response', 'set severity-output' and
'set table' do not always send an extra LF at the end of their outputs.
This is required as mentioned in doc/management.txt:
"Since multiple commands may be issued at once, haproxy uses the empty
line as a delimiter to mark an end of output for each command"
Signed-off-by: Aurlien Nephtali <aurelien.nephtali@corp.ovh.com>
openssl/x509.h is included twice since commit fc0421fde ("MEDIUM: ssl:
add support for SNI and wildcard certificates").
Signed-off-by: Aurlien Nephtali <aurelien.nephtali@corp.ovh.com>
ssl_sock_get_pkey_algo can be used to report pkey algorithm to log
and ppv2 (RSA2048, EC256,...).
Extract pkey information is not free in ssl api (lock/alloc/free):
haproxy can use the pkey information computed in load_certificate.
Store and use this information in a SSL ex_data when available,
compute it if not (SSL multicert bundled and generated cert).
Private key information is used in switchctx to implement native multicert
selection (ecdsa/rsa/anonymous). This patch extract and store full pkey
information: dsa type and pkey size in bits. This can be used for switchctx
or to report pkey informations in ppv2 and log.
Returns true when the back connection was made over an SSL/TLS transport
layer and the newly created SSL session was resumed using a cached
session or a TLS ticket.
When SSL_read returns SSL_ERROR_SYSCALL and errno is unset or set to EAGAIN, the
connection must be shut down for reading. Else, the connection loops infinitly,
consuming all the CPU.
The bug was introduced in the commit 7e2e50500 ("BUG/MEDIUM: ssl: Don't always
treat SSL_ERROR_SYSCALL as unrecovarable."). This patch must be backported in
1.8 too.
A TLS ticket keys file can be updated on the CLI and used in same time. So we
need to protect it to be sure all accesses are thread-safe. Because updates are
infrequent, a R/W lock has been used.
This patch must be backported in 1.8
Bart Geesink reported some random errors appearing under the form of
termination flags SD in the logs for connections involving SSL traffic
to reach the servers.
Tomek Gacek and Mateusz Malek finally narrowed down the problem to commit
c2aae74 ("MEDIUM: ssl: Handle early data with OpenSSL 1.1.1"). It happens
that the special case of SSL_ERROR_SYSCALL isn't handled anymore since
this commit.
SSL_read() might return <= 0, and SSL_get_erro() return SSL_ERROR_SYSCALL,
without meaning the connection is gone. Before flagging the connection
as in error, check the errno value.
This should be backported to 1.8.
It may be useful to keep the CO_FL_EARLY_DATA flag, so that we know early
data were used, so instead of doing this, only add the Early-data header,
and have the sample fetch ssl_fc_has_early return 1, if CO_FL_EARLY_DATA is
set, and if the handshake isn't done yet.
Instead of looking for CO_FL_EARLY_DATA to know if we have to try to wake
up a stream, because it is waiting for a SSL handshake, instead add a new
conn_stream flag, CS_FL_WAIT_FOR_HS. This way we don't have to rely on
CO_FL_EARLY_DATA, and we will only wake streams that are actually waiting.
fd_insert() is currently called just after setting the owner and iocb,
but proceeding like this prevents the operation from being atomic and
requires a lock to protect the maxfd computation in another thread from
meeting an incompletely initialized FD and computing a wrong maxfd.
Fortunately for now all fdtab[].owner are set before calling fd_insert(),
and the first lock in fd_insert() enforces a memory barrier so the code
is safe.
This patch moves the initialization of the owner and iocb to fd_insert()
so that the function will be able to properly arrange its operations and
remain safe even when modified to become lockless. There's no other change
beyond the internal API.
Since the rework of the shctx with the hot list system, the ssl cache
was putting session inside the hot list, without removing them.
Once all block were used, they were all locked in the hot list, which
was forbiding to reuse them for new sessions.
Bug introduced by 4f45bb9 ("MEDIUM: shctx: separate ssl and shctx")
Thanks to Jeffrey J. Persch for reporting this bug.
Must be backported to 1.8.
The number of async fd is computed considering the maxconn, the number
of sides using ssl and the number of engines using async mode.
This patch should be backported on haproxy 1.8
Since the split of the shctx and the ssl cache, we lost the ability to
disable the cache with tune.ssl.cachesize 0.
Worst than that, when using this configuration, haproxy segfaults during
the configuration parsing.
Must be backported to 1.8.
The shctx_init() function does not check anymore if the pointer is not
NULL, this check must be done is the caller.
The consequence was to allocate one shctx per ssl bind.
Bug introduced by 4f45bb9 ("MEDIUM: shctx: separate ssl and shctx")
Thanks to Maciej Zdeb for reporting this bug.
Must be backported to 1.8.
During the migration to the second version of the pools, the new
functions and pool pointers were all called "pool_something2()" and
"pool2_something". Now there's no more pool v1 code and it's a real
pain to still have to deal with this. Let's clean this up now by
removing the "2" everywhere, and by renaming the pool heads
"pool_head_something".
BoringSSL early data differ from OpenSSL 1.1.1 implementation. When early
handshake is done, SSL_in_early_data report if SSL_read will be done on early
data. CO_FL_EARLY_SSL_HS and CO_FL_EARLY_DATA can be adjust accordingly.
It can happen that we want to read early data, write some, and then continue
reading them.
To do so, we can't reuse tmp_early_data to store the amount of data sent,
so introduce a new member.
If we read early data, then ssl_sock_to_buf() is now the only responsible
for getting back to the handshake, to make sure we don't miss any early data.