Commit Graph

927 Commits

Author SHA1 Message Date
Willy Tarreau
2b71810cb3 CLEANUP: lists/tree-wide: rename some list operations to avoid some confusion
The current "ADD" vs "ADDQ" is confusing because when thinking in terms
of appending at the end of a list, "ADD" naturally comes to mind, but
here it does the opposite, it inserts. Several times already it's been
incorrectly used where ADDQ was expected, the latest of which was a
fortunate accident explained in 6fa922562 ("CLEANUP: stream: explain
why we queue the stream at the head of the server list").

Let's use more explicit (but slightly longer) names now:

   LIST_ADD        ->       LIST_INSERT
   LIST_ADDQ       ->       LIST_APPEND
   LIST_ADDED      ->       LIST_INLIST
   LIST_DEL        ->       LIST_DELETE

The same is true for MT_LISTs, including their "TRY" variant.
LIST_DEL_INIT keeps its short name to encourage to use it instead of the
lazier LIST_DELETE which is often less safe.

The change is large (~674 non-comment entries) but is mechanical enough
to remain safe. No permutation was performed, so any out-of-tree code
can easily map older names to new ones.

The list doc was updated.
2021-04-21 09:20:17 +02:00
Willy Tarreau
ff88270ef9 MINOR: pool: move pool declarations to read_mostly
All pool heads are accessed via a pointer and should not be shared with
highly written variables. Move them to the read_mostly section.
2021-04-10 19:27:41 +02:00
Willy Tarreau
4781b1521a CLEANUP: atomic/tree-wide: replace single increments/decrements with inc/dec
This patch replaces roughly all occurrences of an HA_ATOMIC_ADD(&foo, 1)
or HA_ATOMIC_SUB(&foo, 1) with the equivalent HA_ATOMIC_INC(&foo) and
HA_ATOMIC_DEC(&foo) respectively. These are 507 changes over 45 files.
2021-04-07 18:18:37 +02:00
Willy Tarreau
1db427399c CLEANUP: atomic: add an explicit _FETCH variant for add/sub/and/or
Currently our atomic ops return a value but it's never known whether
the fetch is done before or after the operation, which causes some
confusion each time the value is desired. Let's create an explicit
variant of these operations suffixed with _FETCH to explicitly mention
that the fetch occurs after the operation, and make use of it at the
few call places.
2021-04-07 18:18:37 +02:00
Remi Tricot-Le Breton
8218aed90e BUG/MINOR: ssl: Fix update of default certificate
The default SSL_CTX used by a specific frontend is the one of the first
ckch instance created for this frontend. If this instance has SNIs, then
the SSL context is linked to the instance through the list of SNIs
contained in it. If the instance does not have any SNIs though, then the
SSL_CTX is only referenced by the bind_conf structure and the instance
itself has no link to it.
When trying to update a certificate used by the default instance through
a cli command, a new version of the default instance was rebuilt but the
default SSL context referenced in the bind_conf structure would not be
changed, resulting in a buggy behavior in which depending on the SNI
used by the client, he could either use the new version of the updated
certificate or the original one.

This patch adds a reference to the default SSL context in the default
ckch instances so that it can be hot swapped during a certificate
update.

This should fix GitHub issue #1143.

It can be backported as far as 2.2.
2021-03-26 13:06:29 +01:00
Remi Tricot-Le Breton
fb00f31af4 BUG/MINOR: ssl: Prevent disk access when using "add ssl crt-list"
If an unknown CA file was first mentioned in an "add ssl crt-list" CLI
command, it would result in a call to X509_STORE_load_locations which
performs a disk access which is forbidden during runtime. The same would
happen if a "ca-verify-file" or "crl-file" was specified. This was due
to the fact that the crt-list file parsing and the crt-list related CLI
commands parsing use the same functions.
The patch simply adds a new parameter to all the ssl_bind parsing
functions so that they know if the call is made during init or by the
CLI, and the ssl_store_load_locations function can then reject any new
cafile_entry creation coming from a CLI call.

It can be backported as far as 2.2.
2021-03-23 19:29:46 +01:00
Willy Tarreau
f208ac0616 CLEANUP: ssl: use pool_zalloc() in ssl_init_keylog()
This one used to alloc then zero the area, let's have the allocator do it.
2021-03-22 23:19:48 +01:00
Willy Tarreau
b454e908e5 MINOR: ssl: use pool_alloc(), not pool_alloc_dirty()
pool_alloc_dirty() is the version below pool_alloc() that never performs
the memory poisonning. It should only be called directly for very large
unstructured areas for which enabling memory poisonning would not bring
anything but could significantly hurt performance (e.g. buffers). Using
this function here will not provide any benefit and will hurt the ability
to debug.

It would be desirable to backport this, although it does not cause any
user-visible bug, it just complicates debugging.
2021-03-22 15:35:53 +01:00
Olivier Houchard
bc5ce9201a MEDIUM: connections: Implement a start() method in ssl_sock.
Add a start() method to ssl_sock. It is responsible with initiating the
SSL handshake, currently by just scheduling the tasklet, instead of doing
it in the init() method, when all the XPRT may not have been initialized.
2021-03-19 15:33:04 +01:00
Olivier Houchard
1b3c931bff MEDIUM: connections: Introduce a new XPRT method, start().
Introduce a new XPRT method, start(). The init() method will now only
initialize whatever is needed for the XPRT to run, but any action the XPRT
has to do before being ready, such as handshakes, will be done in the new
start() method. That way, we will be sure the full stack of xprt will be
initialized before attempting to do anything.
The init() call is also moved to conn_prepare(). There's no longer any reason
to wait for the ctrl to be ready, any action will be deferred until start(),
anyway. This means conn_xprt_init() is no longer needed.
2021-03-19 15:33:04 +01:00
Willy Tarreau
7416314145 CLEANUP: task: make sure tasklet handlers always indicate their statuses
When tasklets were derived from tasks, there was no immediate need for
the scheduler to know their status after execution, and in a spirit of
simplicity they just started to always return NULL. The problem is that
it simply prevents the scheduler from 1) accounting their execution time,
and 2) keeping track of their current execution status. Indeed, a remote
wake-up could very well end up manipulating a tasklet that's currently
being executed. And this is the reason why those handlers have to take
the idle lock before checking their context.

In 2.5 we'll take care of making tasklets and tasks work more similarly,
but trouble is to be expected if we continue to propagate the trend of
returning NULL everywhere, especially if some fixes relying on a stricter
model later need to be backported. For this reason this patch updates all
known tasklet handlers to make them return NULL only when the tasklet was
freed. It has no effect for now and isn't even guaranteed to always be
100% safe but it puts the code into the right direction for this.
2021-03-13 11:30:19 +01:00
Willy Tarreau
4c48edba4f BUG/MEDIUM: ssl: properly remove the TASK_HEAVY flag at end of handshake
Emeric found that SSL+keepalive traffic had dropped quite a bit in the
recent changes, which could be bisected to recent commit 9205ab31d
("MINOR: ssl: mark the SSL handshake tasklet as heavy"). Indeed, a
first incarnation of this commit made use of the TASK_SELF_WAKING
flag but the last version directly used TASK_HEAVY, but it would still
continue to remove the already absent TASK_SELF_WAKING one instead of
TASK_HEAVY. As such, the SSL traffic remained processed with low
granularity.

No backport is needed as this is only 2.4.
2021-03-09 17:58:02 +01:00
Willy Tarreau
430bf4a483 MINOR: server: allocate a per-thread struct for the per-thread connections stuff
There are multiple per-thread lists in the listeners, which isn't the
most efficient in terms of cache, and doesn't easily allow to store all
the per-thread stuff.

Now we introduce an srv_per_thread structure which the servers will have an
array of, and place the idle/safe/avail conns tree heads into. Overall this
was a fairly mechanical change, and the array is now always initialized for
all servers since we'll put more stuff there. It's worth noting that the Lua
code still has to deal with its own deinit by itself despite being in a
global list, because its server is not dynamically allocated.
2021-03-05 15:00:24 +01:00
Willy Tarreau
4149168255 MEDIUM: ssl: implement xprt_set_used and xprt_set_idle to relax context checks
Currently the SSL layer checks the validity of its tasklet's context just
in case it would have been stolen, had the connection been idle. Now it
will be able to be notified by the mux when this situation happens so as
not to have to grab the idle connection lock on each pass. This reuses the
TASK_F_USR1 flag just as the muxes do.
2021-03-05 08:30:08 +01:00
Willy Tarreau
144f84a09d MEDIUM: task: extend the state field to 32 bits
It's been too short for quite a while now and is now full. It's still
time to extend it to 32-bits since we have room for this without
wasting any space, so we now gained 16 new bits for future flags.

The values were not reassigned just in case there would be a few
hidden u16 or short somewhere in which these flags are placed (as
it used to be the case with stream->pending_events).

The patch is tagged MEDIUM because this required to update the task's
process() prototype to use an int instead of a short, that's quite a
bunch of places.
2021-03-05 08:30:08 +01:00
Willy Tarreau
566cebc1fc BUG/MINOR: ssl: don't truncate the file descriptor to 16 bits in debug mode
Errors reported by ssl_sock_dump_errors() to stderr would only report the
16 lower bits of the file descriptor because it used to be casted to ushort.
This can be backported to all versions but has really no importance in
practice since this is never seen.
2021-03-05 08:30:08 +01:00
Willy Tarreau
3bda3f422e CLEANUP: ssl: use realloc() instead of free()+malloc()
There was a free(ptr) followed by ptr=malloc(ptr, len), which is the
equivalent of ptr = realloc(ptr, len) but slower and less clean. Let's
replace this.
2021-02-26 21:27:33 +01:00
Willy Tarreau
e709e82173 CLEANUP: ssl: make ssl_sock_free_srv_ctx() zero the pointers after free
In ssl_sock_free_srv_ctx() there are some calls to free() which are not
followed by a zeroing of the pointers. For now this function is only used
during deinit but it could be used at run time in the near future, so
better secure this.
2021-02-26 21:23:06 +01:00
Willy Tarreau
01acf563a7 CLEANUP: ssl: remove a useless "if" before freeing an error message
Just an old "if (err) free(err)" that managed to escape cleanups.
2021-02-26 21:22:20 +01:00
Willy Tarreau
61cfdf4fd8 CLEANUP: tree-wide: replace free(x);x=NULL with ha_free(&x)
This makes the code more readable and less prone to copy-paste errors.
In addition, it allows to place some __builtin_constant_p() predicates
to trigger a link-time error in case the compiler knows that the freed
area is constant. It will also produce compile-time error if trying to
free something that is not a regular pointer (e.g. a function).

The DEBUG_MEM_STATS macro now also defines an instance for ha_free()
so that all these calls can be checked.

178 occurrences were converted. The vast majority of them were handled
by the following Coccinelle script, some slightly refined to better deal
with "&*x" or with long lines:

  @ rule @
  expression E;
  @@
  - free(E);
  - E = NULL;
  + ha_free(&E);

It was verified that the resulting code is the same, more or less a
handful of cases where the compiler optimized slightly differently
the temporary variable that holds the copy of the pointer.

A non-negligible amount of {free(str);str=NULL;str_len=0;} are still
present in the config part (mostly header names in proxies). These
ones should also be cleaned for the same reasons, and probably be
turned into ist strings.
2021-02-26 21:21:09 +01:00
Willy Tarreau
9205ab31d2 MINOR: ssl: mark the SSL handshake tasklet as heavy
There's a fairness issue between SSL and clear text. A full end-to-end
cleartext connection can require up to ~7.7 wakeups on average, plus 3.3
for the SSL tasklet, one of which is particularly expensive. So if we
accept to process many handshakes taking 1ms each, we significantly
increase the processing time of regular tasks just by adding an extra
delay between their calls. Ideally in order to be fair we should have a
1:18 call ratio, but this requires a bit more accounting. With very little
effort we can mark the SSL handshake tasklet as TASK_HEAVY until the
handshake completes, and remove it once done.

Doing so reduces from 14 to 3.0 ms the total response time experienced
by HTTP clients running in parallel to 1000 SSL clients doing full
handshakes in loops. Better, when tune.sched.low-latency is set to "on",
the latency further drops to 1.8 ms.

The tasks latency distribution explain pretty well what is happening:

Without the patch:
  $ socat - /tmp/sock1 <<< "show profiling"
  Per-task CPU profiling              : on      # set profiling tasks {on|auto|off}
  Tasks activity:
    function                      calls   cpu_tot   cpu_avg   lat_tot   lat_avg
    ssl_sock_io_cb              2785375   19.35m    416.9us   5.401h    6.980ms
    h1_io_cb                    1868949   9.853s    5.271us   4.829h    9.302ms
    process_stream              1864066   7.582s    4.067us   2.058h    3.974ms
    si_cs_io_cb                 1733808   1.932s    1.114us   26.83m    928.5us
    h1_timeout_task              935760      -         -      1.033h    3.975ms
    accept_queue_process         303606   4.627s    15.24us   16.65m    3.291ms
    srv_cleanup_toremove_connections452   64.31ms   142.3us   2.447s    5.415ms
    task_run_applet                  47   5.149ms   109.6us   57.09ms   1.215ms
    srv_cleanup_idle_connections     34   2.210ms   65.00us   87.49ms   2.573ms

With the patch:
  $ socat - /tmp/sock1 <<< "show profiling"
  Per-task CPU profiling              : on      # set profiling tasks {on|auto|off}
  Tasks activity:
    function                      calls   cpu_tot   cpu_avg   lat_tot   lat_avg
    ssl_sock_io_cb              3000365   21.08m    421.6us   20.30h    24.36ms
    h1_io_cb                    2031932   9.278s    4.565us   46.70m    1.379ms
    process_stream              2010682   7.391s    3.675us   22.83m    681.2us
    si_cs_io_cb                 1702070   1.571s    922.0ns   8.732m    307.8us
    h1_timeout_task             1009594      -         -      17.63m    1.048ms
    accept_queue_process         339595   4.792s    14.11us   3.714m    656.2us
    srv_cleanup_toremove_connections779   75.42ms   96.81us   438.3ms   562.6us
    srv_cleanup_idle_connections     48   2.498ms   52.05us   178.1us   3.709us
    task_run_applet                  17   1.738ms   102.3us   11.29ms   663.9us
    other                             1   947.8us   947.8us   202.6us   202.6us

  => h1_io_cb() and process_stream() are divided by 6 while ssl_sock_io_cb() is
     multipled by 4

And with low-latency on:
  $ socat - /tmp/sock1 <<< "show profiling"
  Per-task CPU profiling              : on      # set profiling tasks {on|auto|off}
  Tasks activity:
    function                      calls   cpu_tot   cpu_avg   lat_tot   lat_avg
    ssl_sock_io_cb              3000565   20.96m    419.1us   20.74h    24.89ms
    h1_io_cb                    2019702   9.294s    4.601us   49.22m    1.462ms
    process_stream              2009755   6.570s    3.269us   1.493m    44.57us
    si_cs_io_cb                 1997820   1.566s    783.0ns   2.985m    89.66us
    h1_timeout_task             1009742      -         -      1.647m    97.86us
    accept_queue_process         494509   4.697s    9.498us   1.240m    150.4us
    srv_cleanup_toremove_connections1120   92.32ms   82.43us   463.0ms   413.4us
    srv_cleanup_idle_connections     70   2.703ms   38.61us   204.5us   2.921us
    task_run_applet                  13   1.303ms   100.3us   85.12us   6.548us

  => process_stream() is divided by 100 while ssl_sock_io_cb() is
     multipled by 4

Interestingly, the total HTTPS response time doesn't increase and even very
slightly decreases, with an overall ~1% higher request rate. The net effect
here is a redistribution of the CPU resources between internal tasks, and
in the case of SSL, handshakes wait bit more but everything after completes
faster.

This was made simple enough to be backportable if it helps some users
suffering from high latencies in mixed traffic.
2021-02-26 00:26:03 +01:00
Amaury Denoyelle
8990b010a0 MINOR: connection: allocate dynamically hash node for backend conns
Remove ebmb_node entry from struct connection and create a dedicated
struct conn_hash_node. struct connection contains now only a pointer to
a conn_hash_node, allocated only for connections where target is of type
OBJ_TYPE_SERVER. This will reduce memory footprints for every
connections that does not need http-reuse such as frontend connections.
2021-02-19 16:59:18 +01:00
Amaury Denoyelle
f232cb3e9b MEDIUM: connection: replace idle conn lists by eb trees
The server idle/safe/available connection lists are replaced with ebmb-
trees. This is used to store backend connections, with the new field
connection hash as the key. The hash is a 8-bytes size field, used to
reflect specific connection parameters.

This is a preliminary work to be able to reuse connection with SNI,
explicit src/dst address or PROXY protocol.
2021-02-12 12:33:05 +01:00
Amaury Denoyelle
5c7086f6b0 MEDIUM: connection: protect idle conn lists with locks
This is a preparation work for connection reuse with sni/proxy
protocol/specific src-dst addresses.

Protect every access to idle conn lists with a lock. This is currently
strictly not needed because the access to the list are made with atomic
operations. However, to be able to reuse connection with specific
parameters, the list storage will be converted to eb-trees. As this
structure does not have atomic operation, it is mandatory to protect it
with a lock.

For this, the takeover lock is reused. Its role was to protect during
connection takeover. As it is now extended to general idle conns usage,
it is renamed to idle_conns_lock. A new lock section is also
instantiated named IDLE_CONNS_LOCK to isolate its impact on performance.
2021-02-12 12:33:04 +01:00
William Lallemand
3ce6eedb37 MEDIUM: ssl: add a rwlock for SSL server session cache
When adding the server side support for certificate update over the CLI
we encountered a design problem with the SSL session cache which was not
locked.

Indeed, once a certificate is updated we need to flush the cache, but we
also need to ensure that the cache is not used during the update.
To prevent the use of the cache during an update, this patch introduce a
rwlock for the SSL server session cache.

In the SSL session part this patch only lock in read, even if it writes.
The reason behind this, is that in the session part, there is one cache
storage per thread so it is not a problem to write in the cache from
several threads. The problem is only when trying to write in the cache
from the CLI (which could be on any thread) when a session is trying to
access the cache. So there is a write lock in the CLI part to prevent
simultaneous access by a session and the CLI.

This patch also remove the thread_isolate attempt which is eating too
much CPU time and was not protecting from the use of a free ptr in the
session.
2021-02-09 09:43:44 +01:00
Ilya Shipitsin
7ff7747a17 BUILD: ssl: guard SSL_CTX_set_msg_callback with SSL_CTRL_SET_MSG_CALLBACK macro
both SSL_CTX_set_msg_callback and SSL_CTRL_SET_MSG_CALLBACK defined since
ea262260469e49149cb10b25a87dfd6ad3fbb4ba, we can safely switch to that guard
instead of OpenSSL version
2021-02-08 13:49:41 +01:00
Ilya Shipitsin
f00cdb1856 BUILD: ssl: guard SSL_CTX_add_server_custom_ext with special macro
special guard macros HAVE_SSL_CTX_ADD_SERVER_CUSTOM_EXT was defined earlier
exactly for guarding SSL_CTX_add_server_custom_ext, let us use it wherever
appropriate
2021-02-08 00:11:43 +01:00
Ilya Shipitsin
7bbf5866e0 BUILD: ssl: fix typo in HAVE_SSL_CTX_ADD_SERVER_CUSTOM_EXT macro
HAVE_SSL_CTX_ADD_SERVER_CUSTOM_EXT was introduced in ec60909871
however it was defined as HAVE_SL_CTX_ADD_SERVER_CUSTOM_EXT (missing "S")
let us fix typo
2021-02-08 00:11:41 +01:00
Willy Tarreau
a84986ae4f BUG/MINOR: ssl: do not try to use early data if not configured
The CO_FL_EARLY_SSL_HS flag was inconditionally set on the connection,
resulting in SSL_read_early_data() always being used first in handshake
calculations. While this seems to work well (probably that there are
fallback paths inside openssl), it's particularly confusing and makes
the debugging quite complicated. It possibly is not optimal by the way.

This flag ought to be set only when early_data is configured on the bind
line. Apparently there used to be a good reason for doing it this way in
1.8 times, but it really does not make sense anymore. It may be OK to
backport this to 2.3 if this helps with troubleshooting, but better not
go too far as it's unlikely to fix any real issue while it could introduce
some in old versions.
2021-02-05 08:04:02 +01:00
Willy Tarreau
0630038e77 BUG/MEDIUM: ssl: check a connection's status before computing a handshake
As spotted in issue #822, we're having a problem with error detection in
the SSL layer. The problem is that on an overwhelmed machine, accepted
connections can start to pile up, each of them requiring a slow handshake,
and during all this time if the client aborts, the handshake will still be
calculated.

The error controls are properly placed, it's just that the SSL layer
reads records exactly of the advertised size, without having the ability
to encounter a pending connection error. As such if injecting many TLS
connections to a listener with a huge backlog, it's fairly possible to
meet this situation:

  12:50:48.236056 accept4(8, {sa_family=AF_INET, sin_port=htons(62794), sin_addr=inet_addr("127.0.0.1")}, [128->16], SOCK_NONBLOCK) = 1109
  12:50:48.236071 setsockopt(1109, SOL_TCP, TCP_NODELAY, [1], 4) = 0
  (process other connections' handshakes)

  12:50:48.257270 getsockopt(1109, SOL_SOCKET, SO_ERROR, [ECONNRESET], [4]) = 0
  (proof that error was detectable there but this code was added for the PoC)

  12:50:48.257297 recvfrom(1109, "\26\3\1\2\0", 5, 0, NULL, NULL) = 5
  12:50:48.257310 recvfrom(1109, "\1\0\1\3"..., 512, 0, NULL, NULL) = 512

  (handshake calculation taking 700us)

  12:50:48.258004 sendto(1109, "\26\3\3\0z"..., 1421, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = -1 EPIPE (Broken pipe)
  12:50:48.258036 close(1109)             = 0

The situation was amplified by the multi-queue accept code, as it resulted
in many incoming connections to be accepted long before they could be
handled. Prior to this they would have been accepted and the handshake
immediately started, which would have resulted in most of the connections
waiting in the the system's accept queue, and dying there when the client
aborted, thus the error would have been detected before even trying to
pass them to the handshake code.

As a result, with a listener running on a very large backlog, it's possible
to quickly accept tens of thousands of connections and waste time slowly
running their handshakes while they get replaced by other ones.

This patch adds an SO_ERROR check on the connection's FD before starting
the handshake. This is not pretty as it requires to access the FD, but it
does the job.

Some improvements should be made over the long term so that the transport
layers can report extra information with their ->rcv_buf() call, or at the
very least, implement a ->get_conn_status() function to report various
flags such as shutr, shutw, error at various stages, allowing an upper
layer to inquire for the relevance of engaging into a long operation if
it's known the connection is not usable anymore. An even simpler step
could probably consist in implementing this in the control layer.

This patch is simple enough to be backported as far as 2.0.

Many thanks to @ngaugler for his numerous tests with detailed feedback.
2021-02-02 15:55:53 +01:00
William Lallemand
b8868498ed CLEANUP: ssl: remove dead code in ckch_inst_new_load_srv_store()
The new ckch_inst_new_load_srv_store() function which mimics the
ckch_inst_new_load_store() function includes some dead code which was
used only in the former function.

Fix issue #1081.
2021-01-27 14:44:59 +01:00
William Lallemand
db26e2b00e CLEANUP: ssl: make load_srv_{ckchs,cert} match their bind counterpart
This patch makes things more consistent between the bind_conf functions
and the server ones:

- ssl_sock_load_srv_ckchs() loads the SSL_CTX in the server
  (ssl_sock_load_ckchs() load the SNIs in the bind_conf)

- add the server parameter to ssl_sock_load_srv_ckchs()

- changes made to the ckch_inst are done in
  ckch_inst_new_load_srv_store()
2021-01-26 15:19:36 +01:00
William Lallemand
795bd9ba3a CLEANUP: ssl: remove SSL_CTX function parameter
Since the server SSL_CTX is now stored in the ckch_inst, it is not
needed anymore to pass an SSL_CTX to ckch_inst_new_load_srv_store() and
ssl_sock_load_srv_ckchs().
2021-01-26 15:19:36 +01:00
Remi Tricot-Le Breton
f3eedfe195 MEDIUM: ssl: Enable backend certificate hot update
When trying to update a backend certificate, we should find a
server-side ckch instance thanks to which we can rebuild a new ssl
context and a new ckch instance that replace the previous ones in the
server structure. This way any new ssl session will be built out of the
new ssl context and the newly updated certificate.

This resolves a subpart of GitHub issue #427 (the certificate part)
2021-01-26 15:19:36 +01:00
Remi Tricot-Le Breton
d817dc733e MEDIUM: ssl: Load client certificates in a ckch for backend servers
In order for the backend server's certificate to be hot-updatable, it
needs to fit into the implementation used for the "bind" certificates.
This patch follows the architecture implemented for the frontend
implementation and reuses its structures and general function calls
(adapted for the server side).
The ckch store logic is kept and a dedicated ckch instance is used (one
per server). The whole sni_ctx logic was not kept though because it is
not needed.
All the new functions added in this patch are basically server-side
copies of functions that already exist on the frontend side with all the
sni and bind_cond references removed.
The ckch_inst structure has a new 'is_server_instance' flag which is
used to distinguish regular instances from the server-side ones, and a
new pointer to the server's structure in case of backend instance.
Since the new server ckch instances are linked to a standard ckch_store,
a lookup in the ckch store table will succeed so the cli code used to
update bind certificates needs to be covered to manage those new server
side ckch instances.
2021-01-26 15:19:36 +01:00
Remi Tricot-Le Breton
ec805a32b9 MINOR: ssl: Certificate chain loading refactorization
Move the certificate chain loading code into a dedicated function that
will then be useable elsewhere.
2021-01-26 15:19:36 +01:00
Remi Tricot-Le Breton
442b7f2238 MINOR: ssl: Server ssl context prepare function refactoring
Split the server's ssl context initialization into the general ssl
related initializations and the actual initialization of a single
SSL_CTX structure. This way the context's initialization will be
usable by itself from elsewhere.
2021-01-26 15:19:36 +01:00
Ilya Shipitsin
1fc44d494a BUILD: ssl: guard Client Hello callbacks with HAVE_SSL_CLIENT_HELLO_CB macro instead of openssl version
let us introduce new macro HAVE_SSL_CLIENT_HELLO_CB and guard
callback functions with it
2021-01-22 20:45:24 +01:00
Willy Tarreau
4bd5d630ac MINOR: ssl/show_fd: report some FDs as suspicious when possible
If a subscriber's tasklet was called more than one million times, if
the ssl_ctx's connection doesn't match the current one, or if the
connection appears closed in one direction while the SSL stack is
still subscribed, the FD is reported as suspicious. The close cases
may occasionally trigger a false positive during very short and rare
windows. Similarly the 1M calls will trigger after 16GB are transferred
over a given connection. These are rare enough events to be reported as
suspicious.
2021-01-21 09:09:05 +01:00
Willy Tarreau
8050efeacb MINOR: cli: give the show_fd helpers the ability to report a suspicious entry
Now the show_fd helpers at the transport and mux levels return an integer
which indicates whether or not the inspected entry looks suspicious. When
an entry is reported as suspicious, "show fd" will suffix it with an
exclamation mark ('!') in the dump, that is supposed to help detecting
them.

For now, helpers were adjusted to adapt to the new API but none of them
reports any suspicious entry yet.
2021-01-21 08:58:15 +01:00
Willy Tarreau
691d503896 MINOR: xprt/mux: export all *_io_cb functions so that "show fd" resolves them
In FD dumps it's often very important to figure what upper layer function
is going to be called. Let's export the few I/O callbacks that appear as
tasklet functions so that "show fd" can resolve them instead of printing
a pointer relative to main. For example:

   1028 : st=0x21(R:rA W:Ra) ev=0x01(heopI) [lc] tmask=0x2 umask=0x2 owner=0x7f00b889b200 iocb=0x65b638(sock_conn_iocb) back=0 cflg=0x00001300 fe=recv mux=H2 ctx=0x7f00c8824de0 h2c.st0=FRH .err=0 .maxid=795 .lastid=-1 .flg=0x0000 .nbst=0 .nbcs=0 .fctl_cnt=0 .send_cnt=0 .tree_cnt=0 .orph_cnt=0 .sub=1 .dsi=795 .dbuf=0@(nil)+0/0 .msi=-1 .mbuf=[1..1|32],h=[0@(nil)+0/0],t=[0@(nil)+0/0] xprt=SSL xprt_ctx=0x7f00c86d0750 xctx.st=0 .xprt=RAW .wait.ev=1 .subs=0x7f00c88252e0(ev=1 tl=0x7f00a07d1aa0 tl.calls=1047 tl.ctx=0x7f00c8824de0 tl.fct=h2_io_cb) .sent_early=0 .early_in=0
2021-01-20 17:17:39 +01:00
Willy Tarreau
de5675a38c MINOR: ssl: provide a "show fd" helper to report important SSL information
The SSL context contains a lot of important details that are currently
missing from debug outputs. Now that we detect ssl_sock, we can perform
some sanity checks, print the next xprt, the subscriber callback's context,
handler and number of calls. The process function is also resolved. This
now gives for example on an H2 connection:

   1029 : st=0x21(R:rA W:Ra) ev=0x01(heopI) [lc] tmask=0x2 umask=0x2 owner=0x7fc714881700 iocb=0x65b528(sock_conn_iocb) back=0 cflg=0x00001300 fe=recv mux=H2 ctx=0x7fc734545e50 h2c.st0=FRH .err=0 .maxid=217 .lastid=-1 .flg=0x0000 .nbst=0 .nbcs=0 .fctl_cnt=0 .send_cnt=0 .tree_cnt=0 .orph_cnt=0 .sub=1 .dsi=217 .dbuf=0@(nil)+0/0 .msi=-1 .mbuf=[1..1|32],h=[0@(nil)+0/0],t=[0@(nil)+0/0] xprt=SSL xprt_ctx=0x7fc73478f230 xctx.st=0 .xprt=RAW .wait.ev=1 .subs=0x7fc734546350(ev=1 tl=0x7fc7346702e0 tl.calls=278 tl.ctx=0x7fc734545e50 tl.fct=main-0x144efa) .sent_early=0 .early_in=0
2021-01-20 17:17:39 +01:00
Ilya Shipitsin
761d64c7ae BUILD: ssl: guard openssl specific with SSL_READ_EARLY_DATA_SUCCESS
let us switch to SSL_READ_EARLY_DATA_SUCCESS instead of openssl versions
2021-01-07 10:20:04 +01:00
Ilya Shipitsin
ec36c91c69 BUILD: ssl: guard EVP_PKEY_get_default_digest_nid with ASN1_PKEY_CTRL_DEFAULT_MD_NID
let us switch to openssl specific macro instead of versions
2021-01-07 10:20:00 +01:00
Ilya Shipitsin
1e9a66603f CLEANUP: assorted typo fixes in the code and comments
This is 14th iteration of typo fixes
2021-01-06 16:26:50 +01:00
Willy Tarreau
b6fc524f05 MINOR: ssl: make tlskeys_list_get_next() take a list element
As reported in issue #1010, gcc-11 as of 2021-01-05 is overzealous in
its -Warray-bounds check as it considers that a cast of a global struct
accesses the entire struct even if only one specific element is accessed.
This instantly breaks all lists making use of container_of() to build
their iterators as soon as the starting point is known if the next
element is retrieved from the list head in a way that is visible to the
compiler's optimizer, because it decides that accessing the list's next
element dereferences the list as a larger struct (which it does not).

The temporary workaround consisted in disabling -Warray-bounds, but this
warning is traditionally quite effective at spotting real bugs, and we
actually have is a single occurrence of this issue in the whole code.

By changing the tlskeys_list_get_next() function to take a list element
as the starting point instead of the current element, we can avoid
the starting point issue but this requires to change all call places
to write hideous casts made of &((struct blah*)ref)->list. At the
moment we only have two such call places, the first one being used to
initialize the list (which is the one causing the warning) and which
is thus easy to simplify, and the second one for which we already have
an aliased pointer to the reference that is still valid at the call
place, and given the original pointer also remained unchanged, we can
safely use this alias, and this is safer than leaving a cast there.

Let's make this change now while it's still easy.

The generated code only changed in function cli_io_handler_tlskeys_files()
due to register allocation and the change of variable scope between the
old one and the new one.
2021-01-05 11:15:45 +01:00
Tim Duesterhus
cb8b281c02 CLEANUP: ssl: Remove useless local variable in tlskeys_list_get_next()
`getnext` was only used to fill `ref` at the beginning of the function. Both
have the same type. Replace the parameter name by `ref` to remove the useless
local variable.
2021-01-05 10:25:20 +01:00
Tim Duesterhus
2c7bb33144 CLEANUP: ssl: Remove useless loop in tlskeys_list_get_next()
This loop was always exited in the first iteration by `return`.
2021-01-05 10:24:36 +01:00
Tim Duesterhus
e5ff14100a CLEANUP: Compare the return value of XXXcmp() functions with zero
According to coding-style.txt it is recommended to use:

`strcmp(a, b) == 0` instead of `!strcmp(a, b)`

So let's do this.

The change was performed by running the following (very long) coccinelle patch
on src/:

    @@
    statement S;
    expression E;
    expression F;
    @@

      if (
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) != 0
      )
    (
      S
    |
      { ... }
    )

    @@
    statement S;
    expression E;
    expression F;
    @@

      if (
    - !
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) == 0
      )
    (
      S
    |
      { ... }
    )

    @@
    expression E;
    expression F;
    expression G;
    @@

    (
    G &&
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) != 0
    )

    @@
    expression E;
    expression F;
    expression G;
    @@

    (
    G ||
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) != 0
    )

    @@
    expression E;
    expression F;
    expression G;
    @@

    (
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) != 0
    && G
    )

    @@
    expression E;
    expression F;
    expression G;
    @@

    (
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) != 0
    || G
    )

    @@
    expression E;
    expression F;
    expression G;
    @@

    (
    G &&
    - !
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) == 0
    )

    @@
    expression E;
    expression F;
    expression G;
    @@

    (
    G ||
    - !
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) == 0
    )

    @@
    expression E;
    expression F;
    expression G;
    @@

    (
    - !
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) == 0
    && G
    )

    @@
    expression E;
    expression F;
    expression G;
    @@

    (
    - !
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) == 0
    || G
    )

    @@
    expression E;
    expression F;
    expression G;
    @@

    (
    - !
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) == 0
    )
2021-01-04 10:09:02 +01:00
Frédéric Lécaille
e9473c7833 MINOR: ssl: QUIC transport parameters parsing.
This patch modifies the TLS ClientHello message callback so that to parse the QUIC
client transport parameters.
2020-12-23 11:57:26 +01:00