Commit Graph

12588 Commits

Author SHA1 Message Date
Willy Tarreau
7893ae117f MEDIUM: resolvers: replace the answer_list with a (flat) tree
With SRV records, a huge amount of time is spent looking for records
by walking long lists. It is possible to reduce this by indexing values
in trees instead. However the whole code relies a lot on the list
ordering, and even implements some round-robin on it to distribute IP
addresses to servers.

This patch starts carefully by replacing the list with a an eb32 tree
that is still used like a list, with a constant key 0. Since ebtrees
preserve insertion order for duplicates, the tree walk visits the nodes
in the exact same order it did with the lists. This allows to implement
the required infrastructure without changing the behavior.
2021-10-21 08:02:08 +02:00
Willy Tarreau
a89c19127d BUG/MEDIUM: checks: fix the starting thread for external checks
When cleaning up the code to remove most explicit task masks in commit
beeabf531 ("MINOR: task: provide 3 task_new_* wrappers to simplify the
API"), a mistake was done with the external checks where the call does
task_new_on(1) instead of task_new_on(0) due to the confusion with the
previous mask 1.

No backport is needed as that's only 2.5-dev.
2021-10-20 18:43:30 +02:00
Willy Tarreau
6878f80427 MEDIUM: resolvers: remove the last occurrences of the "safe" argument
This one was used to indicate whether the callee had to follow particularly
safe code path when removing resolutions. Since the code now uses a kill
list, this is not needed anymore.
2021-10-20 17:54:27 +02:00
Willy Tarreau
f766ec6b53 MEDIUM: resolvers: use a kill list to preserve the list consistency
When scanning resolution.curr it's possible to try to free some
resolutions which will themselves result in freeing other ones. If
one of these other ones is exactly the next one in the list, the list
walk visits deleted nodes and causes memory corruption, double-frees
and so on. The approach taken using the "safe" argument to some
functions seems to work but it's extremely brittle as it is required
to carefully check all call paths from process_ressolvers() and pass
the argument to 1 there to refrain from deleting entries, so the bug
is very likely to come back after some tiny changes to this code.

A variant was tried, checking at various places that the current task
corresponds to process_resolvers() but this is also quite brittle even
though a bit less.

This patch uses another approach which consists in carefully unlinking
elements from the list and deferring their removal by placing it in a
kill list instead of deleting them synchronously. The real benefit here
is that the complexity only has to be placed where the complications
are.

A thread-local list is fed with elements to be deleted before scanning
the resolutions, and it's flushed at the end by picking the first one
until the list is empty. This way we never dereference the next element
and do not care about its presence or not in the list. One function,
resolv_unlink_resolution(), is exported and used outside, so it had to
be modified to use this list as well. Internal code has to use
_resolv_unlink_resolution() instead.
2021-10-20 17:54:22 +02:00
Willy Tarreau
aae7320b0d CLEANUP: resolvers: replace all LIST_DELETE with LIST_DEL_INIT
The code as it is uses crossed lists between many elements, and at
many places the code relies on list iterators or emptiness checks,
which does not work with only LIST_DELETE. Further, it is quite
difficult to place debugging code and checks in the current situation,
and gdb is helpless.

This code replaces all LIST_DELETE calls with LIST_DEL_INIT so that
it becomes possible to trust the lists.
2021-10-20 17:54:14 +02:00
Willy Tarreau
239675e4a9 CLEANUP: resolvers: simplify resolv_link_resolution() regarding requesters
This function allocates requesters by hand for each and every type. This
is complex and error-prone, and it doesn't even initialize the list part,
leaving dangling pointers that complicate debugging.

This patch introduces a new function resolv_get_requester() that either
returns the current pointer if valid or tries to allocate a new one and
links it to its destination. Then it makes use of it in the function
above to clean it up quite a bit. This allows to remove complicated but
unneeded tests.
2021-10-20 17:54:01 +02:00
Willy Tarreau
48664c048d CLEANUP: always initialize the answer_list
Similar to the previous patch, the answer's list was only initialized the
first time it was added to a list, leading to bogus outdated pointer to
appear when debugging code is added around it to watch it. Let's make
sure it's always initialized upon allocation.
2021-10-20 17:53:54 +02:00
Willy Tarreau
25e010906a BUG/MEDIUM: resolvers: always check a valid item in query_list
The query_list is physically stored in the struct resolution itself,
so we have a list that contains a list to items stored in itself (and
there is a single item). But the list is first initialized in
resolv_validate_dns_response(), while it's scanned in
resolv_process_responses() later after calling the former. First,
this results in crashes as soon as the code is instrumented a little
bit for debugging, as elements from a previous incarnation can appear.

But in addition to this, the presence of an element is checked by
verifying that the return of LIST_NEXT() is not NULL, while it may
never be NULL even for an empty list, resulting in bugs or crashes
if the number of responses does not match the list's contents. This
is easily triggered by testing for the list non-emptiness outside of
the function.

Let's make sure the list is always correct, i.e. it's initialized to
an empty list when the structure is allocated, elements are checked by
first verifying the list is not empty, they are deleted once checked,
and in any case at end so that there are no dangling pointers.

This should be backported, but only as long as the patch fits without
modifications, as adaptations can be risky there given that bugs tend
to hide each other.
2021-10-20 17:53:35 +02:00
Willy Tarreau
10c1a8c3bd BUILD: resolvers: avoid a possible warning on null-deref
Depending on the code that precedes the loop, gcc may emit this warning:

  src/resolvers.c: In function 'resolv_process_responses':
  src/resolvers.c:1009:11: warning: potential null pointer dereference [-Wnull-dereference]
   1009 |  if (query->type != DNS_RTYPE_SRV && flags & DNS_FLAG_TRUNCATED) {
        |      ~~~~~^~~~~~

However after carefully checking, r_res->header.qdcount it exclusively 1
when reaching this place, which forces the for() loop to enter for at
least one iteration, and <query> to be set. Thus there's no code path
leading to a null deref. It's possibly just because the assignment is
too far and the compiler cannot figure that the condition is always OK.
Let's just mark it to please the compiler.
2021-10-20 17:53:35 +02:00
Willy Tarreau
2acc160c05 CLEANUP: resolvers: do not export resolv_purge_resolution_answer_records()
This code is dangerous enough that we certainly don't want external code
to ever approach it, let's not export unnecessary functions like this one.
It was made static and a comment was added about its purpose.
2021-10-20 17:52:50 +02:00
Willy Tarreau
2a67aa0a51 BUG/MAJOR: resolvers: add other missing references during resolution removal
There is a fundamental design bug in the resolvers code which is that
a list of active resolutions is being walked to try to delete outdated
entries, and that the code responsible for removing them also removes
other elements, including the next one which will be visited by the
list iterator. This randomly causes a use-after-free condition leading
to crashes, infinite loops and various other issues such as random memory
corruption.

A first fix for the memory fix for this was brought by commit 0efc0993e
("BUG/MEDIUM: resolvers: Don't release resolution from a requester
callbacks"). While preparing for more fixes, some code was factored by
commit 11c6c3965 ("MINOR: resolvers: Clean server in a dedicated function
when removing a SRV item"), which inadvertently passed "0" as the "safe"
argument all the time, missing one case of removal protection, instead
of always using "safe". This patch reintroduces the correct argument.

This must be backported with all fixes above.

Cc: Christopher Faulet <cfaulet@haproxy.com>
2021-10-20 17:52:36 +02:00
Willy Tarreau
62e467c667 DEBUG: dns: add a few more BUG_ON at sensitive places
A few places have been caught triggering late bugs recently, always cases
of use-after-free because a freed element was still found in one of the
lists. This patch adds a few checks for such elements in dns_session_free()
before the final pool_free() and dns_session_io_handler() before adding
elements to lists to make sure they remain consistent. They do not trigger
anymore now.
2021-10-20 17:52:17 +02:00
Willy Tarreau
b56a878950 CLEANUP: dns: always detach the appctx from the dns session on release
When dns_session_release() calls dns_session_free(), it was shown that
it might still be attached there:

  Program terminated with signal SIGSEGV, Segmentation fault.
  #0  0x00000000006437d7 in dns_session_free (ds=0x7f895439e810) at src/dns.c:768
  768             BUG_ON(!LIST_ISEMPTY(&ds->ring.waiters));
  [Current thread is 1 (Thread 0x7f895bbe2700 (LWP 31792))]
  (gdb) bt
  #0  0x00000000006437d7 in dns_session_free (ds=0x7f895439e810) at src/dns.c:768
  #1  0x0000000000643ab8 in dns_session_release (appctx=0x7f89545a4ff0) at src/dns.c:805
  #2  0x000000000062e35a in si_applet_release (si=0x7f89545a5550) at include/haproxy/stream_interface.h:236
  #3  0x000000000063150f in stream_int_shutw_applet (si=0x7f89545a5550) at src/stream_interface.c:1697
  #4  0x0000000000640ab8 in si_shutw (si=0x7f89545a5550) at include/haproxy/stream_interface.h:437
  #5  0x0000000000643103 in dns_session_io_handler (appctx=0x7f89545a4ff0) at src/dns.c:725
  #6  0x00000000006d776f in task_run_applet (t=0x7f89545a5100, context=0x7f89545a4ff0, state=81924) at src/applet.c:90
  #7  0x000000000068b82b in run_tasks_from_lists (budgets=0x7f895bbbf5c0) at src/task.c:611
  #8  0x000000000068c258 in process_runnable_tasks () at src/task.c:850
  #9  0x0000000000621e61 in run_poll_loop () at src/haproxy.c:2636
  #10 0x0000000000622328 in run_thread_poll_loop (data=0x8d7440 <ha_thread_info+64>) at src/haproxy.c:2807
  #11 0x00007f895c54a06b in start_thread () from /lib64/libpthread.so.0
  #12 0x00007f895bf3772f in clone () from /lib64/libc.so.6
  (gdb) p &ds->ring.waiters
  $1 = (struct list *) 0x7f895439e8a8
  (gdb) p ds->ring.waiters
  $2 = {
    n = 0x7f89545a5078,
    p = 0x7f89545a5078
  }
  (gdb) p ds->ring.waiters->n
  $3 = (struct list *) 0x7f89545a5078
  (gdb) p *ds->ring.waiters->n
  $4 = {
    n = 0x7f895439e8a8,
    p = 0x7f895439e8a8
  }

Let's always detach it before freeing so that it remains possible to
check the dns_session's ring before releasing it, and possibly catch
bugs.
2021-10-20 17:52:13 +02:00
Emeric Brun
7045590d8a BUG/MAJOR: dns: attempt to lock globaly for msg waiter list instead of use barrier
The barrier is insufficient here to protect the waiters list as we can
definitely catch situations where ds->waiter shows an inconsistency
whereby the element is not attached when entering the "if" block and
is already attached when attaching it later.

This patch uses a larger lock to maintain consistency. Without it the
code would crash in 30-180 minutes under heavy stress, always showing
the same problem (ds->waiter->n->p != &ds->waiter). Now it seems to
always resist, suggesting that this was indeed the problem.

This will have to be backported to 2.4.
2021-10-20 17:52:07 +02:00
Emeric Brun
d20dc21eec BUG/MAJOR: dns: tcp session can remain attached to a list after a free
Using tcp, after a session release and free, the session can remain
attached to the list of sessions with a response message waiting for
a commit (ds->waiter). This results to a use after free of this
session.

Also, on some error path and after free, a session could remain attached
to the lists of available idle/free sessions (ds->list).

This patch ensure to remove the session from those external lists
before a free.

This patch should be backported to all version including
the dns over tcp (2.4)
2021-10-20 17:52:02 +02:00
Christopher Faulet
d16e7dd0e4 BUG/MEDIUM: tcpcheck: Properly catch early HTTP parsing errors
When an HTTP response is parsed, early parsing errors are not properly
handled. When this error is reported by the multiplexer, nothing is copied
into the input buffer. The HTX message remains empty but the
HTX_FL_PARSING_ERROR flag is set. In addition CS_FL_EOI is set on the
conn-stream. This last flag must be handled to prevent subscription for
receive events. Otherwise, in the best case, a L7 timeout error is
reported. But a transient loop is also possible if a shutdown is received
because the multiplexer notifies the check of the event while the check
never handles it and waits for more data.

Now, if CS_FL_EOI flag is set on the conn-stream, expect rules are
evaluated. Any error must be handled there.

Thanks to @kazeburo for his valuable report.

This patch should fix the issue #1420. It must be backported at least to
2.4. On 2.3 and 2.2, there is no loop but the wrong error is reported (empty
response instead of invalid one). Thus it may also be backported as far as
2.2.
2021-10-20 14:35:38 +02:00
William Lallemand
34b3a93655 MINOR: httpclient/cli: access should be only done from expert mode
Only enable the usage of the CLI HTTP client in expert mode.
2021-10-19 15:02:42 +02:00
Christopher Faulet
813f913444 BUG/MEDIUM: stream: Keep FLT_END analyzers if a stream detects a channel error
If a channel error (READ_ERRO|READ_TIMEOUT|WRITE_ERROR|WRITE_TIMEOUT) is
detected by the stream, in process_stream(), FLT_END analyers must be
preserved. It is important to be sure to ends filter analysis and be able to
release the stream.

First, filters may release some ressources when FLT_END analyzers are
called. Then, the CF_FL_ANALYZE flag is used to sync end of analysis for the
request and the response. If FLT_END analyzer is ignored on a channel, this
may block the other side and freeze the stream.

This patch must be backported to all stable versions
2021-10-19 11:29:30 +02:00
Remi Tricot-Le Breton
8abed17a34 MINOR: jwt: Do not rely on enum order anymore
Replace the test based on the enum value of the algorithm by an explicit
switch statement in case someone reorders it for some reason (while
still managing not to break the regtest).
2021-10-18 16:02:31 +02:00
Remi Tricot-Le Breton
0b24d2fa45 MINOR: jwt: Empty the certificate tree during deinit
The tree in which the JWT certificates are stored was not emptied. It is
now done during deinit.
2021-10-18 16:02:28 +02:00
Willy Tarreau
75cc65356f MEDIUM: resolvers: replace bogus resolv_hostname_cmp() with memcmp()
resolv_hostname_cmp() is bogus, it is applied on labels and not plain
names, but doesn't make any distinction between length prefixes and
characters, so it compares the labels lengths via tolower() as well.
The only reason for which it doesn't break is because labels cannot
be larger than 63 bytes, and that none of the common encoding systems
have upper case letters in the lower 63 bytes, that could be turned
into a different value via tolower().

Now that all labels are stored in lower case, we don't need to burn
CPU cycles in tolower() at run time and can use memcmp() instead of
resolv_hostname_cmp(). This results in a ~22% lower CPU usage on large
farms using SRV records:

before:
  18.33%  haproxy                   [.] resolv_validate_dns_response
  10.58%  haproxy                   [.] process_resolvers
  10.28%  haproxy                   [.] resolv_hostname_cmp
   7.50%  libc-2.30.so              [.] tolower
  46.69%  total

after:
  24.73%  haproxy                     [.] resolv_validate_dns_response
   7.78%  libc-2.30.so                [.] __memcmp_avx2_movbe
   3.65%  haproxy                     [.] process_resolvers
  36.16%  total
2021-10-18 10:47:36 +02:00
Willy Tarreau
814889c28a MEDIUM: resolvers: lower-case labels when converting from/to DNS names
The whole code relies on performing case-insensitive comparison on
lookups, which is extremely inefficient.

Let's make sure that all labels to be looked up or sent are first
converted to lower case. Doing so is also the opportunity to eliminate
an inefficient memcpy() in resolv_dn_label_to_str() that essentially
runs over a few unaligned bytes at once. As a side note, that call
was dangerous because it relied on a sign-extended size taken from a
string that had to be sanitized first.

This is tagged medium because while this is 100% safe, it may cause
visible changes on the wire at the packet level and trigger bugs in
test programs.
2021-10-18 09:14:02 +02:00
Ilya Shipitsin
bd6b4be721 CLEANUP: assorted typo fixes in the code and comments
This is 27th iteration of typo fixes
2021-10-18 07:26:19 +02:00
Bjrn Jacke
20d0f50b00 MINOR: add ::1 to predefined LOCALHOST acl
The "LOCALHOST" ACL currently matches only 127.0.0.1/8. This adds the
IPv6 "::1" address to the supported patterns.
2021-10-18 07:21:28 +02:00
Tim Duesterhus
c5aa113d80 CLEANUP: Apply strcmp.cocci
This fixes the use of the various *cmp functions to use != 0 or == 0.
2021-10-18 07:17:04 +02:00
Willy Tarreau
6d19f0d837 CLEANUP: listeners: remove unreachable code in clone_listener()
Coverity reported in issue #1416 that label oom3 is not reachable in
function close_listener() added by commit 59a877dfd ("MINOR: listeners:
add clone_listener() to duplicate listeners at boot time"). The code
leading to it was removed during the development of the function, but
not the label itself.
2021-10-16 14:58:30 +02:00
Willy Tarreau
7c4c830d04 BUG/MINOR: listener: add an error check for unallocatable trash
Coverity noticed in issue #1416 that a missing allocation error was
introduced in tcp_bind_listener() with the rework of error messages by
commit ed1748553 ("MINOR: proto_tcp: use chunk_appendf() to ouput socket
setup errors"). In practice nobody will ever face it but better address
it anyway.

No backport is needed.
2021-10-16 14:54:19 +02:00
Willy Tarreau
a146289d4f BUG/MINOR: listener: fix incorrect return on out-of-memory
When the clone_listener() function was added in commit 59a877dfd
("MINOR: listeners: add clone_listener() to duplicate listeners at
boot time"), a stupid bug was introduced when splitting the error
path because while the first case where calloc fails will leave NULL
in the output value, the other cases will return the pointer to a
freed area. This was reported by Coverity in issue #1416.

In practice nobody will face it (out-of-memory while checking config),
but let's fix it.

No backport is needed.
2021-10-16 14:45:29 +02:00
Willy Tarreau
b39e47a52b BUG/MINOR: sample: fix backend direction flags consecutive to last fix
Commit 7a06ffb85 ("BUG/MEDIUM: sample: Cumulate frontend and backend
sample validity flags") introduced a typo confusing the request and
the response direction when checking for validity of a rule applied
to a backend. This was reported by Coverity in issue #1417.

This needs to be backported where the patch above is backported.
2021-10-16 14:41:09 +02:00
Amaury Denoyelle
697cfde340 BUG/MEDIUM: cpuset: fix cpuset size for FreeBSD
Fix the macro used to retrieve the max number of cpus on FreeBSD. The
MAXCPU is not properly defined in userspace and always set to 1 despite
the machine architecture. Replace it with CPU_SETSIZE.

See https://freebsd-hackers.freebsd.narkive.com/gw4BeLum/smp-in-machine-params-h#post6

Without this, the following config file is rejected on FreeBSD even if
the machine is SMP :
  global
    cpu-map 1-2 0-1

This must be backported up to 2.4.
2021-10-15 17:16:11 +02:00
Christopher Faulet
6db9a97f61 BUG/MINOR: proxy: Release ACLs and TCP/HTTP rules of default proxies
It is now possible to have TCP/HTTP rules and ACLs defined in defaults
sections. So we must try to release corresponding lists when a default proxy
is destroyed.

No backport needed.
2021-10-15 14:33:35 +02:00
Christopher Faulet
7a06ffb854 BUG/MEDIUM: sample: Cumulate frontend and backend sample validity flags
When the sample validity flags are computed to check if a sample is used in
a valid scope, the flags depending on the proxy capabilities must be
cumulated. Historically, for a sample on the request, only the frontend
capability was used to set the sample validity flags while for a sample on
the response only the backend was used. But it is a problem for listen or
defaults proxies. For those proxies, all frontend and backend samples should
be valid. However, at many place, only frontend ones are possible.

For instance, it is impossible to set the backend name (be_name) into a
variable from a listen proxy.

This bug exists on all stable versions. Thus this patch should probably be
backported. But with some caution because the code has probably changed
serveral times. Note that nobody has ever noticed this issue. So the need to
backport this patch must be evaluated for each branch.
2021-10-15 14:12:19 +02:00
Christopher Faulet
d4150ad869 MEDIUM: http-ana: Eval HTTP rules defined in defaults sections
As for TCP rules, HTTP rules from defaults section are now evaluated. These
rules are evaluated before those of the proxy. The same default ruleset
cannot be attached to the frontend and the backend. However, at this stage,
we take care to not execute twice the same ruleset. So, in theory, a
frontend and a backend could use the same defaults section. In this case,
the default ruleset is executed before all others and only once.
2021-10-15 14:12:19 +02:00
Christopher Faulet
c8016d0f58 MEDIUM: tcp-rules: Eval TCP rules defined in defaults sections
TCP rules from defaults section are now evaluated. These rules are evaluated
before those of the proxy. For L7 TCP rules, the same default ruleset cannot
be attached to the frontend and the backend. However, at this stage, we take
care to not execute twice the same ruleset. So, in theory, a frontend and a
backend could use the same defaults section. In this case, the default
ruleset is executed before all others and only once.
2021-10-15 14:12:19 +02:00
Christopher Faulet
ee08d6cc74 MEDIUM: rules/acl: Parse TCP/HTTP rules and acls defined in defaults sections
TCP and HTTP rules can now be defined in defaults sections, but only those
with a name. Because these rules may use conditions based on ACLs, ACLs can
also be defined in defaults sections.

However there are some limitations:

  * A defaults section defining TCP/HTTP rules cannot be used by a defaults
    section
  * A defaults section defining TCP/HTTP rules cannot be used bu a listen
    section
  * A defaults sections defining TCP/HTTP rules cannot be used by frontends
    and backends at the same time
  * A defaults sections defining 'tcp-request connection' or 'tcp-request
    session' rules cannot be used by backends
  * A defaults sections defining 'tcp-response content' rules cannot be used
    by frontends

The TCP request/response inspect-delay of a proxy is now inherited from the
defaults section it uses. For now, these rules are only parsed. No evaluation is
performed.
2021-10-15 14:12:19 +02:00
Christopher Faulet
6ff7de5d64 MINOR: tcpcheck: Support 2-steps args resolution in defaults sections
With the commit eaba25dd9 ("BUG/MINOR: tcpcheck: Don't use arg list for
default proxies during parsing"), we restricted the use of sample fetch in
tcpcheck rules defined in a defaults section to those depending on explicit
arguments only. This means a tcpcheck rules defined in a defaults section
cannot rely on argument unresolved during the configuration parsing.

Thanks to recent changes, it is now possible again.

This patch is mandatory to support TCP/HTTP rules in defaults sections.
2021-10-15 14:12:19 +02:00
Christopher Faulet
52b8a43d4e MINOR: config: No longer remove previous anonymous defaults section
When the parsing of a defaults section is started, the previous anonymous
defaults section is removed. It may be a problem with referenced defaults
sections. And because all unused defautl proxies are removed after the
configuration parsing, it is not required to remove it so early.

This patch is mandatory to support TCP/HTTP rules in defaults sections.
2021-10-15 14:12:19 +02:00
Christopher Faulet
ff556276eb MINOR: config: Finish configuration for referenced default proxies
If a not-ready default proxy is referenced by a proxy during the
configuration validity check, its configuration is also finished and
PR_FL_READY flag is set on it.

For now, the arguments resolution is the only step performed.

This patch is mandatory to support TCP/HTTP rules in defaults sections.
2021-10-15 14:12:19 +02:00
Christopher Faulet
56717803e1 MINOR: proxy: Add PR_FL_READY flag on fully configured and usable proxies
The PR_FL_READY flags must now be set on a proxy at the end of the
configuration validity check to notify it is fully configured and may be
safely used.

For now there is no real usage of this flag. But it will be usefull for
referenced default proxies to finish their configuration only once.

This patch is mandatory to support TCP/HTTP rules in defaults sections.
2021-10-15 14:12:19 +02:00
Christopher Faulet
27c8d20451 MINOR: proxy: Be able to reference the defaults section used by a proxy
A proxy may now references the defaults section it is used. To do so, a
pointer on the default proxy was added in the proxy structure. And a
refcount must be used to track proxies using a default proxy. A default
proxy is destroyed iff its refcount is equal to zero and when it drops to
zero.

All this stuff must be performed during init/deinit staged for now. All
unreferenced default proxies are removed after the configuration parsing.

This patch is mandatory to support TCP/HTTP rules in defaults sections.
2021-10-15 14:12:19 +02:00
Christopher Faulet
b40542000d MEDIUM: proxy: Warn about ambiguous use of named defaults sections
It is now possible to designate the defaults section to use by adding a name
of the corresponding defaults section and referencing it in the desired
proxy section. However, this introduces an ambiguity. This named defaults
section may still be implicitly used by other proxies if it is the last one
defined. In this case for instance:

  default common
    ...

  default frt from common
    ...

  default bck from common
    ...

  frontend fe from frt
    ...

  backend be from bck
    ...

  listen stats
    ...

Here, it is not really obvious the last section will use the 'bck' defaults
section. And it is probably not the expected behaviour. To help users to
properly configure their haproxy, a warning is now emitted if a defaults
section is explicitly AND implicitly used. The configuration manual was
updated accordingly.

Because this patch adds a warning, it should probably not be backported to
2.4. However, if is is backported, it depends on commit "MINOR: proxy:
Introduce proxy flags to replace disabled bitfield".
2021-10-15 14:12:19 +02:00
Christopher Faulet
37a9e21a3a MINOR: sample/arg: Be able to resolve args found in defaults sections
It is not yet used but thanks to this patch, it will be possible to resolve
arguments found in defaults sections. However, there is some restrictions:

  * For FE (frontend) or BE (backend) arguments, if the proxy is explicity
    defined, there is no change. But for implicit proxy (not specified), the
    argument points on the default proxy. when a sample fetch using this
    kind of argument is evaluated, the default proxy replaced by the current
    one.

  * For SRV (server) and TAB (stick-table)arguments, the proxy must always
    be specified. Otherwise an error is reported.

This patch is mandatory to support TCP/HTTP rules in defaults sections.
2021-10-15 14:12:19 +02:00
Christopher Faulet
dfd10ab5ee MINOR: proxy: Introduce proxy flags to replace disabled bitfield
This change is required to support TCP/HTTP rules in defaults sections. The
'disabled' bitfield in the proxy structure, used to know if a proxy is
disabled or stopped, is replaced a generic bitfield named 'flags'.

PR_DISABLED and PR_STOPPED flags are renamed to PR_FL_DISABLED and
PR_FL_STOPPED respectively. In addition, everywhere there is a test to know
if a proxy is disabled or stopped, there is now a bitwise AND operation on
PR_FL_DISABLED and/or PR_FL_STOPPED flags.
2021-10-15 14:12:19 +02:00
Christopher Faulet
647a61cc4b BUG/MINOR: proxy: Use .disabled field as a bitfield as documented
.disabled field in the proxy structure is documented to be a bitfield. So
use it as a bitfield. This change was introduced to the 2.5, by commit
8e765b86f ("MINOR: proxy: disabled takes a stopping and a disabled state").

No backport is needed except if the above commit is backported.
2021-10-15 14:12:19 +02:00
Christopher Faulet
a5aa082742 BUG/MINOR: sample: Fix 'fix_tag_value' sample when waiting for more data
The test on the return value of fix_tag_value() function was inverted. To
wait for more data, the return value must be a valid empty string and not
IST_NULL.

This patch must be backported to 2.4.
2021-10-15 14:12:19 +02:00
Christopher Faulet
597909f4e6 BUG/MINOR: http-ana: Don't eval front after-response rules if stopped on back
http-after-response rules evaluation must be stopped after a "allow". It
means the frontend ruleset must not be evaluated if a "allow" was performed
in the backend ruleset. Internally, the evaluation must be stopped if on
HTTP_RULE_RES_STOP return value. Only the "allow" action is concerned by
this change.

Thanks to this patch, http-response and http-after-response behave in the
same way.

This patch should be backported as far as 2.2.
2021-10-15 14:12:19 +02:00
Willy Tarreau
e20e026033 BUG/MEDIUM: sample/jwt: fix another instance of base64 error detection
This is the same as for commit 468c000db ("BUG/MEDIUM: jwt: fix base64
decoding error detection"), but for function sample_conv_jwt_member_query()
that is used by sample converters jwt_header_query() and jwt_payload_query().
Thanks to Tim for the report. No backport is needed.
2021-10-15 12:14:16 +02:00
Willy Tarreau
ce16db4145 BUG/MINOR: jwt: use CRYPTO_memcmp() to compare HMACs
As Tim reported in github issue #1414, we ought to use a constant-time
memcmp() when comparing hashes to avoid time-based attacks. Let's use
CRYPTO_memcmp() since this code already depends on openssl.

No backport is needed, this was just merged into 2.5.
2021-10-15 11:54:04 +02:00
Willy Tarreau
468c000db0 BUG/MEDIUM: jwt: fix base64 decoding error detection
Tim reported that a decoding error from the base64 function wouldn't
be matched in case of bad input, and could possibly cause trouble
with -1 being passed in decoded_sig->data. In the case of HMAC+SHA
it is harmless as the comparison is made using memcmp() after checking
for length equality, but in the case of RSA/ECDSA this result is passed
as a size_t to EVP_DigetVerifyFinal() and may depend on the lib's mood.

The fix simply consists in checking the intermediary result before
storing it.

That's precisely what happens with one of the regtests which returned
0 instead of 4 on the intentionally defective token, so the regtest
was fixed as well.

No backport is needed as this is new in this release.
2021-10-15 11:41:16 +02:00
Willy Tarreau
7b232f132d BUG/MEDIUM: resolvers: fix truncated TLD consecutive to the API fix
A bug was introduced by commit previous bf9498a31 ("MINOR: resolvers:
fix the resolv_str_to_dn_label() API about trailing zero") as the code
is particularly contrived and hard to test. The output writes the last
char at [i+1] so the trailing zero and return value must be at i+1.

This will have to be backported where the patch above is backported
since it was needed for a fix.
2021-10-15 08:09:25 +02:00
Willy Tarreau
cc8fd4c040 MINOR: resolvers: merge address and target into a union "data"
These two fields are exclusive as they depend on the data type.
Let's move them into a union to save some precious bytes. This
reduces the struct resolv_answer_item size from 600 to 576 bytes.
2021-10-14 22:52:04 +02:00
Willy Tarreau
b4ca0195a9 BUG/MEDIUM: resolvers: use correct storage for the target address
The struct resolv_answer_item contains an address field of type
"sockaddr" which is only 16 bytes long, but which is used to store
either IPv4 or IPv6. Fortunately, the contents only overlap with
the "target" field that follows it and that is large enough to
absorb the extra bytes needed to store AAAA records. But this is
dangerous as just moving fields around could result in memory
corruption.

The fix uses a union and removes the casts that were used to hide
the problem.

Older versions need to be checked and possibly fixed. This needs
to be backported anyway.
2021-10-14 22:44:51 +02:00
Willy Tarreau
6dfbef4145 MEDIUM: listener: add the "shards" bind keyword
In multi-threaded mode, on operating systems supporting multiple listeners on
the same IP:port, this will automatically create this number of multiple
identical listeners for the same line, all bound to a fair share of the number
of the threads attached to this listener. This can sometimes be useful when
using very large thread counts where the in-kernel locking on a single socket
starts to cause a significant overhead. In this case the incoming traffic is
distributed over multiple sockets and the contention is reduced. Note that
doing this can easily increase the CPU usage by making more threads work a
little bit.

If the number of shards is higher than the number of available threads, it
will automatically be trimmed to the number of threads. A special value
"by-thread" will automatically assign one shard per thread.
2021-10-14 21:27:48 +02:00
Willy Tarreau
59a877dfd9 MINOR: listeners: add clone_listener() to duplicate listeners at boot time
This function's purpose will be to duplicate a listener in INIT state.
This will be used to ease declaration of listeners spanning multiple
groups, which will thus require multiple FDs hence multiple receivers.
2021-10-14 21:27:48 +02:00
Willy Tarreau
01cac3f721 MEDIUM: listeners: split the thread mask between receiver and bind_conf
With groups at some point we'll have to have distinct masks/groups in the
receiver and the bind_conf, because a single bind_conf might require to
instantiate multiple receivers (one per group).

Let's split the thread mask and group to have one for the bind_conf and
another one for the receiver while it remains easy to do. This will later
allow to use different storage for the bind_conf if needed (e.g. support
multiple groups).
2021-10-14 21:27:48 +02:00
Willy Tarreau
875ee704dd MINOR: resolvers: fix the resolv_dn_label_to_str() API about trailing zero
This function suffers from the same API issue as its sibling that does the
opposite direction, it demands that the input string is zero-terminated
*and* that its length *including* the trailing zero is passed on input,
forcing callers to pass length + 1, and itself to use that length - 1
everywhere internally.

This patch addressess this. There is a single caller, which is the
location of the previous bug, so it should probably be backported at
least to keep the code consistent across versions. Note that the
function is called dns_dn_label_to_str() in 2.3 and earlier.
2021-10-14 21:24:18 +02:00
Willy Tarreau
85c15e6bff BUG/MINOR: resolvers: do not reject host names of length 255 in SRV records
An off-by-one issue in buffer size calculation used to limit the output
of resolv_dn_label_to_str() to 254 instead of 255.

This must be backported to 2.0.
2021-10-14 21:24:18 +02:00
Willy Tarreau
947ae125cc BUG/MEDIUM: resolver: make sure to always use the correct hostname length
In issue #1411, @jjiang-stripe reports that do-resolve() sometimes seems
to be trying to resolve crap from random memory contents.

The issue is that action_prepare_for_resolution() tries to measure the
input string by itself using strlen(), while resolv_action_do_resolve()
directly passes it a pointer to the sample, omitting the known length.
Thus of course any other header present after the host in memory are
appended to the host value. It could theoretically crash if really
unlucky, with a buffer that does not contain any zero including in the
index at the end, and if the HTX buffer ends on an allocation boundary.
In practice it should be too low a probability to have ever been observed.

This patch modifies the action_prepare_for_resolution() function to take
the string length on with the host name on input and pass that down the
chain. This should be backported to 2.0 along with commit "MINOR:
resolvers: fix the resolv_str_to_dn_label() API about trailing zero".
2021-10-14 21:24:18 +02:00
Willy Tarreau
bf9498a31b MINOR: resolvers: fix the resolv_str_to_dn_label() API about trailing zero
This function is bogus at the API level: it demands that the input string
is zero-terminated *and* that its length *including* the trailing zero is
passed on input. While that already looks smelly, the trailing zero is
copied as-is, and is then explicitly replaced with a zero... Not only
all callers have to pass hostname_len+1 everywhere to work around this
absurdity, but this requirement causes a bug in the do-resolve() action
that passes random string lengths on input, and that will be fixed on a
subsequent patch.

Let's fix this API issue for now.

This patch will have to be backported, and in versions 2.3 and older,
the function is in dns.c and is called dns_str_to_dn_label().
2021-10-14 21:24:18 +02:00
Willy Tarreau
6823a3acee MINOR: protocol: uniformize protocol errors
Some protocols fail with "error blah [ip:port]" and other fail with
"[ip:port] error blah". All this already appears in a "starting" or
"binding" context after a proxy name. Let's choose a more universal
approach like below where the ip:port remains at the end of the line
prefixed with "for".

  [WARNING]  (18632) : Binding [binderr.cfg:10] for proxy http: cannot bind receiver to device 'eth2' (No such device) for [0.0.0.0:1080]
  [WARNING]  (18632) : Starting [binderr.cfg:10] for proxy http: cannot set MSS to 12 for [0.0.0.0:1080]
2021-10-14 21:22:52 +02:00
Willy Tarreau
37de553f1d MINOR: protocol: report the file and line number for binding/listening errors
Binding errors and late socket errors provide no information about
the file and line where the problem occurs. These are all done by
protocol_bind_all() and they only report "Starting proxy blah". Let's
change this a little bit so that:
  - the file name and line number of the faulty bind line is alwas mentioned
  - early binding errors are indicated with "Binding" instead of "Starting".

Now we can for example have this:
  [WARNING]  (18580) : Binding [binderr.cfg:10] for proxy http: cannot bind receiver to device 'eth2' (No such device) [0.0.0.0:1080]
2021-10-14 21:22:52 +02:00
Willy Tarreau
f78b52eb7d MINOR: inet: report the faulty interface name in "bind" errors
When a "bind ... interface foo" statement fails, let's report the
interface name in the error message to help locating it in the file.
2021-10-14 21:22:52 +02:00
Willy Tarreau
3cf05cb0b1 MINOR: proto_tcp: also report the attempted MSS values in error message
The MSS errors are the only ones not indicating what was attempted, let's
report the value that was tried, as it can help users spot them in the
config (particularly if a default value was used).
2021-10-14 21:22:52 +02:00
Bjoern Jacke
ed1748553a MINOR: proto_tcp: use chunk_appendf() to ouput socket setup errors
Right now only the last warning or error is reported from
tcp_bind_listener(), but it is useful to report all warnings and no only
the last one, so we now emit them delimited by commas. Previously we used
a fixed buffer of 100 bytes, which was too small to store more than one
message, so let's extend it.

Signed-off-by: Bjoern Jacke <bjacke@samba.org>
2021-10-14 21:22:52 +02:00
Remi Tricot-Le Breton
130e142ee2 MEDIUM: jwt: Add jwt_verify converter to verify JWT integrity
This new converter takes a JSON Web Token, an algorithm (among the ones
specified for JWS tokens in RFC 7518) and a public key or a secret, and
it returns a verdict about the signature contained in the token. It does
not simply return a boolean because some specific error cases cas be
specified by returning an integer instead, such as unmanaged algorithms
or invalid tokens. This enables to distinguich malformed tokens from
tampered ones, that would be valid format-wise but would have a bad
signature.
This converter does not perform a full JWT validation as decribed in
section 7.2 of RFC 7519. For instance it does not ensure that the header
and payload parts of the token are completely valid JSON objects because
it would need a complete JSON parser. It only focuses on the signature
and checks that it matches the token's contents.
2021-10-14 16:38:14 +02:00
Remi Tricot-Le Breton
0a72f5ee7c MINOR: jwt: jwt_header_query and jwt_payload_query converters
Those converters allow to extract a JSON value out of a JSON Web Token's
header part or payload part (the two first dot-separated base64url
encoded parts of a JWS in the Compact Serialization format).
They act as a json_query call on the corresponding decoded subpart when
given parameters, and they return the decoded JSON subpart when no
parameter is given.
2021-10-14 16:38:13 +02:00
Remi Tricot-Le Breton
864089e0a6 MINOR: jwt: Insert public certificates into dedicated JWT tree
A JWT signed with the RSXXX or ESXXX algorithm (RSA or ECDSA) requires a
public certificate to be verified and to ensure it is valid. Those
certificates must not be read on disk at runtime so we need a caching
mechanism into which those certificates will be loaded during init.
This is done through a dedicated ebtree that is filled during
configuration parsing. The path to the public certificates will need to
be explicitely mentioned in the configuration so that certificates can
be loaded as early as possible.
This tree is different from the ckch one because ckch entries are much
bigger than the public certificates used in JWT validation process.
2021-10-14 16:38:12 +02:00
Remi Tricot-Le Breton
e0d3c00086 MINOR: jwt: JWT tokenizing helper function
This helper function splits a JWT under Compact Serialization format
(dot-separated base64-url encoded strings) into its different sub
strings. Since we do not want to manage more than JWS for now, which can
only have at most three subparts, any JWT that has strictly more than
two dots is considered invalid.
2021-10-14 16:38:10 +02:00
Remi Tricot-Le Breton
7feb361776 MINOR: jwt: Parse JWT alg field
The full list of possible algorithms used to create a JWS signature is
defined in section 3.1 of RFC7518. This patch adds a helper function
that converts the "alg" strings into an enum member.
2021-10-14 16:38:08 +02:00
Remi Tricot-Le Breton
f5dd337b12 MINOR: http: Add http_auth_bearer sample fetch
This fetch can be used to retrieve the data contained in an HTTP
Authorization header when the Bearer scheme is used. This is used when
transmitting JSON Web Tokens for instance.
2021-10-14 16:38:07 +02:00
William Lallemand
1d58b01316 MINOR: ssl: add ssl_fc_is_resumed to "option httpslog"
In order to trace which session were TLS resumed, add the
ssl_fc_is_resumed in the httpslog option.
2021-10-14 14:27:48 +02:00
Amaury Denoyelle
493bb1db10 MINOR: quic: handle CONNECTION_CLOSE frame
On receiving CONNECTION_CLOSE frame, the mux is flagged for immediate
connection close. A stream is closed even if there is data not ACKed
left if CONNECTION_CLOSE has been received.
2021-10-13 16:38:56 +02:00
Amaury Denoyelle
1e308ffc79 MINOR: mux: remove last occurences of qcc ring buffer
The mux tx buffers have been rewritten with buffers attached to qcs
instances. qc_buf_available and qc_get_buf functions are updated to
manipulates qcs. All occurences of the unused qcc ring buffer are
removed to ease the code maintenance.
2021-10-13 16:38:56 +02:00
Amaury Denoyelle
cae0791942 MEDIUM: mux-quic: defer stream shut if remaining tx data
Defer the shutting of a qcs if there is still data in its tx buffers. In
this case, the conn_stream is closed but the qcs is kept with a new flag
QC_SF_DETACH.

On ACK reception, the xprt wake up the shut_tl tasklet if the stream is
flagged with QC_SF_DETACH. This tasklet is responsible to free the qcs
and possibly the qcc when all bidirectional streams are removed.
2021-10-13 16:38:56 +02:00
Amaury Denoyelle
ac8ee25659 MINOR: mux-quic: implement standard method to detect if qcc is dead
For the moment, a quic connection is considered dead if it has no
bidirectional streams left on it. This test is implemented via
qcc_is_dead function. It can be reused to properly close the connection
when needed.
2021-10-13 16:38:56 +02:00
Amaury Denoyelle
4fc8b1cb17 CLEANUP: h3: remove dead code
Remove unused function. This will simplify code maintenance.
2021-10-13 16:38:56 +02:00
Amaury Denoyelle
a587136c6f MINOR: mux-quic: standardize h3 settings sending
Use same buffer management to send h3 settings as for streams. This
simplify the code maintenance with unused function removed.
2021-10-13 16:38:56 +02:00
Amaury Denoyelle
a543eb1f6f MEDIUM: h3: properly manage tx buffers for large data
Properly handle tx buffers management in h3 data sending. If there is
not enough contiguous space, the buffer is first realigned. If this is
not enough, the stream is flagged with QC_SF_BLK_MROOM waiting for the
buffer to be emptied.

If a frame on a stream is successfully pushed for sending, the stream is
called if it was flagged with QC_SF_BLK_MROOM.
2021-10-13 16:38:56 +02:00
Amaury Denoyelle
d3d97c6ae7 MEDIUM: mux-quic: rationalize tx buffers between qcc/qcs
Remove the tx mux ring buffers in qcs, which should be in the qcc. For
the moment, use a simple architecture with 2 simple tx buffers in the
qcs.

The first buffer is used by the h3 layer to prepare the data. The mux
send operation transfer these into the 2nd buffer named xprt_buf. This
buffer is only freed when an ACK has been received.

This architecture is functional but not optimal for two reasons :
- it won't limit the buffer usage by connection
- each transfer on a new stream requires an allocation
2021-10-13 16:38:56 +02:00
Remi Tricot-Le Breton
b01179aa92 MINOR: ssl: Add ssllib_name_startswith precondition
This new ssllib_name_startswith precondition check can be used to
distinguish application linked with OpenSSL from the ones linked with
other SSL libraries (LibreSSL or BoringSSL namely). This check takes a
string as input and returns 1 when the SSL library's name starts with
the given string. It is based on the OpenSSL_version function which
returns the same output as the "openssl version" command.
2021-10-13 11:28:08 +02:00
Tim Duesterhus
9e5e586e35 BUG/MINOR: lua: Fix lua error handling in hlua_config_prepend_path()
Set an `lua_atpanic()` handler before calling `hlua_prepend_path()` in
`hlua_config_prepend_path()`.

This prevents the process from abort()ing when `hlua_prepend_path()` fails
for some reason.

see GitHub Issue #1409

This is a very minor issue that can't happen in practice. No backport needed.
2021-10-12 11:28:57 +02:00
Christopher Faulet
8c67eceeca CLEANUP: stream: Properly indent current_rule line in "show sess all"
This line is not related to the response channel but to the stream. Thus it
must be indented at the same level as stream-interfaces, connections,
channels...
2021-10-12 11:27:24 +02:00
Christopher Faulet
d4762b8474 MINOR: stream: report the current filter in "show sess all" when known
Filters can block the stream on pre/post analysis for any reason and it can
be useful to report it in "show sess all". So now, a "current_filter" extra
line is reported for each channel if a filter is blocking the analysis. Note
that this does not catch the TCP/HTTP payload analysis because all
registered filters are always evaluated when more data are received.
2021-10-12 11:26:49 +02:00
Willy Tarreau
1274e10d5c MINOR: stream: report the current rule in "show sess all" when known
Sometimes an HTTP or TCP rule may take time to complete because it is
waiting for external data (e.g. "wait-for-body", "do-resolve"), and it
can be useful to report the action and the location of that rule in
"show sess all". Here for streams blocked on such a rule, there will
now be a "current_line" extra line reporting this. Note that this does
not catch rulesets which are re-evaluated from the start on each change
(e.g. tcp-request content waiting for changes) but only when a specific
rule is being paused.
2021-10-12 07:38:30 +02:00
Willy Tarreau
c9e4868510 MINOR: rules: add a file name and line number to act_rules
These ones are passed on rule creation for the sole purpose of being
reported in "show sess", which is not done yet. For now the entries
are allocated upon rule creation and freed in free_act_rules().
2021-10-12 07:38:30 +02:00
Willy Tarreau
d535f807bb MINOR: rules: add a new function new_act_rule() to allocate act_rules
Rules are currently allocated using calloc() by their caller, which does
not make it very convenient to pass more information such as the file
name and line number.

This patch introduces new_act_rule() which performs the malloc() and
already takes in argument the ruleset (ACT_F_*), the file name and the
line number. This saves the caller from having to assing ->from, and
will allow to improve the internal storage with more info.
2021-10-12 07:38:30 +02:00
Willy Tarreau
db2ab8218c MEDIUM: stick-table: never learn the "conn_cur" value from peers
There have been a large number of issues reported with conn_cur
synchronization because the concept is wrong. In an active-passive
setup, pushing the local connections count from the active node to
the passive one will result in the passive node to have a higher
counter than the real number of connections. Due to this, after a
switchover, it will never be able to close enough connections to
go down to zero. The same commonly happens on reloads since the new
process preloads its values from the old process, and if no connection
happens for a key after the value is learned, it is impossible to reset
the previous ones. In active-active setups it's a bit different, as the
number of connections reflects the number on the peer that pushed last.

This patch solves this by marking the "conn_cur" local and preventing
it from being learned from peers. It is still pushed, however, so that
any monitoring system that collects values from the peers will still
see it.

The patch is tiny and trivially backportable. While a change of behavior
in stable branches is never welcome, it remains possible to fix issues
if reports become frequent.
2021-10-08 17:53:12 +02:00
Willy Tarreau
e3f4d7496d MEDIUM: config: resolve relative threads on bind lines to absolute ones
Now threads ranges specified on bind lines will be turned to effective
ones that will lead to a usable thread mask and a group ID.
2021-10-08 17:22:26 +02:00
Willy Tarreau
627def9e50 MINOR: threads: add a new function to resolve config groups and masks
In the configuration sometimes we'll omit a thread group number to designate
a global thread number range, and sometimes we'll mention the group and
designate IDs within that group. The operation is more complex than it
seems due to the need to check for ranges spanning between multiple groups
and determining groups from threads from bit masks and remapping bit masks
between local/global.

This patch adds a function to perform this operation, it takes a group and
mask on input and updates them on output. It's designed to be used by "bind"
lines but will likely be usable at other places if needed.

For situations where specified threads do not exist in the group, we have
the choice in the code between silently fixing the thread set or failing
with a message. For now the better option seems to return an error, but if
it turns out to be an issue we can easily change that in the future. Note
that it should only happen with "x/even" when group x only has one thread.
2021-10-08 17:22:26 +02:00
Willy Tarreau
d57b9ff7af MEDIUM: listeners: support the definition of thread groups on bind lines
This extends the "thread" statement of bind lines to support an optional
thread group number. When unspecified (0) it's an absolute thread range,
and when specified it's one relative to the thread group. Masks are still
used so no more than 64 threads may be specified at once, and a single
group is possible. The directive is not used for now.
2021-10-08 17:22:26 +02:00
Willy Tarreau
a3870b7952 MINOR: debug: report the group and thread ID in the thread dumps
Now thread dumps will report the thread group number and the ID within
this group. Note that this is still quite limited because some masks
are calculated based on the thread in argument while they have to be
performed against a group-level thread ID.
2021-10-08 17:22:26 +02:00
Willy Tarreau
b90935c908 MINOR: threads: add the current group ID in thread-local "tgid" variable
This is the equivalent of "tid" for ease of access. In the future if we
make th_cfg a pure thread-local array (not a pointer), it may make sense
to move it there.
2021-10-08 17:22:26 +02:00
Willy Tarreau
43ab05b3da MEDIUM: threads: replace ha_set_tid() with ha_set_thread()
ha_set_tid() was randomly used either to explicitly set thread 0 or to
set any possibly incomplete thread during boot. Let's replace it with
a pointer to a valid thread or NULL for any thread. This allows us to
check that the designated threads are always valid, and to ignore the
thread 0's mapping when setting it to NULL, and always use group 0 with
it during boot.

The initialization code is also cleaner, as we don't pass ugly casts
of a thread ID to a pointer anymore.
2021-10-08 17:22:26 +02:00
Willy Tarreau
cc7a11ee3b MINOR: threads: set the tid, ltid and their bit in thread_cfg
This will be a convenient way to communicate the thread ID and its
local ID in the group, as well as their respective bits when creating
the threads or when only a pointer is given.
2021-10-08 17:22:26 +02:00
Willy Tarreau
6eee85f887 MINOR: threads: set the group ID and its bit in the thread group
This will ease the reporting of the current thread group ID when coming
from the thread itself, especially since it returns the visible ID,
starting at 1.
2021-10-08 17:22:26 +02:00
Willy Tarreau
e6806ebecc MEDIUM: threads: automatically assign threads to groups
This takes care of unassigned threads groups and places unassigned
threads there, in a more or less balanced way. Too sparse allocations
may still fail though. For now with a maximum group number fixed to 1
nothing can really fail.
2021-10-08 17:22:26 +02:00
Willy Tarreau
d04bc3ac21 MINOR: global: add a new "thread-group" directive
This registers a mapping of threads to groups by enumerating for each thread
what group it belongs to, and marking the group as assigned. It takes care of
checking for redefinitions, overlaps, and holes. It supports both individual
numbers and ranges. The thread group is referenced from the thread config.
2021-10-08 17:22:26 +02:00
Willy Tarreau
c33b969e35 MINOR: global: add a new "thread-groups" directive
This is used to configure the number of thread groups. For now it can
only be 1.
2021-10-08 17:22:26 +02:00
Willy Tarreau
f9662848f2 MINOR: threads: introduce a minimalistic notion of thread-group
This creates a struct tgroup_info which knows the thread ID of the first
thread in a group, and the number of threads in it. For now there's only
one thread group supported in the configuration, but it may be forced to
other values for development purposes by defining MAX_TGROUPS, and it's
enabled even when threads are disabled and will need to remain accessible
during boot to keep a simple enough internal API.

For the purpose of easing the configurations which do not specify a thread
group, we're starting group numbering at 1 so that thread group 0 can be
"undefined" (i.e. for "bind" lines or when binding tasks).

The goal will be to later move there some global items that must be
made per-group.
2021-10-08 17:22:26 +02:00
Willy Tarreau
6036342f58 MINOR: thread: make "ti" a const pointer and clean up thread_info a bit
We want to make sure that the current thread_info accessed via "ti" will
remain constant, so that we don't accidentally place new variable parts
there and so that the compiler knows that info retrieved from there is
not expected to have changed between two function calls.

Only a few init locations had to be adjusted to use the array and the
rest is unaffected.
2021-10-08 17:22:26 +02:00
Willy Tarreau
b4e34766a3 REORG: thread/sched: move the last dynamic thread_info to thread_ctx
The last 3 fields were 3 list heads that are per-thread, and which are:
  - the pool's LRU head
  - the buffer_wq
  - the streams list head

Moving them into thread_ctx completes the removal of dynamic elements
from the struct thread_info. Now all these dynamic elements are packed
together at a single place for a thread.
2021-10-08 17:22:26 +02:00
Willy Tarreau
a0b99536c8 REORG: thread/sched: move the thread_info flags to the thread_ctx
The TI_FL_STUCK flag is manipulated by the watchdog and scheduler
and describes the apparent life/death of a thread so it changes
all the time and it makes sense to move it to the thread's context
for an active thread.
2021-10-08 17:22:26 +02:00
Willy Tarreau
45c38e22bf REORG: thread/clock: move the clock parts of thread_info to thread_ctx
The "thread_info" name was initially chosen to store all info about
threads but since we now have a separate per-thread context, there is
no point keeping some of its elements in the thread_info struct.

As such, this patch moves prev_cpu_time, prev_mono_time and idle_pct to
thread_ctx, into the thread context, with the scheduler parts. Instead
of accessing them via "ti->" we now access them via "th_ctx->", which
makes more sense as they're totally dynamic, and will be required for
future evolutions. There's no room problem for now, the structure still
has 84 bytes available at the end.
2021-10-08 17:22:26 +02:00
Willy Tarreau
1a9c922b53 REORG: thread/sched: move the task_per_thread stuff to thread_ctx
The scheduler contains a lot of stuff that is thread-local and not
exclusively tied to the scheduler. Other parts (namely thread_info)
contain similar thread-local context that ought to be merged with
it but that is even less related to the scheduler. However moving
more data into this structure isn't possible since task.h is high
level and cannot be included everywhere (e.g. activity) without
causing include loops.

In the end, it appears that the task_per_thread represents most of
the per-thread context defined with generic types and should simply
move to tinfo.h so that everyone can use them.

The struct was renamed to thread_ctx and the variable "sched" was
renamed to "th_ctx". "sched" used to be initialized manually from
run_thread_poll_loop(), now it's initialized by ha_set_tid() just
like ti, tid, tid_bit.

The memset() in init_task() was removed in favor of a bss initialization
of the array, so that other subsystems can put their stuff in this array.

Since the tasklet array has TL_CLASSES elements, the TL_* definitions
was moved there as well, but it's not a problem.

The vast majority of the change in this patch is caused by the
renaming of the structures.
2021-10-08 17:22:26 +02:00
Willy Tarreau
6414e4423c CLEANUP: wdt: do not remap SI_TKILL to SI_LWP, test the values directly
We used to remap SI_TKILL to SI_LWP when SI_TKILL was not available
(e.g. FreeBSD) but that's ugly and since we need this only in a single
switch/case block in wdt.c it's even simpler and cleaner to perform the
two tests there, so let's do this.
2021-10-08 17:22:26 +02:00
Willy Tarreau
b474f43816 MINOR: wdt: move wd_timer to wdt.c
The watchdog timer had no more reason for being shared with the struct
thread_info since the watchdog is the only user now. Let's remove it
from the struct and move it to a static array in wdt.c. This removes
some ifdefs and the need for the ugly mapping to empty_t that might be
subject to a cast to a long when compared to TIMER_INVALID. Now timer_t
is not known outside of wdt.c and clock.c anymore.
2021-10-08 17:22:26 +02:00
Willy Tarreau
2169498941 MINOR: clock: move the clock_ids to clock.c
This removes the knowledge of clockid_t from anywhere but clock.c, thus
eliminating a source of includes burden. The unused clock_id field was
removed from thread_info, and the definition setting of clockid_t was
removed from compat.h. The most visible change is that the function
now_cpu_time_thread() now takes the thread number instead of a tinfo
pointer.
2021-10-08 17:22:26 +02:00
Willy Tarreau
6cb0c391e7 REORG: clock/wdt: move wdt timer initialization to clock.c
The code that deals with timer creation for the WDT was moved to clock.c
and is called with the few relevant arguments. This removes the need for
awareness of clock_id from wdt.c and as such saves us from having to
share it outside. The timer_t is also known only from both ends but not
from the public API so that we don't have to create a fake timer_t
anymore on systems which do not support it (e.g. macos).
2021-10-08 17:22:26 +02:00
Willy Tarreau
44c58da52f REORG: clock: move the clock_id initialization to clock.c
This was previously open-coded in run_thread_poll_loop(). Now that
we have clock.c dedicated to such stuff, let's move the code there
so that we don't need to keep such ifdefs nor to depend on the
clock_id.
2021-10-08 17:22:26 +02:00
Willy Tarreau
2c6a998727 CLEANUP: clock: stop exporting before_poll and after_poll
We don't need to export them anymore so let's make them static.
2021-10-08 17:22:26 +02:00
Willy Tarreau
20adfde9c8 MINOR: activity: get the run_time from the clock updates
Instead of fiddling with before_poll and after_poll in
activity_count_runtime(), the function is now called by
clock_entering_poll() which passes it the number of microseconds
spent working. This allows to remove all calls to
activity_count_runtime() from the pollers.
2021-10-08 17:22:26 +02:00
Willy Tarreau
f9d5e1079c REORG: clock: move the updates of cpu/mono time to clock.c
The entering_poll/leaving_poll/measure_idle functions that were hard
to classify and used to move to various locations have now been placed
into clock.c since it's precisely about time-keeping. The functions
were renamed to clock_*. The samp_time and idle_time values are now
static since there is no reason for them to be read from outside.
2021-10-08 17:22:26 +02:00
Willy Tarreau
5554264f31 REORG: time: move time-keeping code and variables to clock.c
There is currently a problem related to time keeping. We're mixing
the functions to perform calculations with the os-dependent code
needed to retrieve and adjust the local time.

This patch extracts from time.{c,h} the parts that are solely dedicated
to time keeping. These are the "now" or "before_poll" variables for
example, as well as the various now_*() functions that make use of
gettimeofday() and clock_gettime() to retrieve the current time.

The "tv_*" functions moved there were also more appropriately renamed
to "clock_*".

Other parts used to compute stolen time are in other files, they will
have to be picked next.
2021-10-08 17:22:26 +02:00
Willy Tarreau
28345c6652 BUILD: init: avoid a build warning on FreeBSD with USE_PROCCTL
It was brought by a variable declared after some statements in commit
21185970c ("MINOR: proc: setting the process to produce a core dump on
FreeBSD."). It's worth noting that some versions of clang seem to ignore
-Wdeclaration-after-statement by default. No backport is needed.
2021-10-08 17:21:48 +02:00
Amaury Denoyelle
eb01f597eb BUG/MINOR: quic: fix includes for compilation
Fix missing includes in quic code following the general recent include
reorganization. This fixes the compilation error with QUIC enabled.
2021-10-08 15:59:02 +02:00
Amaury Denoyelle
769e9ffd94 CLEANUP: mux-quic: remove unused code
Remove unused code in mux-quic. This is mostly code related to the
backend side. This code is untested for the moment, its removal will
simplify the code maintenance.
2021-10-08 15:48:00 +02:00
Amaury Denoyelle
9c8c4fa3a2 MINOR: qpack: fix memory leak on huffman decoding
Remove an unneeded strdup invocation during QPACK huffman decoding. A
temporary storage buffer is passed by the function and exists after
decoding so no need to duplicate memory here.
2021-10-08 15:45:57 +02:00
Amaury Denoyelle
3a590c7ff2 MINOR: qpack: support non-indexed http status code encoding
If a HTTP status code is not present in the QPACK static table, encode
it with a literal field line with name reference.
2021-10-08 15:30:18 +02:00
Amaury Denoyelle
fccffe08b3 MINOR: qpack: do not encode invalid http status code
Ensure that the HTTP status code is valid before encoding with QPACK. An
error is return if this is not the case.
2021-10-08 15:28:35 +02:00
Christopher Faulet
485da0b053 BUG/MEDIUM: mux_h2: Handle others remaining read0 cases on partial frames
We've found others places where the read0 is ignored because of an
incomplete frame parsing. This time, it happens during parsing of
CONTINUATION frames.

When frames are parsed, incomplete frames are properly handled and
H2_CF_DEM_SHORT_READ flag is set. It is also true for HEADERS
frames. However, for CONTINUATION frames, there is an exception. Besides
parsing the current frame, we try to peek header of the next one to merge
payload of both frames, the current one and the next one. Idea is to create
a sole HEADERS frame before parsing the payload. However, in this case, it
is possible to have an incomplete frame too, not the current one but the
next one. From the demux point of view, the current frame is complete. We
must go to the internal function h2c_decode_headers() to detect an
incomplete frame. And this case was not identified and fixed when
H2_CF_DEM_SHORT_READ flag was introduced in the commit b5f7b5296
("BUG/MEDIUM: mux-h2: Handle remaining read0 cases on partial frames")

This bug was reported in a comment of the issue #1362. The patch must be
backported as far as 2.0.
2021-10-08 09:17:27 +02:00
Amaury Denoyelle
2af1985af8 BUG/MAJOR: quic: remove qc from receiver cids tree on free
Remove the quic_conn from the receiver connection_ids tree on
quic_conn_free. This fixes a crash due to dangling references in the
tree after a quic connection release.

This operation must be conducted under the listener lock. For this
reason, the quic_conn now contains a reference to its attached listener.
2021-10-07 17:35:25 +02:00
Amaury Denoyelle
d595f108db MINOR: mux-quic: release connection if no more bidir streams
Use the count of bidirectional streams to call qc_release in qc_detach.
We cannot inspect the by_id tree because uni-streams are never removed
from it. This allows the connection to be properly freed.
2021-10-07 17:35:25 +02:00
Amaury Denoyelle
336f6fd964 BUG/MAJOR: xprt-quic: do not queue qc timer if not set
Do not queue the pto/loss-detection timer if set to TICK_ETERNITY. This
usage is invalid with the scheduler and cause a BUG_ON trigger.
2021-10-07 17:35:25 +02:00
Amaury Denoyelle
139814a67a BUG/MEDIUM: mux-quic: reinsert all streams in by_id tree
It is required that all qcs streams are in the by_id tree for the xprt
to function correctly. Without this, some ACKs are not properly emitted
by xprt.

Note that this change breaks the free of the connection because the
condition eb_is_empty in qc_detach is always true. This will be fixed in
a following patch.
2021-10-07 17:35:25 +02:00
Frédéric Lécaille
75dd2b7987 MINOR: quic: Fix SSL error issues (do not use ssl_bio_and_sess_init())
It seems it was a bad idea to use the same function as for TCP ssl sockets
to initialize the SSL session objects for QUIC with ssl_bio_and_sess_init().
Indeed, this had as very bad side effects to generate SSL errors due
to the fact that such BIOs initialized for QUIC could not finally be controlled
via the BIO_ctrl*() API, especially BIO_ctrl() function used by very much other
internal OpenSSL functions (BIO_push(), BIO_pop() etc).
Others OpenSSL base QUIC implementation do not use at all BIOs to configure
QUIC connections. So, we decided to proceed the same way as ngtcp2 for instance:
only initialize an SSL object and call SSL_set_quic_method() to set its
underlying method. Note that calling this function silently disable this option:
SSL_OP_ENABLE_MIDDLEBOX_COMPAT.
We implement qc_ssl_sess_init() to initialize SSL sessions for QUIC connections
to do so with a retry in case of allocation failure as this is done by
ssl_bio_and_sess_init(). We also modify the code part for haproxy servers.
2021-10-07 17:35:25 +02:00
Frédéric Lécaille
7c881bdab8 MINOR: quic: BUG_ON() SSL errors.
As this QUIC implementation is still experimental, let's BUG_ON()
very important SSL handshake errors.
Also dump the SSL errors before BUG_ON().
2021-10-07 17:35:25 +02:00
Frédéric Lécaille
6f0fadb5a7 MINOR: quic: Add a function to dump SSL stack errors
This has been very helpful to fix SSL related issues.
2021-10-07 17:35:25 +02:00
Frédéric Lécaille
57e6e9eef8 MINOR: quic: Distinguish packet and SSL read enc. level in traces
This is only to distinguish the encryption level of packet traces from
the TLS stack current read encryption level.
2021-10-07 17:35:25 +02:00
Willy Tarreau
1b4a714266 MINOR: pools: report the amount used by thread caches in "show pools"
The "show pools" command provides some "allocated" and "used" estimates
on the pools objects, but this applies to the shared pool and the "used"
includes what is currently assigned to thread-local caches. It's possible
to know how much each thread uses, so let's dump the total size allocated
by thread caches as an estimate. It's only done when pools are enabled,
which explains why the patch adds quite a lot of ifdefs.
2021-10-07 17:30:06 +02:00
Willy Tarreau
aa992761d8 CLEANUP: thread: uninline ha_tkill/ha_tkillall/ha_cpu_relax()
These ones are rarely used or only to waste CPU cycles waiting, and are
the last ones requiring system includes in thread.h. Let's uninline them
and move them to thread.c.
2021-10-07 01:41:15 +02:00
Willy Tarreau
5e03dfaaf6 MINOR: thread: use a dedicated static pthread_t array in thread.c
This removes the thread identifiers from struct thread_info and moves
them only in static array in thread.c since it's now the only file that
needs to touch it. It's also the only file that needs to include
pthread.h, beyond haproxy.c which needs it to start the poll loop. As
a result, much less system includes are needed and the LoC reduced by
around 3%.
2021-10-07 01:41:15 +02:00
Willy Tarreau
4eeb88363c REORG: thread: move ha_get_pthread_id() to thread.c
It's the last function which directly accesses the pthread_t, let's move
it to thread.c and leave a static inline for non-thread.
2021-10-07 01:41:14 +02:00
Willy Tarreau
d10385ac4b REORG: thread: move the thread init/affinity/stop to thread.c
haproxy.c still has to deal with pthread-specific low-level stuff that
is OS-dependent. We should not have to deal with this there, and we do
not need to access pthread anywhere else.

Let's move these 3 functions to thread.c and keep empty inline ones for
when threads are disabled.
2021-10-07 01:41:14 +02:00
Willy Tarreau
19b18ad552 CLENAUP: wdt: use ha_tkill() instead of accessing pthread directly
Instead of calling pthread_kill() directly on the pthread_t let's
call ha_tkill() which does the same by itself. This will help isolate
pthread_t.
2021-10-07 01:41:14 +02:00
Willy Tarreau
b63888c67c REORG: fd: uninline compute_poll_timeout()
It's not needed to inline it at all (one call per loop) and it introduces
dependencies, let's move it to fd.c.

Removing the few remaining includes that came with it further reduced
by ~0.2% the LoC and the build time is now below 6s.
2021-10-07 01:41:14 +02:00
Willy Tarreau
d8b325c748 REORG: task: uninline the loop time measurement code
It's pointless to inline this, it's called exactly once per poll loop,
and it depends on time.h which is quite deep. Let's move that to task.c
along with sched_report_idle().
2021-10-07 01:41:14 +02:00
Willy Tarreau
8de90c71b3 REORG: connection: uninline the rest of the alloc/free stuff
The remaining large functions are those allocating/initializing and
occasionally freeing connections, conn_streams and sockaddr. Let's
move them to connection.c. In fact, cs_free() is the only one-liner
but let's move it along with the other ones since a call will be
small compared to the rest of the work done there.
2021-10-07 01:41:14 +02:00
Willy Tarreau
aac777f169 REORG: connection: move the largest inlines from connection.h to connection.c
The following inlined functions are particularly large (and probably not
inlined at all by the compiler), and together represent roughly half of
the file, while they're used at most once per connection. They were moved
to connection.c.

  conn_upgrade_mux_fe, conn_install_mux_fe, conn_install_mux_be,
  conn_install_mux_chk, conn_delete_from_tree, conn_init, conn_new,
  conn_free
2021-10-07 01:41:14 +02:00
Willy Tarreau
260f324c19 REORG: server: uninline the idle conns management functions
The following functions are quite heavy and have no reason to be kept
inlined:

   srv_release_conn, srv_lookup_conn, srv_lookup_conn_next,
   srv_add_to_idle_list

They were moved to server.c. It's worth noting that they're a bit
at the edge between server and connection and that maybe we could
create an idle-conn file for these in the near future.
2021-10-07 01:41:14 +02:00
Willy Tarreau
930428c0bf REORG: connection: uninline conn_notify_mux() and conn_delete_from_tree()
The former is far too huge to be inlined and the second is the only
one requiring an ebmb tree through all includes, let's move them to
connection.c.
2021-10-07 01:41:14 +02:00
Willy Tarreau
e5983ffb3a REORG: connection: move the hash-related stuff to connection.c
We do not really need to have them inlined, and having xxhash.h included
by connection.h results in this 4700-lines file being processed 101 times
over the whole project, which accounts for 13.5% of the total size!
Additionally, half of the functions are only needed from connection.c.
Let's move the functions there and get rid of the painful include.

The build time is now down to 6.2s just due to this.
2021-10-07 01:41:14 +02:00
Willy Tarreau
fd21c6c6fd MINOR: connection: use uint64_t for the hashes
The hash type stored everywhere is XXH64_hash_t, which annoyingly forces
everyone to include the huge xxhash file. We know it's an uint64_t because
that's its purpose and the type is only made to abstract it on machines
where uint64_t is not availble. Let's switch the type to uint64_t
everywhere and avoid including xxhash from the type file.
2021-10-07 01:41:14 +02:00
Willy Tarreau
a26be37e20 REORG: acitvity: uninline sched_activity_entry()
This one is expensive in code size because it comes with xxhash.h at a
low level of dependency that's inherited at plenty of places, and for
a function does doesn't benefit from inlining and could possibly even
benefit from not being inline given that it's large and called from the
scheduler.

Moving it to activity.c reduces the LoC by 1.2% and the binary size by
~1kB.
2021-10-07 01:41:14 +02:00
Willy Tarreau
e0650224b8 REORG: activity: uninline activity_count_runtime()
This function has no reason for being inlined, it's called from non
critical places (once in pollers), is quite large and comes with
dependencies (time and freq_ctr). Let's move it to acitvity.c. That's
another 0.4% less LoC to build.
2021-10-07 01:41:14 +02:00
Willy Tarreau
9310f481ce CLEANUP: tree-wide: remove unneeded include time.h in ~20 files
20 files used to have haproxy/time.h included only for now_ms, and two
were missing it for other things but used to inherit from it via other
files.
2021-10-07 01:41:14 +02:00
Willy Tarreau
078c2573c2 REORG: sched: moved samp_time and idle_time to task.c as well
The idle time calculation stuff was moved to task.h by commit 6dfab112e
("REORG: sched: move idle time calculation from time.h to task.h") but
these two variables that are only maintained by task.{c,h} were still
left in time.{c,h}. They have to move as well.
2021-10-07 01:41:14 +02:00
Willy Tarreau
99ea188c0e REORG: sample: move the crypto samples to ssl_sample.c
These ones require openssl and are only built when it's enabled. There's
no point keeping them in sample.c when ssl_sample.c already deals with this
and the required includes. This also allows to remove openssl-compat.h
from sample.c and to further reduce the number of inclusions of openssl
includes, and the build time is now down to under 8 seconds.
2021-10-07 01:41:14 +02:00
Willy Tarreau
82531f6730 REORG: ssl-sock: move the sslconns/totalsslconns counters to global
These two counters were the only ones not in the global struct, while
the SSL freq counters or the req counts are already in it, this forces
stats.c to include ssl_sock just to know about them. Let's move them
over there with their friends. This reduces from 408 to 384 the number
of includes of opensslconf.h.
2021-10-07 01:41:14 +02:00
Willy Tarreau
a8a72c68d5 CLEANUP: ssl/server: move ssl_sock_set_srv() to srv_set_ssl() in server.c
This one has nothing to do with ssl_sock as it manipulates the struct
server only. Let's move it to server.c and remove unneeded dependencies
on ssl_sock.h. This further reduces by 10% the number of includes of
opensslconf.h and by 0.5% the number of compiled lines.
2021-10-07 01:41:06 +02:00
Willy Tarreau
d2ae3858e9 CLEANUP: mux_fcgi: remove dependency on ssl_sock
It's not needed anymore (used to be needed for ssl_sock_is_ssl()).
2021-10-07 01:36:51 +02:00
Willy Tarreau
1057beecda REORG: ssl: move ssl_sock_is_ssl() to connection.h and rename it
This one doesn't use anything from an SSL context, it only checks the
type of the transport layer of a connection, thus it belongs to
connection.h. This is particularly visible due to all the ifdefs
around it in various call places.
2021-10-07 01:36:51 +02:00
Willy Tarreau
dbf78025a0 REORG: listener: move bind_conf_alloc() and listener_state_str() to listener.c
These functions have no reason for being inlined, and they require some
includes with long dependencies. Let's move them to listener.c and trim
unused includes in listener.h.
2021-10-07 01:36:51 +02:00
Willy Tarreau
dced3ebb4a MINOR: thread/debug: replace nsec_now() with now_mono_time()
The two functions do exactly the same except that the second one
is already provided by time.h and still defined if not available.
2021-10-07 01:36:51 +02:00
Willy Tarreau
407ef893e7 REORG: thread: uninline the lock-debugging code
The lock-debugging code in thread.h has no reason to be inlined. the
functions are quite fat and perform a lot of operations so there's no
saving keeping them inlined. Worse, most of them are in fact not
inlined, resulting in a significantly bigger executable.

This patch moves all this part from thread.h to thread.c. The functions
are still exported in thread.h of course. This results in ~166kB less
code:

     text    data     bss     dec     hex filename
  3165938   99424  897376 4162738  3f84b2 haproxy-before
  2991987   99424  897376 3988787  3cdd33 haproxy-after

In addition the build time with thread debugging enabled has shrunk
from 19.2 to 17.7s thanks to much less code to be parsed in thread.h
that is included virtually everywhere.
2021-10-07 01:36:51 +02:00
Willy Tarreau
f14d19024b REORG: pools: uninline the UAF allocator and force-inline the rest
pool-os.h relies on a number of includes solely because the
pool_alloc_area() function was inlined, and this only because we want
the normal version to be inlined so that we can track the calling
places for the memory profiler. It's worth noting that it already
does not work at -O0, and that when UAF is enabled we don't care a
dime about profiling.

This patch does two things at once:
  - force-inline the functions so that pool_alloc_area() is still
    inlined at -O0 to help track malloc() users ;

  - uninline the UAF version of these (that rely on mmap/munmap)
    and move them to pools.c so that we can remove all unneeded
    includes.

Doing so reduces by ~270kB or 0.15% the total build size.
2021-10-07 01:36:51 +02:00
Willy Tarreau
5d9ddc5442 BUILD: tree-wide: add several missing activity.h
A number of files currently access activity counters but rely on their
definitions to be inherited from other files (task.c, backend.c hlua.c,
sock.c, pool.c, stats.c, fd.c).
2021-10-07 01:36:51 +02:00
Willy Tarreau
410e2590e9 BUILD: mworker: mworker-prog needs time.h for the 'now' variable
It wasn't included and it used to get them through other includes.
2021-10-07 01:36:51 +02:00
Willy Tarreau
6cd007d078 BUILD: tcp_sample: include missing errors.h and session-t.h
Both are used without being defined as they were inherited from other
files.
2021-10-07 01:36:51 +02:00
Willy Tarreau
0d1dd0e894 BUILD: cfgparse-ssl: add missing errors.h
ha_warning(), ha_alert() and friends are in errors.h and it used
to be inherited via other files.
2021-10-07 01:36:51 +02:00
Willy Tarreau
b7fc4c4e9f BUILD: tree-wide: add missing http_ana.h from many places
At least 6 files make use of s->txn without including http_ana which
defines it. They used to get it from other includes.
2021-10-07 01:36:51 +02:00
Willy Tarreau
63617dbec6 BUILD: idleconns: include missing ebmbtree.h at several places
backend.c, all muxes, backend.c started manipulating ebmb_nodes with
the introduction of idle conns but the types were inherited through
other includes. Let's add ebmbtree.h there.
2021-10-07 01:36:51 +02:00
Willy Tarreau
74f2456c42 BUILD: ssl_ckch: include ebpttree.h in ssl_ckch.c
It's used but is only found through other includes.
2021-10-07 01:36:51 +02:00
Willy Tarreau
8db34cc974 BUILD: peers: need to include eb{32/mb/pt}tree.h
peers.c uses them all and used to only find them through other includes.
2021-10-07 01:36:51 +02:00
Willy Tarreau
b555eb1176 BUILD: vars: need to include xxhash
It's needed for XXH3(), and it used to get it through other includes.
2021-10-07 01:36:51 +02:00
Willy Tarreau
0ce6dc0107 BUILD: http_rules: requires http_ana-t.h for REDIRECT_*
It used to inherit it through other includes.
2021-10-07 01:36:51 +02:00
Willy Tarreau
286631a1a0 BUILD: sample: include openssl-compat
It's needed for EVP_*.
2021-10-07 01:36:51 +02:00
Willy Tarreau
1df20428f1 BUILD: httpclient: include missing ssl_sock-t
It's needed for SSL_SOCK_VERIFY_NONE.
2021-10-07 01:36:51 +02:00
Willy Tarreau
27539409fd BUILD: hlua: needs to include stream-t.h
It uses the SF_ERR_* error codes and currently gets them via
intermediary includes.
2021-10-07 01:36:51 +02:00
Willy Tarreau
397ad4135a BUILD: extcheck: needs to include stream-t.h
It uses the SF_ERR_* error codes and currently gets them via
intermediary includes.
2021-10-07 01:36:51 +02:00
Willy Tarreau
2476ff102f BUG/MEDIUM: sample: properly verify that variables cast to sample
The various variable-to-sample converters allow to turn a variable to
a sample of type string, sint or binary, but both the string one used
by strcmp() and the binary one used by secure_memcmp() are missing a
pointer check on the ability to the cast, making them crash if a
variable of type addr is used with strcmp(), or if an addr or bool is
used with secure_memcmp().

Let's rely on the new sample_conv_var2smp() function to run the proper
checks.

This will need to be backported to all supported version. It relies on
previous commits:

  CLEANUP: server: always include the storage for SSL settings
  CLEANUP: sample: rename sample_conv_var2smp() to *_sint
  CLEANUP: sample: uninline sample_conv_var2smp_str()
  MINOR: sample: provide a generic var-to-sample conversion function

For backports it's probably easier to check the sample_casts[] pointer
before calling it in sample_conv_strcmp() and sample_conv_secure_memcmp().
2021-10-07 01:36:51 +02:00
Willy Tarreau
168e8de1d0 MINOR: sample: provide a generic var-to-sample conversion function
We're using variable-to-sample conversion at least 4 times in the code,
two of which are bogus. Let's introduce a generic conversion function
that performs the required checks.
2021-10-07 01:36:51 +02:00
Willy Tarreau
4034e2cb58 CLEANUP: sample: uninline sample_conv_var2smp_str()
There's no reason to limit this one to this file, it could be used in
other contexts.
2021-10-07 01:36:51 +02:00
Willy Tarreau
d9be599529 CLEANUP: sample: rename sample_conv_var2smp() to *_sint
This one only handles integers, contrary to its sibling with the suffix
_str that only handles strings. Let's rename it and uninline it since it
may well be used from outside.
2021-10-07 01:36:51 +02:00
Willy Tarreau
80527bcb9d CLEANUP: server: always include the storage for SSL settings
The SSL stuff in struct server takes less than 3% of it and requires
lots of annoying ifdefs in the code just to take care of the cases
where the field is absent. Let's get rid of this and stop including
openssl-compat from server.c to detect NPN and ALPN capabilities.

This reduces the total LoC by another 0.4%.
2021-10-07 01:36:51 +02:00
William Lallemand
746e6f3f8e MINOR: httpclient/lua: supports headers via named arguments
Migrate the httpclient:get() method to named arguments so we can
specify optional arguments.

This allows to pass headers as an optional argument as an array.

The () in the method call must be replaced by {}:

	local res = httpclient:get{url="http://127.0.0.1:9000/?s=99",
	            headers= {["X-foo"]  = { "salt" }, ["X-bar"] = {"pepper" }}}
2021-10-06 15:21:02 +02:00
William Lallemand
ef574b2101 BUG/MINOR: httpclient/lua: does not process headers when failed
Do not try to process the header list when it is NULL. This case can
arrive when the request failed and did not return a response.
2021-10-06 15:15:03 +02:00
William Lallemand
2a879001b5 MINOR: httpclient: destroy checks if a client was started but not stopped
During httpclient_destroy, add a condition in the BUG_ON which checks
that the client was started before it has ended. A httpclient structure
could have been created without being started.
2021-10-06 15:15:03 +02:00
William Lallemand
4d60184887 BUG/MEDIUM: httpclient/lua: crash because of b_xfer and get_trash_chunk()
When using the lua httpclient, haproxy could crash because a b_xfer is
done in httpclient_xfer, which will do a zero-copy swap of the data in
the buffers. The ptr will then be free() by the pool.

However this can't work with a trash buffer, because the area was not
allocated from the pool buffer, so the pool is not suppose to free it
because it does not know this ptr, using -DDEBUG_MEMORY_POOLS will
result with a crash during the free.

Fix the problem by using b_force_xfer() instead of b_xfer which copy
the data instead. The problem still exist with the trash however, and
the trash API must be reworked.
2021-10-06 15:15:03 +02:00
William Lallemand
f77f1de802 MINOR: httpclient/lua: implement garbage collection
Implement the garbage collector of the lua httpclient.

This patch declares the __gc method of the httpclient object which only
does a httpclient_stop_and_destroy().
2021-10-06 15:15:03 +02:00
William Lallemand
b8b1370307 MINOR: httpclient: test if started during stop_and_destroy()
If the httpclient was never started, it is safe to destroy completely
the httpclient.
2021-10-06 15:15:03 +02:00
William Lallemand
ecb83e13eb MINOR: httpclient: stop_and_destroy() ask the applet to autokill
httpclient_stop_and_destroy() tries to destroy the httpclient structure
if the client was stopped.

In the case the client wasn't stopped, it ask the client to stop itself
and to destroy the httpclient structure itself during the release of the
applet.
2021-10-06 15:15:03 +02:00
William Lallemand
739f90a6ef MINOR: httpclient: set HTTPCLIENT_F_ENDED only in release
Only set the HTTPCLIENT_F_ENDED flag in httpclient_applet_release()
function so we are sure that the appctx is not used anymore once the
flag is set.
2021-10-06 15:15:03 +02:00
William Lallemand
03f5a1c77d MINOR: httpclient: destroy() must free the headers and the ists
httpclient_destroy() must free all the ist in the httpclient structure,
the URL in the request, the vsn and reason in the response.

It also must free the list of headers of the response.
2021-10-06 15:15:03 +02:00
Christopher Faulet
d34758849e BUG/MEDIUM: http-ana: Clear request analyzers when applying redirect rule
A bug was introduced by the commit 2d5650082 ("BUG/MEDIUM: http-ana: Reset
channels analysers when returning an error").

The request analyzers must be cleared when a redirect rule is applied. It is
not a problem if the redirect rule is inside an http-request ruleset because
the analyzer takes care to clear it. However, when it comes from a redirect
ruleset (via the "redirect ..."  directive), because of the above commit,
the request analyzers are no longer cleared. It means some HTTP request
analyzers may be called while the request channel was already flushed. It is
totally unexpected and may lead to crash.

Thanks to Yves Lafon for reporting the problem.

This patch must be backported everywhere the above commit was backported.
2021-10-04 14:32:02 +02:00
Christopher Faulet
d28b2b2352 BUG/MEDIUM: filters: Fix a typo when a filter is attached blocking the release
When a filter is attached to a stream, the wrong FLT_END analyzer is added
on the request channel. AN_REQ_FLT_END must be added instead of
AN_RES_FLT_END. Because of this bug, the stream may hang on the filter
release stage.

It seems to be ok for HTTP filters (cache & compression) in HTTP mode. But
when enabled on a TCP proxy, the stream is blocked until the client or the
server timeout expire because data forwarding is blocked. The stream is then
prematurely aborted.

This bug was introduced by commit 26eb5ea35 ("BUG/MINOR: filters: Always set
FLT_END analyser when CF_FLT_ANALYZE flag is set"). The patch must be
backported in all stable versions.
2021-10-04 08:28:44 +02:00
Willy Tarreau
6dfab112e1 REORG: sched: move idle time calculation from time.h to task.h
time.h is a horrible place to put activity calculation, it's a
historical mistake because the functions were there. We already have
most of the parts in sched.{c,h} and these ones make an exception in
the middle, forcing time.h to include some thread stuff and to access
the before/after_poll and idle_pct values.

Let's move these 3 functions to task.h with the other ones. They were
prefixed with "sched_" instead of the historical "tv_" which already
made no sense anymore.
2021-10-01 18:37:51 +02:00
Willy Tarreau
6136989a22 MINOR: time: uninline report_idle() and move it to task.c
I don't know why I inlined this one, this makes no sense given that it's
only used for stats, and it starts a circular dependency on tinfo.h which
can be problematic in the future. In addition, all the stuff related to
idle time calculation should be with the rest of the scheduler, which
currently is in task.{c,h}, so let's move it there.
2021-10-01 18:37:50 +02:00
Willy Tarreau
beeabf5314 MINOR: task: provide 3 task_new_* wrappers to simplify the API
We'll need to improve the API to pass other arguments in the future, so
let's start to adapt better to the current use cases. task_new() is used:
  - 18 times as task_new(tid_bit)
  - 18 times as task_new(MAX_THREADS_MASK)
  - 2 times with a single bit (in a loop)
  - 1 in the debug code that uses a mask

This patch provides 3 new functions to achieve this:
  - task_new_here()     to create a task on the calling thread
  - task_new_anywhere() to create a task to be run anywhere
  - task_new_on()       to create a task to run on a specific thread

The change is trivial and will allow us to later concentrate the
required adaptations to these 3 functions only. It's still possible
to call task_new() if needed but a comment was added to encourage the
use of the new ones instead. The debug code was not changed and still
uses it.
2021-10-01 18:36:29 +02:00
Willy Tarreau
6a2a912cb8 CLEANUP: tasks: remove the long-unused work_lists
Work lists were a mechanism introduced in 1.8 to asynchronously delegate
some work to be performed on another thread via a dedicated task.
The only user was the listeners, to deal with the queue. Nowadays
the tasklets have made this much more convenient, and have replaced
work_lists in the listeners. It seems there will be no valid use case
of work lists anymore, so better get rid of them entirely and keep the
scheduler code cleaner.
2021-10-01 18:30:14 +02:00
Willy Tarreau
7a9699916a MINOR: tasks: catch TICK_ETERNITY with BUG_ON() in __task_queue()
__task_queue() must absolutely not be called with TICK_ETERNITY or it
will place a never-expiring node upfront in the timers queue, preventing
any timer from expiring until the process is restarted. Code was found
to cause this using "task_schedule(task, now_ms)" which does this one
millisecond every 49.7 days, so let's add a condition against this. It
must never trigger since any process susceptible to trigger it would
already accumulate tasks until it dies.

An extra test was added in wake_expired_tasks() to detect tasks whose
timeout would have been changed after being queued.

An improvement over this could be in the future to use a non-scalar
type (union/struct) for expiration dates so as to avoid the risk of
using them directly like this. But now_ms is already such a valid
time and this specific construct would still not be caught.

This could even be backported to stable versions to help detect other
occurrences if any.
2021-09-30 17:09:39 +02:00
Christopher Faulet
cb59e0bc3c BUG/MINOR: tcp-rules: Stop content rules eval on read error and end-of-input
For now, tcp-request and tcp-response content rules evaluation is
interrupted before the inspect-delay when the channel's buffer is full, the
RX path is blocked or when a shutdown for reads was received. To sum up, the
evaluation is interrupted when no more input data are expected. However, it
is not exhaustive. It also happens when end of input is reached (CF_EOI flag
set) or when a read error occurred (CF_READ_ERROR flag set).

Note that, AFAIK, it is only a problem on HAProy 2.3 and prior when a H1 to
H2 upgrade is performed. On newer versions, it works as expected because the
stream is not created at this stage.

This patch must be backported as far as 2.0.
2021-09-30 16:37:29 +02:00
Christopher Faulet
eaba25dd97 BUG/MINOR: tcpcheck: Don't use arg list for default proxies during parsing
During tcp/http check rules parsing, when a sample fetch or a log-format
string is parsed, the proxy's argument list used to track unresolved
argument is no longer passed for default proxies. It means it is no longer
possible to rely on sample fetches depending on the execution context (for
instance 'nbsrv').

It is important to avoid HAProxy crashes because these arguments are
resolved during the configuration validity check. But, default proxies are
not evaluated during this stage. Thus, these arguments remain unresolved.

It will probably be possible to relax this rule. But to ease backports, it
is forbidden for now.

This patch must be backported as far as 2.2. It depends on the commit
"MINOR: arg: Be able to forbid unresolved args when building an argument
list".  It must be adapted for the 2.3 because PR_CAP_DEF capability was
introduced in the 2.4. A solution may be to test The proxy's id agains NULL.
2021-09-30 16:37:05 +02:00
Christopher Faulet
35926a16ac MINOR: arg: Be able to forbid unresolved args when building an argument list
In make_arg_list() function, unresolved dependencies are pushed in an
argument list to be resolved later, during the configuration validity
check. It is now possible to forbid such unresolved dependencies by omitting
<al> parameter (setting it to NULL). It is usefull when the parsing context
is not the same than the running context or when the parsing context is lost
after the startup stage. For instance, an argument may be defined in
defaults section during parsing and executed in a frontend/backend section.
2021-09-30 16:37:05 +02:00
Willy Tarreau
e3957f83e0 BUG/MAJOR: lua: use task_wakeup() to properly run a task once
The Lua tasks registered vi core.register_task() use a dangerous
task_schedule(task, now_ms) to start them, that will most of the
time work by accident, except when the time wraps every 49.7 days,
if now_ms is 0, because it's not valid to queue a task with an
expiration date set to TICK_ETERNITY, as it will fail all wakeup
checks and prevent all subsequent timers from being seen as expired.
The only solution in this case is to restart the process.

Fortunately for the vast majority of users it is extremely unlikely
to ever be met (only one millisecond every 49.7 days is at risk), but
this can be systematic for a process dealing with 1000 req/s, hence
the major tag.

The bug was introduced in 1.6-dev with commit 24f335340 ("MEDIUM: lua:
add coroutine as tasks."), so the fix must be backported to all stable
branches.
2021-09-30 16:26:51 +02:00
Willy Tarreau
12c02701d3 BUG/MEDIUM: lua: fix wakeup condition from sleep()
A time comparison was wrong in hlua_sleep_yield(), making the sleep()
code do nothing for periods of 24 days every 49 days. An arithmetic
comparison was performed on now_ms instead of using tick_is_expired().

This bug was added in 1.6-dev by commit 5b8608f1e ("MINOR: lua: core:
add sleep functions") so the fix should be backported to all stable
versions.
2021-09-30 16:26:51 +02:00
Remi Tricot-Le Breton
9543d5ad5b MINOR: ssl: Store the last SSL error code in case of read or write failure
In case of error while calling a SSL_read or SSL_write, the
SSL_get_error function is called in order to know more about the error
that happened. If the error code is SSL_ERROR_SSL or SSL_ERROR_SYSCALL,
the error queue might contain more information on the error. This error
code was not used until now. But we now need to store it in order for
backend error fetches to catch all handshake related errors.

The change was required because the previous backend fetch would not
have raised anything if the client's certificate was rejected by the
server (and the connection interrupted). This happens because starting
from TLS1.3, the 'Finished' state on the client is reached before its
certificate is sent to the server (see the "Protocol Overview" part of
RFC 8446). The only place where we can detect that the server rejected the
certificate is after the first SSL_read call after the SSL_do_handshake
function.

This patch then adds an extra ERR_peek_error after the SSL_read and
SSL_write calls in ssl_sock_to_buf and ssl_sock_from_buf. This means
that it could set an error code in the SSL context a long time after the
handshake is over, hence the change in the error fetches.
2021-09-30 11:04:35 +02:00
Remi Tricot-Le Breton
1fe0fad88b MINOR: ssl: Rename ssl_bc_hsk_err to ssl_bc_err
The ssl_bc_hsk_err sample fetch will need to raise more errors than only
handshake related ones hence its renaming to a more generic ssl_bc_err.
This patch is required because some handshake failures that should have
been caught by this fetch (verify error on the server side for instance)
were missed. This is caused by a change in TLS1.3 in which the
'Finished' state on the client is reached before its certificate is sent
(and verified) on the server side (see the "Protocol Overview" part of
RFC 8446).
This means that the SSL_do_handshake call is finished long before the
server can verify and potentially reject the client certificate.

The ssl_bc_hsk_err will then need to be expanded to catch other types of
errors.

This change is also applied to the frontend fetches (ssl_fc_hsk_err
becomes ssl_fc_err) and to their string counterparts.
2021-09-30 11:04:35 +02:00
Remi Tricot-Le Breton
61944f7a73 MINOR: ssl: Set connection error code in case of SSL read or write fatal failure
In case of a connection error happening after the SSL handshake is
completed, the error code stored in the connection structure would not
always be set, hence having some connection failures being described as
successful in the fc_conn_err or bc_conn_err sample fetches.
The most common case in which it could happen is when the SSL server
rejects the client's certificate. The SSL_do_handshake call on the
client side would be sucessful because the client effectively sent its
client hello and certificate information to the server, but the next
call to SSL_read on the client side would raise an SSL_ERROR_SSL code
(through the SSL_get_error function) which is decribed in OpenSSL
documentation as a non-recoverable and fatal SSL error.
This patch ensures that in such a case, the connection's error code is
set to a special CO_ERR_SSL_FATAL value.
2021-09-30 11:04:35 +02:00
Christopher Faulet
da3adebd06 BUG/MEDIUM: mux-h1/mux-fcgi: Reject messages with unknown transfer encoding
HAproxy only handles "chunked" encoding internally. Because it is a gateway,
we stated it was not a problem if unknown encodings were applied on a
message because it is the recipient responsibility to accept the message or
not. And indeed, it is not a problem if both the client and the server
connections are using H1. However, Transfer-Encoding headers are dropped
from H2 messages. It is not a problem for chunk-encoded payload because
dechunking is performed during H1 parsing. But, for any other encodings, the
xferred H2 message is invalid.

It is also a problem for internal payload manipulations (lua,
filters...). Because the TE request headers are now sanitiezd, unsupported
encoding should not be used by servers. Thus it is only a problem for the
request messages. For this reason, such messages are now rejected. And if a
server decides to use an unknown encoding, the response will also be
rejected.

Note that it is pretty uncommon to use other encoding than "chunked" on the
request payload. So it is not necessary to backport it.

This patch should fix the issue #1301. No backport is needed.
2021-09-28 16:39:47 +02:00
Christopher Faulet
545fbba273 MINOR: h1: Change T-E header parsing to fail if chunked encoding is found twice
According to the RFC7230, "chunked" encoding must not be applied more than
once to a message body. To handle this case, h1_parse_xfer_enc_header() is
now responsible to fail when a parsing error is found. It also fails if the
"chunked" encoding is not the last one for a request.

To help the parsing, two H1 parser flags have been added: H1_MF_TE_CHUNKED
and H1_MF_TE_OTHER. These flags are set, respectively, when "chunked"
encoding and any other encoding are found. H1_MF_CHNK flag is used when
"chunked" encoding is the last one.
2021-09-28 16:21:25 +02:00
Christopher Faulet
92cafb39e7 MINOR: http: Add 422-Unprocessable-Content error message
The last HTTP/1.1 draft adds the 422 status code in the list of client
errors. It normalizes the WebDav specific one (422-Unprocessable-Entity).
2021-09-28 16:21:25 +02:00
Christopher Faulet
f56e8465f0 BUG/MINOR: mux-h1/mux-fcgi: Sanitize TE header to only send "trailers"
Only chunk-encoded response payloads are supported by HAProxy. All other
transfer encodings are not supported and will be an issue if the HTTP
compression is enabled. So be sure only "trailers" is send in TE request
headers.

The patch is related to the issue #1301. It must be backported to all stable
versions. Be carefull for 2.0 and lower because the HTTP legacy must also be
fixed.
2021-09-28 16:21:25 +02:00
Christopher Faulet
631c7e8665 MEDIUM: h1: Force close mode for invalid uses of T-E header
Transfer-Encoding header is not supported in HTTP/1.0. However, softwares
dealing with HTTP/1.0 and HTTP/1.1 messages may accept it and transfer
it. When a Content-Length header is also provided, it must be
ignored. Unfortunately, this may lead to vulnerabilities (request smuggling
or response splitting) if an intermediary is only implementing
HTTP/1.0. Because it may ignore Transfer-Encoding header and only handle
Content-Length one.

To avoid any security issues, when Transfer-Encoding and Content-Length
headers are found in a message, the close mode is forced. The same is
performed for HTTP/1.0 message with a Transfer-Encoding header only. This
change is conform to what it is described in the last HTTP/1.1 draft. See
also httpwg/http-core#879.

Note that Content-Length header is also removed from any incoming messages
if a Transfer-Encoding header is found. However it is not true (not yet) for
responses generated by HAProxy.
2021-09-28 16:21:25 +02:00
Christopher Faulet
e136bd12a3 MEDIUM: mux-h1: Reject HTTP/1.0 GET/HEAD/DELETE requests with a payload
This kind of requests is now forbidden and rejected with a
413-Payload-Too-Large error.

It is unexpected to have a payload for GET/HEAD/DELETE requests. It is
explicitly allowed in HTTP/1.1 even if some servers may reject such
requests. However, HTTP/1.0 is not clear on this point and some old servers
don't expect any payload and never look for body length (via Content-Length
or Transfer-Encoding headers).

It means that some intermediaries may properly handle the payload for
HTTP/1.0 GET/HEAD/DELETE requests, while some others may totally ignore
it. That may lead to security issues because a request smuggling attack is
possible.

To prevent any issue, those requests are now rejected.

See also httpwg/http-core#904
2021-09-28 16:21:11 +02:00
Christopher Faulet
b3230f76e8 MINOR: mux-h1: Be able to set custom status code on parsing error
When a parsing error is triggered, the status code may be customized by
setting H1C .errcode field. By default a 400-Bad-Request is returned. The
function h1_handle_bad_req() has been renamed to h1_handle_parsing_error()
to be more generic.
2021-09-28 16:18:17 +02:00
Christopher Faulet
36e46aa28c MINOR: mux-h1: Set error code if possible when MUX_EXIT_STATUS is returned
In h1_ctl(), if output parameter is provided when MUX_EXIT_STATUS is
returned, it is used to set the error code. In addition, any client errors
(4xx), except for 408 ones, are handled as invalid errors
(MUX_ES_INVALID_ERR). This way, it will be possible to customize the parsing
error code for request messages.
2021-09-28 16:17:59 +02:00
Christopher Faulet
a015b3ec8b MINOR: log: Try to get the status code when MUX_EXIT_STATUS is retrieved
The mux .ctl callback can provide some information about the mux to the
caller if the third parameter is provided. Thus, when MUX_EXIT_STATUS is
retrieved, a pointer on the status is now passed. The mux may fill it. It
will be pretty handy to provide custom error code from h1 mux instead of
default ones (400/408/500/501).
2021-09-28 13:52:25 +02:00
Willy Tarreau
2d5d4e0c3e MINOR: init: extract the setup and end of threads to their own functions
The startup code was still ugly with tons of unreadable nested ifdefs.
Let's just have one function to set up the extra threads and another one
to wait for their completion. The ifdefs are isolated into their own
functions now and are more readable, just like the end of main(), which
now uses the same statements to start thread 0 with and without threads.
2021-09-28 11:44:31 +02:00
Willy Tarreau
fb641d7af0 MEDIUM: init: de-uglify the per-thread affinity setting
Till now the threads startup was quite messy:
  - we would start all threads but one
  - then we would change all threads' CPU affinities
  - then we would manually start the poll loop for the current thread

Let's change this by moving the CPU affinity setting code to a function
set_thread_cpu_affinity() that does this job for the current thread only,
and that is called during the thread's initialization in the polling loop.

It takes care of not doing this for the master, and will result in all
threads to be properly bound earlier and with cleaner code. It also
removes some ugly nested ifdefs.
2021-09-28 11:42:19 +02:00
Willy Tarreau
2a30f4d87e CLEANUP: init: remove useless test against MAX_THREADS in affinity loop
The test i < MAX_THREADS is pointless since the loop boundary is bound
to global.nbthread which is already not greater.
2021-09-28 09:56:44 +02:00
Willy Tarreau
51ec03a61d MINOR: config: use a standard parser for the "nbthread" keyword
Probably because of some copy-paste from "nbproc", "nbthread" used to
be parsed in cfgparse instead of using a registered parser. Let's fix
this to clean up the code base now.
2021-09-27 09:47:40 +02:00
William Lallemand
614e68337d BUG/MEDIUM: httpclient: replace ist0 by istptr
ASAN reported a buffer overflow in the httpclient. This overflow is the
consequence of ist0() which is incorrect here.

Replace all occurences of ist0() by istptr() which is more appropried
here since all ist in the httpclient were created from strings.
2021-09-26 18:19:55 +02:00
William Lallemand
4a4e663771 Revert "head-truc"
This reverts commit fe67e091859b07dca4622981a8d98a0b64de3cab.

Revert a development/test patch which was accidentely introduced.
2021-09-24 19:19:37 +02:00
William Lallemand
7d21836bc6 head-truc 2021-09-24 19:05:41 +02:00
Tim Duesterhus
eaf16fcb53 CLEANUP: slz: Mark reset_refs as static
This function has no prototype and is not used outside of slz.c.
2021-09-24 15:07:50 +02:00
William Lallemand
79416cbd7a BUG/MINOR: httpclient/lua: return an error on argument check
src/hlua.c:7074:6: error: variable 'url_str' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
        if (lua_type(L, -1) == LUA_TSTRING)
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
src/hlua.c:7079:36: note: uninitialized use occurs here
        hlua_hc->hc->req.url = istdup(ist(url_str));
                                          ^~~~~~~

Return an error on the stack if the argument is not a string.
2021-09-24 14:57:15 +02:00
William Lallemand
d7df73a114 MINOR: httpclient/lua: implement the headers in the response object
Provide a new field "headers" in the response of the HTTPClient, which
contains all headers of the response.

This field is a multi-dimensionnal table which could be represented this
way in lua:

    headers = {
       ["content-type"] = { "text/html" },
       ["cache-control"] = { "no-cache" }
    }
2021-09-24 14:29:36 +02:00
William Lallemand
3956c4ead2 MINOR: httpclient/lua: httpclient:get() API in lua
This commit provides an hlua_httpclient object which is a bridge between
the httpclient and the lua API.

The HTTPClient is callable in lua this way:

    local httpclient = core.httpclient()
    local response = httpclient:get("http://127.0.0.1:9000/?s=9999")
    core.Debug("Status: ".. res.status .. ", Reason : " .. res.reason .. ", Len:" .. string.len(res.body) .. "\n")

The resulting response object will provide a "status" field which
contains the status code, a "reason" string which contains the reason
string, and a "body" field which contains the response body.

The implementation uses the httpclient callback to wake up the lua task
which yield each time it pushes some data. The httpclient works in the
same thread as the lua task.
2021-09-24 14:29:36 +02:00
William Lallemand
1123dde6dd MINOR: httpclient: httpclient_ended() returns 1 if the client ended
httpclient_ended() returns 1 if there is no more data to collect,
because the client received everything or the connection ended.
2021-09-24 14:21:26 +02:00
William Lallemand
518878e007 MINOR: httpclient: httpclient_data() returns the available data
httpclient_data() returns the available data in the httpclient.
2021-09-24 14:21:26 +02:00
Thierry Fournier
b6b1cdeae4 CLEANUP: stats: Fix some alignment mistakes
This patch fix some broken alignements. Code is not modified
The command `git show -w` whows nothing.
2021-09-24 08:52:45 +02:00
Thierry Fournier
e9ed63e548 MINOR: stats: Enable dark mode on stat web page
According with the W3 CSS specification, media queries 5 allow
the browser to enable some CSS when dark mode is enabled. This
patch defines dark mode CSS for the stats page.

https://www.w3.org/TR/mediaqueries-5/#prefers-color-scheme
2021-09-24 08:27:40 +02:00
Dragan Dosen
9a006f9641 BUG/MINOR: http-ana: increment internal_errors counter on response error
A bug was introduced in the commit cff0f739e5 ("MINOR: counters: Review
conditions to increment counters from analysers"). The internal_errors
counter for the target server was incremented twice. The counter for the
session listener needs to be incremented instead.

This must be backported everywhere the commit cff0f739e5 is.
2021-09-23 16:25:47 +02:00
Christopher Faulet
564e39c4c6 MINOR: stream-int: Notify mux when the buffer is not stuck when calling rcv_buf
The transient flag CO_RFL_BUF_NOT_STUCK should now be set when the mux's
rcv_buf() function is called, in si_cs_recv(), to be sure the mux is able to
perform some optimisation during data copy. This flag is set when we are
sure the channel buffer is not stuck. Concretely, it happens when there are
data scheduled to be sent.

It is not a fix and this flag is not used for now. But it makes sense to have
this info to be sure to be able to do some optimisations if necessary.

This patch is related to the issue #1362. It may be backported to 2.4 to
ease future backports.
2021-09-23 16:25:47 +02:00
Christopher Faulet
2bc364c191 BUG/MEDIUM: stream-int: Defrag HTX message in si_cs_recv() if necessary
The stream interface is now responsible for defragmenting the HTX message of
the input channel if necessary, before calling the mux's .rcv_buf()
function. The defrag is performed if the underlying buffer contains only
input data while the HTX message free space is not contiguous.

The defrag is important here to be sure the mux and the app layer have the
same criteria to decide if a buffer is full or not. Otherwise, the app layer
may wait for more data because the buffer is not full while the mux is
blocked because it needs more space to proceed.

This patch depends on following commits:

  * MINOR: htx: Add an HTX flag to know when a message is fragmented
  * MINOR: htx: Add a function to know if the free space wraps

This patch is related to the issue #1362. It may be backported as far as 2.0
after some observation period (not sure it is required or not).
2021-09-23 16:25:16 +02:00
Christopher Faulet
4697c92c9d MINOR: htx: Add an HTX flag to know when a message is fragmented
HTX_FL_FRAGMENTED flag is now set on an HTX message when it is
fragmented. It happens when an HTX block is removed in the middle of the
message and flagged as unused. HTX_FL_FRAGMENTED flag is removed when all
data are removed from the message or when the message is defragmented.

Note that some optimisations are still possible because the flag can be
avoided in other situations. For instance when the last header of a bodyless
message is removed.
2021-09-23 16:19:36 +02:00
Christopher Faulet
68a14db573 MINOR: stream-int: Set CO_RFL transient/persistent flags apart in si_cs_rcv()
In si_cs_recv(), some CO_RFL flags are set when the mux's .rcv_buf()
function is called. Some are persitent inside si_cs_recv() scope, some
others must be computed at each call to rcv_buf(). This patch takes care of
distinguishing them.

Among others, CO_RFL_KEEP_RECV is a persistent flag while CO_RFL_BUF_WET is
transient.
2021-09-23 16:19:36 +02:00
Christopher Faulet
7833596ff4 BUG/MEDIUM: stream: Stop waiting for more data if SI is blocked on RXBLK_ROOM
If the stream-interface is waiting for more buffer room to store incoming
data, it is important at the stream level to stop to wait for more data to
continue. Thanks to the previous patch ("BUG/MEDIUM: stream-int: Notify
stream that the mux wants more room to xfer data"), the stream is woken up
when this happens. In this patch, we take care to interrupt the
corresponding tcp-content ruleset or to stop waiting for the HTTP message
payload.

To ease detection of the state, si_rx_blocked_room() helper function has
been added. It returns non-zero if the stream interface's Rx path is blocked
because of lack of room in the input buffer.

This patch is part of a series related to the issue #1362. It should be
backported as ar as 2.0, probably with some adaptations. So be careful
during backports.
2021-09-23 16:18:07 +02:00
Christopher Faulet
df99408e0d BUG/MEDIUM: stream-int: Notify stream that the mux wants more room to xfer data
When the mux failed to transfer data to the upper layer because of a lack of
room, it is important to wake the stream up to let it handle this
event. Otherwise, if the stream is waiting for more data, both the stream
and the mux reamin blocked waiting for each other.

When this happens, the mux set the CS_FL_WANT_ROOM flag on the
conn-stream. Thus, in si_cs_recv() we are able to detect this event. Today,
the stream-interface is blocked. But, it is not enough to wake the stream
up. To fix the bug, CF_READ_PARTIAL flag is extended to also handle cases
where a read exception occurred. This flag should idealy be renamed. But for
now, it is good enough. By setting this flag, we are sure the stream will be
woken up.

This patch is part of a series related to the issue #1362. It should be
backported as far as 2.0, probably with some adaptations. So be careful
during backports.
2021-09-23 16:16:57 +02:00
Christopher Faulet
46e058dda5 BUG/MEDIUM: mux-h1: Adjust conditions to ask more space in the channel buffer
When a message is parsed and copied into the channel buffer, in
h1_process_demux(), more space is requested if some pending data remain
after the parsing while the channel buffer is not empty. To do so,
CS_FL_WANT_ROOM flag is set. It means the H1 parser needs more space in the
channel buffer to continue. In the stream-interface, when this flag is set,
the SI is considered as blocked on the RX path. It is only unblocked when
some data are sent.

However, it is not accurrate because the parsing may be stopped because
there is not enough data to continue. For instance in the middle of a chunk
size. In this case, some data may have been already copied but the parser is
blocked because it must receive more data to continue. If the calling SI is
blocked on RX at this stage when the stream is waiting for the payload
(because http-buffer-request is set for instance), the stream remains stuck
infinitely.

To fix the bug, we must request more space to the app layer only when it is
not possible to copied more data. Actually, this happens when data remain in
the input buffer while the H1 parser is in states MSG_DATA or MSG_TUNNEL, or
when we are unable to copy headers or trailers into a non-empty buffer.

The first condition is quite easy to handle. The second one requires an API
refactoring. h1_parse_msg_hdrs() and h1_parse_msg_tlrs() fnuctions have been
updated. Now it is possible to know when we need more space in the buffer to
copy headers or trailers (-2 is returned). In the H1 mux, a new H1S flag
(H1S_F_RX_CONGESTED) is used to track this state inside h1_process_demux().

This patch is part of a series related to the issue #1362. It should be
backported as far as 2.0, probably with some adaptations. So be careful
during backports.
2021-09-23 16:13:17 +02:00
Christopher Faulet
216d3352b1 BUG/MINOR: h1-htx: Fix a typo when request parser is reset
In h1_postparse_req_hdrs(), if we need more space to copy headers, the request
parser is reset. However, because of a typo, it was reset as a response parser
instead of a request one. h1m_init_req() must be called.

This patch must be backported as far as 2.2.
2021-09-23 16:10:36 +02:00
Amaury Denoyelle
cde911231e MINOR: quic: fix qcc subs initialization 2021-09-23 15:27:25 +02:00
Amaury Denoyelle
cd28b27581 MEDIUM: quic: implement mux release/conn free 2021-09-23 15:27:25 +02:00
Amaury Denoyelle
414cac5f9d MINOR: quic: define close handler 2021-09-23 15:27:25 +02:00
Frédéric Lécaille
865b07855e MINOR: quic: Crash upon too big packets receipt
This bug came with this commit:
    ("MINOR: quic: RX packets memory leak")
Too big packets were freed twice.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
3230bcfdc4 MINOR: quic: Possible endless loop in qc_treat_rx_pkts()
Ensure we do not endlessly treat always the same encryption level
in qc_treat_rx_pkts().
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
310d1bd08f MINOR: quic: RX packets memory leak
Missing RX packet reference counter decrementation at the lowest level.
This leaded the memory reserved for RX packets to never be released.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
ebc3fc1509 CLEANUP: quic: Remove useless inline functions
We want to track the packet reference counting more easily, so without
inline functions.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
8526f14acd MINOR: quic: Wake up the xprt from mux
We wake up the xprt as soon as STREAM frames have been pushed to
the TX mux buffer (->tx.buf).
We also make the mux subscribe() to the xprt layer if some data
remain in its ring buffer after having try to transfer them to the
xprt layer (TX mux buffer for the stream full).
Also do not consider a buffer in the ring if not allocated (see b_size(buf))
condition in the for(;;) loop.
Make a call to qc_process_mux() if possible when entering qc_send() to
fill the mux with data from streams in the send or flow control lists.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
1d40240f25 MINOR: quic: Implement qc_process_mux()
At this time, we only add calls to qc_resume_each_sending_qcs()
which handle the flow control and send lists.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
d2ba0967b7 MINOR: quic: Stream FIN bit fix in qcs_push_frame()
The FIN of a STREAM frame to be built must be set if there is no more
at all data in the ring buffer.
Do not do anything if there is nothing to transfer the ->tx.buf mux
buffer via b_force_xfer() (without zero copy)
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
1c482c665b MINOR: quic: Wake up the mux upon ACK receipt
When ACK have been received by the xprt, it must wake up the
mux if this latter has subscribed to SEND events. This is the
role of qcs_try_to_consume() to detect such a situation. This
is the function which consumes the buffer filled by the mux.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
513b4f290a MINOR: quic: Implement quic_conn_subscribe()
We implement ->subscribe() xprt callback which should be used only by the mux.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
153194f47a MINOR: mux_quic: Export the mux related flags
These flags should be available from the xprt which must be able to
wake up the mux when blocked.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
acd43a597c MINOR: quic: Add useful trace about pktns discarding
It is important to know if the packet number spaces used during the
handshakes have really been discarding. If not, this may have a
significant impact on the packet loss detection.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
8c27de7d20 MINOR: quic: Initial packet number spaced not discarded
There were cases where the Initial packet number space was not discarded.
This leaded the packet loss detection to continue to take it into
considuration during the connection lifetime. Some Application level
packets could not be retransmitted.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
2cb130c980 MINOR: quic: Constantness fixes for frame builders/parsers.
This is to ensure we do not modify important static variables:
the QUIC frame builders and parsers.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
dc2593e460 MINOR: quic: Wrong packet flags settings during frame building
We flag the packet as being ack-eliciting when building the frame.
But a wrong variable was used to to so.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
156a59b7c9 MINOR: quic: Confusion between TX/RX for the frame builders
QUIC_FL_TX_PACKET_ACK_ELICITING was replaced by QUIC_FL_RX_PACKET_ACK_ELICITING
by this commit due to a copy and paste:
   e5b47b637 ("MINOR: quic: Add a mask for TX frame builders and their authorized packet types")
Furthermore the flags for the PADDING frame builder was not initialized.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
578a7898f2 MINOR: mux_quic: move qc_process() code to qc_send()
qc_process is supposed to be run for each I/O handler event, not
only for "send" events.
2021-09-23 15:27:25 +02:00