Commit Graph

5132 Commits

Author SHA1 Message Date
Amaury Denoyelle
d3a88c1c32 MEDIUM: connection: close front idling connection on soft-stop
Implement a safe mechanism to close front idling connection which
prevents the soft-stop to complete. Every h1/h2 front connection is
added in a new per-thread list instance. On shutdown, a new task is
waking up which calls wake mux operation on every connection still
present in the new list.

A new stopping_list attach point has been added in the connection
structure. As this member is only used for frontend connections, it
shared the same union as the session_list reserved for backend
connections.
2021-05-05 14:39:23 +02:00
Amaury Denoyelle
99cca08ecc MINOR: connection: move session_list member in a union
Move the session_list attach point in an anonymous union. This member is
only used for backend connections. This commit is in preparation for the
support of stopping frontend idling connections which will add another
member to the union.

This change means that a special care must be taken to be sure that only
backend connections manipulate the session_list. A few BUG_ON has been
added as special guard to prevent from misuse.
2021-05-05 14:35:36 +02:00
Willy Tarreau
1ab6c0bfd2 MINOR: pools/debug: slightly relax DEBUG_DONT_SHARE_POOLS
The purpose of this debugging option was to prevent certain pools from
masking other ones when they were shared. For example, task, http_txn,
h2s, h1s, h1c, session, fcgi_strm, and connection are all 192 bytes and
would normally be mergedi, but not with this option. The problem is that
certain pools are declared multiple times with various parameters, which
are often very close, and due to the way the option works, they're not
shared either. Good examples of this are captures and stick tables. Some
configurations have large numbers of stick-tables of pretty similar types
and it's very common to end up with the following when the option is
enabled:

  $ socat - /tmp/sock1  <<< "show pools" | grep stick
    - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753800=56
    - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753880=57
    - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753900=58
    - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753980=59
    - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753a00=60
    - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753a80=61
    - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753b00=62
    - Pool sticktables (224 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753780=55

In addition to not being convenient, it can have important effects on the
memory usage because these pools will not share their entries, so one stick
table cannot allocate from another one's pool.

This patch solves this by going back to the initial goal which was not to
have different pools in the same list. Instead of masking the MAP_F_SHARED
flag, it simply adds a test on the pool's name, and disables pool sharing
if the names differ. This way pools are not shared unless they're of the
same name and size, which doesn't hinder debugging. The same test above
now returns this:

  $ socat - /tmp/sock1  <<< "show pools" | grep stick
    - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 7 users, @0x3fadb30 [SHARED]
    - Pool sticktables (224 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x3facaa0 [SHARED]

This is much better. This should probably be backported, in order to limit
the side effects of DEBUG_DONT_SHARE_POOLS being enabled in production.
2021-05-05 07:47:29 +02:00
Amaury Denoyelle
d272b409d7 BUILD: compiler: do not use already defined __read_mostly on dragonfly
DragonflyBSD already has an attribute __read_mostly which serves the
same purpose as the one in compiler.h.

No need to be backported as it was added in the current 2.4-dev.
2021-04-30 17:16:36 +02:00
Willy Tarreau
a13afe6535 MINOR: pattern: support purging arbitrary ranges of generations
Instead of being able to purge only values older than a specific value,
let's support arbitrary ranges and make pat_ref_purge_older() just be
one special case of this one.
2021-04-30 15:36:31 +02:00
Willy Tarreau
b4476c6a8c CLEANUP: freq_ctr: make arguments of freq_ctr_total() const
freq_ctr_total() doesn't modify the freq counters, it should take a
const argument.
2021-04-28 17:44:37 +02:00
Christopher Faulet
8b604d1656 CLEANUP: channel: No longer notify the producer in co_skip()/co_htx_skip()
Thanks to the commit "BUG/MINOR: applet: Notify the other side if data were
consumed by an applet", it is no longer necessary to notify the producer when an
applet skips output data. Now, it is the default applet handler responsibility
to take care of that.
2021-04-28 11:08:35 +02:00
Christopher Faulet
260ec8e9a9 MINOR: htx: Limit length of headers name/value when a HTX message is dumped
In htx_dump() function, we now limit the length of the headers name and the
value to not fully print huge headers.
2021-04-28 10:51:08 +02:00
Christopher Faulet
2b78f0bfc4 CLEANUP: htx: Remove unsued hdrs_bytes field from the HTX start-line
Thanks to the htx_xfer_blks() refactoring, it is now possible to remove
hdrs_bytes field from the start-line because no function rely on it anymore.
2021-04-28 10:51:08 +02:00
Amaury Denoyelle
fc6ac53dca BUG/MAJOR: fix build on musl with cpu_set_t support
Move cpu_map structure outside of the global struct to a global
variable defined in cpuset.c compilation unit. This allows to reorganize
the includes without having to define _GNU_SOURCE everywhere for the
support of the cpu_set_t.

This fixes the compilation with musl libc, most notably used for the
alpine based docker image.

This fixes the github issue #1235.

No need to backport as this feature is new in the current
2.4-dev.
2021-04-27 14:11:26 +02:00
Amaury Denoyelle
9463f0e222 BUG/MINOR: cpuset: move include guard at the very beginning
The include guard in cpuset-t.h were misplaced and should be the first
directive of the file.

No need to backport.
2021-04-27 10:39:39 +02:00
Ilya Shipitsin
b2be9a1ea9 CLEANUP: assorted typo fixes in the code and comments
This is 22nd iteration of typo fixes
2021-04-26 10:42:58 +02:00
Christopher Faulet
df3db630e4 REORG: htx: Inline htx functions to add HTX blocks in a message
The HTX functions used to add new HTX blocks in a message have been moved to
the header file to inline them in calling functions. These functions are
small enough.
2021-04-26 10:24:57 +02:00
Christopher Faulet
fb38c910f8 BUG/MINOR: mux-fcgi: Don't send normalized uri to FCGI application
A normalized URI is the internal term used to specify an URI is stored using
the absolute format (scheme + authority + path). For now, it is only used
for H2 clients. It is the default and recommended format for H2 request.
However, it is unusual for H1 servers to receive such URI. So in this case,
we only send the path of the absolute URI. It is performed for H1 servers,
but not for FCGI applications. This patch fixes the difference.

Note that it is not a real bug, because FCGI applications should support
abosolute URI.

Note also a normalized URI is only detected for H2 clients when a request is
received. There is no such test on the H1 side. It means an absolute URI
received from an H1 client will be sent without modification to an H1 server
or a FCGI application.

To make it possible, a dedicated function has been added to get the H1
URI. This function is called by the H1 and the FCGI multiplexer when a
request is sent to a server.

This patch should fix the issue #1232. It must be backported as far as 2.2.
2021-04-26 10:23:18 +02:00
Tim Duesterhus
2e4a18e04a MINOR: uri_normalizer: Add a percent-decode-unreserved normalizer
This normalizer decodes percent encoded characters within the RFC 3986
unreserved set.

See GitHub Issue #714.
2021-04-23 19:43:45 +02:00
Emeric Brun
2cc201f97e BUG/MEDIUM: peers: re-work refcnt on table to protect against flush
In proxy.c, when process is stopping we try to flush tables content
using 'stktable_trash_oldest'. A check on a counter "table->syncing" was
made to verify if there is no pending resync in progress.
But using multiple threads this counter can be increased by an other thread
only after some delay, so the content of some tables can be trashed earlier and
won't be pushed to the new process (after reload, some tables appear reset and
others don't).

This patch re-names the counter "table->syncing" to "table->refcnt" and
the counter is increased during configuration parsing (registering a table to
a peer section) to protect tables during runtime and until resync of a new
process has succeeded or failed.

The inc/dec operations are now made using atomic operations
because multiple peer sections could refer to the same table in futur.

This fix addresses github #1216.

This patch should be backported on all branches multi-thread support (v >= 1.8)
2021-04-23 18:03:06 +02:00
Willy Tarreau
5020ffbe49 MINOR: time: avoid u64 needlessly expensive computations for the 32-bit now_ms
The compiler cannot guess that tv_sec or tv_usec might have unused
parts, so the multiply by 1000 and the divide by 1000 are both
performed using 64-bit constants to stick to the common type. This is
not needed since we only keep the final 32 bits, let's help the compiler
here by casting these fields to uint. The tv_update_date() code is much
cleaner (48 bytes smaller in the CAS loop) as it avoids some register
spilling at a location where that's really unwanted.
2021-04-23 18:03:06 +02:00
Amaury Denoyelle
a6f9c5d2a7 BUG/MINOR: cpuset: fix compilation on platform without cpu affinity
The compilation is currently broken on platform without USE_CPU_AFFINITY
set. An error has been reported by the cygwin build of the CI.

This does not need to be backported.

In file included from include/haproxy/global-t.h:27,
                 from include/haproxy/global.h:26,
                 from include/haproxy/fd.h:33,
                 from src/ev_poll.c:22:
include/haproxy/cpuset-t.h:32:3: error: #error "No cpuset support implemented on this platform"
   32 | # error "No cpuset support implemented on this platform"
      |   ^~~~~
include/haproxy/cpuset-t.h:37:2: error: unknown type name ‘CPUSET_REPR’
   37 |  CPUSET_REPR cpuset;
      |  ^~~~~~~~~~~
make: *** [Makefile:944: src/ev_poll.o] Error 1
make: *** Waiting for unfinished jobs....
In file included from include/haproxy/global-t.h:27,
                 from include/haproxy/global.h:26,
                 from include/haproxy/fd.h:33,
                 from include/haproxy/connection.h:30,
                 from include/haproxy/ssl_sock.h:27,
                 from src/ssl_sample.c:30:
include/haproxy/cpuset-t.h:32:3: error: #error "No cpuset support implemented on this platform"
   32 | # error "No cpuset support implemented on this platform"
      |   ^~~~~
include/haproxy/cpuset-t.h:37:2: error: unknown type name ‘CPUSET_REPR’
   37 |  CPUSET_REPR cpuset;
      |  ^~~~~~~~~~~
make: *** [Makefile:944: src/ssl_sample.o] Error 1
2021-04-23 17:04:24 +02:00
Amaury Denoyelle
0f50cb9c73 MINOR: global: add option to disable numa detection
Render numa detection optional with a global configuration statement
'no numa-cpu-mapping'. This can be used if the applied affinity of the
algorithm is not optimal. Also complete the documentation with this new
keyword.
2021-04-23 16:06:49 +02:00
Amaury Denoyelle
b56a7c89a8 MEDIUM: cfgparse: detect numa and set affinity if needed
On process startup, the CPU topology of the machine is inspected. If a
multi-socket CPU machine is detected, automatically define the process
affinity on the first node with active cpus. This is done to prevent an
impact on the overall performance of the process in case the topology of
the machine is unknown to the user.

This step is not executed in the following condition :
- a non-null nbthread statement is present
- a restrictive 'cpu-map' statement is present
- the process affinity is already restricted, for example via a taskset
  call

For the record, benchmarks were executed on a machine with 2 CPUs
Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz. In both clear and ssl
scenario, the performance were sub-optimal without the automatic
rebinding on a single node.
2021-04-23 16:06:49 +02:00
Amaury Denoyelle
a80823543c MINOR: cfgparse: support the comma separator on parse_cpu_set
Allow to specify multiple cpu ids/ranges in parse_cpu_set separated by a
comma. This is optional and must be activated by a parameter.

The comma support is disabled for the parsing of the 'cpu-map' config
statement. However, it will be useful to parse files in sysfs when
inspecting the cpus topology for NUMA automatic process binding.
2021-04-23 16:06:49 +02:00
Amaury Denoyelle
4c9efdecf5 MINOR: thread: implement the detection of forced cpu affinity
Create a function thread_cpu_mask_forced. Its purpose is to report if a
restrictive cpu mask is active for the current proces, for example due
to a taskset invocation. It is only implemented for the linux platform
currently.
2021-04-23 16:06:49 +02:00
Amaury Denoyelle
982fb53390 MEDIUM: config: use platform independent type hap_cpuset for cpu-map
Use the platform independent type hap_cpuset for the cpu-map statement
parsing. This allow to address CPU index greater than LONGBITS.

Update the documentation to reflect the removal of this limit except for
platforms without cpu_set_t type or equivalent.
2021-04-23 16:06:49 +02:00
Amaury Denoyelle
c90932bc8e MINOR: cfgparse: use hap_cpuset for parse_cpu_set
Replace the unsigned long parameter by a hap_cpuset. This allows to
address CPU with index greater than LONGBITS.

This function is used to parse the 'cpu-map' statement. However at the
moment, the result is casted back to a long to store it in the global
structure. The next step is to replace ulong in in cpu_map in the
global structure with hap_cpuset.
2021-04-23 16:06:49 +02:00
Amaury Denoyelle
f75c640f7b MINOR: cpuset: define a platform-independent cpuset type
This module can be used to manipulate a cpu sets in a platform agnostic
way. Use the type cpu_set_t/cpuset_t if available on the platform, or
fallback to unsigned long, which limits de facto the maximum cpu index
to LONGBITS.
2021-04-23 16:06:49 +02:00
Willy Tarreau
5e65f4276b CLEANUP: compression: remove calls to SLZ init functions
As we now embed the library we don't need to support the older 1.0 API
any more, so we can remove the explicit calls to slz_make_crc_table()
and slz_prepare_dist_table().
2021-04-22 16:11:19 +02:00
Willy Tarreau
12840be005 BUILD: compression: switch SLZ from out-of-tree to in-tree
Now that SLZ is merged, let's update the makefile and compression
files to use it. As a result, SLZ_INC and SLZ_LIB are neither defined
nor used anymore.

USE_SLZ is enabled by default ("USE_SLZ=default") and can be disabled
by passing "USE_SLZ=" or by enabling USE_ZLIB=1.

The doc was updated to reflect the changes.
2021-04-22 16:08:25 +02:00
Willy Tarreau
ab2b7828e2 IMPORT: slz: import slz into the tree
SLZ is rarely packaged by distros and there have been complaints about
the CPU and memory usage of ZLIB, leading to some suggestions to better
address the issue by simply integrating SLZ into the tree (just 3 files).
See discussions below:

   https://www.mail-archive.com/haproxy@formilux.org/msg38037.html
   https://www.mail-archive.com/haproxy@formilux.org/msg40079.html
   https://www.mail-archive.com/haproxy@formilux.org/msg40365.html

This patch does just this, after minor adjustments to these files:
  - tables.h was renamed to slz-tables.h
  - tables.h had the precomputed tables removed since not used here
  - slz.c uses includes <import/slz*> instead of "slz*.h"

The slz commit imported here was b06c172 ("slz: avoid a build warning
with -Wimplicit-fallthrough"). No other change was performed either to
SLZ nor to haproxy at this point so that this operation may be replicated
if needed for a future version.
2021-04-22 15:50:41 +02:00
Maximilian Mader
ff3bb8b609 MINOR: uri_normalizer: Add a strip-dot normalizer
This normalizer removes "/./" segments from the path component.
Usually the dot refers to the current directory which renders those segments redundant.

See GitHub Issue #714.
2021-04-21 12:15:14 +02:00
Willy Tarreau
2b71810cb3 CLEANUP: lists/tree-wide: rename some list operations to avoid some confusion
The current "ADD" vs "ADDQ" is confusing because when thinking in terms
of appending at the end of a list, "ADD" naturally comes to mind, but
here it does the opposite, it inserts. Several times already it's been
incorrectly used where ADDQ was expected, the latest of which was a
fortunate accident explained in 6fa922562 ("CLEANUP: stream: explain
why we queue the stream at the head of the server list").

Let's use more explicit (but slightly longer) names now:

   LIST_ADD        ->       LIST_INSERT
   LIST_ADDQ       ->       LIST_APPEND
   LIST_ADDED      ->       LIST_INLIST
   LIST_DEL        ->       LIST_DELETE

The same is true for MT_LISTs, including their "TRY" variant.
LIST_DEL_INIT keeps its short name to encourage to use it instead of the
lazier LIST_DELETE which is often less safe.

The change is large (~674 non-comment entries) but is mechanical enough
to remain safe. No permutation was performed, so any out-of-tree code
can easily map older names to new ones.

The list doc was updated.
2021-04-21 09:20:17 +02:00
Willy Tarreau
942b89f7dc BUILD: pools: fix build with DEBUG_FAIL_ALLOC
Amaury noticed that I managed to break the build of DEBUG_FAIL_ALLOC
for the second time with 207c09509 ("MINOR: pools: move the fault
injector to __pool_alloc()"). The joy of endlessly reworking patch
sets... No backport is needed, that was in the just merged cleanup
series.
2021-04-19 18:36:48 +02:00
Willy Tarreau
096b6cf581 CLEANUP: pools: declare dummy pool functions to remove some ifdefs
By having a pair of dummy pool_get_from_cache() and pool_put_to_cache()
we can remove some ugly ifdefs, so let's do this. We've already done it
for the shared cache.
2021-04-19 15:24:33 +02:00
Willy Tarreau
b2a853d5f0 CLEANUP: pools: uninline pool_put_to_cache()
This function has become too big (251 bytes) and is now hurting
performance a lot, with up to 4% request rate being lost over the last
pool changes. Let's move it to pool.c as a regular function. Other
attempts were made to cut it in half but it's still inefficient. Doing
this results in saving ~90kB of object code, and even 112kB since the
pool changes, with code that is even slightly faster!

Conversely, pool_get_from_cache(), which remains half of this size, is
still faster inlined, likely in part due to the immediate use of the
returned pointer afterwards.
2021-04-19 15:24:33 +02:00
Willy Tarreau
43d4ed548f CLEANUP: pools: merge pool_{get_from,put_to}_local_caches with generic ones
Since pool_get_from_cache() and pool_put_to_cache() were now only wrappers
to the local cache versions which do all the job, let's merge them together
so that there is no more local-cache specific function.
2021-04-19 15:24:33 +02:00
Willy Tarreau
d56db11447 CLEANUP: pools: make the local cache allocator fall back to the shared cache
Now when pool_get_from_local_cache() fails, it automatically falls back
to pool_get_from_shared_cache(), which used to always be done in
pool_get_from_cache(). Thus now the API is simpler as we always allocate
and free from/to the local caches.
2021-04-19 15:24:33 +02:00
Willy Tarreau
fa19d20ac4 MEDIUM: pools: make pool_put_to_cache() always call pool_put_to_local_cache()
Till now it used to call it only if there were not too many objects into
the local cache otherwise would send the latest one directly into the
shared cache. Now it always sends to the local cache and it's up to the
local cache to free its oldest objects. From a cache freshness perspective
it's better this way since we always evict cold objects instead of hot
ones. From an API perspective it's better because it will help make the
shared cache invisible to the public API.
2021-04-19 15:24:33 +02:00
Willy Tarreau
147e1fa385 MINOR: pools: create unified pool_{get_from,put_to}_cache()
These two functions are now responsible for allocating directly from
the cache and releasing to the cache.

Now the pool_alloc() function simply does this:

    if cache enabled
       return pool_alloc_from_cache() if no NULL
    return pool_alloc_nocache() otherwise

and the pool_free() function does this:

    if cache enabled
       pool_put_to_cache()
    else
       pool_free_nocache()

For now this only introduces these two functions without changing anything
else, but the goal is to soon allow to make them implementation-specific.
2021-04-19 15:24:33 +02:00
Willy Tarreau
b8498e961a MEDIUM: pools: make CONFIG_HAP_POOLS control both local and shared pools
Continuing the unification of local and shared pools, now the usage of
pools is governed by CONFIG_HAP_POOLS without which allocations and
releases are performed directly from the OS using pool_alloc_nocache()
and pool_free_nocache().
2021-04-19 15:24:33 +02:00
Willy Tarreau
45e4e28161 MINOR: pools: factor the release code into pool_put_to_os()
There are two levels of freeing to the OS:
  - code that wants to keep the pool's usage counters updated uses
    pool_free_area() and handles the counters itself. That's what
    pool_put_to_shared_cache() does in the no-global-pools case.
  - code that does not want to update the counters because they were
    already updated only calls pool_free_area().

Let's extract these calls to establish the symmetry with pool_get_from_os()
and pool_alloc_nocache(), resulting in pool_put_to_os() (which only updates
the allocated counter) and pool_free_nocache() (which also updates the used
counter). This will later allow to simplify the generic code.
2021-04-19 15:24:33 +02:00
Willy Tarreau
acf0c54491 MINOR: pools: move pool_free_area() out of the lock in the locked version
Calling pool_free_area() inside a lock in pool_put_to_shared_cache() is
a very bad idea. Fortunately this only happens on the lowest end platforms
which almost never use threads or in very small counts.

This change consists in zeroing the pointer once already released to the
cache in the first test so that the second stage knows if it needs to
pass it to the OS or not. This has slightly reduced the length of the
2021-04-19 15:24:33 +02:00
Willy Tarreau
2b5579f6da MINOR: pools: always use atomic ops to maintain counters
A part of the code cannot be factored out because it still uses non-atomic
inc/dec for pool->used and pool->allocated as these are located under the
pool's lock. While it can make sense in terms of bus cycles, it does not
make sense in terms of code normalization. Further, some operations were
still performed under a lock that could be totally removed via the use of
atomic ops.

There is still one occurrence in pool_put_to_shared_cache() in the locked
code where pool_free_area() is called under the lock, which must absolutely
be fixed.
2021-04-19 15:24:33 +02:00
Willy Tarreau
13843641e5 MINOR: pools: split the OS-based allocator in two
Now there's one part dealing with the allocation itself and keeping
counters up to date, and another one on top of it to return such an
allocated pointer to the user and update the use count and stats.

This is in anticipation for being able to group cache-related parts.
The release code is still done at once.
2021-04-19 15:24:33 +02:00
Willy Tarreau
207c095098 MINOR: pools: move the fault injector to __pool_alloc()
Till now it was limited to objects allocated from the OS which means
it had little use as soon as pools were enabled. Let's move it upper
in the layers so that any code can benefit from fault injection. In
addition this allows to pass a new flag POOL_F_NO_FAIL to disable it
if some callers prefer a no-failure approach.
2021-04-19 15:24:33 +02:00
Willy Tarreau
84ebfabf7f MINOR: tools: add statistical_prng_range() to get a random number over a range
This is simply a multiply and shift from statistical_prng() but it's made
easily accessible.
2021-04-19 15:24:33 +02:00
Willy Tarreau
635cced32f CLEANUP: pools: rename __pool_free() to pool_put_to_shared_cache()
Now the multi-level cache becomes more visible:

    pool_get_from_local_cache()
    pool_put_to_local_cache()
    pool_get_from_shared_cache()
    pool_put_to_shared_cache()
2021-04-19 15:24:33 +02:00
Willy Tarreau
8c77ee5ae5 CLEANUP: pools: rename pool_*_{from,to}_cache() to *_local_cache()
The functions were rightfully called from/to_cache when the thread-local
cache was considered as the only cache, but this is getting terribly
confusing. Let's call them from/to local_cache to make it clear that
it is not related with the shared cache.

As a side note, since pool_evict_from_cache() used not to work for a
particular pool but for all of them at once, it was renamed to
pool_evict_from_local_caches()  (plural form).
2021-04-19 15:24:33 +02:00
Willy Tarreau
2f03dcde91 CLEANUP: pools: rename __pool_get_first() to pool_get_from_shared_cache()
This is exactly what it is, the entry is retrieved from the shared
cache when it is defined. The implementation that is enabled with
CONFIG_HAP_NO_GLOBAL_POOLS continues to return NULL.
2021-04-19 15:24:33 +02:00
Willy Tarreau
2543211830 CLEANUP: pools: move the lock to the only __pool_get_first() that needs it
Now that __pool_alloc() only surrounds __pool_get_first() with the lock,
let's move it to the only variant that requires it and remove the ugly
ifdefs from the function. This is safe because nobody else calls this
function.
2021-04-19 15:24:33 +02:00
Willy Tarreau
8ee9df57db MINOR: pools: call pool_alloc_nocache() out of the pool's lock
In __pool_alloc(), historically we used to use factor out the
pool's lock between __pool_get_first() and __pool_refill_alloc(),
resulting in real malloc() or mmap() calls being performed under
the pool lock (for platforms using the locked shared pools).

As this is not needed anymore, let's move the call out of the
lock, it may improve allocation patterns on some platforms. This
also makes __pool_alloc() cleaner as we see a first attempt to
allocate from the local cache, then a second from the shared
cache then a reall allocation.
2021-04-19 15:24:33 +02:00
Willy Tarreau
8fe726f118 CLEANUP: pools: re-merge pool_refill_alloc() and __pool_refill_alloc()
They were strictly equivalent, let's remerge them and rename them to
pool_alloc_nocache() as it's the call which performs a real allocation
which does not check nor update the cache. The only difference in the
past was the former taking the lock and not the second but now the lock
is not needed anymore at this stage since the pool's list is not touched.

In addition, given that the "avail" argument is no longer used by the
function nor by its callers, let's drop it.
2021-04-19 15:24:33 +02:00
Willy Tarreau
64383b8181 MINOR: pools: make the basic pool_refill_alloc()/pool_free() update needed_avg
This is a first step towards unifying all the fallback code. Right now
these two functions are the only ones which do not update the needed_avg
rate counter since there's currently no shared pool kept when using them.
But their code is similar to what could be used everywhere except for
this one, so let's make them capable of maintaining usage statistics.

As a side effect the needed field in "show pools" will now be populated.
2021-04-19 15:24:33 +02:00
Willy Tarreau
53a7fe49aa MINOR: pools: enable the fault injector in all allocation modes
The mem_should_fail() call enabled by DEBUG_FAIL_ALLOC used to be placed
only in the no-cache version of the allocator. Now we can generalize it
to all modes and remove the exclusive test on CONFIG_HAP_NO_GLOBAL_POOLS.
2021-04-19 15:24:33 +02:00
Willy Tarreau
2d6f628d34 MINOR: pools: rename CONFIG_HAP_LOCAL_POOLS to CONFIG_HAP_POOLS
We're going to make the local pool always present unless pools are
completely disabled. This means that pools are always enabled by
default, regardless of the use of threads. Let's drop this notion
of "local" pools and make it just "pool". The equivalent debug
option becomes DEBUG_NO_POOLS instead of DEBUG_NO_LOCAL_POOLS.

For now this changes nothing except the option and dropping the
dependency on USE_THREAD.
2021-04-19 15:24:33 +02:00
Willy Tarreau
d5140e7c6f MINOR: pool: remove the size field from pool_cache_head
Everywhere we have access to the pool so we don't need to cache a copy
of the pool's size into the pool_cache_head. Let's remove it.
2021-04-19 15:24:33 +02:00
Willy Tarreau
9f3129e583 MEDIUM: pools: move the cache into the pool header
Initially per-thread pool caches were stored into a fixed-size array.
But this was a bit ugly because the last allocated pools were not able
to benefit from the cache at all. As a work around to preserve
performance, a size of 64 cacheable pools was set by default (there
are 51 pools at the moment, excluding any addon and debugging code),
so all in-tree pools were covered, at the expense of higher memory
usage.

In addition an index had to be calculated for each pool, and was used
to acces the pool cache head into that array. The pool index was not
even stored into the pools so it was required to determine it to access
the cache when the pool was already known.

This patch changes this by moving the pool cache head into the pool
head itself. This way it is certain that each pool will have its own
cache. This removes the need for index calculation.

The pool cache head is 32 bytes long so it was aligned to 64B to avoid
false sharing between threads. The extra cost is not huge (~2kB more
per pool than before), and we'll make better use of that space soon.
The pool cache head contains the size, which should probably be removed
since it's already in the pool's head.
2021-04-19 15:24:33 +02:00
Willy Tarreau
fff96b441f CLEANUP: pools: remove unused arguments to pool_evict_from_cache()
In commit fb117e6a8 ("MEDIUM: memory: don't let pool_put_to_cache() free
the objects itself") pool_evict_from_cache() was introduced with no
argument, yet the only call place passes it the pool, the pointer and
the index number!

Let's remove these as they even let the reader think that the function
does something specific to the current pool while it's not the case.
2021-04-19 15:24:33 +02:00
Tim Duesterhus
5be6ab269e MEDIUM: http_act: Rename uri-normalizers
This patch renames all existing uri-normalizers into a more consistent naming
scheme:

1. The part of the URI that is being touched.
2. The modification being performed as an explicit verb.
2021-04-19 09:05:57 +02:00
Tim Duesterhus
a407193376 MINOR: uri_normalizer: Add a percent-upper normalizer
This normalizer uppercases the hexadecimal characters used in percent-encoding.

See GitHub Issue #714.
2021-04-19 09:05:57 +02:00
Tim Duesterhus
d7b89be30a MINOR: uri_normalizer: Add a sort-query normalizer
This normalizer sorts the `&` delimited query parameters by parameter name.

See GitHub Issue #714.
2021-04-19 09:05:57 +02:00
Tim Duesterhus
560e1a6352 MINOR: uri_normalizer: Add support for supressing leading ../ for dotdot normalizer
This adds an option to supress `../` at the start of the resulting path.
2021-04-19 09:05:57 +02:00
Tim Duesterhus
9982fc2bbd MINOR: uri_normalizer: Add a dotdot normalizer to http-request normalize-uri
This normalizer merges `../` path segments with the predecing segment, removing
both the preceding segment and the `../`.

Empty segments do not receive special treatment. The `merge-slashes` normalizer
should be executed first.

See GitHub Issue #714.
2021-04-19 09:05:57 +02:00
Tim Duesterhus
d371e99d1c MINOR: uri_normalizer: Add a merge-slashes normalizer to http-request normalize-uri
This normalizer merges adjacent slashes into a single slash, thus removing
empty path segments.

See GitHub Issue #714.
2021-04-19 09:05:57 +02:00
Tim Duesterhus
d2bedcc4ab MINOR: uri_normalizer: Add http-request normalize-uri
This patch adds the `http-request normalize-uri` action that was requested in
GitHub issue #714.

Normalizers will be added in the next patches.
2021-04-19 09:05:57 +02:00
Tim Duesterhus
0ee1ad5675 MINOR: uri_normalizer: Add enum uri_normalizer_err
This enum will serve as the return type for each normalizer.
2021-04-19 09:05:57 +02:00
Tim Duesterhus
dbd25c34de MINOR: uri_normalizer: Add uri_normalizer module
This is in preparation for future patches.
2021-04-19 09:05:57 +02:00
Christopher Faulet
76b44195c9 MINOR: threads: Only consider running threads to end a thread harmeless period
When a thread ends its harmeless period, we must only consider running
threads when testing threads_want_rdv_mask mask. To do so, we reintroduce
all_threads_mask mask in the bitwise operation (It was removed to fix a
deadlock).

Note that for now it is useless because there is no way to stop threads or
to have threads reserved for another task. But it is safer this way to avoid
bugs in the future.
2021-04-17 11:14:58 +02:00
Christopher Faulet
f63a185500 BUG/MEDIUM: threads: Ignore current thread to end its harmless period
A previous patch was pushed to fix a deadlock when an isolated thread ends
its harmless period (a9a9e9aac ["BUG/MEDIUM: thread: Fix a deadlock if an
isolated thread is marked as harmless"]). But, unfortunately, the fix is
incomplete. The same must be done in the outer loop, in
thread_harmless_end() function. The current thread must be ignored when
threads_want_rdv_mask mask is tested.

This patch must also be backported as far as 2.0.
2021-04-17 11:14:58 +02:00
Alex
41007a6835 MINOR: sample: converter: Add mjson library.
This library is required for the subsequent patch which adds
the JSON query possibility.

It is necessary to change the include statement in "src/mjson.c"
because the imported includes in haproxy are in "include/import"

orig: #include "mjson.h"
new:  #include <import/mjson.h>
2021-04-15 17:05:38 +02:00
Tim Duesterhus
763342646f MINOR: ist: Add istclear(struct ist*)
istclear allows one to easily reset an ist to zero-size, while preserving the
previous size, indicating the length of the underlying buffer.
2021-04-14 19:49:33 +02:00
Moemen MHEDHBI
92f7d43c5d MINOR: sample: add ub64dec and ub64enc converters
ub64dec and ub64enc are the base64url equivalent of b64dec and base64
converters. base64url encoding is the "URL and Filename Safe Alphabet"
variant of base64 encoding. It is also used in in JWT (JSON Web Token)
standard.
RFC1421 mention in base64.c file is deprecated so it was replaced with
RFC4648 to which existing converters, base64/b64dec, still apply.

Example:
  HAProxy:
    http-request return content-type text/plain lf-string %[req.hdr(Authorization),word(2,.),ub64dec]
  Client:
    Token=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VyIjoiZm9vIiwia2V5IjoiY2hhZTZBaFhhaTZlIn0.5VsVj7mdxVvo1wP5c0dVHnr-S_khnIdFkThqvwukmdg
    $ curl -H "Authorization: Bearer ${TOKEN}" http://haproxy.local
    {"user":"foo","key":"chae6AhXai6e"}
2021-04-13 17:28:13 +02:00
Christopher Faulet
0c6d1dcf7d BUG/MINOR: listener: Handle allocation error when allocating a new bind_conf
Allocation error are now handled in bind_conf_alloc() functions. Thus
callers, when not already done, are also updated to catch NULL return value.

This patch may be backported (at least partially) to all stable
versions. However, it only fix errors durung configuration parsing. Thus it
is not mandatory.
2021-04-12 21:33:43 +02:00
Christopher Faulet
147b8c919c MINOIR: checks/trace: Register a new trace source with its events
Add the trace support for the checks. Only tcp-check based health-checks are
supported, including the agent-check.

In traces, the first argument is always a check object. So it is easy to get
all info related to the check. The tcp-check ruleset, the conn-stream and
the connection, the server state...
2021-04-12 12:09:36 +02:00
Christopher Faulet
6d80b63e3c MINOR: trace: Add the checks as a possible trace source
To be able to add the trace support for the checks, a new kind of source
must be added for this purpose.
2021-04-12 12:09:36 +02:00
Willy Tarreau
7b1425a91b MINOR: atomic: reimplement the relaxed version of x86 BTS/BTR
Olivier spotted that I messed up during a rebase of commit 92c059c2a
("MINOR: atomic: implement native BTS/BTR for x86"), losing the x86
version of the BTS/BTR and leaving the generic version for it instead
of having this block in the #else. Since this variant is not used for
now it was easy to overlook it. Let's re-implement it here.
2021-04-12 10:01:44 +02:00
Willy Tarreau
c4c80fb4ea MINOR: time: move the time initialization out of tv_update_date()
The time initialization was made a bit complex because we rely on a
dummy negative argument to reset all fields, leaving no distinction
between process-level initialization and thread-level initialization.
This patch changes this by introducing two functions, one for the
process and the second one for the threads. This removes ambigous
test and makes sure that the relevant fields are always initialized
exactly once. This also offers a better solution to the bug fixed in
commit b48e7c001 ("BUG/MEDIUM: time: make sure to always initialize
the global tick") as there is no more special values for global_now_ms.

It's simple enough to be backported if any other time-related issues
are encountered in stable versions in the future.
2021-04-11 23:45:48 +02:00
Willy Tarreau
61c72c366e CLEANUP: time: remove the now unused ms_left_scaled
It was only used by freq_ctr and is not used anymore. In addition the
local curr_sec_ms was removed, as well as the equivalent extern
definitions which did not exist anymore either.
2021-04-11 14:01:53 +02:00
Willy Tarreau
d46ed5c26b MINOR: freq_ctr: simplify and improve the update function
update_freq_ctr_period() was still not very clean and didn't wait for
the rotation lock to be dropped before trying again, thus maintaining
the contention at a high level. In addition, the rotation update was
made in three steps, which are not very efficient in terms of bus
cycles.

Here the wait loop was reworked so that the fast path remains short
and that the contended path waits for the lock to be dropped before
attempting another write, but it only waits a relax cycle before
attempting a read. The rotation block was simplified to remove a
test that was already validated by the first loop, and so that the
retrieval of the current period, its reset and its increment are all
performed in a single atomic op and the store to the previous period
is performed immediately after.

All this results in significantly smaller code for the inline function
(~1kB total) and a shorter critical path.
2021-04-11 14:01:53 +02:00
Willy Tarreau
6339c19cac MINOR: freq_ctr: add cpu_relax in the rotation loop of update_freq_ctr_period()
When counters are rotated, there is contention between the threads which
can slow down the operation of the thread performing the rotation. Let's
apply a cpu_relax there to let the first thread finish faster.
2021-04-11 11:12:57 +02:00
Willy Tarreau
fc6323ad82 MEDIUM: freq_ctr: replace the per-second counters with the generic ones
It remains cumbersome to preserve two versions of the freq counters and
two different internal clocks just for this. In addition, the savings
from using two different mechanisms are not that important as the only
saving is a divide that is replaced by a multiply, but now thanks to
the freq_ctr_total() unificaiton the code could also be simplified to
optimize it in case of constants.

This patch turns all non-period freq_ctr functions to static inlines
which call the period-based ones with a period of 1 second. A direct
benefit is that a single internal clock is now needed for any counter
and that they now all rely on ticks.

These 1-second counters are essentially used to report request rates
and to enforce a connection rate limitation in listeners. It was
verified that these continue to work like before.
2021-04-11 11:12:55 +02:00
Willy Tarreau
fa1258f02c MINOR: freq_ctr: unify freq_ctr and freq_ctr_period into freq_ctr
Both structures are identical except the name of the field starting
the period and its description. Let's call them all freq_ctr and the
period's start "curr_tick" which is generic.

This is only a temporary change and fields are expected to remain
the same with no code change (verified).
2021-04-11 11:11:27 +02:00
Willy Tarreau
d209c87142 MINOR: freq_ctr: add the missing next_event_delay_period()
There was still no function to compute a wait time for periods, let's
implement it on top of freq_ctr_total() as we'll soon need it for the
per-second one. The divide here is applied on the frequency so that it
will be replaced with a reciprocal multiply when constant.
2021-04-11 11:11:03 +02:00
Willy Tarreau
607be24a85 MEDIUM: freq_ctr: reimplement freq_ctr_remain_period() from freq_ctr_total()
Now the function becomes an inline one and only contains a divide and
a max. The divide will automatically go away with constant periods.
2021-04-11 11:11:03 +02:00
Willy Tarreau
a7a31b2602 MEDIUM: freq_ctr: make read_freq_ctr_period() use freq_ctr_total()
This one is the easiest to implement, it just requires a call and a
divide of the result. Anti-flapping correction for low-rates was
preserved.

Now calls using a constant period will be able to use a reciprocal
multiply for the period instead of a divide.
2021-04-11 11:11:03 +02:00
Willy Tarreau
f3a9f8dc5a MINOR: freq_ctr: add a generic function to report the total value
Most of the functions designed to read a counter over a period go through
the same complex loop and only differ in the way they use the returned
values, so it was worth implementing all this into freq_ctr_total() which
returns the total number of events over a period so that the caller can
finish its operation using a divide or a remaining time calculation. As
a special case, read_freq_ctr_period() doesn't take pending events but
requires to enable an anti-flapping correction at very low frequencies.
Thus the function implements it when pend<0.

Thanks to this function it will be possible to reimplement the other ones
as inline and merge the per-second ones with the arbitrary period ones
without always adding the cost of a 64 bit divide.
2021-04-11 11:10:57 +02:00
Willy Tarreau
ff88270ef9 MINOR: pool: move pool declarations to read_mostly
All pool heads are accessed via a pointer and should not be shared with
highly written variables. Move them to the read_mostly section.
2021-04-10 19:27:41 +02:00
Willy Tarreau
f459640ef6 MINOR: global: declare a read_mostly section
Some variables are mostly read (mostly pointers) but they tend to be
merged with other ones in the same cache line, slowing their access down
in multi-thread setups. This patch declares an empty, aligned variable
in a section called "read_mostly". This will force a cache-line alignment
on this section so that any variable declared in it will be certain to
avoid false sharing with other ones. The section will be eliminated at
link time if not used.

A __read_mostly attribute was added to compiler.h to ease use of this
section.
2021-04-10 19:27:41 +02:00
Willy Tarreau
ba386f6a8d CLEANUP: initcall: rely on HA_SECTION_* instead of defining its own
Now initcalls are defined using the regular section definitions from
compiler.h in order to ease maintenance.
2021-04-10 19:27:41 +02:00
Willy Tarreau
5bec4c42ed MINOR: compiler: add macros to declare section names
HA_SECTION() is used as an attribute to force a section name. This is
required because OSX prepends "__DATA, " in front of the declaration.
HA_SECTION_START() and HA_SECTION_STOP() are used as post-attribute on
variable declaration to designate the section start/end (needed only on
OSX, empty on others).

For platforms with an obsolete linker, all macros are left empty. It would
possibly still work on some of them but this will not be needed anyway.
2021-04-10 19:27:41 +02:00
Willy Tarreau
731f0c6502 CLEANUP: initcall: rename HA_SECTION to HA_INIT_SECTION
The HA_SECTION name is too generic and will be reused globally. Let's
rename this one.
2021-04-10 19:27:41 +02:00
Willy Tarreau
afa9bc0ec5 MINOR: initcall: uniformize the section names between MacOS and other unixes
Due to length restrictions on OSX the initcall sections are called "i_"
there while they're called "init_" on other OSes. However the start and
end of sections are still called "__start_init_" and "__stop_init_",
which forces to have distinct code between the OSes. Let's switch everyone
to "i_" and rename the symbols accordingly.
2021-04-10 19:27:41 +02:00
Willy Tarreau
ad14c2681b MINOR: trace: replace the trace() inline function with an equivalent macro
The trace() function is convenient to avoid calling trace() when traces
are not enabled, but there starts to be some callers which place complex
expressions in their trace calls, which results in all of them to be
evaluated before being passed as arguments to the trace() function. This
needlessly wastes precious CPU cycles.

Let's change the function for a macro, so that the arguments are now only
evaluated when the surce has traces enabled. However having a generic
macro being called "trace()" can easily cause conflicts with innocent
code so we rename it "_trace".

Just doing this has resulted in a 2.5% increase of the HTTP/1 request rate.
2021-04-10 19:27:41 +02:00
Willy Tarreau
9057a0026e CLEANUP: pattern: make all pattern tables read-only
Interestingly, all arrays used to declare patterns were read-write while
only hard-coded. Let's mark them const so that they move from data to
rodata and don't risk to experience false sharing.
2021-04-10 17:49:41 +02:00
Tim Duesterhus
403fd722ac CLEANUP: Remove useless malloc() casts
This is not C++.
2021-04-08 20:11:58 +02:00
Tim Duesterhus
fea59fcf79 CLEANUP: ist: Remove unused count argument from ist2str*
This argument is not being used inside the function (and the functions
themselves are unused as well) and not documented. Its purpose is not clear.
Just remove it.
2021-04-08 19:40:59 +02:00
Tim Duesterhus
b8ee894b66 CLEANUP: htx: Make http_get_stline take a const struct
Nothing is being modified there, so this can be `const`.
2021-04-08 19:40:59 +02:00
Tim Duesterhus
fbc2b79743 MINOR: ist: Rename istappend() to __istappend()
Indicate that this function is not inherently safe by adding two underscores as
a prefix.
2021-04-08 19:35:52 +02:00
Willy Tarreau
1197459e0a BUG/MAJOR: fd: switch temp values to uint in fd_stop_both()
With latest commit f50906519 ("MEDIUM: fd: merge fdtab[].ev and state
for FD_EV_* and FD_POLL_* into state") one occurrence of a pair of
chars was missed in fd_stop_both(), resulting in the operation to
fail if the upper flags were set. Interestingly it managed to fail
2 tests in all setups in the CI while all used to work fine on my
local machines. Probably that the reason is that the chars had enough
room above them for the CAS to fail then refill "old" overwriting the
upper parts of the stack, and that thanks to this the subsequent tests
worked. With ASAN being used on lots of tests, it very likely caught
it but used to only report failed tests with no more info.

No backport is needed, as this was never released nor backported.
2021-04-07 20:46:26 +02:00
Tim Duesterhus
8daf8dceb9 MINOR: ist: Add istsplit(struct ist*, char)
istsplit is a combination of iststop + istadv.
2021-04-07 19:50:43 +02:00
Tim Duesterhus
90aa8c7f02 MINOR: ist: Add istshift(struct ist*)
istshift() returns the first character and advances the ist by 1.
2021-04-07 19:50:43 +02:00
Tim Duesterhus
551eeaec91 MINOR: ist: Add istappend(struct ist, char)
This function appends the given char to the given `ist` and returns
the resulting `ist`.
2021-04-07 19:50:43 +02:00