13 Commits

Author SHA1 Message Date
Willy Tarreau
f14c7570d6 CLEANUP: cli: rename MAX_STATS_ARGS to MAX_CLI_ARGS
This is the number of args accepted on a command received on the CLI,
is has long been totally independent of stats and should not carry
this misleading "stats" name anymore.
2021-03-13 10:59:23 +01:00
Willy Tarreau
060a761248 OPTIM: task: automatically adjust the default runqueue-depth to the threads
The recent default runqueue size reduction appeared to have significantly
lowered performance on low-thread count configs. Testing various values
runqueue values on different workloads under thread counts ranging from
1 to 64, it appeared that lower values are more optimal for high thread
counts and conversely. It could even be drawn that the optimal value for
various workloads sits around 280/sqrt(nbthread), and probably has to do
with both the L3 cache usage and how to optimally interlace the threads'
activity to minimize contention. This is much easier to optimally
configure, so let's do this by default now.
2021-03-10 11:15:34 +01:00
Willy Tarreau
f587003fe9 MINOR: pools: double the local pool cache size to 1 MB
The reason is that H2 can already require 32 16kB buffers for the mux
output at once, which will deplete the local cache. Thus it makes sense
to go further to leave some time to other connection to release theirs.
In addition, the L2 cache on modern CPUs is already 1 MB, so this change
is welcome in any case.
2021-03-05 08:30:08 +01:00
Willy Tarreau
66161326fd MINOR: listener: refine the default MAX_ACCEPT from 64 to 4
The maximum number of connections accepted at once by a thread for a single
listener used to default to 64 divided by the number of processes but the
tasklet-based model is much more scalable and benefits from smaller values.
Experimentation has shown that 4 gives the highest accept rate for all
thread values, and that 3 and 5 come very close, as shown below (HTTP/1
connections forwarded per second at multi-accept 4 and 64):

 ac\thr|    1     2    4     8     16
 ------+------------------------------
      4|   80k  106k  168k  270k  336k
     64|   63k   89k  145k  230k  274k

Some tests were also conducted on SSL and absolutely no change was observed.

The value was placed into a define because it used to be spread all over the
code.

It might be useful at some point to backport this to 2.3 and 2.2 to help
those who observed some performance regressions from 1.6.
2021-02-19 16:02:04 +01:00
Willy Tarreau
4327d0ac00 MINOR: tasks: refine the default run queue depth
Since a lot of internal callbacks were turned to tasklets, the runqueue
depth had not been readjusted from the default 200 which was initially
used to favor batched processing. But nowadays it appears too large
already based on the following tests conducted on a 8c16t machine with
a simple config involving "balance leastconn" and one server. The setup
always involved the two threads of a same CPU core except for 1 thread,
and the client was running over 1000 concurrent H1 connections. The
number of requests per second is reported for each (runqueue-depth,
nbthread) couple:

 rq\thr|    1     2     4     8    16
 ------+------------------------------
     32|  120k  159k  276k  477k  698k
     40|  122k  160k  276k  478k  722k
     48|  121k  159k  274k  482k  720k
     64|  121k  160k  274k  469k  710k
    200|  114k  150k  247k  415k  613k  <-- default

It's possible to save up to about 18% performance by lowering the
default value to 40. One possible explanation to this is that checking
I/Os more frequently allows to flush buffers faster and to smooth the
I/O wait time over multiple operations instead of alternating phases
of processing, waiting for locks and waiting for new I/Os.

The total round trip time also fell from 1.62ms to 1.40ms on average,
among which at least 0.5ms is attributed to the testing tools since
this is the minimum attainable on the loopback.

After some observation it would be nice to backport this to 2.3 and
2.2 which observe similar improvements, since some users have already
observed some perf regressions between 1.6 and 2.2.
2021-02-19 16:01:55 +01:00
Dragan Dosen
6f7cc11e6d MEDIUM: xxhash: use the XXH_INLINE_ALL macro to inline all functions
This way we make all xxhash functions inline, with implementations being
directly included within xxhash.h.

Makefile is updated as well, since we don't need to compile and link
xxhash.o anymore.

Inlining should improve performance on small data inputs.
2020-12-23 06:39:21 +01:00
Willy Tarreau
ca8b069aa7 REORG: include: move MAX_THREADS to defaults.h
That's already where MAX_PROCS is set, and we already handle the case of
the default value so there is no reason for placing it in thread.h given
that most call places don't need the rest of the threads definitions. The
include was removed from global-t.h and activity.c.
2020-06-11 10:18:59 +02:00
Willy Tarreau
aeed4a85d6 REORG: include: move log.h to haproxy/log{,-t}.h
The current state of the logging is a real mess. The main problem is
that almost all files include log.h just in order to have access to
the alert/warning functions like ha_alert() etc, and don't care about
logs. But log.h also deals with real logging as well as log-format and
depends on stream.h and various other things. As such it forces a few
heavy files like stream.h to be loaded early and to hide missing
dependencies depending where it's loaded. Among the missing ones is
syslog.h which was often automatically included resulting in no less
than 3 users missing it.

Among 76 users, only 5 could be removed, and probably 70 don't need the
full set of dependencies.

A good approach would consist in splitting that file in 3 parts:
  - one for error output ("errors" ?).
  - one for log_format processing
  - and one for actual logging.
2020-06-11 10:18:58 +02:00
Willy Tarreau
ac13aeaa89 REORG: include: move auth.h to haproxy/auth{,-t}.h
The STATS_DEFAULT_REALM and STATS_DEFAULT_URI were moved to defaults.h.
It was required to include types/pattern.h and types/sample.h since they
are mentioned in function prototypes.

It would be wise to merge this with uri_auth.h later.
2020-06-11 10:18:57 +02:00
Willy Tarreau
0f6ffd652e REORG: include: move fd.h to haproxy/fd{,-t}.h
A few includes were missing in each file. A definition of
struct polled_mask was moved to fd-t.h. The MAX_POLLERS macro was
moved to defaults.h

Stdio used to be silently inherited from whatever path but it's needed
for list_pollers() which takes a FILE* and which can thus not be
forward-declared.
2020-06-11 10:18:57 +02:00
Willy Tarreau
fd4bffe7c0 REORG: include: move the base files from common/ to haproxy/
The files currently covered by api-t.h and api.h (compat, compiler,
defaults, initcall) are now located inside haproxy/.
2020-06-11 10:18:56 +02:00
Willy Tarreau
2dd0d4799e [CLEANUP] renamed include/haproxy to include/common 2006-06-29 17:53:05 +02:00
Willy Tarreau
baaee00406 [BIGMOVE] exploded the monolithic haproxy.c file into multiple files.
The files are now stored under :
  - include/haproxy for the generic includes
  - include/types.h for the structures needed within prototypes
  - include/proto.h for function prototypes and inline functions
  - src/*.c for the C files

Most include files are now covered by LGPL. A last move still needs
to be done to put inline functions under GPL and not LGPL.

Version has been set to 1.3.0 in the code but some control still
needs to be done before releasing.
2006-06-26 02:48:02 +02:00