MEDIUM: cpu-topo: switch to the "performance" cpu-policy by default

As mentioned during the NUMA series development, the goal is to use
all available cores in the most efficient way by default, which
normally corresponds to "cpu-policy performance". The previous default
choice of "cpu-policy first-usable-node" was only meant to stay 100%
identical to before cpu-policy.

So let's switch the default cpu-policy to "performance" right now.
The doc was updated to reflect this.
This commit is contained in:
Willy Tarreau 2025-06-26 16:01:55 +02:00
parent 5128178256
commit b74336984d
2 changed files with 14 additions and 18 deletions

View File

@ -2174,7 +2174,7 @@ cpu-policy <policy>
The "cpu-policy" directive chooses between a small number of allocation The "cpu-policy" directive chooses between a small number of allocation
policies which one to use instead, when "cpu-map" is not used. The following policies which one to use instead, when "cpu-map" is not used. The following
policies are currently supported: policies are currently supported, with "performance" being the default one:
- none no particular post-selection is performed. All enabled - none no particular post-selection is performed. All enabled
CPUs will be usable, and if the number of threads is CPUs will be usable, and if the number of threads is
@ -2202,8 +2202,7 @@ cpu-policy <policy>
node with enabled CPUs will be used, and this number of node with enabled CPUs will be used, and this number of
CPUs will be used as the number of threads. A single CPUs will be used as the number of threads. A single
thread group will be enabled with all of them, within thread group will be enabled with all of them, within
the limit of 32 or 64 depending on the system. This is the limit of 32 or 64 depending on the system.
the default policy.
- group-by-2-ccx same as "group-by-ccx" below but create a group every - group-by-2-ccx same as "group-by-ccx" below but create a group every
two CCX. This can make sense on CPUs having many CCX of two CCX. This can make sense on CPUs having many CCX of
@ -2299,7 +2298,7 @@ cpu-policy <policy>
such as network handling is much more effective. On such as network handling is much more effective. On
development systems, these can also be used to run development systems, these can also be used to run
auxiliary tools such as load generators and monitoring auxiliary tools such as load generators and monitoring
tools. tools. This is the default policy.
- resource this is like "group-by-cluster" above, except that only - resource this is like "group-by-cluster" above, except that only
the smallest and most efficient CPU cluster will be the smallest and most efficient CPU cluster will be
@ -2904,18 +2903,15 @@ no-quic
processed by haproxy. See also "quic_enabled" sample fetch. processed by haproxy. See also "quic_enabled" sample fetch.
numa-cpu-mapping numa-cpu-mapping
When running on a NUMA-aware platform with the cpu-policy is set to When running on a NUMA-aware platform, this enables the "cpu-policy"
"first-usable-node" (the default one), HAProxy inspects on startup the CPU directive to inspect the topology and figure the best set of CPUs to use and
topology of the machine. If a multi-socket machine is detected, the affinity the corresponding number of threads. However, if the applied binding is non
is automatically calculated to run on the CPUs of a single node. This is done optimal on a particular architecture, it can be disabled with the statement
in order to not suffer from the performance penalties caused by the 'no numa-cpu-mapping'. This automatic binding is also not applied if a
inter-socket bus latency. However, if the applied binding is non optimal on a 'nbthread' statement is present in the configuration, if the affinity of the
particular architecture, it can be disabled with the statement 'no process is already specified, for example via the 'cpu-map' directive or the
numa-cpu-mapping'. This automatic binding is also not applied if a nbthread taskset utility, or if the cpu-policy is set to any other value. See also
statement is present in the configuration, if the affinity of the process is "cpu-map", "cpu-policy", "cpu-set".
already specified, for example via the 'cpu-map' directive or the taskset
utility, or if the cpu-policy is set to any other value. See also "cpu-map",
"cpu-policy", "cpu-set".
ocsp-update.disable [ on | off ] ocsp-update.disable [ on | off ]
Disable completely the ocsp-update in HAProxy. Any ocsp-update configuration Disable completely the ocsp-update in HAProxy. Any ocsp-update configuration

View File

@ -60,7 +60,7 @@ static int cpu_policy_resource(int policy, int tmin, int tmax, int gmin, int gma
static struct ha_cpu_policy ha_cpu_policy[] = { static struct ha_cpu_policy ha_cpu_policy[] = {
{ .name = "none", .desc = "use all available CPUs", .fct = NULL }, { .name = "none", .desc = "use all available CPUs", .fct = NULL },
{ .name = "first-usable-node", .desc = "use only first usable node if nbthreads not set", .fct = cpu_policy_first_usable_node, .arg = 0 }, { .name = "performance", .desc = "make one thread group per perf. core cluster", .fct = cpu_policy_performance , .arg = 0 },
{ .name = "group-by-ccx", .desc = "make one thread group per CCX", .fct = cpu_policy_group_by_ccx , .arg = 1 }, { .name = "group-by-ccx", .desc = "make one thread group per CCX", .fct = cpu_policy_group_by_ccx , .arg = 1 },
{ .name = "group-by-2-ccx", .desc = "make one thread group per 2 CCX", .fct = cpu_policy_group_by_ccx , .arg = 2 }, { .name = "group-by-2-ccx", .desc = "make one thread group per 2 CCX", .fct = cpu_policy_group_by_ccx , .arg = 2 },
{ .name = "group-by-3-ccx", .desc = "make one thread group per 3 CCX", .fct = cpu_policy_group_by_ccx , .arg = 3 }, { .name = "group-by-3-ccx", .desc = "make one thread group per 3 CCX", .fct = cpu_policy_group_by_ccx , .arg = 3 },
@ -69,9 +69,9 @@ static struct ha_cpu_policy ha_cpu_policy[] = {
{ .name = "group-by-2-clusters",.desc = "make one thread group per 2 core clusters", .fct = cpu_policy_group_by_cluster , .arg = 2 }, { .name = "group-by-2-clusters",.desc = "make one thread group per 2 core clusters", .fct = cpu_policy_group_by_cluster , .arg = 2 },
{ .name = "group-by-3-clusters",.desc = "make one thread group per 3 core clusters", .fct = cpu_policy_group_by_cluster , .arg = 3 }, { .name = "group-by-3-clusters",.desc = "make one thread group per 3 core clusters", .fct = cpu_policy_group_by_cluster , .arg = 3 },
{ .name = "group-by-4-clusters",.desc = "make one thread group per 4 core clusters", .fct = cpu_policy_group_by_cluster , .arg = 4 }, { .name = "group-by-4-clusters",.desc = "make one thread group per 4 core clusters", .fct = cpu_policy_group_by_cluster , .arg = 4 },
{ .name = "performance", .desc = "make one thread group per perf. core cluster", .fct = cpu_policy_performance , .arg = 0 },
{ .name = "efficiency", .desc = "make one thread group per eff. core cluster", .fct = cpu_policy_efficiency , .arg = 0 }, { .name = "efficiency", .desc = "make one thread group per eff. core cluster", .fct = cpu_policy_efficiency , .arg = 0 },
{ .name = "resource", .desc = "make one thread group from the smallest cluster", .fct = cpu_policy_resource , .arg = 0 }, { .name = "resource", .desc = "make one thread group from the smallest cluster", .fct = cpu_policy_resource , .arg = 0 },
{ .name = "first-usable-node", .desc = "use only first usable node if nbthreads not set", .fct = cpu_policy_first_usable_node, .arg = 0 },
{ 0 } /* end */ { 0 } /* end */
}; };