During the last call to the stats I/O handle, it is possible to have nothing
to dump. It may happen for many reasons. For instance, all remaining proxies
are disabled or they don't match the specified scope. In HTML or in JSON, it
is not really an issue because there is a footer. So there are still some
data to push in the response channel buffer. In CSV, it is a problem because
there is no footer. It means it is possible to finish the response with no
payload at all in the HTX message. Thus, the EOM flag may be added on an
empty message. When this happens, a shutdown is performed on an empty HTX
message. Because there is nothing to send, the mux on the client side is not
notified that the message was properly finished and interprets the shutdown
as an abort.
The response is chunked. So an abort at this stage means the last CRLF is
never sent to the client. All data were sent but the message is invalid
because the response chunking is not finished. If the reponse is compressed,
because of a similar bug in the comppression filter, the compression is also
aborted and the content is truncated because some data a lost in the
compression filter.
It is design issue with the HTX. It must be addressed. And there is an
opportunity to do so with the recent conn-stream refactoring. It must be
carefully evaluated first. But it is possible. In the means time and to also
fix stable versions, to workaround the bug, a end-of-trailer HTX block is
systematically added at the end of the message when the EOM flag is set if
the HTX message is empty. This way, there are always some data to send when
the EOM flag is set.
Note that with a H2 client, it is only a problem when the response is
compressed.
This patch should fix the issue #1478. It must be backported as far as
2.4. On previous versions there is still the EOM block.
This bug is the same than for the HTTP client. See "BUG/MINOR: httpclient:
Set conn-stream/channel EOI flags at the end of request" for details.
This patch must be backported as far as 2.0. But only CF_EOI must be set
because applets are not attached to a conn-stream on older versions.
In commit e9ed63e548 dark mode support was added to the stats page. The
initial commit does not include dark mode color overwrites for the
.socket CSS class. This commit colors socket rows the same way as
backends that acre active but do not have a health check defined.
This fixes an issue where reading information from socket lines became
really hard in dark mode due to suboptimal coloring of the cell
background and the font in it.
The unsafe conn-stream API (__cs_*) is now used when we are sure the good
endpoint or application is attached to the conn-stream. This avoids compiler
warnings about possible null derefs. It also simplify the code and clear up
any ambiguity about manipulated entities.
Since recent changes related to the conn-stream/stream-interface
refactoring, GCC reports potential null pointer dereferences when we get the
appctx, the stream or the stream-interface from the conn-strem. Of course,
depending on the time, these entities may be null. But at many places, we
know they are defined and it is safe to get them without any check. Thus, we
use ALREADY_CHECKED() macro to silent these warnings.
Note that the refactoring is unfinished, so it is not a real issue for now.
Because appctx is now an endpoint of the conn-stream, there is no reason to
still have the stream-interface as appctx owner. Thus, the conn-stream is
now the appctx owner.
Thanks to previous changes, it is now possible to set an appctx as endpoint
for a conn-stream. This means the appctx is no longer linked to the
stream-interface but to the conn-stream. Thus, a pointer to the conn-stream
is explicitly stored in the stream-interface. The endpoint (connection or
appctx) can be retrieved via the conn-stream.
- add new metric: `haproxy_backend_agg_server_check_status`
it counts the number of servers matching a specific check status
this permits to exclude per server check status as the usage is often
to rely on the total. Indeed in large setup having thousands of
servers per backend the memory impact is not neglible to store the per
server metric.
- realign promex_str_metrics array
quite simple implementation - we could improve it later by adding an
internal state to the prometheus exporter, thus to avoid counting at
every dump.
this patch is an attempt to close github issue #1312. It may bebackported
to 2.4 if requested.
Signed-off-by: William Dauchy <wdauchy@gmail.com>
This patch renames all dns extra counters and stats functions, types and
enums using the 'resolv' prefix/suffixes.
The dns extra counter domain id used on cli was replaced by "resolvers"
instead of "dns".
The typed extra counter prefix dumping resolvers domain "D." was
also renamed "N." because it points counters on a Nameserver.
This was done to finish the split between "resolver" and "dns" layers
and to avoid further misunderstanding when haproxy will handle dns
load balancing.
This should not be backported.
This change is required to support TCP/HTTP rules in defaults sections. The
'disabled' bitfield in the proxy structure, used to know if a proxy is
disabled or stopped, is replaced a generic bitfield named 'flags'.
PR_DISABLED and PR_STOPPED flags are renamed to PR_FL_DISABLED and
PR_FL_STOPPED respectively. In addition, everywhere there is a test to know
if a proxy is disabled or stopped, there is now a bitwise AND operation on
PR_FL_DISABLED and/or PR_FL_STOPPED flags.
The entering_poll/leaving_poll/measure_idle functions that were hard
to classify and used to move to various locations have now been placed
into clock.c since it's precisely about time-keeping. The functions
were renamed to clock_*. The samp_time and idle_time values are now
static since there is no reason for them to be read from outside.
There is currently a problem related to time keeping. We're mixing
the functions to perform calculations with the os-dependent code
needed to retrieve and adjust the local time.
This patch extracts from time.{c,h} the parts that are solely dedicated
to time keeping. These are the "now" or "before_poll" variables for
example, as well as the various now_*() functions that make use of
gettimeofday() and clock_gettime() to retrieve the current time.
The "tv_*" functions moved there were also more appropriately renamed
to "clock_*".
Other parts used to compute stolen time are in other files, they will
have to be picked next.
These two counters were the only ones not in the global struct, while
the SSL freq counters or the req counts are already in it, this forces
stats.c to include ssl_sock just to know about them. Let's move them
over there with their friends. This reduces from 408 to 384 the number
of includes of opensslconf.h.
A number of files currently access activity counters but rely on their
definitions to be inherited from other files (task.c, backend.c hlua.c,
sock.c, pool.c, stats.c, fd.c).
I don't know why I inlined this one, this makes no sense given that it's
only used for stats, and it starts a circular dependency on tinfo.h which
can be problematic in the future. In addition, all the stuff related to
idle time calculation should be with the rest of the scheduler, which
currently is in task.{c,h}, so let's move it there.
According with the W3 CSS specification, media queries 5 allow
the browser to enable some CSS when dark mode is enabled. This
patch defines dark mode CSS for the stats page.
https://www.w3.org/TR/mediaqueries-5/#prefers-color-scheme
Before threads were introduced in 1.8, idle_pct used to be a global
variable indicating the overall process idle time. Threads made it
thread-local, meaning that its reporting in the stats made little
sense, though this was not easy to spot. In 2.0, the idle_pct variable
moved to the struct thread_info via commit 81036f273 ("MINOR: time:
move the cpu, mono, and idle time to thread_info"). It made it more
obvious that the idle_pct was per thread, and also allowed to more
accurately measure it. But no more effort was made in that direction.
This patch introduces a new report_idle() function that accurately
averages the per-thread idle time over all running threads (i.e. it
should remain valid even if some threads are paused or stopped), and
makes use of it in the stats / "show info" reports.
Sending traffic over only two connections of an 8-thread process
would previously show this erratic CPU usage pattern:
$ while :; do socat /tmp/sock1 - <<< "show info"|grep ^Idle;sleep 0.1;done
Idle_pct: 30
Idle_pct: 35
Idle_pct: 100
Idle_pct: 100
Idle_pct: 100
Idle_pct: 100
Idle_pct: 100
Idle_pct: 100
Idle_pct: 35
Idle_pct: 33
Idle_pct: 100
Idle_pct: 100
Idle_pct: 100
Idle_pct: 100
Idle_pct: 100
Idle_pct: 100
Now it shows this more accurate measurement:
$ while :; do socat /tmp/sock1 - <<< "show info"|grep ^Idle;sleep 0.1;done
Idle_pct: 83
Idle_pct: 83
Idle_pct: 83
Idle_pct: 83
Idle_pct: 83
Idle_pct: 83
Idle_pct: 83
Idle_pct: 83
Idle_pct: 83
Idle_pct: 83
Idle_pct: 83
Idle_pct: 83
Idle_pct: 83
Idle_pct: 83
Idle_pct: 83
This is not technically a bug but this lack of precision definitely affects
some users who rely on the idle_pct measurement. This should at least be
backported to 2.4, and might be to some older releases depending on users
demand.
In a future patch, it will be possible to remove at runtime every
servers, both static and dynamic. This requires to extend the server
refcount for all instances.
First, refcount manipulation functions have been renamed to better
express the API usage.
* srv_refcount_use -> srv_take
The refcount is always initialize to 1 on the server creation in
new_server. It's also incremented for each check/agent configured on a
server instance.
* free_server -> srv_drop
This decrements the refcount and if null, the server is freed, so code
calling it must not use the server reference after it. As a bonus, this
function now returns the next server instance. This is useful when
calling on the server loop without having to save the next pointer
before each invocation.
In these functions, remove the checks that prevent refcount on
non-dynamic servers. Each reference to "dynamic" in variable/function
naming have been eliminated as well.
A dynamic server may be deleted at runtime at the same moment when the
stats applet is pointing to it. Use the server refcount to prevent
deletion in this case.
This should be backported up to 2.4, with an observability period of 2
weeks. Note that it requires the dynamic server refcounting feature
which has been implemented on 2.5; the following commits are required :
- MINOR: server: implement a refcount for dynamic servers
- BUG/MINOR: server: do not use refcount in free_server in stopping mode
- MINOR: server: return the next srv instance on free_server
Previous patch b5c0d65 ("MINOR: proxy: disabled takes a stopping and a
disabled state") allows us to set 2 states for a stopped or a disabled
proxy. With this patch we are now able to show the stats of all proxies
when the process is in a stopping states, not only when there is some
activity on a proxy.
This patch should fix issue #1307.
Disable the output of the statistics of internal proxies (PR_CAP_INT),
wo we don't rely only on the px->uuid > 0. This will allow to hide more
cleanly the internal proxies in the stats.
Agent stats were lost during the stats refactoring performed in the 2.4 to
simplify the Prometheus exporter. stats_fill_sv_stats() function must fill
ST_F_AGENT_* and ST_F_LAST_AGT stats.
This patch should fix the issue #1331. It must be backported to 2.4.
After reloading HAProxy, the old process may still hold active sessions.
Currently there is no way to gather information, how many sessions such
a process still holds. This patch will not exclude disabled proxies from
stats output when they hold at least one active session. This will allow
sending `!@<PID> show stat` through a master socket to the disabled
process and have it returning its stats data.
As part of the changes to support per-module stats data in 2.3-dev6
with commit ee63d4bd6 ("MEDIUM: stats: integrate static proxies stats
in new stats"), a small change resulted in the description field to
be replaced by the name field, making it pointless. Let's fix this
back.
This should fix issue #1291. Thanks to Nick Ramirez for reporting this
issue.
This patch can be backported to 2.3.
The relative_pid is always 1. In mworker mode we also have a
child->relative_pid which is always equalt relative_pid, except for a
master (0) or external process (-1), but these types are usually tested
for, except for one place that was amended to carefully check for the
PROC_O_TYPE_WORKER option.
Changes were pretty limited as most usages of relative_pid were for
designating a process in stats output and peers protocol.
Commit d3a9a4992 ("MEDIUM: stats: allow to select one field in
`stats_fill_sv_stats`") left one occurrence of a direct assignment
of stats[] instead of placing it into the <metric> variable, and it
was on ST_F_CHECK_STATUS. This resulted in the field being overwritten
with an empty one immediately after being set in stats_fill_sv_stats()
and the field to appear empty on the stats page.
No backport is needed as this was only for 2.4.
Now "show info float" will also report SSL rates, connection rates and
key reuse ratios as floats. This can be convenient at very low rates.
Note that the SSL reuse ratio which used to commonly oscillate between
0 and 1 under load is now more often above zero with small values. It
indicates that for better stability we shouldn't be comparing a key rate
with a connection rate but instead we should measure the reuse rate at
its source.
We'll have to support reporting sub-second uptimes, so let's use the
appropriate function which will automatically adjust the tv_usec field.
In addition to this, it will also report a more accurate uptime thanks
to considering the sub-second part in the result.
This will allow some fields to be produced with a higher accuracy when
the requester indicates being able to parse floats. Rates and times are
among the elements which can make sense.
Currently the stats filling function knows nothing about the caller's
needs, so let's pass the STAT_* flags so that it can adapt to the
requester's constraints.
For the prometheus exporter, a new float type was added for the fields
and its conversion was added everywhere except for the HTML output.
Now that we have F2H() we can implement it for consistency.
When emitting stats, we don't need to have 6 zeroes after the decimal point
for each value, so let's trim floating point numbers to the longest needed
only.
There were 102 CLI commands whose help were zig-zagging all along the dump
making them unreadable. This patch realigns all these messages so that the
command now uses up to 40 characters before the delimiting colon. About a
third of the commands did not correctly list their arguments which were
added after the first version, so they were all updated. Some abuses of
the term "id" were fixed to use a more explanatory term. The
"set ssl ocsp-response" command was not listed because it lacked a help
message, this was fixed as well. The deprecated enable/disable commands
for agent/health/server were prominently written as deprecated. Whenever
possible, clearer explanations were provided.
The current "ADD" vs "ADDQ" is confusing because when thinking in terms
of appending at the end of a list, "ADD" naturally comes to mind, but
here it does the opposite, it inserts. Several times already it's been
incorrectly used where ADDQ was expected, the latest of which was a
fortunate accident explained in 6fa922562 ("CLEANUP: stream: explain
why we queue the stream at the head of the server list").
Let's use more explicit (but slightly longer) names now:
LIST_ADD -> LIST_INSERT
LIST_ADDQ -> LIST_APPEND
LIST_ADDED -> LIST_INLIST
LIST_DEL -> LIST_DELETE
The same is true for MT_LISTs, including their "TRY" variant.
LIST_DEL_INIT keeps its short name to encourage to use it instead of the
lazier LIST_DELETE which is often less safe.
The change is large (~674 non-comment entries) but is mechanical enough
to remain safe. No permutation was performed, so any out-of-tree code
can easily map older names to new ones.
The list doc was updated.
When a backend is in status DOWN and going UP it is currently displayed
as yellow ("active UP, going down") instead of orange ("active DOWN, going
UP"). This patches restyles the table rows to actually match the
legend.
This may be backported to any version, the issue appeared in 1.7-dev2
with commit 0c378efe8 ("MEDIUM: stats: compute the color code only in
the HTML form").
Remove static qualifier on stats_allocate_proxy_counters_internal. This
function will be used to allocate extra counters at runtime for dynamic
servers.
Refactoring performed with the following Coccinelle patch:
@@
char *s;
@@
(
- ist2(s, strlen(s))
+ ist(s)
|
- ist2(strdup(s), strlen(s))
+ ist(strdup(s))
)
Note that this replacement is safe even in the strdup() case, because `ist()`
will not call `strlen()` on a `NULL` pointer. Instead is inserts a length of
`0`, effectively resulting in `IST_NULL`.
This makes the code more readable and less prone to copy-paste errors.
In addition, it allows to place some __builtin_constant_p() predicates
to trigger a link-time error in case the compiler knows that the freed
area is constant. It will also produce compile-time error if trying to
free something that is not a regular pointer (e.g. a function).
The DEBUG_MEM_STATS macro now also defines an instance for ha_free()
so that all these calls can be checked.
178 occurrences were converted. The vast majority of them were handled
by the following Coccinelle script, some slightly refined to better deal
with "&*x" or with long lines:
@ rule @
expression E;
@@
- free(E);
- E = NULL;
+ ha_free(&E);
It was verified that the resulting code is the same, more or less a
handful of cases where the compiler optimized slightly differently
the temporary variable that holds the copy of the pointer.
A non-negligible amount of {free(str);str=NULL;str_len=0;} are still
present in the config part (mostly header names in proxies). These
ones should also be cleaned for the same reasons, and probably be
turned into ist strings.
The nb_tasks counter was still global and gets incremented and decremented
for each task_new()/task_free(), and was read in process_runnable_tasks().
But it's only used for stats reporting, so doing this this often is
pointless and expensive. Let's move it to the task_per_thread struct and
have the stats sum it when needed.
This counter is solely used for reporting in the stats and is the hottest
thread contention point to date. Moving it to the scheduler and having a
separate one for the global run queue dramatically improves the performance,
showing a 12% boost on the request rate on 16 threads!
In addition, the thread debugging output which used to rely on rqueue_size
was not totally accurate as it would only report task counts. Now we can
return the exact thread's run queue length.
It is also interesting to note that there are still a few other task/tasklet
counters in the scheduler that are not efficiently updated because some cover
a single area and others cover multiple areas. It looks like having a distinct
counter for each of the following entries would help and would keep the code
a bit cleaner:
- global run queue (tree)
- per-thread run queue (tree)
- per-thread shared tasklets list
- per-thread local lists
Maybe even splitting the shared tasklets lists between pure tasklets and
tasks instead of having the whole and tasks would simplify the code because
there remain a number of places where several counters have to be updated.
move listen status to a helper, defining both status enum and string
definition.
this will be helpful to be reused in prometheus code. It also removes
this hard-to-read nested ternary.
Signed-off-by: William Dauchy <wdauchy@gmail.com>
prometheus approach requires to output all values for a given metric
name; meaning we iterate through all metrics, and then iterate in the
inner loop on all objects for this metric.
In order to allow more code reuse, adapt the stats API to be able to
select one field or fill them all otherwise.
From this patch it should be possible to add support for listen stats in
prometheus.
Signed-off-by: William Dauchy <wdauchy@gmail.com>
This patch splits current dns.c into two files:
The first dns.c contains code related to DNS message exchange over UDP
and in future other TCP. We try to remove depencies to resolving
to make it usable by other stuff as DNS load balancing.
The new resolvers.c inherit of the code specific to the actual
resolvers.
Note:
It was really difficult to obtain a clean diff dur to the amount
of moved code.
Note2:
Counters and stuff related to stats is not cleany separated because
currently counters for both layers are merged and hard to separate
for now.