Commit Graph

38 Commits

Author SHA1 Message Date
Willy Tarreau
ae0f8be011 MINOR: stats: protect against future stats fields omissions
As seen in commits 33a4461fa ("BUG/MINOR: stats: Fix Lua's `get_stats`
function") and a46b142e8 ("BUG/MINOR: Missing stat_field_names (since
f21d17bb)") it seems frequent to omit to update stats_fields[] when
adding a new ST_F_xxx entry. This breaks Lua's get_stats() and shows
a "(null)" in the header of "show stat", but that one is not detectable
to the naked eye anymore.

Let's add a reminder above the enum declaration about this, and a small
reg tests checking for the absence of "(null)". It was verified to fail
before the last patch above.
2023-06-02 08:39:53 +02:00
Willy Tarreau
5723b382ed MINOR: stats: report the boot time in "show info"
Just like we have the uptime in "show info", let's add the boot time.
It's trivial to collect as it's just the difference between the ready
date and the start date, and will allow users to monitor this element
in order to take action before it starts becoming problematic. Here
the boot time is reported in milliseconds, so this allows to even
observe sub-second anomalies in startup delays.
2023-05-17 09:33:54 +02:00
Willy Tarreau
4cfb0019e6 MINOR: stats: report the listener's protocol along with the address in stats
When "optioon socket-stats" is used in a frontend, its listeners have
their own stats and will appear in the stats page. And when the stats
page has "stats show-legends", then a tooltip appears on each such
socket with ip:port and ID. The problem is that since QUIC arrived, it
was not possible to distinguish the TCP listeners from the QUIC ones
because no protocol indication was mentioned. Now we add a "proto"
legend there with the protocol name, so we can see "tcp4" or "quic6"
and figure how the socket is bound.
2023-05-11 14:52:56 +02:00
Willy Tarreau
9615102b01 MINOR: stats: report the number of times the global maxconn was reached
As discussed a few times over the years, it's quite difficult to know
how often we stop accepting connections because the global maxconn was
reached. This is not easy to know because when we reach the limit we
stop accepting but we don't know if incoming connections are pending,
so it's not possible to know how many were delayed just because of this.
However, an interesting equivalent metric consist in counting the number
of times an accepted incoming connection resulted in the limit being
reached. I.e. "we've accepted the last one for now". That doesn't imply
any other one got delayed but it's a factual indicator that something
might have been delayed. And by counting the number of such events, it
becomes easier to know whether some limits need to be adjusted because
they're reached often, or if it's exceptionally rare.

The metric is reported as a counter in show info and on the stats page
in the info section right next to "maxconn".
2023-05-11 13:51:31 +02:00
Willy Tarreau
3c4a297d2b MINOR: stats: report the total number of warnings issued
Now in "show info" we have a TotalWarnings field that reports the total
number of warnings issued since the process started. It's also reported
in the the stats page next to the uptime.
2023-05-11 12:02:21 +02:00
Frédéric Lécaille
9969adbcdc MINOR: stats: add by HTTP version cumulated number of sessions and requests
Add cum_sess_ver[] new array of counters to count the number of cumulated
HTTP sessions by version (h1, h2 or h3).
Implement proxy_inc_fe_cum_sess_ver_ctr() to increment these counter.
This function is called each a HTTP mux is correctly initialized. The QUIC
must before verify the application operations for the mux is for h3 before
calling proxy_inc_fe_cum_sess_ver_ctr().
ST_F_SESS_OTHER stat field for the cumulated of sessions others than
HTTP sessions is deduced from ->cum_sess_ver counter (for all the session,
not only HTTP sessions) from which the HTTP sessions counters are substracted.

Add cum_req[] new array of counters to count the number of cumulated HTTP
requests by version and others than HTTP requests. This new member replace ->cum_req.
Modify proxy_inc_fe_req_ctr() which increments these counters to pass an HTTP
version, 0 special values meaning "other than an HTTP request". This is the case
for instance for syslog.c from which proxy_inc_fe_req_ctr() is called with 0
as version parameter.
ST_F_REQ_TOT stat field compputing for the cumulated number of requests is modified
to count the sum of all the cum_req[] counters.

As this patch is useful for QUIC, it must be backported to 2.7.
2023-02-03 17:55:49 +01:00
Aurelien DARRAGON
5594184190 MINOR: stats: introduce stats field ctx
Add a new value in stats ctx: field.
Implement field support in line dumping parent functions
stats_print_proxy_field_json() and stats_dump_proxy_to_buffer().

This will allow child dumping functions to support partial line dumping
when needed. ie: when dumping buffer is exhausted: do a partial send and
wait for a new buffer to finish the dump. Thanks to field ctx, the function
can start dumping where it left off on previous (unterminated) invokation.
2022-12-15 16:53:49 +01:00
Cedric Paillet
e06e31ea3b MINOR: promex: introduce haproxy_backend_agg_check_status
This patch introduces haproxy_backend_agg_check_status metric
as we wanted in 42d7c402d but with the right data source.

This patch could be backported as far as 2.4.
2022-12-09 10:54:48 +01:00
Cedric Paillet
7d6644e689 BUG/MINOR: promex: create haproxy_backend_agg_server_status
haproxy_backend_agg_server_check_status currently aggregates
haproxy_server_status instead of haproxy_server_check_status.
We deprecate this and create a new one,
haproxy_backend_agg_server_status to clarify what it really
does.

This patch could be backported as far as 2.4.
2022-12-09 10:54:27 +01:00
Aurelien DARRAGON
745ce8e8ad MINOR: stats: add server revision id support
Make use of the new srv->rid value in stats.

Stat is referred as ST_F_SRID, it is now used in stats_fill_sv_stats
function in order to be included in csv and json stats dumps.

Moreover, "rid: $value" will be displayed next to server puid
in html stats page if "stats show-legend" is specified in the stats frontend.
(mouse hovering tooltip)

Depends on the following commit:
	"MINOR: server: add srv->rid (revision id) value"
2022-12-06 10:22:06 +01:00
Willy Tarreau
ecab71fbac BUILD: stats: conditionally mark obsolete stats states as deprecated
The obsolete stats states STAT_ST_* were marked as deprecated with recent
commit 6ef1648dc ("CLEANUP: stats: rename the stats state values an mark
the old ones deprecated"), except that this feature requires gcc 6 and
above. Let's use the macro that depends on this condition instead.

The issue appeared on 2.6-dev9 so no backport is needed.
2022-05-09 20:32:11 +02:00
Willy Tarreau
6ef1648dc2 CLEANUP: stats: rename the stats state values an mark the old ones deprecated
The STAT_ST_* values have been abused by virtually every applet and CLI
keyword handler, and this must not continue as it's a source of bugs and
of overly complicated code.

This patch renames the states to STAT_STATE_*, and keeps the previous
enum while marking each entry as deprecated. This should be sufficient to
catch out-of-tree code that might rely on them and to let them know what
to do with that.
2022-05-06 18:33:49 +02:00
Willy Tarreau
41f885241e CLEANUP: stats/cli: stop using appctx->st2
Instead, let's have the state as an enum inside the context. It's much
cleaner and safer as we know nobody else touches it.
2022-05-06 18:13:35 +02:00
Willy Tarreau
91cefcaba4 CLEANUP: stats/cli: take the "show stat" context definition out of the appctx
This makes use of the generic command context allocation so that the
appctx doesn't have to declare a specific one anymore. The context is
created during parsing (both in the CLI and HTTP).

The change looks large but it's particularly mechanical. The context
initialization appears in stats.c and http_ana.c. The context is used
in stats.c and resolvers.c since "show stat resolvers" points there.
That's the reason why the definition moved to stats.h. "show info"
and "show stat" continue to share the same state definition for now.

Nothing else was modified.
2022-05-06 18:13:35 +02:00
William Dauchy
42d7c402d5 MINOR: promex: backend aggregated server check status
- add new metric: `haproxy_backend_agg_server_check_status`
  it counts the number of servers matching a specific check status
  this permits to exclude per server check status as the usage is often
  to rely on the total. Indeed in large setup having thousands of
  servers per backend the memory impact is not neglible to store the per
  server metric.
- realign promex_str_metrics array

quite simple implementation - we could improve it later by adding an
internal state to the prometheus exporter, thus to avoid counting at
every dump.

this patch is an attempt to close github issue #1312. It may bebackported
to 2.4 if requested.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-11-09 10:51:08 +01:00
William Dauchy
d3141b1d37 DOC: stats: fix location of the text representation
`info_field_names` and `stat_field_names` no longer exist and have been
moved in stats.c
To avoid changing this comment, just mention the name of the new table
`info_fields` and `stat_fields`

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-11-08 13:46:02 +01:00
Emeric Brun
f8642ee826 MEDIUM: resolvers: rename dns extra counters to resolvers extra counters
This patch renames all dns extra counters and stats functions, types and
enums using the 'resolv' prefix/suffixes.

The dns extra counter domain id used on cli was replaced by "resolvers"
instead of "dns".

The typed extra counter prefix dumping resolvers domain "D." was
also renamed "N." because it points counters on a Nameserver.

This was done to finish the split between "resolver" and "dns" layers
and to avoid further misunderstanding when haproxy will handle dns
load balancing.

This should not be backported.
2021-11-03 17:16:46 +01:00
Willy Tarreau
2745620240 MINOR: stats: support an optional "float" option to "show info"
This will allow some fields to be produced with a higher accuracy when
the requester indicates being able to parse floats. Rates and times are
among the elements which can make sense.
2021-05-08 10:52:12 +02:00
Amaury Denoyelle
5dfdf3e5b0 MINOR: stats: report tainted on show info
Add a new info field ST_F_TAINTED to dump tainted status at the end of
the 'show info' output.
2021-05-07 14:35:02 +02:00
Willy Tarreau
5bbc676608 BUG/MINOR: stats: revert the change on ST_CONVDONE
In 2.1, commit ee4f5f83d ("MINOR: stats: get rid of the ST_CONVDONE flag")
introduced a subtle bug. By testing curproxy against defproxy in
check_config_validity(), it tried to eliminate the need for a flag
to indicate that stats authentication rules were already compiled,
but by doing so it left the issue opened for the case where a new
defaults section appears after the two proxies sharing the first
one:

      defaults
          mode http
          stats auth foo:bar

      listen l1
          bind :8080

      listen l2
          bind :8181

      defaults
          # just to break above

This config results in:
  [ALERT] 042/113725 (3121) : proxy 'f2': stats 'auth'/'realm' and 'http-request' can't be used at the same time.
  [ALERT] 042/113725 (3121) : Fatal errors found in configuration.

Removing the last defaults remains OK. It turns out that the cleanups
that followed that patch render it useless, so the best fix is to revert
the change (with the up-to-date flags instead). The flag was marked as
belonging to the config. It's not exact but it's the closest to the
reality, as it's not there to configure the behavior but ti mention
that the config parser did its job.

This could be backported as far as 2.1, but in practice it looks like
nobody ever hit it.
2021-02-12 16:23:45 +01:00
William Dauchy
defd15685e MINOR: stats: add new start time field
Another patch in order to try to reconciliate haproxy stats and
prometheus. Here I'm adding a proper start time field in order to make
proper use of uptime field.
That being done we can move the calculation in `fill_info`

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-01-21 18:59:30 +01:00
William Dauchy
a8766cfad1 MINOR: stats: duplicate 3 fields in bytes in info
in order to prepare a possible merge of fields between haproxy stats and
prometheus, duplicate 3 fields:
  INF_MEMMAX
  INF_POOL_ALLOC
  INF_POOL_USED
Those were specifically named in MB unit which is not what prometheus
recommends. We therefore used them but changed the unit while doing the
calculation. It created a specific case for that, up to the description.
This patch:
- removes some possible confusion, i.e. using MB field for bytes
- will permit an easier merge of fields such as description

First consequence for now, is that we can remove the calculation on
prometheus side and move it on `fill_info`.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-01-21 18:59:30 +01:00
William Dauchy
5a982a7165 MINOR: contrib/prometheus-exporter: export build_info
commit c55a626217 ("MINOR: contrib/prometheus-exporter: Add
missing global and per-server metrics") is renaming two metrics between
v2.2 and v2.3:
  server_idle_connections_current
  server_idle_connections_limit

It is breaking some tools which are making use of those metrics while
supporting several haproxy versions. This build_info will permit tools
which make use of metrics to be able to match the haproxy version and
change the list of expected metrics. This was possible using the haproxy
stats socket but not with prometheus export.

This patch follows prometheus best pratices to export specific software
informations. It is adding a new field `build_info` so we can extend it
to other parameters if needed in the future.

example output:
  # HELP haproxy_process_build_info HAProxy build info.
  # TYPE haproxy_process_build_info gauge
  haproxy_process_build_info{version="2.4-dev5-2e1a3f-5"} 1

Even though it is not a bugfix, this patch will make more sense when
backported up to >= 2.0

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-01-08 14:48:13 +01:00
Amaury Denoyelle
7f8f6cb926 BUG/MEDIUM: stats: prevent crash if counters not alloc with dummy one
Define a per-thread counters allocated with the greatest size of any
stat module counters. This variable is named trash_counters.

When using a proxy without allocated counters, return the trash counters
from EXTRA_COUNTERS_GET instead of a dangling pointer to prevent
segfault.

This is useful for all the proxies used internally and not
belonging to the global proxy list. As these objects does not appears on
the stat report, it does not matter to use the dummy counters.

For this fix to be functional, the extra counters are explicitly
initialized to NULL on proxy/server/listener init functions.

Most notably, the crash has already been detected with the following
vtc:
- reg-tests/lua/txn_get_priv.vtc
- reg-tests/peers/tls_basic_sync.vtc
- reg-tests/peers/tls_basic_sync_wo_stkt_backend.vtc
There is probably other parts that may be impacted (SPOE for example).

This bug was introduced in the current release and do not need to be
backported. The faulty commits are
"MINOR: ssl: count client hello for stats" and
"MINOR: ssl: add counters for ssl sessions".
2020-11-12 15:16:05 +01:00
Willy Tarreau
bd71510024 MINOR: stats: report server's user-configured weight next to effective weight
The "weight" column on the stats page is somewhat confusing when using
slowstart becaue it reports the effective weight, without being really
explicit about it. In some situations the user-configured weight is more
relevant (especially with long slowstarts where it's important to know
if the configured weight is correct).

This adds a new uweight stat which reports a server's user-configured
weight, and in a backend it receives the sum of all servers' uweights.
In addition it adds the mention of "effective" in a few descriptions
for the "weight" column (help and doc).

As a result, the list of servers in a backend is now always scanned
when dumping the stats. But this is not a problem given that these
servers are already scanned anyway and for way heavier processing.
2020-10-23 22:47:30 +02:00
Willy Tarreau
3e32036701 MINOR: stats: also support a "no-maint" show stat modifier
"no-maint" is a bit similar to "up" except that it will only hide
servers that are in maintenance (or disabled in the configuration), and
not those that are enabled but failed a check. One benefit here is to
significantly reduce the output of the "show stat" command when using
large server-templates containing entries that are not yet provisioned.

Note that the prometheus exporter also has such an option which does
the exact same.
2020-10-23 18:11:24 +02:00
Amaury Denoyelle
fbd0bc98fe MINOR: dns/stats: integrate dns counters in stats
Use the new stats module API to integrate the dns counters in the
standard stats. This is done in order to avoid code duplication, keep
the code related to cli out of dns and use the full possibility of the
stats function, allowing to print dns stats in csv or json format.
2020-10-05 12:02:14 +02:00
Amaury Denoyelle
0b70a8a314 MINOR: stats: add config "stats show modules"
By default, hide the extra statistics on the html page. Define a new
flag STAT_SHMODULES which is activated if the config "stats show
modules" is set.
2020-10-05 12:02:14 +02:00
Amaury Denoyelle
d3700a7fda MINOR: stats: support clear counters for dynamic stats
Add a boolean 'clearable' on stats module structure. If set, it forces
all the counters to be reset on 'clear counters' cli command. If not,
the counters are reset only when 'clear counters all' is used.
2020-10-05 12:02:14 +02:00
Amaury Denoyelle
730c727ea3 MEDIUM: stats: add abstract type to store counters
Implement a small API to easily add extra counters inside a structure
instance. This will be used to implement dynamic statistics linked on
every type of object as needed.

The counters are stored in a dynamic array inside the relevant objects.
2020-10-05 12:02:14 +02:00
Amaury Denoyelle
58d395e0d6 MEDIUM: stats: define an API to register stat modules
A stat module can be registered to quickly add new statistics on
haproxy. It must be attached to one of the available stats domain. The
register must be done using INITCALL on STG_REGISTER.

The stat module has a name which should be unique for each new module in
a domain. It also contains a statistics list with their name/desc and a
pointer to a function used to fill the stats from the module counters.

The module also provides the initial counters values used on
automatically allocated counters. The offset for these counters
are stored in the module structure.
2020-10-05 12:02:14 +02:00
Amaury Denoyelle
72b16e5173 MINOR: stats: define additional flag px cap on domain
This flag can be used to determine on what type of proxy object the
statistics should be relevant. It will be useful when adding dynamic
statistics. Currently, this flag is not used.
2020-10-05 12:02:14 +02:00
Amaury Denoyelle
072f97eddf MINOR: stats: define the concept of domain for statistics
The domain option will be used to have statistics attached to other
objects than proxies/listeners/servers. At the moment, only the PROXY
domain is available.

Add an argument 'domain' on the 'show stats' cli command to specify the
domain. Only 'domain proxy' is available now. If not specified, proxy
will be considered the default domain.

For HTML output, only proxy statistics will be displayed.
2020-10-05 12:02:14 +02:00
Emeric Brun
45c457a629 MINOR: log: adds counters on received syslog messages.
This patch adds a global counter of received syslog messages
and this one is exported on CLI "show info" as "CumRecvLogs".

This patch also updates internal conn counter and freq
of the listener and the proxy for each received log message to
prepare a further export on the "show stats".
2020-07-15 17:50:12 +02:00
Christopher Faulet
aaa70852d9 MINOR: raw_sock: Report the number of bytes emitted using the splicing
In the continuity of the commit 7cf0e4517 ("MINOR: raw_sock: report global
traffic statistics"), we are now able to report the global number of bytes
emitted using the splicing. It can be retrieved in "show info" output on the
CLI.

Note this counter is always declared, regardless the splicing support. This
eases the integration with monitoring tools plugged on the CLI.
2020-07-15 14:08:14 +02:00
Willy Tarreau
a9fcecbdf3 MINOR: stats: add the estimated need of concurrent connections per server
The max_used_conns value is used as an estimate of the needed number of
connections on a server to know how many to keep open. But this one is
not reported, making it hard to troubleshoot reuse issues. Let's export
it in the sessions/current column.
2020-06-29 16:29:11 +02:00
Willy Tarreau
3bb617cfe0 MINOR: stats: add 3 new output values for the per-server idle conn state
The servers have internal states describing the status of idle connections,
unfortunately these were not exported in the stats. This patch adds the 3
following gauges:

 - idle_conn_cur : Current number of unsafe idle connections
 - safe_conn_cur : Current number of safe idle connections
 - used_conn_cur : Current number of connections in use
2020-06-29 14:26:05 +02:00
Willy Tarreau
2eec9b5f95 REORG: include: move stats.h to haproxy/stats{,-t}.h
Just some minor reordering, and the usual cleanup of call places for
those which didn't need it. We don't include the whole tools.h into
stats-t anymore but just tools-t.h.
2020-06-11 10:18:58 +02:00