Command line argument -dt can be used to activate traces during startup.
Via its optional argument, it is possible to change settings for a
particular trace source. It is also possible to update every registered
sources by specifying an empty name.
Support the trace source alias "all". This is an alternative to the
empty name to update every sources.
Traces can be activated on startup either via -dt command line argument
or via the traces configuration section. This can caused confusion as it
may not be clear as trace source can be completed or overriden by one or
the other.
Fix the precedence to give the priority to the command line argument.
Now, each trace source configured via -dt is first resetted to a default
state before applying new settings. Then, it is impossible to change a
trace source via the configuration file if it was already targetted via
-dt argument.
Traces can be activated on startup via -dt command line argument. To
facilitate its usage, display a usage description and examples when
"help" is specified.
Define a set of functions to temporarily disable/reactivate tracing for
the current thread. This could be useful when wanting to quickly remove
tracing output for some code parts.
The API relies on a disable/resume set of functions, with a thread-local
counter. This counter is tested under __trace_enabled(). It is a
cumulative value so that the same count of resume must be issued after
several disable usage. There is also the possibility to force reset the
counter to 0 before restoring the old value.
This should be backported up to 3.1.
When using trace with -dt, the trace_parse_cmd() function is doing a
strtok which write \0 into the argv string.
When using the mworker mode, and reloading, argv was modified and the
trace won't work anymore because the first : is replaced by a '\0'.
This patch fixes the issue by allocating a temporary string so we don't
modify the source string directly. It also replace strtok by its
reentrant version strtok_r.
Must be backported as far as 2.9.
A previous known limitation about traces was that parsing was performed on
the fly, meaning that when using "sink" keyword, only sinks that were
either internal or previously defined in the config could be used. Indeed,
it was not possible to use a ring section defined AFTER the traces section
when using the 'sink' keyword from traces.
This limitation was also mentioned in the config file.
Let's get rid of that limitation by implementing proper postparsing for
the sink parameter in traces section. To do this, make use of the new
sink_find_early() helper to start referencing sink by their names even
if they don't exist yet (if they are about to be defined later in the
config)
Traces commands on the cli are not concerned by this change.
In the configuration file or on the CLI, configuring traces for a specific
source is a bit painful because this must be done in several lines. Thanks
to this patch, it is now possible to fully configure traces for a source in
one line. For instance, the following on the CLI:
trace h1 sink stderr; trace h1 level developer; trace h1 verbosity complete; trace h1 start now
can now be replaced by:
trace h1 sink stderr level developer verbosity complete start now
The same is true for the 'trace' directives in the configuration file.
It is no longer supported to declare debug traces, via 'trace' directive, in
a global section. A 'traces' directive must be used instead. The syntax of
the 'trace' directive in these sections remains the same. But it is no
longer experimental.
The main reason for this change is to avoid to have a ring section defined
before a global one. Indeed, for now, forward declarations of ring sections
are not supported. So to configure traces, you had to add a ring section
before the global one defining the traces. Most of time, that meant to have
two global sections :
global
[...] # global settings
ring <name>
[...]
global
[...] # trace config
In addition, it will be possible to easily extend the traces section by
adding some new directives.
We now have a trace_ctx to hold the sess, conn, qc, stream and so on.
This will allow us to pass it across layers so that other helpers can
help fill them.
Ideally it should be passed as an argument to __trace_enabled() by
__trace() so that it can be passed back to the trace callback. But
it seems that trace callbacks are smart enough to figure all their
info when they need them.
With "follow" from one source to another, it becomes possible for a
source to automatically follow another source's tracked pointer. The
best example is the session:
- the "session" source is enabled and has a "lockon session"
-> its lockon_ptr is equal to the session when valid
- other sources (h1,h2,h3 etc) are configured for "follow session"
and will then automatically check if session's lockon_ptr matches
its own session, in which case tracing will be enabled for that
trace (no state change).
It's not necessary to start/pause/stop traces when using this, only
"follow" followed by a source with lockon enabled is needed. Some
combinations might work better than others. At the moment the session
is almost never known from the backend, but this may improve.
The meta-source "all" is supported for the follower so that all sources
will follow the tracked one.
It's extremely painful to have to set "trace <src> sink buf1" for all
sources, then to do the same for "level developer" (for example). Let's
have a possibility via a meta-source "all" to apply the change to all
sources at once. This currently supports level and sink, which are not
dependent on the source, this is a good start.
The test was was performed but there's no way to set the option! Let's
just add "qconn" to select the quic conn when the source supports it.
This can be backported at least to 3.0, probably 2.6.
The doc clearly says that "start <evt>" should leave the trace in pause
mode until the indicated event appears. However it's not what's happening,
the state is not changed until one command uses "now", so it's typically
needed to configure the events with "start <evt>" then enable the waiting
mode using "pause now". This is counter-intuitive and does not match the
doc, so let's fix it so that "start <evt>" switches from stopped to waiting
as long as at least one event is enabled.
This can be backported to all versions.
When calling TRACE_ENABLED(), which is called by TRACE_PRINTF(), we pass
a NULL plockptr to __trace_enabled(). This argument is used when lockon
is active, and may update the pointer. This is an overlook which also
broke the lockon mechanism because now for calls from __trace(), it
dereferences a pointer pointing to NULL, and never updates it due to the
broken condition, so that trace() never sets up src->lockon_ptr.
The bug was introduced in 2.8 by commit 8f9a9704bb ("MINOR: trace: add a
TRACE_ENABLED() macro to determine if a trace is active"), so the fix must
be backported there.
These ones were not proposed in the list of trackable elements. Note
that this depends on previous commit:
BUG/MINOR: trace/quic: enable conn/session pointer recovery from quic_conn
This should be backported to at least 3.0, maybe even 2.6.
In __trace_enabled(), a quic_conn was detected, but it was not possible
to derive the connection nor the session from it, which was quite limiting
in terms of ability to track a same instance.
This should be backported to at least 3.0, maybe even 2.6.
Since a recent change on trace, the following compilation warning may
occur :
src/trace.c: In function ‘trace_parse_cmd’:
src/trace.c:865:33: error: potential null pointer dereference [-Werror=null-dereference]
865 | for (nd = src->decoding; nd->name && nd->desc; nd++)
| ~~~^~~~~~~~~~~~~~~
Fix this by rearranging code path to better highlight that only "quiet"
verbosity is allowed if no trace source is specified.
This was detected with GCC 14.1.
It's quite frustrating, particularly on the command line, not to have
access to the list of available levels and verbosities when one does
not exist for a given source, because there's no easy way to find them
except by starting without and connecting to the CLI. Let's enumerate
the list of supported levels and verbosities when a name does not match.
For example:
$ ./haproxy -db -f quic-repro.cfg -dt h2:help
[NOTICE] (9602) : haproxy version is 3.0-dev12-60496e-27
[NOTICE] (9602) : path to executable is ./haproxy
[ALERT] (9602) : -dt: no such trace level 'help', available levels are 'error', 'user', 'proto', 'state', 'data', and 'developer'.
$ ./haproxy -db -f quic-repro.cfg -dt h2:user:help
[NOTICE] (9604) : haproxy version is 3.0-dev12-60496e-27
[NOTICE] (9604) : path to executable is ./haproxy
[ALERT] (9604) : -dt: no such trace verbosity 'help' for source 'h2', available verbosities for this source are: 'quiet', 'clean', 'minimal', 'simple', 'advanced', and 'complete'.
The same is done for the CLI where the existing help message is always
displayed when entering an invalid verbosity or level.
By default, backend connections are attached to a server instance. This
allows to implement connection reuse. However, in some particular cases,
connection cannot be shared accross several clients. These connections
are considered and private and are attached to the session instance
instead.
These private connections are also indexed by the target server to not
mix them. All of this is implemented via a dedicated structure
previously named struct sess_srv_list.
Rename it to better reflect its usage to struct sess_priv_conns. Also
rename its internal members and all of the associated functions.
This commit is only a renaming, thus no functional impact is expected.
Add an optional argument for "-dt". This argument is interpreted as a
list of several trace statement separated by comma. For each statement,
a specific trace name can be specifed, or none to act on all sources.
Using double-colon separator, it is possible to add specifications on
the wanted level and verbosity.
Extract conversion of level string argument to integer value in a
dedicated internal function trace_parse_level(). This function is used
to for CLI trace parsing and will also be useful for "-dt" process
argument.
Add '-dt' haproxy process argument. This will automatically activate all
trace sources on stderr with the error level. This could be useful to
troubleshoot issues such as protocol violations.
Since traces were adapted to support being declared in the global section
in 2.7 with commit c11f1cdf4 ("MINOR: trace: split the CLI "trace" parser
in CLI vs statement"), the method used to return the error message was
unreliable. For example an invalid sink name in the global section would
produce:
[ALERT] (26685) : config : parsing [test-trace.cfg:51] : 'trace': No such sink
[ALERT] (26685) : config : parsing [test-trace.cfg:51] : (null)
[ALERT] (26685) : config : Error(s) found in configuration file : test-trace.cfg
[ALERT] (26685) : config : Fatal errors found in configuration.
The reason is that the trace is emitted manually using ha_error() in
cfg_parse_trace() and -1 is returned without setting the message, and
the caller also prints the empty message. That's quite awkward given
that the API originally comes from the CLI which does support dynamic
strings and that config keywords do as well.
This commit modifies both cli_parse_trace() and cfg_parse_trace() to
return a dynamically allocated message instead, and adapts the central
function trace_parse_statement() to do the same, replacing a few direct
assignments with strdup() or memprintf(). This way the alert is no
longer emitted by the parser function, it just passes the message to
the caller.
A few of the static messages switching to memprintf() also took this
opportunity to report the faulty word:
[ALERT] (26772) : config : parsing [test-trace.cfg:51] : No such trace sink 'stduot'
[ALERT] (26772) : config : Error(s) found in configuration file : test-trace.cfg
[ALERT] (26772) : config : Fatal errors found in configuration.
This may be backported to 2.8 and 2.7.
Introduce log_header struct to easily pass log header data between
functions and use that to simplify the logic around log header
handling.
While at it, some outdated comments were updated as well.
No change in behavior should be expected.
sink_write() currently relies on sink->maxlen to know when to stop
writing a given payload.
But it could be useful to pass a smaller, explicit value to sink_write()
to stop before the ring maxlen, for instance if the ring is shared between
multiple feeders.
sink_write() now takes an optional maxlen parameter:
if maxlen is > 0, then sink_write will stop writing at maxlen if maxlen
is smaller than ring->maxlen, else only ring->maxlen will be considered.
[for haproxy <= 2.7, patch must be applied by hand: that is:
__sink_write() and sink_write() should be patched to take maxlen into
account and function calls to sink_write() should use 0 as second argument
to keep original behavior]
By default, passing a NULL cb to the trace functions will result in the
source's default one to be used. For some cases we won't want to use any
callback at all, not event the default one. Let's define a trace_no_cb()
function for this, that does absolutely nothing.
Sometimes it would be necessary to prepare some messages, pre-process some
blocks or maybe duplicate some contents before they vanish for the purpose
of tracing them. However we don't want to do that for everything that is
submitted to the traces, it's important to do it only for what will really
be traced.
The __trace() function has all the knowledge for this, to the point of even
checking the lockon pointers. This commit splits the function in two, one
with the trace decision logic, and the other one for the trace production.
The first one is now usable through wrappers such as _trace_enabled() and
TRACE_ENABLED() which will indicate whether traces are going to be produced
for the current source, level, event mask, parameters and tracking.
There are ifdefs at several places to only define TRC_ARGS_QCON when
QUIC is defined, but nothing prevents this code from building without.
Let's just remove those ifdefs, the single "if" they avoid is not worth
the extra maintenance burden.
The exact same commands as those from the CLI may be pre-loaded at boot
time by passing them one per line after the "trace" keyword in the global
section; i.e. just copy-pasting all commands directly there will do the
job. Note that if a ring is mentioned, it needs to be declared before the
global section. Another option is to append another global section after
"ring".
For now the keyword is marked as experimental to discourage its broad
adoption by default. "expose-experimental-directives" needs to be placed
in the global section to expose it.
In order to be able to reuse the "trace" statements elsewhere (e.g.
global section), we'll first need to split its parser. It turns out
that the whole thing is self-contained inside a single function that
emits a single message on warning/error or nothing on success. That's
quite easy to split in two parts, the one that does the job and produces
the status message and the one that sends it to the CLI. That's what
this patch does.
Email alerts are based on health-checks but with no server. Thus, in
__trace() function, responsible to write a trace message, we must be
prepared to have no server and thus no proxy.
This patch must be backported as far as 2.4.
There's a rare race condition possible when trying to retrieve session from
a back connection's owner, that was fixed in 2.4 and described in commit
3aab17bd5 ("BUG/MAJOR: connection: reset conn->owner when detaching from
session list").
It also affects the trace code which does the same, so the same fix is
needed, i.e. check from conn->session_list that the connection is still
enlisted. It's visible when sending a few tens to hundreds of parallel
requests to an h2 backend and enabling traces in parallel.
This should be backported as far as 2.2 which is the oldest version
supporting traces.
This one is referenced in initcalls by its pointer, it makes no sense
to declare it inline. At best it causes function duplication, at worst
it doesn't build on older compilers.
Prepare trace support for quic_conn instances as argument. This will be
used by the xprt-quic layer in replacement of the connection.
This commit is part of the rearchitecture of xprt-quic layers and the
separation between xprt and connection instances.
There were 102 CLI commands whose help were zig-zagging all along the dump
making them unreadable. This patch realigns all these messages so that the
command now uses up to 40 characters before the delimiting colon. About a
third of the commands did not correctly list their arguments which were
added after the first version, so they were all updated. Some abuses of
the term "id" were fixed to use a more explanatory term. The
"set ssl ocsp-response" command was not listed because it lacked a help
message, this was fixed as well. The deprecated enable/disable commands
for agent/health/server were prominently written as deprecated. Whenever
possible, clearer explanations were provided.
With commit a1f12746b ("MINOR: traces: add a new level "error" below
the "user" level") a new trace level was inserted, resulting in
shifting all exiting ones by one. But the levels reported in the
__trace() function were not updated accordingly, resulting in the
TRACE_LEVEL_DEVELOPER not to be properly reported anymore. This
patch fixes it by extending the number of levels to 6.
No backport is needed.
Sometimes it would be nice to be able to only trace abnormal events such
as protocol errors. Let's add a new "error" level below the "user" level
for this. This will allow to add TRACE_ERROR() at various error points
and only see them.
This patch merges build message code between sink and log
and introduce a new API based on struct ist array to
prepare message header with zero copy, targeting the
log forwarding feature.
Log format 'iso' and 'timed' are now avalaible on logs line.
A new log format 'priority' is also added.
Most of the files dealing with error reports have to include log.h in order
to access ha_alert(), ha_warning() etc. But while these functions don't
depend on anything, log.h depends on a lot of stuff because it deals with
log-formats and samples. As a result it's impossible not to embark long
dependencies when using ha_warning() or qfprintf().
This patch moves these low-level functions to errors.h, which already
defines the error codes used at the same places. About half of the users
of log.h could be adjusted, sometimes revealing other issues such as
missing tools.h. Interestingly the total preprocessed size shrunk by
4%.
The current state of the logging is a real mess. The main problem is
that almost all files include log.h just in order to have access to
the alert/warning functions like ha_alert() etc, and don't care about
logs. But log.h also deals with real logging as well as log-format and
depends on stream.h and various other things. As such it forces a few
heavy files like stream.h to be loaded early and to hide missing
dependencies depending where it's loaded. Among the missing ones is
syslog.h which was often automatically included resulting in no less
than 3 users missing it.
Among 76 users, only 5 could be removed, and probably 70 don't need the
full set of dependencies.
A good approach would consist in splitting that file in 3 parts:
- one for error output ("errors" ?).
- one for log_format processing
- and one for actual logging.
Almost no change except moving the cli_kw struct definition after the
defines. Almost all users had both types&proto included, which is not
surprizing since this code is old and it used to be the norm a decade
ago. These places were cleaned.
The sink files could be moved with almost no change at since they
didn't rely on anything fancy. ssize_t required sys/types.h and
thread.h was needed for the locks.
The pretty confusing "buffer.h" was in fact not the place to look for
the definition of "struct buffer" but the one responsible for dynamic
buffer allocation. As such it defines the struct buffer_wait and the
few functions to allocate a buffer or wait for one.
This patch moves it renaming it to dynbuf.h. The type definition was
moved to its own file since it's included in a number of other structs.
Doing this cleanup revealed that a significant number of files used to
rely on this one to inherit struct buffer through it but didn't need
anything from this file at all.