634 Commits

Author SHA1 Message Date
Aurelien DARRAGON
3c15ee05e9 MINOR: log: introduce log_orig flags
Rename 'enum log_orig' to 'enum log_orig_id', since this enum specifically
contains the log origin ids.

Add 'struct log_orig' which wraps 'enum log_orig' with optional flags
(no flags defined for now).

Add log_orig() helper func that takes id and flags as parameter and
returns log_orig struct initialized with input arguments.

Update functions taking log origin as parameter so they explicitly take
log orig id or log orig wrapper as argument depending on the level of
context expected by the function.
2024-09-26 16:53:07 +02:00
Aurelien DARRAGON
6567e37680 MINOR: log: handle extra log origins in _process_send_log_override()
Thanks to the previous commit, it is now possible to register additional
log origins that may be used from log-profile section as 'on' steps.

As such, let's make _process_send_log_override() function aware of them
by trying to lookup in the tree of extra logging steps in the default
switch-case catchall. If the log origin id matches with the id of the
extra logging step, we use the associated log format instead of the
"any" log format.
2024-09-26 16:53:07 +02:00
Aurelien DARRAGON
818475c5cc MINOR: log: introduce extra log profile steps
add a way to register additional log origins using log_origin_register()
that may be used as log profile steps from log profile sections.

For now this does nothing as no extra origins are registered and extra log
origins are not yet considered for runtime logging paths.

When specifying an extra logging step for on <step> under log-profile
section, the logging step is stored within a binary tree for efficient
lookup during runtime. No performance impact should be expected if extra
log origins are not being used, and slight performance impact if extra
log origins are used.

Don't forget to update the documentation when new log origins are added
(both %OG log alias and on <step> log-profile keyword are concerned.
2024-09-26 16:53:07 +02:00
Aurelien DARRAGON
facf259d88 MINOR: log: fix indent in strm_log()
8f34320e15 ("MINOR: log: provide log origin in logformat expressions
using '%OG'") caused wrong indent in strm_log()
2024-09-26 16:53:07 +02:00
Willy Tarreau
b325453c36 MINOR: proxy: use the global file names for conf->file
Proxy file names are assigned a bit everywhere (resolvers, peers,
cli, logs, proxy). All these elements were enumerated and now use
copy_file_name(). The only ha_free() call was turned to drop_file_name().

As a bonus side effect, a 300k backend config saved 14 MB of RAM.
2024-09-19 15:38:19 +02:00
Willy Tarreau
1a38684fbc MEDIUM: cfgparse: detect collisions between defaults and log-forward
Sadly, when log-forward were introduced they took great care of avoiding
collision with regular proxies but defaults were missed (they need to be
explicitly checked for). So now we have to move them to a warning for 3.1
instead of rejecting them.
2024-09-18 18:08:15 +02:00
Aperence
a7b04e383a MINOR: tools: extend str2sa_range to add an alt parameter
Add a new parameter "alt" that will store wether this configuration
use an alternate protocol.

This alt pointer will contain a value that can be transparently
passed to protocol_lookup to obtain an appropriate protocol structure.

This change is needed to allow for example the servers to know if it
need to use an alternate protocol or not.
2024-08-30 18:53:49 +02:00
Willy Tarreau
d592ebdbeb MEDIUM: socket: always properly use the sock_domain for requested families
Now we make sure to always look up the protocol's domain for an address
family. Previously we would use it as-is, which prevented from properly
using custom addresses (which is when they differ).

This removes some hard-coded tests such as in log.c where UNIX vs UDP
was explicitly checked for example. It requires a bit of care, however,
so as to properly pass value 1 in the 3rd arg of the protocol_lookup()
for DGRAM stuff. Maybe one day we'll change these for defines or enums
to limit mistakes.
2024-08-21 17:36:58 +02:00
Nathan Wehrman
fd48b28315 MINOR: Implements new log format of option tcplog clf
Some systems require log formats in the CLF format and that meant that I
could not send my logs for proxies in mode tcp to those servers.  This
implements a format that uses log variables that are compatble with TCP
mode frontends and replaces traditional HTTP values in the CLF format
to make them stand out. Instead of logging method and URI like this
"GET /example HTTP/1.1" it will log "TCP " and for a response code I
used "000" so it would be easy to separate from legitimate HTTP
traffic. Now your log servers that require a CLF format can see the
timings for TCP traffic as well as HTTP.
2024-08-20 07:46:34 +02:00
Aurelien DARRAGON
f8299bc5ea MINOR: log: "drop" support for log-profile steps
It is now possible to use "drop" keyword for "on" lines under a
log-profile section to specify that no log at all should be emitted for
the specified step (setting an empty format was not sufficient to do so
because only the log payload would be empty, not the log header, thus the
log would still be emitted).

It may be useful to selectively disable logging at specific steps for a
given log target (since the log profile may be set on log directives):

log-profile myprof
  on request format "blabla" sd "custom sd"
  on response drop

New testcase was added to reg-tests/log/log_profiles.vtc
2024-08-19 18:53:01 +02:00
Aurelien DARRAGON
41ca89bc6f MEDIUM: log: relax some checks and emit diag warnings instead in lf_expr_postcheck()
With 7a21c3a ("MAJOR: log: implement proper postparsing for logformat
expressions") which finally made postparsing checks reliable, we started
to get report from users that couldn't start haproxy 3.0 with configs that
used to work in the past. The current situation is described in GH #2642.

While the checks are mostly relevant, it turns out there are not strictly
needed anymore from a technical point of view. Most of them were useful in
early logformat implementation to prevent runtime bugs due to the use of
an alias or fetch at runtime from an incompatible proxy. It's been a few
versions already that the code handling fetches and log aliases is robust
enough to support fetches/aliases used from the wrong context: all it
does is that the fetch/alias will silently fail if it's not available.

This can be proved by the fact that even if the postparsing checks were
partially broken in the past, it didn't cause runtime issues (at least
on recent haproxy versions).

Most of these checks can now be seen as configuration hints: when a check
triggers, it will indicate a configuration inconsistency in most cases,
but they are some corner cases where it is not possible to know at config
time if the conditions will be met for the alias/fetch to work properly..
so instead of failing with a hard error like we did so far, let's just be
more permissive and report our findings using "diag_warning": such
warnings are only emitted when haproxy is started with '-dD' cli option.

We also took this opportunity to improve messages clarity and make them
more precise (report the offending item instead of complaining about the
whole expression because of a single element).

With this patch, configs that used to start before 7a21c3a shouldn't
trigger hard errors anymore.

This may be backported in 3.0.
2024-08-16 14:25:10 +02:00
Aurelien DARRAGON
0030f722a2 BUG/MINOR: log: fix broken '+bin' logformat node option
In 12d08cf912 ("BUG/MEDIUM: log: don't ignore disabled node's options"),
while trying to restore historical node option inheritance behavior, I
broke the '+bin' logformat node option recently introduced in b7c3d8c87c
("MINOR: log: add +bin logformat node option").

Indeed, because of 12d08cf912, LOG_OPT_BIN is not set anymore on
individual nodes even if it was set globally, making the feature unusable.
('+bin' is also used for binary cbor encoding)

What I should have done instead is include LOG_OPT_BIN in the options
inherited from global ones. This is what's being done in this commit.
Misleading comment was adjusted.

It must be backported in 3.0 with 12d08cf912.
2024-06-14 18:25:21 +02:00
Aurelien DARRAGON
cc6fd2646b MEDIUM: log: handle log-profile in process_send_log()
In previous commit we implemented log-profile parsing logic. Now let's
actually make use of available log-profile information from logger struct
to decide whether we need to rebuild the logline under process_send_log()
according to log profile settings. Nothing is done if the logger didn't
specify a log-profile.
2024-06-13 15:43:09 +02:00
Aurelien DARRAGON
48d34b98e4 MINOR: log: add log profile buildlines
Now that we have log-profile parsing done, let's prepare for runtime
log-profile handling by adding the necessary string buffer required to
re-build log strings using sess_build_logline() on the fly without
altering regular loglines content.

Indeed, since a different log-profile may (or may not) be specified for
each logger, we must keep the original string and only rebuild a custom
one when required for the current logger (according to the selected log-
profile).
2024-06-13 15:43:09 +02:00
Aurelien DARRAGON
15e9c7da6b MINOR: log: add log-profile parsing logic
This patch implements prerequisite log-profile struct and parser logic.
It has no effect during runtime for now.

Logformat expressions provided in log-profile "steps" are postchecked
during postparsing for each proxy "log" directive that makes use of a
given profile. (this allows to ensure that the logformat expressions
used in the profile are compatible with proxy using them)
2024-06-13 15:43:09 +02:00
Aurelien DARRAGON
a6e38465fb MINOR: log: add __send_log_set_metadata_sd helper
Extract sd metadata assignment in __send_log() to make an inline helper
function out of it in order to be able to use it from other functions if
needed.
2024-06-13 15:43:09 +02:00
Aurelien DARRAGON
3102c89dde MINOR: log: provide proxy context to resolve_logger()
Prerequisite work for log-profiles, we need to know under which proxy
context the logger is being used. When the info is not available, (ie:
global section or log-forward section, <px> is set to NULL)
2024-06-13 15:43:09 +02:00
Aurelien DARRAGON
42139fa16e MINOR: log: make resolve_logger() static
There is no need to expose this internal function, let's make it static.
2024-06-13 15:43:09 +02:00
Aurelien DARRAGON
db47471155 MINOR: log/backend: always free parsing hints in resolve_logger()
Since resolve_logger() always resolves logger target (even when error
occurs), we must take care of freeing parsing hints because free_logger()
won't try to do it if target RESOLVED flag is set on the target.

This isn't considered as a bug because resolve_logger(), being a
postparsing check, will make haproxy immediately exit upon fatal error
in haproxy.c, but it's better to ensure that everything will be properly
freed if we decide to perform a clean exit upon postparsing checks error
in the future.
2024-06-13 15:43:09 +02:00
Aurelien DARRAGON
2a1bf99923 CLEANUP: log: remove ambiguous legacy comment for resolve_logger()
It is no longer relevant to say that <logger> is used for implicit
settings. In fact the function resolves <logger>, but currently
mainly focuses on loggers's target. However we could extend the
function to perform additional work on the logger itself in the future.

let's adjust the comment to prevent any confusion.
2024-06-13 15:43:09 +02:00
Aurelien DARRAGON
8f34320e15 MINOR: log: provide log origin in logformat expressions using '%OG'
'%OG' logformat alias may be used to report the log origin (when/where)
that triggered log generation using sess_build_logline().

Possible values are:
  - "sess_error": log was generated during session error handling
  - "sess_killed": log was generated during session abortion (killed
    embryonic session)
  - "txn_accept": log was generated right after frontend conn was accepted
  - "txn_request": log was generated after client request was received
  - "txn_connect": log was generated after backend connection establishment
  - "txn_response": log was generated during server response handling
  - "txn_close": log was generated at the final txn step, before closing
  - "unspec": unknown or not specified

Documentation was updated.
2024-06-13 15:43:09 +02:00
Aurelien DARRAGON
b52862d401 MINOR: log: add log_orig_to_str() function
Get human readable string from log_orig enum members.
2024-06-13 15:43:09 +02:00
Aurelien DARRAGON
2a91bd52ad MINOR: log: provide sending log context to process_send_log() when available
This is another prerequisite work in preparation for log-profiles: in this
patch we make process_send_log() aware of the log origin, primarily aiming
for sess and txn logging steps such as error, accept, connect, close, as
well as relevant sess and stream pointers.
2024-06-13 15:43:09 +02:00
Aurelien DARRAGON
0b7a5a64eb MEDIUM: log/session: handle embryonic session log within sess_log()
Move the embryonic session logging logic down to sess_log() in preparation
for log-profiles because then log preferences will be set per logger and
not per proxy. Indeed, as each logger may come with its own log-profile
that possibly overrides proxy logformat preferences, the check will need
to be performed at a central place by lower sending functions.

To ensure the change doesn't break existing behavior, a dedicated
sess_log_embryonic() wrapper was added and is exclusively used by
session_kill_embryonic() to indicate that a special logging logic must
be performed under sess_log().

Also, thanks to this change, log-format-sd will now be taken into account
for legacy embryonic session logging.
2024-06-13 15:43:09 +02:00
Aurelien DARRAGON
ee288a4eef REORG: log: reorder send log helpers by dependency order
This commit looks messy, but all it does is reorganize send_log() helpers
by dependency order to remove the need of forward-declaring some of them.

Also, since they're all internal helpers, let's explicitly mark them as
static to prevent any misuse.
2024-06-13 15:43:09 +02:00
Aurelien DARRAGON
e4f122f3f4 BUG/MEDIUM: log: fix lf_expr_postcheck() behavior with default section
Since 7a21c3a4ef ("MAJOR: log: implement proper postparsing for logformat
expressions"), logformat expressions stored in a default section are not
postchecked anymore. This is because the REGISTER_POST_PROXY_CHECK() only
evaluates regular proxies. Because of this, proxy options which are
automatically enabled on the proxy depending on the logformat expression
features in use are not set on the default proxy, which means such options
are not passed to the regular proxies that inherit from it (proxies that
and will actually be running the logformat expression during runtime).

Because of that, a logformat expression stored inside a default section
and executed by a regular proxy may not behave properly. Also, since
03ca16f38b ("OPTIM: log: resolve logformat options during postparsing"),
it's even worse because logformat node options postresoving is also
skipped, which may also alter logformat expression encoding feature.

To fix the issue, let's add a special case for default proxies in
parse_logformat_string() and lf_expr_postcheck() so that default proxies
are postchecked on the fly during parsing time in a "relaxed" way as we
cannot assume that the features involved in the logformat expression won't
be compatible with the proxy actually running it since we may have
different types of proxies inheriting from the same default section.

This bug was discovered while trying to address GH #2597.

It should be backported to 3.0 with 7a21c3a4ef and 03ca16f38b.
2024-06-11 11:00:05 +02:00
Aurelien DARRAGON
cbc8e1394d MINOR: log: change wording in lf_expr_postcheck() error message
logformat_node was referenced as "node" in the error message reported
to the user, but in fact it is referred to as "item" in user
documentation. Using "item" in the error message to better comply with
the doc.

Error message was introduced with 7a21c3a4ef ("MAJOR: log: implement
proper postparsing for logformat expressions")
2024-06-11 10:59:58 +02:00
Aurelien DARRAGON
e5ccfda9d3 MINOR: log: fix "http-send-name-header" ignore warning message
Warning message to indicate that the "http-send-name-header" option is
ignored for backend in "mode log" was referenced using its internal
struct wording instead of public name (as seen in the documentation).

Let's fix that.

It may be backported with c7783fb ("MINOR: log/backend: prevent
"http-send-name-header" use with LOG mode") in 2.9.
2024-06-11 10:58:55 +02:00
Aurelien DARRAGON
435a9da267 MINOR: log: rename 'log-format tag' to 'log-format alias'
In 2.9 we started to introduce an ambiguity in the documentation by
referring to historical log-format variables ('%var') as log-format
tags in 739c4e5b1e ("MINOR: sample: accept_date / request_date return
%Ts / %tr timestamp values") and 454c372b60 ("DOC: configuration: add
sample fetches for timing events").

In fact, we've had this confusion between log-format tag and log-format
var for more than 10 years now, but in 2.9 it was the first time the
confusion was exposed in the documentation.

Indeed, both 'log-format variable' and 'log-format tag' actually refer
to the same feature (that is: '%B' and friends that can be used for
direct access to some log-oriented predefined fetches instead of using
%[expr] with generic sample expressions).

This feature was first implemented in 723b73ad75 ("MINOR: config: Parse
the string of the log-format config keyword") and later documented in
4894040fa ("DOC: log-format documentation"). At that time, it was clear
that we used to name it 'log-format variable'.

But later the same year, 'log-format tag' naming started to appear in
some commit messages (while still referring to the same feature), for
instance with ffc3fcd6d ("MEDIUM: log: report SSL ciphers and version
in logs using logformat %sslc/%sslv").

Unfortunately in 2.9 when we added (and documented) new log-format
variables we officially started drifting to the misleading 'log-format
tag' naming (perhaps because it was the most recent naming found for
this feature in git log history, or because the confusion has always
been there)

Even worse, in 3.0 this confusion led us to rename all 'var' occurrences
to 'tag' in log-format related code to unify the code with the doc.

Hopefully William quickly noticed that we made a mistake there, but
instead of reverting to historical naming (log-format variable), it was
decided that we must use a different name that is less confusing than
'tags' or 'variables' (tags and variables are keywords that are already
used to designate other features in the code and that are not very
explicit under log-format context today).

Now we refer to '%B' and friends as a logformat alias, which is
essentially a handy way to print some log oriented information in the
log string instead of leveraging '%[expr]' with generic sample expressions
made of fetches and converters. Of course, there are some subtelties, such
as a few log-format aliases that still don't have sample fetch equivalent
for historical reasons, and some aliases that may be a little faster than
their generic sample expression equivalents because most aliases are
pretty much hardcoded in the log building function. But in general
logformat aliases should be simply considered as an alternative to using
expressions (with '%[expr']')

Also, under log-format context, when we want to refer to either an alias
('%alias') or an expression ('%[expr]'), we should use the generic term
'logformat item', which in fact designates a single item within the
logformat string provided by the user. Indeed, a logformat item (whether
is is an alias or an expression) always starts with '%' and may accept
optional flags / arguments

Both the code and the documentation were updated in that sense, hopefully
this will clarify things and prevent future confusions.
2024-05-27 17:03:48 +02:00
Aurelien DARRAGON
e84c8dee1a BUILD: log: get rid of non-portable strnlen() func
In c614fd3b9 ("MINOR: log: add +cbor encoding option"), I wrongly used
strnlen() without noticing that the function is not portable (requires
_POSIX_C_SOURCE >= 2008) and that it was the first occurrence in the
entire project. In fact it is not a hard requirement since it's a pretty
simple function. Thus to restore build compatibility with minimal/older
build systems, let's actually get rid of it and use an equivalent portable
code where needed (we cannot simply rely on strlen() because the string
might not be NULL terminated, we must take upstream len into account).

No backport needed (unless c614fd3b9 gets backported)
2024-05-17 15:24:53 +02:00
Aurelien DARRAGON
32f0cd3242 BUG/MINOR: log: smp_rgs array issues with inherited global log directives
When a log directive is defined in the global section, each time we use
"log global" in a proxy section, the global log directives are duplicated
for the current proxy. This works by creating a new proxy logger struct
and duplicating every members for each global one.

However, smp_rgs logger member is a special pointer member that is
allocated when "range" is used on a log directive. Currently, we simply
copy the array pointer (from the global one), instead of creating our own
copy. Because of that, range log sampling may not work properly in some
situations prior to 3f1284560 ("MINOR: log: remove the unused curr_idx in
struct smp_log_range") when used in global log directives, for instance:

  global
    log 127.0.0.1:5114 format raw sample 1-2,3:4 local0 info # should receive 75% of all proxy logs
    log 127.0.0.1:5115 format raw sample 4:4 local0 info     # should receive 25% of all proxy logs

  listen proxy1
    log global

  listen proxy2
    log global

May not work as expected, because curr_idx was stored within smp_rgs array
member prior to 3f1284560, and due to this bug, it happens to be shared
between every log directive inherited from a "global" one. The result is
that curr_idx counter will not behave properly because the index will be
increased globally instead of per-log directive, and it could even suffer
from concurrent thread accesses under load since we don't own the global
log directive's lock when manipulating it.

Another issue that was revealed because of this bug is that the smp_rgs
array allocated during config parsing is never freed in free_logger(),
resulting in small memory leak during clean exit.

To fix these issues all at once, let's properly duplicate smp_rgs logger
struct member in dup_logger() like we already do for other special members
so that every log directive have its own sms_rgs copy, and then
systematically free it in free_logger().

While this bug affects all stable versions (including 2.4), it's probably
best to not backport this beyond 2.6 because of 211ea252d
("BUG/MINOR: logs: fix logsrv leaks on clean exit") prerequisite that
first appears in 2.6.

[ada: for versions prior to 2.9, 969e212
 ("MINOR: log: add dup_logsrv() helper function") and 76acde91
 ("BUG/MINOR: log: keep the ref in dup_logger()") must be backported
 first.
 Note: Some ctx adjustments should be performed because 'logger' struct
 used to be named 'logsrv' in the past and 2.9 introduced logger target
 struct member. Thus it's probably easier to manually apply 76acde91 and
 the current bugfix by hand directly on top of 969e212.
]
2024-05-14 12:00:23 +02:00
Aurelien DARRAGON
9d4a44e713 BUG/MINOR: log: fix leak in add_sample_to_logformat_list() error path
If add_sample_to_logformat_list() fails to allocate new logformat_node,
then we directly jump to error_free label to cleanup the node using
free_logformat_node() before returning an error.

However if the node failed to allocate, then the sample expression that
was allocated just before (not yet assigned) isn't released
(free_logformat_node() is a no-op when NULL is provided). Thus if expr
wasn't assigned to the node during early failure, then it must be manually
released.

This bug was introduced by 2462e5bcc ("BUG/MINOR: log: fix potential
lf->name memory leak") which wasn't marked for backports. It only
affects 3.0.
2024-05-13 16:44:27 +02:00
Aurelien DARRAGON
fbbc2925d4 BUG/MEDIUM: log/ring: broken syslog octet counting
As reported by Tristan in GH #2561, syslog messages sent over rings are
malformed since commit 01aa0a05 ("MEDIUM: ring: change the ring reader
to use the new vector-based API now").

Indeed, take a look at the following log message produced prior to
01aa0a05:

  181 <134>1 2024-05-07T09:45:21.543263+02:00 - haproxy 113700 - - 127.0.0.1:56136 [07/May/2024:09:45:21.491] front front/s1 0/0/21/30/51 404 369 - - ---- 1/1/0/0/0 0/0   "GET / HTTP/1.1"

Starting with 01aa0a05, here's the equivalent log message:

  <134>1 2024-05-07T09:45:21.543263+02:00 - haproxy 112729 - - 127.0.0.1:56136 [07/May/2024:09:45:21.491] front front/s1 0/0/66/39/105 404 369 - - ---- 1/1/0/0/0 0/0   "GET / HTTP/1.1"-fwr

-> Message is missing octet counting header, and garbage bytes are found
at the end of the payload.

This bug is caused by a small mistake in syslog_applet_append_event():
when the function was refactored to use vector API instead of buffer
API, we used 'trash.area' as starting pointer to write the event instead
of 'trash.area + trash.data', causing existing octet counting prefix
(already written in trash) to be overwritten and trash.data to be
wrongly incremented.

No backport needed (01aa0a05 was introduced during 3.0 development)
2024-05-07 19:23:01 +02:00
Aurelien DARRAGON
03ca16f38b OPTIM: log: resolve logformat options during postparsing
In lf_buildctx_prepare(), we perform costly bitwise operations for every
nodes to resolve node options and check for incompatibilities with global
options.

In fact, all this logic may safely be performed during postparsing. This
is what we're doing in this commit. Doing so saves us from unnecessary
runtime checks and could help speedup sess_build_logline().

Since checks are not as costly as before (due to them being performed
during postparsing and not on log building path anymore), an complementary
check for OPT_HTTP vs OPT_ENCODE incompatibity was added:

  encoding is ignored if HTTP option is set, unless HTTP option wasn't
  set globally and encoding was set globally, which means encoding
  takes the precedence

Thanks to this patch, lf_buildctx_prepare() now only takes care of
assigning proper typecast and options settings depending if it's used
from global or per-node context, and prepares CBOR-specific structure
members when CBOR encode option is set.
2024-05-06 11:13:46 +02:00
Aurelien DARRAGON
d26a160133 OPTIM: log: speedup date printing in sess_build_logline() when no encoding is used
In sess_build_logline(), we have multiple fieds such as '%t' that build
a fixed-length string out of a date struct and then print it using
lf_rawtext(). In fact, printing it using lf_rawtext() is only mandatory
to deal with encoding options, but when no encoding is used we can output
the result to tmplog directly. Since most dates generate between 25 and 30
chars, doing so spares us from writing them twice and could help make
sess_build_logline() a bit faster when no encoding is used. (to match with
pre-encoding patch series performance).
2024-05-04 10:13:05 +02:00
Aurelien DARRAGON
bf3b4001ce OPTIM: log: use lf_buildctx's buffer instead of temporary stack buffers
Now that lf_buildctx isn't pushed on the stack anymore, let's take this
opportunity to store a small buffer of 256 bytes within it, and then use
this buffer as general purpose buffer to build fixed-length strings that
are then printed using lf_{raw}text() function. By doing so we stop
relying on temporary stack buffers.
2024-05-04 10:13:05 +02:00
Aurelien DARRAGON
ccc4341258 OPTIM: log: use thread local lf_buildctx to stop pushing it on the stack
Following previous commit's logic, let's move lf_buildctx ctx away from
sess_build_logline() to stop abusing from the stack to push large
structure each time sess_build_logline() is called. Also, don't memset
the structure for each invokation, but only reset members explicitly when
required.

For that we now declare one static lf_buildctx per thread (using
THREAD_LOCAL) and make sess_build_logline() refer to it using a pointer.
2024-05-04 10:13:05 +02:00
Aurelien DARRAGON
728b5aa835 OPTIM: log: declare empty buffer as global variable
'empty' buffer used in sess_build_logline() inside a loop, and since it
is only being read from and not modified, until recently it ended up being
cached most of the time and didn't cause overhead due to systematic push
on the stack.

However, due recent encoding work and new added variables on the stack,
we're starting to reach a stack limit and declaring 'empty' buffer within
the loop seems to cause non-negligible CPU overhead.

Since the variable isn't modified during log generation, let's declare
'empty' buffer as a global variable outside from sess_build_logline()
to prevent pushing it on the stack for each node evaluation.
2024-05-04 10:13:05 +02:00
Aurelien DARRAGON
cc2e94a948 BUG/MINOR: log: prevent double spaces emission in sess_build_logline()
Christian reported in GH #2556 that since 3.0-dev double spaces may be
found in log messages on some cases where it was not the case before.

As we were able to easily reproduce, a quick bisect led us to c6a7138
("MINOR: log: simplify last_isspace in sess_build_logline()"). While
it is true that all switch cases set the last_isspace variable to 0,
there was a subtelty for some fields such as '%hr', '%hrl', '%hs' or
'%hsl' and I overlooked it. Indeed, for '%hr', last_isspace was only set
to 0 if data was emitted, else the assignment didn't occur.

But with c6a7138, last_isspace is always set to 0 as long as the current
node type is not a separator. Because of that, if no data is emitted for
the current node value, and a space was already emitted prior to the
current node, then an extra space could be emitted after the node,
resulting in two spaces being emitted.

Note that while c6a7138 introduces a slight behavior regression regarding
last_isspace logic with the specific fields mentionned above, this
behavior could already be triggered with a failing or empty logformat
node sample expression. Consider this logformat expression:

  log-format "%{-M}o | %[str()] |"

str() will not print anything, and since we disabled mandatory option with
'-M', nothing gets printed for the node sample expression. As a result, we
have the following output:

  "|  |"

Instead of (when mandatory option is enabled):

  "| - |"

Thus in order to stick to the historical behavior, systematically set
last_isspace to 0 for EXPR nodes, and only set last_isspace to 0 when
data was written for TAG nodes. This way, '%hr', '%hrl', '%hs' or
'%hsl' should behave as before.

No backport needed.
2024-05-03 16:48:21 +02:00
Aurelien DARRAGON
48e0efb00b MEDIUM: log: optimizing tmp->type handling in sess_build_logline()
Instead of chaining 2 switchcases and performing encoding checks for all
nodes let's actually split the logic in 2: first handle simple node types
(text/separator), and then handle dynamic node types (tag, expr). Encoding
options are only evaluated for dynamic node types.

Also, last_isspace is always set to 0 after next_fmt label, since next_fmt
label is only used for dynamic nodes, thus != LOG_FMT_SEPARATOR.

Since LF_NODE_WITH_OPT() macro (which was introduced recently) is now
unused, let's get rid of it.

No functional change should be expected.

(Use diff -w to check patch changes since reindentation makes the patch
look heavy, but in fact it remains fairly small)
2024-05-03 16:48:21 +02:00
Amaury Denoyelle
634cc2a5d8 MINOR: counters: move last_change into counters struct
last_change was a member present in both proxy and server struct. It is
used as an age statistics to report the last update of the object.

Move last_change into fe_counters/be_counters. This is necessary to be
able to manipulate it through generic stat column and report it into
stats-file.

Note that there is a change for proxy structure with now 2 different
last_change values, on frontend and backend side. Special care was taken
to ensure that the value is initialized only on the proxy side. The
other value is set to 0 unless a listen proxy is instantiated. For the
moment, only backend counter is reported in stats. However, with now two
distinct values, stats could be extended to report it on both side.
2024-05-02 10:55:25 +02:00
Aurelien DARRAGON
12d08cf912 BUG/MEDIUM: log: don't ignore disabled node's options
In 3f2e8d0ed ("MEDIUM: log: lf_* build helpers now take a ctx argument")
I made a mistake, because starting with this commit it is no longer
possible from a node to disable global logformat options.
The result is that when an option is set globally, it cannot be disabled
anymore.

For instance, it is not possible to do this anymore:
  log-format "%{+X}o %{-X}Ts"

The original intent was to prevent encoding options from being
disabled once enabled globally, because when encoding is enabled globally
we start the object enumeration right away (ie: in CBOR and JSON we
announce dynamic map, and for each node we announce the key..), thus it
doesn't make sense to mix encoding types there, unless encoding is only
used per-node, in which case only the value gets encoded, thus it remains
possible to print a value in JSON/CBOR-compatible format while the next
one shouldn't be printed as-is.

Thus, to restore the original behavior, slightly change the logic in
lf_buildctx_prepare() so that only global encoding options take the
precedence over node's options (instead of all options).

No backport needed.
2024-04-30 18:45:07 +02:00
Aurelien DARRAGON
41d7e82e0f MINOR: log/cbor: _lf_cbor_encode_byte() explicitly requires non-NULL ctx (again)
The BUG_ON() statement that was added in 9bdea51 ("MINOR: log/cbor:
_lf_cbor_encode_byte() explicitly requires non-NULL ctx") isn't
sufficient as Coverity still thinks the lf_buildctx itself may be NULL
as shown in GH #2554. In fact the original reports complains about the
lf_buildctx itself and I didn't understand it properly, let's add another
check in the BUG_ON() to ensure both cbor_ctx and cbor_ctx->ctx are not
NULL since it is not expected if used properly.
2024-04-30 10:10:35 +02:00
Aurelien DARRAGON
9931a62c3f BUG/MINOR: log: fix global lf_expr node options behavior (2nd try)
In 98b44e8 ("BUG/MINOR: log: fix global lf_expr node options behavior"),
I properly restored global node options behavior for when encoding is
not used, however the fix is not optimal when encoding is involved:

Indeed, encoding logic in sess_build_logline() relies on global node
options to know if encoding must be handled expression-wide or
individually. However, because of the above fix, if an expression is
made of 1 or multiple nodes that all set an encoding option manually
(without '%o'), we consider that the option was set globally, but
that's probably not what the user intended. Instead we should only
evaluate global options from '%o', so that it remains possible to
skip global encoding when needed.

No backport needed.
2024-04-30 10:10:35 +02:00
Aurelien DARRAGON
97240d01b3 BUG/MINOR: log/encode: fix potential NULL-dereference in LOGCHAR()
When CBOR encoding was added in c614fd3b9 ("MINOR: log: add +cbor encoding
option"), in LOGCHAR(), we forgot to check that we don't assign the NULL
value to tmplog (as we assume that tmplog cannot be NULL at the end of
sess_build_logline())

No backport needed.
2024-04-30 10:10:35 +02:00
Aurelien DARRAGON
949ac95aa6 BUG/MINOR: log/encode: consider global options for key encoding
In sess_build_logline(), contrary to what's stated in the comment
"only consider global ctx for key encoding", we check for
LOG_OPT_ENCODE flag on the current ctx options instead of global
ones. Because of this, we could end up doing the wrong thing if the
previous node had encoding enabled but it isn't set globally for
instance.

To fix the issue, let's simply check the presence of the flag on
g_options before entering the "key encoding" block.

This bug was introduced with 3f7c8387 ("MINOR: log: add +json encoding
option"), no backport needed.
2024-04-30 10:10:35 +02:00
Aurelien DARRAGON
9bdce67585 CLEANUP: log: add a macro to know if a lf_node is configurable
LF_NODE_WITH_OPT(node) returns true if the node's option may be set and
thus should be considered. Logic is based on logformat node's type:
for now only TAG and FMT nodes can be configured.
2024-04-29 14:47:37 +02:00
Aurelien DARRAGON
98b44e8edb BUG/MINOR: log: fix global lf_expr node options behavior
In 507223d5 ("MINOR: log: global lf_expr node options"), a mistake was
made because it was assumed that only the last occurence of %o
(LOG_FMT_GLOBAL) should be kept as global node options.

However, although not documented, it is possible to have multiple %o
within a single logformat expression to change the global settings on the
fly.

For instance, consider this example:

  log-format "%{+X}o test1=%ms %{-X}o test2=%ms %{+X}o test3=%ms"

Prior to 3f2e8d0ed ("MEDIUM: log: lf_* build helpers now take a ctx
argument"), this would output something like this:

  test1=18B test2=395 test3=18B

This is because global options is properly updated as the lf_expr string
is parsed. But now due to 507223d5 and 3f2e8d0ed, only the last %o
occurence is considered. With the above example, this gives:

  test1=18B test2=18B test3=18B

To restore historical behavior, let's partially revert 507223d5: to
compute global node options, we now start with all options enabled and
then for each configurable node in lf_expr_postcheck(), we keep options
common to the current node and previous nodes using AND masking, this way
we really end up with options common to all nodes.

No backport needed.
2024-04-29 14:47:37 +02:00
Aurelien DARRAGON
9bdea51d7e MINOR: log/cbor: _lf_cbor_encode_byte() explicitly requires non-NULL ctx
As shown in GH #2550, Coverity is tempted to think that NULL-dereference
can occur in _lf_cbor_encode_byte() due to user-ctx being dereferenced
from cbor_ctx, while coverity thinks that cbor_ctx may be NULL.

In practise this cannot happen, because _lf_cbor_encode_byte() is
only leveraged through a function pointer that is set in conjunction with
the function pointer ctx (which ain't NULL). All this logic is done inside
lf_buildctx_prepare() when LOG_OPT_ENCODE_CBOR is set.

Since coverity doesn't seem to understand the logic properly, then it
might as well confuse humans, so let's make it clear in
_lf_cbor_encode_byte() that we expect non-NULL ctx by adding a BUG_ON()
2024-04-29 14:47:37 +02:00
Aurelien DARRAGON
0e2aea8224 CLEANUP: tools/cbor: rename cbor_encode_ctx struct members
Rename e_byte_fct to e_fct_byte and e_fct_byte_ctx to e_fct_ctx, and
adjust some comments to make it clear that e_fct_ctx is here to provide
additional user-ctx to the custom cbor encode function pointers.

For now, only e_fct_byte function may be provided, but we could imagine
having e_fct_int{16,32,64}() one day to speed up the encoding when we
know we can encode multiple bytes at a time, but for now it's not worth
the hassle.
2024-04-29 14:47:37 +02:00