Commit Graph

622 Commits

Author SHA1 Message Date
Aurelien DARRAGON
2e4cc517bf MEDIUM: log: use lf_rawtext for lf_ip() and lf_port() hex strings
Same as the previous commit, but for ip and port oriented values when
+X option is provided.

No functional change should be expected.

Because of this patch, we add a little overhead because we first generate
the text into a temporary variable and then use lf_rawtext() to print it.
Thus we have a double-copy, and this could have some performance
implications that were not yet evaluated. Due to the small number of bytes
that can end up being copied twice, we could be lucky and have no visible
performance impact, but if we happen to see a significant impact, it could
be useful to add a passthrough mechanism (to keep historical behavior)
when no encoding is involved.
2024-04-26 18:39:31 +02:00
Aurelien DARRAGON
3a3bdf1c76 MEDIUM: log: write raw strings using lf_rawtext()
Make use of the previous commit to print strings that should not be
modified.

For instance, when +X option is provided, we have to print numerical
values in ASCII HEX form. For that, we used snprintf() to output the
result to the log output buffer directly, but now we build the string in
a temporary buffer of fixed-size and then print it using lf_rawtext()
which will take care of encoding options.

Because of this patch, we add a little overhead because we first generate
the text into a temporary variable and then use lf_rawtext() to print it.
Thus we have a double-copy, and this could have some performance
implications that were not yet evaluated. Due to the small number of bytes
that can end up being copied twice, we could be lucky and have no visible
performance impact, but if we happen to see a significant impact, it could
be useful to add a passthrough mechanism (to keep historical behavior)
when no encoding is involved.
2024-04-26 18:39:31 +02:00
Aurelien DARRAGON
0d1e99c086 MEDIUM: log: pass date strings to lf_rawtext()
Don't directly call functions that take date as argument and output the
string representation to the log output buffer under sess_build_logline(),
and instead build the strings in temporary buffers of fixed size
(hopefully such functions, such as date2str_log() and gmt2str_log()
procuce strings of known size), and then print the result using
lf_rawtext() helper function. This way, we will be able to encode them
automatically as regular string/text when new encoding methods are added.

Because of this patch, we add a little overhead because we first generate
the text into a temporary variable and then use lf_rawtext() to print it.
Thus we have a double-copy, and this could have some performance
implications that were not yet evaluated. Due to the small number of bytes
that can end up being copied twice (< 30), we could be lucky and have no
visible performance impact, but if we happen to see a significant impact,
it could be useful to add a passthrough mechanism (to keep historical
behavior) when no encoding is involved.
2024-04-26 18:39:31 +02:00
Aurelien DARRAGON
fcb7e4beaa MINOR: log: add lf_rawtext{_len}() functions
similar to lf_text_{len}, except that quoting and mandatory options are
ignored. Use this to print the input string without any modification (
except for encoding logic).
2024-04-26 18:39:31 +02:00
Aurelien DARRAGON
1fa2da18cd MINOR: log: add lf_int() wrapper to print integers
Wrap ltoa(), lltoa(), ultoa() and utoa_pad() functions that are used by
sess_build_logline() to print numerical values by implementing a dedicated
helper named lf_int() that takes <dft_hld> as argument to know how to
write the integer by default (when no encoding is specified).

LF_INT_UTOA_PAD_4 is used to emulate utoa_pad(x, 4) since it's found only
once under sess_build_logline(), thus there is no need to pass an extra
parameter to lf_int() function.
2024-04-26 18:39:31 +02:00
Aurelien DARRAGON
d3c92a3a83 MINOR: log: skip custom logformat_node name if empty
Reminder:

Since 3.0-dev4, we can optionally give a name to logformat nodes:

  log-format "%(custom_name1)B %(custom_name2)[str(value)]"

But we may also optionally set the expected node type by appending
':type' after the name, type being either sint,str or bool, like this:

  log-format "%(string_as_int:sint)[str(14)]"

However, it is currently not possible to provide a type without providing
a name that is a least 1 char long. But it could be useful to provide a
type without setting a name, like this, for typecasting purposes only:

  log-format "%(:sint)[bool(true)]"

Thus in order to allow this usage, don't set node->name if node name is
not at least 1 character long. By doing so, node->name will remain NULL
and will not be considered, but the typecast setting will.
2024-04-26 18:39:31 +02:00
Aurelien DARRAGON
c584600083 CLEANUP: log: simplify complex values usages in sess_build_logline()
make sess_build_logline() switch case more readable by performing some
simplifications: complex values are first extracted in a temporary
variable so that it's easier to refer to them and at a single place.
2024-04-26 18:39:31 +02:00
Aurelien DARRAGON
507223d527 MINOR: log: global lf_expr node options
Add options to lf_expr->nodes to store global options (those that are
common to all node) for easier access.

No functional change should be expected.
2024-04-26 18:39:31 +02:00
Aurelien DARRAGON
7ff4f09e23 MINOR: log: store lf_expr nodes inside substruct
Add another struct level inside lf_expr struct to allow new information
to be stored alongside lf_expr nodes.
2024-04-26 18:39:31 +02:00
Aurelien DARRAGON
f8e1357a05 CLEANUP: log: remove unused checks for encode_{chunk,string}
Thanks to 8226e92eb ("BUG/MINOR: tools/log: invalid
encode_{chunk,string} usage"), we only need to check for NULL return
value from encode_{chunk,string}() and escape_string() to know if the
call failed.
2024-04-26 18:39:31 +02:00
Ilya Shipitsin
ab7f05daba CLEANUP: assorted typo fixes in the code and comments
This is 41st iteration of typo fixes
2024-04-17 11:14:44 +02:00
Aurelien DARRAGON
9420cfc0db CLEANUP: log: lf_text_len() returns a pointer not an integer
In c83684519 ("MEDIUM: log: add the ability to include samples in logs")
we checked the return value of lf_text_len() as an integer instead of
comparing the pointer with NULL explicitly. Since this may be confusing,
let's test the return value against NULL.

[ada: for backports, the patch needs to be applied manually because of
 c6a713842 ("MINOR: log: simplify last_isspace in sess_build_logline()")]
2024-04-09 17:35:53 +02:00
Aurelien DARRAGON
28548f812f BUG/MINOR: log: invalid snprintf() usage in sess_build_logline()
According to snprintf() man page:

       The  functions snprintf() and vsnprintf() do not write more than
       size bytes (including the terminating null byte ('\0')).  If the
       output was truncated due to this limit, then the return value is
       the number of  characters  (excluding  the  terminating null byte)
       which would have been written to the final string if enough space
       had been available.  Thus, a return value of size or more means
       that the output was truncated.

However, in sess_build_logline(), each time we need to check the return
value of snprintf(), here is how we proceed:

       iret = snprintf(tmplog, max, ...);
       if (iret < 0 || iret > max)
           // error
       // success
       tmplog += iret;

Here is the issue: if snprintf() lacks 1 byte space to write the
terminating NULL byte, it will return max. Which means in this case
that we fail to know that snprintf() truncated the output in reality,
and we still add iret to tmplog pointer. Considering sess_build_logline()
should NOT write more than <maxsize> bytes (including the terminating NULL
byte) as per the function description, in this case the function would
write <maxsize>+1 byte (to write the terminating NULL byte upon return),
which may lead to invalid write if <dst> was meant to hold <maxsize> bytes
at maximum.

Hopefully, this bug wasn't triggered so far because sess_build_logline()
is called with logline as <dst> argument and <global.max_syslog_len> as
<maxsize> argument, logline being initialized with 1 extra byte upon
startup.

But we better fix this to comply with the function description and prevent
any side-effect since some sess_build_logline() helpers may assume that
'tmplog-dst < maxsize' is always true. Also sess_build_logline() users
probably don't expect NULL-byte to be accounted for in the produced
logline length.

This should be backported to all stable versions.

[ada: for backports, the patch needs to be applied manually because of
 c6a713842 ("MINOR: log: simplify last_isspace in sess_build_logline()")]
2024-04-09 17:35:53 +02:00
Aurelien DARRAGON
8226e92eb0 BUG/MINOR: tools/log: invalid encode_{chunk,string} usage
encode_{chunk,string}() is often found to be used this way:

  ret = encode_{chunk,string}(start, stop...)
  if (ret == NULL || *ret != '\0') {
	//error
  }
  //success

Indeed, encode_{chunk,string} will always try to add terminating NULL byte
to the output string, unless no space is available for even 1 byte.
However, it means that for the caller to be able to spot an error, then it
must provide a buffer (here: start) which is already initialized.

But this is wrong: not only this is very tricky to use, but since those
functions don't return NULL on failure, then if the output buffer was not
properly initialized prior to calling the function, the caller will
perform invalid reads when checking for failure this way. Moreover, even
if the buffer is initialized, we cannot reliably tell if the function
actually failed this way because if the buffer was previously initialized
with NULL byte, then the caller might think that the call actually
succeeded (since the function didn't return NULL and didn't update the
buffer).

Also, sess_build_logline() relies lf_encode_{chunk,string}() functions
which are in fact wrappers for encode_{chunk,string}() functions and thus
exhibit the same error handling mechanism. It turns out that
sess_build_logline() makes unsafe use of those functions because it uses
the error-checking logic mentionned above while buffer (tmplog) is not
guaranteed to be initialized when entering the function. This may
ultimately cause malfunctions or invalid reads if the output buffer is
lacking space.

To fix the issue once and for all and prevent similar bugs from being
introduced, we make it so encode_{string, chunk} and escape_string()
(based on encode_string()) now explicitly return NULL on failure
(when the function failed to write at least the ending NULL byte)

lf_encode_{string,chunk}() helpers had to be patched as well due to code
duplication.

This should be backported to all stable versions.

[ada: for 2.4 and 2.6 the patch won't apply as-is, it might be helpful to
 backport ae1e14d65 ("CLEANUP: tools: removing escape_chunk() function")
 first, considering it's not very relevant to maintain a dead function]
2024-04-09 17:35:45 +02:00
Aurelien DARRAGON
b15f6dfae8 BUG/MINOR: log: fix lf_text_len() truncate inconsistency
In c5bff8e550 ("BUG/MINOR: log: improper behavior when escaping log data")
we fixed lf_text_len() behavior with +E (escape) option.

However we introduced an inconsistency if output buffer is too small to
hold the whole output and truncation occurs: indeed without +E option up
to <size> bytes (including NULL byte) will be used whereas with +E option
only <size-1> bytes will be used. Fixing the function and related comment
so that the function behaves the same in regards to truncation whether +E
option is used or not.

This should be backported to all stable versions.
2024-04-09 17:30:13 +02:00
Aurelien DARRAGON
e751eebfc6 MEDIUM: proxy/log: leverage lf_expr API for logformat preparsing
Currently, the way proxy-oriented logformat directives are handled is way
too complicated. Indeed, "log-format", "log-format-error", "log-format-sd"
and "unique-id-format" all rely on preparsing hints stored inside
proxy->conf member struct. Those preparsing hints include the original
string that should be compiled once the proxy parameters are known plus
the config file and line number where the string was found to generate
precise error messages in case of failure during the compiling process
that happens within check_config_validity().

Now that lf_expr API permits to compile a lf_expr struct that was
previously prepared (with original string and config hints), let's
leverage lf_expr_compile() from check_config_validity() and instead
of relying on individual proxy->conf hints for each logformat expression,
store string and config hints in the lf_expr struct directly and use
lf_expr helpers funcs to handle them when relevant (ie: original
logformat string freeing is now done at a central place inside
lf_expr_deinit(), which allows for some simplifications)

Doing so allows us to greatly simplify the preparsing logic for those 4
proxy directives, and to finally save some space in the proxy struct.

Also, since httpclient proxy has its "logformat" automatically compiled
in check_config_validity(), we now use the file hint from the logformat
expression struct to set an explicit name that will be reported in case
of error ("parsing [httpclient:0] : ...") and remove the extraneous check
in httpclient_precheck() (logformat was parsed twice previously..)
2024-04-04 19:10:01 +02:00
Aurelien DARRAGON
2b79457bc0 MEDIUM: log: add compiling logic to logformat expressions
split parse_logformat_string() into two functions:

parse_logformat_string() sticks to the same behavior, but now becomes an
helper for lf_expr_compile() which uses explicit arguments so that it
becomes possible to use lf_expr_compile() without a proxy, but also
compile an expression which was previously prepared for compiling (set
string and config hints within the logformat expression to avoid manually
storing string and config context if the compiling step happens later).

lf_expr_dup() may be used to duplicate an expression before it is
compiled, lf_expr_xfer() now makes sure that the input logformat is
already compiled.

This is some prerequisite works for log-profiles implementation, no
functional change should be expected.
2024-04-04 19:10:01 +02:00
Aurelien DARRAGON
7a21c3a4ef MAJOR: log: implement proper postparsing for logformat expressions
This patch tries to address a design flaw with how logformat expressions
are parsed from config. Indeed, some parse_logformat_string() calls are
performed during config parsing when the proxy mode is not yet known.

Here's a config example that illustrates the issue:

  defaults
     mode tcp

  listen test
     bind :8888
     http-response set-header custom-hdr "%trl" # needs http
     mode http

The above config should work, because the effective proxy mode is http,
yet haproxy fails with this error:

  [ALERT]    (99051) : config : parsing [repro.conf:6] : error detected in proxy 'test' while parsing 'http-response set-header' rule : format tag 'trl' is reserved for HTTP mode.

To fix the issue once and for all, let's implement smart postparsing for
logformat expressions encountered during config parsing:

  - split parse_logformat_string() (and subfonctions) in order to create a
    new lf_expr_postcheck() function that must be called to finish
    preparing and checking the logformat expression once the proxy type is
    known.
  - save some config hints info during parse_logformat_string() to
    generate more precise error messages during lf_expr_postcheck(), if
    needed, we rely on curpx->conf.args.{file,line} hints for that because
    parse_logformat_string() doesn't know about current file and line
    number.
  - lf_expr_postcheck() uses PR_FL_CHECKED proxy flag to know if the
    function may try to make the proxy compatible with the expression, or
    if it should simply fail as soon as an incompatibility is detected.
  - if parse_logformat_string() is called from an unchecked proxy, then
    schedule the expression for postparsing, else (ie: during runtime),
    run the postcheck right away.

This change will also allow for some logformat expression error handling
simplifications in the future.
2024-04-04 19:10:01 +02:00
Aurelien DARRAGON
6810c41f8e MEDIUM: tree-wide: add logformat expressions wrapper
log format expressions are broadly used within the code: once they are
parsed from input string, they are converted to a linked list of
logformat nodes.

We're starting to face some limitations because we're simply storing the
converted expression as a generic logformat_node list.

The first issue we're facing is that storing logformat expressions that
way doesn't allow us to add metadata alongside the list, which is part
of the prerequites for implementing log-profiles.

Another issue with storing logformat expressions as generic lists of
logformat_node elements is that it's starting to become really hard to
tell when we rely on logformat expressions or not in the code given that
there isn't always a comment near the list declaration or manipulation
to indicate that it's relying on logformat expressions under the hood,
so this adds some complexity for code maintenance.

This patch looks quite impressive due to changes in a lot of header and
source files (since logformat expressions are broadly used), but it does
a simple thing: it defines the lf_expr structure which itself holds a
generic list of logformat nodes, and then declares some helpers to
manipulate lf_expr elements and fixes the code so that we now exclusively
manipulate logformat_node lists as lf_expr elements outside of log.c.

For now, lf_expr struct only contains the list of logformat nodes (no
additional metadata), but now that we have dedicated type and helpers,
doing so in the future won't be problematic at all and won't require
extensive code changes.
2024-04-04 19:10:01 +02:00
Aurelien DARRAGON
7d8f45b647 MEDIUM: log: carry tag context in logformat node
This is a pretty simple patch despite requiring to make some visible
changes in the code:

When parsing a logformat string, log tags (ie: '%tag', AKA log tags) are
turned into logformat nodes with their type set to the type of the
corresponding logformat_tag element which was matched by name. Thus, when
"compiling" a logformat tag, we only keep a reference to the tag type
from the original logformat_tag.

For example, for "%B" log tag, we have the following logformat_tag
element:

  {
    .name = "B",
    .type = LOG_FMT_BYTES,
    .mode = PR_MODE_TCP,
    .lw = LW_BYTES,
    .config_callback = NULL
  }

When parsing "%B" string, we search for a matching logformat tag
inside logformat_tags[] array using the provided name, once we find a
matching element, we craft a logformat node whose type will be
LOG_FMT_BYTES, but from the node itself, we no longer have access to
other informations that are set in the logformat_tag struct element.

Thus from a logformat_node resulting from a log tag, with current
implementation, we cannot easily get back to matching logformat_tag
struct element as it would require us to scan the whole logformat_tags
array at runtime using node->type to find the matching element.

Let's take a simpler path and consider all tag-specific LOG_FMT_*
subtypes as being part of the same logformat node type: LOG_FMT_TAG.

Thanks to that, we're now able to distinguish logformat nodes made
from logformat tag from other logformat nodes, and link them to
their corresponding logformat_tag element from logformat_tags[] array. All
it costs is a simple indirection and an extra pointer in logformat_node
struct.

While at it, all LOG_FMT_* types related to logformat tags were moved
inside log.c as they have no use outside of it since they are simply
lookup indexes for sess_build_logline() and could even be replaced by
function pointers some day...
2024-04-04 19:10:01 +02:00
Aurelien DARRAGON
8cf5c3d7f0 MINOR: log: expose logformat_tag struct
rename logformat_type internal struct to logformat_tag to to make it less
confusing, then expose logformat_tag struct through header file so that it
can be referenced in other structs.

also rename logformat_keywords[] to logformat_tags[] for better
consistency.
2024-04-04 19:10:01 +02:00
Aurelien DARRAGON
c85cbc1061 MEDIUM: log: rename logformat var to logformat tag
What we use to call logformat variable in the code is referred as
log-format tag in the documentation. Having both 'var' and 'tag' labels
referring to the same thing is really confusing. Let's make the code
comply with the documentation by replacing all logformat var/variable/VAR
occurences with either tag or TAG.

No functional change should be expected, the only visible side-effect from
user point of view is that "variable" was replaced by "tag" in some error
messages.
2024-04-04 19:10:01 +02:00
Aurelien DARRAGON
3c6dfa618a MEDIUM: log/balance: leverage lbprm api for log load-balancing
log load-balancing implementation was not seamlessly integrated within
lbprm API. The consequence is that it could become harder to maintain
over time since it added some specific cases just for the log backend.
Moreover, it resulted in some code duplication since balance algorithms
that are common to logs and regular (tcp, http) backends were specifically
rewritten for log backends.

Thanks to the previous commit, we now have all the prerequisites to make
log load-balancing fully leverage lbprm logic. Thus in this patch we make
__do_send_log_backend() use existing lbprm algorithms, and we no longer
require log-specific lbprm initialization in cfgparse.c and in
postcheck_log_backend().

As a bonus, for log backends this allows weighed algorithms to properly
support weights (ie: roundrobin, random and log-hash) since we now
leverage the same lb algorithms that we use for tcp/http backends
(doc was updated).
2024-03-29 17:08:37 +01:00
Aurelien DARRAGON
9aea6df81f MINOR: lbprm: implement true "sticky" balance algo
As previously mentioned in cd352c0db ("MINOR: log/balance: rename
"log-sticky" to "sticky""), let's define a sticky algorithm that may be
used from any protocol. Sticky algorithm sticks on the same server as
long as it remains available.

The documentation was updated accordingly.
2024-03-29 17:08:37 +01:00
Aurelien DARRAGON
d0692d7019 BUG/MINOR: log/balance: detect if user tries to use unsupported algo
b61147fd ("MEDIUM: log/balance: merge tcp/http algo with log ones")
introduced some ambiguities, because while it shares some algos with the
ones from mode {tcp,http}, we forgot report an error when the user tries
to use an algorithm that is not available in this mode (as per the doc).

Because of that, haproxy would silently drop log messages during runtime.

To fix that, we ensure that algo is one of the supported ones during log
backend postparsing. If the algo is not supported, we raise an error.

This should be backported in 2.9 with b61147fd
2024-03-29 17:08:36 +01:00
Willy Tarreau
01aa0a057c MEDIUM: ring: change the ring reader to use the new vector-based API now
The code now looks cleaner and more easily shows what still needs to be
addressed. There are not that many changes in practice, these are mostly
mechanical, essentially hiding the buffer from the callers.
2024-03-25 17:34:19 +00:00
Willy Tarreau
0f611987da MINOR: ring: make the ring reader use only absolute offsets
The goal is to remove references to the buffer's head and tail in the
fast path so that we can release the lock during some reads. This means
no more comparisons with b_data() nor operations relative to b_head()
will be possible anymore. As a first step we need to have an absolute
offset in the buffer, and to use b_getblk_ofs() in the applet callbacks
to retrieve the data based on this.
2024-03-25 17:34:19 +00:00
Willy Tarreau
201c706330 MINOR: log/applet: add new function syslog_applet_append_event()
This function takes a buffer on input, and offset and a length, and
consumes the block from that buffer to send it to the appctx's output
buffer. Contrary to its sibling applet_append_line(), instead of just
appending an LF at the end of the line, it prepends the message size
in decimal and a space before the message, as expected by syslog TCP
implementaions. This will be used to simplify the ring reader code.
2024-03-25 17:34:19 +00:00
Aurelien DARRAGON
2df7e077c7 CLEANUP: log: fix obsolete comment for add_sample_to_logformat_list()
Since 833cc794 ("MEDIUM: sample: handle comma-delimited converter list")
logformat expressions now support having a comma-delimited converter list
right after the fetch. Let's remove a leftover comment from the initial
implementation that says otherwise.
2024-03-07 11:47:56 +01:00
Aurelien DARRAGON
2462e5bcca BUG/MINOR: log: fix potential lf->name memory leak
Recent commit 2ed6068 ("MINOR: log: custom name for logformat node")
introduced a potential memory leak because when custom name is provided,
lf->name value is allocated using strdup(), thus is expected to be freed
alongside the node when the node is released.

However lf->name was only freed in some common places within log.c
cleanups and helpers func, but in reality there are still cases where
lf nodes are manually freed without making use of freeing helpers.

So this is what this patch does, it makes sure all lf freeing places now
leverage the free_logformat_node() helper function that takes care of
freeing all known allocated elements within the node, including custom
name.

This commit depends on:
 - "MINOR: log: add free_logformat_node() helper function"

No backport needed unless 2ed6068 gets backported.
2024-02-22 15:32:42 +01:00
Aurelien DARRAGON
1c2e16ba8a MINOR: log: add free_logformat_node() helper function
Function may be used to free a single logformat node.
2024-02-22 15:32:42 +01:00
Aurelien DARRAGON
62121d5b90 CLEANUP: log: use free_logformat_list() in parse_logformat_string()
This is a follow up for 24a5e42db6 ("CLEANUP: log: deinitialization of
the log buffer in one function") as there was another opportunity to
make use of the new cleanup function.
2024-02-22 15:32:42 +01:00
Aurelien DARRAGON
e7aee6edd5 CLEANUP: log: fix process_send_log() indentation
Fix bad indentation for process_send_log() prototype (tab was used instead
of spaces)
2024-02-22 15:32:42 +01:00
Aurelien DARRAGON
ee88c4418f MINOR: log: automate string array construction in sess_build_logline()
make it so string array construction is performed by dedicated macro
helpers instead of manual char insertion between string members.

The goal is to easily be able to support multiple forms of array
construction depending on the data encoding format (raw, json..).

Only %hrl and %hsl logformats are concerned.
2024-02-20 15:49:55 +01:00
Aurelien DARRAGON
8d2b9e2acd MINOR: log: print metadata prefixes separately in sess_build_logline()
Some log variables may be prefixed with specific chars that represent
extra informations that are relevant with it but are are not directly
part of the "raw" value.

ie: '+' char is prepended before some values when "option logasap" is
used to indicate that the value has not yet reached its final value.

However, as those "metadata" are printed using the general purpose
LOGCHAR() printing helper, it's not easy to tell if they are part of the
base value or not.

In this patch we add the LOGMETACHAR() helper that is a wrapper for
LOGCHAR(). The goal is to prepare for adding some logic to prevent such
additional infos from being generated when not relevant or needed.
2024-02-20 15:49:55 +01:00
Aurelien DARRAGON
a2fc40bc28 MINOR: log: simplify quotes handling in sess_build_logline()
quotes building for some log formats is directly performed under each
switch case statement so it would become painful to add other conditions
to prevent the quotes from being generated when it's not supported by the
the data encoding format for instance (ie: JSON).

Let's centralize and simplify quotes handling by adding LOGQUOTE_START()
and LOGQUOTE_END() helper macros. If a quotation is started and not
explicitly ended, it will be automatically ended at the end of the current
logformat node:

LOGQUOTE_START() sets 'quote' variable to 1, this way LOGQUOTE_END() only
prints the ending quote when needed. LOGQUOTE_END() is systematically
called after each node switch-case (after each value). LOGQUOTE_START()
does nothing if LOG_OPT_QUOTE isn't set, so does LOGQUOTE_END().

Some rare cases such as %hsl (list of captured headers) required special
handling: in this case multiple quoted texts are generated for the same
field value so explicit LOGQUOTE_START() + LOGQUOTE_END() combination was
needed.
2024-02-20 15:49:55 +01:00
Aurelien DARRAGON
c6a7138420 MINOR: log: simplify last_isspace in sess_build_logline()
last_isspace variable is explicitly set to 0 in all cases except
LOG_FMT_SEPARATOR case. So we can actually simplify the code by setting
last_isspace to 0 by default and skipping the assignment for the
LOG_FMT_SEPARATOR case.
2024-02-20 15:49:55 +01:00
Aurelien DARRAGON
1448478d62 MINOR: log: explicit typecasting for logformat nodes
Add the ability to manually specify desired output type after a custom
field name for logformat nodes. Forcing the type can be useful to ensure
value is stored with the proper type representation. (i.e.: forcing
numerical to string to work around the limited resolution of JS number
types)

By default, type is set to SMP_T_SAME, which means the original type will
be preserved.

Currently supported types are: bool, str, sint
2024-02-20 15:49:54 +01:00
Aurelien DARRAGON
2ed6068f2a MINOR: log: custom name for logformat node
Add the ability to specify custom name (will be used for representation
in verbose output types such as json) to logformat nodes.

For now, a custom name should be composed by characters [a-zA-Z0-9-_]*
2024-02-20 15:18:39 +01:00
Christopher Faulet
dcd917d972 MINOR: applet: Remove uselelss test on SE_FL_SHR/SHW flags
These both flags are set after releasing the applet, in
appctx_shut(). Concretly, it means the applet is shutdown for reads and
writes. Once set, the applet's I/O handler was no longer called. Tests on
these flags are useless. There is no chance to match them.
2024-02-14 14:22:36 +01:00
Miroslav Zagorac
24a5e42db6 CLEANUP: log: deinitialization of the log buffer in one function
In several places in the source, there was the same block of code that was
used to deinitialize the log buffer.  There were even two functions that
did this, but they were called only from the code that is in the same
source file (free_tcpcheck_fmt() in src/tcpcheck.c and free_logformat_list()
in src/proxy.c - they were both static functions).

The function free_logformat_list() was moved from the file src/proxy.c to
src/log.c, and a check of the list before freeing the memory was added to
that function.
2024-01-30 08:27:26 +01:00
Christopher Faulet
d982a37e4c MINOR: muxes: Rename mux_ctl_type values to use MUX_CTL_ prefix
Instead of the generic MUX_, we now use MUX_CTL_ prefix for all mux_ctl_type
value. This will avoid any ambiguities with other enums, especially with a
new one that will be added to get information on mux streams.
2023-11-29 11:11:12 +01:00
Christopher Faulet
07691a2e7c CLEANUP: log: Fix %rc comment in sess_build_logline()
%rq was used instead of %rc.
2023-11-29 08:59:27 +01:00
Aurelien DARRAGON
2f2cb6d082 MEDIUM: log/balance: support FQDN for UDP log servers
In previous log backend implementation, we created a pseudo log target
for each declared log server, and we made the log target's address point
to the actual server address to save some time and prevent unecessary
copies.

But this was done without knowing that when FQDN is involved (more broadly
when dns/resolution is involved), the "port" part of server addr should
not be relied upon, and we should explicitly use ->svc_port for that
purpose.

With that in mind and thanks to the previous commit, some changes were
required: we allocate a dedicated addr within the log target when target
is in DGRAM mode. The addr is first initialized with known values and it
is then updated automatically by _srv_set_inetaddr() during runtime.
(the change is atomic so readers don't need to worry about it)

addr from server "log target" (INET/DGRAM mode) is made of the combination
of server's address (lacking the port part) and server's svc_port.
2023-11-29 08:59:27 +01:00
Aurelien DARRAGON
20437b3e32 MINOR: log/balance: set lbprm tot_weight on server on queue/dequeue
Maintain proper px->lbprm.tot_weight for log backends. server's weight is
considered as 1 as long as the server is usable.

This will allow the stats page to correctly display the proxy status since
the check currently relies on proxy's lbprm.tot_weight variable.
2023-11-24 16:27:55 +01:00
Aurelien DARRAGON
661c079bc5 MINOR: log/backend: prevent "use-server" rules use with LOG mode
server_rules declared using "use-server" keyword within a proxy are not
supported inside a log backend (with "mode log" set), so we report a
warning to the user and reset the setting.
2023-11-24 16:27:55 +01:00
Ilya Shipitsin
80813cdd2a CLEANUP: assorted typo fixes in the code and comments
This is 37th iteration of typo fixes
2023-11-23 16:23:14 +01:00
Willy Tarreau
8f9e94ecff BUILD: log: silence a build warning when threads are disabled
Building without threads emits two warnings because the proxy pointer
is no longer used (only serves for the lock) since 2.9 commit 9a74a6cb1
("MAJOR: log: introduce log backends"). No backport is needed.
2023-11-22 11:21:07 +01:00
Aurelien DARRAGON
82f4bcafae MINOR: log/backend: prevent "dynamic-cookie-key" use with LOG mode
It doesn't make sense to set "dynamic-cookie-key" inside a log backend,
thus we report a warning to the user and reset the setting.
2023-11-18 11:16:21 +01:00
Aurelien DARRAGON
c7783fb32b MINOR: log/backend: prevent "http-send-name-header" use with LOG mode
It doesn't make sense to use the "http-send-name-header" directive inside
a log backend so we report a warning in with case and reset the setting.
2023-11-18 11:16:21 +01:00
Aurelien DARRAGON
4b2616f784 MINOR: log/backend: prevent stick table and stick rules with LOG mode
Report a warning and prevent errors if user tries to declare a stick table
or use stick rules within a log backend.
2023-11-18 11:16:21 +01:00
Aurelien DARRAGON
5335618967 MINOR: log/backend: prevent tcp-{request,response} use with LOG mode
We start implementing some postparsing compatibility checks for log
backends.

Here we report a warning if user tries to use tcp-{request,response} rules
with log backend, and we properly ignore such rules when inherited from
defaults section.
2023-11-18 11:16:21 +01:00
Aurelien DARRAGON
b61147fd2a MEDIUM: log/balance: merge tcp/http algo with log ones
"log-balance" directive was recently introduced to configure the
balancing algorithm to use when in a log backend. However, it is
confusing and it causes issues when used in default section.

In this patch, we take another approach: first we remove the
"log-balance" directive, and instead we rely on existing "balance"
directive to configure log load balancing in log backend.

Some algorithms such as roundrobin can be used as-is in a log backend,
and for log-only algorithms, they are implemented as "log-$name" inside
the "backend" directive.

The documentation was updated accordingly.
2023-11-18 11:16:21 +01:00
Aurelien DARRAGON
76acde9107 BUG/MINOR: log: keep the ref in dup_logger()
This bug was introduced with 969e212 ("MINOR: log: add dup_logsrv() helper
function")

When duplicating an existing log entry, we must take care to inherit from
its original ->ref if it is set, because not doing so would make 28ac0999
("MINOR: log: Keep the ref when a log server is copied to avoid duplicate entries")
ineffective given that global log directives will lose their original
reference when duplicated resursively (at least twice), which is what
happens when global log directives are first inherited to defaults which
are then inherited to a regular proxy at the end of the chain.

This can be easily reproduced using the following configuration:

   |global
   |  log stdout format raw local0
   |
   |defaults
   |  log global
   |
   |frontend test
   |  log global
   |  ...

Logs from "test" proxy will be duplicated because test incorrectly
inherited from global "log" directives twice, which 28ac0999 would
normally detect and prevent.

No backport needed unless 969e212 gets backported.
2023-11-13 11:06:05 +01:00
Aurelien DARRAGON
64e0b63442 BUG/MEDIUM: server: invalid address (post)parsing checks
This bug was introduced with 29b76ca ("BUG/MEDIUM: server/log: "mode log"
after server keyword causes crash ")

Indeed, we cannot safely rely on addr_proto being set when str2sa_range()
returns in parse_server() (even if SRV_PARSE_PARSE_ADDR is set), because
proto lookup might be bypassed when FQDN addresses are involved.

Unfortunately, the above patch wrongly assumed that proto would always
be set when SRV_PARSE_PARSE_ADDR was passed to parse_server() (so when
str2sa_range() was called), resulting in invalid postparsing checks being
performed, which could as well lead to crashes with log backends
("mode log" set) because some postparsing init was skipped as a result of
proto not being set and this wasn't expected later in the init code.

To fix this, we now make use of the previous patch to perform server's
address compatibility checks on hints that are always set when
str2sa_range() succesfully returns.

For log backend, we're also adding a complementary test to check if the
address family is of expected type, else we report an error, plus we're
moving the postinit logic in log api since _srv_check_proxy_mode() is
only meant to check proxy mode compatibility and we were abusing it.

This patch depends on:
 - "MINOR: tools: make str2sa_range() directly return type hints"

No backport required unless 29b76ca gets backported.
2023-11-10 17:49:57 +01:00
Aurelien DARRAGON
12582eb8e5 MINOR: tools: make str2sa_range() directly return type hints
str2sa_range() already allows the caller to provide <proto> in order to
get a pointer on the protocol matching with the string input thanks to
5fc9328a ("MINOR: tools: make str2sa_range() directly return the protocol")

However, as stated into the commit message, there is a trick:
   "we can fail to return a protocol in case the caller
    accepts an fqdn for use later. This is what servers do and in this
    case it is valid to return no protocol"

In this case, we're unable to return protocol because the protocol lookup
depends on both the [proto type + xprt type] and the [family type] to be
known.

While family type might not be directly resolved when fqdn is involved
(because family type might be discovered using DNS queries), proto type
and xprt type are already known. As such, the caller might be interested
in knowing those address related hints even if the address family type is
not yet resolved and thus the matching protocol cannot be looked up.

Thus in this patch we add the optional net_addr_type (custom type)
argument to str2sa_range to enable the caller to check the protocol type
and transport type when the function succeeds.
2023-11-10 17:49:57 +01:00
Tim Duesterhus
d7eaa0d553 CLEANUP: Re-apply xalloc_size.cocci (3)
This reapplies the xalloc_size.cocci patch across the whole `src/` tree.

see 16cc16dd82
see 63ee0e4c01
see 9fb57e8c17
2023-11-06 20:49:56 +01:00
Willy Tarreau
91ed52976c MINOR: dgram: allow to set rcv/sndbuf for dgram sockets as well
tune.rcvbuf.client and tune.rcvbuf.server are not suitable for shared
dgram sockets because they're per connection so their units are not the
same. However, QUIC's listener and log servers are not connected and
take per-thread or per-process traffic where a socket log buffer might
be too small, causing undesirable packet losses and retransmits in the
case of QUIC. This essentially manifests in listener mode with new
connections taking a lot of time to set up under heavy traffic due to
the small queues causing delays. Let's add a few new settings allowing
to set these shared socket sizes on the frontend and backend side (which
reminds that these are per-front/back and not per client/server hence
not per connection).
2023-10-18 17:01:19 +02:00
Aurelien DARRAGON
b30bd7adba MEDIUM: log/balance: support for the "hash" lb algorithm
hash lb algorithm can be configured with the "log-balance hash <cnv_list>"
directive. With this algorithm, the user specifies a converter list with
<cnv_list>.

The produced log message will be passed as-is to the provided converter
list, and the resulting hash will be used to select the log server that
will receive the log message.
2023-10-13 10:05:06 +02:00
Aurelien DARRAGON
e0b4660015 MINOR: log/balance: support for the "random" lb algorithm
In this patch we add basic support for the random algorithm:

random algorithm picks a random server using the result of the
statistical_prng() function as if it was a hash key to then compute the
related server ID.

There is no support for the <draw> parameter (which is implemented for
tcp/http load-balancing), because we don't have the required metrics to
evaluate server's load in log backends for the moment. Plus it would add
more complexity to the __do_send_log_backend() function so we'll keep it
this way for now but this might be needed in the future.
2023-10-13 10:05:06 +02:00
Aurelien DARRAGON
26f73dbcbb MINOR: log/balance: support for the "sticky" lb algorithm
sticky algorithm always tries to send log messages to the first server in
the farm. The server will stay in front during queue and dequeue
operations (no other server can steal its place), unless it becomes
unavailable, in which case it will be replaced by another server from
the tree.
2023-10-13 10:05:06 +02:00
Aurelien DARRAGON
9a74a6cb17 MAJOR: log: introduce log backends
Using "mode log" in a backend section turns the proxy in a log backend
which can be used to log-balance logs between multiple log targets
(udp or tcp servers)

log backends can be used as regular log targets using the log directive
with "backend@be_name" prefix, like so:

  | log backend@mybackend local0

A log backend will distribute log messages to servers according to the
log load-balancing algorithm that can be set using the "log-balance"
option from the log backend section. For now, only the roundrobin
algorithm is supported and set by default.
2023-10-13 10:05:06 +02:00
Aurelien DARRAGON
6dad0549a5 MEDIUM: log/sink: simplify log header handling
Introduce log_header struct to easily pass log header data between
functions and use that to simplify the logic around log header
handling.

While at it, some outdated comments were updated as well.

No change in behavior should be expected.
2023-10-13 10:05:06 +02:00
Aurelien DARRAGON
ab914667da MINOR: log: remove the logger dependency in do_send_log()
do_send_log() now exlusively relies on explicit parameters to remove
logger dependency in low-level log sending chain.
2023-10-13 10:05:06 +02:00
Aurelien DARRAGON
60c5821867 MINOR: log: support explicit log target as argument in __do_send_log()
__do_send_log() now takes an extra target parameter to pass an explicit
log target instead of getting it from logger->target.

This will allow __do_send_log() to be called multiple times within a
logger entry containing multiple log targets.
2023-10-13 10:05:06 +02:00
Aurelien DARRAGON
cc3dfe89ed MEDIUM: sink/log: stop relying on AF_UNSPEC for rings
Since a5b325f92 ("MINOR: protocol: add a real family for existing FDs"),
we don't rely anymore on AF_UNSPEC for buffer rings in do_send_log.

But we kept it as a parsing hint to differentiate between implicit and
named rings during ring buffer postparsing.

However it is still a bit confusing and forces us to systematically rely
on target->addr, even for named buffer rings where it doesn't make much
sense anymore.

Now that target->addr was made a pointer in a recent commit, we can
choose not to initialize it when not needed (i.e.: named rings) and use
this as a hint to distinguish implicit rings during init since they rely
on the addr struct to temporarily store the ring's address until the ring
is actually created during postparsing step.
2023-10-13 10:05:06 +02:00
Aurelien DARRAGON
a9b185f34e MEDIUM: log: introduce log target
log targets were immediately embedded in logger struct (previously
named logsrv) and could not be used outside of this context.

In this patch, we're introducing log_target type with the associated
helper functions so that it becomes possible to declare and use log
targets outside of loggers scope.
2023-10-13 10:05:06 +02:00
Aurelien DARRAGON
18da35c123 MEDIUM: tree-wide: logsrv struct becomes logger
When 'log' directive was implemented, the internal representation was
named 'struct logsrv', because the 'log' directive would directly point
to the log target, which used to be a (UDP) log server exclusively at
that time, hence the name.

But things have become more complex, since today 'log' directive can point
to ring targets (implicit, or named) for example.

Indeed, a 'log' directive does no longer reference the "final" server to
which the log will be sent, but instead it describes which log API and
parameters to use for transporting the log messages to the proper log
destination.

So now the term 'logsrv' is rather confusing and prevents us from
introducing a new level of abstraction because they would be mixed
with logsrv.

So in order to better designate this 'log' directive, and make it more
generic, we chose the word 'logger' which now replaces logsrv everywhere
it was used in the code (including related comments).

This is internal rewording, so no functional change should be expected
on user-side.
2023-10-13 10:05:06 +02:00
Willy Tarreau
cec8b42cb3 MEDIUM: logs: atomically check and update the log sample index
The log server lock is pretty visible in perf top when using log samples
because it's taken for each server in turn while trying to validate and
update the log server's index. Let's change this for a CAS, since we have
the index and the range at hand now. This allow us to remove the logsrv
lock.

The test on 4 servers now shows a 3.7 times improvement thanks to much
lower contention. Without log sampling a test producing 4.4M logs/s
delivers 4.4M logs/s at 21 CPUs used, everything spent in the kernel.
After enabling 4 samples (1:4, 2:4, 3:4 and 4:4), the throughput would
previously drop to 1.13M log/s with 37 CPUs used and 75% spent in
process_send_log(). Now with this change, 4.25M logs/s are emitted,
using 26 CPUs and 22% in process_send_log(). That's a 3.7x throughput
improvement for a 30% global CPU usage reduction, but in practice it
mostly shows that the performance drop caused by having samples is much
less noticeable (each of the 4 servers has its index updated for each
log).

Note that in order to even avoid incrementing an index for each log srv
that is consulted, it would be more convenient to have a single index
per frontend and apply the modulus on each log server in turn to see if
the range has to be updated. It would then only perform one write per
range switch. However the place where this is done doesn't have access
to a frontend, so some changes would need to be performed for this, and
it would require to update the current range independently in each
logsrv, which is not necessarily easier since we don't know yet if we
can commit it.
2023-09-20 21:38:33 +02:00
Willy Tarreau
e00470378b MINOR: logs: use a single index to store the current range and index
By using a single long long to store both the current range and the
next index, we'll make it possible to perform atomic operations instead
of locking. Let's only regroup them for now under a new "curr_rg_idx".
The upper word is the range, the lower is the index.
2023-09-20 21:38:33 +02:00
Willy Tarreau
49ddc0138c CLEANUP: logs: rename a confusing local variable "curr_rg" to "smp_rg"
The variable curr_rg in process_send_log() is misleading because it is
not related to the integer curr_rg that's used to calculate it, instead
it's a pointer to the current smp_log_range from smp_rgs[], so let's call
it "smp_rg" as a singular for this "smp_rgs" and put an end to this
confusion.
2023-09-20 21:38:33 +02:00
Willy Tarreau
3f1284560f MINOR: log: remove the unused curr_idx in struct smp_log_range
This index is useless because it only serves to know when the global
index reached the end, while the global one already knows it. Let's
just drop it and perform the test on the global range.

It was verified with the following config that the first server continues
to take 1/10 of the traffic, the 2nd one 2/10, the 3rd one 3/10 and the
4th one 4/10:

    log 127.0.0.1:10001 sample 1:10 local0
    log 127.0.0.1:10002 sample 2,5:10 local0
    log 127.0.0.1:10003 sample 3,7,9:10 local0
    log 127.0.0.1:10004 sample 4,6,8,10:10 local0
2023-09-20 21:38:33 +02:00
Willy Tarreau
4351364700 MINOR: logs: clarify the check of the log range
The test of the log range is not very clear, in part due to the
reuse of the "curr_idx" name that happens at two levels. The call
to in_smp_log_range() applies to the smp_info's index to which 1 is
added: it verifies that the next index is still within the current
range.

Let's just have a local variable "next_index" in process_send_log()
that gets assigned the next index (current+1) and compare it to the
current range's boundaries. This makes the test much clearer. We can
then simply remove in_smp_log_range() that's no longer needed.
2023-09-20 21:38:33 +02:00
Aurelien DARRAGON
7a71801af6 CLEANUP: log: remove unnecessary trim in __do_send_log
Since both sink_write and fd_write_frag_line take the maxlen parameter
as argument, there is no added value for the trim before passing the
msg parameter to those functions.
2023-09-06 16:06:39 +02:00
Aurelien DARRAGON
d9b81e5b49 MEDIUM: log/sink: make logsrv postparsing more generic
We previously had postparsing logic but only for logsrv sinks, but now we
need to make this operation on logsrv directly instead of sinks to prepare
for additional postparsing logic that is not sink-specific.

To do this, we migrated post_sink_resolve() and sink_postresolve_logsrvs()
to their postresolve_logsrvs() and postresolve_logsrv_list() equivalents.

Then, we split postresolve_logsrv_list() so that the sink-only logic stays
in sink.c (sink_resolve_logsrv_buffer() function), and the "generic"
target part stays in log.c as resolve_logsrv().

Error messages formatting was preserved as far as possible but some slight
variations are to be expected.
As for the functional aspect, no change should be expected.
2023-09-06 16:06:39 +02:00
Aurelien DARRAGON
969e212c66 MINOR: log: add dup_logsrv() helper function
ease code maintenance by introducing dup_logsrv() helper function to
properly duplicate an existing logsrv struct.
2023-09-06 16:06:39 +02:00
Aurelien DARRAGON
e187361b52 MINOR: log: move log-forwarders cleanup in log.c
Move the log-forwarded proxies cleanup from global deinit() function into
log dedicated deinit function.

No backport needed.
2023-09-06 16:06:39 +02:00
Aurelien DARRAGON
9f9d557468 BUG/MINOR: log: free errmsg on error in cfg_parse_log_forward()
When leaving cfg_parse_log_forward() on error paths, errmsg which is local
to the function could still point to valid data, and it's our
responsibility to free it.

Instead of freeing it everywhere it is invoved, we free it prior to
leaving the function.

This should be backported as far as 2.4.
2023-07-10 18:28:08 +02:00
Aurelien DARRAGON
21cf42f579 BUG/MINOR: log: fix multiple error paths in cfg_parse_log_forward()
Multiple error paths were badly handled in cfg_parse_log_forward():
some errors were raised without interrupting the function execution,
resulting in undefined behavior.

Instead of fixing issues separately, let's fix the whole function at once.
This should be backported as far as 2.4.
2023-07-10 18:28:08 +02:00
Aurelien DARRAGON
d1af50c807 BUG/MINOR: log: fix missing name error message in cfg_parse_log_forward()
"missing name for ip-forward section" is generated instead of "missing
name name for log-forward section" in cfg_parse_log_forward().

This may be backported up to 2.4.
2023-07-10 18:28:08 +02:00
Aurelien DARRAGON
47ee036e5f BUG/MEDIUM: log: improper use of logsrv->maxlen for buffer targets
In e709e1e ("MEDIUM: logs: buffer targets now rely on new sink_write")
we started using the sink API instead of using the ring_write function
directly.

But as indicated in the commit message, the maxlen parameter of the log
directive now only applies to the message part and not the complete
payload. I don't know what the original intent was (maybe minimizing code
changes) but it seems wrong, because the doc doesn't mention this special
case, and the result is that the ring->buffer output can differ from all
other log output types, making it very confusing.

One last issue with this is that log messages can end up being dropped at
runtime, only for the buffer target, and even if logsrv->maxlen is
correctly set (including default: 1024) because depending on the generated
header size the payload can grow bigger than the accepted sink size (sink
maxlen is not mandatory) and we have no simple way to detect this at
configuration time.

First, we partially revert e709e1e:

  TARGET_BUFFER still leverages the proper sink API, but thanks to
  "MINOR: sink: pass explicit maxlen parameter to sink_write()" we now
  explicitly pass the logsrv->maxlen to the sink_write function in order
  to stop writing as soon as either sink->maxlen or logsrv->maxlen is
  reached.

This restores pre-e709e1e behavior with the added benefit from using the
high-level API, which includes automatically announcing dropped message
events.

Then, we also need to take the ending '\n' into account: it is not
explicitly set when generating the logline for TARGET_BUFFER, but it will
be forcefully added by the sink_forward_io_handler function from the tcp
handler applet when log messages from the buffer are forwarded to tcp
endpoints.

In current form, because the '\n' is added later in the chain, maxlen is
not being considered anymore, so the final log message could exceed maxlen
by 1 byte, which could make receiving servers unhappy in logging context.

To prevent this, we sacrifice 1 byte from the logsrv->maxlen to ensure
that the final message will never exceed log->maxlen, even if the '\n'
char is automatically appended later by the forwarding applet.

Thanks to this change TCP (over RING/BUFFER) target now behaves like
FD and UDP targets.

This commit depends on:
 - "MINOR: sink: pass explicit maxlen parameter to sink_write()"

It may be backported as far as 2.2

[For 2.2 and 2.4 the patch does not apply automatically, the sink_write()
call must be updated by hand]
2023-07-10 18:28:08 +02:00
Aurelien DARRAGON
b6e2d62fb3 MINOR: sink/api: pass explicit maxlen parameter to sink_write()
sink_write() currently relies on sink->maxlen to know when to stop
writing a given payload.

But it could be useful to pass a smaller, explicit value to sink_write()
to stop before the ring maxlen, for instance if the ring is shared between
multiple feeders.

sink_write() now takes an optional maxlen parameter:
  if maxlen is > 0, then sink_write will stop writing at maxlen if maxlen
  is smaller than ring->maxlen, else only ring->maxlen will be considered.

[for haproxy <= 2.7, patch must be applied by hand: that is:
__sink_write() and sink_write() should be patched to take maxlen into
account and function calls to sink_write() should use 0 as second argument
to keep original behavior]
2023-07-10 18:28:08 +02:00
Aurelien DARRAGON
901f31bc9a BUG/MINOR: log: LF upsets maxlen for UDP targets
A regression was introduced with 5464885 ("MEDIUM: log/sink: re-work
and merge of build message API.").

For UDP targets, a final '\n' is systematically inserted, but with the
rework of the build message API, it is inserted after the maxlen
limitation has been enforced, so this can lead to the final message
becoming maxlen+1. For strict syslog servers that only accept up to
maxlen characters, this could be a problem.

To fix the regression, we take the final '\n' into account prior to
building the message, like it was done before the rework of the API.

This should be backported up to 2.4.
2023-07-10 18:28:08 +02:00
Aurelien DARRAGON
256d581fbd BUG/MINOR: log: fix memory error handling in parse_logsrv()
A check was missing in parse_logsrv() to make sure that malloc-dependent
variable is checked for non-NULL before using it.

If malloc fails, the function raises an error and stops, like it's already
done at a few other places within the function.

This partially fixes GH #2130.

It should be backported to every stable versions.
2023-05-12 09:45:30 +02:00
Willy Tarreau
69530f59ae MEDIUM: clock: replace timeval "now" with integer "now_ns"
This puts an end to the occasional confusion between the "now" date
that is internal, monotonic and not synchronized with the system's
date, and "date" which is the system's date and not necessarily
monotonic. Variable "now" was removed and replaced with a 64-bit
integer "now_ns" which is a counter of nanoseconds. It wraps every
585 years, so if all goes well (i.e. if humanity does not need
haproxy anymore in 500 years), it will just never wrap. This implies
that now_ns is never nul and that the zero value can reliably be used
as "not set yet" for a timestamp if needed. This will also simplify
date checks where it becomes possible again to do "date1<date2".

All occurrences of "tv_to_ns(&now)" were simply replaced by "now_ns".
Due to the intricacies between now, global_now and now_offset, all 3
had to be turned to nanoseconds at once. It's not a problem since all
of them were solely used in 3 functions in clock.c, but they make the
patch look bigger than it really  is.

The clock_update_local_date() and clock_update_global_date() functions
are now much simpler as there's no need anymore to perform conversions
nor to round the timeval up or down.

The wrapping continues to happen by presetting the internal offset in
the short future so that the 32-bit now_ms continues to wrap 20 seconds
after boot.

The start_time used to calculate uptime can still be turned to
nanoseconds now. One interrogation concerns global_now_ms which is used
only for the freq counters. It's unclear whether there's more value in
using two variables that need to be synchronized sequentially like today
or to just use global_now_ns divided by 1 million. Both approaches will
work equally well on modern systems, the difference might come from
smaller ones. Better not change anyhting for now.

One benefit of the new approach is that we now have an internal date
with a resolution of the nanosecond and the precision of the microsecond,
which can be useful to extend some measurements given that timestamps
also have this resolution.
2023-04-28 16:08:08 +02:00
Willy Tarreau
eed5da1037 MINOR: clock: do not use now.tv_sec anymore
Instead we're using ns_to_sec(tv_to_ns(&now)) which allows the tv_sec
part to disappear. At this point, "now" is only used as a timeval in
clock.c where it is updated.
2023-04-28 16:08:08 +02:00
Willy Tarreau
ad5a5f6779 MEDIUM: tree-wide: replace timeval with nanoseconds in tv_accept and tv_request
Let's get rid of timeval in storage of internal timestamps so that they
are no longer mistaken for wall clock time. These were exclusively used
subtracted from each other or to/from "now" after being converted to ns,
so this patch removes the tv_to_ns() conversion to use them natively. Two
occurrences of tv_isge() were turned to a regular wrapping subtract.
2023-04-28 16:08:08 +02:00
Willy Tarreau
76d343d3d3 MINOR: time: replace calls to tv_ms_elapsed() with a linear subtract
Instead of operating on {sec, usec} now we convert both operands to
ns then subtract them and convert to ms. This is a first step towards
dropping timeval from these timestamps.

Interestingly, tv_ms_elapsed() and tv_ms_remain() are no longer used at
all and could be removed.
2023-04-28 16:08:08 +02:00
Christopher Faulet
285aa40d35 BUG/MEDIUM: log: Properly handle client aborts in syslog applet
In the syslog applet, when there is no output data, nothing is performed and
the applet leaves by requesting more data. But it is an issue because a
client abort is only handled if it reported with the last bytes of the
message. If the abort occurs after the message was handled, it is ignored.
The session remains opened and inactive until the client timeout is being
triggered. It no such timeout is configured, given that the default maxconn
is 10, all slots can be quickly busy and make the applet unresponsive.

To fix the issue, the best is to always try to read a message when the I/O
handle is called. This way, the abort can be handled. And if there is no
data, we leave as usual.

This patch should fix the issue #2112. It must be backported as far as 2.4.
2023-04-17 16:50:30 +02:00
Christopher Faulet
211452ef9a BUG/MEDIUM: log: Eat output data when waiting for appctx shutdown
When the log applet is executed while a shut is pending, the remaining
output data must always be consumed. Otherwise, this can prevent the stream
to exit, leading to a spinning loop on the applet.

It is 2.8-specific. No backport needed.
2023-04-11 08:19:06 +02:00
Christopher Faulet
22a88f06d4 MEDIUM: log: Use the sedesc to report and detect end of processing
Just like for other applets, we now use the SE descriptor instead of the
channel to report error and end-of-stream.

Here, the refactoring only reports errors by setting SE_FL_ERROR flag.
2023-04-05 08:57:05 +02:00
Christopher Faulet
3aeb36681c BUG/MINOR: syslog: Request for more data if message was not fully received
In the syslog applet, when a message was not fully received, we must request
for more data by calling appctx_need_more_data() and not by setting
CF_READ_DONTWAIT flag on the request channel. Indeed, this flag is only used
to only try a read at once.

This patch could be backported as far as 2.4. On 2.5 and 2.4,
applet_need_more_data() must be replaced by si_cant_get().
2023-03-24 09:24:03 +01:00
Christopher Faulet
b08c5259eb MINOR: stconn: Always report READ/WRITE event on shutr/shutw
It was done by hand by callers when a shutdown for read or write was
performed. It is now always handled by the functions performing the
shutdown. This way the callers don't take care of it. This will avoid some
bugs.
2023-02-22 15:59:16 +01:00
Willy Tarreau
d5983cef80 MINOR: listener: remove the useless ->default_target field
This field is used by stream_new() to optionally set the applet the
stream will connect to for simple proxies like the CLI for example.
But it has never been configurable to anything and is always strictly
equal to the frontend's ->default_target. Let's just drop it and make
stream_new() only use the frontend's. It makes more sense anyway as
we don't want the proxy to work differently based on the "bind" line.
This idea was brought in 1.6 hoping that the h2 implementation would
use applets for decoding (which was dropped after the very first
attempt in 1.8).
2023-02-03 18:00:20 +01:00
Willy Tarreau
3083615410 MINOR: listener: move the ->accept callback to the bind_conf
The accept callback directly derives from the upper layer, generally
it's session_accept_fd(). As such it's also defined per bind line
so it makes sense to move it there.
2023-02-03 18:00:20 +01:00
Willy Tarreau
882f2485a1 MINOR: listener: move maxaccept from listener to bind_conf
Like for previous values, maxaccept is really per-bind_conf, so let's
move it there. Some frontends (peers, log) set it to 1 so the assignment
was slightly moved.
2023-02-03 18:00:20 +01:00
Willy Tarreau
7866e8e50d MEDIUM: listener: move the analysers mask to the bind_conf
When bind_conf were created, some elements such as the analysers mask
ought to have moved there but that wasn't the case. Now that it's
getting clearer that bind_conf provides all binding parameters and
the listener is essentially a listener on an address, it's starting
to get really confusing to keep such parameters in the listener, so
let's move the mask to the bind_conf. We also take this opportunity
for pre-setting the mask to the frontend's upon initalization. Now
several loops have one less argument to take care of.
2023-02-03 18:00:20 +01:00
Frdric Lcaille
9969adbcdc MINOR: stats: add by HTTP version cumulated number of sessions and requests
Add cum_sess_ver[] new array of counters to count the number of cumulated
HTTP sessions by version (h1, h2 or h3).
Implement proxy_inc_fe_cum_sess_ver_ctr() to increment these counter.
This function is called each a HTTP mux is correctly initialized. The QUIC
must before verify the application operations for the mux is for h3 before
calling proxy_inc_fe_cum_sess_ver_ctr().
ST_F_SESS_OTHER stat field for the cumulated of sessions others than
HTTP sessions is deduced from ->cum_sess_ver counter (for all the session,
not only HTTP sessions) from which the HTTP sessions counters are substracted.

Add cum_req[] new array of counters to count the number of cumulated HTTP
requests by version and others than HTTP requests. This new member replace ->cum_req.
Modify proxy_inc_fe_req_ctr() which increments these counters to pass an HTTP
version, 0 special values meaning "other than an HTTP request". This is the case
for instance for syslog.c from which proxy_inc_fe_req_ctr() is called with 0
as version parameter.
ST_F_REQ_TOT stat field compputing for the cumulated number of requests is modified
to count the sum of all the cum_req[] counters.

As this patch is useful for QUIC, it must be backported to 2.7.
2023-02-03 17:55:49 +01:00
Christopher Faulet
6e1bbc446b REORG: channel: Rename CF_READ_NULL to CF_READ_EVENT
CF_READ_NULL flag is not really useful and used. It is a transient event
used to wakeup the stream. As we will see, all read events on a channel may
be resumed to only one and are all used to wake up the stream.

In this patch, we introduce CF_READ_EVENT flag as a replacement to
CF_READ_NULL. There is no breaking change for now, it is just a
rename. Gradually, other read events will be merged with this one.
2023-01-09 18:41:08 +01:00
William Lallemand
be6a873096 BUG/MINOR: httpclient/log: free of invalid ptr with httpclient_log_format
free_proxy() must check if the ptr is not httpclient_log_format before
trying to free p->conf.logformat_string.

No backport needed.
2022-12-22 15:39:31 +01:00
Aurelien DARRAGON
ab9efc25f0 BUG/MINOR: log: fix parse_log_message rfc5424 size check
In parse_log_message(), if log is rfc5424 compliant, p pointer
is incremented and size is not. However size is still used in further
checks as if p pointer was not incremented.

This could lead to logic error or buffer overflow if input buf is not
null-terminated.

Fixing this by making sure size is up to date where it is needed.

It could be backported up to 2.4.
2022-11-22 16:27:52 +01:00
Willy Tarreau
80f9a63184 BUILD: logs: use __fallthrough in build_log_header()
This avoids 4 build warnings when preprocessing happens before compiling
with gcc >= 7.
2022-11-14 11:14:02 +01:00
Aurelien DARRAGON
7faffdc6ab BUG/MINOR: log: fixing bug in tcp syslog_io_handler Octet-Counting
syslog_io_handler does specific treatment to handle syslog tcp octet
counting:

Logic was good, but a sneaky mistake prevented
rfc-6587 octet counting from working properly.

trash.area was used as an input buffer.
It does not make sense here since it is uninitialized.
Compilation was unaffected because trash is a thread
local "global" variable.

buf->area should definitely be used instead.

This should be backported as far as 2.4.
2022-10-27 11:28:53 +02:00
Christopher Faulet
cc640e851a BUG/MINOR: log: Preserve message facility when the log target is a ring buffer
When a ring is used as log target, the original facility, if any, must be
preserved. The default facility must only be used if there no facility was
found in the incoming log message.

This patch should fix the issue #1901. It must be backported as far as 2.4.
2022-10-20 09:03:19 +02:00
Aurelien DARRAGON
c5bff8e550 BUG/MINOR: log: improper behavior when escaping log data
Patrick Hemmer reported an improper log behavior when using
log-format to escape log data (+E option):
Some bytes were truncated from the output:

- escape_string() function now takes an extra parameter that
  allow the caller to specify input string stop pointer in
  case the input string is not guaranteed to be zero-terminated.
- Minors checks were added into lf_text_len() to make sure dst
  string will not overflow.
- lf_text_len() now makes proper use of escape_string() function.

This should be backported as far as 1.8.
2022-09-20 16:25:30 +02:00
Emeric Brun
a8942cd9c4 BUG/MAJOR: log-forward: Fix ssl layer not initialized on bind even if configured
Since commit 2071a99df ("MINOR: listener/ssl: set the SSL xprt layer only
once the whole config is known") the xprt is initialized for ssl directly
from a generic funtion used to parse bind args.

But the 'bind' lines from 'log-forward' sections were forgotten in commit
55f0f7bb5 ("MINOR: config: use the new bind_parse_args_list() to parse a
"bind" line").

This patch re-works 'log-forward' section parsing to use the generic
function to parse bind args and fix the issue.

Since the generic way to parse was introduced in 2.6, this patch
should be backported as far as this version.
2022-08-19 16:09:06 +02:00
Christopher Faulet
a892b7f15f BUG/MINOR: log: Properly test connection retries to fix dontlog-normal option
The commit 731c8e6cf ("MINOR: stream: Simplify retries counter calculation")
introduced a regression. It broke the dontlog-normal option because the test
on the connection retries counter was not updated accordingly.

This patch should fix the issue #1754. It must be backported to 2.6.
2022-06-17 14:53:21 +02:00
Willy Tarreau
c12b321661 CLEANUP: applet: rename appctx_cs() to appctx_sc()
It returns a stream connector, not a conn_stream anymore, so let's
fix its name.
2022-05-27 19:33:35 +02:00
Willy Tarreau
270a4574a4 CLEANUP: log-forward: rename all occurrences of stconn "cs" to "sc"
In the log-forwarding applet, function arguments and local variables
called "cs" were renamed to "sc" to avoid future confusion.
2022-05-27 19:33:35 +02:00
Willy Tarreau
bde14ad499 CLEANUP: check: rename all occurrences of stconn "cs" to "sc"
The check struct had a "cs" field renamed to "sc", which also required
a tiny update to a few functions using it to distinguish a check from
a stream (log.c, payload.c, ssl_sample.c, tcp_sample.c, tcpcheck.c,
connection.c).

Function arguments and local variables called "cs" were renamed to "sc".
The presence of one "cs=" in the debugging traces was also turned to
"sc=" for consistency.
2022-05-27 19:33:35 +02:00
Willy Tarreau
cb086c6de1 REORG: stconn: rename conn_stream.{c,h} to stconn.{c,h}
There's no more reason for keepin the code and definitions in conn_stream,
let's move all that to stconn. The alphabetical ordering of include files
was adjusted.
2022-05-27 19:33:35 +02:00
Willy Tarreau
5edca2f0e1 REORG: rename cs_utils.h to sc_strm.h
This file contains all the stream-connector functions that are specific
to application layers of type stream. So let's name it accordingly so
that it's easier to figure what's located there.

The alphabetical ordering of include files was preserved.
2022-05-27 19:33:35 +02:00
Willy Tarreau
f61dd19284 CLEANUP: stconn: rename cs_{shut,chk}* to sc_*
This applies the following renaming:

cs_shutr() -> sc_shutr()
cs_shutw() -> sc_shutw()
cs_chk_rcv() -> sc_chk_rcv()
cs_chk_snd() -> sc_chk_snd()
cs_must_kill_conn() -> sc_must_kill_conn()
2022-05-27 19:33:35 +02:00
Willy Tarreau
d68ff018c5 CLEANUP: stconn: rename cs{,_get}_{src,dst} to sc_*
The following functions were renamed:

cs_src() -> sc_src()
cs_dst() -> sc_dst()
cs_get_src() -> sc_get_src()
cs_get_dst() -> sc_get_dst()
2022-05-27 19:33:35 +02:00
Willy Tarreau
fd9417ba3f CLEANUP: stconn: rename cs_conn() to sc_conn()
It's mostly used from upper layers. Both the checked and unchecked
functions were updated, or ~150 entries.
2022-05-27 19:33:34 +02:00
Willy Tarreau
ea27f48c5a CLEANUP: stconn: rename cs_{check,strm,strm_task} to sc_strm_*
These functions return the app-layer associated with an stconn, which
is a check, a stream or a stream's task. They're used a lot to access
channels, flags and for waking up tasks. Let's just name them
appropriately for the stream connector.
2022-05-27 19:33:34 +02:00
Willy Tarreau
40a9c32e3a CLEANUP: stconn: rename cs_{i,o}{b,c} to sc_{i,o}{b,c}
We're starting to propagate the stream connector's new name through the
API. Most call places of these functions that retrieve the channel or its
buffer are in applets. The local variable names are not changed in order
to keep the changes small and reviewable. There were ~92 uses of cs_ic(),
~96 of cs_oc() (due to co_get*() being less factorizable than ci_put*),
and ~5 accesses to the buffer itself.
2022-05-27 19:33:34 +02:00
Willy Tarreau
7cb9e6c6ba CLEANUP: stream: rename "csf" and "csb" to "scf" and "scb"
These are the stream connectors, let's give them consistent names. The
patch is large (405 locations) but totally trivial.
2022-05-27 19:33:34 +02:00
Willy Tarreau
4596fe20d9 CLEANUP: conn_stream: tree-wide rename to stconn (stream connector)
This renames the "struct conn_stream" to "struct stconn" and updates
the descriptions in all comments (and the rare help descriptions) to
"stream connector" or "connector". This touches a lot of files but
the change is minimal. The local variables were not even renamed, so
there's still a lot of "cs" everywhere.
2022-05-27 19:33:34 +02:00
Willy Tarreau
91b47263f7 MINOR: protocol: replace ctrl_type with xprt_type and clarify it
There's been some great confusion between proto_type, ctrl_type and
sock_type. It turns out that ctrl_type was improperly chosen because
it's not the control layer that is of this or that type, but the
transport layer, and it turns out that the transport layer doesn't
(normally) denaturate the underlying control layer, except for QUIC
which turns dgrams to streams. The fact that the SOCK_{DGRAM|STREAM}
set of values was used added to the confusion.

Let's replace it with xprt_type which reuses the later introduced
PROTO_TYPE_* values, and update the comments to explain which one
works at what level.
2022-05-20 18:39:43 +02:00
Willy Tarreau
0698c80a58 CLEANUP: applet: remove the unneeded appctx->owner
This one is the pointer to the conn_stream which is always in the
endpoint that is always present in the appctx, thus it's not needed.
This patch removes it and replaces it with appctx_cs() instead. A
few occurences that were using __cs_strm(appctx->owner) were moved
directly to appctx_strm() which does the equivalent.
2022-05-13 14:28:48 +02:00
Willy Tarreau
382474348c CLEANUP: tree-wide: use fd_set_nonblock() and fd_set_cloexec()
This gets rid of most open-coded fcntl() calls, some of which were passed
through DISGUISE() to avoid a useless test. The FD_CLOEXEC was most often
set without preserving previous flags, which could become a problem once
new flags are created. Now this will not happen anymore.
2022-04-26 10:59:48 +02:00
Willy Tarreau
acef5e27b0 MINOR: tree-wide: always consider EWOULDBLOCK in addition to EAGAIN
Some older systems may routinely return EWOULDBLOCK for some syscalls
while we tend to check only for EAGAIN nowadays. Modern systems define
EWOULDBLOCK as EAGAIN so that solves it, but on a few older ones (AIX,
VMS etc) both are different, and for portability we'd need to test for
both or we never know if we risk to confuse some status codes with
plain errors.

There were few entries, the most annoying ones are the switch/case
because they require to only add the entry when it differs, but the
other ones are really trivial.
2022-04-25 20:32:15 +02:00
Christopher Faulet
6b0a0fb2f9 CLEANUP: tree-wide: Remove any ref to stream-interfaces
Stream-interfaces are gone. Corresponding files can be safely be removed. In
addition, comments are updated accordingly.
2022-04-13 15:10:16 +02:00
Christopher Faulet
da098e6c17 MINOR: stream-int/conn-stream: Move si_shut* and si_chk* in conn-stream scope
si_shutr(), si_shutw(), si_chk_rcv() and si_chk_snd() are moved in the
conn-stream scope and renamed, respectively, cs_shutr(), cs_shutw(),
cs_chk_rcv(), cs_chk_snd() and manipulate a conn-stream instead of a
stream-interface.
2022-04-13 15:10:15 +02:00
Christopher Faulet
8da67aae3e MEDIUM: stream-int/conn-stream: Move src/dst addresses in the conn-stream
The source and destination addresses at the applicative layer are moved from
the stream-interface to the conn-stream. This simplifies a bit the code and
it is a logicial step to remove the stream-interface.
2022-04-13 15:10:14 +02:00
Christopher Faulet
731c8e6cf9 MINOR: stream: Simplify retries counter calculation
The conn_retries counter was set to the max value and decremented at each
connection retry. Thus the counter reflected the number of retries left and
not the real number of retries. All calculations of redispatch or reporting
of number of retries experienced were made using subtracts from the
configured retries, which was complicated and didn't bring any benefit.

Now, this counter is set to 0 and incremented at each retry. We know we've
reached the maximum allowed connection retries by comparing it to the
configured value. In all other cases, we directly use the counter.

This patch should address the feature request #1608.
2022-04-13 15:10:14 +02:00
Christopher Faulet
909f318259 MINOR: stream-int/stream: Move conn_retries counter in the stream
The conn_retries counter may be moved into the stream structure. It only
concerns the connection establishment. The frontend stream-interface does not
use it. So it is a logical change.
2022-04-13 15:10:14 +02:00
Christopher Faulet
908628c4c0 MEDIUM: tree-wide: Use CS util functions instead of SI ones
At many places, we now use the new CS functions to get a stream or a channel
from a conn-stream instead of using the stream-interface API. It is the
first step to reduce the scope of the stream-interfaces. The main change
here is about the applet I/O callback functions. Before the refactoring, the
stream-interface was the appctx owner. Thus, it was heavily used. Now, as
far as possible,the conn-stream is used. Of course, it remains many calls to
the stream-interface API.
2022-04-13 15:10:14 +02:00
Willy Tarreau
807a3a53bb MINOR: log: add '~' to frontend when the transport layer provides SSL
We used to check if the transport layer was ssl_sock to decide to log
"~" after a frontend's name. Now that QUIC is present, this doesn't work
anymore. Better rely on the transport layer's get_ssl_sock_ctx() method.
2022-04-12 08:08:33 +02:00
Christopher Faulet
b4f96eda56 BUG/MINOR: log: Initialize the list element when allocating a new log server
211ea252d ("BUG/MINOR: logs: fix logsrv leaks on clean exit") introduced a
regression because the list element of a new log server is not intialized. Thus
HAProxy crashes on error path when an invalid log server is released.

This patch shoud fix the issue #1636. It must be backported if the above commit
is backported. For now, it is 2.6-specific and no backport is needed.
2022-03-29 14:17:10 +02:00
Tim Duesterhus
7750850594 CLEANUP: Reapply ist.cocci with --include-headers-for-types --recursive-includes
Previous uses of `ist.cocci` did not add `--include-headers-for-types` and
`--recursive-includes` preventing Coccinelle seeing `struct ist` members of
other structs.

Reapply the patch with proper flags to further clean up the use of the ist API.

The command used was:

    spatch -sp_file dev/coccinelle/ist.cocci -in_place --include-headers --include-headers-for-types --recursive-includes --dir src/
2022-03-21 08:30:47 +01:00
Willy Tarreau
211ea252d9 BUG/MINOR: logs: fix logsrv leaks on clean exit
Log servers are a real mess because:
  - entries are duplicated using memcpy() without their strings being
    reallocated, which results in these ones not being freeable every
    time.

  - a new field, ring_name, was added in 2.2 by commit 99c453df9
    ("MEDIUM: ring: new section ring to declare custom ring buffers.")
    but it's never initialized during copies, causing the same issue

  - no attempt is made at freeing all that.

Of course, running "haproxy -c" under ASAN quickly notices that and
dumps a core.

This patch adds the missing strdup() and initialization where required,
adds a new free_logsrv() function to cleanly free() such a structure,
calls it from the proxy when iterating over logsrvs instead of silently
leaking their file names and ring names, and adds the same logsrv loop
to the proxy_free_defaults() function so that we don't leak defaults
sections on exit.

It looks a bit entangled, but it comes as a whole because all this stuff
is inter-dependent and was missing.

It's probably preferable not to backport this in the foreseable future
as it may reveal other jokes if some obscure parts continue to memcpy()
the logsrv struct.
2022-03-17 19:53:46 +01:00
Christopher Faulet
02fc86e8f6 MINOR: log: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the log part.
2022-02-24 11:00:02 +01:00
Christopher Faulet
95a61e8a0e MINOR: stream: Add pointer to front/back conn-streams into stream struct
frontend and backend conn-streams are now directly accesible from the
stream. This way, and with some other changes, it will be possible to remove
the stream-interfaces from the stream structure.
2022-02-24 11:00:02 +01:00
Christopher Faulet
86e1c3381b MEDIUM: applet: Set the conn-stream as appctx owner instead of the stream-int
Because appctx is now an endpoint of the conn-stream, there is no reason to
still have the stream-interface as appctx owner. Thus, the conn-stream is
now the appctx owner.
2022-02-24 11:00:02 +01:00
Christopher Faulet
13a35e5752 MAJOR: conn_stream/stream-int: move the appctx to the conn-stream
Thanks to previous changes, it is now possible to set an appctx as endpoint
for a conn-stream. This means the appctx is no longer linked to the
stream-interface but to the conn-stream. Thus, a pointer to the conn-stream
is explicitly stored in the stream-interface. The endpoint (connection or
appctx) can be retrieved via the conn-stream.
2022-02-24 11:00:02 +01:00
Emeric Brun
2ad2b1c94c BUG/MAJOR: segfault using multiple log forward sections.
For each new log forward section, the proxy was added to the log forward
proxy list but the ref on the previous log forward section's proxy was
scratched using "init_new_proxy" which performs a memset. After configuration
parsing this list contains only the last section's proxy.

The post processing walk through this list to resolve "ring" names.
Since some section's proxies are missing in this list, the resolving
is not done for those ones and the pointer on the ring is kept to null
causing a segfault at runtime trying to write a log message
into the ring.

This patch shift the "init_new_proxy" before adding the ref on the
previous log forward section's proxy on currently parsed one.

This patch shoud fix github issue #1464

This patch should be backported to 2.3
2021-12-01 15:21:56 +01:00
Christopher Faulet
1ccbe12f4a DOC: log: Add comments to specify when session's listener is defined or not
When a log message is emitted, The session's listener is always defined when
the session's owner is an inbound connection while it is undefined for a
health-check. It is not obvious. So, comments have been added to make it
clear.

This patch is related to the issue #1434.
2021-11-15 11:31:09 +01:00
Tim Duesterhus
2471f5c2b2 CLEANUP: Apply ist.cocci
Make use of the new rules to use `isttrim()`.
2021-11-08 12:08:26 +01:00
Willy Tarreau
68574dd492 MEDIUM: log: add the client's SNI to the default HTTPS log format
During a troublehooting it came obvious that the SNI always ought to
be logged on httpslog, as it explains errors caused by selection of
the default certificate (or failure to do so in case of strict-sni).

This expectation was also confirmed on the mailing list.

Since the field may be empty it appeared important not to leave an
empty string in the current format, so it was decided to place the
field before a '/' preceding the SSL version and ciphers, so that
in the worst case a missing field leads to a field looking like
"/TLSv1.2/AES...", though usually a missing element still results
in a "-" in logs.

This will change the log format for users who already deployed the
2.5-dev versions (hence the medium level) but no released version
was using this format yet so there's no harm for stable deployments.
The reg-test was updated to check for "-" there since we don't send
SNI in reg-tests.

Link: https://www.mail-archive.com/haproxy@formilux.org/msg41410.html
Cc: William Lallemand <wlallemand@haproxy.org>
2021-11-06 09:20:07 +01:00
Willy Tarreau
6f7497616e MEDIUM: connection: rename fc_conn_err and bc_conn_err to fc_err and bc_err
Commit 3d2093af9 ("MINOR: connection: Add a connection error code sample
fetch") added these convenient sample-fetch functions but it appears that
due to a misunderstanding the redundant "conn" part was kept in their
name, causing confusion, since "fc" already stands for "front connection".

Let's simply call them "fc_err" and "bc_err" to match all other related
ones before they appear in a final release. The VTC they appeared in were
also updated, and the alpha sort in the keywords table updated.

Cc: William Lallemand <wlallemand@haproxy.org>
2021-11-06 09:20:07 +01:00
Christopher Faulet
52b28d2f30 BUILD: log: Fix compilation without SSL support
When compiled without SSL support, a variable is reported as not used by
GCC.

src/log.c: In function ‘sess_build_logline’:
src/log.c:2056:36: error: unused variable ‘conn’ [-Werror=unused-variable]
 2056 |                 struct connection *conn;
      |                                    ^~~~

This does not need to be backported.
2021-10-27 12:00:15 +02:00
Christopher Faulet
f9c4d8d5be MINOR: log: Rely on client addresses at the appropriate level to log messages
When a log message is emitted, if the stream exits, we use the frontend
stream-interface to retrieve the client source and destination
addresses. Otherwise, the session is used. For now, stream-interface or
session addresses are never set. So, thanks to the fallback mechanism, no
changes are expected with this patch. But its purpose is to rely on
addresses at the appropriate level when set instead of those at the
connection level.
2021-10-27 11:34:21 +02:00
Christopher Faulet
6ff7de5d64 MINOR: tcpcheck: Support 2-steps args resolution in defaults sections
With the commit eaba25dd9 ("BUG/MINOR: tcpcheck: Don't use arg list for
default proxies during parsing"), we restricted the use of sample fetch in
tcpcheck rules defined in a defaults section to those depending on explicit
arguments only. This means a tcpcheck rules defined in a defaults section
cannot rely on argument unresolved during the configuration parsing.

Thanks to recent changes, it is now possible again.

This patch is mandatory to support TCP/HTTP rules in defaults sections.
2021-10-15 14:12:19 +02:00
William Lallemand
1d58b01316 MINOR: ssl: add ssl_fc_is_resumed to "option httpslog"
In order to trace which session were TLS resumed, add the
ssl_fc_is_resumed in the httpslog option.
2021-10-14 14:27:48 +02:00
Willy Tarreau
5554264f31 REORG: time: move time-keeping code and variables to clock.c
There is currently a problem related to time keeping. We're mixing
the functions to perform calculations with the os-dependent code
needed to retrieve and adjust the local time.

This patch extracts from time.{c,h} the parts that are solely dedicated
to time keeping. These are the "now" or "before_poll" variables for
example, as well as the various now_*() functions that make use of
gettimeofday() and clock_gettime() to retrieve the current time.

The "tv_*" functions moved there were also more appropriately renamed
to "clock_*".

Other parts used to compute stolen time are in other files, they will
have to be picked next.
2021-10-08 17:22:26 +02:00
Willy Tarreau
b7fc4c4e9f BUILD: tree-wide: add missing http_ana.h from many places
At least 6 files make use of s->txn without including http_ana which
defines it. They used to get it from other includes.
2021-10-07 01:36:51 +02:00
Christopher Faulet
eaba25dd97 BUG/MINOR: tcpcheck: Don't use arg list for default proxies during parsing
During tcp/http check rules parsing, when a sample fetch or a log-format
string is parsed, the proxy's argument list used to track unresolved
argument is no longer passed for default proxies. It means it is no longer
possible to rely on sample fetches depending on the execution context (for
instance 'nbsrv').

It is important to avoid HAProxy crashes because these arguments are
resolved during the configuration validity check. But, default proxies are
not evaluated during this stage. Thus, these arguments remain unresolved.

It will probably be possible to relax this rule. But to ease backports, it
is forbidden for now.

This patch must be backported as far as 2.2. It depends on the commit
"MINOR: arg: Be able to forbid unresolved args when building an argument
list".  It must be adapted for the 2.3 because PR_CAP_DEF capability was
introduced in the 2.4. A solution may be to test The proxy's id agains NULL.
2021-09-30 16:37:05 +02:00
Remi Tricot-Le Breton
1fe0fad88b MINOR: ssl: Rename ssl_bc_hsk_err to ssl_bc_err
The ssl_bc_hsk_err sample fetch will need to raise more errors than only
handshake related ones hence its renaming to a more generic ssl_bc_err.
This patch is required because some handshake failures that should have
been caught by this fetch (verify error on the server side for instance)
were missed. This is caused by a change in TLS1.3 in which the
'Finished' state on the client is reached before its certificate is sent
(and verified) on the server side (see the "Protocol Overview" part of
RFC 8446).
This means that the SSL_do_handshake call is finished long before the
server can verify and potentially reject the client certificate.

The ssl_bc_hsk_err will then need to be expanded to catch other types of
errors.

This change is also applied to the frontend fetches (ssl_fc_hsk_err
becomes ssl_fc_err) and to their string counterparts.
2021-09-30 11:04:35 +02:00