CBOR in hex format as implemented in previous commit is convenient because
the produced output is portable and can easily be embedded in regular
syslog payloads.
However, one of the goal of CBOR implementation is to be able to produce
"Concise Binary" object representation. Here is an excerpt from cbor.io
website:
"Some applications also benefit from CBOR itself being encoded in
binary. This saves bulk and allows faster processing."
Currently we don't offer that with '+cbor', quite the opposite actually
since a text string encoded with '+cbor' option will be larger than a
text string encoded with '+json' or without encoding at all, because for
each CBOR binary byte, 2 characters will be emitted.
Hopefully, the sink/log API allows for binary data to be passed as
parameter, this is because all relevant functions in the chain don't rely
on the terminating NULL byte and take a string pointer + string length as
parameter. We can actually rely on this property to support the '+bin'
option when combined with '+cbor' to produce RAW binary CBOR output.
Be careful though, as this is only intended for use with set-var-fmt or to
send binary data to capable UDP/ring endpoints.
Example:
log-format "%{+cbor,+bin}o %(test)[bin(00AABB)]"
Will produce:
bf64746573745f4300aabbffff
(output was piped to `hexdump -ve '1/1 "%.2x"'` to dump raw bytes as HEX
characters)
With cbor.me pretty printer, it gives us:
BF # map(*)
64 # text(4)
74657374 # "test"
5F # bytes(*)
43 # bytes(3)
00AABB # "\u0000\xAA\xBB"
FF # primitive(*)
FF # primitive(*)
In this patch, we make use of the CBOR (RFC8949) encode helper functions
from the previous commit to implement '+cbor' encoding option for log-
formats. The logic behind it is pretty similar to '+json' encoding option,
except that the produced output is a CBOR payload written in HEX format so
that it remains compatible to use this with regular syslog endpoints.
Example:
log-format "%{+cbor}o %[int(4)] test %(named_field)[str(ok)]"
Will produce:
BF6B6E616D65645F6669656C64626F6BFF
Detailed view (from cbor.me):
BF # map(*)
6B # text(11)
6E616D65645F6669656C64 # "named_field"
62 # text(2)
6F6B # "ok"
FF # primitive(*)
If the option isn't set globally, but on a specific node instead, then
only the value will be encoded according to CBOR specification.
Example:
log-format "test cbor bool: %{+cbor}[bool(true)]"
Will produce:
test cbor bool: F5
In this patch, we add the "+json" log format option that can be set
globally or per log format node.
What it does, it that it sets the LOG_OPT_ENCODE_JSON flag for the
current context which is provided to all lf_* log building function.
This way, all lf_* are now aware of this option and try to comply with
JSON specification when the option is set.
If the option is set globally, then sess_build_logline() will produce a
map-like object with key=val pairs for named logformat nodes.
(logformat nodes that don't have a name are simply ignored).
Example:
log-format "%{+json}o %[int(4)] test %(named_field)[str(ok)]"
Will produce:
{"named_field": "ok"}
If the option isn't set globally, but on a specific node instead, then
only the value will be encoded according to JSON specification.
Example:
log-format "{ \"manual_key\": %(named_field){+json}[bool(true)] }"
Will produce:
{"manual_key": true}
When the option is set, +E option will be ignored, and partial numerical
values (ie: because of logasap) will be encoded as-is.
Support '+bin' option argument on logformat nodes to try to preserve
binary output type with binary sample expressions.
For this, we rely on the log/sink API which is capable of conveying binary
data since all related functions don't search for a terminating NULL byte
in provided log payload as they take a string pointer and a string length
as argument.
Example:
log-format "%{+bin}o %[bin(00AABB)]"
Will produce:
00aabb
(output was piped to `hexdump -ve '1/1 "%.2x"'` to dump raw bytes as HEX
characters)
This should be used carefully, because many syslog endpoints don't expect
binary data (especially NULL bytes). This is mainly intended for use with
set-var-fmt actions or with ring/udp log endpoints that know how to deal
with such binary payloads.
Also, this option is only supported globally (for use with '%o'), it will
not have any effect when set on an individual node. (it makes no sense to
have binary data in the middle of log payload that was started without
binary data option)
Providing no_escape_map as <map> argument to _lf_encode_bytes() function
will make the function skip escaping since the map is empty.
This is for convenience, as it might be useful to call lf_encode_chunk()
to encoding binary data without escaping it.
In sess_build_logline(), for sample expression nodes, instead of directly
calling sample_fetch_as_type(... SMP_T_STR), let's first process the
sample using sample_process(), and then proceed with the conversion to
str if required.
Doing so will allow us to implement type casting and preserving logic.
Add internal lf_buildctx struct that is only used inside
sess_build_logline() scope and is passed to lf_* log building helpers
to expose current building context. For now, node options and the in_text
counter are stored in the ctx struct. Thanks to this change, lf_* building
functions don't depend on a logformat_node struct pointer, and may be used
in a standalone manner as long as a build context is provided.
Also, global options are now handled explictly in sess_build_logline() to
make sure that global options are always considered even if they were not
duplicated on every nodes.
No functional change should be expected.
lf_encode_string() and lf_encode_chunk() function are pretty similar. The
only difference is the stopping behavior, encode_chunk stops at a given
position while encode_string stops when encountering '\0'. Moreover,
both functions leverage tools.c encode helpers, but because of the
LOG_OPT_ESC option, they reimplement those helpers with added logic.
Instead of having to deal with code duplication which makes both functions
harder to maintain, let's define a _lf_encode_bytes() helper function
which satisfies lf_encode_string() and lf_encode_chunk() needs while
keeping the function as simple as possible.
_lf_encode_bytes() itself is made of multiple static inline helper
functions, in the attempt to keep checks outside of core loop for
better performance.
There is no need to expose such functions since they are only involved in
the log building process that occurs inside sess_build_logline().
Making functions static and removing their public prototype to ease code
maintenance.
Rename LOGQUOTE_{START,END} macros to more generic LOG_VARTEXT_{START,END}
in order to prepare for new encoding types that rely on specific treatment
for variable-length texts. No functional change should be expected.
Build fixed-length strings for %ts and %tsc to be able to print them
using lf_rawtext_len(), this way it will be easier to encode them
when new encoding options will be added.
No functional change should be expected.
Same as the previous commit, but for ip and port oriented values when
+X option is provided.
No functional change should be expected.
Because of this patch, we add a little overhead because we first generate
the text into a temporary variable and then use lf_rawtext() to print it.
Thus we have a double-copy, and this could have some performance
implications that were not yet evaluated. Due to the small number of bytes
that can end up being copied twice, we could be lucky and have no visible
performance impact, but if we happen to see a significant impact, it could
be useful to add a passthrough mechanism (to keep historical behavior)
when no encoding is involved.
Make use of the previous commit to print strings that should not be
modified.
For instance, when +X option is provided, we have to print numerical
values in ASCII HEX form. For that, we used snprintf() to output the
result to the log output buffer directly, but now we build the string in
a temporary buffer of fixed-size and then print it using lf_rawtext()
which will take care of encoding options.
Because of this patch, we add a little overhead because we first generate
the text into a temporary variable and then use lf_rawtext() to print it.
Thus we have a double-copy, and this could have some performance
implications that were not yet evaluated. Due to the small number of bytes
that can end up being copied twice, we could be lucky and have no visible
performance impact, but if we happen to see a significant impact, it could
be useful to add a passthrough mechanism (to keep historical behavior)
when no encoding is involved.
Don't directly call functions that take date as argument and output the
string representation to the log output buffer under sess_build_logline(),
and instead build the strings in temporary buffers of fixed size
(hopefully such functions, such as date2str_log() and gmt2str_log()
procuce strings of known size), and then print the result using
lf_rawtext() helper function. This way, we will be able to encode them
automatically as regular string/text when new encoding methods are added.
Because of this patch, we add a little overhead because we first generate
the text into a temporary variable and then use lf_rawtext() to print it.
Thus we have a double-copy, and this could have some performance
implications that were not yet evaluated. Due to the small number of bytes
that can end up being copied twice (< 30), we could be lucky and have no
visible performance impact, but if we happen to see a significant impact,
it could be useful to add a passthrough mechanism (to keep historical
behavior) when no encoding is involved.
similar to lf_text_{len}, except that quoting and mandatory options are
ignored. Use this to print the input string without any modification (
except for encoding logic).
Wrap ltoa(), lltoa(), ultoa() and utoa_pad() functions that are used by
sess_build_logline() to print numerical values by implementing a dedicated
helper named lf_int() that takes <dft_hld> as argument to know how to
write the integer by default (when no encoding is specified).
LF_INT_UTOA_PAD_4 is used to emulate utoa_pad(x, 4) since it's found only
once under sess_build_logline(), thus there is no need to pass an extra
parameter to lf_int() function.
Reminder:
Since 3.0-dev4, we can optionally give a name to logformat nodes:
log-format "%(custom_name1)B %(custom_name2)[str(value)]"
But we may also optionally set the expected node type by appending
':type' after the name, type being either sint,str or bool, like this:
log-format "%(string_as_int:sint)[str(14)]"
However, it is currently not possible to provide a type without providing
a name that is a least 1 char long. But it could be useful to provide a
type without setting a name, like this, for typecasting purposes only:
log-format "%(:sint)[bool(true)]"
Thus in order to allow this usage, don't set node->name if node name is
not at least 1 character long. By doing so, node->name will remain NULL
and will not be considered, but the typecast setting will.
make sess_build_logline() switch case more readable by performing some
simplifications: complex values are first extracted in a temporary
variable so that it's easier to refer to them and at a single place.
Thanks to 8226e92eb ("BUG/MINOR: tools/log: invalid
encode_{chunk,string} usage"), we only need to check for NULL return
value from encode_{chunk,string}() and escape_string() to know if the
call failed.
In c83684519 ("MEDIUM: log: add the ability to include samples in logs")
we checked the return value of lf_text_len() as an integer instead of
comparing the pointer with NULL explicitly. Since this may be confusing,
let's test the return value against NULL.
[ada: for backports, the patch needs to be applied manually because of
c6a713842 ("MINOR: log: simplify last_isspace in sess_build_logline()")]
According to snprintf() man page:
The functions snprintf() and vsnprintf() do not write more than
size bytes (including the terminating null byte ('\0')). If the
output was truncated due to this limit, then the return value is
the number of characters (excluding the terminating null byte)
which would have been written to the final string if enough space
had been available. Thus, a return value of size or more means
that the output was truncated.
However, in sess_build_logline(), each time we need to check the return
value of snprintf(), here is how we proceed:
iret = snprintf(tmplog, max, ...);
if (iret < 0 || iret > max)
// error
// success
tmplog += iret;
Here is the issue: if snprintf() lacks 1 byte space to write the
terminating NULL byte, it will return max. Which means in this case
that we fail to know that snprintf() truncated the output in reality,
and we still add iret to tmplog pointer. Considering sess_build_logline()
should NOT write more than <maxsize> bytes (including the terminating NULL
byte) as per the function description, in this case the function would
write <maxsize>+1 byte (to write the terminating NULL byte upon return),
which may lead to invalid write if <dst> was meant to hold <maxsize> bytes
at maximum.
Hopefully, this bug wasn't triggered so far because sess_build_logline()
is called with logline as <dst> argument and <global.max_syslog_len> as
<maxsize> argument, logline being initialized with 1 extra byte upon
startup.
But we better fix this to comply with the function description and prevent
any side-effect since some sess_build_logline() helpers may assume that
'tmplog-dst < maxsize' is always true. Also sess_build_logline() users
probably don't expect NULL-byte to be accounted for in the produced
logline length.
This should be backported to all stable versions.
[ada: for backports, the patch needs to be applied manually because of
c6a713842 ("MINOR: log: simplify last_isspace in sess_build_logline()")]
encode_{chunk,string}() is often found to be used this way:
ret = encode_{chunk,string}(start, stop...)
if (ret == NULL || *ret != '\0') {
//error
}
//success
Indeed, encode_{chunk,string} will always try to add terminating NULL byte
to the output string, unless no space is available for even 1 byte.
However, it means that for the caller to be able to spot an error, then it
must provide a buffer (here: start) which is already initialized.
But this is wrong: not only this is very tricky to use, but since those
functions don't return NULL on failure, then if the output buffer was not
properly initialized prior to calling the function, the caller will
perform invalid reads when checking for failure this way. Moreover, even
if the buffer is initialized, we cannot reliably tell if the function
actually failed this way because if the buffer was previously initialized
with NULL byte, then the caller might think that the call actually
succeeded (since the function didn't return NULL and didn't update the
buffer).
Also, sess_build_logline() relies lf_encode_{chunk,string}() functions
which are in fact wrappers for encode_{chunk,string}() functions and thus
exhibit the same error handling mechanism. It turns out that
sess_build_logline() makes unsafe use of those functions because it uses
the error-checking logic mentionned above while buffer (tmplog) is not
guaranteed to be initialized when entering the function. This may
ultimately cause malfunctions or invalid reads if the output buffer is
lacking space.
To fix the issue once and for all and prevent similar bugs from being
introduced, we make it so encode_{string, chunk} and escape_string()
(based on encode_string()) now explicitly return NULL on failure
(when the function failed to write at least the ending NULL byte)
lf_encode_{string,chunk}() helpers had to be patched as well due to code
duplication.
This should be backported to all stable versions.
[ada: for 2.4 and 2.6 the patch won't apply as-is, it might be helpful to
backport ae1e14d65 ("CLEANUP: tools: removing escape_chunk() function")
first, considering it's not very relevant to maintain a dead function]
In c5bff8e550 ("BUG/MINOR: log: improper behavior when escaping log data")
we fixed lf_text_len() behavior with +E (escape) option.
However we introduced an inconsistency if output buffer is too small to
hold the whole output and truncation occurs: indeed without +E option up
to <size> bytes (including NULL byte) will be used whereas with +E option
only <size-1> bytes will be used. Fixing the function and related comment
so that the function behaves the same in regards to truncation whether +E
option is used or not.
This should be backported to all stable versions.
Currently, the way proxy-oriented logformat directives are handled is way
too complicated. Indeed, "log-format", "log-format-error", "log-format-sd"
and "unique-id-format" all rely on preparsing hints stored inside
proxy->conf member struct. Those preparsing hints include the original
string that should be compiled once the proxy parameters are known plus
the config file and line number where the string was found to generate
precise error messages in case of failure during the compiling process
that happens within check_config_validity().
Now that lf_expr API permits to compile a lf_expr struct that was
previously prepared (with original string and config hints), let's
leverage lf_expr_compile() from check_config_validity() and instead
of relying on individual proxy->conf hints for each logformat expression,
store string and config hints in the lf_expr struct directly and use
lf_expr helpers funcs to handle them when relevant (ie: original
logformat string freeing is now done at a central place inside
lf_expr_deinit(), which allows for some simplifications)
Doing so allows us to greatly simplify the preparsing logic for those 4
proxy directives, and to finally save some space in the proxy struct.
Also, since httpclient proxy has its "logformat" automatically compiled
in check_config_validity(), we now use the file hint from the logformat
expression struct to set an explicit name that will be reported in case
of error ("parsing [httpclient:0] : ...") and remove the extraneous check
in httpclient_precheck() (logformat was parsed twice previously..)
split parse_logformat_string() into two functions:
parse_logformat_string() sticks to the same behavior, but now becomes an
helper for lf_expr_compile() which uses explicit arguments so that it
becomes possible to use lf_expr_compile() without a proxy, but also
compile an expression which was previously prepared for compiling (set
string and config hints within the logformat expression to avoid manually
storing string and config context if the compiling step happens later).
lf_expr_dup() may be used to duplicate an expression before it is
compiled, lf_expr_xfer() now makes sure that the input logformat is
already compiled.
This is some prerequisite works for log-profiles implementation, no
functional change should be expected.
This patch tries to address a design flaw with how logformat expressions
are parsed from config. Indeed, some parse_logformat_string() calls are
performed during config parsing when the proxy mode is not yet known.
Here's a config example that illustrates the issue:
defaults
mode tcp
listen test
bind :8888
http-response set-header custom-hdr "%trl" # needs http
mode http
The above config should work, because the effective proxy mode is http,
yet haproxy fails with this error:
[ALERT] (99051) : config : parsing [repro.conf:6] : error detected in proxy 'test' while parsing 'http-response set-header' rule : format tag 'trl' is reserved for HTTP mode.
To fix the issue once and for all, let's implement smart postparsing for
logformat expressions encountered during config parsing:
- split parse_logformat_string() (and subfonctions) in order to create a
new lf_expr_postcheck() function that must be called to finish
preparing and checking the logformat expression once the proxy type is
known.
- save some config hints info during parse_logformat_string() to
generate more precise error messages during lf_expr_postcheck(), if
needed, we rely on curpx->conf.args.{file,line} hints for that because
parse_logformat_string() doesn't know about current file and line
number.
- lf_expr_postcheck() uses PR_FL_CHECKED proxy flag to know if the
function may try to make the proxy compatible with the expression, or
if it should simply fail as soon as an incompatibility is detected.
- if parse_logformat_string() is called from an unchecked proxy, then
schedule the expression for postparsing, else (ie: during runtime),
run the postcheck right away.
This change will also allow for some logformat expression error handling
simplifications in the future.
log format expressions are broadly used within the code: once they are
parsed from input string, they are converted to a linked list of
logformat nodes.
We're starting to face some limitations because we're simply storing the
converted expression as a generic logformat_node list.
The first issue we're facing is that storing logformat expressions that
way doesn't allow us to add metadata alongside the list, which is part
of the prerequites for implementing log-profiles.
Another issue with storing logformat expressions as generic lists of
logformat_node elements is that it's starting to become really hard to
tell when we rely on logformat expressions or not in the code given that
there isn't always a comment near the list declaration or manipulation
to indicate that it's relying on logformat expressions under the hood,
so this adds some complexity for code maintenance.
This patch looks quite impressive due to changes in a lot of header and
source files (since logformat expressions are broadly used), but it does
a simple thing: it defines the lf_expr structure which itself holds a
generic list of logformat nodes, and then declares some helpers to
manipulate lf_expr elements and fixes the code so that we now exclusively
manipulate logformat_node lists as lf_expr elements outside of log.c.
For now, lf_expr struct only contains the list of logformat nodes (no
additional metadata), but now that we have dedicated type and helpers,
doing so in the future won't be problematic at all and won't require
extensive code changes.
This is a pretty simple patch despite requiring to make some visible
changes in the code:
When parsing a logformat string, log tags (ie: '%tag', AKA log tags) are
turned into logformat nodes with their type set to the type of the
corresponding logformat_tag element which was matched by name. Thus, when
"compiling" a logformat tag, we only keep a reference to the tag type
from the original logformat_tag.
For example, for "%B" log tag, we have the following logformat_tag
element:
{
.name = "B",
.type = LOG_FMT_BYTES,
.mode = PR_MODE_TCP,
.lw = LW_BYTES,
.config_callback = NULL
}
When parsing "%B" string, we search for a matching logformat tag
inside logformat_tags[] array using the provided name, once we find a
matching element, we craft a logformat node whose type will be
LOG_FMT_BYTES, but from the node itself, we no longer have access to
other informations that are set in the logformat_tag struct element.
Thus from a logformat_node resulting from a log tag, with current
implementation, we cannot easily get back to matching logformat_tag
struct element as it would require us to scan the whole logformat_tags
array at runtime using node->type to find the matching element.
Let's take a simpler path and consider all tag-specific LOG_FMT_*
subtypes as being part of the same logformat node type: LOG_FMT_TAG.
Thanks to that, we're now able to distinguish logformat nodes made
from logformat tag from other logformat nodes, and link them to
their corresponding logformat_tag element from logformat_tags[] array. All
it costs is a simple indirection and an extra pointer in logformat_node
struct.
While at it, all LOG_FMT_* types related to logformat tags were moved
inside log.c as they have no use outside of it since they are simply
lookup indexes for sess_build_logline() and could even be replaced by
function pointers some day...
rename logformat_type internal struct to logformat_tag to to make it less
confusing, then expose logformat_tag struct through header file so that it
can be referenced in other structs.
also rename logformat_keywords[] to logformat_tags[] for better
consistency.
What we use to call logformat variable in the code is referred as
log-format tag in the documentation. Having both 'var' and 'tag' labels
referring to the same thing is really confusing. Let's make the code
comply with the documentation by replacing all logformat var/variable/VAR
occurences with either tag or TAG.
No functional change should be expected, the only visible side-effect from
user point of view is that "variable" was replaced by "tag" in some error
messages.
log load-balancing implementation was not seamlessly integrated within
lbprm API. The consequence is that it could become harder to maintain
over time since it added some specific cases just for the log backend.
Moreover, it resulted in some code duplication since balance algorithms
that are common to logs and regular (tcp, http) backends were specifically
rewritten for log backends.
Thanks to the previous commit, we now have all the prerequisites to make
log load-balancing fully leverage lbprm logic. Thus in this patch we make
__do_send_log_backend() use existing lbprm algorithms, and we no longer
require log-specific lbprm initialization in cfgparse.c and in
postcheck_log_backend().
As a bonus, for log backends this allows weighed algorithms to properly
support weights (ie: roundrobin, random and log-hash) since we now
leverage the same lb algorithms that we use for tcp/http backends
(doc was updated).
As previously mentioned in cd352c0db ("MINOR: log/balance: rename
"log-sticky" to "sticky""), let's define a sticky algorithm that may be
used from any protocol. Sticky algorithm sticks on the same server as
long as it remains available.
The documentation was updated accordingly.
b61147fd ("MEDIUM: log/balance: merge tcp/http algo with log ones")
introduced some ambiguities, because while it shares some algos with the
ones from mode {tcp,http}, we forgot report an error when the user tries
to use an algorithm that is not available in this mode (as per the doc).
Because of that, haproxy would silently drop log messages during runtime.
To fix that, we ensure that algo is one of the supported ones during log
backend postparsing. If the algo is not supported, we raise an error.
This should be backported in 2.9 with b61147fd
The code now looks cleaner and more easily shows what still needs to be
addressed. There are not that many changes in practice, these are mostly
mechanical, essentially hiding the buffer from the callers.
The goal is to remove references to the buffer's head and tail in the
fast path so that we can release the lock during some reads. This means
no more comparisons with b_data() nor operations relative to b_head()
will be possible anymore. As a first step we need to have an absolute
offset in the buffer, and to use b_getblk_ofs() in the applet callbacks
to retrieve the data based on this.
This function takes a buffer on input, and offset and a length, and
consumes the block from that buffer to send it to the appctx's output
buffer. Contrary to its sibling applet_append_line(), instead of just
appending an LF at the end of the line, it prepends the message size
in decimal and a space before the message, as expected by syslog TCP
implementaions. This will be used to simplify the ring reader code.
Since 833cc794 ("MEDIUM: sample: handle comma-delimited converter list")
logformat expressions now support having a comma-delimited converter list
right after the fetch. Let's remove a leftover comment from the initial
implementation that says otherwise.
Recent commit 2ed6068 ("MINOR: log: custom name for logformat node")
introduced a potential memory leak because when custom name is provided,
lf->name value is allocated using strdup(), thus is expected to be freed
alongside the node when the node is released.
However lf->name was only freed in some common places within log.c
cleanups and helpers func, but in reality there are still cases where
lf nodes are manually freed without making use of freeing helpers.
So this is what this patch does, it makes sure all lf freeing places now
leverage the free_logformat_node() helper function that takes care of
freeing all known allocated elements within the node, including custom
name.
This commit depends on:
- "MINOR: log: add free_logformat_node() helper function"
No backport needed unless 2ed6068 gets backported.
This is a follow up for 24a5e42db6 ("CLEANUP: log: deinitialization of
the log buffer in one function") as there was another opportunity to
make use of the new cleanup function.
make it so string array construction is performed by dedicated macro
helpers instead of manual char insertion between string members.
The goal is to easily be able to support multiple forms of array
construction depending on the data encoding format (raw, json..).
Only %hrl and %hsl logformats are concerned.
Some log variables may be prefixed with specific chars that represent
extra informations that are relevant with it but are are not directly
part of the "raw" value.
ie: '+' char is prepended before some values when "option logasap" is
used to indicate that the value has not yet reached its final value.
However, as those "metadata" are printed using the general purpose
LOGCHAR() printing helper, it's not easy to tell if they are part of the
base value or not.
In this patch we add the LOGMETACHAR() helper that is a wrapper for
LOGCHAR(). The goal is to prepare for adding some logic to prevent such
additional infos from being generated when not relevant or needed.
quotes building for some log formats is directly performed under each
switch case statement so it would become painful to add other conditions
to prevent the quotes from being generated when it's not supported by the
the data encoding format for instance (ie: JSON).
Let's centralize and simplify quotes handling by adding LOGQUOTE_START()
and LOGQUOTE_END() helper macros. If a quotation is started and not
explicitly ended, it will be automatically ended at the end of the current
logformat node:
LOGQUOTE_START() sets 'quote' variable to 1, this way LOGQUOTE_END() only
prints the ending quote when needed. LOGQUOTE_END() is systematically
called after each node switch-case (after each value). LOGQUOTE_START()
does nothing if LOG_OPT_QUOTE isn't set, so does LOGQUOTE_END().
Some rare cases such as %hsl (list of captured headers) required special
handling: in this case multiple quoted texts are generated for the same
field value so explicit LOGQUOTE_START() + LOGQUOTE_END() combination was
needed.
last_isspace variable is explicitly set to 0 in all cases except
LOG_FMT_SEPARATOR case. So we can actually simplify the code by setting
last_isspace to 0 by default and skipping the assignment for the
LOG_FMT_SEPARATOR case.
Add the ability to manually specify desired output type after a custom
field name for logformat nodes. Forcing the type can be useful to ensure
value is stored with the proper type representation. (i.e.: forcing
numerical to string to work around the limited resolution of JS number
types)
By default, type is set to SMP_T_SAME, which means the original type will
be preserved.
Currently supported types are: bool, str, sint