haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-10-27 14:41:28 +01:00

Author	SHA1	Message	Date
Aurelien DARRAGON	c33b857df9	MINOR: log: support true cbor binary encoding CBOR in hex format as implemented in previous commit is convenient because the produced output is portable and can easily be embedded in regular syslog payloads. However, one of the goal of CBOR implementation is to be able to produce "Concise Binary" object representation. Here is an excerpt from cbor.io website: "Some applications also benefit from CBOR itself being encoded in binary. This saves bulk and allows faster processing." Currently we don't offer that with '+cbor', quite the opposite actually since a text string encoded with '+cbor' option will be larger than a text string encoded with '+json' or without encoding at all, because for each CBOR binary byte, 2 characters will be emitted. Hopefully, the sink/log API allows for binary data to be passed as parameter, this is because all relevant functions in the chain don't rely on the terminating NULL byte and take a string pointer + string length as parameter. We can actually rely on this property to support the '+bin' option when combined with '+cbor' to produce RAW binary CBOR output. Be careful though, as this is only intended for use with set-var-fmt or to send binary data to capable UDP/ring endpoints. Example: log-format "%{+cbor,+bin}o %(test)[bin(00AABB)]" Will produce: bf64746573745f4300aabbffff (output was piped to `hexdump -ve '1/1 "%.2x"'` to dump raw bytes as HEX characters) With cbor.me pretty printer, it gives us: BF # map() 64 # text(4) 74657374 # "test" 5F # bytes() 43 # bytes(3) 00AABB # "\u0000\xAA\xBB" FF # primitive() FF # primitive()	2024-04-26 18:39:32 +02:00
Aurelien DARRAGON	c614fd3b9f	MINOR: log: add +cbor encoding option In this patch, we make use of the CBOR (RFC8949) encode helper functions from the previous commit to implement '+cbor' encoding option for log- formats. The logic behind it is pretty similar to '+json' encoding option, except that the produced output is a CBOR payload written in HEX format so that it remains compatible to use this with regular syslog endpoints. Example: log-format "%{+cbor}o %[int(4)] test %(named_field)[str(ok)]" Will produce: BF6B6E616D65645F6669656C64626F6BFF Detailed view (from cbor.me): BF # map() 6B # text(11) 6E616D65645F6669656C64 # "named_field" 62 # text(2) 6F6B # "ok" FF # primitive() If the option isn't set globally, but on a specific node instead, then only the value will be encoded according to CBOR specification. Example: log-format "test cbor bool: %{+cbor}[bool(true)]" Will produce: test cbor bool: F5	2024-04-26 18:39:32 +02:00
Aurelien DARRAGON	3f7c8387c0	MINOR: log: add +json encoding option In this patch, we add the "+json" log format option that can be set globally or per log format node. What it does, it that it sets the LOG_OPT_ENCODE_JSON flag for the current context which is provided to all lf_* log building function. This way, all lf_* are now aware of this option and try to comply with JSON specification when the option is set. If the option is set globally, then sess_build_logline() will produce a map-like object with key=val pairs for named logformat nodes. (logformat nodes that don't have a name are simply ignored). Example: log-format "%{+json}o %[int(4)] test %(named_field)[str(ok)]" Will produce: {"named_field": "ok"} If the option isn't set globally, but on a specific node instead, then only the value will be encoded according to JSON specification. Example: log-format "{ \"manual_key\": %(named_field){+json}[bool(true)] }" Will produce: {"manual_key": true} When the option is set, +E option will be ignored, and partial numerical values (ie: because of logasap) will be encoded as-is.	2024-04-26 18:39:32 +02:00
Aurelien DARRAGON	b7c3d8c87c	MINOR: log: add +bin logformat node option Support '+bin' option argument on logformat nodes to try to preserve binary output type with binary sample expressions. For this, we rely on the log/sink API which is capable of conveying binary data since all related functions don't search for a terminating NULL byte in provided log payload as they take a string pointer and a string length as argument. Example: log-format "%{+bin}o %[bin(00AABB)]" Will produce: 00aabb (output was piped to `hexdump -ve '1/1 "%.2x"'` to dump raw bytes as HEX characters) This should be used carefully, because many syslog endpoints don't expect binary data (especially NULL bytes). This is mainly intended for use with set-var-fmt actions or with ring/udp log endpoints that know how to deal with such binary payloads. Also, this option is only supported globally (for use with '%o'), it will not have any effect when set on an individual node. (it makes no sense to have binary data in the middle of log payload that was started without binary data option)	2024-04-26 18:39:31 +02:00
Aurelien DARRAGON	162e311a0e	MINOR: log: add no_escape_map to bypass escape with _lf_encode_bytes() Providing no_escape_map as <map> argument to _lf_encode_bytes() function will make the function skip escaping since the map is empty. This is for convenience, as it might be useful to call lf_encode_chunk() to encoding binary data without escaping it.	2024-04-26 18:39:31 +02:00
Aurelien DARRAGON	fb8b47fed8	MINOR: log: postpone conversion for sample expressions in sess_build_logline() In sess_build_logline(), for sample expression nodes, instead of directly calling sample_fetch_as_type(... SMP_T_STR), let's first process the sample using sample_process(), and then proceed with the conversion to str if required. Doing so will allow us to implement type casting and preserving logic.	2024-04-26 18:39:31 +02:00
Aurelien DARRAGON	84963fb743	MINOR: log: expose node typecast in lf_buildctx struct Store node->typecast setting inside lf_buildctx struct so that encoding functions may benefit from it.	2024-04-26 18:39:31 +02:00
Aurelien DARRAGON	3f2e8d0ed2	MEDIUM: log: lf_* build helpers now take a ctx argument Add internal lf_buildctx struct that is only used inside sess_build_logline() scope and is passed to lf_* log building helpers to expose current building context. For now, node options and the in_text counter are stored in the ctx struct. Thanks to this change, lf_* building functions don't depend on a logformat_node struct pointer, and may be used in a standalone manner as long as a build context is provided. Also, global options are now handled explictly in sess_build_logline() to make sure that global options are always considered even if they were not duplicated on every nodes. No functional change should be expected.	2024-04-26 18:39:31 +02:00
Aurelien DARRAGON	f7cb384f1a	MINOR: log: merge lf_encode_string() and lf_encode_chunk() logic lf_encode_string() and lf_encode_chunk() function are pretty similar. The only difference is the stopping behavior, encode_chunk stops at a given position while encode_string stops when encountering '\0'. Moreover, both functions leverage tools.c encode helpers, but because of the LOG_OPT_ESC option, they reimplement those helpers with added logic. Instead of having to deal with code duplication which makes both functions harder to maintain, let's define a _lf_encode_bytes() helper function which satisfies lf_encode_string() and lf_encode_chunk() needs while keeping the function as simple as possible. _lf_encode_bytes() itself is made of multiple static inline helper functions, in the attempt to keep checks outside of core loop for better performance.	2024-04-26 18:39:31 +02:00
Aurelien DARRAGON	a1583ec7c7	MINOR: log: make all lf_* sess build helper static There is no need to expose such functions since they are only involved in the log building process that occurs inside sess_build_logline(). Making functions static and removing their public prototype to ease code maintenance.	2024-04-26 18:39:31 +02:00
Aurelien DARRAGON	3b9096bd36	MINOR: log: use LOG_VARTEXT_{START,END} to enclose text strings Rename LOGQUOTE_{START,END} macros to more generic LOG_VARTEXT_{START,END} in order to prepare for new encoding types that rely on specific treatment for variable-length texts. No functional change should be expected.	2024-04-26 18:39:31 +02:00
Aurelien DARRAGON	278d6c3379	MINOR: log: explicitly handle %ts and %tsc as text strings Build fixed-length strings for %ts and %tsc to be able to print them using lf_rawtext_len(), this way it will be easier to encode them when new encoding options will be added. No functional change should be expected.	2024-04-26 18:39:31 +02:00
Aurelien DARRAGON	2e4cc517bf	MEDIUM: log: use lf_rawtext for lf_ip() and lf_port() hex strings Same as the previous commit, but for ip and port oriented values when +X option is provided. No functional change should be expected. Because of this patch, we add a little overhead because we first generate the text into a temporary variable and then use lf_rawtext() to print it. Thus we have a double-copy, and this could have some performance implications that were not yet evaluated. Due to the small number of bytes that can end up being copied twice, we could be lucky and have no visible performance impact, but if we happen to see a significant impact, it could be useful to add a passthrough mechanism (to keep historical behavior) when no encoding is involved.	2024-04-26 18:39:31 +02:00
Aurelien DARRAGON	3a3bdf1c76	MEDIUM: log: write raw strings using lf_rawtext() Make use of the previous commit to print strings that should not be modified. For instance, when +X option is provided, we have to print numerical values in ASCII HEX form. For that, we used snprintf() to output the result to the log output buffer directly, but now we build the string in a temporary buffer of fixed-size and then print it using lf_rawtext() which will take care of encoding options. Because of this patch, we add a little overhead because we first generate the text into a temporary variable and then use lf_rawtext() to print it. Thus we have a double-copy, and this could have some performance implications that were not yet evaluated. Due to the small number of bytes that can end up being copied twice, we could be lucky and have no visible performance impact, but if we happen to see a significant impact, it could be useful to add a passthrough mechanism (to keep historical behavior) when no encoding is involved.	2024-04-26 18:39:31 +02:00
Aurelien DARRAGON	0d1e99c086	MEDIUM: log: pass date strings to lf_rawtext() Don't directly call functions that take date as argument and output the string representation to the log output buffer under sess_build_logline(), and instead build the strings in temporary buffers of fixed size (hopefully such functions, such as date2str_log() and gmt2str_log() procuce strings of known size), and then print the result using lf_rawtext() helper function. This way, we will be able to encode them automatically as regular string/text when new encoding methods are added. Because of this patch, we add a little overhead because we first generate the text into a temporary variable and then use lf_rawtext() to print it. Thus we have a double-copy, and this could have some performance implications that were not yet evaluated. Due to the small number of bytes that can end up being copied twice (< 30), we could be lucky and have no visible performance impact, but if we happen to see a significant impact, it could be useful to add a passthrough mechanism (to keep historical behavior) when no encoding is involved.	2024-04-26 18:39:31 +02:00
Aurelien DARRAGON	fcb7e4beaa	MINOR: log: add lf_rawtext{_len}() functions similar to lf_text_{len}, except that quoting and mandatory options are ignored. Use this to print the input string without any modification ( except for encoding logic).	2024-04-26 18:39:31 +02:00
Aurelien DARRAGON	1fa2da18cd	MINOR: log: add lf_int() wrapper to print integers Wrap ltoa(), lltoa(), ultoa() and utoa_pad() functions that are used by sess_build_logline() to print numerical values by implementing a dedicated helper named lf_int() that takes <dft_hld> as argument to know how to write the integer by default (when no encoding is specified). LF_INT_UTOA_PAD_4 is used to emulate utoa_pad(x, 4) since it's found only once under sess_build_logline(), thus there is no need to pass an extra parameter to lf_int() function.	2024-04-26 18:39:31 +02:00
Aurelien DARRAGON	d3c92a3a83	MINOR: log: skip custom logformat_node name if empty Reminder: Since 3.0-dev4, we can optionally give a name to logformat nodes: log-format "%(custom_name1)B %(custom_name2)[str(value)]" But we may also optionally set the expected node type by appending ':type' after the name, type being either sint,str or bool, like this: log-format "%(string_as_int:sint)[str(14)]" However, it is currently not possible to provide a type without providing a name that is a least 1 char long. But it could be useful to provide a type without setting a name, like this, for typecasting purposes only: log-format "%(:sint)[bool(true)]" Thus in order to allow this usage, don't set node->name if node name is not at least 1 character long. By doing so, node->name will remain NULL and will not be considered, but the typecast setting will.	2024-04-26 18:39:31 +02:00
Aurelien DARRAGON	c584600083	CLEANUP: log: simplify complex values usages in sess_build_logline() make sess_build_logline() switch case more readable by performing some simplifications: complex values are first extracted in a temporary variable so that it's easier to refer to them and at a single place.	2024-04-26 18:39:31 +02:00
Aurelien DARRAGON	507223d527	MINOR: log: global lf_expr node options Add options to lf_expr->nodes to store global options (those that are common to all node) for easier access. No functional change should be expected.	2024-04-26 18:39:31 +02:00
Aurelien DARRAGON	7ff4f09e23	MINOR: log: store lf_expr nodes inside substruct Add another struct level inside lf_expr struct to allow new information to be stored alongside lf_expr nodes.	2024-04-26 18:39:31 +02:00
Aurelien DARRAGON	f8e1357a05	CLEANUP: log: remove unused checks for encode_{chunk,string} Thanks to 8226e92eb ("BUG/MINOR: tools/log: invalid encode_{chunk,string} usage"), we only need to check for NULL return value from encode_{chunk,string}() and escape_string() to know if the call failed.	2024-04-26 18:39:31 +02:00
Ilya Shipitsin	ab7f05daba	CLEANUP: assorted typo fixes in the code and comments This is 41st iteration of typo fixes	2024-04-17 11:14:44 +02:00
Aurelien DARRAGON	9420cfc0db	CLEANUP: log: lf_text_len() returns a pointer not an integer In c83684519 ("MEDIUM: log: add the ability to include samples in logs") we checked the return value of lf_text_len() as an integer instead of comparing the pointer with NULL explicitly. Since this may be confusing, let's test the return value against NULL. [ada: for backports, the patch needs to be applied manually because of c6a713842 ("MINOR: log: simplify last_isspace in sess_build_logline()")]	2024-04-09 17:35:53 +02:00
Aurelien DARRAGON	28548f812f	BUG/MINOR: log: invalid snprintf() usage in sess_build_logline() According to snprintf() man page: The functions snprintf() and vsnprintf() do not write more than size bytes (including the terminating null byte ('\0')). If the output was truncated due to this limit, then the return value is the number of characters (excluding the terminating null byte) which would have been written to the final string if enough space had been available. Thus, a return value of size or more means that the output was truncated. However, in sess_build_logline(), each time we need to check the return value of snprintf(), here is how we proceed: iret = snprintf(tmplog, max, ...); if (iret < 0 \|\| iret > max) // error // success tmplog += iret; Here is the issue: if snprintf() lacks 1 byte space to write the terminating NULL byte, it will return max. Which means in this case that we fail to know that snprintf() truncated the output in reality, and we still add iret to tmplog pointer. Considering sess_build_logline() should NOT write more than <maxsize> bytes (including the terminating NULL byte) as per the function description, in this case the function would write <maxsize>+1 byte (to write the terminating NULL byte upon return), which may lead to invalid write if <dst> was meant to hold <maxsize> bytes at maximum. Hopefully, this bug wasn't triggered so far because sess_build_logline() is called with logline as <dst> argument and <global.max_syslog_len> as <maxsize> argument, logline being initialized with 1 extra byte upon startup. But we better fix this to comply with the function description and prevent any side-effect since some sess_build_logline() helpers may assume that 'tmplog-dst < maxsize' is always true. Also sess_build_logline() users probably don't expect NULL-byte to be accounted for in the produced logline length. This should be backported to all stable versions. [ada: for backports, the patch needs to be applied manually because of c6a713842 ("MINOR: log: simplify last_isspace in sess_build_logline()")]	2024-04-09 17:35:53 +02:00
Aurelien DARRAGON	8226e92eb0	BUG/MINOR: tools/log: invalid encode_{chunk,string} usage encode_{chunk,string}() is often found to be used this way: ret = encode_{chunk,string}(start, stop...) if (ret == NULL \|\| *ret != '\0') { //error } //success Indeed, encode_{chunk,string} will always try to add terminating NULL byte to the output string, unless no space is available for even 1 byte. However, it means that for the caller to be able to spot an error, then it must provide a buffer (here: start) which is already initialized. But this is wrong: not only this is very tricky to use, but since those functions don't return NULL on failure, then if the output buffer was not properly initialized prior to calling the function, the caller will perform invalid reads when checking for failure this way. Moreover, even if the buffer is initialized, we cannot reliably tell if the function actually failed this way because if the buffer was previously initialized with NULL byte, then the caller might think that the call actually succeeded (since the function didn't return NULL and didn't update the buffer). Also, sess_build_logline() relies lf_encode_{chunk,string}() functions which are in fact wrappers for encode_{chunk,string}() functions and thus exhibit the same error handling mechanism. It turns out that sess_build_logline() makes unsafe use of those functions because it uses the error-checking logic mentionned above while buffer (tmplog) is not guaranteed to be initialized when entering the function. This may ultimately cause malfunctions or invalid reads if the output buffer is lacking space. To fix the issue once and for all and prevent similar bugs from being introduced, we make it so encode_{string, chunk} and escape_string() (based on encode_string()) now explicitly return NULL on failure (when the function failed to write at least the ending NULL byte) lf_encode_{string,chunk}() helpers had to be patched as well due to code duplication. This should be backported to all stable versions. [ada: for 2.4 and 2.6 the patch won't apply as-is, it might be helpful to backport ae1e14d65 ("CLEANUP: tools: removing escape_chunk() function") first, considering it's not very relevant to maintain a dead function]	2024-04-09 17:35:45 +02:00
Aurelien DARRAGON	b15f6dfae8	BUG/MINOR: log: fix lf_text_len() truncate inconsistency In c5bff8e550 ("BUG/MINOR: log: improper behavior when escaping log data") we fixed lf_text_len() behavior with +E (escape) option. However we introduced an inconsistency if output buffer is too small to hold the whole output and truncation occurs: indeed without +E option up to <size> bytes (including NULL byte) will be used whereas with +E option only <size-1> bytes will be used. Fixing the function and related comment so that the function behaves the same in regards to truncation whether +E option is used or not. This should be backported to all stable versions.	2024-04-09 17:30:13 +02:00
Aurelien DARRAGON	e751eebfc6	MEDIUM: proxy/log: leverage lf_expr API for logformat preparsing Currently, the way proxy-oriented logformat directives are handled is way too complicated. Indeed, "log-format", "log-format-error", "log-format-sd" and "unique-id-format" all rely on preparsing hints stored inside proxy->conf member struct. Those preparsing hints include the original string that should be compiled once the proxy parameters are known plus the config file and line number where the string was found to generate precise error messages in case of failure during the compiling process that happens within check_config_validity(). Now that lf_expr API permits to compile a lf_expr struct that was previously prepared (with original string and config hints), let's leverage lf_expr_compile() from check_config_validity() and instead of relying on individual proxy->conf hints for each logformat expression, store string and config hints in the lf_expr struct directly and use lf_expr helpers funcs to handle them when relevant (ie: original logformat string freeing is now done at a central place inside lf_expr_deinit(), which allows for some simplifications) Doing so allows us to greatly simplify the preparsing logic for those 4 proxy directives, and to finally save some space in the proxy struct. Also, since httpclient proxy has its "logformat" automatically compiled in check_config_validity(), we now use the file hint from the logformat expression struct to set an explicit name that will be reported in case of error ("parsing [httpclient:0] : ...") and remove the extraneous check in httpclient_precheck() (logformat was parsed twice previously..)	2024-04-04 19:10:01 +02:00
Aurelien DARRAGON	2b79457bc0	MEDIUM: log: add compiling logic to logformat expressions split parse_logformat_string() into two functions: parse_logformat_string() sticks to the same behavior, but now becomes an helper for lf_expr_compile() which uses explicit arguments so that it becomes possible to use lf_expr_compile() without a proxy, but also compile an expression which was previously prepared for compiling (set string and config hints within the logformat expression to avoid manually storing string and config context if the compiling step happens later). lf_expr_dup() may be used to duplicate an expression before it is compiled, lf_expr_xfer() now makes sure that the input logformat is already compiled. This is some prerequisite works for log-profiles implementation, no functional change should be expected.	2024-04-04 19:10:01 +02:00
Aurelien DARRAGON	7a21c3a4ef	MAJOR: log: implement proper postparsing for logformat expressions This patch tries to address a design flaw with how logformat expressions are parsed from config. Indeed, some parse_logformat_string() calls are performed during config parsing when the proxy mode is not yet known. Here's a config example that illustrates the issue: defaults mode tcp listen test bind :8888 http-response set-header custom-hdr "%trl" # needs http mode http The above config should work, because the effective proxy mode is http, yet haproxy fails with this error: [ALERT] (99051) : config : parsing [repro.conf:6] : error detected in proxy 'test' while parsing 'http-response set-header' rule : format tag 'trl' is reserved for HTTP mode. To fix the issue once and for all, let's implement smart postparsing for logformat expressions encountered during config parsing: - split parse_logformat_string() (and subfonctions) in order to create a new lf_expr_postcheck() function that must be called to finish preparing and checking the logformat expression once the proxy type is known. - save some config hints info during parse_logformat_string() to generate more precise error messages during lf_expr_postcheck(), if needed, we rely on curpx->conf.args.{file,line} hints for that because parse_logformat_string() doesn't know about current file and line number. - lf_expr_postcheck() uses PR_FL_CHECKED proxy flag to know if the function may try to make the proxy compatible with the expression, or if it should simply fail as soon as an incompatibility is detected. - if parse_logformat_string() is called from an unchecked proxy, then schedule the expression for postparsing, else (ie: during runtime), run the postcheck right away. This change will also allow for some logformat expression error handling simplifications in the future.	2024-04-04 19:10:01 +02:00
Aurelien DARRAGON	6810c41f8e	MEDIUM: tree-wide: add logformat expressions wrapper log format expressions are broadly used within the code: once they are parsed from input string, they are converted to a linked list of logformat nodes. We're starting to face some limitations because we're simply storing the converted expression as a generic logformat_node list. The first issue we're facing is that storing logformat expressions that way doesn't allow us to add metadata alongside the list, which is part of the prerequites for implementing log-profiles. Another issue with storing logformat expressions as generic lists of logformat_node elements is that it's starting to become really hard to tell when we rely on logformat expressions or not in the code given that there isn't always a comment near the list declaration or manipulation to indicate that it's relying on logformat expressions under the hood, so this adds some complexity for code maintenance. This patch looks quite impressive due to changes in a lot of header and source files (since logformat expressions are broadly used), but it does a simple thing: it defines the lf_expr structure which itself holds a generic list of logformat nodes, and then declares some helpers to manipulate lf_expr elements and fixes the code so that we now exclusively manipulate logformat_node lists as lf_expr elements outside of log.c. For now, lf_expr struct only contains the list of logformat nodes (no additional metadata), but now that we have dedicated type and helpers, doing so in the future won't be problematic at all and won't require extensive code changes.	2024-04-04 19:10:01 +02:00
Aurelien DARRAGON	7d8f45b647	MEDIUM: log: carry tag context in logformat node This is a pretty simple patch despite requiring to make some visible changes in the code: When parsing a logformat string, log tags (ie: '%tag', AKA log tags) are turned into logformat nodes with their type set to the type of the corresponding logformat_tag element which was matched by name. Thus, when "compiling" a logformat tag, we only keep a reference to the tag type from the original logformat_tag. For example, for "%B" log tag, we have the following logformat_tag element: { .name = "B", .type = LOG_FMT_BYTES, .mode = PR_MODE_TCP, .lw = LW_BYTES, .config_callback = NULL } When parsing "%B" string, we search for a matching logformat tag inside logformat_tags[] array using the provided name, once we find a matching element, we craft a logformat node whose type will be LOG_FMT_BYTES, but from the node itself, we no longer have access to other informations that are set in the logformat_tag struct element. Thus from a logformat_node resulting from a log tag, with current implementation, we cannot easily get back to matching logformat_tag struct element as it would require us to scan the whole logformat_tags array at runtime using node->type to find the matching element. Let's take a simpler path and consider all tag-specific LOG_FMT_* subtypes as being part of the same logformat node type: LOG_FMT_TAG. Thanks to that, we're now able to distinguish logformat nodes made from logformat tag from other logformat nodes, and link them to their corresponding logformat_tag element from logformat_tags[] array. All it costs is a simple indirection and an extra pointer in logformat_node struct. While at it, all LOG_FMT_* types related to logformat tags were moved inside log.c as they have no use outside of it since they are simply lookup indexes for sess_build_logline() and could even be replaced by function pointers some day...	2024-04-04 19:10:01 +02:00
Aurelien DARRAGON	8cf5c3d7f0	MINOR: log: expose logformat_tag struct rename logformat_type internal struct to logformat_tag to to make it less confusing, then expose logformat_tag struct through header file so that it can be referenced in other structs. also rename logformat_keywords[] to logformat_tags[] for better consistency.	2024-04-04 19:10:01 +02:00
Aurelien DARRAGON	c85cbc1061	MEDIUM: log: rename logformat var to logformat tag What we use to call logformat variable in the code is referred as log-format tag in the documentation. Having both 'var' and 'tag' labels referring to the same thing is really confusing. Let's make the code comply with the documentation by replacing all logformat var/variable/VAR occurences with either tag or TAG. No functional change should be expected, the only visible side-effect from user point of view is that "variable" was replaced by "tag" in some error messages.	2024-04-04 19:10:01 +02:00
Aurelien DARRAGON	3c6dfa618a	MEDIUM: log/balance: leverage lbprm api for log load-balancing log load-balancing implementation was not seamlessly integrated within lbprm API. The consequence is that it could become harder to maintain over time since it added some specific cases just for the log backend. Moreover, it resulted in some code duplication since balance algorithms that are common to logs and regular (tcp, http) backends were specifically rewritten for log backends. Thanks to the previous commit, we now have all the prerequisites to make log load-balancing fully leverage lbprm logic. Thus in this patch we make __do_send_log_backend() use existing lbprm algorithms, and we no longer require log-specific lbprm initialization in cfgparse.c and in postcheck_log_backend(). As a bonus, for log backends this allows weighed algorithms to properly support weights (ie: roundrobin, random and log-hash) since we now leverage the same lb algorithms that we use for tcp/http backends (doc was updated).	2024-03-29 17:08:37 +01:00
Aurelien DARRAGON	9aea6df81f	MINOR: lbprm: implement true "sticky" balance algo As previously mentioned in cd352c0db ("MINOR: log/balance: rename "log-sticky" to "sticky""), let's define a sticky algorithm that may be used from any protocol. Sticky algorithm sticks on the same server as long as it remains available. The documentation was updated accordingly.	2024-03-29 17:08:37 +01:00
Aurelien DARRAGON	d0692d7019	BUG/MINOR: log/balance: detect if user tries to use unsupported algo b61147fd ("MEDIUM: log/balance: merge tcp/http algo with log ones") introduced some ambiguities, because while it shares some algos with the ones from mode {tcp,http}, we forgot report an error when the user tries to use an algorithm that is not available in this mode (as per the doc). Because of that, haproxy would silently drop log messages during runtime. To fix that, we ensure that algo is one of the supported ones during log backend postparsing. If the algo is not supported, we raise an error. This should be backported in 2.9 with b61147fd	2024-03-29 17:08:36 +01:00
Willy Tarreau	01aa0a057c	MEDIUM: ring: change the ring reader to use the new vector-based API now The code now looks cleaner and more easily shows what still needs to be addressed. There are not that many changes in practice, these are mostly mechanical, essentially hiding the buffer from the callers.	2024-03-25 17:34:19 +00:00
Willy Tarreau	0f611987da	MINOR: ring: make the ring reader use only absolute offsets The goal is to remove references to the buffer's head and tail in the fast path so that we can release the lock during some reads. This means no more comparisons with b_data() nor operations relative to b_head() will be possible anymore. As a first step we need to have an absolute offset in the buffer, and to use b_getblk_ofs() in the applet callbacks to retrieve the data based on this.	2024-03-25 17:34:19 +00:00
Willy Tarreau	201c706330	MINOR: log/applet: add new function syslog_applet_append_event() This function takes a buffer on input, and offset and a length, and consumes the block from that buffer to send it to the appctx's output buffer. Contrary to its sibling applet_append_line(), instead of just appending an LF at the end of the line, it prepends the message size in decimal and a space before the message, as expected by syslog TCP implementaions. This will be used to simplify the ring reader code.	2024-03-25 17:34:19 +00:00
Aurelien DARRAGON	2df7e077c7	CLEANUP: log: fix obsolete comment for add_sample_to_logformat_list() Since 833cc794 ("MEDIUM: sample: handle comma-delimited converter list") logformat expressions now support having a comma-delimited converter list right after the fetch. Let's remove a leftover comment from the initial implementation that says otherwise.	2024-03-07 11:47:56 +01:00
Aurelien DARRAGON	2462e5bcca	BUG/MINOR: log: fix potential lf->name memory leak Recent commit 2ed6068 ("MINOR: log: custom name for logformat node") introduced a potential memory leak because when custom name is provided, lf->name value is allocated using strdup(), thus is expected to be freed alongside the node when the node is released. However lf->name was only freed in some common places within log.c cleanups and helpers func, but in reality there are still cases where lf nodes are manually freed without making use of freeing helpers. So this is what this patch does, it makes sure all lf freeing places now leverage the free_logformat_node() helper function that takes care of freeing all known allocated elements within the node, including custom name. This commit depends on: - "MINOR: log: add free_logformat_node() helper function" No backport needed unless 2ed6068 gets backported.	2024-02-22 15:32:42 +01:00
Aurelien DARRAGON	1c2e16ba8a	MINOR: log: add free_logformat_node() helper function Function may be used to free a single logformat node.	2024-02-22 15:32:42 +01:00
Aurelien DARRAGON	62121d5b90	CLEANUP: log: use free_logformat_list() in parse_logformat_string() This is a follow up for 24a5e42db6 ("CLEANUP: log: deinitialization of the log buffer in one function") as there was another opportunity to make use of the new cleanup function.	2024-02-22 15:32:42 +01:00
Aurelien DARRAGON	e7aee6edd5	CLEANUP: log: fix process_send_log() indentation Fix bad indentation for process_send_log() prototype (tab was used instead of spaces)	2024-02-22 15:32:42 +01:00
Aurelien DARRAGON	ee88c4418f	MINOR: log: automate string array construction in sess_build_logline() make it so string array construction is performed by dedicated macro helpers instead of manual char insertion between string members. The goal is to easily be able to support multiple forms of array construction depending on the data encoding format (raw, json..). Only %hrl and %hsl logformats are concerned.	2024-02-20 15:49:55 +01:00
Aurelien DARRAGON	8d2b9e2acd	MINOR: log: print metadata prefixes separately in sess_build_logline() Some log variables may be prefixed with specific chars that represent extra informations that are relevant with it but are are not directly part of the "raw" value. ie: '+' char is prepended before some values when "option logasap" is used to indicate that the value has not yet reached its final value. However, as those "metadata" are printed using the general purpose LOGCHAR() printing helper, it's not easy to tell if they are part of the base value or not. In this patch we add the LOGMETACHAR() helper that is a wrapper for LOGCHAR(). The goal is to prepare for adding some logic to prevent such additional infos from being generated when not relevant or needed.	2024-02-20 15:49:55 +01:00
Aurelien DARRAGON	a2fc40bc28	MINOR: log: simplify quotes handling in sess_build_logline() quotes building for some log formats is directly performed under each switch case statement so it would become painful to add other conditions to prevent the quotes from being generated when it's not supported by the the data encoding format for instance (ie: JSON). Let's centralize and simplify quotes handling by adding LOGQUOTE_START() and LOGQUOTE_END() helper macros. If a quotation is started and not explicitly ended, it will be automatically ended at the end of the current logformat node: LOGQUOTE_START() sets 'quote' variable to 1, this way LOGQUOTE_END() only prints the ending quote when needed. LOGQUOTE_END() is systematically called after each node switch-case (after each value). LOGQUOTE_START() does nothing if LOG_OPT_QUOTE isn't set, so does LOGQUOTE_END(). Some rare cases such as %hsl (list of captured headers) required special handling: in this case multiple quoted texts are generated for the same field value so explicit LOGQUOTE_START() + LOGQUOTE_END() combination was needed.	2024-02-20 15:49:55 +01:00
Aurelien DARRAGON	c6a7138420	MINOR: log: simplify last_isspace in sess_build_logline() last_isspace variable is explicitly set to 0 in all cases except LOG_FMT_SEPARATOR case. So we can actually simplify the code by setting last_isspace to 0 by default and skipping the assignment for the LOG_FMT_SEPARATOR case.	2024-02-20 15:49:55 +01:00
Aurelien DARRAGON	1448478d62	MINOR: log: explicit typecasting for logformat nodes Add the ability to manually specify desired output type after a custom field name for logformat nodes. Forcing the type can be useful to ensure value is stored with the proper type representation. (i.e.: forcing numerical to string to work around the limited resolution of JS number types) By default, type is set to SMP_T_SAME, which means the original type will be preserved. Currently supported types are: bool, str, sint	2024-02-20 15:49:54 +01:00

1 2 3 4 5 ...

634 Commits