When support for UUIDv7 was added in commit
aab6477b67
the specification still was a draft.
It has since been published as RFC 9562.
This patch updates all UUID references from the obsoleted RFC 4122 and the
draft for RFC 9562 to the published RFC 9562.
There were several places in grpc and its dependency protobuf where unaligned
accesses were done. Read accesses to 32 (resp. 64) bits values should be performed
by read_u32() (resp. read_u64()).
Replace these unligned read accesses by correct calls to these functions.
Same fixes for doubles and floats.
Such unaligned read accesses could lead to crashes with bus errors on CPU
archictectures which do not fix them at run time.
This patch depends on this previous commit:
861199fa71 MINOR: net_helper: Add support for floats/doubles.
Must be backported as far as 2.6.
type_to_smp(type) does the reverse operation of smp_to_type[smp]: it takes
a type name as input string and tries to return the corresponding SMP_T_*
smp type or SMP_TYPES if not found.
The jwt_verify converter was added in 2.5 with commit 130e142ee2
("MEDIUM: jwt: Add jwt_verify converter to verify JWT integrity"). It
takes a string on input and returns an integer. It turns out that by
presetting the return value to zero before processing contents, while
the sample data is a union, it overwrites the beginning of the buffer
struct passed on input. On a 64-bit arch it's not an issue because it's
where the allocated size is stored and it's not used in the operation,
which explains why the regtest works. But on 32-bit, both the size and
the pointer are overwritten, causing a NULL pointer to be passed to
jwt_tokenize() which is not designed to support this, hence crashes.
Let's just use a temporary variable to hold the result and move the
output sample initialization to the end of the function.
This should be backported as far as 2.5.
Previously an expression like:
path,word(2,/) -m found
always returned `true`.
Bug exists since the `word` converter exists. That is:
c9a0f6d023
The same bug was previously fixed for the `field` converter in commit
4381d26edc.
The fix should be backported to 1.6+.
When the bytes converter was improved to be able to use variables (915e48675
["MEDIUM: sample: Enhances converter "bytes" to take variable names as
arguments"]), the behavior of the sample slightly change. A failure is
reported if the given offset is bigger than the sample length. Before, a
empty binary sample was returned.
This patch fixes the converter to restore the original behavior. The
function was also refactored to properly handle failures by removing
SMP_F_MAY_CHANGE flag. Because the converter now handles variables, the
conversion to an integer may fail. In this case SMP_F_MAY_CHANGE flag must
be removed to be sure the caller will not retry.
This patch should fix the issue #2335. No backport needed except if commit
above is backported.
Method now returns the content of Json Arrays, if it is specified in
Json Path as String. The start and end character is a square bracket. Any
complex object in the array is returned as Json, so that you might get Arrays
of Array or objects. Only recommended for Arrays of simple types (e.g.,
String or int) which will be returned as CSV String. Also updated
documentation and fixed issue with parenthesis and other changes from
comments.
This patch was discussed in issue #2281.
Signed-off-by: William Lallemand <wlallemand@haproxy.com>
split sample_process() in 2 parts in order to be able to only process
the converter part of a sample expression from an existing input sample
struct passed as parameter.
Introduce log_header struct to easily pass log header data between
functions and use that to simplify the logic around log header
handling.
While at it, some outdated comments were updated as well.
No change in behavior should be expected.
This extra comment ensure that we do not try to pass an 'err' argument
to 'vars_check_arg' otherwise some warnings will be raised if an
operator is given an integer directly in the configuration file.
This reverts commit d897d7da87.
The "check_operator" function is used for all the operator converters
such as "and", "or", "add"...
With such a converter that accepts a variable name as well as an
integer, the "vars_check_arg" call is expected to fail when an integer
is provided. Passing an "err" variable has the unwanted side effect of
raising a warning during init for a configuration such as the following:
http-request set-query "s=%[rand,add(20)]"
which raises the following warning:
[WARNING] (33040) : config : parsing [hap.cfg:14] : invalid
variable name '20'. A variable name must be start by its scope. The
scope can be 'proc', 'sess', 'txn', 'req', 'res' or 'check'.
When function "check_operator" calls function "vars_check_arg" to decode
a variable, it passes in NULL value for pointer to the char array meant
for capturing the error message. This commit replaces NULL with the
pointer to the real char array. This should help in correct error
reporting.
Prior to this commit, converter "bytes" takes only integer values as
arguments. After this commit, it can take variable names as inputs.
This allows us to dynamically determine the offset/length and capture
them in variables. These variables can then be used with the converter.
Example use case: parsing a token present in a request header.
This commit implements the following timer tags available in the log
format as sample fetches:
req.timer.idle (%Ti)
req.timer.tq (%Tq)
req.timer.hdr (%TR)
req.timer.queue (%Tw)
res.timer.hdr (%Tr)
res.timer.user (%Tu)
txn.timer.total (%Ta)
txn.timer.data (%Td)
bc.timer.connect (%Tc)
fc.timer.handshake (%Th)
fc.timer.total (%Tt)
Check the trash chunk allocation in conv_time_common(), also remove the
data initialisation which is already done when allocating.
Fixes issue #2227.
No backported needed.
Implement 4 new fetches:
- ms_ltime
- ms_utime
- us_ltime
- us_utime
Which are the same as ltime and utime but with milliseconds and
microseconds input.
The converters also suports the %N conversion specifier like in date(1).
Unfortunately since %N is not supported by strftime, the format string
is parsed twice, once manually to replace %N, and once by strftime.
When "add" or "sub" conveters are used, an overflow detection is performed.
When 2 negative integers are added (or a positive integer is substracted to
a positive one), we take care to not exceed the low limit (LLONG_MIN) and
when 2 positive integers are added, we take care to not exceed the high
limit (LLONG_MAX).
However, because of a missing 'else' statement, if there is no overflow in
the first case, we fall back on the second check (the one for positive adds)
and LLONG_MAX is returned. It means that most of time, when 2 negative
integers are added (or a positive integer is substracted to a negative one),
LLONG_MAX is returned.
This patch should solve the issue #2216. It must be backported to all stable
versions.
sink_write() currently relies on sink->maxlen to know when to stop
writing a given payload.
But it could be useful to pass a smaller, explicit value to sink_write()
to stop before the ring maxlen, for instance if the ring is shared between
multiple feeders.
sink_write() now takes an optional maxlen parameter:
if maxlen is > 0, then sink_write will stop writing at maxlen if maxlen
is smaller than ring->maxlen, else only ring->maxlen will be considered.
[for haproxy <= 2.7, patch must be applied by hand: that is:
__sink_write() and sink_write() should be patched to take maxlen into
account and function calls to sink_write() should use 0 as second argument
to keep original behavior]
Thierry Fournier reported an annoying side-effect when using the debug()
converter.
Consider the following examples:
[1] http-request set-var(txn.test) bool(true),ipmask(24)
[2] http-request redirect location /match if { bool(true),ipmask(32) }
When starting haproxy with [1] example we get:
config : parsing [test.conf:XX] : error detected in frontend 'fe' while parsing 'http-request set-var(txn.test)' rule : converter 'ipmask' cannot be applied.
With [2], we get:
config : parsing [test.conf:XX] : error detected in frontend 'fe' while parsing 'http-request redirect' rule : error in condition: converter 'ipmask' cannot be applied in ACL expression 'bool(true),ipmask(32)'.
Now consider the following examples which are based on [1] and [2]
but with the debug() sample conv inserted in-between those incompatible
sample types:
[1*] http-request set-var(txn.test) bool(true),debug,ipmask(24)
[2*] http-request redirect location /match if { bool(true),debug,ipmask(32) }
According to the documentation, "it is safe to insert the debug converter
anywhere in a chain, even with non-printable sample types".
Thus we don't expect any side-effect from using it within a chain.
However in current implementation, because of debug() returning SMP_T_ANY
type which is a pseudo type (only resolved at runtime), the sample
compatibility checks performed at parsing time are completely uneffective.
(haproxy will start and no warning will be emitted)
The indesirable effect of this is that debug() prevents haproxy from
warning you about impossible type conversions, hiding potential errors
in the configuration that could result to unexpected evaluation or logic
while serving live traffic. We better let haproxy warn you about this kind
of errors when it has the chance.
With our previous examples, this could cause some inconveniences. Let's
say for example that you are testing a configuration prior to deploying
it. When testing the config you might want to use debug() converter from
time to time to check how the conversion chain behaves.
Now after deploying the exact same conf, you might want to remove those
testing debug() statements since they are no longer relevant.. but
removing them could "break" the config and suddenly prevent haproxy from
starting upon reload/restart. (this is the expected behavior, but it
comes a bit too late because of debug() hiding errors during tests..)
To fix this, we introduce a new output type for sample expressions:
SMP_T_SAME - may only be used as "expected" out_type (parsing time)
for sample converters.
As it name implies, it is a way for the developpers to indicate that the
resulting converter's output type is guaranteed to match the type of the
sample that is presented on the converter's input side.
(converter may alter data content, but data type must not be changed)
What it does is that it tells haproxy that if switching to the converter
(by looking at the converter's input only, since outype is SAME) is
conversion-free, then the converter type can safely be ignored for type
compatibility checks within the chain.
debug()'s out_type is thus set to SMP_T_SAME instead of ANY, which allows
it to fully comply with the doc in the sense that it does not impact the
conversion chain when inserted between sample items.
Co-authored-by: Thierry Fournier <thierry.f.78@gmail.com>
Historically, the ADDR pseudo-type did not exist. So when IPV6 support
was added to existing IPV4 sample fetches (e.g.: src,dst,hdr_ip...) the
expected out_type in related sample definitions was left on IPV4 because
it was required to declare the out_type as the lowest common denominator
(the type that can be casted into all other ones) to make compatibility
checks at parse time work properly.
However, now that ADDR pseudo-type may safely be used as out_type since
("MEDIUM: sample: add missing ADDR=>? compatibility matrix entries"), we
can use ADDR for fetches that may output both IPV4 and IPV6 at runtime.
One added benefit on top of making the code less confusing is that
'haproxy -dKsmp' output will now show "addr" instead of "ipv4" for such
fetches, so the 'haproxy -dKsmp' output better complies with the fetches
signatures from the documentation.
out_ip fetch, which returns an ip according to the doc, was purposely
left as is (returning IPV4) since the smp_fetch_url_ip() implementation
forces output type to IPV4 anyway, and since this is an historical fetch
I prefer not to touch it to prevent any regression. However if
smp_fetch_url_ip() were to be fixed to also return IPV6 in the future,
then its expected out_type may be changed to ADDR as well.
Multiple notes in the code were updated to mention that the appropriate
pseudo-type may be used instead of the lowest common denominator for
out_type when available.
Since 1478aa7 ("MEDIUM: sample: Add IPv6 support to the ipmask converter")
ipmask converter may return IPV6 as well, so the sample definition must
be updated. (the output type is currently explicited to IPV4)
This has no functional impact: one visible effect will be that
converter's visible output type will change from "ipv4" to "addr" when
using 'haproxy -dKcnv'.
SMP_T_ADDR support was added in b805f71 ("MEDIUM: sample: let the cast
functions set their output type").
According to the above commit, it is made clear that the ADDR type is
a pseudo/generic type that may be used for compatibility checks but
that cannot be emitted from a fetch or converter.
With that in mind, all conversions from ADDR to other types were
explicitly disabled in the compatibility matrix.
But later, when map_*_ip functions were updated in b2f8f08 ("MINOR: map:
The map can return IPv4 and IPv6"), we started using ADDR as "expected"
output type for converters. This still complies with the original
description from b805f71, because it is used as the theoric output
type, and is never emitted from the converters themselves (only "real"
types such as IPV4 or IPV6 are actually being emitted at runtime).
But this introduced an ambiguity as well as a few bugs, because some
compatibility checks are being performed at config parse time, and thus
rely on the expected output type to check if the conversion from current
element to the next element in the chain is theorically supported.
However, because the compatibility matrix doesn't support ADDR to other
types it is currently impossible to use map_*_ip converters in the middle
of a chain (the only supported usage is when map_*_ip converters are
at the very end of the chain).
To illustrate this, consider the following examples:
acl test str(ok),map_str_ip(test.map) -m found # this will work
acl test str(ok),map_str_ip(test.map),ipmask(24) -m found # this will raise an error
Likewise, stktable_compatible_sample() check for stick tables also relies
on out_type[table_type] compatibility check, so map_*_ip cannot be used
with sticktables at the moment:
backend st_test
stick-table type string size 1m expire 10m store http_req_rate(10m)
frontend fe
bind localhost:8080
mode http
http-request track-sc0 str(test),map_str_ip(test.map) table st_test # raises an error
To fix this, and prevent future usage of ADDR as expected output type
(for either fetches or converters) from introducing new bugs, the
ADDR=>? part of the matrix should follow the ANY type logic.
That is, ADDR, which is a pseudo-type, must be compatible with itself,
and where IPV4 and IPV6 both support a backward conversion to a given
type, ADDR must support it as well. It is done by setting the relevant
ADDR entries to c_pseudo() in the compatibility matrix to indicate that
the operation is theorically supported (c_pseudo() will never be executed
because ADDR should not be emitted: this only serves as a hint for
compatibility checks at parse time).
This is what's being done in this commit, thanks to this the broken
examples documented above should work as expected now, and future
usage of ADDR as out_type should not cause any issue.
This function is used for ANY=>!ANY conversions in the compatibility
matrix to help differentiate between real NOOP (c_none) and pseudo
conversions that are theorically supported at config parse time but can
never occur at runtime,. That is, to explicit the fact that actual related
runtime operations (e.g.: ANY->IPV4) are not NOOP since they might require
some conversion to be performed depending on the input type.
When checking the conf we don't know the effective out types so
cast[pseudo type][pseudo type] is allowed in the compatibility matrix,
but at runtime we only expect cast[real type][(real type || pseudo type)]
because fetches and converters may not emit pseudo types, thus using
c_none() everywhere was too ambiguous.
The process will crash if c_pseudo() is invoked to help catch bugs:
crashing here means that a pseudo type has been encountered on a
converter's input at runtime (because it was emitted earlier in the
chain), which is not supported and results from a broken sample fetch
or converter implementation. (pseudo types may only be used as out_type
in sample definitions for compatibility checks at parsing time)
Both sample_parse_expr() and parse_acl_expr() implement some code
logic to parse sample conv list after respective fetch or acl keyword.
(Seems like the acl one was inspired by the sample one historically)
But there is clearly code duplication between the two functions, making
them hard to maintain.
Hopefully, the parsing logic between them has stayed pretty much the
same, thus the sample conv parsing part may be moved in a dedicated
helper parsing function.
This is what's being done in this commit, we're adding the new function
sample_parse_expr_cnv() which does a single thing: parse the converters
that are listed right after a sample fetch keyword and inject them into
an already existing sample expression.
Both sample_parse_expr() and parse_acl_expr() were adapted to now make
use of this specific parsing function and duplicated code parts were
cleaned up.
Although sample_parse_expr() remains quite complicated (numerous function
arguments due to contextual parsing data) the first goal was to get rid of
code duplication without impacting the current behavior, with the added
benefit that it may allow further code cleanups / simplification in the
future.
This patch adds the support for the PS algorithms when verifying JWT
signatures (rsa-pss). It was not managed during the first implementation
and previously raised an "Unmanaged algorithm" error.
The tests use the same rsa signature as the plain rsa tests (RS256 ...)
and the implementation simply adds a call to
EVP_PKEY_CTX_set_rsa_padding in the function that manages rsa and ecdsa
signatures.
The signatures in the reg-test were built thanks to the PyJWT python
library once again.
This sample fetch returns a boolean. True if the support for QUIC transport
protocol was built and if this protocol was not disabled by "no-quic"
global option.
Must be backported to 2.7.
Emeric noticed that producing many randoms to fill a stick table was
saturating on the rand_lock. Since 2.4 we have the statistical PRNG
for low-quality randoms like this one, there is no point in using the
one that was originally implemented for the purpose of creating safe
UUIDs, since the doc itself clearly states that these randoms are not
secure and they have not been in the past either. With this change,
locking contention is completely gone.
It was a mistake to put these two fields in the struct task. This
was added in 1.9 via commit 9efd7456e ("MEDIUM: tasks: collect per-task
CPU time and latency"). These fields are used solely by streams in
order to report the measurements via the lat_ns* and cpu_ns* sample
fetch functions when task profiling is enabled. For the rest of the
tasks, this is pure CPU waste when profiling is enabled, and memory
waste 100% of the time, as the point where these latencies and usages
are measured is in the profiling array.
Let's move the fields to the stream instead, and have process_stream()
retrieve the relevant info from the thread's context.
The struct task is now back to 120 bytes, i.e. almost two cache lines,
with 32 bit still available.
Adjust the size of the sample buffer before we change the "area"
pointer. Otherwise, we end up not changing the size, because the area
pointer is already the same as "start" before we compute the difference
between the two.
This is similar to the change in b28430591d
but for the word converter instead of field.
Almost all of our hash-based LB algorithms are implemented as special
cases of something that can now be achieved using sample expressions,
and some of them have adopted some options to adapt their behavior in
ways that could also be achieved using converters.
There are users who want to hash other parameters that are combined
into variables, and who set headers from these values and use
"balance hdr(name)" for this.
Instead of constantly implementing specific options and having users
hack around when they want a real hash, let's implement a native hash
mode that applies to a standard sample expression. This way, any
fetchable element (including variables) may be used to construct the
hash, even modified by any converter if desired.
Any type except bool could cast to bin, while it can cast to string.
That's a bit inconsistent, and prevents a boolean from being used as
the entry of a hash function while any other type can. This is a
problem when passing via variable where someone could use:
... set-var(txn.bar) always_false
to temporarily disable something, but this would result in an empty
hash output when later doing:
... var(txn.bar),sdbm
Instead of using c_int2bin() as is done for the string output, better
enfore an set of inputs or exactly 0 or 1 so that a poorly written sample
fetch function does not result in a difficult to debug hash output.
Surprisingly, while about all calls to a sample cast function carefully
avoid calling c_none(), sample_fetch_as_type() makes no effort regarding
this, while by nature, the function is most often called with an expected
output type similar to the one of the expresison. Let's add it to shorten
the most common call path.
The use_backend and use-server contexts were not enumerated in
smp_resolve_args, and while use-server doesn't currently take an
expression, at least use_backend supports that, and both entries ought
to be listed for completeness. Now an error in a use_backend rule
becomes more precise, from:
[ALERT] (12373) : config : parsing [use-srv.cfg:33]: unable to find
backend 'foo' referenced in arg 1 of sample fetch
keyword 'nbsrv' in proxy 'echo'.
to:
[ALERT] (12307) : config : parsing [use-srv.cfg:33]: unable to find
backend 'foo' referenced in arg 1 of sample fetch
keyword 'nbsrv' in use_backend expression in proxy
'echo'.
This may be backported though this is totally harmless.
This new converter is similar to the concat converter and can be used to
build new variables made of a succession of other variables but the main
difference is that it does the checks if adding a delimiter makes sense as
wouldn't be the case if e.g the current input sample is empty. That
situation would require 2 separate rules using concat converter where the
first rule would have to check if the current sample string is empty before
adding a delimiter. This resolves GitHub Issue #1621.
It's much more convenient to sort these keywords on output to detect
changes, and it's easy to do. The patch looks big but most of it is
only caused by an indent change in the loop, as "git diff -b" shows.
Similar to the sample fetch keywords, let's also list the converter
keywords. They're much simpler since there's no compatibility matrix.
Instead the input and output types are listed. This is called by
dump_registered_keywords() for the "cnv" keywords class.
New function smp_dump_fetch_kw lists registered sample fetch keywords
with their compatibility matrix, mandatory and optional argument types,
and output types. It's called from dump_registered_keywords() with class
"smp".