This patch implements a couple of converters to validate and extract data from a
MQTT (Message Queuing Telemetry Transport) message. The validation consists of a
few checks as well as "packet size" validation. The extraction can get any field
from the variable header and the payload.
This is limited to CONNECT and CONNACK packet types only. All other messages are
considered as invalid. It is not a problem for now because only the first packet
on each side can be parsed (CONNECT for the client and CONNACK for the server).
MQTT 3.1.1 and 5.0 are supported.
Reviewed and Fixed by Christopher Faulet <cfaulet@haproxy.com>
This patch implements a couple of converters to validate and extract tag value
from a FIX (Financial Information eXchange) message. The validation consists in
a few checks such as mandatory fields and checksum computation. The extraction
can get any tag value based on a tag string or tag id.
This patch requires the istend() function. Thus it depends on "MINOR: ist: Add
istend() function to return a pointer to the end of the string".
Reviewed and Fixed by Christopher Faulet <cfaulet@haproxy.com>
iif() takes a boolean as input and returns one of the two argument
strings depending on whether the boolean is true.
This converter most likely is most useful to return the proper scheme
depending on the value returned by the `ssl_fc` fetch, e.g. for use within
the `x-forwarded-proto` request header.
However it can also be useful for use within a template that is sent to
the client using `http-request return` with a `lf-file`. It allows the
administrator to implement a simple condition, without needing to prefill
variables within the regular configuration using `http-request
set-var(req.foo)`.
This way, all fields of the buffer structure are reset when a string argument
(ARGT_STR) is released. It is also a good way to explicitly specify this kind
of argument is a chunk. So .data and .size fields must be set.
This patch may be backported to ease backports.
Some sample fetches or sample converters uses a validation functions for their
arguments. In these function, string arguments (ARGT_STR) may be converted to
another type (for instance a regex, a variable or a integer). Because these
strings are allocated when the argument list is built, they must be freed after
a conversion. Most of time, it is done. But not always. This patch fixes these
minor memory leaks (only on few strings, during the configuration parsing).
This patch may be backported to all supported versions, most probably as far as
2.1 only. If this commit is backported, the previous one 73292e9e6 ("BUG/MINOR:
lua: Duplicate map name to load it when a new Map object is created") must also
be backported. Note that some validation functions does not exists on old
version. It should be easy to resolve conflicts.
The debug() converter uses a string to reference the sink where to send debug
events. During the configuration parsing, this string is converted to a sink
object but it is still store as a string argument. It is a problem on deinit
because string arguments are released. So the sink pointer will be released
twice.
To fix the bug, we keep a reference on the sink using an ARGT_PTR argument. This
way, it will not be freed on the deinit.
This patch depends on the commit e02fc4d0d ("MINOR: arg: Add an argument type to
keep a reference on opaque data"). Both must be backported as far as 2.1.
This patch merges build message code between sink and log
and introduce a new API based on struct ist array to
prepare message header with zero copy, targeting the
log forwarding feature.
Log format 'iso' and 'timed' are now avalaible on logs line.
A new log format 'priority' is also added.
Given the following example configuration:
listen foo
mode http
bind *:8080
http-request set-var(txn.leak) meth(GET)
server x example.com:80
Running a configuration check with valgrind reports:
==25992== 4 bytes in 1 blocks are definitely lost in loss record 1 of 344
==25992== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==25992== by 0x4E239D: my_strndup (tools.c:2261)
==25992== by 0x581E20: make_arg_list (arg.c:253)
==25992== by 0x4DE91D: sample_parse_expr (sample.c:890)
==25992== by 0x58E304: parse_store (vars.c:772)
==25992== by 0x566A3F: parse_http_req_cond (http_rules.c:95)
==25992== by 0x4A4CE6: cfg_parse_listen (cfgparse-listen.c:1339)
==25992== by 0x494C59: readcfgfile (cfgparse.c:2049)
==25992== by 0x545145: init (haproxy.c:2029)
==25992== by 0x421E42: main (haproxy.c:3175)
After this patch is applied the leak is gone as expected.
This is a fairly minor leak, but it can add up for many uses of the `bool()`
sample fetch. The bug most likely exists since the `bool()` sample fetch was
introduced in commit cc103299c75c530ab3637a1698306145bdc85552. The fix may
be backported to HAProxy 1.6+.
Given the following example configuration:
listen foo
mode http
bind *:8080
http-request set-var(txn.leak) bool(1)
server x example.com:80
Running a configuration check with valgrind reports:
==24233== 2 bytes in 1 blocks are definitely lost in loss record 1 of 345
==24233== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==24233== by 0x4E238D: my_strndup (tools.c:2261)
==24233== by 0x581E10: make_arg_list (arg.c:253)
==24233== by 0x4DE90D: sample_parse_expr (sample.c:890)
==24233== by 0x58E2F4: parse_store (vars.c:772)
==24233== by 0x566A2F: parse_http_req_cond (http_rules.c:95)
==24233== by 0x4A4CE6: cfg_parse_listen (cfgparse-listen.c:1339)
==24233== by 0x494C59: readcfgfile (cfgparse.c:2049)
==24233== by 0x545135: init (haproxy.c:2029)
==24233== by 0x421E42: main (haproxy.c:3175)
After this patch is applied the leak is gone as expected.
This is a fairly minor leak, but it can add up for many uses of the `bool()`
sample fetch. The bug most likely exists since the `bool()` sample fetch was
introduced in commit cc103299c75c530ab3637a1698306145bdc85552. The fix may
be backported to HAProxy 1.6+.
Instead of just calling release_sample_arg(conv_expr->arg_p) we also must
free() the conv_expr itself (after removing it from the list).
Given the following example configuration:
frontend foo
bind *:8080
mode http
http-request set-var(txn.foo) str(bar)
acl is_match str(foo),strcmp(txn.hash) -m bool
Running a configuration check within valgrind reports:
==1431== 32 bytes in 1 blocks are definitely lost in loss record 20 of 43
==1431== at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1431== by 0x4C39B5: sample_parse_expr (sample.c:982)
==1431== by 0x56B410: parse_acl_expr (acl.c:319)
==1431== by 0x56BA7F: parse_acl (acl.c:697)
==1431== by 0x48D225: cfg_parse_listen (cfgparse-listen.c:816)
==1431== by 0x4797C3: readcfgfile (cfgparse.c:2167)
==1431== by 0x52943D: init (haproxy.c:2021)
==1431== by 0x41F382: main (haproxy.c:3133)
After this patch is applied the leak is gone as expected.
This is a fairly minor leak that can only be observed if samples need to be
freed, which is not something that should occur during normal processing and
most likely only during shut down. Thus no backport should be needed.
This patch fixes all the leftovers from the include cleanup campaign. There
were not that many (~400 entries in ~150 files) but it was definitely worth
doing it as it revealed a few duplicates.
Most of the files dealing with error reports have to include log.h in order
to access ha_alert(), ha_warning() etc. But while these functions don't
depend on anything, log.h depends on a lot of stuff because it deals with
log-formats and samples. As a result it's impossible not to embark long
dependencies when using ha_warning() or qfprintf().
This patch moves these low-level functions to errors.h, which already
defines the error codes used at the same places. About half of the users
of log.h could be adjusted, sometimes revealing other issues such as
missing tools.h. Interestingly the total preprocessed size shrunk by
4%.
This one is particularly difficult to split because it provides all the
functions used to manipulate a proxy state and to retrieve names or IDs
for error reporting, and as such, it was included in 73 files (down to
68 after cleanup). It would deserve a small cleanup though the cut points
are not obvious at the moment given the number of structs involved in
the struct proxy itself.
The current state of the logging is a real mess. The main problem is
that almost all files include log.h just in order to have access to
the alert/warning functions like ha_alert() etc, and don't care about
logs. But log.h also deals with real logging as well as log-format and
depends on stream.h and various other things. As such it forces a few
heavy files like stream.h to be loaded early and to hide missing
dependencies depending where it's loaded. Among the missing ones is
syslog.h which was often automatically included resulting in no less
than 3 users missing it.
Among 76 users, only 5 could be removed, and probably 70 don't need the
full set of dependencies.
A good approach would consist in splitting that file in 3 parts:
- one for error output ("errors" ?).
- one for log_format processing
- and one for actual logging.
Initially it looked like this could have been placed into auth.h or
stats.h but it's not the case as it's what makes the link between them
and the HTTP layer. However the file needed to be split in two. Quite
a number of call places were dropped because these were mostly leftovers
from the early days where the stats and cli were packed together.
The stktable_types[] array declaration was moved to the main file as
it had nothing to do in the types. A few declarations were reordered
in the types file so that defines were before the structs. Thread-t
was added since there are a few __decl_thread(). The loss of peers.h
revealed that cfgparse-listen needed it.
global.h was one of the messiest files, it has accumulated tons of
implicit dependencies and declares many globals that make almost all
other file include it. It managed to silence a dependency loop between
server.h and proxy.h by being well placed to pre-define the required
structs, forcing struct proxy and struct server to be forward-declared
in a significant number of files.
It was split in to, one which is the global struct definition and the
few macros and flags, and the rest containing the functions prototypes.
The UNIX_MAX_PATH definition was moved to compat.h.
There is no C file for this one, the code was placed into sample.c which
thus has a dependency on this file which itself includes sample.h. Probably
that it would be wise to split that later.
This one is particularly tricky to move because everyone uses it
and it depends on a lot of other types. For example it cannot include
arg-t.h and must absolutely only rely on forward declarations to avoid
dependency loops between vars -> sample_data -> arg. In order to address
this one, it would be nice to split the sample_data part out of sample.h.
The STATS_DEFAULT_REALM and STATS_DEFAULT_URI were moved to defaults.h.
It was required to include types/pattern.h and types/sample.h since they
are mentioned in function prototypes.
It would be wise to merge this with uri_auth.h later.
The sink files could be moved with almost no change at since they
didn't rely on anything fancy. ssize_t required sys/types.h and
thread.h was needed for the locks.
And also rename standard.c to tools.c. The original split between
tools.h and standard.h dates from version 1.3-dev and was mostly an
accident. This patch moves the files back to what they were expected
to be, and takes care of not changing anything else. However this
time tools.h was split between functions and types, because it contains
a small number of commonly used macros and structures (e.g. name_desc)
which in turn cause the massive list of includes of tools.h to conflict
with the callers.
They remain the ugliest files of the whole project and definitely need
to be cleaned and split apart. A few types are defined there only for
functions provided there, and some parts are even OS-specific and should
move somewhere else, such as the symbol resolution code.
Most of the file was a large set of HTX elements manipulation functions
and few types, so splitting them allowed to further reduce dependencies
and shrink the build time. Doing so revealed that a few files (h2.c,
mux_pt.c) needed haproxy/buf.h and were previously getting it through
htx.h. They were fixed.
So the enums and structs were placed into http-t.h and the functions
into http.h. This revealed that several files were dependeng on http.h
but not including it, as it was silently inherited via other files.
Regex are essentially included for myregex_t but it turns out that
several of the C files didn't include it directly, relying on the
one included by their own .h. This has been cleanly addressed so
that only the type is included by H files which need it, and adding
the missing includes for the other ones.
All files that were including one of the following include files have
been updated to only include haproxy/api.h or haproxy/api-t.h once instead:
- common/config.h
- common/compat.h
- common/compiler.h
- common/defaults.h
- common/initcall.h
- common/tools.h
The choice is simple: if the file only requires type definitions, it includes
api-t.h, otherwise it includes the full api.h.
In addition, in these files, explicit includes for inttypes.h and limits.h
were dropped since these are now covered by api.h and api-t.h.
No other change was performed, given that this patch is large and
affects 201 files. At least one (tools.h) was already freestanding and
didn't get the new one added.
The http-error directive can now be used instead of errorfile to define an error
message in a proxy section (including default sections). This directive uses the
same syntax that http return rules. The only real difference is the limitation
on status code that may be specified. Only status codes supported by errorfile
directives are supported for this new directive. Parsing of errorfile directive
remains independent from http-error parsing. But functionally, it may be
expressed in terms of http-errors :
errorfile <status> <file> ==> http-errror status <status> errorfile <file>
This patch extends the sink_write prototype and code to
handle the rfc5424 and rfc3164 header.
It uses header building tools from log.c. Doing this some
functions/vars have been externalized.
facility and minlevel have been removed from the struct sink
and passed to args at sink_write because they depends of the log
and not of the sink (they remained unused by rest of the code
until now).
Make the digest and HMAC function of OpenSSL accessible to the user via
converters. They can be used to sign and validate content.
Reviewed-by: Tim Duesterhus <tim@bastelstu.be>
aes_gcm_dec is independent of the TLS implementation and fits better
in sample.c file with others hash functions.
[Cf: I slightly updated this patch to move aes_gcm_dec converter in sample.c
instead the new file crypto.c]
Reviewed-by: Tim Duesterhus <tim@bastelstu.be>
All sample fetches in the scope "check." have been removed. Response sample
fetches must be used instead. It avoids keyword duplication. So, for instance,
res.hdr() must be now used instead of check.hdr().
To do so, following sample fetches have been added on the response :
* res.body, res.body_len and res.body_size
* res.hdrs and res.hdrs_bin
Sample feches dealing with the response's body are only useful in the health
checks context. When called from a stream context, there is no warranty on the
body presence. There is no option to wait the response's body.
A binary sample data can be converted, implicitly or not, to a string by cutting
the buffer on the first null byte.
I guess this patch should be backported to all stable versions.
cpu_calls, cpu_ns_avg, cpu_ns_tot, lat_ns_avg and lat_ns_tot depend on the
stream to find the current task and must check for it or they may cause a
crash if misused or used in a log-format string after commit 5f940703b3
("MINOR: log: Don't depends on a stream to process samples in log-format
string").
This must be backported as far as 1.9.
This converter tranform a integer to its binary representation in the network
byte order. Integer are already automatically converted to binary during sample
expression evaluation. But because samples own 8-bytes integers, the conversion
produces 8 bytes. the htonl converter do the same but for 4-bytes integer.
Register the custom action rules "set-var" and "unset-var", that will
call the parse_store() command upon parsing.
These rules are thus built and integrated to the tcp-check ruleset, but
have no further effect for the moment.
We currently have two UUID generation functions, one for the sample
fetch and the other one in the SPOE filter. Both were a bit complicated
since they were made to support random() implementations returning an
arbitrary number of bits, and were throwing away 33 bits every 64. Now
we don't need this anymore, so let's have a generic function consuming
64 bits at once and use it as appropriate.
The rand() sample fetch supports being limited to a certain range, but
it only uses 31 bits and scales them as requested, which means that when
the requested output range is larger than 31 bits, the least significant
one is not random and may even be constant.
Let's make use of the whole 32 bits now that we have access ot them.
This is the replacement of failed attempt to add thread safety and
per-process sequences of random numbers initally tried with commit
1c306aa84d ("BUG/MEDIUM: random: implement per-thread and per-process
random sequences").
This new version takes a completely different approach and doesn't try
to work around the horrible OS-specific and non-portable random API
anymore. Instead it implements "xoroshiro128**", a reputedly high
quality random number generator, which is one of the many variants of
xorshift, which passes all quality tests and which is described here:
http://prng.di.unimi.it/
While not cryptographically secure, it is fast and features a 2^128-1
period. It supports fast jumps allowing to cut the period into smaller
non-overlapping sequences, which we use here to support up to 2^32
processes each having their own, non-overlapping sequence of 2^96
numbers (~7*10^28). This is enough to provide 1 billion randoms per
second and per process for 2200 billion years.
The implementation was made thread-safe either by using a double 64-bit
CAS on platforms supporting it (x86_64, aarch64) or by using a local
lock for the time needed to perform the shift operations. This ensures
that all threads pick numbers from the same pool so that it is not
needed to assign per-thread ranges. For processes we use the fast jump
method to advance the sequence by 2^96 for each process.
Before this patch, the following config:
global
nbproc 8
frontend f
bind :4445
mode http
log stdout format raw daemon
log-format "%[uuid] %pid"
redirect location /
Would produce this output:
a4d0ad64-2645-4b74-b894-48acce0669af 12987
a4d0ad64-2645-4b74-b894-48acce0669af 12992
a4d0ad64-2645-4b74-b894-48acce0669af 12986
a4d0ad64-2645-4b74-b894-48acce0669af 12988
a4d0ad64-2645-4b74-b894-48acce0669af 12991
a4d0ad64-2645-4b74-b894-48acce0669af 12989
a4d0ad64-2645-4b74-b894-48acce0669af 12990
82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12987
82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12992
82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12986
(...)
And now produces:
f94b29b3-da74-4e03-a0c5-a532c635bad9 13011
47470c02-4862-4c33-80e7-a952899570e5 13014
86332123-539a-47bf-853f-8c8ea8b2a2b5 13013
8f9efa99-3143-47b2-83cf-d618c8dea711 13012
3cc0f5c7-d790-496b-8d39-bec77647af5b 13015
3ec64915-8f95-4374-9e66-e777dc8791e0 13009
0f9bf894-dcde-408c-b094-6e0bb3255452 13011
49c7bfde-3ffb-40e9-9a8d-8084d650ed8f 13014
e23f6f2e-35c5-4433-a294-b790ab902653 13012
There are multiple benefits to using this method. First, it doesn't
depend anymore on a non-portable API. Second it's thread safe. Third it
is fast and more proven than any hack we could attempt to try to work
around the deficiencies of the various implementations around.
This commit depends on previous patches "MINOR: tools: add 64-bit rotate
operators" and "BUG/MEDIUM: random: initialize the random pool a bit
better", all of which will need to be backported at least as far as
version 2.0. It doesn't require to backport the build fixes for circular
include files dependecy anymore.
This reverts commit 1c306aa84d785b9c2240bf7767dcc1f2596cfcfd.
It breaks the build on all non-glibc platforms. I got confused by the
man page (which possibly is the most confusing man page I've ever read
about a standard libc function) and mistakenly understood that random_r
was portable, especially since it appears in latest freebsd source as
well but not in released versions, and with a slightly different API :-/
We need to find a different solution with a fallback. Among the
possibilities, we may reintroduce this one with a fallback relying on
locking around the standard functions, keeping fingers crossed for no
other library function to call them in parallel, or we may also provide
our own PRNG, which is not necessarily more difficult than working
around the totally broken up design of the portable API.