Commit Graph

325 Commits

Author SHA1 Message Date
Christopher Faulet
4ccc12fc41 MINOR: sample: add htonl converter
This converter tranform a integer to its binary representation in the network
byte order. Integer are already automatically converted to binary during sample
expression evaluation. But because samples own 8-bytes integers, the conversion
produces 8 bytes. the htonl converter do the same but for 4-bytes integer.
2020-04-27 09:39:37 +02:00
Gaetan Rivet
707b52f17e MEDIUM: checks: Parse custom action rules in tcp-checks
Register the custom action rules "set-var" and "unset-var", that will
call the parse_store() command upon parsing.

These rules are thus built and integrated to the tcp-check ruleset, but
have no further effect for the moment.
2020-04-27 09:39:37 +02:00
Willy Tarreau
ee3bcddef7 MINOR: tools: add a generic function to generate UUIDs
We currently have two UUID generation functions, one for the sample
fetch and the other one in the SPOE filter. Both were a bit complicated
since they were made to support random() implementations returning an
arbitrary number of bits, and were throwing away 33 bits every 64. Now
we don't need this anymore, so let's have a generic function consuming
64 bits at once and use it as appropriate.
2020-03-08 18:04:16 +01:00
Willy Tarreau
aa8bbc12dd MINOR: sample: make all bits random on the rand() sample fetch
The rand() sample fetch supports being limited to a certain range, but
it only uses 31 bits and scales them as requested, which means that when
the requested output range is larger than 31 bits, the least significant
one is not random and may even be constant.

Let's make use of the whole 32 bits now that we have access ot them.
2020-03-08 18:04:16 +01:00
Willy Tarreau
52bf839394 BUG/MEDIUM: random: implement a thread-safe and process-safe PRNG
This is the replacement of failed attempt to add thread safety and
per-process sequences of random numbers initally tried with commit
1c306aa84d ("BUG/MEDIUM: random: implement per-thread and per-process
random sequences").

This new version takes a completely different approach and doesn't try
to work around the horrible OS-specific and non-portable random API
anymore. Instead it implements "xoroshiro128**", a reputedly high
quality random number generator, which is one of the many variants of
xorshift, which passes all quality tests and which is described here:

   http://prng.di.unimi.it/

While not cryptographically secure, it is fast and features a 2^128-1
period. It supports fast jumps allowing to cut the period into smaller
non-overlapping sequences, which we use here to support up to 2^32
processes each having their own, non-overlapping sequence of 2^96
numbers (~7*10^28). This is enough to provide 1 billion randoms per
second and per process for 2200 billion years.

The implementation was made thread-safe either by using a double 64-bit
CAS on platforms supporting it (x86_64, aarch64) or by using a local
lock for the time needed to perform the shift operations. This ensures
that all threads pick numbers from the same pool so that it is not
needed to assign per-thread ranges. For processes we use the fast jump
method to advance the sequence by 2^96 for each process.

Before this patch, the following config:
    global
        nbproc 8

    frontend f
        bind :4445
        mode http
        log stdout format raw daemon
        log-format "%[uuid] %pid"
        redirect location /

Would produce this output:
    a4d0ad64-2645-4b74-b894-48acce0669af 12987
    a4d0ad64-2645-4b74-b894-48acce0669af 12992
    a4d0ad64-2645-4b74-b894-48acce0669af 12986
    a4d0ad64-2645-4b74-b894-48acce0669af 12988
    a4d0ad64-2645-4b74-b894-48acce0669af 12991
    a4d0ad64-2645-4b74-b894-48acce0669af 12989
    a4d0ad64-2645-4b74-b894-48acce0669af 12990
    82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12987
    82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12992
    82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12986
    (...)

And now produces:
    f94b29b3-da74-4e03-a0c5-a532c635bad9 13011
    47470c02-4862-4c33-80e7-a952899570e5 13014
    86332123-539a-47bf-853f-8c8ea8b2a2b5 13013
    8f9efa99-3143-47b2-83cf-d618c8dea711 13012
    3cc0f5c7-d790-496b-8d39-bec77647af5b 13015
    3ec64915-8f95-4374-9e66-e777dc8791e0 13009
    0f9bf894-dcde-408c-b094-6e0bb3255452 13011
    49c7bfde-3ffb-40e9-9a8d-8084d650ed8f 13014
    e23f6f2e-35c5-4433-a294-b790ab902653 13012

There are multiple benefits to using this method. First, it doesn't
depend anymore on a non-portable API. Second it's thread safe. Third it
is fast and more proven than any hack we could attempt to try to work
around the deficiencies of the various implementations around.

This commit depends on previous patches "MINOR: tools: add 64-bit rotate
operators" and "BUG/MEDIUM: random: initialize the random pool a bit
better", all of which will need to be backported at least as far as
version 2.0. It doesn't require to backport the build fixes for circular
include files dependecy anymore.
2020-03-08 10:09:02 +01:00
Willy Tarreau
0fbf28a05b Revert "BUG/MEDIUM: random: implement per-thread and per-process random sequences"
This reverts commit 1c306aa84d.

It breaks the build on all non-glibc platforms. I got confused by the
man page (which possibly is the most confusing man page I've ever read
about a standard libc function) and mistakenly understood that random_r
was portable, especially since it appears in latest freebsd source as
well but not in released versions, and with a slightly different API :-/

We need to find a different solution with a fallback. Among the
possibilities, we may reintroduce this one with a fallback relying on
locking around the standard functions, keeping fingers crossed for no
other library function to call them in parallel, or we may also provide
our own PRNG, which is not necessarily more difficult than working
around the totally broken up design of the portable API.
2020-03-07 11:24:39 +01:00
Willy Tarreau
1c306aa84d BUG/MEDIUM: random: implement per-thread and per-process random sequences
As mentioned in previous patch, the random number generator was never
made thread-safe, which used not to be a problem for health checks
spreading, until the uuid sample fetch function appeared. Currently
it is possible for two threads or processes to produce exactly the
same UUID. In fact it's extremely likely that this will happen for
processes, as can be seen with this config:

    global
        nbproc 8

    frontend f
        bind :4445
        mode http
        log stdout daemon format raw
        log-format "%[uuid] %pid"
        redirect location /

It typically produces this log:

  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30645
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30641
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30644
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30639
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30646
  07764439-c24d-4e6f-a5a6-0138be59e7a8 30645
  07764439-c24d-4e6f-a5a6-0138be59e7a8 30639
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30643
  07764439-c24d-4e6f-a5a6-0138be59e7a8 30646
  b6773fdd-678f-4d04-96f2-4fb11ad15d6b 30646
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30642
  07764439-c24d-4e6f-a5a6-0138be59e7a8 30642

What this patch does is to use a distinct per-thread and per-process
seed to make sure the same sequences will not appear, and will then
extend these seeds by "burning" a number of randoms that depends on
the global random seed, the thread ID and the process ID. This adds
roughly 20 extra bits of randomness, resulting in 52 bits total per
thread and per process.

It only takes a few milliseconds to burn these randoms and given
that threads start with a different seed, we know they will not
catch each other. So these random extra bits are essentially added
to ensure randomness between boots and cluster instances.

This replaces all uses of random() with ha_random() which uses the
thread-local state.

This must be backported as far as 2.0 or any version having the
UUID sample-fetch function since it's the main victim here.

It's important to note that this patch, in addition to depending on
the previous one "BUG/MEDIUM: init: initialize the random pool a bit
better", also depends on the preceeding build fixes to address a
circular dependency issue in the include files that prevented it
from building. Part or all of these patches may need to be backported
or adapted as well.
2020-03-07 06:11:15 +01:00
Willy Tarreau
a8b7ecd4dc CLEANUP: sample: use read_u64() in ipmask() to apply an IPv6 mask
There were 8 strict aliasing warnings there due to the dereferences
casting to uint32_t of input and output. We can achieve the same using
two write_u64() and four read_u64() which do not cause this issue and
even let the compiler use 64-bit operations.
2020-02-25 10:24:14 +01:00
Willy Tarreau
5715da269d BUG/MINOR: sample: fix the json converter's endian-sensitivity
About every time there's a pointer cast in the code, there's a hidden
bug, and this one was no exception, as it passes the first octet of the
native representation of an integer as a single-character string, which
obviously only works on little endian machines. On big-endian machines,
something as simple as "str(foo),json" only returns zeroes.

This bug was introduced with the JSON converter in 1.6-dev1 by commit
317e1c4f1e ("MINOR: sample: add "json" converter"), the fix may be
backported to all stable branches.
2020-02-25 08:47:45 +01:00
Willy Tarreau
908071171b BUILD: general: always pass unsigned chars to is* functions
The isalnum(), isalpha(), isdigit() etc functions from ctype.h are
supposed to take an int in argument which must either reflect an
unsigned char or EOF. In practice on some platforms they're implemented
as macros referencing an array, and when passed a char, they either cause
a warning "array subscript has type 'char'" when lucky, or cause random
segfaults when unlucky. It's quite unconvenient by the way since none of
them may return true for negative values. The recent introduction of
cygwin to the list of regularly tested build platforms revealed a lot
of breakage there due to the same issues again.

So this patch addresses the problem all over the code at once. It adds
unsigned char casts to every valid use case, and also drops the unneeded
double cast to int that was sometimes added on top of it.

It may be backported by dropping irrelevant changes if that helps better
support uncommon platforms. It's unlikely to fix bugs on platforms which
would already not emit any warning though.
2020-02-25 08:16:33 +01:00
Willy Tarreau
23997daf4e BUG/MINOR: sample: exit regsub() in case of trash allocation error
As reported in issue #507, since commiy 07e1e3c93e ("MINOR: sample:
regsub now supports backreferences") we must not proceed in regsub()
if we fali to allocate a trash (which in practice never happens). No
backport needed.
2020-02-18 14:27:44 +01:00
Jerome Magnin
07e1e3c93e MINOR: sample: regsub now supports backreferences
Now that the configuration parser is more flexible with samples,
converters and their arguments, we can leverage this to enable
support for backreferences in regsub.
2020-02-16 19:48:54 +01:00
Willy Tarreau
e3b57bf92f MINOR: sample: make sample_parse_expr() able to return an end pointer
When an end pointer is passed, instead of complaining that a comma is
missing after a keyword, sample_parse_expr() will silently return the
pointer to the current location into this return pointer so that the
caller can continue its parsing. This will be used by more complex
expressions which embed sample expressions, and may even permit to
embed sample expressions into arguments of other expressions.
2020-02-14 19:02:06 +01:00
Willy Tarreau
80b53ffb1c MEDIUM: arg: make make_arg_list() stop after its own arguments
The main problem we're having with argument parsing is that at the
moment the caller looks for the first character looking like an end
of arguments (')') and calls make_arg_list() on the sub-string inside
the parenthesis.

Let's first change the way it works so that make_arg_list() also
consumes the parenthesis and returns the pointer to the first char not
consumed. This will later permit to refine each argument parsing.

For now there is no functional change.
2020-02-14 19:02:06 +01:00
Willy Tarreau
ed2c662b01 MINOR: sample/acl: use is_idchar() to locate the fetch/conv name
Instead of scanning a string looking for an end of line, ')' or ',',
let's only accept characters which are actually valid identifier
characters. This will let the parser know that in %[src], only "src"
is the sample fetch name, not "src]". This was done both for samples
and ACLs since they are the same here.
2020-02-14 19:02:06 +01:00
Willy Tarreau
0851fd5eef MINOR: debug: support logging to various sinks
As discussed in the thread below [1], the debug converter is currently
not of much use given that it's only built when DEBUG_EXPR is set, and
it is limited to stderr only.

This patch changes this to make it take an optional prefix and an optional
target sink so that it can log to stdout, stderr or a ring buffer. The
default output is the "buf0" ring buffer, that can be consulted from the
CLI.

[1] https://www.mail-archive.com/haproxy@formilux.org/msg35671.html

Note: if this patch is backported, it also requires the following commit to
work: 46dfd78cbf ("BUG/MINOR: sample: always check converters' arguments").
2019-12-19 09:19:13 +01:00
Tim Duesterhus
cd3732456b MINOR: sample: Validate the number of bits for the sha2 converter
Instead of failing the conversion when an invalid number of bits is
given the sha2 converter now fails with an appropriate error message
during startup.

The sha2 converter was introduced in d437630237,
which is in 2.1 and higher.
2019-12-17 13:28:00 +01:00
Willy Tarreau
46dfd78cbf BUG/MINOR: sample: always check converters' arguments
In 1.5-dev20, sample-fetch arguments parsing was addresse by commit
689a1df0a1 ("BUG/MEDIUM: sample: simplify and fix the argument parsing").
The issue was that argument checks were not run for sample-fetches if
parenthesis were not present. Surprisingly, the fix was mde only for
sample-fetches and not for converters which suffer from the exact same
problem. There are even a few comments in the code mentioning that some
argument validation functions are not called when arguments are missing.

This fix applies the exact same method as the one above. The impact of
this bug is limited because over the years the code has learned to work
around this issue instead of fixing it.

This may be backported to all maintained versions.
2019-12-17 10:44:49 +01:00
Willy Tarreau
5060326798 BUG/MINOR: sample: fix the closing bracket and LF in the debug converter
The closing bracket was emitted for the "debug" converter even when the
opening one was not sent, and the new line was not always emitted. Let's
fix this. This is harmless since this converter is not built by default.
2019-12-17 09:04:38 +01:00
Damien Claisse
ae6f125c7b MINOR: sample: add us/ms support to date/http_date
It can be sometimes interesting to have a timestamp with a
resolution of less than a second.
It is currently painful to obtain this, because concatenation
of date and date_us lead to a shorter timestamp during first
100ms of a second, which is not parseable and needs ugly ACLs
in configuration to prepend 0s when needed.
To improve this, add an optional <unit> parameter to date sample
to report an integer with desired unit.
Also support this unit in http_date converter to report
a date string with sub-second precision.
2019-10-31 08:47:31 +01:00
Tim Duesterhus
4381d26edc BUG/MINOR: sample: Make the field converter compatible with -m found
Previously an expression like:

    path,field(2,/) -m found

always returned `true`.

Bug exists since the `field` converter exists. That is:
f399b0debf

The fix should be backported to 1.6+.
2019-10-21 15:49:42 +02:00
Luca Schimweg
8a694b859c MINOR: sample: Add UUID-fetch
Adds the fetch uuid(int). It returns a UUID following the format of
version 4 in the RFC4122 standard.

New feature, but could be backported.
2019-09-13 04:43:33 +02:00
Frdric Lcaille
be36793d1d BUG/MEDIUM: stick-table: Wrong stick-table backends parsing.
When parsing references to stick-tables declared as backends, they are added to
a list of proxies (they are proxies!) which refer to this stick-tables.
Before this patch we added them to these list without checking they were already
present, making the silly hypothesis the actions/sample were checked/resolved in the same
order the proxies are parsed.

This patch implement a simple inline function to in_proxies_list() to test
the presence of a proxy in a list of proxies. We use this function when resolving
/checking samples/actions.

This bug was introduced by 015e4d7 commit.

Must be backported to 2.0.
2019-08-07 10:32:31 +02:00
Frdric Lcaille
9417f4534a BUG/MAJOR: sample: Wrong stick-table name parsing in "if/unless" ACL condition.
This bug was introduced by 1b8e68e commit which supposed the stick-table was always
stored in struct arg at parsing time. This is never the case with the usage of
"if/unless" conditions in stick-table declared as backends. In this case, this is
the name of the proxy which must be considered as the stick-table name.

This must be backported to 2.0.
2019-06-21 09:48:28 +02:00
Tim Duesterhus
d437630237 MINOR: sample: Add sha2([<bits>]) converter
This adds a converter for the SHA-2 family, supporting SHA-224, SHA-256
SHA-384 and SHA-512.

The converter relies on the OpenSSL implementation, thus only being available
when HAProxy is compiled with USE_OPENSSL.

See GitHub issue #123. The hypothetical `ssl_?_sha256` fetch can then be
simulated using `ssl_?_der,sha2(256)`:

  http-response set-header Server-Cert-FP %[ssl_f_der,sha2(256),hex]
2019-06-17 13:36:42 +02:00
Dragan Dosen
2674303912 MEDIUM: regex: modify regex_comp() to atomically allocate/free the my_regex struct
Now we atomically allocate the my_regex struct within function
regex_comp() and compile the regex or free both in case of failure. The
pointer to the allocated my_regex struct is returned directly. The
my_regex* argument to regex_comp() is removed.

Function regex_free() was modified so that it systematically frees the
my_regex entry. The function does nothing when called with a NULL as
argument (like free()). It will avoid existing risk of not properly
freeing the initialized area.

Other structures are also updated in order to be compatible (the ones
related to Lua and action rules).
2019-05-07 06:58:15 +02:00
Frdric Lcaille
015e4d7d93 MINOR: stick-tables: Add peers process binding computing.
Add a list of proxies for all the stick-tables (->proxies_list struct stktable
member) so that to be able to compute the process bindings of the peers after having
parsed the configuration file.
The proxies are added to the stick-tables they reference when parsing
stick-tables lines in proxy sections, when checking the actions in
check_trk_action() and when resolving samples args for stick-tables
without checking is they are duplicates. We check only there is no loop.
Then, after having parsed everything, we add the proxy bindings to the
peers frontend bindings with stick-tables they reference.
2019-05-07 06:54:07 +02:00
Frdric Lcaille
1b8e68e89a MEDIUM: stick-table: Stop handling stick-tables as proxies.
This patch adds the support for the "table" line parsing in "peers" sections
to declare stick-table in such sections. This also prevents the user from having
to declare dummy backends sections with a unique stick-table inside.
Even if still supported, this usage will become deprecated.

To do so, the ->table member of proxy struct which is a stktable struct is replaced
by a pointer to a stktable struct allocated at parsing time in src/cfgparse-listen.c
for the dummy stick-table backends and in src/cfgparse.c for "peers" sections.
This has an impact on the code for stick-table sample converters and on the stickiness
rules parsers which first store the name of the dummy before resolving the rules.
This patch replaces proxy_tbl_by_name() calls by stktable_find_by_name() calls
to lookup for stick-tables stored in "stktable_by_name" ebtree at parsing time.
There is only one remaining place where proxy_tbl_by_name() is used: src/hlua.c.

At several places in the code we relied on the fact that ->size member of stick-table
was equal to zero to consider the stick-table was present by not configured,
this do not make sense anymore as ->table member of struct proxyis fow now on a pointer.
These tests are replaced by a test on ->table value itself.

In "peers" section we do not have to temporary store the name of the section the
stick-table are attached to because this name is obviously already known just after
having entered this "peers" section.

About the CLI stick-table I/O handler, the pointer to proxy struct is replaced by
a pointer to a stktable struct.
2019-05-07 06:54:06 +02:00
Frdric Lcaille
bfe6138150 MINOR: sample: Add a protocol buffers specific converter.
This patch adds "protobuf" protocol buffers specific converter wich
may used in combination with "ungrpc" as first converter to extract
a protocol buffers field value. It is simply implemented reusing
protobuf_field_lookup() which is the protocol buffers specific parser already
used by "ungrpc" converter which only parse a gRPC header in addition of
parsing protocol buffers message.

Update the documentation for this new "protobuf" converter.
2019-03-06 15:36:02 +01:00
Frdric Lcaille
5f33f85ce8 MINOR: sample: Extract some protocol buffers specific code.
We move the code responsible of parsing protocol buffers messages
inside gRPC messages from sample.c to include/proto/protocol_buffers.h
so that to reuse it to cascade "ungrpc" converter.
2019-03-06 15:36:02 +01:00
Frdric Lcaille
756d97f205 MINOR: sample: Rework gRPC converter code.
For now on, "ungrpc" may take a second optional argument to provide
the protocol buffers types used to encode the field value to be extracted.
When absent the field value is extracted as a binary sample which may then
followed by others converters like "hex" which takes binary as input sample.
When this second argument is a type which does not match the one found by "ungrpc",
this field is considered as not found even if present.

With this patch we also remove the useless "varint" and "svarint" converters.

Update the documentation about "ungrpc" converters.
2019-03-05 11:04:23 +01:00
Frdric Lcaille
7c93e88d0c MINOR: sample: Code factorization "ungrpc" converter.
Parsing protocol buffer fields always consists in skip the field
if the field is not found or store the field value if found.
So, with this patch we factorize a little bit the code for "ungrpc" converter.
2019-03-05 11:03:53 +01:00
Frdric Lcaille
50290fbb42 MINOR: sample: Replace "req.ungrpc" smp fetch by a "ungrpc" converter.
This patch simply extracts the code of smp_fetch_req_ungrpc() for "req.ungrpc"
from http_fetch.c to move it to sample.c with very few modifications.
Furthermore smp_fetch_body_buf() used to fetch the body contents is no more needed.

Update the documentation for gRPC.
2019-03-04 08:28:42 +01:00
Frdric Lcaille
fd95c62f1b MINOR: sample: Add two sample converters for protocol buffers.
Add "varint" to convert all the protocol buffers binary varints excepted the signed
ones ("sint32" and "sint64") to an integer. The binary signed varints may be
converted to an integer with "svarint" converter implemented by this patch.
These two new converters do not take any argument.
2019-02-26 16:27:05 +01:00
Willy Tarreau
1a0fe3becd BUG/MINOR: config: make sure to count the error on incorrect track-sc/stick rules
When commit 151e1ca98 ("BUG/MAJOR: config: verify that targets of track-sc
and stick rules are present") added a check for some process inconsistencies
between rules and their stick tables, some errors resulted in a "return 0"
statement, which is taken as "no error" in some cases. Let's fix this.

This must be backported to all versions using the above commit.
2019-02-06 10:25:07 +01:00
Willy Tarreau
151e1ca989 BUG/MAJOR: config: verify that targets of track-sc and stick rules are present
Stick and track-sc rules may optionally designate a table in a different
proxy. In this case, a number of verifications are made such as validating
that this proxy actually exists. However, in multi-process mode, the target
table might indeed exist but not be bound to the set of processes the rules
will execute on. This will definitely result in a random behaviour especially
if these tables do require peer synchronization, because some tasks will be
started to try to synchronize form uninitialized areas.

The typical issue looks like this :

    peers my-peers
         peer foo ...

    listen proxy
         bind-process 1
         stick on src table ip
         ...

    backend ip
         bind-process 2
         stick-table type ip size 1k peers my-peers

While it appears obvious that the example above will not work, there are
less obvious situations, such as having bind-process in a defaults section
and having a larger set of processes for the referencing proxy than the
referenced one.

The present patch adds checks for such situations by verifying that all
processes from the referencing proxy are present on the other one in all
track-sc* and stick-* rules, and in sample fetch / converters referencing
another table so that sc_inc_gpc0() and similar are safe as well.

This fix must be backported to all maintained versions. It may potentially
disrupt configurations which already randomly crash. There hardly is any
intermediary solution though, such configurations need to be fixed.
2019-02-05 11:54:49 +01:00
Olivier Houchard
4468f1cacb BUG/MEDIUM: sample: Don't treat SMP_T_METH as SMP_T_STR.
In smp_dup(), don't consider a SMP_T_METH with an unknown method the same as
SMP_T_STR. The string and string length aren't stored at the same place.

This should be backported to 1.8.
2018-12-07 15:31:43 +01:00
Willy Tarreau
0108d90c6c MEDIUM: init: convert all trivial registration calls to initcalls
This switches explicit calls to various trivial registration methods for
keywords, muxes or protocols from constructors to INITCALL1 at stage
STG_REGISTER. All these calls have in common to consume a single pointer
and return void. Doing this removes 26 constructors. The following calls
were addressed :

- acl_register_keywords
- bind_register_keywords
- cfg_register_keywords
- cli_register_kw
- flt_register_keywords
- http_req_keywords_register
- http_res_keywords_register
- protocol_register
- register_mux_proto
- sample_register_convs
- sample_register_fetches
- srv_register_keywords
- tcp_req_conn_keywords_register
- tcp_req_cont_keywords_register
- tcp_req_sess_keywords_register
- tcp_res_cont_keywords_register
- flt_register_keywords
2018-11-26 19:50:32 +01:00
Willy Tarreau
70fe94419c MINOR: sample: add cpu_calls, cpu_ns_avg, cpu_ns_tot, lat_ns_avg, lat_ns_tot
These sample fetch keywords report performance metrics about the task calling
them. They are useful to report in logs which requests consume too much CPU
time and what negative performane impact it has on other requests. Typically
logging cpu_ns_avg and lat_ns_avg will show culprits and victims.
2018-11-22 16:07:39 +01:00
Joseph Herlant
757f5ad73a CLEANUP: Fix typos in the sample subsystem
Fix some typos in the code comment of the sample subsystem.
2018-11-18 22:26:42 +01:00
Willy Tarreau
35b51c6e5b REORG: http: move the HTTP semantics definitions to http.h/http.c
It's a bit painful to have to deal with HTTP semantics for each protocol
version (H1 and H2), and working on the version-agnostic code further
emphasizes the problem.

This patch creates http.h and http.c which are agnostic to the version
in use, and which borrow a few parts from proto_http and from h1. For
example the once thought h1-specific h1_char_classes array is in fact
dictated by RFC7231 and is used to parse HTTP headers. A few changes
were made to a few files which were including proto_http.h while they
only needed http.h.

Certain string definitions pre-dated the introduction of indirect
strings (ist) so some were used to simplify the definition of the known
HTTP methods. The current lookup code saves 2 kB of a heavily used table
and is faster than the previous table based lookup (typ. 14 ns vs 16
before).
2018-09-11 10:30:25 +02:00
Willy Tarreau
83061a820e MAJOR: chunks: replace struct chunk with struct buffer
Now all the code used to manipulate chunks uses a struct buffer instead.
The functions are still called "chunk*", and some of them will progressively
move to the generic buffer handling code as they are cleaned up.
2018-07-19 16:23:43 +02:00
Willy Tarreau
843b7cbe9d MEDIUM: chunks: make the chunk struct's fields match the buffer struct
Chunks are only a subset of a buffer (a non-wrapping version with no head
offset). Despite this we still carry a lot of duplicated code between
buffers and chunks. Replacing chunks with buffers would significantly
reduce the maintenance efforts. This first patch renames the chunk's
fields to match the name and types used by struct buffers, with the goal
of isolating the code changes from the declaration changes.

Most of the changes were made with spatch using this coccinelle script :

  @rule_d1@
  typedef chunk;
  struct chunk chunk;
  @@
  - chunk.str
  + chunk.area

  @rule_d2@
  typedef chunk;
  struct chunk chunk;
  @@
  - chunk.len
  + chunk.data

  @rule_i1@
  typedef chunk;
  struct chunk *chunk;
  @@
  - chunk->str
  + chunk->area

  @rule_i2@
  typedef chunk;
  struct chunk *chunk;
  @@
  - chunk->len
  + chunk->data

Some minor updates to 3 http functions had to be performed to take size_t
ints instead of ints in order to match the unsigned length here.
2018-07-19 16:23:43 +02:00
Tim Duesterhus
ca097c16a8 MINOR: sample: Add strcmp sample converter
This converter supplements the existing string matching by allowing
strings to be converted to a variable.

Example usage:

  http-request set-var(txn.host) hdr(host)
  # Check whether the client is attempting domain fronting.
  acl ssl_sni_http_host_match ssl_fc_sni,strcmp(txn.host) eq 0
2018-04-28 07:03:39 +02:00
Willy Tarreau
9eb2a4addf BUILD: sample: avoid build warning in sample.c
Recent commit 9631a28 ("MEDIUM: sample: Extend functionality for field/word
converters") introduced this minor build warning that this patch addresses :

 src/sample.c: In function 'sample_conv_word':
 src/sample.c:2108:8: warning: suggest explicit braces to avoid ambiguous 'else' [-Wparentheses]
 src/sample.c:2137:8: warning: suggest explicit braces to avoid ambiguous 'else' [-Wparentheses]

No backport is needed.
2018-04-19 10:33:28 +02:00
Marcin Deranek
9631a28275 MEDIUM: sample: Extend functionality for field/word converters
Extend functionality of field/word converters, so it's possible
to extract field(s)/word(s) counting from the beginning/end and/or
extract multiple fields/words (including separators) eg.

str(f1_f2_f3__f5),field(2,_,2)  # f2_f3
str(f1_f2_f3__f5),field(2,_,0)  # f2_f3__f5
str(f1_f2_f3__f5),field(-2,_,3) # f2_f3_
str(f1_f2_f3__f5),field(-3,_,0) # f1_f2_f3

str(w1_w2_w3___w4),word(3,_,2)  # w3___w4
str(w1_w2_w3___w4),word(2,_,0)  # w2_w3___w4
str(w1_w2_w3___w4),word(-2,_,3) # w1_w2_w3
str(w1_w2_w3___w4),word(-3,_,0) # w1_w2

Change is backward compatible.
2018-04-17 11:27:48 +02:00
Emmanuel Hocdet
50791a7df3 MINOR: samples: add crc32c converter
This patch adds the support of CRC32c (rfc4960).
2018-03-21 16:17:00 +01:00
Willy Tarreau
280f42b99e MINOR: sample: add a new "concat" converter
It's always a pain not to be able to combine variables. This commit
introduces the "concat" converter, which appends a delimiter, a variable's
contents and another delimiter to an existing string. The result is a string.
This makes it easier to build composite variables made of other variables.
2018-02-19 15:34:12 +01:00
Tim Duesterhus
1478aa795e MEDIUM: sample: Add IPv6 support to the ipmask converter
Add an optional second parameter to the ipmask converter that specifies
the number of bits to mask off IPv6 addresses.

If the second parameter is not given IPv6 addresses fail to mask (resulting
in an empty string), preserving backwards compatibility: Previously
a sample like `src,ipmask(24)` failed to give a result for IPv6 addresses.

This feature can be tested like this:

  defaults
  	log	global
  	mode	http
  	option	httplog
  	option	dontlognull
  	timeout connect 5000
  	timeout client  50000
  	timeout server  50000

  frontend fe
  	bind :::8080 v4v6

  	# Masked IPv4 for IPv4, empty for IPv6 (with and without this commit)
  	http-response set-header Test %[src,ipmask(24)]
  	# Correctly masked IP addresses for both IPv4 and IPv6
  	http-response set-header Test2 %[src,ipmask(24,ffff:ffff:ffff:ffff::)]
  	# Correctly masked IP addresses for both IPv4 and IPv6
  	http-response set-header Test3 %[src,ipmask(24,64)]

  	default_backend be

  backend be
  	server s example.com:80

Tested-By: Jarno Huuskonen <jarno.huuskonen@uef.fi>
2018-01-25 22:25:40 +01:00
Tim Duesterhus
bf5ce02eff BUG/MINOR: sample: Fix output type of c_ipv62ip
c_ipv62ip failed to set the output type of the cast to SMP_T_IPV4
even for a successful conversion.

This bug exists as of commit cc4d1716a2
which is the first commit adding this function.

v1.6-dev4 is the first tag containing this commit, the fix should
be backported to haproxy 1.6 and newer.
2018-01-25 22:25:40 +01:00
Tim Duesterhus
ec6b0a2d18 CLEANUP: sample: Fix outdated comment about sample casts functions
The cast functions modify their output type as of commit:
b805f71d1b

v1.5-dev20 is the first tag containing this comment, the fix
should be backported to haproxy 1.5 and newer.
2018-01-25 22:25:40 +01:00
Tim Duesterhus
c555ee0c45 CLEANUP: sample: Fix comment encoding of sample.c
The file contained an 'e' with an gravis accent and thus was
not US-ASCII, but ISO-8859-1.

Also correct the spelling in the incorrect comment.

The incorrect character was introduced in commit:
4d9a1d1a5c

v1.6-dev1 is the first tag containing this comment, the fix
should be backported to haproxy 1.6 and newer.
2018-01-25 22:25:40 +01:00
Etienne Carriere
a792a0aa93 MINOR: sample: add date_us sample
Add date_us sample that returns the microsecond part of the timeval
structure representing the date of the structure. The "second" part of
the timeval can already be fetched by the "date" sample
2018-01-21 07:56:42 +01:00
Willy Tarreau
60a2ee7945 MINOR: sample: rename the "len" converter to "length"
This converter was recently introduced by commit ed0d24e ("MINOR:
sample: add len converter").

As found by Cyril, it causes an issue in "http-request capture"
statements. The non-obvious problem is that an old syntax for sample
expressions and converters used to support a series of words, each
representing a converter. This used to be how the "stick" directives
were created initially. By having a converter called "len", a
statement such as "http-request capture foo len 10" considers "len"
as a converter and not as the capture length.

This obsolete syntax needs to be changed in 1.9 but it's too late
for other versions. It's worth noting that the same problem can
happen if converters are registered on the fly using Lua. Other
language keywords that currently have to be avoided in converters
include "id", "table", "if", "unless".
2017-12-15 07:13:48 +01:00
Etienne Carriere
ed0d24ebed MINOR: sample: add len converter
Add len converter that returns the length of a string
2017-12-14 14:36:10 +01:00
Christopher Faulet
767a84bcc0 CLEANUP: log: Rename Alert/Warning in ha_alert/ha_warning 2017-11-24 17:19:12 +01:00
Christopher Faulet
34adb2af96 MINOR: sample: Add "thread" sample fetch
It returns id of the thread calling the function.
2017-11-23 16:33:13 +01:00
Emeric Brun
e5c918bcef MINOR: threads/sample: Change temp_smp into a thread local variable 2017-10-31 13:58:31 +01:00
Dragan Dosen
3f957b2f83 MINOR: sample: add the hex2i converter
Converts a hex string containing two hex digits per input byte to an
integer. If the input value can not be converted, then zero is returned.
2017-10-25 04:46:08 +02:00
Dragan Dosen
6e5a9ca948 MINOR: sample: add the sha1 converter
This converter can be used to generate a SHA1 digest from binary type
sample. The result is a binary sample with length of 20 bytes.
2017-10-25 04:45:58 +02:00
Christopher Faulet
ec10051349 MINOR: samples: Handle the type SMP_T_METH when we duplicate a sample in smp_dup
First, the type SMP_T_METH was not handled by smp_dup function. It was never
called with this kind of samples, so it's not really a problem. But, this could
be useful in future.

For all known HTTP methods (GET, POST...), there is no extra space allocated for
a sample of type SMP_T_METH. But for unkown methods, it uses a chunk. So, like
for strings, we duplicate data, using a trash chunk.
2017-07-24 17:15:47 +02:00
Holger Just
1bfc24ba03 MINOR: sample: Add b64dec sample converter
Add "b64dec" as a new converter which can be used to decode a base64
encoded string into its binary representation. It performs the inverse
operation of the "base64" converter.
2017-05-12 15:56:52 +02:00
Nenad Merdanovic
50c8044423 CLEANUP: Remove comment that's no longer valid
Code was deleted in ad63582eb, but the comment remained.

Signed-off-by: Nenad Merdanovic <nmerdan@haproxy.com>
2017-03-13 18:26:05 +01:00
Nenad Merdanovic
807a6e7856 MINOR: Add hostname sample fetch
It adds "hostname" as a new sample fetch. It does exactly the same as
"%H" in a log format except that it can be used outside of log formats.

Signed-off-by: Nenad Merdanovic <nmerdan@haproxy.com>
2017-03-13 18:26:05 +01:00
Thierry FOURNIER
01e0974b5a MINOR: samples: add xx-hash functions
This patch adds the support of xx-hash 32 and 64-bits functions.
2016-12-26 12:45:04 +01:00
Willy Tarreau
97108e08ce CLEANUP: sample: report "converter" instead of "conv method" in error messages
This was inherited from the very early stick-tables code but it's about
time to produce understandable error messages :-)
2016-11-25 07:36:22 +01:00
Thierry FOURNIER / OZON.IO
a69c912187 CLEANUP: log-format: useless file and line in json converter
The caller must log location information, so this information is
provided two times in the log line. The error log is like this:

   [ALERT] 327/011513 (14291) : parsing [o3.conf:38]: 'http-response
   set-header': Sample fetch <method,json(rrr)> failed with : invalid
   args in conv method 'json' : Unexpected input code type at file
   'o3.conf', line 38. Allowed value are 'ascii', 'utf8', 'utf8s',
   'utf8p' and 'utf8ps'.

This patch removes the second location indication, the the same error
becomes:

   [ALERT] 327/011637 (14367) : parsing [o3.conf:38]: 'http-response
   set-header': Sample fetch <method,json(rrr)> failed with : invalid
   args in conv method 'json' : Unexpected input code type. Allowed
   value are 'ascii', 'utf8', 'utf8s', 'utf8p' and 'utf8ps'.
2016-11-24 18:54:25 +01:00
Christopher Faulet
f7e4e7e096 MAJOR: spoe: Add an experimental Stream Processing Offload Engine
SPOE makes possible the communication with external components to retrieve some
info using an in-house binary protocol, the Stream Processing Offload Protocol
(SPOP). In the long term, its aim is to allow any kind of offloading on the
streams. This first version, besides being experimental, won't do lot of
things. The most important today is to validate the protocol design and lay the
foundations of what will, one day, be a full offload engine for the stream
processing.

So, for now, the SPOE can offload the stream processing before "tcp-request
content", "tcp-response content", "http-request" and "http-response" rules. And
it only supports variables creation/suppression. But, in spite of these limited
features, we can easily imagine to implement a SSO solution, an ip reputation
service or an ip geolocation service.

Internally, the SPOE is implemented as a filter. So, to use it, you must use
following line in a proxy proxy section:

  frontend my-front
      ...
      filter spoe [engine <name>] config <file>
      ...

It uses its own configuration file to keep the HAProxy configuration clean. It
is also a easy way to disable it by commenting out the filter line.

See "doc/SPOE.txt" for all details about the SPOE configuration.
2016-11-09 22:57:01 +01:00
Christopher Faulet
476e5d0e03 REORG: sample: move code to release a sample expression in sample.c
This code has been moved from haproxy.c to sample.c and the function
release_sample_expr can now be called from anywhere to release a sample
expression. This function will be used by the stream processing offload engine
(SPOE).
2016-11-09 22:57:00 +01:00
Willy Tarreau
2235b261b6 OPTIM: http: move all http character classs tables into a single one
We used to have 7 different character classes, each was 256 bytes long,
resulting in almost 2kB being used in the L1 cache. It's as cheap to
test a bit than to check the byte is not null, so let's store a 7-bit
composite value and check for the respective bits there instead.

The executable is now 4 kB smaller and the performance on small
objects increased by about 1% to 222k requests/second with a config
involving 4 http-request rules including 1 header lookup, one header
replacement, and 2 variable assignments.
2016-11-05 15:58:08 +01:00
Willy Tarreau
f0645dce4f MINOR: sample: use smp_make_rw() in upper/lower converters
There's no point in always duplicating the sample, just ensure it's
writable, as was done prior to the smp_dup() change. This should be
backported to 1.6 to avoid a performance regression caused by this
change (about 30% more time for upper/lower due to the copy).
2016-08-09 14:31:25 +02:00
Willy Tarreau
ad63582eb9 BUG/MEDIUM: samples: make smp_dup() always duplicate the sample
Vedran Furac reported a strange problem where the "base" sample fetch
would not always work for tracking purposes.

In fact, it happens that commit bc8c404 ("MAJOR: stick-tables: use sample
types in place of dedicated types") merged in 1.6 exposed a fundamental
bug related to the way samples use chunks as strings. The problem is that
chunks convey a base pointer, a length and an optional size, which may be
zero when unknown or when the chunk is allocated from a read-only location.
The sole purpose of this size is to know whether or not the chunk may be
appended new data. This size cause some semantics issue in the sample,
which has its own SMP_F_CONST flag to indicate read-only contents.

The problem was emphasized by the commit above because it made use of new
calls to smp_dup() to convert a sample to a table key. And since smp_dup()
would only check the SMP_F_CONST flag, it would happily return read-write
samples indicating size=0.

So some tests were added upon smp_dup() return to ensure that the actual
length is smaller than size, but this in fact made things even worse. For
example, the "sni" server directive does some bad stuff on many occasions
because it limits len to size-1 and effectively sets it to -1 and writes
the zero byte before the beginning of the string!

It is therefore obvious that smp_dup() needs to be modified to take this
nature of the chunks into account. It's not enough but is needed. The core
of the problem comes from the fact that smp_dup() is called for 5 distinct
needs which are not always fulfilled :

  1) duplicate a sample to keep a copy of it during some operations
  2) ensure that the sample is rewritable for a converter like upper()
  3) ensure that the sample is terminated with a \0
  4) set a correct size on the sample
  5) grow the sample in case it was extracted from a partial chunk

Case 1 is not used for now, so we can ignore it. Case 2 indicates the wish
to modify the sample, so its R/O status must be removed if any, but there's
no implied requirement that the chunk becomes larger. Case 3 is used when
the sample has to be made compatible with libc's str* functions. There's no
need to make it R/W nor to duplicate it if it is already correct. Case 4
can happen when the sample's size is required (eg: before performing some
changes that must fit in the buffer). Case 5 is more or less similar but
will happen when the sample by be grown but we want to ensure we're not
bound by the current small size.

So the proposal is to have different functions for various operations. One
will ensure a sample is safe for use with str* functions. Another one will
ensure it may be rewritten in place. And smp_dup() will have to perform an
inconditional duplication to guarantee at least #5 above, and implicitly
all other ones.

This patch only modifies smp_dup() to make the duplication inconditional. It
is enough to fix both the "base" sample fetch and the "sni" server directive,
and all use cases in general though not always optimally. More patches will
follow to address them more optimally and even better than the current
situation (eg: avoid a dup just to add a \0 when possible).

The bug comes from an ambiguous design, so its roots are old. 1.6 is affected
and a backport is needed. In 1.5, the function already existed but was only
used by two converters modifying the data in place, so the bug has no effect
there.
2016-08-09 14:03:23 +02:00
Herve COMMOWICK
8dfe863fbf DOC: fix json converter example and error message 2016-08-07 08:08:18 +02:00
Willy Tarreau
5f6e9054b9 BUILD: fix build on Solaris 11
htonll()/ntohll() already exist on Solaris 11 with a different declaration,
causing a build error as reported by Jonathan Fisher. They used to exist on
OSX with a #define which allowed us to detect them. It was a bad idea to give
these functions a name subject to conflicts like this. Simply rename them
my_htonll()/my_ntohll() to definitely get rid of the conflict.

This patch must be backported to 1.6.
2016-05-26 07:15:57 +02:00
David Carlier
64a16ab19c BUG/MEDIUM: sample: initialize the pointer before parse_binary call.
parse_binary line 2025 checks the nullity of binstr parameter.
Other calls of parse_binary properly zeroify this parameter.
[wt: this could result in random failures of the const parser]
2016-04-12 11:08:24 +02:00
Vincent Bernat
02779b6263 CLEANUP: uniformize last argument of malloc/calloc
Instead of repeating the type of the LHS argument (sizeof(struct ...))
in calls to malloc/calloc, we directly use the pointer
name (sizeof(*...)). The following Coccinelle patch was used:

@@
type T;
T *x;
@@

  x = malloc(
- sizeof(T)
+ sizeof(*x)
  )

@@
type T;
T *x;
@@

  x = calloc(1,
- sizeof(T)
+ sizeof(*x)
  )

When the LHS is not just a variable name, no change is made. Moreover,
the following patch was used to ensure that "1" is consistently used as
a first argument of calloc, not the last one:

@@
@@

  calloc(
+ 1,
  ...
- ,1
  )
2016-04-03 14:17:42 +02:00
Willy Tarreau
6204cd9f27 BUG/MAJOR: vars: always retrieve the stream and session from the sample
This is the continuation of previous patch called "BUG/MAJOR: samples:
check smp->strm before using it".

It happens that variables may have a session-wide scope, and that their
session is retrieved by dereferencing the stream. But nothing prevents them
from being used from a streamless context such as tcp-request connection,
thus crashing the process. Example :

    tcp-request connection accept if { src,set-var(sess.foo) -m found }

In order to fix this, we have to always ensure that variable manipulation
only happens via the sample, which contains the correct owner and context,
and that we never use one from a different source. This results in quite a
large change since a lot of functions are inderctly involved in the call
chain, but the change is easy to follow.

This fix must be backported to 1.6, and requires the last two patches.
2016-03-10 17:28:04 +01:00
Willy Tarreau
7560dd4b6a MINOR: sample: always set a new sample's owner before evaluating it
Some functions like sample_conv_var2smp(), var_get_byname(), and
var_set_byname() directly or indirectly need to access the current
stream and/or session and must find it in the sample itself and not
as a distinct argument. Thus we first need to call smp_set_owner()
prior to each such calls.
2016-03-10 16:42:58 +01:00
Willy Tarreau
1777ea63e0 MINOR: sample: add a new helper to initialize the owner of a sample
Since commit 6879ad3 ("MEDIUM: sample: fill the struct sample with the
session, proxy and stream pointers") merged in 1.6-dev2, the sample
contains the pointer to the stream and sample fetch functions as well
as converters use it heavily. This requires from a lot of call places
to initialize 4 fields, and it was even forgotten at a few places.

This patch provides a convenient helper to initialize all these fields
at once, making it easy to prepare a new sample from a previous one for
example.

A few call places were cleaned up to make use of it. It will be needed
by further fixes.

At one place in the Lua code, it was moved earlier because we used to
call sample casts with a non completely initialized sample, which is
not clean eventhough at the moment there are no consequences.
2016-03-10 16:42:58 +01:00
Dragan Dosen
0b85ecee53 MEDIUM: logs: add a new RFC5424 log-format for the structured-data
This patch adds a new RFC5424-specific log-format for the structured-data
that is automatically send by __send_log() when the sender is in RFC5424
mode.

A new statement "log-format-sd" should be used in order to set log-format
for the structured-data part in RFC5424 formatted syslog messages.
Example:

    log-format-sd [exampleSDID@1234\ bytes=\"%B\"\ status=\"%ST\"]
2015-09-28 14:01:27 +02:00
Thierry FOURNIER
136f9d34a9 MINOR: samples: rename union from "data" to "u"
The union name "data" is a little bit heavy while we read the source
code because we can read "data.data.sint". The rename from "data" to "u"
makes the read easiest like "data.u.sint".
2015-08-20 17:13:46 +02:00
Thierry FOURNIER
8c542cac07 MEDIUM: samples: Use the "struct sample_data" in the "struct sample"
This patch remove the struct information stored both in the struct
sample_data and in the striuct sample. Now, only thestruct sample_data
contains data, and the struct sample use the struct sample_data for storing
his own data.
2015-08-20 17:13:46 +02:00
Thierry FOURNIER
cc4d1716a2 MINOR: sample: Add ipv6 to ipv4 and sint to ipv6 casts
The RFC4291 says that when the IPv6 adress have the followin form:
0000::ffff:a.b.c.d, if can be converted to an IPv4 adress. This patch
enable this conversion in casts.

As the sint can be casted as ipv4, and ipv4 can be casted as ipv6, we
can directly cast sint as ipv6 using the RFC4291.
2015-08-11 14:14:10 +02:00
Thierry FOURNIER
5d86fae234 MEDIUM: vars/sample: operators can use variables as parameter
This patch allow the existing operators to take a variable as parameter.
This is useful to add the content of two variables. This patch modify
the behavior of operators.
2015-07-22 00:48:24 +02:00
Thierry FOURNIER
00c005c726 MEDIUM: sample: switch to saturated arithmetic
This patch check calculus for overflow and returns capped values.
This permits to protect against integer overflow in certain operations
involving ratios, percentages, limits or anything. That can sometimes
be critically important with some operations (eg: content-length < X).
2015-07-22 00:48:24 +02:00
Thierry FOURNIER
bf65cd4d77 MAJOR: arg: converts uint and sint in sint
This patch removes the 32 bits unsigned integer and the 32 bit signed
integer. It replaces these types by a unique type 64 bit signed.
2015-07-22 00:48:23 +02:00
Thierry FOURNIER
07ee64ef4d MAJOR: sample: converts uint and sint in 64 bits signed integer
This patch removes the 32 bits unsigned integer and the 32 bit signed
integer. It replaces these types by a unique type 64 bit signed.

This makes easy the usage of integer and clarify signed and unsigned use.
With the previous version, signed and unsigned are used ones in place of
others, and sometimes the converter loose the sign. For example, divisions
are processed with "unsigned", if one entry is negative, the result is
wrong.

Note that the integer pattern matching and dotted version pattern matching
are already working with signed 64 bits integer values.

There is one user-visible change : the "uint()" and "sint()" sample fetch
functions which used to return a constant integer have been replaced with
a new more natural, unified "int()" function. These functions were only
introduced in the latest 1.6-dev2 so there's no impact on regular
deployments.
2015-07-22 00:48:23 +02:00
Thierry FOURNIER
fac9ccfb70 BUG/MINOR: http/sample: gmtime/localtime can fail
The man said that gmtime() and localtime() can return a NULL value.
This is not tested. It appears that all the values of a 32 bit integer
are valid, but it is better to check the return of these functions.

However, if the integer move from 32 bits to 64 bits, some 64 values
can be unsupported.
2015-07-20 12:21:35 +02:00
Willy Tarreau
28d976d5ee MINOR: args: add new context for servers
We'll have to support fetch expressions and args on server lines for
"usesrc", "usedst", "sni", etc...
2015-07-09 11:39:33 +02:00
Adis Nezirovic
79beb248b9 CLEANUP: sample: generalize sample_fetch_string() as sample_fetch_as_type()
This modification makes possible to use sample_fetch_string() in more places,
where we might need to fetch sample values which are not plain strings. This
way we don't need to fetch string, and convert it into another type afterwards.

When using aliased types, the caller should explicitly check which exact type
was returned (e.g. SMP_T_IPV4 or SMP_T_IPV6 for SMP_T_ADDR).

All usages of sample_fetch_string() are converted to use new function.
2015-07-06 16:17:25 +02:00
Dragan Dosen
93b38d9191 MEDIUM: 51Degrees code refactoring and cleanup
Moved 51Degrees code from src/haproxy.c, src/sample.c and src/cfgparse.c
into a separate files src/51d.c and include/import/51d.h.

Added two new functions init_51degrees() and deinit_51degrees(), updated
Makefile and other code reorganizations related to 51Degrees.
2015-06-30 10:43:03 +02:00
Thierry FOURNIER
cc103299c7 MINOR: samples: add samples which returns constants
This patch adds sample which returns constants values. This is useful
for intialising variables.
2015-06-13 23:01:37 +02:00
Thierry FOURNIER
9687c77c91 MINOR: debug: add a special converter which display its input sample content.
This converter displays its input sample type and content. It is useful
for debugging some complex configurations.
2015-06-13 23:01:36 +02:00
Thierry FOURNIER
9c627e84b2 MEDIUM: sample: Add type any
This type is used to accept any type of sample as input, and prevent
any automatic "cast". It runs like the type "ADDR" which accept the
type "IPV4" and "IPV6".
2015-06-13 22:59:14 +02:00
Thierry FOURNIER
0f811440d5 BUG/MINOR: sample: wrong conversion of signed values
The signed values are casted as unsigned before conversion. This patch
use the good converters according with the sample type.

Note: it depends on previous patch to parse signed ints.
2015-06-13 22:59:14 +02:00
Thierry FOURNIER
4c2479e1c4 BUG/MINOR: debug: display (null) in place of "meth"
The array which contains names of types, miss the METH entry.

[wt: should be backported to 1.5 as well]
2015-06-09 10:58:14 +02:00
Thomas Holmes
4d441a759c MEDIUM: sample: add trie support to 51Degrees
Trie or pattern algorithm is used depending on what 51Degrees source
files are provided to MAKE.
2015-06-02 19:30:53 +02:00
Thomas Holmes
951d44d24d MEDIUM: sample: add fiftyone_degrees converter.
It takes up to 5 string arguments that are to be 51Degrees property names.
It will then create a chunk with values detected based on the request header
supplied (this should be the User-Agent).
2015-06-02 14:00:25 +02:00
Willy Tarreau
f63386ad27 CLEANUP: da: move the converter registration to da.c
There's no reason to put it into sample.c, it's better to register it
locally in da.c, it removes a number of ifdefs and exports.
2015-06-02 13:42:12 +02:00
David Carlier
4542b10ae1 MEDIUM: sample: add the da-csv converter
This diff declares the deviceatlas module and can accept up to 5
property names for the API lookup.

[wt: this should probably be moved to its own file using the keyword
      registration mechanism]
2015-06-02 13:24:50 +02:00
Willy Tarreau
e2dc1fa8ca MEDIUM: stick-table: remove the now duplicate find_stktable() function
Since proxy_tbl_by_name() already does the same job, let's not keep
duplicate functions and use this one only.
2015-05-26 12:08:07 +02:00
Willy Tarreau
9e0bb1013e CLEANUP: proxy: make the proxy lookup functions more user-friendly
First, findproxy() was renamed proxy_find_by_name() so that its explicit
that a name is required for the lookup. Second, we give this function
the ability to search for tables if needed. Third we now provide inline
wrappers to pass the appropriate PR_CAP_* flags and to explicitly look
up a frontend, backend or table.
2015-05-26 11:24:42 +02:00
Thierry FOURNIER
0786d05a04 MEDIUM: sample: change the prototype of sample-fetches functions
This patch removes the "opt" entry from the prototype of the
sample-fetches fucntions. This permits to remove some weight
in the prototype call.
2015-05-11 20:03:08 +02:00
Thierry FOURNIER
1d33b882d2 MINOR: sample: fill the struct sample with the options.
Options are relative to the sample. Each sample fetched is associated with
fetch options or fetch flags.

This patch adds the 'opt' vaue in the sample struct. This permits to reduce
the sample-fetch function prototype. In other way, the converters will have
more detail about the origin of the sample.
2015-05-11 20:02:11 +02:00
Thierry FOURNIER
0a9a2b8cec MEDIUM: sample change the prototype of sample-fetches and converters functions
This patch removes the structs "session", "stream" and "proxy" from
the sample-fetches and converters function prototypes.

This permits to remove some weight in the prototype call.
2015-05-11 20:01:42 +02:00
Thierry FOURNIER
6879ad31a5 MEDIUM: sample: fill the struct sample with the session, proxy and stream pointers
Some sample analyzer (sample-fetch or converters) needs to known the proxy,
session and stream attached to the sampel. The sample-fetches and the converters
function pointers cannot be called without these 3 pointers filled.

This patch permits to reduce the sample-fetch and the converters called
prototypes, and provides a new mean to add information for this type of
functions.
2015-05-11 20:00:03 +02:00
Willy Tarreau
192252e2d8 MAJOR: sample: pass a pointer to the session to each sample fetch function
Many such function need a session, and till now they used to dereference
the stream. Once we remove the stream from the embryonic session, this
will not be possible anymore.

So as of now, sample fetch functions will be called with this :

   - sess = NULL,  strm = NULL                     : never
   - sess = valid, strm = NULL                     : tcp-req connection
   - sess = valid, strm = valid, strm->txn = NULL  : tcp-req content
   - sess = valid, strm = valid, strm->txn = valid : http-req / http-res
2015-04-06 11:37:25 +02:00
Willy Tarreau
15e91e1b36 MAJOR: sample: don't pass l7 anymore to sample fetch functions
All of them can now retrieve the HTTP transaction *if it exists* from
the stream and be sure to get NULL there when called with an embryonic
session.

The patch is a bit large because many locations were touched (all fetch
functions had to have their prototype adjusted). The opportunity was
taken to also uniformize the call names (the stream is now always "strm"
instead of "l4") and to fix indent where it was broken. This way when
we later introduce the session here there will be less confusion.
2015-04-06 11:35:53 +02:00
Willy Tarreau
87b09668be REORG/MAJOR: session: rename the "session" entity to "stream"
With HTTP/2, we'll have to support multiplexed streams. A stream is in
fact the largest part of what we currently call a session, it has buffers,
logs, etc.

In order to catch any error, this commit removes any reference to the
struct session and tries to rename most "session" occurrences in function
names to "stream" and "sess" to "strm" when that's related to a session.

The files stream.{c,h} were added and session.{c,h} removed.

The session will be reintroduced later and a few parts of the stream
will progressively be moved overthere. It will more or less contain
only what we need in an embryonic session.

Sample fetch functions and converters will have to change a bit so
that they'll use an L5 (session) instead of what's currently called
"L4" which is in fact L6 for now.

Once all changes are completed, we should see approximately this :

   L7 - http_txn
   L6 - stream
   L5 - session
   L4 - connection | applet

There will be at most one http_txn per stream, and a same session will
possibly be referenced by multiple streams. A connection will point to
a session and to a stream. The session will hold all the information
we need to keep even when we don't yet have a stream.

Some more cleanup is needed because some code was already far from
being clean. The server queue management still refers to sessions at
many places while comments talk about connections. This will have to
be cleaned up once we have a server-side connection pool manager.
Stream flags "SN_*" still need to be renamed, it doesn't seem like
any of them will need to move to the session.
2015-04-06 11:23:56 +02:00
Thierry FOURNIER
8fd1376014 MINOR: converters: add function to browse converters
This patch adds a fucntion to browse each converter. This
is used with Lua for using the converters with a wrapper.
2015-03-11 19:55:10 +01:00
Thierry FOURNIER
4d9a1d1a5c MINOR: sample: add function for browsing samples.
This function is useful with the incoming lua functions.
2015-02-28 23:12:32 +01:00
Thierry FOURNIER
f41a809dc9 MINOR: sample: add private argument to the struct sample_fetch
The add of this private argument is to prepare the integration
of the lua fetchs.
2015-02-28 23:12:31 +01:00
Thierry FOURNIER
68a556e282 MINOR: converters: give the session pointer as converter argument
Some usages of the converters need to know the attached session. The Lua
needs the session for retrieving his running context. This patch adds
the "session" as an argument of the converters prototype.
2015-02-28 23:12:31 +01:00
Thierry FOURNIER
1edc971919 MINOR: converters: add a "void *private" argument to converters
This permits to store specific configuration pointer. It is useful
with future Lua integration.
2015-02-28 23:12:31 +01:00
Willy Tarreau
9770787e70 MEDIUM: samples: provide basic arithmetic and bitwise operators
This commit introduces a new category of converters. They are bitwise and
arithmetic operators which support performing basic operations on integers.
Some bitwise operations are supported (and, or, xor, cpl) and some arithmetic
operations are supported (add, sub, mul, div, mod, neg). Some comparators
are provided (odd, even, not, bool) which make it possible to report a match
without having to write an ACL.

The detailed list of new operators as they appear in the doc is :

add(<value>)
  Adds <value> to the input value of type unsigned integer, and returns the
  result as an unsigned integer.

and(<value>)
  Performs a bitwise "AND" between <value> and the input value of type unsigned
  integer, and returns the result as an unsigned integer.

bool
  Returns a boolean TRUE if the input value of type unsigned integer is
  non-null, otherwise returns FALSE. Used in conjunction with and(), it can be
  used to report true/false for bit testing on input values (eg: verify the
  presence of a flag).

cpl
  Takes the input value of type unsigned integer, applies a twos-complement
  (flips all bits) and returns the result as an unsigned integer.

div(<value>)
  Divides the input value of type unsigned integer by <value>, and returns the
  result as an unsigned integer. If <value> is null, the largest unsigned
  integer is returned (typically 2^32-1).

even
  Returns a boolean TRUE if the input value of type unsigned integer is even
  otherwise returns FALSE. It is functionally equivalent to "not,and(1),bool".

mod(<value>)
  Divides the input value of type unsigned integer by <value>, and returns the
  remainder as an unsigned integer. If <value> is null, then zero is returned.

mul(<value>)
  Multiplies the input value of type unsigned integer by <value>, and returns
  the product as an unsigned integer. In case of overflow, the higher bits are
  lost, leading to seemingly strange values.

neg
  Takes the input value of type unsigned integer, computes the opposite value,
  and returns the remainder as an unsigned integer. 0 is identity. This
  operator is provided for reversed subtracts : in order to subtract the input
  from a constant, simply perform a "neg,add(value)".

not
  Returns a boolean FALSE if the input value of type unsigned integer is
  non-null, otherwise returns TRUE. Used in conjunction with and(), it can be
  used to report true/false for bit testing on input values (eg: verify the
  absence of a flag).

odd
  Returns a boolean TRUE if the input value of type unsigned integer is odd
  otherwise returns FALSE. It is functionally equivalent to "and(1),bool".

or(<value>)
  Performs a bitwise "OR" between <value> and the input value of type unsigned
  integer, and returns the result as an unsigned integer.

sub(<value>)
  Subtracts <value> from the input value of type unsigned integer, and returns
  the result as an unsigned integer. Note: in order to subtract the input from
  a constant, simply perform a "neg,add(value)".

xor(<value>)
  Performs a bitwise "XOR" (exclusive OR) between <value> and the input value
  of type unsigned integer, and returns the result as an unsigned integer.
2015-01-27 15:41:13 +01:00
Willy Tarreau
d817e468bf BUG/MINOR: sample: fix case sensitivity for the regsub converter
Two commits ago in 7eda849 ("MEDIUM: samples: add a regsub converter to
perform regex-based transformations"), I got caught for the second time
with the inverted case sensitivity usage of regex_comp(). So by default
it is case insensitive and passing the "i" flag makes it case sensitive.
I forgot to recheck that case before committing the cleanup. No harm
anyway, nobody had the time to use it.
2015-01-23 20:27:41 +01:00
Willy Tarreau
7eda849dce MEDIUM: samples: add a regsub converter to perform regex-based transformations
We can now replace matching regex parts with a string, a la sed. Note
that there are at least 3 different behaviours for existing sed
implementations when matching 0-length strings. Here is the result
of the following operation on each implementationt tested :

  echo 'xzxyz' | sed -e 's/x*y*/A/g'

  GNU sed 4.2.1       => AzAzA
  Perl's sed 5.16.1   => AAzAAzA
  Busybox v1.11.2 sed => AzAz

The psed behaviour was adopted because it causes the least exceptions
in the code and seems logical from a certain perspective :

  - "x"  matches x*y*  => add "A" and skip "x"
  - "z"  matches x*y*  => add "A" and keep "z", not part of the match
  - "xy" matches x*y*  => add "A" and skip "xy"
  - "z"  matches x*y*  => add "A" and keep "z", not part of the match
  - ""   matches x*y*  => add "A" and stop here

Anyway, given the incompatibilities between implementations, it's unlikely
that some processing will rely on this behaviour.

There currently is one big limitation : the configuration parser makes it
impossible to pass commas or closing parenthesis (or even closing brackets
in log formats). But that's still quite usable to replace certain characters
or character sequences. It will become more complete once the config parser
is reworked.
2015-01-22 14:24:53 +01:00
Willy Tarreau
469477879c MINOR: args: implement a new arg type for regex : ARGT_REG
This one will be used when a regex is expected. It is automatically
resolved after the parsing and compiled into a regex. Some optional
flags are supported in the type-specific flags that should be set by
the optional arg checker. One is used during the regex compilation :
ARGF_REG_ICASE to ignore case.
2015-01-22 14:24:53 +01:00
Willy Tarreau
8059977d3e MINOR: samples: provide a "crc32" converter
This converter hashes a binary input sample into an unsigned 32-bit quantity
using the CRC32 hash function. Optionally, it is possible to apply a full
avalanche hash function to the output if the optional <avalanche> argument
equals 1. This converter uses the same functions as used by the various hash-
based load balancing algorithms, so it will provide exactly the same results.
It is provided for compatibility with other software which want a CRC32 to be
computed on some input keys, so it follows the most common implementation as
found in Ethernet, Gzip, PNG, etc... It is slower than the other algorithms
but may provide a better or at least less predictable distribution.
2015-01-20 19:48:08 +01:00
Vincent Bernat
1228dc0e7a BUG/MEDIUM: sample: fix random number upper-bound
random() will generate a number between 0 and RAND_MAX. POSIX mandates
RAND_MAX to be at least 32767. GNU libc uses (1<<31 - 1) as
RAND_MAX.

In smp_fetch_rand(), a reduction is done with a multiply and shift to
avoid skewing the results. However, the shift was always 32 and hence
the numbers were not distributed uniformly in the specified range. We
fix that by dividing by RAND_MAX+1. gcc is smart enough to turn that
into a shift:

    0x000000000046ecc8 <+40>:    shr    $0x1f,%rax
2014-12-10 22:45:34 +01:00
Emeric Brun
c9a0f6d023 MINOR: samples: add the word converter.
word(<index>,<delimiters>)
  Extracts the nth word considering given delimiters from an input string.
  Indexes start at 1 and delimiters are a string formatted list of chars.
2014-11-25 14:48:39 +01:00
Emeric Brun
f399b0debf MINOR: samples: adds the field converter.
field(<index>,<delimiters>)
  Extracts the substring at the given index considering given delimiters from
  an input string. Indexes start at 1 and delimiters are a string formatted
  list of chars.
2014-11-24 17:44:02 +01:00
Emeric Brun
54c4ac8417 MINOR: samples: adds the bytes converter.
bytes(<offset>[,<length>])
  Extracts a some bytes from an input binary sample. The result is a
  binary sample starting at an offset (in bytes) of the original sample
  and optionnaly truncated at the given length.
2014-11-24 17:44:02 +01:00
Willy Tarreau
0f30d26dbf MINOR: sample: add a few basic internal fetches (nbproc, proc, stopping)
Sometimes, either for debugging or for logging we'd like to have a bit
of information about the running process. Here are 3 new fetches for this :

nbproc : integer
  Returns an integer value corresponding to the number of processes that were
  started (it equals the global "nbproc" setting). This is useful for logging
  and debugging purposes.

proc : integer
  Returns an integer value corresponding to the position of the process calling
  the function, between 1 and global.nbproc. This is useful for logging and
  debugging purposes.

stopping : boolean
  Returns TRUE if the process calling the function is currently stopping. This
  can be useful for logging, or for relaxing certain checks or helping close
  certain connections upon graceful shutdown.
2014-11-24 17:44:02 +01:00
Emeric Brun
4b9e80268e BUG/MINOR: samples: fix unnecessary memcopy converting binary to string. 2014-11-24 17:44:02 +01:00
Thierry FOURNIER
317e1c4f1e MINOR: sample: add "json" converter
This converter escapes string to use it as json/ascii escaped string.
It can read UTF-8 with differents behavior on errors and encode it in
json/ascii.

json([<input-code>])
  Escapes the input string and produces an ASCII ouput string ready to use as a
  JSON string. The converter tries to decode the input string according to the
  <input-code> parameter. It can be "ascii", "utf8", "utf8s", "utf8"" or
  "utf8ps". The "ascii" decoder never fails. The "utf8" decoder detects 3 types
  of errors:
   - bad UTF-8 sequence (lone continuation byte, bad number of continuation
     bytes, ...)
   - invalid range (the decoded value is within a UTF-8 prohibited range),
   - code overlong (the value is encoded with more bytes than necessary).

  The UTF-8 JSON encoding can produce a "too long value" error when the UTF-8
  character is greater than 0xffff because the JSON string escape specification
  only authorizes 4 hex digits for the value encoding. The UTF-8 decoder exists
  in 4 variants designated by a combination of two suffix letters : "p" for
  "permissive" and "s" for "silently ignore". The behaviors of the decoders
  are :
   - "ascii"  : never fails ;
   - "utf8"   : fails on any detected errors ;
   - "utf8s"  : never fails, but removes characters corresponding to errors ;
   - "utf8p"  : accepts and fixes the overlong errors, but fails on any other
                error ;
   - "utf8ps" : never fails, accepts and fixes the overlong errors, but removes
                characters corresponding to the other errors.

  This converter is particularly useful for building properly escaped JSON for
  logging to servers which consume JSON-formated traffic logs.

  Example:
     capture request header user-agent len 150
     capture request header Host len 15
     log-format {"ip":"%[src]","user-agent":"%[capture.req.hdr(1),json]"}

  Input request from client 127.0.0.1:
     GET / HTTP/1.0
     User-Agent: Very "Ugly" UA 1/2

  Output log:
     {"ip":"127.0.0.1","user-agent":"Very \"Ugly\" UA 1\/2"}
2014-10-26 06:41:12 +01:00
Willy Tarreau
6bcb0a84e7 BUG/MAJOR: tcp: fix a possible busy spinning loop in content track-sc*
As a consequence of various recent changes on the sample conversion,
a corner case has emerged where it is possible to wait forever for a
sample in track-sc*.

The issue is caused by the fact that functions relying on sample_process()
don't all exactly work the same regarding the SMP_F_MAY_CHANGE flag and
the output result. Here it was possible to wait forever for an output
sample from stktable_fetch_key() without checking the SMP_OPT_FINAL flag.
As a result, if the client connects and closes without sending the data
and haproxy expects a sample which is capable of coming, it will ignore
this impossible case and will continue to wait.

This change adds control for SMP_OPT_FINAL before waiting for extra data.
The various relevant functions have been better documented regarding their
output values.

This fix must be backported to 1.5 since it appeared there.
2014-07-30 08:56:35 +02:00
Willy Tarreau
bbfd1a25ee MINOR: sample: allow integers to cast to binary
Doing so finally allows to apply the hex converter to integers as well.
Note that all integers are represented in 32-bit, big endian so that their
conversion remains human readable and portable. A later improvement to the
hex converter could be to make it trim leading zeroes, and/or to only report
a number of least significant bytes.
2014-07-15 21:36:15 +02:00
Willy Tarreau
23ec4ca1bb MINOR: sample: add new converters to hash input
From time to time it's useful to hash input data (scramble input, or
reduce the space needed in a stick table). This patch provides 3 simple
converters allowing use of the available hash functions to hash input
data. The output is an unsigned integer which can be passed into a header,
a log or used as an index for a stick table. One nice usage is to scramble
source IP addresses before logging when there are requirements to hide them.
2014-07-15 21:36:15 +02:00
Willy Tarreau
9700e5c914 MINOR: sample: allow IP address to cast to binary
IP addresses are a perfect example of fixed size data which we could
cast to binary, still it was not allowed by lack of cast function,
eventhough the opposite was allowed in ACLs. Make that possible both
in sample expressions and in stick tables.
2014-07-15 21:36:15 +02:00
Willy Tarreau
0dbfdbaef1 MINOR: samples: add two converters for the date format
This patch adds two converters :

   ltime(<format>[,<offset>])
   utime(<format>[,<offset>])

Both use strftime() to emit the output string from an input date. ltime()
provides local time, while utime() provides the UTC time.
2014-07-10 16:43:44 +02:00
Willy Tarreau
6c616e0b96 BUG/MAJOR: sample: correctly reinitialize sample fetch context before calling sample_process()
We used to only clear flags when reusing the static sample before calling
sample_process(), but that's not enough because there's a context in samples
that can be used by some fetch functions such as auth, headers and cookies,
and not reinitializing it risks that a pointer of a different type is used
in the wrong context.

An example configuration which triggers the case consists in mixing hdr()
and http_auth_group() which both make use of contexts :

     http-request add-header foo2 %[hdr(host)],%[http_auth_group(foo)]

The solution is simple, initialize all the sample and not just the flags.
This fix must be backported into 1.5 since it was introduced in 1.5-dev19.
2014-06-25 17:12:08 +02:00
Willy Tarreau
3a4ac422ce MINOR: tcp: prepare support for the "capture" action
A few minor entries will be needed to capture sample fetches in requests
or responses. This patch just prepares the code for this.
2014-06-13 16:32:48 +02:00
Willy Tarreau
5b4bf70a95 MINOR: sample: improve sample_fetch_string() to report partial contents
Currently, all callers to sample_fetch_string() call it with SMP_OPT_FINAL.
Now we improve it to support the case where this option is not set, and to
make it return the original sample as-is. The purpose is to let the caller
check the SMP_F_MAY_CHANGE flag in the result and know that it should wait
to get complete contents. Currently this has no effect on existing code.
2014-06-13 16:32:48 +02:00
Emeric Brun
53d1a98270 MINOR: ssl: adds sample converter base64 for binary type.
The new converter encode binary type sample to base64 string.

i.e. : ssl_c_serial,base64
2014-04-30 22:31:11 +02:00
Thierry FOURNIER
eeaa951726 MINOR: configuration: File and line propagation
This patch permits to communicate file and line of the
configuration file at the configuration parser.
2014-03-17 18:06:08 +01:00
Thierry FOURNIER
d437314979 MEDIUM: sample/http_proto: Add new type called method
The method are actuelly stored using two types. Integer if the method is
known and string if the method is not known. The fetch is declared as
UINT, but in some case it can provides STR.

This patch create new type called METH. This type contain interge for
known method and string for the other methods. It can be used with
automatic converters.

The pattern matching can expect method.

During the free or prune function, http_meth pettern is freed. This
patch initialise the freed pointer to NULL.
2014-03-17 18:06:07 +01:00
Thierry FOURNIER
7654c9ff44 MEDIUM: sample: Remove types SMP_T_CSTR and SMP_T_CBIN, replace it by SMP_F_CONST flags
The operations applied on types SMP_T_CSTR and SMP_T_STR are the same,
but the check code and the declarations are double, because it must
declare action for SMP_T_C* and SMP_T_*. The declared actions and checks
are the same. this complexify the code. Only the "conv" functions can
change from "C*" to "*"

Now, if a function needs to modify input string, it can call the new
function smp_dup(). This one duplicate data in a trash buffer.
2014-03-17 18:06:07 +01:00
Thierry FOURNIER
0e9af55700 MINOR: sample: dont call the sample cast function "c_none"
If the cast function to execute is c_none, dont execute it and return
true. The function c_none, do nothing. This save a call.
2014-03-17 18:06:07 +01:00
Willy Tarreau
1cf8f08c17 MINOR: sample: move smp_to_type to sample.c
This way it can be exported and reused anywhere else to report type names.
2014-03-17 18:06:06 +01:00
Thierry FOURNIER
e87cac16cc MEDIUM: sample: change the behavior of the bin2str cast
The bin2str cast gives the hexadecimal representation of the binary
content when it is used as string. This was inherited from the
stick-table casts without realizing that it was a mistake. Indeed,
it breaks string processing on binary contents, preventing any _reg,
_beg, etc from working.

For example, with an HTTP GET request, the fetch "req.payload(0,3)"
returns the 3 bytes "G", "E", and "T" in binary. If this fetch is
used with regex, it is automatically converted to "474554" and the
regex is applied on this string, so it never matches.

This commit changes the cast so that bin2str does not convert the
contents anymore, and returns a string type. The contents can thus
be matched as is, and the NULL character continues to mark the end
of the string to avoid any issue with some string-based functions.

This commit could almost have been marked as a bug fix since it
does what the doc says.

Note that in case someone would rely on the hex encoding, then the
same behaviour could be achieved by appending ",hex" after the sample
fetch function (brought by previous patch).
2014-03-17 17:31:46 +01:00
Thierry FOURNIER
2f49d6d17b MINOR: sample: add hex converter
This new filter converts BIN type to its hexadecimal
representation in STR type. It is used to keep the
compatibility with the original bin2str cast.

It will be useful when bin2str changes to copy the
string as-is without encoding anymore.
2014-03-17 16:39:18 +01:00
Willy Tarreau
84310e2e73 MINOR: sample: add a rand() sample fetch to return a sample.
Sometimes it can be useful to generate a random value, at least
for debugging purposes, but also to take routing decisions or to
pass such a value to a backend server.
2014-02-14 11:59:04 +01:00
Thierry FOURNIER
60bb020d70 BUG/MINOR: sample: The c_str2int converter does not fail if the entry is not an integer
If the string not start with a number, the converter fails. In other, it
converts a maximum of characters to a number and stop to the first
character that not match a number.
2014-01-27 19:04:18 +01:00
Willy Tarreau
689a1df0a1 BUG/MEDIUM: sample: simplify and fix the argument parsing
Some errors may be reported about missing mandatory arguments when some
sample fetch arguments are marked as mandatory and implicit (eg: proxy
names such as in table_cnt or be_conn).

In practice the argument parser already handles all the situations very
well, it's just that the sample fetch parser want to go beyond its role
and starts some controls that it should not do. Simply removing these
useless controls lets make_arg_list() create the correct argument types
when such types are encountered.

This regression was introduced by the recent use of sample_parse_expr()
in ACLs which makes use of its own argument parser, while previously
the arguments were parsed in the ACL function itself. No backport is
needed.
2013-12-13 01:33:33 +01:00
Willy Tarreau
975c1784c8 MINOR: sample: make sample_parse_expr() use memprintf() to report parse errors
Doing so ensures that we're consistent between all the functions in the whole
chain. This is important so that we can extract the argument parsing from this
function.
2013-12-12 23:16:54 +01:00
Thierry FOURNIER
fd1399091e BUG/MEDIUM: sample: conversion from str to ipv6 may read data past end
Applying inet_pton() to input contents is not reliable because the
function requires a zero-terminated string. While inet_pton() will
stop when contents do not match an IPv6 address anymore, it could
theorically read past the end of a buffer if the data to be converted
was at the end of a buffer (this cannot happen right now thanks to
the reserve at the end of the buffer). At least the conversion does
not work.

Fix this by using buf2ip6() instead, which copies the string into a
padded aread.

This bug came with recent commit b805f71 (MEDIUM: sample: let the
cast functions set their output type), no backport is needed.
2013-12-11 22:03:00 +01:00
Willy Tarreau
3d536ac378 BUG/MINOR: acl: fix sample expression error reporting
ACL parse errors are not easy to understand since recent commit 348971e
(MEDIUM: acl: use the fetch syntax 'fetch(args),conv(),conv()' into the
ACL keyword) :

[ALERT] 339/154717 (26437) : parsing [check-bug.cfg:10] : error detected while parsing a 'stats admin' rule : unknown ACL or sample keyword 'env(a,b,c)': invalid arg 2 in fetch method 'env' : end of arguments expected at position 2, but got ',b,c'..

This error is only relevant to sample fetch keywords, so the new form is
a bit easier to understand :

[ALERT] 339/160011 (26626) : parsing [check-bug.cfg:12] : error detected while parsing a 'stats admin' rule : invalid arg 2 in fetch method 'env' : end of arguments expected at position 2, but got ',b,c' in sample expression 'env(a,b,c),upper'.

No backport is needed.
2013-12-06 16:02:46 +01:00
Willy Tarreau
67ff7e0af3 BUG/MEDIUM: acl: fix regression introduced by latest converters support
Since commit 348971e (MEDIUM: acl: use the fetch syntax
'fetch(args),conv(),conv()' into the ACL keyword), ACLs wait on input
that may change. This is visible in the configuration below :

        tcp-request inspect-delay 3s
        tcp-request content accept if REQ_CONTENT

Nothing will pass before the end of the timer. This is because
historically, sample_process() was dedicated to stick tables where
it was absolutely necessary to wait for a stable sample. Now samples
are used by many other things and we can't afford this. So let's move
this check to the stick tables after the call to sample_process()
instead.

This is post-1.5-dev19 work, no backport is required.
2013-12-05 02:23:13 +01:00
Thierry FOURNIER
d18cd0f110 MEDIUM: http: The redirect strings follows the log format rules.
We handle "http-request redirect" with a log-format string now, but we
leave "redirect" unaffected.

Note that the control of the special "/" case is move from the runtime
execution to the configuration parsing. If the format rule list is
empty, the build_logline() function does nothing.
2013-12-02 23:31:33 +01:00
Thierry FOURNIER
b805f71d1b MEDIUM: sample: let the cast functions set their output type
This patch allows each sample cast function to specify the sample
output type. The goal is to be able to emit an output type IPv4 or
IPv6 depending on what is found in the input if the next converter
is able to process them both.

The patch also adds a new pseudo type called "ADDR". This type is an
alias for IPV4 and IPV6 which is only used as an input type by converters
who want to express their compatibility with both address formats. It may
not be emitted.

The goal is to unify as much as possible the processing of IPv4 and IPv6
in order not to add extra keywords for the maps which act as converters,
but will match samples like ACLs do with their patterns.
2013-12-02 23:31:33 +01:00
Thierry FOURNIER
9c1d67ecbd MINOR: sample: provide the original sample_conv descriptor struct to the argument checker function.
Note that this argument checker is still unused but will be used by
maps.
2013-12-02 23:31:32 +01:00
Thierry FOURNIER
348971ea28 MEDIUM: acl: use the fetch syntax 'fetch(args),conv(),conv()' into the ACL keyword
If the acl keyword is a "fetch", the dedicated parsing function
"sample_parse_expr()" is used. Otherwise, the acl parsing function
"parse_acl_expr()" is extended to understand the syntax of a series
of converters placed after the "fetch" keyword.

Before this patch, each acl uses a "struct sample_fetch" and executes
it with the "<fetch>->process()" function. Now, the dedicated function
"sample_process()" is called.

These syntax are now avalaible:

   acl bad req.hdr(host),lower -m str www
   http-request redirect prefix /go-away if bad

   acl bad hdr_beg(host),lower www
   http-request redirect prefix /go-away if bad
2013-12-02 23:31:32 +01:00
Thierry FOURNIER
8af6ff12b5 MINOR: sample: export sample_casts
just export the sample cast matrix "sample_casts" to prepare the
generic sample conversion parser.
2013-12-02 23:31:32 +01:00
Thierry FOURNIER
1c0054fe83 BUG/MINOR: arg: fix error reporting for add-header/set-header sample fetch arguments
The 'add-header %[samples]' parsing errors associated to http-request
and http-response are displayed with the wrong keyword.

Configuration entry:

   http-request set-header mon-header %[res.hdr(user-agent)]

Original error message:

   [WARNING] 323/150920 (16559) : parsing [haproxy.conf:36] : 'log-format' : sample fetch <res.hdr ...

After commit error message:

   [WARNING] 323/150929 (16580) : parsing [haproxy.conf:36] : 'http-request' : sample fetch <res.hdr ...
2013-11-28 18:25:18 +01:00
Willy Tarreau
ef38c39287 MEDIUM: sample: systematically pass the keyword pointer to the keyword
We're having a lot of duplicate code just because of minor variants between
fetch functions that could be dealt with if the functions had the pointer to
the original keyword, so let's pass it as the last argument. An earlier
version used to pass a pointer to the sample_fetch element, but this is not
the best solution for two reasons :
  - fetch functions will solely rely on the keyword string
  - some other smp_fetch_* users do not have the pointer to the original
    keyword and were forced to pass NULL.

So finally we're passing a pointer to the keyword as a const char *, which
perfectly fits the original purpose.
2013-08-01 21:17:13 +02:00
Willy Tarreau
6236d3abe4 MINOR: sample: add a new "date" fetch to return the current date
Returns the current date as the epoch (number of seconds since 01/01/1970).
If an offset value is specified, then it is a number of seconds that is added
to the current date before returning the value. This is particularly useful
to compute relative dates, as both positive and negative offsets are allowed.
2013-07-25 15:00:37 +02:00
Willy Tarreau
5b8ad22228 CLEANUP: acl: move the 3 remaining sample fetches to samples.c
There is no more reason for having "always_true", "always_false" and "env"
in acl.c while they're the most basic sample fetch keywords, so let's move
them to sample.c where it's easier to find them.
2013-07-25 15:00:37 +02:00
Willy Tarreau
18387e2e48 MINOR: sample: fix sample_process handling of unstable data
sample_process() used to return NULL on changing data, regardless of the
SMP_OPT_FINAL flag. Let's change this so that it is now possible to
include such data in logs or HTTP headers. Also, one unconvenient
thing was that it used to always set the sample flags to zero, making
it incompatible with ACLs which may need to call it multiple times. Only
do this for locally-allocated samples.
2013-07-25 15:00:37 +02:00
Willy Tarreau
833cc79434 MEDIUM: sample: handle comma-delimited converter list
We now support having a comma-delimited converter list, which can start
right after the fetch keyword. The immediate benefit is that it allows
to use converters in log-format expressions, for example :

   set-header source-net %[src,ipmask(24)]

The parser is also slightly improved and should be more resilient against
configuration errors. Also, optional arguments in converters were mistakenly
not allowed till now, so this was fixed.
2013-07-25 15:00:37 +02:00
Willy Tarreau
dc13c11c1e BUG/MEDIUM: prevent gcc from moving empty keywords lists into BSS
Benoit Dolez reported a failure to start haproxy 1.5-dev19. The
process would immediately report an internal error with missing
fetches from some crap instead of ACL names.

The cause is that some versions of gcc seem to trim static structs
containing a variable array when moving them to BSS, and only keep
the fixed size, which is just a list head for all ACL and sample
fetch keywords. This was confirmed at least with gcc 3.4.6. And we
can't move these structs to const because they contain a list element
which is needed to link all of them together during the parsing.

The bug indeed appeared with 1.5-dev19 because it's the first one
to have some empty ACL keyword lists.

One solution is to impose -fno-zero-initialized-in-bss to everyone
but this is not really nice. Another solution consists in ensuring
the struct is never empty so that it does not move there. The easy
solution consists in having a non-null list head since it's not yet
initialized.

A new "ILH" list head type was thus created for this purpose : create
an Initialized List Head so that gcc cannot move the struct to BSS.
This fixes the issue for this version of gcc and does not create any
burden for the declarations.
2013-06-21 23:29:02 +02:00
Willy Tarreau
a4312fa28e MAJOR: sample: maintain a per-proxy list of the fetch args to resolve
While ACL args were resolved after all the config was parsed, it was not the
case with sample fetch args because they're almost everywhere now.

The issue is that ACLs now solely rely on sample fetches, so their args
resolving doesn't work anymore. And many fetches involving a server, a
proxy or a userlist don't work at all.

The real issue is that at the bottom layers we have no information about
proxies, line numbers, even ACLs in order to report understandable errors,
and that at the top layers we have no visibility over the locations where
fetches are referenced (think log node).

After failing multiple unsatisfying solutions attempts, we now have a new
concept of args list. The principle is that every proxy has a list head
which contains a number of indications such as the config keyword, the
context where it's used, the file and line number, etc... and a list of
arguments. This list head is of the same type as the elements, so it
serves as a template for adding new elements. This way, it is filled from
top to bottom by the callers with the information they have (eg: line
numbers, ACL name, ...) and the lower layers just have to duplicate it and
add an element when they face an argument they cannot resolve yet.

Then at the end of the configuration parsing, a loop passes over each
proxy's list and resolves all the args in sequence. And this way there is
all necessary information to report verbose errors.

The first immediate benefit is that for the first time we got very precise
location of issues (arg number in a keyword in its context, ...). Second,
in order to do this we had to parse log-format and unique-id-format a bit
earlier, so that was a great opportunity for doing so when the directives
are encountered (unless it's a default section). This way, the recorded
line numbers for these args are the ones of the place where the log format
is declared, not the end of the file.

Userlists report slightly more information now. They're the only remaining
ones in the ACL resolving function.
2013-04-03 02:13:02 +02:00
Willy Tarreau
bf8e251077 MINOR: sample: provide a function to report the name of a sample check point
We need to put names on places where samples are used in order to emit warnings
and errors. Let's do that now.
2013-04-03 02:13:00 +02:00
Willy Tarreau
80aca90ad2 MEDIUM: samples: use new flags to describe compatibility between fetches and their usages
Samples fetches were relying on two flags SMP_CAP_REQ/SMP_CAP_RES to describe
whether they were compatible with requests rules or with response rules. This
was never reliable because we need a finer granularity (eg: an HTTP request
method needs to parse an HTTP request, and is available past this point).

Some fetches are also dependant on the context (eg: "hdr" uses request or
response depending where it's involved, causing some abiguity).

In order to solve this, we need to precisely indicate in fetches what they
use, and their users will have to compare with what they have.

So now we have a bunch of bits indicating where the sample is fetched in the
processing chain, with a few variants indicating for some of them if it is
permanent or volatile (eg: an HTTP status is stored into the transaction so
it is permanent, despite being caught in the response contents).

The fetches also have a second mask indicating their validity domain. This one
is computed from a conversion table at registration time, so there is no need
for doing it by hand. This validity domain consists in a bitmask with one bit
set for each usage point in the processing chain. Some provisions were made
for upcoming controls such as connection-based TCP rules which apply on top of
the connection layer but before instantiating the session.

Then everywhere a fetch is used, the bit for the control point is checked in
the fetch's validity domain, and it becomes possible to finely ensure that a
fetch will work or not.

Note that we need these two separate bitfields because some fetches are usable
both in request and response (eg: "hdr", "payload"). So the keyword will have
a "use" field made of a combination of several SMP_USE_* values, which will be
converted into a wider list of SMP_VAL_* flags.

The knowledge of permanent vs dynamic information has disappeared for now, as
it was never used. Later we'll probably reintroduce it differently when
dealing with variables. Its only use at the moment could have been to avoid
caching a dynamic rate measurement, but nothing is cached as of now.
2013-04-03 02:12:56 +02:00
Willy Tarreau
47ca54505c MINOR: chunks: centralize the trash chunk allocation
At the moment, we need trash chunks almost everywhere and the only
correctly implemented one is in the sample code. Let's move this to
the chunks so that all other places can use this allocator.

Additionally, the get_trash_chunk() function now really returns two
different chunks. Previously it used to always overwrite the same
chunk and point it to a different buffer, which was a bit tricky
because it's not obvious that two consecutive results do alias each
other.
2012-12-23 21:46:07 +01:00
Willy Tarreau
e7ad4bb2f0 MINOR: samples: add a function to fetch and convert any sample to a string
Any sample type can now easily be converted to a string that can be used
anywhere. This will be used for logging and passing information in headers.
2012-12-21 17:57:24 +01:00
Willy Tarreau
d167e6d9fb MINOR: sample: support cast from bool to string
Samples could be converted from bool to int and from int to string but
not from bool to string. Let's add this.
2012-12-21 17:57:05 +01:00
Willy Tarreau
7e2c647ee7 MEDIUM: remove remains of BUFSIZE in HTTP auth and sample conversions
Sample conversions rely on two alternative buffers which were previously
allocated as static bufs of size BUFSIZE. Now they're initialized to the
global buffer size. It was the same for HTTP authentication. Note that it
seems that none of them was prone to any mistake when dealing with the
buffer size, but better stay on the safe side by maintaining the old
assumption that a trash buffer is always "large enough".
2012-10-29 20:44:36 +01:00
Emeric Brun
a068a2951d MINOR: sample: export 'sample_get_trash_chunk(void)'
This will be used on external fetch modules.
2012-10-22 18:54:24 +02:00
Emeric Brun
8ac33d99f2 MINOR: sample: manage binary to string type convertion in stick-table and samples.
Binary type is converted to a null terminated hexa string.
2012-10-22 18:54:15 +02:00
Willy Tarreau
2e845be249 MEDIUM: sample: pass an empty list instead of a null for fetch args
ACL and sample fetches use args list and it is really not convenient to
check for null args everywhere. Now for empty args we pass a constant
list of end of lists. It will allow us to remove many useless checks.
2012-10-19 19:49:09 +02:00
Willy Tarreau
f22a50836d MINOR: sample: accept fetch keywords without parenthesis
fetch keywords which support arguments do not support being called
without parenthesis even if all arguments are optional. Let's fix
this to allow fetch keywords without parenthesis as is already done
in ACLs.
2012-10-19 16:47:23 +02:00
Willy Tarreau
dd2f85eb3b CLEANUP: includes: fix includes for a number of users of fd.h
It appears that fd.h includes a number of unneeded files and was
included from standard.h, and as such served as an intermediary
to provide almost everything to everyone.

By removing its useless includes, a long dependency chain broke
but could easily be fixed.
2012-09-03 20:49:14 +02:00
Willy Tarreau
c7e4238df0 REORG: buffers: split buffers into chunk,buffer,channel
Many parts of the channel definition still make use of the "buffer" word.
2012-09-03 20:47:32 +02:00
Willy Tarreau
cd3b094618 REORG: rename "pattern" files
They're now called "sample" everywhere to match their description.
2012-05-08 20:57:21 +02:00