Patrick Hemmer reported that http-request unset-var(foo) if ... fails to
parse. The reason is that it reuses the same parser as "set-var(foo)" which
makes a special case of the arguments, supposed to be a sample expression
for set-var, but which must not exist for unset-var. Unfortunately the
parser finds "if" or "unless" and believes it's an expression. Let's simply
drop the test so that the outer rule parser deals with potential extraneous
keywords.
This should be backported to all versions supporting unset-var().
Patrick Hemmer reported that a simple tcp rule involving a variable like
this is enough to crash haproxy :
frontend foo
bind :8001
tcp-request session set-var(txn.foo) src
The tests on the variables scopes is not strict enough, it needs to always
verify if the stream is valid when accessing a req/res/txn variable. This
patch does this by adding a new get_vars() function which does the job
instead of open-coding all the lookups everywhere.
It must be backported to all versions supporting set-var and
"tcp-request session" so at least 1.9 and 1.8.
vars_check_arg previously leaked the string containing the variable
name:
Consider this config:
frontend fe1
mode http
bind :8080
http-request set-header X %[var(txn.host)]
Starting HAProxy and immediately stopping it by sending a SIGINT makes
Valgrind report this leak:
==7795== 9 bytes in 1 blocks are definitely lost in loss record 15 of 71
==7795== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==7795== by 0x4AA2AD: my_strndup (standard.c:2227)
==7795== by 0x51FCC5: make_arg_list (arg.c:146)
==7795== by 0x4CF095: sample_parse_expr (sample.c:897)
==7795== by 0x4BA7D7: add_sample_to_logformat_list (log.c:495)
==7795== by 0x4BBB62: parse_logformat_string (log.c:688)
==7795== by 0x4E70A9: parse_http_req_cond (http_rules.c:239)
==7795== by 0x41CD7B: cfg_parse_listen (cfgparse-listen.c:1466)
==7795== by 0x480383: readcfgfile (cfgparse.c:2089)
==7795== by 0x47A081: init (haproxy.c:1581)
==7795== by 0x4049F2: main (haproxy.c:2591)
This leak can be detected even in HAProxy 1.6, this patch thus should
be backported to all supported branches
[Cf: This fix was reverted because the chunk's area was inconditionnaly
released, making haproxy to crash when spoe was enabled. Now the chunk is
released by calling chunk_destroy(). This function takes care of the
chunk's size to release it or not. It is the responsibility of callers to
set or not the chunk's size.]
This reverts commit 6ea00195c4.
As found by Christopher, this fix is not correct due to the way args
are built at various places. For example some config or runtime parsers
will place a substring pointer there, and calling free() on it will
immediately crash the program. A quick audit of the code shows that
there are not that many users, but the way it's done requires to
properly set the string as a regular chunk (size=0 if free not desired,
then call chunk_destroy() at release time), and given that the size is
currently set to len+1 in all parsers, a deeper audit needs to be done
to figure the impacts of not setting it anymore.
Thus for now better leave this harmless leak which impacts only the
config parsing time.
This fix must be backported to all branches containing the fix above.
vars_check_arg previously leaked the string containing the variable
name:
Consider this config:
frontend fe1
mode http
bind :8080
http-request set-header X %[var(txn.host)]
Starting HAProxy and immediately stopping it by sending a SIGINT makes
Valgrind report this leak:
==7795== 9 bytes in 1 blocks are definitely lost in loss record 15 of 71
==7795== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==7795== by 0x4AA2AD: my_strndup (standard.c:2227)
==7795== by 0x51FCC5: make_arg_list (arg.c:146)
==7795== by 0x4CF095: sample_parse_expr (sample.c:897)
==7795== by 0x4BA7D7: add_sample_to_logformat_list (log.c:495)
==7795== by 0x4BBB62: parse_logformat_string (log.c:688)
==7795== by 0x4E70A9: parse_http_req_cond (http_rules.c:239)
==7795== by 0x41CD7B: cfg_parse_listen (cfgparse-listen.c:1466)
==7795== by 0x480383: readcfgfile (cfgparse.c:2089)
==7795== by 0x47A081: init (haproxy.c:1581)
==7795== by 0x4049F2: main (haproxy.c:2591)
This leak can be detected even in HAProxy 1.6, this patch thus should
be backported to all supported branches.
This commit replaces the explicit pool creation that are made in
constructors with a pool registration. Not only this simplifies the
pools declaration (it can be done on a single line after the head is
declared), but it also removes references to pools from within
constructors. The only remaining create_pool() calls are those
performed in init functions after the config is parsed, so there
is no more user of potentially uninitialized pool now.
It has been the opportunity to remove no less than 12 constructors
and 6 init functions.
This patch replaces a number of __decl_hathread() followed by HA_SPIN_INIT
or HA_RWLOCK_INIT by the new __decl_spinlock() or __decl_rwlock() which
automatically registers the lock for initialization in during the STG_LOCK
init stage. A few static modifiers were lost in the process, but since they
were not essential at all it was not worth extending the API to provide such
a variant.
This switches explicit calls to various trivial registration methods for
keywords, muxes or protocols from constructors to INITCALL1 at stage
STG_REGISTER. All these calls have in common to consume a single pointer
and return void. Doing this removes 26 constructors. The following calls
were addressed :
- acl_register_keywords
- bind_register_keywords
- cfg_register_keywords
- cli_register_kw
- flt_register_keywords
- http_req_keywords_register
- http_res_keywords_register
- protocol_register
- register_mux_proto
- sample_register_convs
- sample_register_fetches
- srv_register_keywords
- tcp_req_conn_keywords_register
- tcp_req_cont_keywords_register
- tcp_req_sess_keywords_register
- tcp_res_cont_keywords_register
- flt_register_keywords
These ones are mostly called from cfgparse.c for the parsing and do
not depend on the HTTP representation. The functions's prototypes
were moved to proto/http_rules.h, making this file work exactly like
tcp_rules. Ideally we should stop calling these functions directly
from cfgparse and register keywords, but there are a few cases where
that wouldn't work (stats http-request) so it's probably not worth
trying to go this far.
It's a bit painful to have to deal with HTTP semantics for each protocol
version (H1 and H2), and working on the version-agnostic code further
emphasizes the problem.
This patch creates http.h and http.c which are agnostic to the version
in use, and which borrow a few parts from proto_http and from h1. For
example the once thought h1-specific h1_char_classes array is in fact
dictated by RFC7231 and is used to parse HTTP headers. A few changes
were made to a few files which were including proto_http.h while they
only needed http.h.
Certain string definitions pre-dated the introduction of indirect
strings (ist) so some were used to simplify the definition of the known
HTTP methods. The current lookup code saves 2 kB of a heavily used table
and is faster than the previous table based lookup (typ. 14 ns vs 16
before).
Chunks are only a subset of a buffer (a non-wrapping version with no head
offset). Despite this we still carry a lot of duplicated code between
buffers and chunks. Replacing chunks with buffers would significantly
reduce the maintenance efforts. This first patch renames the chunk's
fields to match the name and types used by struct buffers, with the goal
of isolating the code changes from the declaration changes.
Most of the changes were made with spatch using this coccinelle script :
@rule_d1@
typedef chunk;
struct chunk chunk;
@@
- chunk.str
+ chunk.area
@rule_d2@
typedef chunk;
struct chunk chunk;
@@
- chunk.len
+ chunk.data
@rule_i1@
typedef chunk;
struct chunk *chunk;
@@
- chunk->str
+ chunk->area
@rule_i2@
typedef chunk;
struct chunk *chunk;
@@
- chunk->len
+ chunk->data
Some minor updates to 3 http functions had to be performed to take size_t
ints instead of ints in order to match the unsigned length here.
In register_name, before locking the var_names array, we check the variable name
validity. So if we try to register an invalid or empty name, we need to return
without unlocking it (because it was never locked).
This patch must be backported in 1.8.
During the migration to the second version of the pools, the new
functions and pool pointers were all called "pool_something2()" and
"pool2_something". Now there's no more pool v1 code and it's a real
pain to still have to deal with this. Let's clean this up now by
removing the "2" everywhere, and by renaming the pool heads
"pool_head_something".
This macro should be used to declare variables or struct members depending on
the USE_THREAD compile option. It avoids the encapsulation of such declarations
between #ifdef/#endif. It is used to declare all lock variables.
A RW lock has been added to the vars structure to protect each list of
variables. And a global RW lock is used to protect registered names.
When a varibable is fetched, we duplicate sample data because the variable could
be modified by another thread.
For known methods (GET,POST...), in samples, an enum is used instead of a chunk
to reference the method. So there is no needs to allocate memory when a variable
is stored with this kind of sample.
The variable are compared only using text, the final '\0' (or the
string length) are not checked. So, the variable name "txn.internal"
matchs other one call "txn.int".
This patch fix this behavior
It must be backported ni 1.6 and 1.7
There's no more reason to keep tcp rules processing inside proto_tcp.c
given that there is nothing in common there except these 3 letters : tcp.
The tcp rules are in fact connection, session and content processing rules.
Let's move them to "tcp-rules" and let them live their life there.
gcc 3.4.6 noticed a possibly unitialized variable in vars.c, and while it
cannot happen the way the function is used, it's surprizing that newer
versions did not report it.
This fix may be backported to 1.6.
It does the opposite of 'set-var' action/converter. It is really useful for
per-process variables. But, it can be used for any scope.
The lua function 'unset_var' has also been added.
Now it is possible to use variables attached to a process. The scope name is
'proc'. These variables are released only when HAProxy is stopped.
'tune.vars.proc-max-size' directive has been added to confiure the maximum
amount of memory used by "proc" variables. And because memory accounting is
hierachical for variables, memory for "proc" vars includes memory for "sess"
vars.
This function, unsurprisingly, sets a variable value only if it already
exists. In other words, this function will succeed only if the variable was
found somewhere in the configuration during HAProxy startup.
It will be used by SPOE filter. So an agent will be able to set a value only for
existing variables. This prevents an agent to create a very large number of
unused variables to flood HAProxy and exhaust the memory reserved to variables..
The 'set-var' converter uses function smp_conv_store (vars.c). In this function,
we should use the first argument (index 0) to retrieve the variable name and its
scope. But because of a typo, we get the scope of the second argument (index
1). In this case, there is no second argument. So the scope used was always 0
(SCOPE_SESS), always setting the variable in the session scope.
So, due to this bug, this rules
tcp-request content accept if { src,set-var(txn.foo) -m found }
always set the variable 'sess.foo' instead of 'txn.foo'.
This commit introduces "tcp-request session" rules. These are very
much like "tcp-request connection" rules except that they're processed
after the handshake, so it is possible to consider SSL information and
addresses rewritten by the proxy protocol header in actions. This is
particularly useful to track proxied sources as this was not possible
before, given that tcp-request content rules are processed after each
HTTP request. Similarly it is possible to assign the proxied source
address or the client's cert to a variable.
Thus the SMP_USE_HTTP_ANY dependency is incorrect, we have to depend on
SMP_USE_L5_CLI (the session). It's particularly important for session-wide
variables which are kept across HTTP requests. For now there is no impact
but it will make a difference with tcp-request session rules.
smp_fetch_var() may be called from everywhere since it just reads a
variable. It must ensure that the stream exists before trying to return
a stream-dependant variable. For now there is no impact but it will
cause trouble with tcp-request session rules.
Changed all the cases where the pointer passed to realloc is overwritten
by the pointer returned by realloc. The new function my_realloc2 has
been used except in function register_name. If register_name fails to
add a new variable because of an "out of memory" error, all the existing
variables remain valid. If we had used my_realloc2, the array of variables
would have been freed.
This is the continuation of previous patch called "BUG/MAJOR: samples:
check smp->strm before using it".
It happens that variables may have a session-wide scope, and that their
session is retrieved by dereferencing the stream. But nothing prevents them
from being used from a streamless context such as tcp-request connection,
thus crashing the process. Example :
tcp-request connection accept if { src,set-var(sess.foo) -m found }
In order to fix this, we have to always ensure that variable manipulation
only happens via the sample, which contains the correct owner and context,
and that we never use one from a different source. This results in quite a
large change since a lot of functions are inderctly involved in the call
chain, but the change is easy to follow.
This fix must be backported to 1.6, and requires the last two patches.
Since commit bc4c1ac ("MEDIUM: http/tcp: permit to resume http and tcp
custom actions"), some actions may yield and be called back when new
information are available. Unfortunately some of them may continue to
yield because they simply don't know that it's the last call from the
rule set. For this reason we'll need to pass a flag to the custom
action to pass such information and possibly other at the same time.
Before this patch, two type of custom actions exists: ACT_ACTION_CONT and
ACT_ACTION_STOP. ACT_ACTION_CONT is a non terminal action and ACT_ACTION_STOP is
a terminal action.
Note that ACT_ACTION_STOP is not used in HAProxy.
This patch remove this behavior. Only type type of custom action exists, and it
is called ACT_CUSTOM. Now, the custion action can return a code indicating the
required behavior. ACT_RET_CONT wants that HAProxy continue the current rule
list evaluation, and ACT_RET_STOP wants that HAPRoxy stops the the current rule
list evaluation.
Now the prototype for each action from each section are the same, and
a discriminant for determining for each section we are called are added.
So, this patch removes the wrappers for the action functions called from
more than one section.
This patch removes 132 lines of useless code.
This patch normalize the return code of the configuration parsers. Before
these changes, the tcp action parser returned -1 if fail and 0 for the
succes. The http action returned 0 if fail and 1 if succes.
The normalisation does:
- ACT_RET_PRS_OK for succes
- ACT_RET_PRS_ERR for failure
This patch merges the conguration keyword struct. Each declared configuration
keyword struct are similar with the others. This patch simplify the code.
Action function can return 3 status:
- error if the action encounter fatal error (like out of memory)
- yield if the action must terminate his work later
- continue in other cases
This patch group the action name in one file. Some action are called
many times and need an action embedded in the action caller. The main
goal is to have only one header file grouping all definitions.
The (http|tcp)-(request|response) action rules use common
opaque type. For the HAProxy embbedded feature, types are know,
it better to add this types in the action union and use it.
This patch is the first of a serie which merge all the action structs. The
function "tcp-request content", "tcp-response-content", "http-request" and
"http-response" have the same values and the same process for some defined
actions, but the struct and the prototype of the declared function are
different.
This patch try to unify all of these entries.
With the difference between the "struct sample" data and the
"struct sample_storage" data, it was not possible to write
data = data, and we did a memcpy. This patch remove some of
these memcpy.
The union name "data" is a little bit heavy while we read the source
code because we can read "data.data.sint". The rename from "data" to "u"
makes the read easiest like "data.u.sint".
This patch remove the struct information stored both in the struct
sample_data and in the striuct sample. Now, only thestruct sample_data
contains data, and the struct sample use the struct sample_data for storing
his own data.
Some function are just a wrappers. This patch reduce the size of
this wrapper for improving the readability. One check is moved
from the wrapper to the main function, and some middle vars are
removed.
This patch removes the 32 bits unsigned integer and the 32 bit signed
integer. It replaces these types by a unique type 64 bit signed.
This makes easy the usage of integer and clarify signed and unsigned use.
With the previous version, signed and unsigned are used ones in place of
others, and sometimes the converter loose the sign. For example, divisions
are processed with "unsigned", if one entry is negative, the result is
wrong.
Note that the integer pattern matching and dotted version pattern matching
are already working with signed 64 bits integer values.
There is one user-visible change : the "uint()" and "sint()" sample fetch
functions which used to return a constant integer have been replaced with
a new more natural, unified "int()" function. These functions were only
introduced in the latest 1.6-dev2 so there's no impact on regular
deployments.
Commit 4834bc7 ("MEDIUM: vars: adds support of variables") brought a bug.
Setting a variable from an expression that doesn't resolve infinitely
blocks the processing.
The internal actions API must be changed to let the caller pass the various
flags regarding the state of the analysis (SMP_OPT_FINAL).
For now we only fix the issue by making the action_store() function always
return 1 to prevent any blocking.
No backport is needed.
We'll need to move the session variables to the session. For this, the
accounting must not depend on the stream. Instead we pass the pointers
to the different lists.
A switch case doesn't have default entry, and the compilator sends
a warning about uninitilized var.
warning: 'vars' may be used uninitialized in this function [-Wmaybe-uninitialized]
This patch adds two functions used for variable acces using the
variable full name. If the variable doesn't exists in the variable
pool name, it is created.
This patch adds support of variables during the processing of each stream. The
variables scope can be set as 'session', 'transaction', 'request' or 'response'.
The variable type is the type returned by the assignment expression. The type
can change while the processing.
The allocated memory can be controlled for each scope and each request, and for
the global process.