As reported by Chris Staite in GH #3002, trying to yield from a Lua
action during a client disconnect causes the script to be interrupted
(which is expected) and an alert to be emitted with the error:
"Lua function '%s': yield not allowed".
While this error is well suited for cases where the yield is not expected
at all (ie: when context doesn't allow it) and results from a yield misuse
in the Lua script, it isn't the case when the yield is exceptionnally not
available due to an abort or error in the request/response processing.
Because of that we raise an alert but the user cannot do anything about it
(the script is correct), so it is confusing and polluting the logs.
In this patch we introduce the ACT_OPT_FINAL_EARLY flag which is a
complementary flag to ACT_OPT_FIRST. This flag is set when the
ACT_OPT_FIRST is set earlier than normal (due to error/abort).
hlua_action() then checks for this flag to decide whether an error (alert)
or a simple log message should be emitted when the yield is not available.
It should solve GH #3002. Thanks to Chris Staite (@chrisstaite-menlo) for
having reported the issue and suggested a solution.
Most fe and be counters are good candidates for being shared between
processes. They are now grouped inside "shared" struct sub member under
be_counters and fe_counters.
Now they are properly identified, they would greatly benefit from being
shared over thread groups to reduce the cost of atomic operations when
updating them. For this, we take the current tgid into account so each
thread group only updates its own counters. For this to work, it is
mandatory that the "shared" member from {fe,be}_counters is initialized
AFTER global.nbtgroups is known, because each shared counter causes the stat
to be allocated lobal.nbtgroups times. When updating a counter without
concurrency, the first counter from the array may be updated.
To consult the shared counters (which requires aggregation of per-tgid
individual counters), some helper functions were added to counter.h to
ease code maintenance and avoid computing errors.
Shareable counters are not tagged as shared counters and are dynamically
allocated in separate memory area as a prerequisite for being stored
in shared memory area. For now, GUID and threads groups are not taken into
account, this is only a first step.
also we ensure all counters are now manipulated using atomic operations,
namely, "last_change" counter is now read from and written to using atomic
ops.
Despite the numerous changes caused by the counters being moved away from
counters struct, no change of behavior should be expected.
It is not rare to see configurations with a large number of "tcp-request
content" or "http-request" rules for instance. A large number of rules
combined with cpu-demanding actions (e.g.: actions that work on content)
may create thread contention as all the rules from a given ruleset are
evaluated under the same polling loop if the evaluation is not interrupted
Thus, in this patch we add extra logic around "tcp-request content",
"tcp-response content", "http-request" and "http-response" rulesets, so
that when a certain number of rules are evaluated under the single polling
loop, we force the evaluating function to yield. As such, the rule which
was about to be evaluated is saved, and the function starts evaluating
rules from the save pointer when it returns (in the next polling loop).
We use task_wakeup(task, TASK_WOKEN_MSG) to explicitly wake the task so
that no time is wasted and the processing is resumed ASAP. TASK_WOKEN_MSG
is mandatory here because process_stream() expects TASK_WOKEN_MSG for
explicit analyzers re-evaluation.
rules_bcount stream's attribute was added to count how manu rules were
evaluated since last interruption (yield). Also, SF_RULE_FYIELD flag
was added to know that the s->current_rule was assigned due to forced
yield and not regular yield.
By default haproxy will enforce a yield every 50 rules, this behavior
can be configured using the "tune.max-rules-at-once" global keyword.
There is a limitation though: for now, if the ACT_OPT_FINAL flag is set
on act_opts, we consider it is not safe to yield (as it is already the
case for automatic yield). In this case instead of yielding an taking
the risk of not being called back, we skip the yield and hope it will
not create contention. This is something we should ideally try to
improve in order to yield in all conditions.
When the tcp-response content ruleset evaluation is delayed because of an
ACL condition, the close forwarding on the client side is not explicitly
blocked. So it is possible to close the client side before the end of the
response evaluation.
To fix the issue, this is now done in all cases where some data are
missing. Concretely, channel_dont_close() is called in "missing_data" goto
label.
Note it is only a theorical bug (or pending bug). It is not possible to
trigger it for now because an ACL cannot wait for more data when a close was
received. But the code remains a bit weak. It is safer this way. It is
especially mandatory for the "force yield" option that should be added soon.
This patch could be backported to all stable versions.
If it is the last patch to introduce dedicated termination events for each
location. In this one, events for the stream location are introcued. The old
enum is also removed because it is now unused.
Here, more accurate evets are added. The "intercepted" event was splitted.
To be able to add more accurate termination events for each location, the
enum will be splitted by location. Indeed, there are at most 16 possbile
events. It will be pretty confusing to use same termination events for the
different locations. So the best is to split them.
In this patch, the termination events for the fd, hs and xprt locations are
introduced. For now some holes are added to keep similar events aligned
across enums. But this may change in future.
In this patch, events for the stream location are reported. These events are
first reported on the corresponding stream-connector. So front events on scf
and back event on scb. Then all events are both merged in the stream. But
only 4 events are saved on the stream.
Several internal events are for now grouped with the type
"tevt_type_intercepted". More events will be added to have a better
resolution. But at least the place to report these events are identified.
For now, when a event is reported on a SC, it is also reported on the stream
and vice versa.
acl_match_cond() combines acl_exec_cond() + acl_pass() and a check on the
condition->pol (to check if the cond is inverted) in order to return
either 0 if the cond doesn't match or 1 if it matches (or NULL).
Thanks to this we can actually simplify some redundant constructs that
iterate over rules and evaluate if the condition matches or not.
Conditions for tcp-request inspect-content and tcp-response
inspect-content couldn't be simplified because they perform an extra
check for missing data, and thus still need to leverage acl_exec_cond()
It's best to display the patch using "-w", like "git show xxxx -w",
because some blocks had to be re-indented after the cleanup, which
makes the patch hard to review by default.
This only concerns functions emitting warnings about misplaced tcp-request
rules. The direction is now specified in the functions name. For instance
"warnif_misplaced_tcp_conn" is replaced by "warnif_misplaced_tcp_req_conn".
In warnings about misplaced rules, only the first keyword is mentionned. It
works well for http-request or quic-initial rules for instance. But it is a
bit confusing for tcp-request rules, because the layer is missing (session
or content).
To make it a bit systematic (and genric), the second argument can now be
provided. It can be set to NULL if there is no layer or scope. But
otherwise, it may be specified and it will be reported in the warning.
So the following snippet:
tcp-request content reject if FALSE
tcp-request session reject if FALSE
tcp-request connection reject if FALSE
Will now emit the following warnings:
a 'tcp-request session' rule placed after a 'tcp-request content' rule will still be processed before.
a 'tcp-request connection' rule placed after a 'tcp-request session' rule will still be processed before.
This patch should fix the issue #2596.
When a rule or a filter yields because it waits for something to be able to
continue its processing, this entity is saved in the stream. If an error or
a timeout occurred, info on this entity may be retrieved via the
"waiting_entity" sample fetch, for instance to dump it in the logs. This
info may be useful to found root cause of some bugs because it is a way to
know the processing was temporarily stopped. This may explain timeouts for
instance.
The sample fetch is not documented yet.
The last evaluated rule is now saved in a generic structure, named
last_entity, with a type to identify it. The idea is to be able to store
other kind of entity that may interrupt a specific processing.
The type of the last evaluated rule is set to 1. It will be replace later by
an enum to be more explicit. In addition, the pointer to the rule itself is
saved instead of its location.
The sample fetch "last_entity" was added to retrieve the information about
it. In this case, it is the rule localtion, the config file containing the
rule followed by the line where the rule is defined, separated by a
colon. This sample fetch is not documented yet.
When an action yields while it is not allowed, an internal error is
reported. This interrupts the processing. So info about the last evaluated
rule must be filled.
This patch may be bakcported if needed. If so, the commit ("MINOR: stream:
Save last evaluated rule on invalid yield") must be backported first.
We start implementing some postparsing compatibility checks for log
backends.
Here we report a warning if user tries to use tcp-{request,response} rules
with log backend, and we properly ignore such rules when inherited from
defaults section.
A regression was introduced with the commit cb59e0bc3 ("BUG/MINOR:
tcp-rules: Stop content rules eval on read error and end-of-input"). We
should not shorten the inspect-delay when the EOI flag is set on the SC.
Idea of the inspect-delay is to wait a TCP rule is matching. It is only
interrupted if an error occurs, on abort or if the peer shuts down. It is
also interrupted if the buffer is full. This last case is a bit ambiguous
and discutable. It could be good to add ACLS, like "wait_complete" and
"wait_full" to do so. But for now, we only remove the test on SC_FL_EOI
flag.
This patch must be backported to all stable versions.
SC_FL_EOS flag is added to report the end-of-stream at the SC level. It will
be used to distinguish end of stream reported by the endoint, via the
SE_FL_EOS flag, and the abort triggered by the stream, via the
SC_FL_ABRT_DONE flag.
In this patch, the flag is defined and is systematically tested everywhere
SC_FL_ABRT_DONE is tested. It should be safe because it is never set.
In the same way the previous commit, we stop to use SE_FL_ERROR flag from
analyzers and their sub-functions. We now fully rely on SC_FL_ERROR to do so.
From the stream, when SE_FL_ERROR flag is tested, we now also test the
SC_FL_ERROR flag. Idea is to stop to rely on the SE descriptor to detect
errors.
Here again, it is just a flag renaming. In SC flags, there is no longer
shutdown for reads but aborts. For now this flag is set when a read0 is
detected. It is of couse not accurate. This will be changed later.
The purpose of this patch is only a one-to-one replacement, as far as
possible.
CF_SHUTR(_NOW) and CF_SHUTW(_NOW) flags are now carried by the
stream-connecter. CF_ prefix is replaced by SC_FL_ one. Of course, it is not
so simple because at many places, we were testing if a channel was shut for
reads and writes in same time. To do the same, shut for reads must be tested
on one side on the SC and shut for writes on the other side on the opposite
SC. A special care was taken with process_stream(). flags of SCs must be
saved to be able to detect changes, just like for the channels.
This patch removes CF_READ_ERROR and CF_WRITE_ERROR flags. We now rely on
SE_FL_ERR_PENDING and SE_FL_ERROR flags. SE_FL_ERR_PENDING is used for write
errors and SE_FL_ERROR for read or unrecoverable errors.
When a connection error is reported, SE_FL_ERROR and SE_FL_EOS are now set and a
read event and a write event are reported to be sure the stream will properly
process the error. At the stream-connector level, it is similar. When an error
is reported during a send, a write event is triggered. On the read side, nothing
more is performed because an error at this stage is enough to wake the stream
up.
A major change is brought with this patch. We stop to check flags of the
ooposite channel to report abort or timeout. It also means when an read or
write error is reported on a side, we no longer update the other side. Thus
a read error on the server side does no long lead to a write error on the
client side. This should ease errors report.
When a read error (CF_READ_ERROR) is reported, a shutdown for reads is
always performed (CF_SHUTR). Thus, there is no reason to check if
CF_READ_ERROR is set if CF_SHUTR is also checked.
The number of stick-counter entries usable by track-sc rules is currently
set at build time. There is no good value for this since the vast majority
of users don't need any, most need only a few and rare users need more.
Adding more counters for everyone increases memory and CPU usages for no
reason.
This patch moves the per-session and per-stream arrays to a pool of a size
defined at boot time. This way it becomes possible to set the number of
entries at boot time via a new global setting "tune.stick-counters" that
sets the limit for the whole process. When not set, the MAX_SESS_STR_CTR
value still applies, or 3 if not set, as before.
It is also possible to lower the value to 0 to save a bit of memory if
not used at all.
Note that a few low-level sample-fetch functions had to be protected due
to the ability to use sample-fetches in the global section to set some
variables.
When a TCP content ruleset is evaluated, we stop waiting for more data if
the inspect-delay is reached, if there is a read error or if we know no more
data will be received. This last point is only valid for ACLs. An action may
decide to yield for another reason. For instance, in the SPOE, the
"send-spoe-group" action yields while the agent response is not
received. Thus, now, an action call is final only when the inspect-delay is
reached or if there is a read error. But it is possible for an action to
yield if the buffer is full or if CF_EOI flag is set.
This patch could be backported to all supported versions.
There's no more reason for keepin the code and definitions in conn_stream,
let's move all that to stconn. The alphabetical ordering of include files
was adjusted.
This file contains all the stream-connector functions that are specific
to application layers of type stream. So let's name it accordingly so
that it's easier to figure what's located there.
The alphabetical ordering of include files was preserved.
The new name mor eclearly indicates that a stream connector cannot make
any more progress because it needs room in the channel buffer, or that
it may be unblocked because the buffer now has more room available. The
testing function is sc_waiting_room(). This is mostly used by applets.
Note that the flags will change soon.
This renames the "struct conn_stream" to "struct stconn" and updates
the descriptions in all comments (and the rare help descriptions) to
"stream connector" or "connector". This touches a lot of files but
the change is minimal. The local variables were not even renamed, so
there's still a lot of "cs" everywhere.
Remaining flags and associated functions are move in the conn-stream
scope. These flags are added on the endpoint and not the conn-stream
itself. This way it will be possible to get them from the mux or the
applet. The functions to get or set these flags are renamed accordingly with
the "cs_" prefix and updated to manipualte a conn-stream instead of a
stream-interface.
si_shutr(), si_shutw(), si_chk_rcv() and si_chk_snd() are moved in the
conn-stream scope and renamed, respectively, cs_shutr(), cs_shutw(),
cs_chk_rcv(), cs_chk_snd() and manipulate a conn-stream instead of a
stream-interface.
Flags to disable lingering and half-close are now handled at the conn-stream
level. Thus SI_FL_NOLINGER and SI_FL_NOHALF stream-int flags are replaced by
CS_FL_NOLINGER and CS_FL_NOHALF conn-stream flags.
Instead of setting a stream-interface flag to then set the corresponding
conn-stream endpoint flag, we now only rely the conn-stream endoint. Thus
SI_FL_KILL_CON is replaced by CS_EP_KILL_CONN.
In addition si_must_kill_conn() is replaced by cs_must_kill_conn().
When a tcp-request or tcp-response rule fails to parse, we currently
free only the rule without its contents, which makes ASAN complain.
Now that we have a new function for this, let's completely free the
rule. Reg-tests are now completely OK with ASAN. This relies on this
commit:
MINOR: actions: add new function free_act_rule() to free a single rule
It's probably not needed to backport this since we're on the exit path
anyway.
When a tcp-{request,response} content or http-request/http-response
rule delivers a final verdict (deny, accept, redirect etc), the last
evaluated one will now be recorded in the stream. The purpose is to
permit to log the last one that performed a final action. For now
the log is not produced.
TCP rules from defaults section are now evaluated. These rules are evaluated
before those of the proxy. For L7 TCP rules, the same default ruleset cannot
be attached to the frontend and the backend. However, at this stage, we take
care to not execute twice the same ruleset. So, in theory, a frontend and a
backend could use the same defaults section. In this case, the default
ruleset is executed before all others and only once.