Historically this function would try to wake the most accurate number of
process_stream() waiters. But since the introduction of filters which could
also require buffers (e.g. for compression), things started not to be as
accurate anymore. Nowadays muxes and transport layers also use buffers, so
the runqueue size has nothing to do anymore with the number of supposed
users to come.
In addition to this, the threshold was compared to the number of free buffer
calculated as allocated minus used, but this didn't work anymore with local
pools since these counts are not updated upon alloc/free!
Let's clean this up and pass the number of released buffers instead, and
consider that each waiter successfully called counts as one buffer. This
is not rocket science and will not suddenly fix everything, but at least
it cannot be as wrong as it is today.
This could have been marked as a bug given that the current situation is
totally broken regarding this, but this probably doesn't completely fix
it, it only goes in a better direction. It is possible however that it
makes sense in the future to backport this as part of a larger series if
the situation significantly improves.
The buffer wait queue used to be global historically but this doest not
make any sense anymore given that the most common use case is to have
thread-local pools. Thus there's no point waking up waiters of other
threads after releasing an entry, as they won't benefit from it.
Let's move the queue head to the thread_info structure and use
ti->buffer_wq from now on.
The two algos defining these functions (first and leastconn) do not need the
server's lock. However it's already present in pendconn_process_next_strm()
so the API must be updated so that the functions may take it if needed and
that the callers indicate whether they already own it.
As such, the call places (backend.c and stream.c) now do not take it
anymore, queue.c was unchanged since it's already held, and both "first"
and "leastconn" were updated to take it if not already held.
A quick test on the "first" algo showed a jump from 432 to 565k rps by
just dropping the lock in stream.c!
The remaining contention on the server lock solely comes from
sess_change_server() which takes the lock to add and remove a
stream from the server's actconn list. This is both expensive
and pointless since we have mt-lists, and this list is only
used by the CLI's "shutdown server sessions" command!
Let's migrate to an mt-list and remove the need for this costly
lock. By doing so, the request rate increased by ~1.8%.
This patch splits current dns.c into two files:
The first dns.c contains code related to DNS message exchange over UDP
and in future other TCP. We try to remove depencies to resolving
to make it usable by other stuff as DNS load balancing.
The new resolvers.c inherit of the code specific to the actual
resolvers.
Note:
It was really difficult to obtain a clean diff dur to the amount
of moved code.
Note2:
Counters and stuff related to stats is not cleany separated because
currently counters for both layers are merged and hard to separate
for now.
The code dealing with the copy of requests in the L7-buffer and the
retransmits during L7 retries has been moved in the HTTP analysers. The copy
is now performed in the REQ_HTTP_XFER_BODY analyser and the L7 retries is
performed in the RES_WAIT_HTTP analyser. This way, si_cs_recv() and
si_cs_send() don't care of it anymore. It is much more natural to deal with
L7 retry in HTTP analysers.
TCP to H1 upgrades are buggy for now. When such upgrade is performed, a
crash is experienced. The bug is the result of the recent H1 mux
refactoring, and more specifically because of the commit c4bfa59f1 ("MAJOR:
mux-h1: Create the client stream as later as possible"). Indeed, now the H1
mux is responsible to create the frontend conn-stream once the request
headers are fully received. Thus the TCP to H1 upgrade is a problem because
the frontend conn-stream already exists.
To fix the bug, we must keep this conn-stream and the associate stream and
use it in the H1 mux. To do so, the upgrade will be performed in two
steps. First, the mux is upgraded from mux-pt to mux-h1. Then, the mux-h1
performs the stream upgrade, once the request headers are fully received and
parsed. To do so, stream_upgrade_from_cs() must be used. This function set
the SF_HTX flags to switch the stream to HTX mode, it removes the SF_IGNORE
flags and eventually it fills the request channel with some input data.
This patch is required to fix the TCP to H1 upgrades and is intimately
linked with the next commits.
When a TCP to H2 upgrade is performed, the SF_IGNORE flag is set on the
stream before killing it. This happens when a TCP/SSL client connection is
routed to a HTTP backend and the h2 alpn detected. The SF_IGNORE flag was
added for this purpose, to skip some processing when the stream is aborted
before a mux upgrade. Some counters updates were skipped this way. But some
others are still updated.
Now, all counters update at the end of process_stream(), before releasing
the stream, are ignored if SF_IGNORE flag is set. Note this stream is
aborted because we switch from a mono-stream to a multi-stream
multiplexer. It works differently for TCP to H1 upgrades.
This patch should be backported as far as 2.0 after some observation period.
I was looking at writing a simple first test for prometheus but I
realised there is no proper way to exclude it if haproxy was not built
with prometheus plugin.
Today we have `REQUIRE_OPTIONS` in reg-tests which is based on `Feature
list` from `haproxy -vv`. Those options are coming from the Makefile
itself.
A plugin is build this way:
EXTRA_OBJS="contrib/prometheus-exporter/service-prometheus.o"
It does register service actions through `service_keywords_register`.
Those are listed through `list_services` in `haproxy -vv`.
To facilitate parsing, I slightly changed the output to a single line
and integrate it in regtests shell script so that we can now specify a
dependency while writing a reg-test for prometheus, e.g:
#REQUIRE_SERVICE=prometheus-exporter
#REQUIRE_SERVICES=prometheus-exporter,foo
There might be other ways to handle this, but that's the cleanest I
found; I understand people might be concerned by this output change in
`haproxy -vv` which goes from:
Available services :
foo
bar
to:
Available services : foo bar
Signed-off-by: William Dauchy <wdauchy@gmail.com>
This allows using the address of the server rather than the name of the
server for keeping track of servers in a backend for stickiness.
The peers code was also extended to support feeding the dictionary using
this key instead of the name.
Fixes#814
Add cur_server_timeout and cur_tunnel_timeout.
These sample fetches return the current timeout value for a stream. This
is useful to retrieve the value of a timeout which was changed via a
set-timeout rule.
Prepare the possibility to register sample fetches on the stream.
This commit is necessary to implement sample fetches to retrieve the
current timeout values.
Allow the modification of the tunnel timeout on the stream side.
Use a new field in the stream for the tunnel timeout. It is initialized
by the tunnel timeout from backend unless it has already been set by a
set-timeout tunnel rule.
Allow the modification of the timeout server value on the stream side.
Do not apply the default backend server timeout in back_establish if it
is already defined. This is the case if a set-timeout server rule has
been executed.
At the end of stream_new(), once the input buffer is transfer to the request
channel, it must not be used anymore. The previous patch (16df178b6 "BUG/MEDIUM:
stream: Xfer the input buffer to a fully created stream") was pushed to quickly.
No backport needed.
The input buffer passed as argument to create a new stream must not be
transferred when the request channel is initialized because the channel
flags are not set at this stage. In addition, the API is a bit confusing
regarding the buffer owner when an error occurred. The caller remains the
owner, but reading the code it is not obvious.
So, first of all, to avoid any ambiguities, comments are added on the
calling chain to make it clear. The buffer owner is the caller if any error
occurred. And the ownership is transferred to the stream on success.
Then, to make things simple, the ownership is transferred at the end of
stream_new(), in case of success. And the input buffer is updated to point
on BUF_NULL. Thus, in all cases, if the caller try to release it calling
b_free() on it, it is not a problem. Of course, it remains the caller
responsibility to release it on error.
The patch fixes a bug introduced by the commit 26256f86e ("MINOR: stream:
Pass an optional input buffer when a stream is created"). No backport is
needed.
The cumulative numbers of http requests, http errors, bytes received and
sent and their respective rates for a tracked counters are now updated using
specific stream independent functions. These functions are used by the
stream but the aim is to allow the session to do so too. For now, there is
no reason to perform these updates from the session, except from the mux-h2
maybe. But, the mux-h1, on the frontend side, will be able to return some
errors to the client, before the stream creation. In this case, it will be
mandatory to update counters tracked at the session level.
It is now possible to set the buffer used by the channel request buffer when
a stream is created. It may be useful if input data are already received,
instead of waiting the first call to the mux rcv_buf() callback. This change
is mandatory to support H1 connection with no stream attached.
For now, the multiplexers don't pass any buffer. BUF_NULL is thus used to
call stream_create_from_cs().
These info are only provided by the mux-h1. But, thanks to previous patches,
we can get them from the session directly. There is no need to retrieve them
from the mux anymore.
Since the idle duration provided by the session is always up-to-date, there
is no more reason to rely on the multiplexer cs_info to set it to the
stream.
This function simply calls action_lookup() on the private service_keywords,
to look up a service name. This will be used to detect double registration
of a same service from Lua.
This will be needed by a next patch to fix a bug and will have to be
backported.
The server lock must be held when server_take_conn() and server_drop_conn()
lbprm callback functions are called. It is a documented prerequisite but it is
not always performed. It only affects leastconn and fas lb algorithm. Others
don't use these callback functions.
A race condition on the next pending effecive weight (next_eweight) may be
encountered with the leastconn lb algorithm. An agent check may set it to 0
while fwlc_srv_reposition() is called. The server is locked during the
next_eweight update. But because the server lock is not acquired when
fwlc_srv_reposition() is called, we may use it to recompute the server key,
leading to a division by 0.
This patch must be backported as far as 1.8.
The remaining proxy states were only used to distinguish an enabled
proxy from a disabled one. Due to the initialization order, both
PR_STNEW and PR_STREADY were equivalent after startup, and they
would only differ from PR_STSTOPPED when the proxy is disabled or
shutdown (which is effectively another way to disable it).
Now we just have a "disabled" field which allows to distinguish them.
It's becoming obvious that start_proxies() is only used to print a
greeting message now, that we'd rather get rid of. Probably that
zombify_proxy() and stop_proxy() should be merged once their
differences move to the right place.
The receiver is the one which depends on the protocol while the listener
relies on the receiver. Let's move the protocol there. Since there's also
a list element to get back to the listener from the proto list, this list
element (proto_list) was moved as well. For now when scanning protos, we
still see listeners which are linked by their rx.proto_list part.
A dedicated expiration date is now used to apply the inspect-delay of the
tcp-request or tcp-response rulesets. Before, the analyse expiratation date was
used but it may also be updated by the lua (at least). So a lua script may
extend or reduce the inspect-delay by side effect. This is not expected. If it
becomes necessary, a specific function will be added to do this. Because, for
now, it is a bit confusing.
The do-resolve HTTP action, performing a DNS resolution of a sample expression
output, is not thread-safe at all. The resolver object used to do the resolution
must be locked when the action is executed or when the stream is released
because its curr or wait resolution lists and the requester list inside a
resolution are updated. It is also important to not wake up a released stream
(with a destroyed task).
Of course, because of this bug, various kind of crashes may be observed.
This patch should fix the issue #236. It must be backported as far as 2.0.
The TRY_ADDQ there was not needed since the wait list is exclusively
owned by the caller. There's a preliminary test on MT_LIST_ADDED()
that might have been eliminated by keeping MT_LIST_TRY_ADDQ() but
it would have required two more expensive writes before testing so
better keep the test the way it is.
Initially when mt_lists were added, their purpose was to be used with
the scheduler, where anyone may concurrently add the same tasklet, so
it sounded natural to implement a check in MT_LIST_ADD{,Q}. Later their
usage was extended and MT_LIST_ADD{,Q} started to be used on situations
where the element to be added was exclusively owned by the one performing
the operation so a conflict was impossible. This became more obvious with
the idle connections and the new macro was called MT_LIST_ADDQ_NOCHECK.
But this remains confusing and at many places it's not expected that
an MT_LIST_ADD could possibly fail, and worse, at some places we start
by initializing it before adding (and the test is superflous) so let's
rename them to something more conventional to denote the presence of the
check or not:
MT_LIST_ADD{,Q} : inconditional operation, the caller owns the
element, and doesn't care about the element's
current state (exactly like LIST_ADD)
MT_LIST_TRY_ADD{,Q}: only perform the operation if the element is not
already added or in the process of being added.
This means that the previously "safe" MT_LIST_ADD{,Q} are not "safe"
anymore. This also means that in case of backport mistakes in the
future causing this to be overlooked, the slower and safer functions
will still be used by default.
Note that the missing unchecked MT_LIST_ADD macro was added.
The rest of the code will have to be reviewed so that a number of
callers of MT_LIST_TRY_ADDQ are changed to MT_LIST_ADDQ to remove
the unneeded test.
"show sess" and particularly "show sess all" can be very slow when dumping
lots of information, and while dumping, new sessions might appear, making
the output really endless. When threads are used, this causes a double
problem:
- all threads are paused during the dump, so an overly long dump degrades
the quality of service ;
- since all threads are paused, more events get postponed, possibly
resulting in more streams to be dumped on next invocation of the dump
function.
This patch addresses this long-lasting issue by doing something simple:
the CLI's stream is moved at the end of the steams list, serving as an
identifiable marker to end the dump, because all entries past it were
added after the command was entered. As a result, the CLI's stream always
appears as the last one.
It may make sense to backport this to stable branches where dumping live
streams is difficult as well.
This one was confusingly called, I thought it was the cumulated number
of streams but it's the number of calls to process_stream(). Let's make
this clearer.
This patch fixes all the leftovers from the include cleanup campaign. There
were not that many (~400 entries in ~150 files) but it was definitely worth
doing it as it revealed a few duplicates.
There's no point splitting the file in two since only cfgparse uses the
types defined there. A few call places were updated and cleaned up. All
of them were in C files which register keywords.
There is nothing left in common/ now so this directory must not be used
anymore.
This one was not easy because it was embarking many includes with it,
which other files would automatically find. At least global.h, arg.h
and tools.h were identified. 93 total locations were identified, 8
additional includes had to be added.
In the rare files where it was possible to finalize the sorting of
includes by adjusting only one or two extra lines, it was done. But
all files would need to be rechecked and cleaned up now.
It was the last set of files in types/ and proto/ and these directories
must not be reused anymore.
extern struct dict server_name_dict was moved from the type file to the
main file. A handful of inlined functions were moved at the bottom of
the file. Call places were updated to use server-t.h when relevant, or
to simply drop the entry when not needed.
The files remained mostly unchanged since they were OK. However, half of
the users didn't need to include them, and about as many actually needed
to have it and used to find functions like srv_currently_usable() through
a long chain that broke when moving the file.
This one is particularly difficult to split because it provides all the
functions used to manipulate a proxy state and to retrieve names or IDs
for error reporting, and as such, it was included in 73 files (down to
68 after cleanup). It would deserve a small cleanup though the cut points
are not obvious at the moment given the number of structs involved in
the struct proxy itself.
The current state of the logging is a real mess. The main problem is
that almost all files include log.h just in order to have access to
the alert/warning functions like ha_alert() etc, and don't care about
logs. But log.h also deals with real logging as well as log-format and
depends on stream.h and various other things. As such it forces a few
heavy files like stream.h to be loaded early and to hide missing
dependencies depending where it's loaded. Among the missing ones is
syslog.h which was often automatically included resulting in no less
than 3 users missing it.
Among 76 users, only 5 could be removed, and probably 70 don't need the
full set of dependencies.
A good approach would consist in splitting that file in 3 parts:
- one for error output ("errors" ?).
- one for log_format processing
- and one for actual logging.
It was moved without any change, however many callers didn't need it at
all. This was a consequence of the split of proto_http.c into several
parts that resulted in many locations to still reference it.
Almost no change except moving the cli_kw struct definition after the
defines. Almost all users had both types&proto included, which is not
surprizing since this code is old and it used to be the norm a decade
ago. These places were cleaned.
Just some minor reordering, and the usual cleanup of call places for
those which didn't need it. We don't include the whole tools.h into
stats-t anymore but just tools-t.h.
The type file was slightly tidied. The cli-specific APPCTX_CLI_ST1_* flag
definitions were moved to cli.h. The type file was adjusted to include
buf-t.h and not the huge buf.h. A few call places were fixed because they
did not need this include.
The files were moved almost as-is, just dropping arg-t and auth-t from
acl-t but keeping arg-t in acl.h. It was useful to revisit the call places
since a handful of files used to continue to include acl.h while they did
not need it at all. Struct stream was only made a forward declaration
since not otherwise needed.
The stktable_types[] array declaration was moved to the main file as
it had nothing to do in the types. A few declarations were reordered
in the types file so that defines were before the structs. Thread-t
was added since there are a few __decl_thread(). The loss of peers.h
revealed that cfgparse-listen needed it.
All includes that were not absolutely necessary were removed because
checks.h happens to very often be part of dependency loops. A warning
was added about this in check-t.h. The fields, enums and structs were
a bit tidied because it's particularly tedious to find anything there.
It would make sense to split this in two or more files (at least
extract tcp-checks).
The file was renamed to the singular because it was one of the rare
exceptions to have an "s" appended to its name compared to the struct
name.
The type file is becoming a mess, half of it is for the proxy protocol,
another good part describes conn_streams and mux ops, it would deserve
being split again. At least it was reordered so that elements are easier
to find, with the PP-stuff left at the end. The MAX_SEND_FD macro was moved
to compat.h as it's said to be the value for Linux.
The TASK_IS_TASKLET() macro was moved to the proto file instead of the
type one. The proto part was a bit reordered to remove a number of ugly
forward declaration of static inline functions. About a tens of C and H
files had their dependency dropped since they were not using anything
from task.h.
global.h was one of the messiest files, it has accumulated tons of
implicit dependencies and declares many globals that make almost all
other file include it. It managed to silence a dependency loop between
server.h and proxy.h by being well placed to pre-define the required
structs, forcing struct proxy and struct server to be forward-declared
in a significant number of files.
It was split in to, one which is the global struct definition and the
few macros and flags, and the rest containing the functions prototypes.
The UNIX_MAX_PATH definition was moved to compat.h.
This one is particularly tricky to move because everyone uses it
and it depends on a lot of other types. For example it cannot include
arg-t.h and must absolutely only rely on forward declarations to avoid
dependency loops between vars -> sample_data -> arg. In order to address
this one, it would be nice to split the sample_data part out of sample.h.
List.h was missing for LIST_ADDQ(). A few unneeded includes of action.h
were removed from certain files.
This one still relies on applet.h and stick-table.h.
A few includes were missing in each file. A definition of
struct polled_mask was moved to fd-t.h. The MAX_POLLERS macro was
moved to defaults.h
Stdio used to be silently inherited from whatever path but it's needed
for list_pollers() which takes a FILE* and which can thus not be
forward-declared.
Most of the file was a large set of HTX elements manipulation functions
and few types, so splitting them allowed to further reduce dependencies
and shrink the build time. Doing so revealed that a few files (h2.c,
mux_pt.c) needed haproxy/buf.h and were previously getting it through
htx.h. They were fixed.
The pretty confusing "buffer.h" was in fact not the place to look for
the definition of "struct buffer" but the one responsible for dynamic
buffer allocation. As such it defines the struct buffer_wait and the
few functions to allocate a buffer or wait for one.
This patch moves it renaming it to dynbuf.h. The type definition was
moved to its own file since it's included in a number of other structs.
Doing this cleanup revealed that a significant number of files used to
rely on this one to inherit struct buffer through it but didn't need
anything from this file at all.
This moves types/activity.h to haproxy/activity-t.h and
proto/activity.h to haproxy/activity.h.
The macros defining the bit field values for the profiling variable
were moved to the type file to be more future-proof.
Now the file is ready to be stored into its final destination. A few
minor reorderings were performed to keep the file properly organized,
making the various sections more visible (cache & lockless).
In addition and to stay consistent, memory.c was renamed to pool.c.
types/freq_ctr.h was moved to haproxy/freq_ctr-t.h and proto/freq_ctr.h
was moved to haproxy/freq_ctr.h. Files were updated accordingly, no other
change was applied.
This splits the hathreads.h file into types+macros and functions. Given
that most users of this file used to include it only to get the definition
of THREAD_LOCAL and MAXTHREADS, the bare minimum was placed into thread-t.h
(i.e. types and macros).
All the thread management was left to haproxy/thread.h. It's worth noting
the drop of the trailing "s" in the name, to remove the permanent confusion
that arises between this one and the system implementation (no "s") and the
makefile's option (no "s").
For consistency, src/hathreads.c was also renamed thread.c.
A number of files were updated to only include thread-t which is the one
they really needed.
Some future improvements are possible like replacing empty inlined
functions with macros for the thread-less case, as building at -O0 disables
inlining and causes these ones to be emitted. But this really is cosmetic.
This one used to be stored into debug.h but the debug tools got larger
and require a lot of other includes, which can't use BUG_ON() anymore
because of this. It does not make sense and instead this macro should
be placed into the lower includes and given its omnipresence, the best
solution is to create a new bug.h with the few surrounding macros needed
to trigger bugs and place assertions anywhere.
Another benefit is that it won't be required to add include <debug.h>
anymore to use BUG_ON, it will automatically be covered by api.h. No
less than 32 occurrences were dropped.
The FSM_PRINTF macro was dropped since not used at all anymore (probably
since 1.6 or so).
All files that were including one of the following include files have
been updated to only include haproxy/api.h or haproxy/api-t.h once instead:
- common/config.h
- common/compat.h
- common/compiler.h
- common/defaults.h
- common/initcall.h
- common/tools.h
The choice is simple: if the file only requires type definitions, it includes
api-t.h, otherwise it includes the full api.h.
In addition, in these files, explicit includes for inttypes.h and limits.h
were dropped since these are now covered by api.h and api-t.h.
No other change was performed, given that this patch is large and
affects 201 files. At least one (tools.h) was already freestanding and
didn't get the new one added.
Only allow L7 retries when using HTTP, it only really makes sense for HTTP,
anyway, and as the L7 retries code assume the message will be HTX, it will
crash when used with mode TCP.
This should fix github issue #627.
This should be backported to 2.1 and 2.0.
Now we very rarely catch spinning streams, and whenever we catch one it
seems a filter is involved, but we currently report no info about them.
Let's print the list of enabled filters on the stream with such a crash
to help with the reports. A typical output will now look like this:
[ALERT] 121/165908 (1110) : A bogus STREAM [0x7fcaf4016a60] is spinning at 2 calls per second and refuses to die, aborting now! Please report this error to developers [strm=0x7fcaf4016a60 src=127.0.0.1 fe=l1 be=l1 dst=<CACHE> rqf=6dc42000 rqa=48000 rpf=a0040223 rpa=24000000 sif=EST,10008 sib=DIS,80110 af=(nil),0 csf=0x7fcaf4023c00,10c000 ab=0x7fcaf40235f0,4 csb=(nil),0 cof=0x7fcaf4016610,1300:H1(0x7fcaf4016840)/RAW((nil))/tcpv4(29) cob=(nil),0:NONE((nil))/NONE((nil))/NONE(0) filters={0x7fcaf4016fb0="cache store filter", 0x7fcaf4017080="compression filter"}]
This may be backported to 2.0.
With server-template was introduced the possibility to scale the
number of servers in a backend without needing a configuration change
and associated reload. On the other hand it became impractical to
write use-server rules for these servers as they would only accept
existing server labels as argument. This patch allows the use of
log-format notation to describe targets of a use-server rules, such
as in the example below:
listen test
bind *:1234
use-server %[hdr(srv)] if { hdr(srv) -m found }
use-server s1 if { path / }
server s1 127.0.0.1:18080
server s2 127.0.0.1:18081
If a use-server rule is applied because it was conditionned by an
ACL returning true, but the target of the use-server rule cannot be
resolved, no other use-server rule is evaluated and we fall back to
load balancing.
This feature was requested on the ML, and bumped with issue #563.
It's more generic and versatile than the previous shut_your_big_mouth_gcc()
that was used to silence annoying warnings as it's not limited to ignoring
syscalls returns only. This allows us to get rid of the aforementioned
function and the shut_your_big_mouth_gcc_int variable, that started to
look ugly in multi-threaded environments.
In the rare case of immediate connect() (unix sockets, socket pairs, and
occasionally TCP over the loopback), it is counter-productive to subscribe
for sending and then getting immediately back to process_stream() after
having passed through si_cs_process() just to update the connection. We
already know it is established and it doesn't have any handshake anymore
so we just have to complete it and return to process_stream() with the
stream_interface in the SI_ST_RDY state. In this case, process_stream will
simply loop back to the beginning to synchronize the state and turn it to
SI_ST_EST/ASS/CLO/TAR etc.
This will save us from having to needlessly subscribe in the connect()
code, something which in addition cannot work with edge-triggered pollers.
This lock was only needed to protect the buffer_wq list, but now we have
the mt_list for this. This patch simply turns the buffer_wq list to an
mt_list and gets rid of the lock.
It's worth noting that the whole buffer_wait thing still looks totally
wrong especially in a threaded context: the wakeup_cb() callback is
called synchronously from any thread and may end up calling some
connection code that was not expected to run on a given thread. The
whole thing should probably be reworked to use tasklets instead and be
a bit more centralized.
This counter is already incremented when a new request is received (or if an
error occurred waiting it). So it must not be incremented when the stream is
terminated, at the end of process_strem(). This bug was introduced by the commit
cff0f739e ("MINOR: counters: Review conditions to increment counters from
analysers").
No backport needed.
In process_stream(), when a client or a server abort is handled, the
corresponding listener's counter is incremented. But, we must be sure to have a
listener attached to the session. This bug was introduced by the commit
cff0f739e5.
Thanks to Fred to reporting me the bug.
No need to backport this patch, except if commit cff0f739e5 is backported.
The flags in the act_flag enum have been renamed act_opt. It means ACT_OPT
prefix is used instead of ACT_FLAG. The purpose of this patch is to reserve the
action flags for the actions configuration.
Now, for these counters, the following rules are followed to know if it must be
incremented or not:
* if it exists for a frontend, the counter is incremented
* if stats must be collected for the session's listener, if the counter exists
for this listener, it is incremented
* if the backend is already assigned, if the counter exists for this backend,
it is incremented
* if a server is attached to the stream, if the counter exists for this
server, it is incremented
It is not hardcoded rules. Some counters are still handled in a different
way. But many counters are incremented this way now.
For more than a decade we've kept all the sess_update_st_*() functions
in stream.c while they're only there to work in relation with what is
currently being done in backend.c (srv_redispatch_connect, connect_server,
etc). Let's move all this pollution over there and take this opportunity
to try to find slightly less confusing names for these old functions
whose role is only to handle transitions from one specific stream-int
state:
sess_update_st_rdy_tcp() -> back_handle_st_rdy()
sess_update_st_con_tcp() -> back_handle_st_con()
sess_update_st_cer() -> back_handle_st_cer()
sess_update_stream_int() -> back_try_conn_req()
sess_prepare_conn_req() -> back_handle_st_req()
sess_establish() -> back_establish()
The last one remained in stream.c because it's more or less a completion
function which does all the initialization expected on a connection
success or failure, can set analysers and emit logs.
The other ones could possibly slightly benefit from being modified to
take a stream-int instead since it's really what they're working with,
but it's unimportant here.
In process_sticking_rules() we only want to apply the first store-request
rule for a given table, but when doing so we need to make sure we only
count actual store-request rules when we list the sticking rules.
Failure to do so leads to not being able to write store-request and match
sticking rules in any order as a match rule after a store-request rule
will be ignored.
The following configuration reproduces the issue:
global
stats socket /tmp/foobar
defaults
mode http
frontend in
bind *:8080
default_backend bar
backend bar
server s1 127.0.0.1:21212
server s2 127.0.0.1:21211
stick store-request req.hdr(foo)
stick match req.hdr(foo)
stick-table type string size 10
listen foo
bind *:21212
bind *:21211
http-request deny deny_status 200 if { dst_port 21212 }
http-request deny
This patch fixes issue #448 and should be backported as far as 1.6.
Building on a 32-bit platform produces these warnings in trace code:
src/stream.c: In function 'strm_trace':
src/stream.c:226:29: warning: format '%lu' expects argument of type 'long unsigned int', but argument 9 has type 'size_t {aka const unsigned int}' [-Wformat=]
chunk_appendf(&trace_buf, " req=(%p .fl=0x%08x .ana=0x%08x .exp(r,w,a)=(%u,%u,%u) .o=%lu .tot=%llu .to_fwd=%u)",
^
src/stream.c:229:29: warning: format '%lu' expects argument of type 'long unsigned int', but argument 9 has type 'size_t {aka const unsigned int}' [-Wformat=]
chunk_appendf(&trace_buf, " res=(%p .fl=0x%08x .ana=0x%08x .exp(r,w,a)=(%u,%u,%u) .o=%lu .tot=%llu .to_fwd=%u)",
^
src/mux_fcgi.c: In function 'fcgi_trace':
src/mux_fcgi.c:443:29: warning: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'size_t {aka const unsigned int}' [-Wformat=]
chunk_appendf(&trace_buf, " - VAL=%lu", *val);
^
src/mux_h1.c: In function 'h1_trace':
src/mux_h1.c:290:29: warning: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'size_t {aka const unsigned int}' [-Wformat=]
chunk_appendf(&trace_buf, " - VAL=%lu", *val);
^
Let's just cast the type to long. This should be backported to 2.1.
For backends and servers, some average times for last 1024 connections are
already calculated. For the moment, the averages for the time passed in the
queue, the connect time, the response time (for HTTP session only) and the total
time are calculated. Now, in addition, the maximum time observed for these
values are also stored.
In addition, These new counters are cleared as all other max values with the CLI
command "clear counters".
This patch is related to #272.