This commit separes the "struct list" used for the chain the "struct
pattern" which contain the pattern data. Later, this change will permit
to manipulate lists ans trees with the same "struct pattern".
Each pattern parser take only one string. This change is reported to the
function prototype of the function "pattern_register()". Now, it is
called with just one string and no need to browse the array of args.
After the previous patches, the "pat_parse_strcat()" function disappear,
and the "pat_parse_int()" and "pat_parse_dotted_ver()" functions dont
use anymore the "opaque" argument, and take only one string on his
input.
So, after this patch, each pattern parser no longer use the opaque
variable and take only one string as input. This patch change the
prototype of the pattern parsing functions.
Now, the "char **args" is replaced by a "char *arg", the "int *opaque"
is removed and these functions return 1 in succes case, and 0 if fail.
This patch remove the limit of 32 groups. It also permit to use standard
"pat_parse_str()" function in place of "pat_parse_strcat()". The
"pat_parse_strcat()" is no longer used and its removed. Before this
patch, the groups are stored in a bitfield, now they are stored in a
list of strings. The matching is slower, but the number of groups is
low and generally the list of allowed groups is short.
The fetch function "smp_fetch_http_auth_grp()" used with the name
"http_auth_group" return valid username. It can be used as string for
displaying the username or with the acl "http_auth_group" for checking
the group of the user.
Maybe the names of the ACL and fetch methods are no longer suitable, but
I keep the current names for conserving the compatibility with existing
configurations.
The function "userlist_postinit()" is created from verification code
stored in the big function "check_config_validity()". The code is
adapted to the new authentication storage system and it is moved in the
"src/auth.c" file. This function is used to check the validity of the
users declared in groups and to check the validity of groups declared
on the "user" entries.
This resolve function is executed before the check of all proxy because
many acl needs solved users and groups.
Large configurations can take time to parse when thousands of backends
are in use. Let's store all the proxies in trees.
findproxy_mode() has been modified to use the tree for lookups, which
has divided the parsing time by about 2.5. But many lookups are still
present at many places and need to be dealt with.
We store the time stamp of last read in the channel in order to
be able to measure some bit rate and pause lengths. We only use
16 bits which were unused for this. We don't need more, as it
allows us to measure with a millisecond precision for up to 65s.
These ones are only reset during transfers. There is a low but non-null
risk that a first full read causes the previous value to be reused and
immediately to immediately set the CF_STREAMER flag. The impact is only
to increase earlier than expected the SSL record size and to use splice().
This bug was already present in 1.4, so a backport is possible.
Summary:
Track and report last session time on the stats page for each server
in every backend, as well as the backend.
This attempts to address the requirement in the ROADMAP
- add a last activity date for each server (req/resp) that will be
displayed in the stats. It will be useful with soft stop.
The stats page reports this as time elapsed since last session. This
change does not adequately address the requirement for long running
session (websocket, RDP... etc).
Till now, we had one flag per stick counter to indicate if it was
tracked in a backend or in a frontend. We just had to add another
flag per stick-counter to indicate if it relies on contents or just
connection. These flags are quite painful to maintain and tend to
easily conflict with other flags if their number is changed.
The correct solution consists in moving the flags to the stkctr struct
itself, but currently this struct is made of 2 pointers, so adding a
new entry there to store only two bits will cause at least 16 more bytes
to be eaten per counter due to alignment issues, and we definitely don't
want to waste tens to hundreds of bytes per session just for things that
most users don't use.
Since we only need to store two bits per counter, an intermediate
solution consists in replacing the entry pointer with a composite
value made of the original entry pointer and the two flags in the
2 unused lower bits. If later a need for other flags arises, we'll
have to store them in the struct.
A few inline functions have been added to abstract the retrieval
and assignment of the pointers and flags, resulting in very few
changes. That way there is no more dependence on the number of
stick-counters and their position in the session flags.
One year ago, commit 5d5b5d8 ("MEDIUM: proto_tcp: add support for tracking
L7 information") brought support for tracking L7 information in tcp-request
content rules. Two years earlier, commit 0a4838c ("[MEDIUM] session-counters:
correctly unbind the counters tracked by the backend") used to flush the
backend counters after processing a request.
While that earliest patch was correct at the time, it became wrong after
the second patch was merged. The code does what it says, but the concept
is flawed. "TCP request content" rules are evaluated for each HTTP request
over a single connection. So if such a rule in the frontend decides to
track any L7 information or to track L4 information when an L7 condition
matches, then it is applied to all requests over the same connection even
if they don't match. This means that a rule such as :
tcp-request content track-sc0 src if { path /index.html }
will count one request for index.html, and another one for each of the
objects present on this page that are fetched over the same connection
which sent the initial matching request.
Worse, it is possible to make the code do stupid things by using multiple
counters:
tcp-request content track-sc0 src if { path /foo }
tcp-request content track-sc1 src if { path /bar }
Just sending two requests first, one with /foo, one with /bar, shows
twice the number of requests for all subsequent requests. Just because
both of them persist after the end of the request.
So the decision to flush backend-tracked counters was not the correct
one. In practice, what is important is to flush countent-based rules
since they are the ones evaluated for each request.
Doing so requires new flags in the session however, to keep track of
which stick-counter was tracked by what ruleset. A later change might
make this easier to maintain over time.
This bug is 1.5-specific, no backport to stable is needed.
In addition to previous outputs, we also emit the cumulated number of
connections, the cumulated number of requests, the maximum allowed
SSL connection concurrency, the current number of SSL connections and
the cumulated number of SSL connections. This will help troubleshoot
systems which experience memory shortage due to SSL.
This function is used to compute the new polling state based on
the previous state. All pollers have to do this in their update
loop, so better centralize the logic for it.
Currently, each poll loop handles the polled events the same way,
resulting in a lot of duplicated, complex code. Additionally, epoll
was the only one to handle newly created FDs immediately.
So instead, let's move that code to fd.c in a new function dedicated
to this task : fd_process_polled_events(). All pollers now use this
function.
This is the reimplementation of the "done" action : when we experience
a short read, we're almost certain that we've exhausted the system's
buffers and that we'll meet an EAGAIN if we attempt to read again. If
the FD is not yet polled, the stream interface already takes care of
stopping the speculative read. When the FD is already being polled, we
have two options :
- either we're running from a level-triggered poller, in which case
we'd rather report that we've reached the end so that we don't
speculate over the poller and let it report next time data are
available ;
- or we're running from an edge-triggered poller in which case we
have no choice and have to see the EAGAIN to re-enable events.
At the moment we don't have any edge-triggered poller, so it's desirable
to avoid speculative I/O that we know will fail.
Note that this must not be ported to SSL since SSL hides the real
readiness of the file descriptor.
Thanks to this change, we observe no EAGAIN anymore during keep-alive
transfers, and failed recvfrom() are reduced by half in http-server-close
mode (the client-facing side is always being polled and the second recv
can be avoided). Doing so results in about 5% performance increase in
keep-alive mode. Similarly, we used to have up to about 1.6% of EAGAIN
on accept() (1/maxaccept), and these have completely disappeared under
high loads.
It's easier and safer to rely on conn_xprt_ready() everywhere than to
check the flag itself. It will also simplify adding extra checks later
if needed. Some useless controls for !xprt have been removed, as the
XPRT_READY flag itself guarantees xprt is set.
It's easier and safer to rely on conn_ctrl_ready() everywhere than to
check the flag itself. It will also simplify adding extra checks later
if needed. Some useless controls for !ctrl have been removed, as the
CTRL_READY flag itself guarantees ctrl is set.
We simply remove these functions and replace their calls with the
appropriate ones :
- if we're in the data phase, we can simply report wait on the FD
- if we're in the socket phase, we may also have to signal the
desire to read/write on the socket because it might not be
active yet.
These flags were used to report the readiness of the file descriptor.
Now this readiness is directly checked at the file descriptor itself.
This removes the need for constantly synchronizing updates between the
file descriptor and the connection and ensures that all layers share
the same level of information.
For now, the readiness is updated in conn_{sock,data}_poll_* by directly
touching the file descriptor. This must move to the lower layers instead
so that these functions can disappear as well. In this state, the change
works but is incomplete. It's sensible enough to avoid making it more
complex.
Now the sock/data updates become much simpler because they just have to
enable/disable access to a file descriptor and not to care anymore about
its readiness.
This commit heavily changes the polling system in order to definitely
fix the frequent breakage of SSL which needs to remember the last
EAGAIN before deciding whether to poll or not. Now we have a state per
direction for each FD, as opposed to a previous and current state
previously. An FD can have up to 8 different states for each direction,
each of which being the result of a 3-bit combination. These 3 bits
indicate a wish to access the FD, the readiness of the FD and the
subscription of the FD to the polling system.
This means that it will now be possible to remember the state of a
file descriptor across disable/enable sequences that generally happen
during forwarding, where enabling reading on a previously disabled FD
would result in forgetting the EAGAIN flag it met last time.
Several new state manipulation functions have been introduced or
adapted :
- fd_want_{recv,send} : enable receiving/sending on the FD regardless
of its state (sets the ACTIVE flag) ;
- fd_stop_{recv,send} : stop receiving/sending on the FD regardless
of its state (clears the ACTIVE flag) ;
- fd_cant_{recv,send} : report a failure to receive/send on the FD
corresponding to EAGAIN (clears the READY flag) ;
- fd_may_{recv,send} : report the ability to receive/send on the FD
as reported by poll() (sets the READY flag) ;
Some functions are used to report the current FD status :
- fd_{recv,send}_active
- fd_{recv,send}_ready
- fd_{recv,send}_polled
Some functions were removed :
- fd_ev_clr(), fd_ev_set(), fd_ev_rem(), fd_ev_wai()
The POLLHUP/POLLERR flags are now reported as ready so that the I/O layers
knows it can try to access the file descriptor to get this information.
In order to simplify the conditions to add/remove cache entries, a new
function fd_alloc_or_release_cache_entry() was created to be used from
pollers while scanning for updates.
The following pollers have been updated :
ev_select() : done, built, tested on Linux 3.10
ev_poll() : done, built, tested on Linux 3.10
ev_epoll() : done, built, tested on Linux 3.10 & 3.13
ev_kqueue() : done, built, tested on OpenBSD 5.2
We're completely changing the way FDs will be polled. There will be no
more speculative I/O since we'll know the exact FD state, so these will
only be cached events.
First, let's fix a few field names which become confusing. "spec_e" was
used to store a speculative I/O event state. Now we'll store the whole
R/W states for the FD there. "spec_p" was used to store a speculative
I/O cache position. Now let's clearly call it "cache".
We're completely changing the way FDs will be polled. First, let's fix
a few field names which become confusing. "spec_e" was used to store a
speculative I/O event state. Now we'll store the whole R/W states for
the FD there.
It is quite often that an connection error only reports "socket error" with
no more information. This is especially problematic with health checks where
many causes are possible, including resource exhaustion which do not lead to
a valid errno code. So let's add explicit codes to cover these cases.
Till now there was no way to know from a connection if a previous
call to drain() had done any change. This function is used to drain
incoming data and to update the connection's flags at the same time.
It also correctly sets the polling flags on the connection if the
drain function indicates inability to receive. This function will
be used preferably over ctrl->drain() when a connection is used.
This reverts commit 12082663561aa2189d243328060c399f2fd95860.
It randomly breaks SSL. What happens is that if the SSL response is
read at once by the SSL stack and is partially delivered to the buffer,
then there's no way to read the next parts because we wait for some
polling first.
So we'll fix this after the polling rework.
This function is called twice per request, and does almost always nothing.
Better use an inline version to avoid entering it when we can.
About 0.5% additional performance was gained this way.
si_connect() used to only return SI_ST_CON. But it already detect the
connection reuse and is the function which avoids calling connect().
So it already knows the connection is valid and reuse. Thus we make it
return SI_ST_EST when a connection is reused. This means that
connect_server() can return this state and sess_update_stream_int()
as well.
Thanks to this change, we don't need to leave process_session() in
SI_ST_CON state to immediately enter it again to switch to SI_ST_EST.
Implementing this removes one call to process_session() per request
in keep-alive mode. We're now at 2 calls per request, which is the
minimum (one for the request and another one for the response). The
number of calls to http_wait_for_response() has also dropped from 2
to one.
Tests indicate a performance gain of about 2.6% in request rate in
keep-alive mode. There should be no gain in http-server-close() since
we don't use this faster path.
This reverts commit f3221f99acdd792352d4ee648d987270d74ca38e.
Igor reported some very strange breakage of his stats page which is
clearly caused by the chunking, though I don't see at first glance
what could be wrong. Better revert it for now.
In theory the principle is simple as we just need to send HTTP chunks
if the client is 1.1 compatible. In practice it's harder because we
have to append a CR LF after each block of data and we're never sure
to have the room for this. In order not to have to deal with this, we
instead send the CR LF prior to each chunk size. The only issue is for
the first chunk and for this reason we avoid to send the empty header
line when using chunked encoding.
If a file descriptor is being polled, and it stopped (eg: buffer full
or end of response), then re-enabled, currently what happens is that
the polling is disabled, then the fd is enabled in speculative mode,
an I/O attempt is made, it loses (otherwise the FD would surely not
have been polled), and the polled is enabled again.
This is too bad, especially with HTTP keep-alive on the server side
where all operations are performed at once before going back to the
poll loop.
Now we improve the behaviour by ensuring that if an fd is still being
polled, when it's enabled after having been disabled, we re-enable the
polling. Doing so saves a number of syscalls and useless wakeups, and
results in a significant performance gain on HTTP keep-alive. A 11%
increase has been observed on the HTTP request rate in keep-alive
thanks to this.
It could be considered as a bug fix, but there was no harm with the
current behaviour, except extra syscalls.
Idle connections are not monitored right now. So if a server closes after
a response without advertising it, it won't be detected until a next
request wants to use the connection. This is a bit problematic because
it unnecessarily maintains file descriptors and sockets in an idle
state.
This patch implements a very simple idle connection manager for the stream
interface. It presents itself as an I/O callback. The HTTP engine enables
it when it recycles a connection. If a close or an error is detected on the
underlying socket, it tries to drain as much data as possible from the socket,
detect the close and responds with a close as well, then detaches from the
stream interface.
The throttling of low weight servers (<16) could mistakenly be reported
as > 100% due to a rounding that was performed before a multiply by 100
instead of after. This was introduced in 1.5-dev20 when fixing a previous
reporting issue by commit d32c399 (MINOR: stats: report correct throttling
percentage for servers in slowstart).
It should be backported if the patch above is backported.
This is the best place to reuse a connection. We centralize all
connection requests and we're at the best place to know exactly
what the current state of the underlying connection is. If the
connection is reused, we just enable polling for send() in order
to be able to emit the request.
When allocating a new connection, only the caller knows whether it's
acceptable to reuse the previous one or not. Let's pass this information
to si_alloc_conn() which will do the cleanup if the connection is not
acceptable.
Right now we see many places doing their own setsockopt(SO_LINGER).
Better only do it just before the close() in fd_delete(). For this
we add a new flag on the file descriptor, indicating if it's safe or
not to linger. If not (eg: after a connect()), then the setsockopt()
call is automatically performed before a close().
The flag automatically turns to safe when receiving a read0.
conn_xprt_ready() reports if the transport layer is ready.
conn_ctrl_ready() reports if the control layer is ready.
The stream interface uses si_conn_ready() to report that the
underlying connection is ready. This will be used for connection
reuse in keep-alive mode.
Doing so ensures that we're consistent between all the functions in the whole
chain. This is important so that we can extract the argument parsing from this
function.
This patch adds map manipulation commands to the socket interface.
add map <map> <key> <value>
Add the value <value> in the map <map>, at the entry corresponding to
the key <key>. This command does not verify if the entry already
exists.
clear map <map>
Remove entries from the map <map>
del map <map> <key>
Delete all the map entries corresponding to the <key> value in the map
<map>.
set map <map> <key> <value>
Modify the value corresponding to each key <key> in a map <map>. The
new value is <value>.
show map [<map>]
Dump info about map converters. Without argument, the list of all
available maps are returned. If a <map> is specified, is content is
dumped.
With this patch, patterns can be compiled for two modes :
- match
- lookup
The match mode is used for example in ACLs or maps. The lookup mode
is used to lookup a key for pattern maintenance. For example, looking
up a network is different from looking up one address belonging to
this network.
A special case is made for regex. In lookup mode they return the input
regex string and do not compile the regex.
Now, the pat_parse_*() functions parses the incoming data. The input
"pattern" struct can be preallocated. If the parser needs to add some
buffers, it allocates memory.
The function pattern_register() runs the call to the parser, process
the key indexation and associate the "sample_storage" used by maps.
This patch remove the compatibility check from the input type and the
match method. Now, it checks if a casts from the input type to output
type exists and the pattern_exec_match() function apply casts before
each pattern matching.
This is used later for increasing the compability with incoming
sample types. When multiple compatible types are supported, one
is arbitrarily used (eg: UINT).