The union name "data" is a little bit heavy while we read the source
code because we can read "data.data.sint". The rename from "data" to "u"
makes the read easiest like "data.u.sint".
This extract is not really required, but it maybe will be usefull later.
A comming soonpatch about simplification of stick table values will
use this union
This patch remove the struct information stored both in the struct
sample_data and in the striuct sample. Now, only thestruct sample_data
contains data, and the struct sample use the struct sample_data for storing
his own data.
This is useful to prevent cross includes. The header file sample.h
needs to include proto_http, stick_tables.h will need to include
sample.h and proto_http includes stick_tables.h.
I choose to move the known http method define because this enum is
mainly used in sample.h. This enum is used for the sample type method.
appsessions started to be deprecated with the introduction of stick
tables, and the latter are much more powerful and flexible, and in
addition they are replicated between nodes and maintained across
reloads. Let's now remove appsession completely.
Since sample fetches are not always available in the response phase,
this patch implements %HQ such that:
GET /foo?bar=baz HTTP/1.0
...would be logged as:
?bar=baz
These ones are considered safe as they have already been reused.
They will be useful in "aggressive" and "always" http-reuse modes
in order to place the first request of a connection with the least
risk.
For now it only supports "never", meaning that we never want to reuse a
shared connection, and "always", meaning that we can use any connection
that was not marked private. When "never" is set, this also implies that
no idle connection may become a shared one.
This flag is set on an outgoing connection when this connection gets
some properties that must not be shared with other connections, such
as dynamic transparent source binding, SNI or a proxy protocol header,
or an authentication challenge from the server. This will be needed
later to implement connection reuse.
This function is now dedicated to idle connections only, which means
that it must not be used without any endpoint nor anything not a
connection. The connection remains attached to the stream interface.
This list member will be used to attach a connection to a list of
idle, reusable or queued connections. It's unused for now. Given
that it's not expected to be used more than a few times per session,
the member was put after the target, in the area starting at the
second cache line of the structure.
For now it's not populated but we have the list entry. It will carry
all idle connections that sessions don't want to share. They may be
used later to reclaim connections upon socket shortage for example.
This function only detaches the endpoint from the stream-int and
optionally returns the original pointer. This will be needed to
steal idle connections from other connections.
Since we now always call this function with the reuse parameter cleared,
let's simplify the function's logic as it cannot return the existing
connection anymore. The savings on this inline function are appreciable
(240 bytes) :
$ size haproxy.old haproxy.new
text data bss dec hex filename
1020383 40816 36928 1098127 10c18f haproxy.old
1020143 40816 36928 1097887 10c09f haproxy.new
This patch removes the 32 bits unsigned integer and the 32 bit signed
integer. It replaces these types by a unique type 64 bit signed.
This makes easy the usage of integer and clarify signed and unsigned use.
With the previous version, signed and unsigned are used ones in place of
others, and sometimes the converter loose the sign. For example, divisions
are processed with "unsigned", if one entry is negative, the result is
wrong.
Note that the integer pattern matching and dotted version pattern matching
are already working with signed 64 bits integer values.
There is one user-visible change : the "uint()" and "sint()" sample fetch
functions which used to return a constant integer have been replaced with
a new more natural, unified "int()" function. These functions were only
introduced in the latest 1.6-dev2 so there's no impact on regular
deployments.
These are the 64-bit equivalent of htonl() and ntohl(). They're a bit
tricky in order to avoid expensive operations.
The principle consists in letting the compiler detect we're playing
with a union and simplify most or all operations. The asm-optimized
htonl() version involving bswap (x86) / rev (arm) / other is a single
operation on little endian, or a NOP on big-endian. In both cases,
this lets the compiler "see" that we're rebuilding a 64-bit word from
two 32-bit quantities that fit into a 32-bit register. In big endian,
the whole code is optimized out. In little endian, with a decent compiler,
a few bswap and 2 shifts are left, which is the minimum acceptable.
This patch adds 3 functions for 64 bit integer conversion.
* lltoa_r : converts signed 64 bit integer to string
* read_uint64 : converts from string to signed 64 bits integer with capping
* read_int64 : converts from string to unsigned 64 bits integer with capping
This patch introduces three new functions which can be used to find a
server in a farm using different server information:
- server unique id (srv->puid)
- server name
- find best match using either name or unique id
When performing best matching, the following applies:
- use the server name first (if provided)
- use the server id if provided
in any case, the function can update the caller about mismatches
encountered.
This flag aims at reporting whether the server unique id (srv->puid) has
been forced by the administrator in HAProxy's configuration.
If not set, it means HAProxy has generated automatically the server's
unique id.
function proxy_find_best_match can update the caller by updating an int
provided in argument.
For now, proxy_find_best_match hardcode bit values 0x01, 0x02 and 0x04,
which is not understandable when reading a code exploiting them.
This patch defines 3 macros with a more explicit wording, so further
reading of a code exploiting the magic bit values will be understandable
more easily.
Change si_alloc_conn() to call si_release_endpoint() instead of
open-coding the connection releasing code when reuse is disabled.
This fuses the code with the one already dealing with applets, makes
it shorter and helps centralizing the connection freeing logic at a
single place.
The commit "MEDIUM: vars: move the session variables to the session, not the stream" (ebcd4844e82a4198ea5d98fe491a46267da1d1ec")
moves the variables from the stream to the session. It forgot to remove
the stream definition of the "vars_sess".
The new "sni" server directive takes a sample fetch expression and
uses its return value as a hostname sent as the TLS SNI extension.
A typical use case consists in forwarding the front connection's SNI
value to the server in a bridged HTTPS forwarder :
sni ssl_fc_sni
ssl_sock_set_servername() is used to set the SNI hostname on an
outgoing connection. This function comes from code originally
provided by Christopher Faulet of Qualys.
This option enables overriding source IP address in a HTTP request. It is
useful when we want to set custom source IP (e.g. front proxy rewrites address,
but provides the correct one in headers) or we wan't to mask source IP address
for privacy or compliance.
It acts on any expression which produces correct IP address.
This modification makes possible to use sample_fetch_string() in more places,
where we might need to fetch sample values which are not plain strings. This
way we don't need to fetch string, and convert it into another type afterwards.
When using aliased types, the caller should explicitly check which exact type
was returned (e.g. SMP_T_IPV4 or SMP_T_IPV6 for SMP_T_ADDR).
All usages of sample_fetch_string() are converted to use new function.
This is in order to avoid conflicting with NetBSD popcount* functions
since 6.x release, the final l to mentions the argument is a long like
NetBSD does.
This patch could be backported to 1.5 to fix the build issue there as well.
This cache is used by 51d converter. The input User-Agent string, the
converter args and a random seed are used as a hashing key. The cached
entries contains a pointer to the resulting string for specific
User-Agent string detection.
The cache size can be tuned using 51degrees-cache-size parameter.
Moved 51Degrees code from src/haproxy.c, src/sample.c and src/cfgparse.c
into a separate files src/51d.c and include/import/51d.h.
Added two new functions init_51degrees() and deinit_51degrees(), updated
Makefile and other code reorganizations related to 51Degrees.
When Lua is disabled, the alternate functions must have the same
prototype as the original ones, otherwise we get such warnings :
src/stream.c:278:27: warning: too many arguments in call to 'hlua_ctx_destroy'
hlua_ctx_destroy(&s->hlua);
~~~~~~~~~~~~~~~~ ^
No backport is needed.
This patch adds two functions used for variable acces using the
variable full name. If the variable doesn't exists in the variable
pool name, it is created.
This patch adds support of variables during the processing of each stream. The
variables scope can be set as 'session', 'transaction', 'request' or 'response'.
The variable type is the type returned by the assignment expression. The type
can change while the processing.
The allocated memory can be controlled for each scope and each request, and for
the global process.
fix include dependency. The header file sample.h don't need to known
the content of the struct arg, so I remove the include, and replace
it by a simple pointer declaration.
This prevent an include dependecy issue with the next patch.
This patch permits to register a new keyword with the keyword "tcp-request content"
'tcp-request connection", tcp-response content", http-request" and "http-response"
which is identified only by matching the start of the keyword.
for example, we register the keyword "set-var" with the option "match_pfx"
and the configuration keyword "set-var(var_name)" matchs this entry.
This type is used to accept any type of sample as input, and prevent
any automatic "cast". It runs like the type "ADDR" which accept the
type "IPV4" and "IPV6".
Implementation of a DNS client in HAProxy to perform name resolution to
IP addresses.
It relies on the freshly created UDP client to perform the DNS
resolution. For now, all UDP socket calls are performed in the
DNS layer, but this might change later when the protocols are
extended to be more suited to datagram mode.
A new section called 'resolvers' is introduced thanks to this patch. It
is used to describe DNS servers IP address and also many parameters.
Basic introduction of a UDP layer in HAProxy. It can be used as a
client only and manages UDP exchanges with servers.
It can't be used to load-balance UDP protocols, but only used by
internal features such as DNS resolution.
Ability to change a server IP address during HAProxy run time.
For now this is provided via function update_server_addr() which
currently is not called.
A log is emitted on each change. For now we do it inconditionally,
but later we'll want to do it only on certain circumstances, which
explains why the logging block is enclosed in if(1).
Since commit 65d805fd witch removes standard.h from compat.h some
values were not properly set on FreeBSD. This caused a segfault
at startup when smp_resolve_args is called.
As FreeBSD have IP_BINDANY, CONFIG_HAP_TRANSPARENT is define. This
cause struct conn_src to be extended with some fields. The size of
this structure was incorrect. Including netinet/in.h fix this issue.
While diving in code preprocessing, I found that limits.h was require
to properly set MAX_HOSTNAME_LEN, ULONG_MAX, USHRT_MAX and others
system limits on FreeBSD.
Following functions are now available in the SSL public API:
* ssl_sock_create_cert
* ssl_sock_get_generated_cert
* ssl_sock_set_generated_cert
* ssl_sock_generated_cert_serial
These functions could be used to create a certificate by hand, set it in the
cache used to store generated certificates and retrieve it. Here is an example
(pseudo code):
X509 *cacert = ...;
EVP_PKEY *capkey = ...;
char *servername = ...;
unsigned int serial;
serial = ssl_sock_generated_cert_serial(servername, strlen(servername));
if (!ssl_sock_get_generated_cert(serial, cacert)) {
SSL_CTX *ctx = ssl_sock_create_cert(servername, serial, cacert, capkey);
ssl_sock_set_generated_cert(ctx, serial, cacert);
}
With this patch, it is possible to configure HAProxy to forge the SSL
certificate sent to a client using the SNI servername. We do it in the SNI
callback.
To enable this feature, you must pass following BIND options:
* ca-sign-file <FILE> : This is the PEM file containing the CA certitifacte and
the CA private key to create and sign server's certificates.
* (optionally) ca-sign-pass <PASS>: This is the CA private key passphrase, if
any.
* generate-certificates: Enable the dynamic generation of certificates for a
listener.
Because generating certificates is expensive, there is a LRU cache to store
them. Its size can be customized by setting the global parameter
'tune.ssl.ssl-ctx-cache-size'.
It lookup a key in a LRU cache for use with specified domain and revision. It
differs from lru64_get as it does not create missing keys. The function returns
NULL if an error or a cache miss occurs.
Now, When a item is committed in an LRU tree, you can define a function to free
data owned by this item. This function will be called when the item is removed
from the LRU tree or when the tree is destroyed..
This diff is the raw C struct definition of all DeviceAtlas module
data needed added to the main global struct haproxy configuration.
The three first members are needed for both init and deinit phases
as some dynamic memory allocations are done.
The useragentid serves to hold during the whole lifecycle of the
module the User-Agent HTTP Header identifier from the DeviceAtlas
data during the init process.
This diff is for the DeviceAtlas convertor.
This patch adds the following converters :
deviceatlas-json-file
deviceatlas-log-level
deviceatlas-property-separator
First, the configuration keywords handling (only the log
level configuration part does not end the haproxy process
if it is wrongly set, it fallbacks to the default level).
Furthermore, init, deinit phases and the API lookup phase,
the da_haproxy function which is fed by the input provided
and set all necessary properties chosen via the configuration
to the output, separated by the separator.
This patch adds the ssl-dh-param-file global setting. It sets the
default DH parameters that will be used during the SSL/TLS handshake when
ephemeral Diffie-Hellman (DHE) key exchange is used, for all "bind" lines
which do not explicitely define theirs.
Actually, the tcp-request and tcp-response custom ation are always final
actions. This patch create a new type of action that can permit to
continue the evaluation of tcp-request and tcp-response processing.
This patch does'nt add any new feature: the functional behavior
is the same than version 1.0.
Technical differences:
In this version all updates on different stick tables are
multiplexed on the same tcp session. There is only one established
tcp session per peer whereas in first version there was one established
tcp session per peer and per stick table.
Messages format was reviewed to be more evolutive and to support
further types of data exchange such as SSL sessions or other sticktable's
data types (currently only the sticktable's server id is supported).
This function checks a string for using it in a CSV output format. If
the string contains one of the following four char <">, <,>, CR or LF,
the string is encapsulated between <"> and the <"> are escaped by a <"">
sequence.
The rounding by <"> is optionnal. It can be canceled, forced or the
function choose automatically the right way.
Sometimes it's problematic not to have "http-response redirect" rules,
for example to perform a browser-based redirect based on certain server
conditions (eg: match of a header).
This patch adds "http-response redirect location <fmt>" which gives
enough flexibility for most imaginable operations. The connection to
the server is closed when this is performed so that we don't risk to
forward any pending data from the server.
Any pending response data are trimmed so that we don't risk to
forward anything pending to the client. It's harmless to also do that
for requests so we don't need to consider the direction.
In order to support http-response redirect, the parsing needs to be
adapted a little bit to only support the "location" type, and to
adjust the log-format parser so that it knows the direction of the
sample fetch calls.
This function tries to spot a proxy by its name, ID and type, and
in case some elements don't match, it tries to determine which ones
could be ignored and reports which ones were ignored so that the
caller can decide whether or not it wants to pick this proxy. This
will be used for maintaining the status across reloads where the
config might have changed a bit.
It does the same as the other one except that it only focuses on the
numeric ID and the capabilities. It's used by proxy_find_by_name()
for numeric names.
These ones were already obsoleted in 1.4, marked for removal in 1.5,
and not documented anymore. They used to emit warnings, and do still
require quite some code to stay in place. Let's remove them now.
First, findproxy() was renamed proxy_find_by_name() so that its explicit
that a name is required for the lookup. Second, we give this function
the ability to search for tables if needed. Third we now provide inline
wrappers to pass the appropriate PR_CAP_* flags and to explicitly look
up a frontend, backend or table.
For backend load balancing it sometimes makes sense to redispatch rather
than retrying against the same server. For example, when machines or routers
fail you may not want to waste time retrying against a dead server and
would instead prefer to immediately redispatch against other servers.
This patch allows backend sections to specify that they want to
redispatch on a particular interval. If the interval N is positive the
redispatch occurs on every Nth retry, and if the interval N is negative then
the redispatch occurs on the Nth retry prior to the last retry (-1 is the
default and maintains backwards compatibility). In low latency environments
tuning this setting can save a few hundred milliseconds when backends fail.
Until now, HAproxy needed to be restarted to change the TLS ticket
keys. With this patch, the TLS keys can be updated on a per-file
basis using the admin socket. Two new socket commands have been
introduced: "show tls-keys" and "set ssl tls-keys".
Signed-off-by: Nenad Merdanovic <nmerdan@anine.io>
Within the listener struct we need to use a reference to the TLS
ticket keys which binds the actual keys with the filename. This will
make it possible to update the keys through the socket
Signed-off-by: Nenad Merdanovic <nmerdan@anine.io>
tcpcheck error messages include the step id where the error occurs.
In some cases, this is not enough. Now, HAProxy also use the comment
field of the latest tcpcheck rule which has been run.
This commit allows HAProxy to parse a new directive in the tcpcheck
ruleset: 'comment'.
It is used to setup comments on the current tcpcheck rules.
A new field is added into the tcpcheck_rule structure.
This field will host a string used as a comment to describe the rule.
Then this comment can be used in logs to report a more user friendly
message on the step which failed during the tcpcheck ruleset.
Options are relative to the sample. Each sample fetched is associated with
fetch options or fetch flags.
This patch adds the 'opt' vaue in the sample struct. This permits to reduce
the sample-fetch function prototype. In other way, the converters will have
more detail about the origin of the sample.
This patch removes the structs "session", "stream" and "proxy" from
the sample-fetches and converters function prototypes.
This permits to remove some weight in the prototype call.
Some sample analyzer (sample-fetch or converters) needs to known the proxy,
session and stream attached to the sampel. The sample-fetches and the converters
function pointers cannot be called without these 3 pointers filled.
This patch permits to reduce the sample-fetch and the converters called
prototypes, and provides a new mean to add information for this type of
functions.
It is sometimes desirable to wait for the body of an HTTP request before
taking a decision. This is what is being done by "balance url_param" for
example. The first use case is to buffer requests from slow clients before
connecting to the server. Another use case consists in taking the routing
decision based on the request body's contents. This option placed in a
frontend or backend forces the HTTP processing to wait until either the whole
body is received, or the request buffer is full, or the first chunk is
complete in case of chunked encoding. It can have undesired side effects with
some applications abusing HTTP by expecting unbufferred transmissions between
the frontend and the backend, so this should definitely not be used by
default.
Note that it would not work for the response because we don't reset the
message state before starting to forward. For the response we need to
1) reset the message state to MSG_100_SENT or BODY , and 2) to reset
body_len in case of chunked encoding to avoid counting it twice.
Since 1.5, the request body analyser has become independant from any
other element and does not even disturb the message forwarder anymore.
And since it's disabled by default, we can place it before most
analysers so that it's can preempt any other one if an intermediary
one enables it.
Recently some browsers started to implement a "pre-connect" feature
consisting in speculatively connecting to some recently visited web sites
just in case the user would like to visit them. This results in many
connections being established to web sites, which end up in 408 Request
Timeout if the timeout strikes first, or 400 Bad Request when the browser
decides to close them first. These ones pollute the log and feed the error
counters. There was already "option dontlognull" but it's insufficient in
this case. Instead, this option does the following things :
- prevent any 400/408 message from being sent to the client if nothing
was received over a connection before it was closed ;
- prevent any log from being emitted in this situation ;
- prevent any error counter from being incremented
That way the empty connection is silently ignored. Note that it is better
not to use this unless it is clear that it is needed, because it will hide
real problems. The most common reason for not receiving a request and seeing
a 408 is due to an MTU inconsistency between the client and an intermediary
element such as a VPN, which blocks too large packets. These issues are
generally seen with POST requests as well as GET with large cookies. The logs
are often the only way to detect them.
This patch should be backported to 1.5 since it avoids false alerts and
makes it easier to monitor haproxy's status.
The principle of this cache is to have a global cache for all pattern
matching operations which rely on lists (reg, sub, dir, dom, ...). The
input data, the expression and a random seed are used as a hashing key.
The cached entries contains a pointer to the expression and a revision
number for that expression so that we don't accidently used obsolete
data after a pattern update or a very unlikely hash collision.
Regarding the risk of collisions, 10k entries at 10k req/s mean 1% risk
of a collision after 60 years, that's already much less than the memory's
reliability in most machines and more durable than most admin's life
expectancy. A collision will result in a valid result to be returned
for a different entry from the same list. If this is not acceptable,
the cache can be disabled using tune.pattern.cache-size.
A test on a file containing 10k small regex showed that the regex
matching was limited to 6k/s instead of 70k with regular strings.
When enabling the LRU cache, the performance was back to 70k/s.
This will be used to detect any change on the pattern list between
two operations, ultimately making it possible to implement a cache
which immediately invalidates obsolete keys after an update. The
revision is simply taken from the timestamp counter to ensure that
even upon a pointer reuse we cannot accidently come back to the
same (expr,revision) tuple.
The xxhash library provides a very fast and excellent hash algorithm
suitable for many purposes. It excels at hashing large blocks but is
also extremely fast on small ones. It's distributed under a 2-clause
BSD license (GPL-compatible) so it can be included here. Updates are
distributed here :
https://github.com/Cyan4973/xxHash
This will be usable to implement some maps/acl caches for heavy datasets
loaded from files (mostly regex-based but in general anything that cannot
be indexed in a tree).
This one returns a timestamp, either the one from the CPU or from
gettimeofday() in 64-bit format. The purpose is to be able to compare
timestamps on various entities to make it easier to detect updates.
It can also be used for benchmarking in certain situations during
development.
This commit adds 4 new log format variables that parse the
HTTP Request-Line for more specific logging than "%r" provides.
For example, we can parse the following HTTP Request-Line with
these new variables:
"GET /foo?bar=baz HTTP/1.1"
- %HM: HTTP Method ("GET")
- %HV: HTTP Version ("HTTP/1.1")
- %HU: HTTP Request-URI ("/foo?bar=baz")
- %HP: HTTP Request-URI without query string ("/foo")
Since appctx are scheduled out of streams, it's pointless to wake up
the task managing the stream to push updates, they won't be seen. In
fact unit tests work because silent sessions are restarted after 5s of
idle and the exchange is correctly scheduled during startup!
So we need to notify the appctx instead. For this we add a pointer to
the appctx in the peer session.
No backport is needed of course.
Currently we have a problem. There are some cases where a sleeping applet
is not woken up (eg: show sess during an injection). The reason is that
the applet is marked WAIT_DATA and is not woken up when WAIT_ROOM leaves,
because we wait for both flags to be cleared in order to call it.
And if we wait for either flag, then we have the opposite situation, which
is that we're not waiting for room in the output buffer so we're spinning
calling the applet to do nothing.
What is missing is an indication of what the applet needs. Since it only
manipulates the WAIT_ROOM/WAIT_DATA which are overwritten later, that cannot
work. In the case of connections, the problem doesn't happen because the
connection maintains these extra states. Ideally we'd need to have similar
states for each appctx and to store those information there. But it would
be overcomplicated given that an applet doesn't exist alone without a
stream-int, so we can safely put these information into the stream int and
make the code simpler.
With this patch we introduce two new flags in the stream interface :
- SI_FL_WANT_PUT : the applet wants to put something into the buffer
- SI_FL_WANT_GET : the applet wants to get something from the buffer
We also have the new functions si_applet_{stop|want|cant}_{get|put}
to make the code look similar to the connection code.
For now these flags are not used yet.
This is the equivalent of si_conn_wake() but for applets. It will be
called after changes to the stream interface are brought by the applet
I/O handler. Ultimately it will release buffers and may be even wake
the stream's task up if some important changes are detected.
It would be nice to be able to merge it with the connection's wake
function since it mostly manipulates the stream interface, but there
are minor differences (such as how to enable/disable polling on a fd
vs applet) and some specificities to applets (eg: don't wake the
applet up until the output is empty) which would require abstract
functions which would slow down everything.
The new function is called for each round of polling in order to call any
active appctx. For now we pick the stream interface from the appctx's
owner. At the moment there's no appctx queued yet, but we have everything
needed to queue them and remove them.
Now that applet's functions only take an appctx in argument, not a
stream interface. This slightly simplifies the code and will be needed
to take the appctx out of the stream interface.
It happened after changing the stream interface deinitialization
sequence that we got random crashes with si_shutw() being called
on NULL si->end. The reason was that si->ops was not reset after
a call to si_release_endpoint() which is sometimes called directly.
Thus we now move the resetting of si->ops just after any si->end
assignment. It happens that si_detach() is now just the same as
si_release_endpoint() and stream_int_unregister_handler(). Some
cleanup will have to be performed there.
It's not sure whether this problem can impact 1.5 since in 1.5
applets are part of the default embedded stream handler. The only
way it could cause some trouble is if it's used with a connection,
which doesn't seem possible at first glance.
Commit bc4c1ac ("MEDIUM: http/tcp: permit to resume http and tcp custom
actions") introduced the ability to interrupt and restart processing in
the middle of a TCP/HTTP ruleset. But it doesn't do it in a consistent
way : it checks current_rule_list, immediately dereferences current_rule,
which is only set in certain cases and never cleared. So that broke the
tcp-request content rules when the processing was interrupted due to
missing data, because current_rule was not yet set (segfault) or could
have been inherited from another ruleset if it was used in a backend
(random behaviour).
The proper way to do it is to always set current_rule before dereferencing
it. But we don't want to set it for all rules because we don't want any
action to provide a checkpointing mechanism. So current_rule is set to NULL
before entering the loop, and only used if not NULL and if current_rule_list
matches the current list. This way they both serve as a guard for the other
one. This fix also makes the current rule point to the rule instead of its
list element, as it's much easier to manipulate.
No backport is needed, this is 1.6-specific.
This patch adds support for error codes 429 and 405 to Haproxy and a
"deny_status XXX" option to "http-request deny" where you can specify which
code is returned with 403 being the default. We really want to do this the
"haproxy way" and hope to have this patch included in the mainline. We'll
be happy address any feedback on how this is implemented.
We don't pass sess->origin anymore but the pointer to the previous step. Now
it should be much easier to chain elements together once applets are moved out
of streams. Indeed, the session is only used for configuration and not for the
dynamic chaining anymore.
The include file did not protect correctly against multiple inclusions,
as it didn't define the file name after checking for it. That's currently
harmless as the file is only included from .c but that could change.
This patch cretes a new Map class that permits to do some lookup in
HAProxy maps. This Map class is integration in the HAProxy update
system, so we can modify the map throught the socket.
The function was called stream_accept_session(), let's rename it
stream_new() and make it return the newly allocated pointer. It's
more convenient for some callers who need it.
This concerns everythins related to accepting a new session and
expiring the embryonic session. There's still a hard-coded call
to stream_accept_session() which could be set somewhere in the
frontend, but for now it's not a problem.
Now that the previous changes were made, we can add a struct task
pointer to stream_complete() and get rid of it in struct session.
The new relation between connection, session and task are like this :
orig -- sess <-- context
| |
v |
conn -- owner ---> task
Some session-specific parts should now move away from stream.
The function now only initializes a session, calls the tcp req connection
rules, and calls stream_complete() to finish initialization. If a handshake
is needed, it is done without allocating the stream at all.
Temporarily, in order to limit the amount of changes, the task allocated
is put into sess->task, and it is used by the connection for the handshake
or is offered to the stream. At this point we set the relation between
sess/task/conn this way :
orig -- sess <-- context
| ^ +- task -+ |
v | v |
conn -- owner task
The task must not remain in the session and ultimately it is planned to
remove this task pointer from the session because it can be found by
having conn->owner = task, and looping back from sess to conn, and to
find the session from the connection via the task.
It passes a NULL wherever a stream was needed (acl_exec_cond() and
action_ptr mainly). It can still track the connection rate correctly
and block based on ACLs.
In order to support sessions tracking counters, we first ensure that there
is no overlap between streams' stkctr and sessions', and we allow an
automatic lookup into the session's counters when the stream doesn't
have a counter or when the stream doesn't exist during an access via
a sample fetch. The functions used to update the stream counters only
update them and not the session counters however.
The stick counters in the session will be used for everything not related
to contents, hence the connections / concurrent sessions / etc. They will
be usable by "tcp-request connection" rules even without a stream. For now
they're just allocated and initialized.
Doing so ensures we don't need to use the stream anymore to prepare the
log information to report a failed handshake on an embryonic session.
Thus, prepare_mini_sess_log_prefix() now takes a session in argument.
Now that we have sess->origin to carry that information along, we don't
need to put that into strm->target anymore, so we remove one dependence
on the stream in embryonic connections.
Many such function need a session, and till now they used to dereference
the stream. Once we remove the stream from the embryonic session, this
will not be possible anymore.
So as of now, sample fetch functions will be called with this :
- sess = NULL, strm = NULL : never
- sess = valid, strm = NULL : tcp-req connection
- sess = valid, strm = valid, strm->txn = NULL : tcp-req content
- sess = valid, strm = valid, strm->txn = valid : http-req / http-res
The registerable http_req_rules / http_res_rules used to require a
struct http_txn at the end. It's redundant with struct stream and
propagates very deep into some parts (ie: it was the reason for lua
requiring l7). Let's remove it now.
All of them can now retrieve the HTTP transaction *if it exists* from
the stream and be sure to get NULL there when called with an embryonic
session.
The patch is a bit large because many locations were touched (all fetch
functions had to have their prototype adjusted). The opportunity was
taken to also uniformize the call names (the stream is now always "strm"
instead of "l4") and to fix indent where it was broken. This way when
we later introduce the session here there will be less confusion.
Now this one is dynamically allocated. It means that 280 bytes of memory
are saved per TCP stream, but more importantly that it will become
possible to remove the l7 pointer from fetches and converters since
it will be deduced from the stream and will support being null.
A lot of care was taken because it's easy to forget a test somewhere,
and the previous code used to always trust s->txn for being valid, but
all places seem to have been visited.
All HTTP fetch functions check the txn first so we shouldn't have any
issue there even when called from TCP. When branching from a TCP frontend
to an HTTP backend, the txn is properly allocated at the same time as the
hdr_idx.
The header captures are now general purpose captures since tcp rules
can use them to capture various contents. That removes a dependency
on http_txn that appeared in some sample fetch functions and in the
order by which captures and http_txn were allocated.
Interestingly the reset of the header captures were done at too many
places as http_init_txn() used to do it while it was done previously
in every call place.
Just like for the listener, the frontend is session-wide so let's move
it to the session. There are a lot of places which were changed but the
changes are minimal in fact.
There is now a pointer to the session in the stream, which is NULL
for now. The session pool is created as well. Some parts will move
from the stream to the session now.
With HTTP/2, we'll have to support multiplexed streams. A stream is in
fact the largest part of what we currently call a session, it has buffers,
logs, etc.
In order to catch any error, this commit removes any reference to the
struct session and tries to rename most "session" occurrences in function
names to "stream" and "sess" to "strm" when that's related to a session.
The files stream.{c,h} were added and session.{c,h} removed.
The session will be reintroduced later and a few parts of the stream
will progressively be moved overthere. It will more or less contain
only what we need in an embryonic session.
Sample fetch functions and converters will have to change a bit so
that they'll use an L5 (session) instead of what's currently called
"L4" which is in fact L6 for now.
Once all changes are completed, we should see approximately this :
L7 - http_txn
L6 - stream
L5 - session
L4 - connection | applet
There will be at most one http_txn per stream, and a same session will
possibly be referenced by multiple streams. A connection will point to
a session and to a stream. The session will hold all the information
we need to keep even when we don't yet have a stream.
Some more cleanup is needed because some code was already far from
being clean. The server queue management still refers to sessions at
many places while comments talk about connections. This will have to
be cleaned up once we have a server-side connection pool manager.
Stream flags "SN_*" still need to be renamed, it doesn't seem like
any of them will need to move to the session.
This library is designed to emit a zlib-compatible stream with no
memory usage and to favor resource savings over compression ratio.
While zlib requires 256 kB of RAM per compression context (and can only
support 4000 connections per GB of RAM), the stateless compression
offered by libslz does not need to retain buffers between subsequent
calls. In theory this slightly reduces the compression ratio but in
practice it does not have that much of an effect since the zlib
window is limited to 32kB.
Libslz is available at :
http://git.1wt.eu/web?p=libslz.git
It was designed for web compression and provides a lot of savings
over zlib in haproxy. Here are the preliminary results on a single
core of a core2-quad 3.0 GHz in 32-bit for only 300 concurrent
sessions visiting the home page of www.haproxy.org (76 kB) with
the default 16kB buffers :
BW In BW Out BW Saved Ratio memory VSZ/RSS
zlib 237 Mbps 92 Mbps 145 Mbps 2.58 84M / 69M
slz 733 Mbps 380 Mbps 353 Mbps 1.93 5.9M / 4.2M
So while the compression ratio is lower, the bandwidth savings are
much more important due to the significantly lower compression cost
which allows to consume even more data from the servers. In the
example above, zlib became the bottleneck at 24% of the output
bandwidth. Also the difference in memory usage is obvious.
More tests run on a single core of a core i5-3320M, with 500 concurrent
users and the default 16kB buffers :
At 100% CPU (no limit) :
BW In BW Out BW Saved Ratio memory VSZ/RSS hits/s
zlib 480 Mbps 188 Mbps 292 Mbps 2.55 130M / 101M 744
slz 1700 Mbps 810 Mbps 890 Mbps 2.10 23.7M / 9.7M 2382
At 85% CPU (limited) :
BW In BW Out BW Saved Ratio memory VSZ/RSS hits/s
zlib 1240 Mbps 976 Mbps 264 Mbps 1.27 130M / 100M 1738
slz 1600 Mbps 976 Mbps 624 Mbps 1.64 23.7M / 9.7M 2210
The most important benefit really happens when the CPU usage is
limited by "maxcompcpuusage" or the BW limited by "maxcomprate" :
in order to preserve resources, haproxy throttles the compression
ratio until usage is within limits. Since slz is much cheaper, the
average compression ratio is much higher and the input bandwidth
is quite higher for one Gbps output.
Other tests made with some reference files :
BW In BW Out BW Saved Ratio hits/s
daniels.html zlib 1320 Mbps 163 Mbps 1157 Mbps 8.10 1925
slz 3600 Mbps 580 Mbps 3020 Mbps 6.20 5300
tv.com/listing zlib 980 Mbps 124 Mbps 856 Mbps 7.90 310
slz 3300 Mbps 553 Mbps 2747 Mbps 5.97 1100
jquery.min.js zlib 430 Mbps 180 Mbps 250 Mbps 2.39 547
slz 1470 Mbps 764 Mbps 706 Mbps 1.92 1815
bootstrap.min.css zlib 790 Mbps 165 Mbps 625 Mbps 4.79 777
slz 2450 Mbps 650 Mbps 1800 Mbps 3.77 2400
So on top of saving a lot of memory, slz is constantly 2.5-3.5 times
faster than zlib and results in providing more savings for a fixed CPU
usage. For links smaller than 100 Mbps, zlib still provides a better
compression ratio, at the expense of a much higher CPU usage.
Larger input files provide slightly higher bandwidth for both libs, at
the expense of a bit more memory usage for zlib (it converges to 256kB
per connection).
This function used to take a zlib-specific flag as argument to indicate
whether a buffer flush or end of contents was met, let's split it in two
so that we don't depend on zlib anymore.
Thanks to MSIE/IIS, the "deflate" name is ambigous. According to the RFC
it's a zlib-wrapped deflate stream, but IIS used to send only a raw deflate
stream, which is the only format MSIE understands for "deflate". The other
widely used browsers do support both formats. For this reason some people
prefer to emit a raw deflate stream on "deflate" to serve more users even
it that means violating the standards. Haproxy only follows the standard,
so they cannot do this.
This patch makes it possible to have one algorithm name in the configuration
and another one in the protocol. This will make it possible to have a new
configuration token to add a different algorithm so that users can decide if
they want a raw deflate or the standard one.
There's no reason for exporting identity_* nor deflate_*, they're only
used in the same file. Mark them static, it will make it easier to add
other algorithms.
Till now we used to rely on a fixed maximum chunk size. Thanks to last
commit we're now free to adjust the chunk's length before sending the
data, so we don't have to use 6 digits all the time anymore, and if
one wants buffers larger than 16 MB it is now possible.
This class is accessible via the TXN object. It is created only if
the attached proxy have HTTP mode. It contain all the HTTP
manipulation functions:
- req_get_headers
- req_del_header
- req_rep_header
- req_rep_value
- req_add_header
- req_set_header
- req_set_method
- req_set_path
- req_set_query
- req_set_uri
- res_get_headers
- res_del_header
- res_rep_header
- res_rep_value
- res_add_header
- res_set_header
This function is a callback for HTTP actions. This function
creates the replacement string from a build_logline() format
and transform the header.
This patch split this function in two part. With this modification,
the header transformation and the replacement string are separed.
We can now transform the header with another replacement string
source than a build_logline() format.
The first part is the replacement engine. It take a replacement action
number and a replacement string and process the action.
The second part is the function which is called by the 'http-request
action' to replace a request line part. This function makes the
string used as replacement.
This split permits to use the replacement engine in other parts of the
code than the request action. The Lua use it for his own http action.
This will be useful later to state that some listeners have to use
certain decoders (typically an HTTP/2 decoder) regardless of the
regular processing applied to other listeners. For now it simply
defaults to the frontend's default target, and it is used by the
session.
Listerner->timeout is a vestigal thing going back to 2007 or so. It
used to only be used by stats and peers frontends to hold a pointer
to the proxy's client timeout. Now that we use regular frontends, we
don't use it anymore.
Some services such as peers and CLI pre-set the target applet immediately
during accept(), and for this reason they're forced to have a dedicated
accept() function which does not even properly follow everything the regular
one does (eg: sndbuf/rcvbuf/linger/nodelay are not set, etc).
Let's store the default target when known into the frontend's config so that
it's session_accept() which automatically sets it.
The default value is stored in a special struct that describe the
map. This default value is parsed with special parser. This is
useless because HAProxy provides a space to store the default
value (the args) and HAProxy provides also standard parser for
the input types (args again).
This patch remove this special storage and replace it by an argument.
In other way the args of maps are declared as the expected type in
place of strings.
It's now called conn_sock_drain() to make it clear that it only reads
at the sock layer and not at the data layer. The function was too big
to remain inlined and it's used at a few places where size counts.
Currently si_idle_conn_null_cb() has to perform some low-level checks
over the file descriptor and the connection configuration that should
only belong to conn_drain(). Let's move these controls there. The
function now automatically checks for errors and hangups on the file
descriptor for example, and disables recv polling if there's no drain
function at the control layer.
This function is an equivalent to send() which operates over a connection
instead of a file descriptor. It checks that the control layer is ready
and that it's allowed to send. If automatically enables polling if it
cannot send. It simplifies the return checks by returning zero in all
cases where it cannot send so that the caller only has to care about
negative values indicating errors.
This will save callers from having to care about conn->xprt and xprt->shutw.
Note that shutw() takes a second argument indicating whether it's a clean or
a hard shutw. This is used by SSL which tries to close cleanly in most cases.
Here we provide two versions, conn_data_shutw() which performs the clean
close, and conn_data_shutw_hard() which does the unclean one.
This function was not used yet and was only supposed to mark the connection
as shutdown for write. Unfortunately at other places in stream_interface.c,
we're seeing a bit of layering violations with attempts to perform the shutdown
on the fd directly. Let's make this function call shutdown() itself so that
the callers only have to care about the connection.
This struct used to carry only a sample fetch function. Thanks to
lua_pushuserdata(), we don't need to have the Lua engine allocate
that struct for us and we can simply push our pointer onto the stack.
This makes the code more readable by removing several occurrences of
"f->f->". Just like the previous patch, it comes with the nice effect
of saving about 1.3% of performance when fetching samples from Lua.
Since last cleanups, this one was only used to carry a struct channel.
Removing it makes the code a bit cleaner (no more chn->chn) and easier
to follow (no more abstraction for a common type). Interestingly it
happens to also make the Lua code slightly faster (about 1.5%) when
using channels, probably thanks to less pointer dereferences and maybe
the use of lua_pushlightuserdata().
Now that we can get the session from the channel, let's simplify the
prototype of session_alloc_recv_buffer() to only require the channel.
Both the caller and the function are now simplified.
The purpose of these two macros will be to pass via the session to
find the relevant stream interfaces so that we don't need to store
the ->cons nor ->prod pointers anymore. Currently they're only defined
so that all references could be removed.
Note that many places need a second pass of clean up so that we don't
have any chn_prod(&s->req) anymore and only &s->si[0] instead, and
conversely for the 3 other cases.
At a few places we need to find one stream interface from the other one.
Instead of passing via the channel, we simply use the session as an
intermediary, which simply results in applying an offset to the pointer.
We go back to the session to get the owner. Here again it's very easy
and is just a matter of relative offsets. Since the owner always exists
and always points to the session's task, we can remove some unneeded
tests.
In order to plan removal of si->ib / si->ob, we now check the side of the
stream interface and find the session, then the requested channel. In
practice it's just an offset applied to the pointer based on the flag.
This new flag "SI_FL_ISBACK" is set only on the back SI and is cleared
on the front SI. That way it's possible only by looking at the SI to
know what side it is.
We'll soon remove direct references to the channels from the stream
interface since everything belongs to the same session, so let's
first not dereference si->ib / si->ob anymore and use macros instead.
The channels were pointers to outside structs and this is not needed
anymore since the buffers have moved, but this complicates operations.
Move them back into the session so that both channels and stream interfaces
are always allocated for a session. Some places (some early sample fetch
functions) used to validate that a channel was NULL prior to dereferencing
it. Now instead we check if chn->buf is NULL and we force it to remain NULL
until the channel is initialized.
In some cases we don't want to known if a fetch or converter
fails. We just want a valid string. After this patch, we
have two sets of fetches and two sets of converters. There are:
txn.f, txn.sf, txn.c, txn.sc. The version prefixed by 's' always
returns strings for any type, and returns an empty string in the
error case or when the data are not available. This is particularly
useful when manipulating headers or cookies.
This patch implements a wrapper to give access to the converters
in the Lua code. The converters are used with the transaction.
The automatically created function are prefixed by "conv_".
HAProxy proposes many sample fetches. It is possible that the
automatic registration of the sample fetches causes a collision
with an existing Lua function. This patch sets a namespace for
the sample fetches.
The function buffer_contig_space() returns the contiguous space avalaible
to add data (at the end of the input side) while the function
hlua_channel_send_yield() needs to insert data starting at p. Here we
introduce a new function bi_space_for_replace() which returns the amount
of space that can be inserted at the head of the input side with one of
the buffer_replace* functions.
This patch proposes a function that returns the space avalaible after buf->p.
If we are writing in the request buffer, we are not waked up
when the data are forwarded because it is useles. The request
analyzers are waked up only when data is incoming. So, if the
request buffer is full, we set the WAKE_ON_WRITE flag.
Before this patch, each yield in a Lua action set a flags to be
waked up when some activity were detected on the response channel.
This behavior causes loop in the analyzer process.
This patch set the wake up on response buffer activity only if we
really want to be waked up on this activity.