A scope is a section name between square bracket, alone on its line, ie:
[scope-name]
...
The spaces at the beginning and at the end of the line are skipped. Comments at
the end of the line are also skipped.
When a scope is parsed, its name is saved in the global variable
cfg_scope. Initially, cfg_scope is NULL and it remains NULL until a valid scope
line is parsed.
This feature remains unused in the HAProxy configuration file and
undocumented. However, it will be used during SPOE configuration parsing.
This feature will be used by the stream processing offload engine (SPOE) to
parse dedicated configuration files without mixing HAProxy sections with SPOE
sections.
So, here we can back up all sections known by HAProxy, unregister all of them
and add new ones, dedicted to the SPOE. Once the SPOE configuration file parsed,
we can roll back all changes by restoring HAProxy sections.
This adds new "hold" timers : nx, refused, timeout, other. This timers
will be used to tell HAProxy to keep an erroneous response as valid for
the corresponding period. For now they're only configured, not enforced.
Right now there is an issue with the way the maintenance flags are
propagated upon startup. They are not propagate, just copied from the
tracked server. This implies that depending on the server's order, some
tracking servers may not be marked down. For example this configuration
does not work as expected :
server s1 1.1.1.1:8000 track s2
server s2 1.1.1.1:8000 track s3
server s3 1.1.1.1:8000 track s4
server s4 wtap:8000 check inter 1s disabled
It results in s1/s2 being up, and s3/s4 being down, while all of them
should be down.
The only clean way to process this is to run through all "root" servers
(those not tracking any other server), and to propagate their state down
to all their trackers. This is the same algorithm used to propagate the
state changes. It has to be done both to compute the IDRAIN flag and the
IMAINT flag. However, doing so requires that tracking servers are not
marked as inherited maintenance anymore while parsing the configuration
(and given that it is wrong, better drop it).
This fix also addresses another side effect of the bug above which is
that the IDRAIN/IMAINT flags are stored in the state files, and if
restored while the tracked server doesn't have the equivalent flag,
the servers may end up in a situation where it's impossible to remove
these flags. For example in the configuration above, after removing
"disabled" on server s4, the other servers would have remained down,
and not anymore with this fix. Similarly, the combination of IMAINT
or IDRAIN with their respective forced modes was not accepted on
reload, which is wrong as well.
This bug has been present at least since 1.5, maybe even 1.4 (it came
with tracking support). The fix needs to be backported there, though
the srv-state parts are irrelevant.
This commit relies on previous patch to silence warnings on startup.
0 will mean no balancing occurs; otherwise it represents the ratio
between the highest-loaded server and the average load, times 100 (i.e.
a value of 150 means a 1.5x ratio), assuming equal weights.
Signed-off-by: Andrew Rodland <andrewr@vimeo.com>
This commit introduces "tcp-request session" rules. These are very
much like "tcp-request connection" rules except that they're processed
after the handshake, so it is possible to consider SSL information and
addresses rewritten by the proxy protocol header in actions. This is
particularly useful to track proxied sources as this was not possible
before, given that tcp-request content rules are processed after each
HTTP request. Similarly it is possible to assign the proxied source
address or the client's cert to a variable.
This is in order to make integration of tcp-request-session cleaner :
- tcp_exec_req_rules() was renamed tcp_exec_l4_rules()
- LI_O_TCP_RULES was renamed LI_O_TCP_L4_RULES
(LI_O_*'s horrible indent was also fixed and a provision was left
for L5 rules).
With Linux officially introducing SO_REUSEPORT support in 3.9 and
its mainstream adoption we have seen more people running into strange
SO_REUSEPORT related issues (a process management issue turning into
hard to diagnose problems because the kernel load-balances between the
new and an obsolete haproxy instance).
Also some people simply want the guarantee that the bind fails when
the old process is still bound.
This change makes SO_REUSEPORT configurable, introducing the command
line argument "-dR" and the noreuseport configuration directive.
A backport to 1.6 should be considered.
This enables tracking of sticky counters from current response. The only
difference from "http-request track-sc" is the <key> sample expression
can only make use of samples in response (eg. res.*, status etc.) and
samples below Layer 6.
Changed all the cases where the pointer passed to realloc is overwritten
by the pointer returned by realloc. The new function my_realloc2 has
been used except in function register_name. If register_name fails to
add a new variable because of an "out of memory" error, all the existing
variables remain valid. If we had used my_realloc2, the array of variables
would have been freed.
The reference to the tls_keys_ref was not deleted from the
tlskeys_reference linked list.
When the SSL is malconfigured, it can lead to an access to freed memory
during a "show tls-keys" on the admin socked.
Ben Cabot reported that after commit 5e4261b ("CLEANUP: config:
detect double registration of a config section") recently introduced
in 1.7-dev, it's not possible anymore to load multiple configuration
files. Bryan Talbot provided a simple reproducer to exhibit the issue.
It turns out that function readcfgfile() registers new parsers for
section keywords for each new file. In addition to being useless, this
has the negative effect of wasting memory and slowing down the config
parser as the number of configuration files increases.
This fix only needs to be backported if/where the commit above is
backported.
When compiled with GCC 6, the IP address specified for a frontend was
ignored and HAProxy was listening on all addresses instead. This is
caused by an incomplete copy of a "struct sockaddr_storage".
With the GNU Libc, "struct sockaddr_storage" is defined as this:
struct sockaddr_storage
{
sa_family_t ss_family;
unsigned long int __ss_align;
char __ss_padding[(128 - (2 * sizeof (unsigned long int)))];
};
Doing an aggregate copy (ss1 = ss2) is different than using memcpy():
only members of the aggregate have to be copied. Notably, padding can be
or not be copied. In GCC 6, some optimizations use this fact and if a
"struct sockaddr_storage" contains a "struct sockaddr_in", the port and
the address are part of the padding (between sa_family and __ss_align)
and can be not copied over.
Therefore, we replace any aggregate copy by a memcpy(). There is another
place using the same pattern. We also fix a function receiving a "struct
sockaddr_storage" by copy instead of by reference. Since it only needs a
read-only copy, the function is converted to request a reference.
In an effort to make the config parser more robust, we should validate
that everything we register is not already registered. Most cfg_register_*
functions unfortunately return void and just perform a LIST_ADDQ(), so they
will have to change for this. At least cfg_register_section() does perform
a bit of checks and is easy to check for such errors, so let's start with
this one. Future patches will definitely have to focus on the remaining
functions and ensure unicity of all config parsers.
commit 7c0ffd23 is only considering the explicit use of the "process" keyword
on the listeners. But at this step, if it's not defined in the configuration,
the listener bind_proc mask is set to 0. As a result, the code will compute
the maxaccept value based on only 1 process, which is not always true.
For example :
global
nbproc 4
frontend test
bind-process 1-2
bind :80
Here, the maxaccept value for the "test" frontend was set to the global
tune.maxaccept value (default to 64), whereas it should consider 2 processes
will accept connections. As of the documentation, the value should be divided
by twice the number of processes the listener is bound to.
To fix this, we can consider that if no mask is set to the listener, we take
the frontend mask.
This is not critical but it can introduce unfairness distribution of the
incoming connections across the processes.
It should be backported to the same branches as commit 7c0ffd23 (1.6 and 1.5
were in the scope).
Christian Ruppert reported a performance degradation when binding a
single frontend to many processes while only one bind line was being
used, bound to a single process.
The reason comes from the fact that whenever a listener is bound to
multiple processes, the it is assigned a maxaccept value which equals
half the global maxaccept value divided by the number of processes the
frontend is bound to. The purpose is to ensure that no single process
will drain all the incoming requests at once and ensure a fair share
between all listeners. Usually this works pretty well, when a listener
is bound to all the processes of its frontend. But here we're in a
situation where the maxaccept of a listener which is bound to a single
process is still divided by a large value.
The fix consists in taking into account the number of processes the
listener is bound do and not only those of the frontend. This way it
is perfectly possible to benefit from nbproc and SO_REUSEPORT without
performance degradation.
1.6 and 1.5 normally suffer from the same issue.
Instead of repeating the type of the LHS argument (sizeof(struct ...))
in calls to malloc/calloc, we directly use the pointer
name (sizeof(*...)). The following Coccinelle patch was used:
@@
type T;
T *x;
@@
x = malloc(
- sizeof(T)
+ sizeof(*x)
)
@@
type T;
T *x;
@@
x = calloc(1,
- sizeof(T)
+ sizeof(*x)
)
When the LHS is not just a variable name, no change is made. Moreover,
the following patch was used to ensure that "1" is consistently used as
a first argument of calloc, not the last one:
@@
@@
calloc(
+ 1,
...
- ,1
)
In C89, "void *" is automatically promoted to any pointer type. Casting
the result of malloc/calloc to the type of the LHS variable is therefore
unneeded.
Most of this patch was built using this Coccinelle patch:
@@
type T;
@@
- (T *)
(\(lua_touserdata\|malloc\|calloc\|SSL_get_app_data\|hlua_checkudata\|lua_newuserdata\)(...))
@@
type T;
T *x;
void *data;
@@
x =
- (T *)
data
@@
type T;
T *x;
T *data;
@@
x =
- (T *)
data
Unfortunately, either Coccinelle or I is too limited to detect situation
where a complex RHS expression is of type "void *" and therefore casting
is not needed. Those cases were manually examined and corrected.
Currently, no warning are emitted when the gid is not a number.
Purpose of this warning is to let admins know they their configuration
won't be applied as expected.
Currently, no warning are emitted when the uid is not a number.
Purpose of this warning is to let admins know they their configuration
won't be applied as expected.
With nbproc > 1, it is possible to specify on which process the stats socket
will be bound using "stats bind-process", but the behaviour was not correct,
ignoring the value in some configurations.
Example :
global
nbproc 4
stats bind-process 1
stats socket /var/run/haproxy.sock
With such a configuration, all the processes will listen on the stats socket.
As a workaround, it is also possible to declare a "process" keyword on
the "stats stocket" line.
The patch must be applied to 1.7, 1.6 and 1.5
This patch introduces a configurable connection timeout for mailers
with a new "timeout mail <time>" directive.
Acked-by: Simon Horman <horms@verge.net.au>
If for example it was written as 'timeout retri 1s' or 'timeout wrong 1s'
this would be used for the retry timeout value. Resolvers section only
timeout setting currently is 'retry', others are still parsed as before
this patch to not break existing configurations.
A less strict version will be backported to 1.6.
With new init systems such as systemd, environment variables became a
real mess because they're only considered on startup but not on reload
since the init script's variables cannot be passed to the process that
is signaled to reload.
This commit introduces an alternative method consisting in making it
possible to modify the environment from the global section with directives
like "setenv", "unsetenv", "presetenv" and "resetenv".
Since haproxy supports loading multiple config files, it now becomes
possible to put the host-dependant variables in one file and to
distribute the rest of the configuration to all nodes, without having
to deal with the init system's deficiencies.
Environment changes take effect immediately when the directives are
processed, so it's possible to do perform the same operations as are
usually performed in regular service config files.
Now, filter's configuration (.id, .conf and .ops fields) is stored in the
structure 'flt_conf'. So proxies own a flt_conf list instead of a filter
list. When a filter is attached to a stream, it gets a pointer on its
configuration. This avoids mixing the filter's context (owns by a stream) and
its configuration (owns by a proxy). It also saves 2 pointers per filter
instance.
This new analyzer will be called for each HTTP request/response, before the
parsing of the body. It is identified by AN_FLT_HTTP_HDRS.
Special care was taken about the following condition :
* the frontend is a TCP proxy
* filters are defined in the frontend section
* the selected backend is a HTTP proxy
So, this patch explicitly add AN_FLT_HTTP_HDRS analyzer on the request and the
response channels when the backend is a HTTP proxy and when there are filters
attatched on the stream.
This patch simplifies http_request_forward_body and http_response_forward_body
functions.
HTTP compression will be moved in a true filter. To prepare the ground, some
functions have been moved in a dedicated file. Idea is to keep everything about
compression algos in compression.c and everything related to the filtering in
flt_http_comp.c.
For now, a header has been added to help during the transition. It will be
removed later.
Unused empty ACL keyword list was removed. The "compression" keyword
parser was moved from cfgparse.c to flt_http_comp.c.
This patch adds the support of filters in HAProxy. The main idea is to have a
way to "easely" extend HAProxy by adding some "modules", called filters, that
will be able to change HAProxy behavior in a programmatic way.
To do so, many entry points has been added in code to let filters to hook up to
different steps of the processing. A filter must define a flt_ops sutrctures
(see include/types/filters.h for details). This structure contains all available
callbacks that a filter can define:
struct flt_ops {
/*
* Callbacks to manage the filter lifecycle
*/
int (*init) (struct proxy *p);
void (*deinit)(struct proxy *p);
int (*check) (struct proxy *p);
/*
* Stream callbacks
*/
void (*stream_start) (struct stream *s);
void (*stream_accept) (struct stream *s);
void (*session_establish)(struct stream *s);
void (*stream_stop) (struct stream *s);
/*
* HTTP callbacks
*/
int (*http_start) (struct stream *s, struct http_msg *msg);
int (*http_start_body) (struct stream *s, struct http_msg *msg);
int (*http_start_chunk) (struct stream *s, struct http_msg *msg);
int (*http_data) (struct stream *s, struct http_msg *msg);
int (*http_last_chunk) (struct stream *s, struct http_msg *msg);
int (*http_end_chunk) (struct stream *s, struct http_msg *msg);
int (*http_chunk_trailers)(struct stream *s, struct http_msg *msg);
int (*http_end_body) (struct stream *s, struct http_msg *msg);
void (*http_end) (struct stream *s, struct http_msg *msg);
void (*http_reset) (struct stream *s, struct http_msg *msg);
int (*http_pre_process) (struct stream *s, struct http_msg *msg);
int (*http_post_process) (struct stream *s, struct http_msg *msg);
void (*http_reply) (struct stream *s, short status,
const struct chunk *msg);
};
To declare and use a filter, in the configuration, the "filter" keyword must be
used in a listener/frontend section:
frontend test
...
filter <FILTER-NAME> [OPTIONS...]
The filter referenced by the <FILTER-NAME> must declare a configuration parser
on its own name to fill flt_ops and filter_conf field in the proxy's
structure. An exemple will be provided later to make it perfectly clear.
For now, filters cannot be used in backend section. But this is only a matter of
time. Documentation will also be added later. This is the first commit of a long
list about filters.
It is possible to have several filters on the same listener/frontend. These
filters are stored in an array of at most MAX_FILTERS elements (define in
include/types/filters.h). Again, this will be replaced later by a list of
filters.
The filter API has been highly refactored. Main changes are:
* Now, HA supports an infinite number of filters per proxy. To do so, filters
are stored in list.
* Because filters are stored in list, filters state has been moved from the
channel structure to the filter structure. This is cleaner because there is no
more info about filters in channel structure.
* It is possible to defined filters on backends only. For such filters,
stream_start/stream_stop callbacks are not called. Of course, it is possible
to mix frontend and backend filters.
* Now, TCP streams are also filtered. All callbacks without the 'http_' prefix
are called for all kind of streams. In addition, 2 new callbacks were added to
filter data exchanged through a TCP stream:
- tcp_data: it is called when new data are available or when old unprocessed
data are still waiting.
- tcp_forward_data: it is called when some data can be consumed.
* New callbacks attached to channel were added:
- channel_start_analyze: it is called when a filter is ready to process data
exchanged through a channel. 2 new analyzers (a frontend and a backend)
are attached to channels to call this callback. For a frontend filter, it
is called before any other analyzer. For a backend filter, it is called
when a backend is attached to a stream. So some processing cannot be
filtered in that case.
- channel_analyze: it is called before each analyzer attached to a channel,
expects analyzers responsible for data sending.
- channel_end_analyze: it is called when all other analyzers have finished
their processing. A new analyzers is attached to channels to call this
callback. For a TCP stream, this is always the last one called. For a HTTP
one, the callback is called when a request/response ends, so it is called
one time for each request/response.
* 'session_established' callback has been removed. Everything that is done in
this callback can be handled by 'channel_start_analyze' on the response
channel.
* 'http_pre_process' and 'http_post_process' callbacks have been replaced by
'channel_analyze'.
* 'http_start' callback has been replaced by 'http_headers'. This new one is
called just before headers sending and parsing of the body.
* 'http_end' callback has been replaced by 'channel_end_analyze'.
* It is possible to set a forwarder for TCP channels. It was already possible to
do it for HTTP ones.
* Forwarders can partially consumed forwardable data. For this reason a new
HTTP message state was added before HTTP_MSG_DONE : HTTP_MSG_ENDING.
Now all filters can define corresponding callbacks (http_forward_data
and tcp_forward_data). Each filter owns 2 offsets relative to buf->p, next and
forward, to track, respectively, input data already parsed but not forwarded yet
by the filter and parsed data considered as forwarded by the filter. A any time,
we have the warranty that a filter cannot parse or forward more input than
previous ones. And, of course, it cannot forward more input than it has
parsed. 2 macros has been added to retrieve these offets: FLT_NXT and FLT_FWD.
In addition, 2 functions has been added to change the 'next size' and the
'forward size' of a filter. When a filter parses input data, it can alter these
data, so the size of these data can vary. This action has an effet on all
previous filters that must be handled. To do so, the function
'filter_change_next_size' must be called, passing the size variation. In the
same spirit, if a filter alter forwarded data, it must call the function
'filter_change_forward_size'. 'filter_change_next_size' can be called in
'http_data' and 'tcp_data' callbacks and only these ones. And
'filter_change_forward_size' can be called in 'http_forward_data' and
'tcp_forward_data' callbacks and only these ones. The data changes are the
filter responsability, but with some limitation. It must not change already
parsed/forwarded data or data that previous filters have not parsed/forwarded
yet.
Because filters can be used on backends, when we the backend is set for a
stream, we add filters defined for this backend in the filter list of the
stream. But we must only do that when the backend and the frontend of the stream
are not the same. Else same filters are added a second time leading to undefined
behavior.
The HTTP compression code had to be moved.
So it simplifies http_response_forward_body function. To do so, the way the data
are forwarded has changed. Now, a filter (and only one) can forward data. In a
commit to come, this limitation will be removed to let all filters take part to
data forwarding. There are 2 new functions that filters should use to deal with
this feature:
* flt_set_http_data_forwarder: This function sets the filter (using its id)
that will forward data for the specified HTTP message. It is possible if it
was not already set by another filter _AND_ if no data was yet forwarded
(msg->msg_state <= HTTP_MSG_BODY). It returns -1 if an error occurs.
* flt_http_data_forwarder: This function returns the filter id that will
forward data for the specified HTTP message. If there is no forwarder set, it
returns -1.
When an HTTP data forwarder is set for the response, the HTTP compression is
disabled. Of course, this is not definitive.
Erez reported a bug on discourse.haproxy.org about DNS resolution not
occuring when no port is specified on the nameserver directive.
This patch prevent this behavior by returning an error explaining this
issue when parsing the configuration file.
That said, later, we may want to force port 53 when client did not
provide any.
backport: 1.6
This setting used to be assigned to a variable tunable from a constant
and for an unknown reason never made its way into the config parser.
tune.recv_enough <number>
Haproxy uses some hints to detect that a short read indicates the end of the
socket buffers. One of them is that a read returns more than <recv_enough>
bytes, which defaults to 10136 (7 segments of 1448 each). This default value
may be changed by this setting to better deal with workloads involving lots
of short messages such as telnet or SSH sessions.
A segfault can occur during at the initialization phase, when an unknown
"mailers" name is configured. This happens when "email-alert myhostname" is not
set, where a direct pointer to an array is used instead of copying the string,
causing the segfault when haproxy tries to free the memory.
This is a minor issue because the configuration is invalid and a fatal error
will remain, but it should be fixed to prevent reload issues.
Example of minimal configuration to reproduce the bug :
backend example
email-alert mailers NOT_FOUND
email-alert from foo@localhost
email-alert to bar@localhost
This fix must be backported to 1.6.
Tommy Atkinson and Sylvain Faivre reported that email alerts didn't work when
they were declared in the defaults section. This is due to the use of an
internal attribute which is set once an email-alert is at least partially
configured. But this attribute was not propagated to the current proxy during
the configuration parsing.
Not that the issue doesn't occur if "email-alert myhostname" is configured in
the defaults section.
This fix must be backported to 1.6.
It is possible to create a http capture rule which points to a capture slot
id which does not exist.
Current patch prevent this when parsing configuration and prevent running
configuration which contains such rules.
This configuration is now invalid:
frontend f
bind :8080
http-request capture req.hdr(User-Agent) id 0
default_backend b
this one as well:
frontend f
bind :8080
declare capture request len 32 # implicit id is 0 here
http-request capture req.hdr(User-Agent) id 1
default_backend b
It applies of course to both http-request and http-response rules.
Current resolvers section parsing function is permissive on nameserver
id and two nameservers may have the same id.
It's a shame, since we don't know for example, whose statistics belong
to which nameserver...
From now, configuration with duplicated nameserver id in a resolvers
section are considered as broken and returns a fatal error when parsing.
When using the external-check command option HAProxy was failing to
start with a fatal error "'external-check' cannot handle unexpected
argument". When looking at the code it was looking for an incorrect
argument. Also correcting an Alert message text as spotted by by
PiBa-NL.
Michael Ezzell reported a bug causing haproxy to segfault during startup
when trying to send syslog message from Lua. The function __send_log() can
be called with *p that is NULL and/or when the configuration is not fully
parsed, as is the case with Lua.
This patch fixes this problem by using individual vectors instead of the
pre-generated strings log_htp and log_htp_rfc5424.
Also, this patch fixes a problem causing haproxy to write the wrong pid in
the logs -- the log_htp(_rfc5424) strings were generated at the haproxy
start, but "pid" value would be changed after haproxy is started in
daemon/systemd mode.
This patch adds a new RFC5424-specific log-format for the structured-data
that is automatically send by __send_log() when the sender is in RFC5424
mode.
A new statement "log-format-sd" should be used in order to set log-format
for the structured-data part in RFC5424 formatted syslog messages.
Example:
log-format-sd [exampleSDID@1234\ bytes=\"%B\"\ status=\"%ST\"]
The function __send_log() iterates over senders and passes the header as
the first vector to sendmsg(), thus it can send a logger-specific header
in each message.
A new logger arguments "format rfc5424" should be used in order to enable
RFC5424 header format. For example:
log 10.2.3.4:1234 len 2048 format rfc5424 local2 info
At the moment we have to call snprintf() for every log line just to
rebuild a constant. Thanks to sendmsg(), we send the message in 3 parts:
time-based header, proxy-specific hostname+log-tag+pid, session-specific
message.
A new function introduced meant to be called during general deinit phase.
During the configuration parsing, the section entries are all allocated.
This new function free them.
The tune.maxrewrite parameter used to be pre-initialized to half of
the buffer size since the very early days when buffers were very small.
It has grown to absurdly large values over the years to reach 8kB for a
16kB buffer. This prevents large requests from being accepted, which is
the opposite of the initial goal.
Many users fix it to 1024 which is already quite large for header
addition.
So let's change the default setting policy :
- pre-initialize it to 1024
- let the user tweak it
- in any case, limit it to tune.bufsize / 2
This results in 15kB usable to buffer HTTP messages instead of 8kB, and
doesn't affect existing configurations which already force it.
This directive gives HAProxy the ability to use the either the global
server-state-file directive or a local one using server-state-file-name to
load server states.
The state can be saved right before the reload by the init script, using
the "show servers state" command on the stats socket redirecting output into
a file.
This new global section directive is used to store the path to the file
where HAProxy will be able to retrieve server states across reloads.
The file pointed by this path is used to store a file which can contains
state of all servers from all backends.
This new global directive can be used to provide a base directory where
all the server state files could be loaded.
If a server state file name starts with a slash '/', then this directive
must not be applied.
The function does a bunch of things among which resolving environment
variables, skipping address family specifiers and trimming port ranges.
It is the only one which sees the complete host name before trying to
resolve it. The DNS resolving code needs to know the original hostname,
so we modify this function to optionally provide it to the caller.
Note that the function itself doesn't know if the host part was a host
or an address, but str2ip() knows that and can be asked not to try to
resolve. So we first try to parse the address without resolving and
try again with resolving enabled. This way we know if the address is
explicit or needs some kind of resolution.
This was the first transparent proxy technology supported by haproxy
circa 2005 but it was obsoleted in 2007 by Tproxy 4.0 which removed a
lot of the earlier versions' shortcomings and was finally merged into
the kernel. Since nobody has been using cttproxy for many years now
and nobody has even just tried to compile the files, it's time to
remove it. The doc was updated as well.
For performances considerations, some actions are not processed by remote
function. They are directly processed by the function. Some of these actions
does the same things but for different processing part (request / response).
This patch give the same name for the same actions, and change the normalization
of the other actions names.
This patch is ONLY a rename, it doesn't modify the code.
This patch is the first of a serie which merge all the action structs. The
function "tcp-request content", "tcp-response-content", "http-request" and
"http-response" have the same values and the same process for some defined
actions, but the struct and the prototype of the declared function are
different.
This patch try to unify all of these entries.
Commit c6678e2 ("MEDIUM: config: authorize frontend and listen without bind")
completely removed the test for bind lines in frontends in order to make it
easier for automated tools to generate configs (eg: replacing a bind with
another one passing via a temporary config without any bind line). The
problem is that some common mistakes are totally hidden now. For example,
this apparently valid entry is silently ignored :
listen 1.2.3.4:8000
server s1 127.0.0.1:8000
Hint: 1.2.3.4:8000 is mistakenly the proxy name here.
Thus instead we now emit a warning to indicate that a frontend was found
with no listener. This should be backported to 1.5 to help spot abnormal
configurations.
This strategy is less extreme than "always", it only dispatches first
requests to validated reused connections, and moves a connection from
the idle list to the safe list once it has seen a second request, thus
proving that it could be reused.
The "safe" mode consists in picking existing connections only when
processing a request that's not the first one from a connection. This
ensures that in case where the server finally times out and closes, the
client can decide to replay idempotent requests.
For now it only supports "never", meaning that we never want to reuse a
shared connection, and "always", meaning that we can use any connection
that was not marked private. When "never" is set, this also implies that
no idle connection may become a shared one.
Madison May reported that the timeout applied by the default
configuration is inproperly set up.
This patch fix this:
- hold valid default to 10s
- timeout retry default to 1s
This is in order to avoid conflicting with NetBSD popcount* functions
since 6.x release, the final l to mentions the argument is a long like
NetBSD does.
This patch could be backported to 1.5 to fix the build issue there as well.
Moved 51Degrees code from src/haproxy.c, src/sample.c and src/cfgparse.c
into a separate files src/51d.c and include/import/51d.h.
Added two new functions init_51degrees() and deinit_51degrees(), updated
Makefile and other code reorganizations related to 51Degrees.
Implementation of a DNS client in HAProxy to perform name resolution to
IP addresses.
It relies on the freshly created UDP client to perform the DNS
resolution. For now, all UDP socket calls are performed in the
DNS layer, but this might change later when the protocols are
extended to be more suited to datagram mode.
A new section called 'resolvers' is introduced thanks to this patch. It
is used to describe DNS servers IP address and also many parameters.
With this patch, it is possible to configure HAProxy to forge the SSL
certificate sent to a client using the SNI servername. We do it in the SNI
callback.
To enable this feature, you must pass following BIND options:
* ca-sign-file <FILE> : This is the PEM file containing the CA certitifacte and
the CA private key to create and sign server's certificates.
* (optionally) ca-sign-pass <PASS>: This is the CA private key passphrase, if
any.
* generate-certificates: Enable the dynamic generation of certificates for a
listener.
Because generating certificates is expensive, there is a LRU cache to store
them. Its size can be customized by setting the global parameter
'tune.ssl.ssl-ctx-cache-size'.
This patch adds the ssl-dh-param-file global setting. It sets the
default DH parameters that will be used during the SSL/TLS handshake when
ephemeral Diffie-Hellman (DHE) key exchange is used, for all "bind" lines
which do not explicitely define theirs.
This patch does'nt add any new feature: the functional behavior
is the same than version 1.0.
Technical differences:
In this version all updates on different stick tables are
multiplexed on the same tcp session. There is only one established
tcp session per peer whereas in first version there was one established
tcp session per peer and per stick table.
Messages format was reviewed to be more evolutive and to support
further types of data exchange such as SSL sessions or other sticktable's
data types (currently only the sticktable's server id is supported).
Most of the keywords in the global section does not check the maximum
number of arguments. This leds sometines to unused and wrong arguments
in the configuration file. This patch add a maximum argument test in
many keywords of this section.
This patch checks the number of arguments of the keywords:
'global', 'defaults', 'listen', 'backend', 'frontend', 'peers' and
'userlist'
The 'global' section does not take any arguments.
Proxy sections does not support bind address as argument anymore. Those
sections supports only an <id> argument.
The 'defaults' section didn't had any check on its arguments. It takes
an optional <name> argument.
'peers' section takes a <peersect> argument.
'userlist' section takes a <listname> argument.
If the 'userlist' keyword parsing returns an error and no userlist were
previously created. The parsing of 'user' and 'group' leads to NULL
derefence.
The userlist pointer is now tested to prevent this issue.
In order to support http-response redirect, the parsing needs to be
adapted a little bit to only support the "location" type, and to
adjust the log-format parser so that it knows the direction of the
sample fetch calls.
These ones were already obsoleted in 1.4, marked for removal in 1.5,
and not documented anymore. They used to emit warnings, and do still
require quite some code to stay in place. Let's remove them now.
We don't use findproxy_mode() anymore so we can check the conflicting
modes and report the anomalies accordingly with line numbers and more
explicit details.
First, findproxy() was renamed proxy_find_by_name() so that its explicit
that a name is required for the lookup. Second, we give this function
the ability to search for tables if needed. Third we now provide inline
wrappers to pass the appropriate PR_CAP_* flags and to explicitly look
up a frontend, backend or table.
A nasty situation happens when two tables have the same name. Since it
is possible to declare a table in a frontend and another one in a backend,
this situation may happen and result in a random behaviour each time a
table is designated in a "stick" or "track" rule. Let's make sure this
is properly detected and stopped. Such a config will now report :
[ALERT] 145/104933 (31571) : parsing [prx.cfg:36] : stick-table name 't' conflicts with table declared in frontend 't' at prx.cfg:30.
[ALERT] 145/104933 (31571) : Error(s) found in configuration file : prx.cfg
[ALERT] 145/104933 (31571) : Fatal errors found in configuration.
Since 1.4 we used to emit a warning when two frontends or two backends
had the same name. In 1.5 we added the same warning for two peers sections.
In 1.6 we added the same warning for two mailers sections. It's about time
to reject such invalid configurations, the impact they have on the code
complexity is huge and it is becoming a real obstacle to some improvements
such as restoring servers check status across reloads.
Now these errors are reported as fatal errors and will need to be fixed.
Anyway, till now there was no guarantee that what was written was working
as expected since the behaviour is not defined (eg: use_backend with a
name used by two backends leads to undefined behaviour).
Example of output :
[ALERT] 145/104759 (31564) : Parsing [prx.cfg:12]: mailers section 'm' has the same name as another mailers section declared at prx.cfg:10.
[ALERT] 145/104759 (31564) : Parsing [prx.cfg:16]: peers section 'p' has the same name as another peers section declared at prx.cfg:14.
[ALERT] 145/104759 (31564) : Parsing [prx.cfg:21]: frontend 'f' has the same name as another frontend declared at prx.cfg:18.
[ALERT] 145/104759 (31564) : Parsing [prx.cfg:27]: backend 'b' has the same name as another backend declared at prx.cfg:24.
[ALERT] 145/104759 (31564) : Error(s) found in configuration file : prx.cfg
[ALERT] 145/104759 (31564) : Fatal errors found in configuration.
For backend load balancing it sometimes makes sense to redispatch rather
than retrying against the same server. For example, when machines or routers
fail you may not want to waste time retrying against a dead server and
would instead prefer to immediately redispatch against other servers.
This patch allows backend sections to specify that they want to
redispatch on a particular interval. If the interval N is positive the
redispatch occurs on every Nth retry, and if the interval N is negative then
the redispatch occurs on the Nth retry prior to the last retry (-1 is the
default and maintains backwards compatibility). In low latency environments
tuning this setting can save a few hundred milliseconds when backends fail.
Within the listener struct we need to use a reference to the TLS
ticket keys which binds the actual keys with the filename. This will
make it possible to update the keys through the socket
Signed-off-by: Nenad Merdanovic <nmerdan@anine.io>
The method used to skip to next rule in the list is wrong, it assumes
that the list element starts at the same offset as the rule. It happens
to be true on most architectures since the list is the first element for
now but it's definitely wrong. Now the code doesn't crash anymore when
the struct list is moved anywhere else in the struct tcpcheck_rule.
This fix must be backported to 1.5.
This is the most important fix of this series. There's a risk of endless
loop and crashes caused by the fact that we go past the head of the list
when skipping to next rule, without checking if it's still a valid element.
Most of the time, the ->action field is checked, which points to the proxy's
check_req pointer (generally NULL), meaning the element is confused with a
TCPCHK_ACT_SEND action.
The situation was accidently made worse with the addition of tcp-check
comment since it also skips list elements. However, since the action that
makes it go forward is TCPCHK_ACT_COMMENT (3), there's little chance to
see this as a valid pointer, except on 64-bit machines where it can match
the end of a check_req string pointer.
This fix heavily depends on previous cleanup and both must be backported
to 1.5 where the bug is present.
Environment variables were expandables only in adresses.
Now there are expandables everywhere in the configuration file within
double quotes.
This patch breaks compatibility with the previous behavior of
environment variables in adresses, you must enclose adresses with double
quotes to make it work.
tcpcheck error messages include the step id where the error occurs.
In some cases, this is not enough. Now, HAProxy also use the comment
field of the latest tcpcheck rule which has been run.
This commit allows HAProxy to parse a new directive in the tcpcheck
ruleset: 'comment'.
It is used to setup comments on the current tcpcheck rules.
This patch introduces quoting which allows to write configuration string
including spaces without escaping them.
Strong (with single quotes) and weak (with double quotes) quoting are
supported. Weak quoting supports escaping and special characters when
strong quoting does not interpret anything.
This patch could break configuration files where ' and " where used.
Chad Lavoie reported an interesting regression caused by the latest
updates to automatically detect the processes a peers section runs on.
It turns out that if a config has neither nbproc nor a bind-process
statement and depending on the frontend->backend chaining, it is possible
to evade all bind_proc propagations, resulting in assigning only ~0UL (all
processes, which is 32 or 64) without ever restricting it to nbproc. It
was not visible in backends until they started to reference peers sections
which saw themselves with 64 processes at once.
This patch addresses this by replacing all those ~0UL with nbits(nbproc).
That way all "bind-process" settings *default* to the number of processes
defined in nbproc instead of 32 or 64.
This fix could possibly be backported into 1.5, though there is no indication
that this bug could have any effect there.
It is sometimes desirable to wait for the body of an HTTP request before
taking a decision. This is what is being done by "balance url_param" for
example. The first use case is to buffer requests from slow clients before
connecting to the server. Another use case consists in taking the routing
decision based on the request body's contents. This option placed in a
frontend or backend forces the HTTP processing to wait until either the whole
body is received, or the request buffer is full, or the first chunk is
complete in case of chunked encoding. It can have undesired side effects with
some applications abusing HTTP by expecting unbufferred transmissions between
the frontend and the backend, so this should definitely not be used by
default.
Note that it would not work for the response because we don't reset the
message state before starting to forward. For the response we need to
1) reset the message state to MSG_100_SENT or BODY , and 2) to reset
body_len in case of chunked encoding to avoid counting it twice.
If a peers section is bound to no process, it's silently discarded. If its
bound to multiple processes, an error is emitted and the process will not
start.
Sometimes it's very hard to disable the use of peers because an empty
section is not valid, so it is necessary to comment out all references
to the section, and not to forget to restore them in the same state
after the operation.
Let's add a "disabled" keyword just like for proxies. A ->state member
in the peers struct is even present for this purpose but was never used
at all.
Maybe it would make sense to backport this to 1.5 as it's really cumbersome
there.
It's dangerous to initialize stick-tables before peers because they
start a task that cannot be stopped before we know if the peers need
to be disabled and destroyed. Move this after.
If a table in a disabled proxy references a peers section, the peers
name is not resolved to a pointer to a table, but since it belongs to
a union, it can later be dereferenced. Right now it seems it cannot
happen, but it definitely will after the pending changes.
It doesn't cost anything to backport this into 1.5, it will make gdb
sessions less head-scratching.
Recently some browsers started to implement a "pre-connect" feature
consisting in speculatively connecting to some recently visited web sites
just in case the user would like to visit them. This results in many
connections being established to web sites, which end up in 408 Request
Timeout if the timeout strikes first, or 400 Bad Request when the browser
decides to close them first. These ones pollute the log and feed the error
counters. There was already "option dontlognull" but it's insufficient in
this case. Instead, this option does the following things :
- prevent any 400/408 message from being sent to the client if nothing
was received over a connection before it was closed ;
- prevent any log from being emitted in this situation ;
- prevent any error counter from being incremented
That way the empty connection is silently ignored. Note that it is better
not to use this unless it is clear that it is needed, because it will hide
real problems. The most common reason for not receiving a request and seeing
a 408 is due to an MTU inconsistency between the client and an intermediary
element such as a VPN, which blocks too large packets. These issues are
generally seen with POST requests as well as GET with large cookies. The logs
are often the only way to detect them.
This patch should be backported to 1.5 since it avoids false alerts and
makes it easier to monitor haproxy's status.
The principle of this cache is to have a global cache for all pattern
matching operations which rely on lists (reg, sub, dir, dom, ...). The
input data, the expression and a random seed are used as a hashing key.
The cached entries contains a pointer to the expression and a revision
number for that expression so that we don't accidently used obsolete
data after a pattern update or a very unlikely hash collision.
Regarding the risk of collisions, 10k entries at 10k req/s mean 1% risk
of a collision after 60 years, that's already much less than the memory's
reliability in most machines and more durable than most admin's life
expectancy. A collision will result in a valid result to be returned
for a different entry from the same list. If this is not acceptable,
the cache can be disabled using tune.pattern.cache-size.
A test on a file containing 10k small regex showed that the regex
matching was limited to 6k/s instead of 70k with regular strings.
When enabling the LRU cache, the performance was back to 70k/s.
This concerns everythins related to accepting a new session and
expiring the embryonic session. There's still a hard-coded call
to stream_accept_session() which could be set somewhere in the
frontend, but for now it's not a problem.
With HTTP/2, we'll have to support multiplexed streams. A stream is in
fact the largest part of what we currently call a session, it has buffers,
logs, etc.
In order to catch any error, this commit removes any reference to the
struct session and tries to rename most "session" occurrences in function
names to "stream" and "sess" to "strm" when that's related to a session.
The files stream.{c,h} were added and session.{c,h} removed.
The session will be reintroduced later and a few parts of the stream
will progressively be moved overthere. It will more or less contain
only what we need in an embryonic session.
Sample fetch functions and converters will have to change a bit so
that they'll use an L5 (session) instead of what's currently called
"L4" which is in fact L6 for now.
Once all changes are completed, we should see approximately this :
L7 - http_txn
L6 - stream
L5 - session
L4 - connection | applet
There will be at most one http_txn per stream, and a same session will
possibly be referenced by multiple streams. A connection will point to
a session and to a stream. The session will hold all the information
we need to keep even when we don't yet have a stream.
Some more cleanup is needed because some code was already far from
being clean. The server queue management still refers to sessions at
many places while comments talk about connections. This will have to
be cleaned up once we have a server-side connection pool manager.
Stream flags "SN_*" still need to be renamed, it doesn't seem like
any of them will need to move to the session.
This will be useful later to state that some listeners have to use
certain decoders (typically an HTTP/2 decoder) regardless of the
regular processing applied to other listeners. For now it simply
defaults to the frontend's default target, and it is used by the
session.
Listerner->timeout is a vestigal thing going back to 2007 or so. It
used to only be used by stats and peers frontends to hold a pointer
to the proxy's client timeout. Now that we use regular frontends, we
don't use it anymore.
The peers frontend timeout was mistakenly set on timeout.connect instead
of timeout.client, resulting in no timeout being applied to the peers
connections. The impact is just that peers can establish connections and
remain connected until they speak. Once they start speaking, only one of
them will still be accepted, and old sessions will be killed, so the
problem is limited. This fix should however be backported to 1.5 since
it was introduced in 1.5-dev3 with peers.
Until now, the TLS ticket keys couldn't have been configured and
shared between multiple instances or multiple servers running HAproxy.
The result was that if a request got a TLS ticket from one instance/server
and it hits another one afterwards, it will have to go through the full
SSL handshake and negotation.
This patch enables adding a ticket file to the bind line, which will be
used for all SSL contexts created from that bind line. We can use the
same file on all instances or servers to mitigate this issue and have
consistent TLS tickets assigned. Clients will no longer have to negotiate
every time they change the handling process.
Signed-off-by: Nenad Merdanovic <nmerdan@anine.io>
This patch adds a new option which allows configuration of the maximum
log level of messages for which email alerts will be sent.
The default is alert which is more restrictive than
the current code which sends email alerts for all priorities.
That behaviour may be configured using the new configuration
option to set the maximum level to notice or greater.
email-alert level notice
Signed-off-by: Simon Horman <horms@verge.net.au>
This currently does nothing beyond parsing the configuration
and storing in the proxy as there is no implementation of email alerts.
Signed-off-by: Simon Horman <horms@verge.net.au>
As mailer and mailers structures and allow parsing of
a mailers section into those structures.
These structures will subsequently be freed as it is
not yet possible to use reference them in the configuration.
Signed-off-by: Simon Horman <horms@verge.net.au>
disable starts a server in the disabled state, however setting the health
of an agent implies that the agent is disabled as well as the server.
This is a problem because the state of the agent is not restored if
the state of the server is subsequently updated leading to an
unexpected state.
For example, if a server is started disabled and then the server
state is set to ready then without this change show stat indicates
that the server is "DOWN (agent)" when it is expected that the server
would be UP if its (non-agent) health check passes.
Reported-by: Mark Brooks <mark@loadbalancer.org>
Signed-off-by: Simon Horman <horms@verge.net.au>
This is equivalent to what was done in commit 48936af ("[MINOR] log:
ability to override the syslog tag") but this time instead of doing
this globally, it does it per proxy. The purpose is to be able to use
a separate log tag for various proxies (eg: make it easier to route
log messages depending on the customer).
This setting is used to limit memory usage without causing the alloc
failures caused by "-m". Unexpectedly, tests have shown a performance
boost of up to about 18% on HTTP traffic when limiting the number of
buffers to about 10% of the amount of concurrent connections.
tune.buffers.limit <number>
Sets a hard limit on the number of buffers which may be allocated per process.
The default value is zero which means unlimited. The minimum non-zero value
will always be greater than "tune.buffers.reserve" and should ideally always
be about twice as large. Forcing this value can be particularly useful to
limit the amount of memory a process may take, while retaining a sane
behaviour. When this limit is reached, sessions which need a buffer wait for
another one to be released by another session. Since buffers are dynamically
allocated and released, the waiting time is very short and not perceptible
provided that limits remain reasonable. In fact sometimes reducing the limit
may even increase performance by increasing the CPU cache's efficiency. Tests
have shown good results on average HTTP traffic with a limit to 1/10 of the
expected global maxconn setting, which also significantly reduces memory
usage. The memory savings come from the fact that a number of connections
will not allocate 2*tune.bufsize. It is best not to touch this value unless
advised to do so by an haproxy core developer.
Used in conjunction with the dynamic buffer allocator.
tune.buffers.reserve <number>
Sets the number of buffers which are pre-allocated and reserved for use only
during memory shortage conditions resulting in failed memory allocations. The
minimum value is 2 and is also the default. There is no reason a user would
want to change this value, it's mostly aimed at haproxy core developers.
Immo Goltz reported a case of segfault while parsing the config where
we try to propagate processes across stopped frontends (those with a
"disabled" statement). The fix is trivial. The workaround consists in
commenting out these frontends, although not always easy.
This fix must be backported to 1.5.
propagate_processes() has a typo in a condition :
if (!from->cap & PR_CAP_FE)
return;
The return is never taken because each proxy has at least one capability
so !from->cap always evaluates to zero. Most of the time the caller already
checks that <from> is a frontend. In the cases where it's not tested
(use_backend, reqsetbe), the rules have been checked for the context to
be a frontend as well, so in the end it had no nasty side effect.
This should be backported to 1.5.
Since during parsing stage, curproxy always represents a proxy to be operated,
it should be a mistake by referring proxy.
Signed-off-by: Godbach <nylzhaowei@gmail.com>
This patch makes it possible to create binds and servers in separate
namespaces. This can be used to proxy between multiple completely independent
virtual networks (with possibly overlapping IP addresses) and a
non-namespace-aware proxy implementation that supports the proxy protocol (v2).
The setup is something like this:
net1 on VLAN 1 (namespace 1) -\
net2 on VLAN 2 (namespace 2) -- haproxy ==== proxy (namespace 0)
net3 on VLAN 3 (namespace 3) -/
The proxy is configured to make server connections through haproxy and sending
the expected source/target addresses to haproxy using the proxy protocol.
The network namespace setup on the haproxy node is something like this:
= 8< =
$ cat setup.sh
ip netns add 1
ip link add link eth1 type vlan id 1
ip link set eth1.1 netns 1
ip netns exec 1 ip addr add 192.168.91.2/24 dev eth1.1
ip netns exec 1 ip link set eth1.$id up
...
= 8< =
= 8< =
$ cat haproxy.cfg
frontend clients
bind 127.0.0.1:50022 namespace 1 transparent
default_backend scb
backend server
mode tcp
server server1 192.168.122.4:2222 namespace 2 send-proxy-v2
= 8< =
A bind line creates the listener in the specified namespace, and connections
originating from that listener also have their network namespace set to
that of the listener.
A server line either forces the connection to be made in a specified
namespace or may use the namespace from the client-side connection if that
was set.
For more documentation please read the documentation included in the patch
itself.
Signed-off-by: KOVACS Tamas <ktamas@balabit.com>
Signed-off-by: Sarkozi Laszlo <laszlo.sarkozi@balabit.com>
Signed-off-by: KOVACS Krisztian <hidden@balabit.com>
Tom Limoncelli from Stack Exchange reported a minor bug : the frontend
inherits the LB parameters from the defaults sections. The impact is
that if a "balance" directive uses any L7 parameter in the defaults
sections and the frontend is in TCP mode, a warning is emitted about
their incompatibility. The warning is harmless but a valid, sane config
should never cause any warning to be reported.
This fix should be backported into 1.5 and possibly 1.4.
Paul Taylor and Bryan Talbot found that after commit 419ead8 ("MEDIUM:
config: compute the exact bind-process before listener's maxaccept"),
a backend marked "disabled" would cause the next backend to be skipped
and if it was the last one it would cause a segfault.
The reason is that the commit above changed the "while" loop for a "for"
loop but a "continue" statement still incrementing the current proxy was
left in the code for disabled proxies, causing the next one to be skipped
as well and the last one to try to dereference NULL when seeking ->next.
The quick workaround consists in not disabling backends, or adding an
empty dummy one after a disabled section.
This fix must be backported to 1.5.
A segfault was reported with the introduction of the propagate_processes()
function. It was caused when a use_backend rule was declared with a dynamic
name, using a log-format string. The backend is not resolved during the
configuration, which lead to the segfault.
The patch prevents the process binding propagation for such dynamic rules, it
should also be backported to 1.5.
propagate_processes() must not be called with unresolved proxies, but
nothing prevents it from being called in check_config_validity(). The
resulting effect is that an unresolved proxy can cause a recursion
loop if called in such a situation, ending with a segfault after the
fatal error report. There's no side effect beyond this.
This patch refrains from calling the function when any error was met.
This bug also affects 1.5, it should be backported.
If a frontend has any tcp-request content rule relying on request contents
without any inspect delay, we now emit a warning as this will randomly match.
This can be backported to 1.5 as it reduces the support effort.
A config where a tcp-request rule appears after an http-request rule
might seem valid but it is not. So let's report a warning about this
since this case is hard to detect by the naked eye.
Some users want to have a stats frontend with one line per process, but while
100% valid and safe, the config parser emits a warning. Relax this check to
ensure that the warning is only emitted if at least one of the listeners is
bound to multiple processes, or if the directive is placed in a backend called
from multiple processes (since in this case we don't know if it's safe).
This is a continuation of previous patch, the listener's maxaccept is divided
by the number of processes, so it's best if we can swap the two blocks so that
the number of processes is already known when computing the maxaccept value.
When a frontend does not have any bind-process directive, make it
automatically bind to the union of all of its listeners' processes
instead of binding to all processes. That will make it possible to
have the expected behaviour without having to explicitly specify a
bind-process directive.
Note that if the listeners are not bound to a specific process, the
default is still to bind to all processes.
This change could be backported to 1.5 as it simplifies process
management, and was planned to be done during the 1.5 development phase.
We now recursively propagate the bind-process values between frontends
and backends instead of doing it during name resolving. This ensures
that we're able to properly propagate all the bind-process directives
even across "listen" instances, which are not perfectly covered at the
moment, depending on the declaration order.
This basically reverts 3507d5d ("MEDIUM: proxy: only adjust the backend's
bind-process when already set"). It was needed during the transition to
the new process binding method but is causing trouble now because frontend
to backend binding is not properly propagated.
This fix should be backported to 1.5.
When an unknown encryption algorithm is used in userlists or the password is
not pasted correctly in the configuration, http authentication silently fails.
An initial check is now performed during the configuration parsing, in order to
verify that the encrypted password is supported. An unsupported password will
fail with a fatal error.
This patch should be backported to 1.4 and 1.5.
Add support for http-request track-sc, similar to what is done in
tcp-request for backends. A new act_prm field was added to HTTP
request rules to store the track params (table, counter). Just
like for TCP rules, the table is resolved while checking for
config validity. The code was mostly copied from the TCP code
with the exception that here we also count the HTTP request count
and rate by hand. Probably that something could be factored out in
the future.
It seems like tracking flags should be improved to mark each hook
which tracks a key so that we can have some check points where to
increase counters of the past if not done yet, a bit like is done
for TRACK_BACKEND.
With all the goodies supported by logformat, people find that the limit
of 1024 chars for log lines is too short. Some servers do not support
larger lines and can simply drop them, so changing the default value is
not always the best choice.
This patch takes a different approach. Log line length is specified per
log server on the "log" line, with a value between 80 and 65535. That
way it's possibly to satisfy all needs, even with some fat local servers
and small remote ones.
This patch remove all references of standard regex in haproxy. The last
remaining references are only in the regex.[ch] files.
In the file src/checks.c, the original function uses a "pmatch" array.
In fact this array is unused. This patch remove it.
Similar to previous patches, HTTP header captures are performed when
a TCP frontend switches to an HTTP backend, but are not possible to
report. So let's relax the check to explicitly allow them to be present
in TCP frontends.
When no static DH parameters are specified, this patch makes haproxy
use standardized (rfc 2409 / rfc 3526) DH parameters with prime lenghts
of 1024, 2048, 4096 or 8192 bits for DHE key exchange. The size of the
temporary/ephemeral DH key is computed as the minimum of the RSA/DSA server
key size and the value of a new option named tune.ssl.default-dh-param.
MySQL will in stop supporting pre-4.1 authentication packets in the future
and is already giving us a hard time regarding non-silencable warnings
which are logged on each health check. Warnings look like the following:
"[Warning] Client failed to provide its character set. 'latin1' will be used
as client character set."
This patch adds basic support for post-4.1 authentication by sending the proper
authentication packet with the character set, along with the QUIT command.
Now that it is possible to know whether a server is in forced maintenance
or inherits its maintenance status from another one, it is possible to
allow server tracking at more than one level. We still provide a loop
detection however.
Note that for the stats it's a bit trickier since we have to report the
check state which corresponds to the state of the server at the end of
the chain.
This change now involves a new flag SRV_ADMF_IMAINT to note that the
maintenance status of a server is inherited from another server. Thus,
we know at each server level in the chain if it's running, in forced
maintenance or in a maintenance status because it tracks another server,
or even in both states.
Disabling a server propagates this flag down to other servers. Enabling
a server flushes the flag down. A server becomes up again once both of
its flags are cleared.
Two new functions "srv_adm_set_maint()" and "srv_adm_set_ready()" are used to
manipulate this maintenance status. They're used by the CLI and the stats
page.
Now the stats page always says "MAINT" instead of "MAINT(via)" and it's
only the chk/down field which reports "via x/y" when the status is
inherited from another server, but it doesn't say it when a server was
forced into maintenance. The CSV output indicates "MAINT (via x/y)"
instead of only "MAINT(via)". This is the most accurate representation.
One important thing is that now entering/leaving maintenance for a
tracking server correctly follows the state of the tracked server.
Servers used to have 3 flags to store a state, now they have 4 states
instead. This avoids lots of confusion for the 4 remaining undefined
states.
The encoding from the previous to the new states can be represented
this way :
SRV_STF_RUNNING
| SRV_STF_GOINGDOWN
| | SRV_STF_WARMINGUP
| | |
0 x x SRV_ST_STOPPED
1 0 0 SRV_ST_RUNNING
1 0 1 SRV_ST_STARTING
1 1 x SRV_ST_STOPPING
Note that the case where all bits were set used to exist and was randomly
dealt with. For example, the task was not stopped, the throttle value was
still updated and reported in the stats and in the http_server_state header.
It was the same if the server was stopped by the agent or for maintenance.
It's worth noting that the internal function names are still quite confusing.
Now we introduce srv->admin and srv->prev_admin which are bitfields
containing one bit per source of administrative status (maintenance only
for now). For the sake of backwards compatibility we implement a single
source (ADMF_FMAINT) but the code already checks any source (ADMF_MAINT)
where the STF_MAINTAIN bit was previously checked. This will later allow
us to add ADMF_IMAINT for maintenance mode inherited from tracked servers.
Along doing these changes, it appeared that some places will need to be
revisited when implementing the inherited bit, this concerns all those
modifying the ADMF_FMAINT bit (enable/disable actions on the CLI or stats
page), and the checks to report "via" on the stats page. But currently
the code is harmless.
Till now, the server's state and flags were all saved as a single bit
field. It causes some difficulties because we'd like to have an enum
for the state and separate flags.
This commit starts by splitting them in two distinct fields. The first
one is srv->state (with its counter-part srv->prev_state) which are now
enums, but which still contain bits (SRV_STF_*).
The flags now lie in their own field (srv->flags).
The function srv_is_usable() was updated to use the enum as input, since
it already used to deal only with the state.
Note that currently, the maintenance mode is still in the state for
simplicity, but it must move as well.
Thomas Heil reported that previous commit 07fcaaa ("MINOR: fix a few
memory usage errors") make haproxy crash when req* rules are used. As
diagnosed by Cyril Bont, this commit introduced a regression which
makes haproxy free the memory areas allocated for regex even when
they're going to be used, resulting in the crashes.
This patch does three things :
- undo the free() on the valid path
- add regfree() on the error path but only when regcomp() succeeds
- rename err_code to ret_code to avoid confusing the valid return
path with an error path.
John-Paul Bader reported a stupid regression in 1.5-dev25, we
forget to check that global.stats_fe is initialized before visiting
its sockets, resulting in a crash.
No backport is needed.
We used to have is_addr() in place to validate sometimes the existence
of an address, sometimes a valid IPv4 or IPv6 address. Replace them
carefully so that is_inet_addr() is used wherever we can only use an
IPv4/IPv6 address.
Till now a warning was emitted if the "stats bind-process" was not
specified when nbproc was greater than 1. Now we can be much finer
and only emit a warning when at least of the stats socket is bound
to more than one process at a time.
When a process list is specified on either the proxy or the bind lines,
the latter is refined to the intersection of the two. A warning is emitted
if no intersection is found, and the situation is fixed by either falling
back to the first process of the proxy or to all processes.
When a bind-process setting is present in a frontend or backend, we
now verify that the specified process range at least shares one common
process with those defined globally by nbproc. Then if the value is
set, it is reduced to the one enforced by nbproc.
A warning is emitted if process count does not match, and the fix is
done the following way :
- if a single process was specified in the range, it's remapped to
process #1
- if more than one process was specified, the binding is removed
and all processes are usable.
Note that since backends may inherit their settings from frontends,
depending on the declaration order, they may or may not be reported
as warnings.
Some consistency checks cannot be performed between frontends, backends
and peers at the moment because there is no way to check for intersection
between processes bound to some processes when the number of processes is
higher than the number of bits in a word.
So first, let's limit the number of processes to the machine's word size.
This means nbproc will be limited to 32 on 32-bit machines and 64 on 64-bit
machines. This is far more than enough considering that configs rarely go
above 16 processes due to scalability and management issues, so 32 or 64
should be fine.
This way we'll ensure we can always build a mask of all the processes a
section is bound to.
By default, a proxy's bind_proc is zero, meaning "bind to all processes".
It's only when not zero that its process list is restricted. So we don't
want the frontends to enforce the value on the backends when the backends
are still set to zero.
Now, haproxy exit an error saying:
Unable to initialize the lock for the shared SSL session cache. You can retry using
the global statement 'tune.ssl.force-private-cache' but it could increase the CPU
usage due to renegotiation if nbproc > 1.
Process shared mutex seems not supported on some OSs (FreeBSD).
This patch checks errors on mutex lock init to fallback
on a private session cache (per process cache) in error cases.
Commit fc6c032 ("MEDIUM: global: add support for CPU binding on Linux ("cpu-map")")
merged into 1.5-dev13 involves a useless test that clang reports as a warning. The
"low" variable cannot be negative here. Issue reported by Charles Carter.
The "block" rules are redundant with http-request rules because they
are performed immediately before and do exactly the same thing as
"http-request deny". Moreover, this duplication has led to a few
minor stats accounting issues fixed lately.
Instead of keeping the two rule sets, we now build a list of "block"
rules that we compile as "http-request block" and that we later insert
at the beginning of the "http-request" rules.
The only user-visible change is that in case of a parsing error, the
config parser will now report "http-request block rule" instead of
"blocking condition".
Finn Arne Gangstad suggested that we should have the ability to break
keep-alive when the target server has reached its maxconn and that a
number of connections are present in the queue. After some discussion
around his proposed patch, the following solution was suggested : have
a per-proxy setting to fix a limit to the number of queued connections
on a server after which we break keep-alive. This ensures that even in
high latency networks where keep-alive is beneficial, we try to find a
different server.
This patch is partially based on his original proposal and implements
this configurable threshold.
In version 1.3.4, we got the ability to split configuration parts between
frontends and backends. The stats was attached to the backend and a control
was made to ensure that it was used only in a listen or backend section, but
not in a frontend.
The documentation clearly says that the statement may only be used in the
backend.
But since that same version above, the defaults stats configuration is
only filled in the frontend part of the proxy and not in the backend's.
So a backend will not get stats which are enabled in a defaults section,
despite what the doc says. However, a frontend configured after a defaults
section will get stats and will not emit the warning!
There were many technical limitations in 1.3.4 making it impossible to
have the stats working both in the frontend and backend, but now this has
become a total mess.
It's common however to see people create a frontend with a perfectly
working stats configuration which only emits a warning stating that it
might not work, adding to the confusion. Most people workaround the tricky
behaviour by declaring a "listen" section with no server, which was the
recommended solution in 1.3 where it was even suggested to add a dispatch
address to avoid a warning.
So the right solution seems to do the following :
- ensure that the defaults section's settings apply to the backends,
as documented ;
- let the frontends work in order not to break existing setups relying
on the defaults section ;
- officially allow stats to be declared in frontends and remove the
warninng
This patch should probably not be backported since it's not certain that
1.4 is fully compatible with having stats in frontends and backends (which
was really made possible thanks to applets).
Till now there was no check against misplaced use-server rules, and
no warning was emitted, adding to the confusion. They're processed
just after the use_backend rules, or more exactly at the same level
but for the backend.
Recently, the http-request ruleset started to be used a lot and some
bug reports were caused by misplaced http-request rules because there
was no warning if they're after a redirect or use_backend rule. Let's
fix this now. http-request rules are just after the block rules.
Since it became possible to use log-format expressions in use_backend,
having a mandatory condition becomes annoying because configurations
are full of "if TRUE". Let's relax the check to accept no condition
like many other keywords (eg: redirect).
When compiled with USE_GETADDRINFO, make sure we use getaddrinfo(3) to
perform name lookups. On default dual-stack setups this will change the
behavior of using IPv6 first. Global configuration option
'nogetaddrinfo' can be used to revert to deprecated gethostbyname(3).
A few occurrences of sprintf() were causing harmless warnings on OpenBSD :
src/cfgparse.o(.text+0x259e): In function `cfg_parse_global':
src/cfgparse.c:1044: warning: sprintf() is often misused, please use snprintf()
These ones were easy to get rid of, so better do it.
The cfgparse.c file becomes huge, and a large part of it comes from the
server keyword parser. Since the configuration is a bit more modular now,
move this parser to server.c.
This patch also moves the check of the "server" keyword earlier in the
supported keywords list, resulting in a slightly faster config parsing
for configs with large numbers of servers (about 10%).
No functional change was made, only the code was moved.
We have a use case where we look up a customer ID in an HTTP header
and direct it to the corresponding server. This can easily be done
using ACLs and use_backend rules, but the configuration becomes
painful to maintain when the number of customers grows to a few
tens or even a several hundreds.
We realized it would be nice if we could make the use_backend
resolve its name at run time instead of config parsing time, and
use a similar expression as http-request add-header to decide on
the proper backend to use. This permits the use of prefixes or
even complex names in backend expressions. If no name matches,
then the default backend is used. Doing so allowed us to get rid
of all the use_backend rules.
Since there are some config checks on the use_backend rules to see
if the referenced backend exists, we want to keep them to detect
config errors in normal config. So this patch does not modify the
default behaviour and proceeds this way :
- if the backend name in the use_backend directive parses as a log
format rule, it's used as-is and is resolved at run time ;
- otherwise it's a static name which must be valid at config time.
There was the possibility of doing this with the use-server directive
instead of use_backend, but it seems like use_backend is more suited
to this task, as it can be used for other purposes. For example, it
becomes easy to serve a customer-specific proxy.pac file based on the
customer ID by abusing the errorfile primitive :
use_backend bk_cust_%[hdr(X-Cust-Id)] if { hdr(X-Cust-Id) -m found }
default_backend bk_err_404
backend bk_cust_1
errorfile 200 /etc/haproxy/static/proxy.pac.cust1
Signed-off-by: Bertrand Jacquin <bjacquin@exosec.fr>
This patch permit to register new sections in the haproxy's
configuration file. This run like all the "keyword" registration, it is
used during the haproxy initialization, typically with the
"__attribute__((constructor))" functions.
The function str2net runs DNS resolution if valid ip cannot be parsed.
The DNS function used is the standard function of the libc and it
performs asynchronous request.
The asynchronous request is not compatible with the haproxy
archictecture.
str2net() is used during the runtime throught the "socket".
This patch remove the DNS resolution during the runtime.
This patch remove the limit of 32 groups. It also permit to use standard
"pat_parse_str()" function in place of "pat_parse_strcat()". The
"pat_parse_strcat()" is no longer used and its removed. Before this
patch, the groups are stored in a bitfield, now they are stored in a
list of strings. The matching is slower, but the number of groups is
low and generally the list of allowed groups is short.
The fetch function "smp_fetch_http_auth_grp()" used with the name
"http_auth_group" return valid username. It can be used as string for
displaying the username or with the acl "http_auth_group" for checking
the group of the user.
Maybe the names of the ACL and fetch methods are no longer suitable, but
I keep the current names for conserving the compatibility with existing
configurations.
The function "userlist_postinit()" is created from verification code
stored in the big function "check_config_validity()". The code is
adapted to the new authentication storage system and it is moved in the
"src/auth.c" file. This function is used to check the validity of the
users declared in groups and to check the validity of groups declared
on the "user" entries.
This resolve function is executed before the check of all proxy because
many acl needs solved users and groups.
The binary samples are sometimes copied as is into http headers.
A sample can contain bytes unallowed by the http rfc concerning
header content, for example if it was extracted from binary data.
The resulting http request can thus be invalid.
This issue does not yet happen because haproxy currently (mistakenly)
hex-encodes binary data, so it is not really possible to retrieve
invalid HTTP chars.
The solution consists in hex-encoding all non-printable chars prefixed
by a '%' sign.
No backport is needed since existing code is not affected yet.
cfg_parse_listen() currently checks for duplicated proxy names.
Now that we have a tree for this, we can use it.
The config load time was further reduced by 1.6, which is now
about 4.5 times faster than what it was without the trees.
In fact it was the last CPU-intensive processing involving proxy
names. Now the only remaining point is the automatic fullconn
computation which can be bypassed by having a fullconn in the
defaults section, reducing the load time by another 10x.
Large configurations can take time to parse when thousands of backends
are in use. Let's store all the proxies in trees.
findproxy_mode() has been modified to use the tree for lookups, which
has divided the parsing time by about 2.5. But many lookups are still
present at many places and need to be dealt with.
Disabled backends don't have their symbols resolved. We must not initialize
their peers section since they're not valid and instead still contain the
section's name.
There are other places where such unions are still in use, and other similar
errors might still happen. Ideally we should get rid of all of them in the
quite sensible config stage.
Commits e0d1bfb ("[MINOR] Allow shutdown of sessions when a server
becomes unavailable") and eb2c24a ("MINOR: checks: add on-marked-up
option") mentionned that the directive was supported in default-server
but while it can be stated there, it's ignored because the config value
is not copied from the default server upon creation of a new server.
Moving the statement to the "server" lines works fine though. Thanks
to Baptiste Assmann for reporting and diagnosing this bug.
These features were introduced in 1.5-dev6 and 1.5-dev10 respectively,
so no backport is needed.
Cyril Bont reported that despite commit 0dbbf317 which attempted
to fix the crash when a peers section has no name, we still get a
segfault after the error message when parsing the peers. The reason
is that the returned error code is ERR_FATAL and not ERR_ABORT, so
the parsing continues while the section was not initialized.
This is 1.5-specific, no backport is needed.
The ability to globally override the default client and server cipher
suites has been requested multiple times since the introduction of SSL.
This commit adds two new keywords to the global section for this :
- ssl-default-bind-ciphers
- ssl-default-server-ciphers
It is still possible to preset them at build time by setting the macros
LISTEN_DEFAULT_CIPHERS and CONNECT_DEFAULT_CIPHERS.
The new tune.idletimer value allows one to set a different value for
idle stream detection. The default value remains set to one second.
It is possible to disable it using zero, and to change the default
value at build time using DEFAULT_IDLE_TIMER.
A new tcp-check rule type: connect.
It allows HAProxy to test applications which stand on multiple ports or
multiple applications load-balanced through the same backend.
At the very beginning of haproxy, there was "option httpclose" to make
haproxy add a "Connection: close" header in both directions to invite
both sides to agree on closing the connection. It did not work with some
rare products, so "option forceclose" was added to do the same and actively
close the connection. Then client-side keep-alive was supported, so option
http-server-close was introduced. Now we have keep-alive with a fourth
option, not to mention the implicit tunnel mode.
The connection configuration has become a total mess because all the
options above may be combined together, despite almost everyone thinking
they cancel each other, as judging from the common problem reports on the
mailing list. Unfortunately, re-reading the doc shows that it's not clear
at all that options may be combined, and the opposite seems more obvious
since they're compared. The most common issue is options being set in the
defaults section that are not negated in other sections, but are just
combined when the user expects them to be overloaded. The migration to
keep-alive by default will only make things worse.
So let's start to address the first problem. A transaction can only work in
5 modes today :
- tunnel : haproxy doesn't bother with what follows the first req/resp
- passive close : option http-close
- forced close : option forceclose
- server close : option http-server-close with keep-alive on the client side
- keep-alive : option http-keep-alive, end to end
All 16 combination for each section fall into one of these cases. Same for
the 256 combinations resulting from frontend+backend different modes.
With this patch, we're doing something slightly different, which will not
change anything for users with valid configs, and will only change the
behaviour for users with unsafe configs. The principle is that these options
may not combined anymore, and that the latest one always overrides all the
other ones, including those inherited from the defaults section. The "no
option xxx" statement is still supported to cancel one option and fall back
to the default one. It is mainly needed to ignore defaults sections (eg:
force the tunnel mode). The frontend+backend combinations have not changed.
So for examplen the following configuration used to put the connection
into forceclose :
defaults http
mode http
option httpclose
frontend foo.
option http-server-close
=> http-server-close+httpclose = forceclose before this patch! Now
the frontend's config replaces the defaults config and results in
the more expected http-server-close.
All 25 combinations of the 5 modes in (frontend,backend) have been
successfully tested.
In order to prepare for upcoming changes, a new "option http-tunnel" was
added. It currently only voids all other options, and has the lowest
precedence when mixed with another option in another frontend/backend.
If no CA file specified on a server line, the config parser will show an error.
Adds an cmdline option '-dV' to re-set verify 'none' as global default on
servers side (previous behavior).
Also adds 'ssl-server-verify' global statement to set global default to
'none' or 'required'.
WARNING: this changes the default verify mode from "none" to "required" on
the server side, and it *will* break insecure setups.
Just like the previous commit, we sometimes want to limit the rate of
incoming SSL connections. While it can be done for a frontend, it was
not possible for a whole process, which makes sense when multiple
processes are running on a system to server multiple customers.
The new global "maxsslrate" setting is usable to fix a limit on the
session rate going to the SSL frontends. The limits applies before
the SSL handshake and not after, so that it saves the SSL stack from
expensive key computations that would finally be aborted before being
accounted for.
The same setting may be changed at run time on the CLI using
"set rate-limit ssl-session global".
It's sometimes useful to be able to limit the connection rate on a machine
running many haproxy instances (eg: per customer) but it removes the ability
for that machine to defend itself against a DoS. Thus, better also provide a
limit on the session rate, which does not include the connections rejected by
"tcp-request connection" rules. This permits to have much higher limits on
the connection rate without having to raise the session rate limit to insane
values.
The limit can be changed on the CLI using "set rate-limit sessions global",
or in the global section using "maxsessrate".
A config where multiple servers have the same name in the same backend is
prone to a number of issues : logs are not really exploitable, stats get
really tricky and even harder to change, etc...
In fact, it can be safe to have the same name between multiple servers only
when their respective IDs are known and used. So now we detect this situation
and emit a warning for the first conflict detected per server if any of the
servers uses an automatic ID.
When the load balancing algorithm in use is not deterministic, and a previous
request was sent to a server to which haproxy still holds a connection, it is
sometimes desirable that subsequent requests on a same session go to the same
server as much as possible. Note that this is different from persistence, as
we only indicate a preference which haproxy tries to apply without any form
of warranty. The real use is for keep-alive connections sent to servers. When
this option is used, haproxy will try to reuse the same connection that is
attached to the server instead of rebalancing to another server, causing a
close of the connection. This can make sense for static file servers. It does
not make much sense to use this in combination with hashing algorithms.
This new option enables HTTP keep-alive processing on the connections.
It can be overwritten by http-server-close, httpclose and forceclose.
Right now full-chain keep-alive is not yet implemented, but we need
the option to work on it. The doc will come later.
If a server is disabled in configuration and another one tracks it,
this last one must not inherit the MAINT flag otherwise it needs to
be explicitly enabled afterwards. Just remove this to fix the issue.
Health checks can now be paused. This is the status they get when the
server is put into maintenance mode, which is more logical than relying
on the server's state at some places. It will be needed to allow agent
checks to run when health checks are disabled (currently not possible).
Having the check state partially stored in the server doesn't help.
Some functions such as srv_getinter() rely on the server being checked
to decide what check frequency to use, instead of relying on the check
being configured. So let's get rid of SRV_CHECKED and SRV_AGENT_CHECKED
and only use the check's states instead.
At the moment, health checks and agent checks are tied : no agent
check is emitted if no health check is enabled. Other parameters
are considered in the condition for letting checks run. It will
help us selectively enable checks (agent and regular checks) to be
know whether they're enabled/disabled and configured or not. Now
we can already emit an error when trying to enable an unconfigured
agent.
Server tracking uses the same "tracknext" list for servers tracking
another one and for the servers being tracked. This caused an issue
which was fixed by commit f39c71c ([CRITICAL] fix server state tracking:
it was O(n!) instead of O(n)), consisting in ensuring that a server is
being checked before walking down the list, so that we don't propagate
the up/down information via servers being part of the track chain.
But the root cause is the fact that all servers share the same list.
The correct solution consists in having a list head for the tracked
servers and a list of next tracking servers. This simplifies the
propagation logic, especially for the case where status changes might
be passed to individual servers via the CLI.
Doing so ensures that we're consistent between all the functions in the whole
chain. This is important so that we can extract the argument parsing from this
function.
The function stktable_init() will return 0 if create_pool() returns NULL. Since
the returned value of this function is ignored, HAProxy will crash if the pool
of stick table is NULL and stksess_new() is called to allocate a new stick
session. It is a better choice to check the returned value and make HAProxy exit
with alert message if any error is caught.
Signed-off-by: Godbach <nylzhaowei@gmail.com>
The original codes are indented by spaces and not aligned with the former line.
It should be a convention to indent by tabs in HAProxy.
Signed-off-by: Godbach <nylzhaowei@gmail.com>
This is a generic health check which can be used to match a
banner or send a request and analyse a server response.
It works in a send/expect ways and many exchange can be done between
HAProxy and a server to decide the server status, making HAProxy able to
speak the server's protocol.
It can send arbitrary regular or binary strings and match content as a
regular or binary string or a regex.
Signed-off-by: Baptiste Assmann <bedis9@gmail.com>
Since commit 4a74143 (MEDIUM: Paramatise functions over the check of a
server), the check type is inherited from the current proxy's check type
at the moment where the server is declared instead of when reviewing
server configs. This causes an issue where a health check is disabled
when the server is declared before the checks. In fact the server will
inherit the last known check type declared before the "server" line :
backend foo
# this server is not checked at all
server s1 1.1.1.1:80 check
option tcpchk
# this server is tcp-checked :
server s2 1.1.1.2:80 check
option httpchk
# this server is http-checked :
server s3 1.1.1.3:80 check
The fix consists in assigning the check type during the config review
phase where the config is stable. No backport is nedeed.
We handle "http-request redirect" with a log-format string now, but we
leave "redirect" unaffected.
Note that the control of the special "/" case is move from the runtime
execution to the configuration parsing. If the format rule list is
empty, the build_logline() function does nothing.
When parsing track-sc* actions in tcp-request rules, we now automatically
compute the track-sc identifier number using %d when displaying an error
message. But the ID has become wrong since we introduced sc0, we continue
to report id+1 in error messages causing some confusion.
No backport is needed.
Add a DRAIN sub-state for a server which
will be shown on the stats page instead of UP if
its effective weight is zero.
Also, log if a server enters or leaves the DRAIN state
as the result of an agent check.
Signed-off-by: Simon Horman <horms@verge.net.au>
This is achieved by moving rise and fall from struct server to struct check.
After this move the behaviour of the primary check, server->check is
unchanged. However, the secondary agent check, server->agent now has
independent rise and fall values each of which are set to 1.
The result is that receiving "fail", "stopped" or "down" just once from the
agent will mark the server as down. And receiving a weight just once will
allow the server to be marked up if its primary check is in good health.
This opens up the scope to allow the rise and fall values of the agent
check to be configurable, however this has not been implemented at this
stage.
Signed-off-by: Simon Horman <horms@verge.net.au>
Allow an auxiliary agent check to be run independently of the
regular a regular health check. This is enabled by the agent-check
server setting.
The agent-port, which specifies the TCP port to use for the agent's
connections, is required.
The agent-inter, which specifies the interval between agent checks and
timeout of agent checks, is optional. If not set the value for regular
checks is used.
e.g.
server web1_1 127.0.0.1:80 check agent-port 10000
If either the health or agent check determines that a server is down
then it is marked as being down, otherwise it is marked as being up.
An agent health check performed by opening a TCP socket and reading an
ASCII string. The string should have one of the following forms:
* An ASCII representation of an positive integer percentage.
e.g. "75%"
Values in this format will set the weight proportional to the initial
weight of a server as configured when haproxy starts.
* The string "drain".
This will cause the weight of a server to be set to 0, and thus it
will not accept any new connections other than those that are
accepted via persistence.
* The string "down", optionally followed by a description string.
Mark the server as down and log the description string as the reason.
* The string "stopped", optionally followed by a description string.
This currently has the same behaviour as "down".
* The string "fail", optionally followed by a description string.
This currently has the same behaviour as "down".
Signed-off-by: Simon Horman <horms@verge.net.au>
Remove option lb-agent-chk and thus the facility to configure
a stand-alone agent health check. This feature was added by
"MEDIUM: checks: Add agent health check". It will be replaced
by subsequent patches with a features to allow an agent check
to be run as either a secondary check, along with any of the existing
checks, or as part of an http check with the status returned
in an HTTP header.
This patch does not entirely revert "MEDIUM: checks: Add agent health
check". The infrastructure it provides to parse the results of an
agent health check remains and will be re-used by the planned features
that are mentioned above.
Signed-off-by: Simon Horman <horms@verge.net.au>
This is in preparation for associating a agent check
with a server which runs as well as the server's existing check.
Signed-off-by: Simon Horman <horms@verge.net.au>
Paramatise the following functions over the check of a server
* set_server_down
* set_server_up
* srv_getinter
* server_status_printf
* set_server_check_status
* set_server_disabled
* set_server_enabled
Generally the server parameter of these functions has been removed.
Where it is still needed it is obtained using check->server.
This is in preparation for associating a agent check
with a server which runs as well as the server's existing check.
By paramatising these functions they may act on each of the checks
without further significant modification.
Explanation of the SSP_O_HCHK portion of this change:
* Prior to this patch SSP_O_HCHK serves a single purpose which
is to tell server_status_printf() weather it should print
the details of the check of a server or not.
With the paramatisation that this patch adds there are two cases.
1) Printing the details of the check in which case a
valid check parameter is needed.
2) Not printing the details of the check in which case
the contents check parameter are unused.
In case 1) we could pass SSP_O_HCHK and a valid check and;
In case 2) we could pass !SSP_O_HCHK and any value for check
including NULL.
If NULL is used for case 2) then SSP_O_HCHK becomes supurfulous
and as NULL is used for case 2) SSP_O_HCHK has been removed.
Signed-off-by: Simon Horman <horms@verge.net.au>
This is in preparation for associating a agent check
with a server which runs as well as the server's existing check.
The split has been made by:
* Moving elements of struct server's check element that will
be shared by both checks into a new check_common element
of struct server.
* Moving the remaining elements to a new struct check and
making struct server's check element a struct check.
* Adding a server element to struct check, a back-pointer
to the server element it is a member of.
- At this time the server could be obtained using
container_of, however, this will not be so easy
once a second struct check element is added to struct server
to accommodate an agent health check.
Signed-off-by: Simon Horman <horms@verge.net.au>
This function was designed for haproxy while testing other functions
in the past. Initially it was not planned to be used given the not
very interesting numbers it showed on real URL data : it is not as
smooth as the other ones. But later tests showed that the other ones
are extremely sensible to the server count and the type of input data,
especially DJB2 which must not be used on numeric input. So in fact
this function is still a generally average performer and it can make
sense to merge it in the end, as it can provide an alternative to
sdbm+avalanche or djb2+avalanche for consistent hashing or when hashing
on numeric data such as a source IP address or a visitor identifier in
a URL parameter.
Summary:
Avalanche is supported not as a native hashing choice, but a modifier
on the hashing function. Note that this means that possible configs
written after 1.5-dev4 using "hash-type avalanche" will get an informative
error instead. But as discussed on the mailing list it seems nobody ever
used it anyway, so let's fix it before the final 1.5 release.
The default values were selected for backward compatibility with previous
releases, as discussed on the mailing list, which means that the consistent
hashing will still apply the avalanche hash by default when no explicit
algorithm is specified.
Examples
(default) hash-type map-based
Map based hashing using sdbm without avalanche
(default) hash-type consistent
Consistent hashing using sdbm with avalanche
Additional Examples:
(a) hash-type map-based sdbm
Same as default for map-based above
(b) hash-type map-based sdbm avalanche
Map based hashing using sdbm with avalanche
(c) hash-type map-based djb2
Map based hashing using djb2 without avalanche
(d) hash-type map-based djb2 avalanche
Map based hashing using djb2 with avalanche
(e) hash-type consistent sdbm avalanche
Same as default for consistent above
(f) hash-type consistent sdbm
Consistent hashing using sdbm without avalanche
(g) hash-type consistent djb2
Consistent hashing using djb2 without avalanche
(h) hash-type consistent djb2 avalanche
Consistent hashing using djb2 with avalanche
Summary:
In testing at tumblr, we found that using djb2 hashing instead of the
default sdbm hashing resulted is better workload distribution to our backends.
This commit implements a change, that allows the user to specify the hash
function they want to use. It does not limit itself to consistent hashing
scenarios.
The supported hash functions are sdbm (default), and djb2.
For a discussion of the feature and analysis, see mailing list thread
"Consistent hashing alternative to sdbm" :
http://marc.info/?l=haproxy&m=138213693909219
Note: This change does NOT make changes to new features, for instance,
applying an avalance hashing always being performed before applying
consistent hashing.
Mathew Levett reported an issue which is a bit nasty and hard to track
down. RDP cookies contain both the IP and the port, and haproxy matches
them exactly. So if a server has no port specified (or a remapped port),
it will never match a port specified in a cookie. Better warn the user
when this is detected.
In preparation of more flexibility in the stick counters, make their
number configurable. It still defaults to 3 which is the minimum
accepted value. Changing the value alone is not sufficient to get
more counters, some bitfields still need to be updated and the TCP
actions need to be updated as well, but this update tries to be
easier, which is nice for experimentation purposes.
The max weight of server is 256 now, but SRV_UWGHT_MAX is still 255. As a result,
FWRR will not work well when server's weight is 256. The description is as below:
There are some macros related to server's weight in include/types/server.h:
#define SRV_UWGHT_RANGE 256
#define SRV_UWGHT_MAX (SRV_UWGHT_RANGE - 1)
#define SRV_EWGHT_MAX (SRV_UWGHT_MAX * BE_WEIGHT_SCALE)
Since weight of server can be reach to 256 and BE_WEIGHT_SCALE equals to 16,
the max eweight of server should be 256*16 = 4096, it will exceed SRV_EWGHT_MAX
which equals to SRV_UWGHT_MAX*BE_WEIGHT_SCALE = 255*16 = 4080. When a server
with weight 256 is insterted into FWRR tree during initialization, the key value
of this server should be SRV_EWGHT_MAX - s->eweight = 4080 - 4096 = -16 which
is closed to UINT_MAX in unsigned type, so the server with highest weight will
be not elected as the first server to process request.
In addition, it is a better choice to compare with SRV_UWGHT_MAX than a magic
number 256 while doing check for the weight. The max number of servers for
round-robin algorithm is also updated.
Signed-off-by: Godbach <nylzhaowei@gmail.com>
It was a bit inconsistent to have gpc start at 0 and sc start at 1,
so make sc start at zero like gpc. No previous release was issued
with sc3 anyway, so no existing setup should be affected.
Some actions were clearly missing to process response headers. This
patch adds a new "http-response" ruleset which provides the following
actions :
- allow : stop evaluating http-response rules
- deny : stop and reject the response with a 502
- add-header : add a header in log-format mode
- set-header : set a header in log-format mode
We're often missin a third counter to track base, src and base+src at
the same time. Here we introduce track_sc3 to have this third counter.
It would be wise not to add much more counters because that slightly
increases the session size and processing time though the real issue
is more the declaration of the keywords in the code and in the doc.
By properly affecting the flags and values, it becomes easier to add
more tracked counters, for example for experimentation. It also slightly
reduces the code and the number of tests. No counters were added with
this patch.
This patch does not change the logic of the code, it only changes the
way OS-specific defines are tested.
At the moment the transparent proxy code heavily depends on Linux-specific
defines. This first patch introduces a new define "CONFIG_HAP_TRANSPARENT"
which is set every time the defines used by transparent proxy are present.
This also means that with an up-to-date libc, it should not be necessary
anymore to force CONFIG_HAP_LINUX_TPROXY during the build, as the flags
will automatically be detected.
The CTTPROXY flags still remain separate because this older API doesn't
work the same way.
A new line has been added in the version output for haproxy -vv to indicate
what transparent proxy support is available.
Source function will not work with the following line in default section:
source 0.0.0.0 usesrc clientip
even that related settings by iptables and ip rule have been set correctly.
But it can work well in backend setcion.
The reason is that the operation in line 1815 in cfgparse.c as below:
curproxy->conn_src.opts = defproxy.conn_src.opts & ~CO_SRC_TPROXY_MASK;
clears three low bits of conn_src.opts which stores the configuration of
'usesrc'. Without correct bits set, the source address function can not
work well. They should be copied to the backend proxy without being modified.
Since conn_src.tproxy_addr had not copied from defproxy to backend proxy
while initializing backend proxy, source function will not work well
with explicit source address set in default section either.
Signed-off-by: Godbach <nylzhaowei@gmail.com>
Note: the bug was introduced in 1.5-dev16 with commit ef9a3605
Commit a4312fa2 merged into dev18 improved log-format management by
processing "log-format" and "unique-id-format" where they were declared,
so that the faulty args could be reported with their correct line numbers.
Unfortunately, the log-format parser considers the proxy mode (TCP/HTTP)
and now if the directive is set before the "mode" statement, it can be
rejected and report warnings.
So we really need to parse these directives at the end of a section at
least. Right now we do not have an "end of section" event, so we need
to store the file name and line number for each of these directives,
and take care of them at the end.
One of the benefits is that now the line numbers can be inherited from
the line passing "option httplog" even if it's in a defaults section.
Future improvements should be performed to report line numbers in every
log-format processed by the parser.
While ACL args were resolved after all the config was parsed, it was not the
case with sample fetch args because they're almost everywhere now.
The issue is that ACLs now solely rely on sample fetches, so their args
resolving doesn't work anymore. And many fetches involving a server, a
proxy or a userlist don't work at all.
The real issue is that at the bottom layers we have no information about
proxies, line numbers, even ACLs in order to report understandable errors,
and that at the top layers we have no visibility over the locations where
fetches are referenced (think log node).
After failing multiple unsatisfying solutions attempts, we now have a new
concept of args list. The principle is that every proxy has a list head
which contains a number of indications such as the config keyword, the
context where it's used, the file and line number, etc... and a list of
arguments. This list head is of the same type as the elements, so it
serves as a template for adding new elements. This way, it is filled from
top to bottom by the callers with the information they have (eg: line
numbers, ACL name, ...) and the lower layers just have to duplicate it and
add an element when they face an argument they cannot resolve yet.
Then at the end of the configuration parsing, a loop passes over each
proxy's list and resolves all the args in sequence. And this way there is
all necessary information to report verbose errors.
The first immediate benefit is that for the first time we got very precise
location of issues (arg number in a keyword in its context, ...). Second,
in order to do this we had to parse log-format and unique-id-format a bit
earlier, so that was a great opportunity for doing so when the directives
are encountered (unless it's a default section). This way, the recorded
line numbers for these args are the ones of the place where the log format
is declared, not the end of the file.
Userlists report slightly more information now. They're the only remaining
ones in the ACL resolving function.
The acl_expr struct used to hold a pointer to the ACL keyword. But since
we now have all the relevant pointers, we don't need that anymore, we just
need the pointer to the keyword as a string in order to return warnings
and error messages.
So let's change this in order to remove the dependency on the acl_keyword
struct from acl_expr.
During this change, acl_cond_kw_conflicts() used to return a pointer to an
ACL keyword but had to be changed to return a const char* for the same reason.