Commit Graph

219 Commits

Author SHA1 Message Date
Baptiste Assmann
0453a1dd45 MINOR: dns: new flag to report that no IP can be found in a DNS response packet
Some DNS response may be valid from a protocol point of view but may not
contain any IP addresses.
This patch gives a new flag to the function dns_get_ip_from_response to
report such case.
It's up to the upper layer to decide what to do with this information.
2015-09-10 15:42:55 +02:00
Baptiste Assmann
96972bcd36 MINOR: dns: no expected DNS record type found
Some DNS responses may be valid from a protocol point of view, but may
not contain any information considered as interested by the requester..
Purpose of the flag DNS_RESP_NO_EXPECTED_RECORD introduced by this patch is
to allow reporting such situation.

When this happens, a new DNS query is sent with a new query type.

For now, the function only expect A and AAAA query types which is enough
to cover current cases.
In a next future, it will be up to the caller to tell the function which
query types are expected.
2015-09-10 15:41:53 +02:00
Willy Tarreau
07101d5a16 BUG/MEDIUM: dns: use the correct server hostname when resolving
The server's host name picked for resolution was incorrect, it did not
skip the address family specifier, did not resolve environment variables,
and messed up with the optional trailing colon.

Instead, let's get the fqdn returned by str2sa_range() and use that
exclusively.
2015-09-08 16:16:35 +02:00
Willy Tarreau
72b8c1f0aa MEDIUM: tools: make str2sa_range() optionally return the FQDN
The function does a bunch of things among which resolving environment
variables, skipping address family specifiers and trimming port ranges.
It is the only one which sees the complete host name before trying to
resolve it. The DNS resolving code needs to know the original hostname,
so we modify this function to optionally provide it to the caller.

Note that the function itself doesn't know if the host part was a host
or an address, but str2ip() knows that and can be asked not to try to
resolve. So we first try to parse the address without resolving and
try again with resolving enabled. This way we know if the address is
explicit or needs some kind of resolution.
2015-09-08 15:50:19 +02:00
Baptiste Assmann
90447582d7 MINOR: DNS client query type failover management
In the first version of the DNS resolver, HAProxy sends an ANY query
type and in case of issue fails over to the type pointed by the
directive in 'resolve-prefer'.
This patch allows the following new failover management:
1. default query type is still ANY
2. if response is truncated or in error because ANY is not supported by
   the server, then a fail over to a new query type is performed. The
   new query type is the one pointed by the directive 'resolve-prefer'.
3. if no response or still some errors occurs, then a query type fail over
   is performed to the remaining IP address family.
2015-09-08 15:04:17 +02:00
Baptiste Assmann
0df5d9669a MINOR: dns: New DNS response analysis code: DNS_RESP_TRUNCATED
This patch introduces a new internal response state about the analysis
of a DNS response received by a server.
It is dedicated to report to above layer that the response is
'truncated'.
2015-09-08 14:58:07 +02:00
Baptiste Assmann
11c4e4eefb BUG/MAJOR: dns: dns client resolution infinite loop
Under certain circonstance (a configuration with many servers relying on
DNS resolution and one of them triggering the replay of a request
because of a timeout or invalid response to an ANY query), HAProxy could
end up in an infinite loop over the currently supposed running DNS
queries.

This was caused because the FIFO list of running queries was improperly
updated in snr_resolution_error_cb. The head of the list was removed
instead of the resolution in error, when moving the resolution to the
end of the list.

In the mean time, a LIST_DEL statement is removed since useless. This
action is already performed by the dns_reset_resolution function.
2015-09-08 10:51:50 +02:00
Baptiste Assmann
f046f11561 BUG/MEDIUM: dns: wrong first time DNS resolution
First DNS resolution is supposed to be triggered by first health check,
which is not the case with current code.
This patch fixes this behavior by setting the
resolution->last_resolution time to 0 instead of now_ms when parsing
server's configuration at startup.
2015-08-28 17:23:04 +02:00
Willy Tarreau
29fbe51490 MAJOR: tproxy: remove support for cttproxy
This was the first transparent proxy technology supported by haproxy
circa 2005 but it was obsoleted in 2007 by Tproxy 4.0 which removed a
lot of the earlier versions' shortcomings and was finally merged into
the kernel. Since nobody has been using cttproxy for many years now
and nobody has even just tried to compile the files, it's time to
remove it. The doc was updated as well.
2015-08-20 19:35:14 +02:00
Baptiste Assmann
93c20623db MINOR: server SRV_ADMF_CMAINT flag doesn't imply SRV_ADMF_FMAINT
The newly created server flag SRV_ADMF_CMAINT means that the server is
in 'disabled' mode because of configuration statement 'disabled'.
The flag SRV_ADMF_FMAINT should not be set anymore in such case and is
reserved only when the server is Forced in maintenance mode from the
stats socket.
2015-08-17 15:42:07 +02:00
Baptiste Assmann
9f5ada32e4 MINOR: server: add new SRV_ADMF_CMAINT flag
The purpose of SRV_ADMF_CMAINT flag is to keep in mind the server was
forced to maintenance status because of the configuration file.
2015-08-08 18:18:17 +02:00
Willy Tarreau
7017cb040c MINOR: server: add a list of safe, already reused idle connections
These ones are considered safe as they have already been reused.
They will be useful in "aggressive" and "always" http-reuse modes
in order to place the first request of a connection with the least
risk.
2015-08-06 16:29:01 +02:00
Willy Tarreau
173a1c6b43 MINOR: server: add a list of already used idle connections
There's a difference with the other idle conns in that these new
ones have already been used and may be reused by other streams.
2015-08-06 11:13:47 +02:00
Willy Tarreau
600802aef0 MINOR: server: add a list of private idle connections
For now it's not populated but we have the list entry. It will carry
all idle connections that sessions don't want to share. They may be
used later to reclaim connections upon socket shortage for example.
2015-08-06 10:59:08 +02:00
Baptiste Assmann
19a106d24a MINOR: server: server_find functions: id, name, best_match
This patch introduces three new functions which can be used to find a
server in a farm using different server information:
- server unique id (srv->puid)
- server name
- find best match using either name or unique id

When performing best matching, the following applies:
 - use the server name first (if provided)
 - use the server id if provided
 in any case, the function can update the caller about mismatches
 encountered.
2015-07-21 23:24:16 +02:00
Baptiste Assmann
7cc419ae1d MINOR: server: new server flag: SRV_F_FORCED_ID
This flag aims at reporting whether the server unique id (srv->puid) has
been forced by the administrator in HAProxy's configuration.
If not set, it means HAProxy has generated automatically the server's
unique id.
2015-07-21 23:24:16 +02:00
Baptiste Assmann
a68ca96375 MAJOR: server: add DNS-based server name resolution
Relies on the DNS protocol freshly implemented in HAProxy.
It performs a server IP addr resolution based on a server hostname.
2015-06-13 22:07:35 +02:00
Baptiste Assmann
3d8f831f13 MEDIUM: server: change server ip address from stats socket
New command available on the stats socket to change a server addr using
the command "set server <backend>/<server> addr <ip4|ip6>"
2015-06-13 22:07:35 +02:00
Baptiste Assmann
14e4014a48 MEDIUM: server: add support for changing a server's address
Ability to change a server IP address during HAProxy run time.
For now this is provided via function update_server_addr() which
currently is not called.

A log is emitted on each change. For now we do it inconditionally,
but later we'll want to do it only on certain circumstances, which
explains why the logging block is enclosed in if(1).
2015-06-13 22:07:35 +02:00
Simon Horman
4cd477f372 MEDIUM: Send email alerts when servers are marked as UP or enter the drain state
This is similar to the way email alerts are sent when servers are marked as
DOWN.

Like the log messages corresponding to these state changes the messages
have log level notice. Thus they are suppressed by the default email-alert
level of 'alert'. To allow these messages the email-alert level should
be set to 'notice', 'info' or 'debug'. e.g:

email-alert level notice

"email-alert mailers" and "email-alert to" settings are also required in
order for any email alerts to be sent.

A follow-up patch will document the above.

Signed-off-by: Simon Horman <horms@verge.net.au>
2015-04-30 07:30:50 +02:00
Willy Tarreau
e7dff02dd4 REORG/MEDIUM: stream: rename stream flags from SN_* to SF_*
This is in order to keep things consistent.
2015-04-06 11:23:57 +02:00
Willy Tarreau
87b09668be REORG/MAJOR: session: rename the "session" entity to "stream"
With HTTP/2, we'll have to support multiplexed streams. A stream is in
fact the largest part of what we currently call a session, it has buffers,
logs, etc.

In order to catch any error, this commit removes any reference to the
struct session and tries to rename most "session" occurrences in function
names to "stream" and "sess" to "strm" when that's related to a session.

The files stream.{c,h} were added and session.{c,h} removed.

The session will be reintroduced later and a few parts of the stream
will progressively be moved overthere. It will more or less contain
only what we need in an embryonic session.

Sample fetch functions and converters will have to change a bit so
that they'll use an L5 (session) instead of what's currently called
"L4" which is in fact L6 for now.

Once all changes are completed, we should see approximately this :

   L7 - http_txn
   L6 - stream
   L5 - session
   L4 - connection | applet

There will be at most one http_txn per stream, and a same session will
possibly be referenced by multiple streams. A connection will point to
a session and to a stream. The session will hold all the information
we need to keep even when we don't yet have a stream.

Some more cleanup is needed because some code was already far from
being clean. The server queue management still refers to sessions at
many places while comments talk about connections. This will have to
be cleaned up once we have a server-side connection pool manager.
Stream flags "SN_*" still need to be renamed, it doesn't seem like
any of them will need to move to the session.
2015-04-06 11:23:56 +02:00
Thierry FOURNIER
bb2ae64b82 MEDIUM: protocol: automatically pick the proto associated to the connection.
When the destination IP is dynamically set, we can't use the "target"
to define the proto. This patch ensures that we always use the protocol
associated with the address family. The proto field was removed from
the server and check structs.
2015-02-28 23:12:31 +01:00
Simon Horman
64e3416662 MEDIUM: Allow suppression of email alerts by log level
This patch adds a new option which allows configuration of the maximum
log level of messages for which email alerts will be sent.

The default is alert which is more restrictive than
the current code which sends email alerts for all priorities.
That behaviour may be configured using the new configuration
option to set the maximum level to notice or greater.

	email-alert level notice

Signed-off-by: Simon Horman <horms@verge.net.au>
2015-02-06 07:59:58 +01:00
Simon Horman
00b69e08d5 MINOR: Remove trailing '.' from email alert messages
This removes the trailing '.' from both the header and the body of email
alerts.

The main motivation for this change is to make the format of email alerts
generated from srv_set_stopped() consistent with those generated from
set_server_check_status().

Signed-off-by: Simon Horman <horms@verge.net.au>
2015-02-06 07:59:58 +01:00
Simon Horman
0ba0e4ac07 MEDIUM: Support sending email alerts
Signed-off-by: Simon Horman <horms@verge.net.au>
2015-02-03 00:24:16 +01:00
Simon Horman
e16c1b3f3d MEDIUM: Attach tcpcheck_rules to check
This is to allow checks to be established whose tcpcheck_rules
are not those of its proxy.

Signed-off-by: Simon Horman <horms@verge.net.au>
2015-02-03 00:24:16 +01:00
Simon Horman
41f5876750 MEDIUM: Move proto and addr fields struct check
The motivation for this is to make checks more independent of each
other to allow further reuse of their infrastructure.

For nowserver->check and server->agent still always use the same values
for the addr and proto fields so this patch should not introduce any
behavioural changes.

Signed-off-by: Simon Horman <horms@verge.net.au>
2015-02-03 00:24:16 +01:00
Simon Horman
b1900d55df MEDIUM: Refactor init_check and move to checks.c
Refactor init_check so that an error string is returned
rather than alerts being printed by it. Also
init_check to checks.c and provide a prototype to allow
it to be used from multiple C files.

Signed-off-by: Simon Horman <horms@verge.net.au>
2015-02-03 00:24:15 +01:00
Simon Horman
1a23cf0dfb BUG/MEDIUM: Do not set agent health to zero if server is disabled in config
disable starts a server in the disabled state, however setting the health
of an agent implies that the agent is disabled as well as the server.

This is a problem because the state of the agent is not restored if
the state of the server is subsequently updated leading to an
unexpected state.

For example, if a server is started disabled and then the server
state is set to ready then without this change show stat indicates
that the server is "DOWN (agent)" when it is expected that the server
would be UP if its (non-agent) health check passes.

Reported-by: Mark Brooks <mark@loadbalancer.org>
Signed-off-by: Simon Horman <horms@verge.net.au>
2015-01-23 16:47:41 +01:00
KOVACS Krisztian
b3e54fe387 MAJOR: namespace: add Linux network namespace support
This patch makes it possible to create binds and servers in separate
namespaces.  This can be used to proxy between multiple completely independent
virtual networks (with possibly overlapping IP addresses) and a
non-namespace-aware proxy implementation that supports the proxy protocol (v2).

The setup is something like this:

net1 on VLAN 1 (namespace 1) -\
net2 on VLAN 2 (namespace 2) -- haproxy ==== proxy (namespace 0)
net3 on VLAN 3 (namespace 3) -/

The proxy is configured to make server connections through haproxy and sending
the expected source/target addresses to haproxy using the proxy protocol.

The network namespace setup on the haproxy node is something like this:

= 8< =
$ cat setup.sh
ip netns add 1
ip link add link eth1 type vlan id 1
ip link set eth1.1 netns 1
ip netns exec 1 ip addr add 192.168.91.2/24 dev eth1.1
ip netns exec 1 ip link set eth1.$id up
...
= 8< =

= 8< =
$ cat haproxy.cfg
frontend clients
  bind 127.0.0.1:50022 namespace 1 transparent
  default_backend scb

backend server
  mode tcp
  server server1 192.168.122.4:2222 namespace 2 send-proxy-v2
= 8< =

A bind line creates the listener in the specified namespace, and connections
originating from that listener also have their network namespace set to
that of the listener.

A server line either forces the connection to be made in a specified
namespace or may use the namespace from the client-side connection if that
was set.

For more documentation please read the documentation included in the patch
itself.

Signed-off-by: KOVACS Tamas <ktamas@balabit.com>
Signed-off-by: Sarkozi Laszlo <laszlo.sarkozi@balabit.com>
Signed-off-by: KOVACS Krisztian <hidden@balabit.com>
2014-11-21 07:51:57 +01:00
Cyril Bont
9ce1311ebc BUG/MEDIUM: checks: fix conflicts between agent checks and ssl healthchecks
Lasse Birnbaum Jensen reported an issue when agent checks are used at the same
time as standard healthchecks when SSL is enabled on the server side.

The symptom is that agent checks try to communicate in SSL while it should
manage raw data. This happens because the transport layer is shared between all
kind of checks.

To fix the issue, the transport layer is now stored in each check type,
allowing to use SSL healthchecks when required, while an agent check should
always use the raw_sock implementation.

The fix must be backported to 1.5.
2014-11-16 00:53:12 +01:00
Willy Tarreau
bfc7b7acd8 MAJOR: checks: add support for a new "drain" administrative mode
This patch adds support for a new "drain" mode. So now we have 3 admin
modes for a server :
  - READY
  - DRAIN
  - MAINT

The drain mode disables load balancing but leaves the server up. It can
coexist with maint, except that maint has precedence. It is also inherited
from tracked servers, so just like maint, it's represented with 2 bits.

New functions were designed to set/clear each flag and to propagate the
changes to tracking servers when relevant, and to log the changes. Existing
functions srv_set_adm_maint() and srv_set_adm_ready() were replaced to make
use of the new functions.

Currently the drain mode is not yet used, however the whole logic was tested
with all combinations of set/clear of both flags in various orders to catch
all corner cases.
2014-05-23 14:29:11 +02:00
Willy Tarreau
9943d3117e MINOR: server: make use of srv_is_usable() instead of checking eweight
srv_is_usable() is broader than srv_is_usable() as it not only considers
the weight but the server's state as well. Future changes will allow a
server to be in drain mode with a non-zero weight, so we should migrate
to use that function instead.
2014-05-23 14:29:11 +02:00
Willy Tarreau
8eb7784634 MINOR: server: implement srv_set_stopping()
This function was taken from check_set_server_drain(). It does not
consider health checks at all and only sets a server to stopping
provided it's not in maintenance and is not currently stopped. The
resulting state will be STOPPING. The state change is propagated
to tracked servers.

For now the function is not used, but the goal is to split health
checks status from server status and to be able to change a server's
state regardless of health checks statuses.
2014-05-23 14:29:11 +02:00
Willy Tarreau
dbd5e78f5b MINOR: server: implement srv_set_running()
This function was taken from check_set_server_up(). It does not consider
health checks at all and only sets a server up provided it's not in
maintenance. The resulting state may be either RUNNING or STARTING
depending on the presence of a slowstart or not. The state change is
propagated to tracked servers.

For now the function is not used, but the goal is to split health
checks status from server status and to be able to change a server's
state regardless of health checks statuses.
2014-05-23 14:29:11 +02:00
Willy Tarreau
e7d1ef16bf MINOR: server: implement srv_set_stopped()
This function was extracted from check_set_server_down(). In only
manipulates the server state and does not consider the health checks
at all, nor does it modify their status. It takes a reason message to
report in logs, however it passes NULL when recursing through the
trackers chain.

For now the function is not used, but the goal is to split health
checks status from server status and to be able to change a server's
state regardless of health checks statuses.
2014-05-23 14:29:11 +02:00
Willy Tarreau
bda92271e6 MINOR: server: make the status reporting function support a reason
srv_adm_append_status() was renamed srv_append_status() since it's no
more dedicated to maintenance mode. It now supports a reason which if
not null is appended to the output string.
2014-05-23 14:29:11 +02:00
Willy Tarreau
3209123fe7 MEDIUM: server: allow multi-level server tracking
Now that it is possible to know whether a server is in forced maintenance
or inherits its maintenance status from another one, it is possible to
allow server tracking at more than one level. We still provide a loop
detection however.

Note that for the stats it's a bit trickier since we have to report the
check state which corresponds to the state of the server at the end of
the chain.
2014-05-23 14:29:11 +02:00
Willy Tarreau
a0066ddbda MEDIUM: server: properly support and propagate the maintenance status
This change now involves a new flag SRV_ADMF_IMAINT to note that the
maintenance status of a server is inherited from another server. Thus,
we know at each server level in the chain if it's running, in forced
maintenance or in a maintenance status because it tracks another server,
or even in both states.

Disabling a server propagates this flag down to other servers. Enabling
a server flushes the flag down. A server becomes up again once both of
its flags are cleared.

Two new functions "srv_adm_set_maint()" and "srv_adm_set_ready()" are used to
manipulate this maintenance status. They're used by the CLI and the stats
page.

Now the stats page always says "MAINT" instead of "MAINT(via)" and it's
only the chk/down field which reports "via x/y" when the status is
inherited from another server, but it doesn't say it when a server was
forced into maintenance. The CSV output indicates "MAINT (via x/y)"
instead of only "MAINT(via)". This is the most accurate representation.

One important thing is that now entering/leaving maintenance for a
tracking server correctly follows the state of the tracked server.
2014-05-22 11:27:00 +02:00
Willy Tarreau
4aac7db940 REORG: checks: put the functions in the appropriate files !
Checks.c has become a total mess. A number of proxy or server maintenance
and queue management functions were put there probably because they were
used there, but that makes the code untouchable. And that's without saying
that their names does not always relate to what they really do!

So let's do a first pass by moving these ones :
  - set_backend_down()       => backend.c
  - redistribute_pending()   => queue.c:pendconn_redistribute()
  - check_for_pending()      => queue.c:pendconn_grab_from_px()
  - shutdown_sessions        => server.c:srv_shutdown_sessions()
  - shutdown_backup_sessions => server.c:srv_shutdown_backup_sessions()

All of them were moved at once.
2014-05-22 11:27:00 +02:00
Willy Tarreau
892337c8e1 MAJOR: server: use states instead of flags to store the server state
Servers used to have 3 flags to store a state, now they have 4 states
instead. This avoids lots of confusion for the 4 remaining undefined
states.

The encoding from the previous to the new states can be represented
this way :

  SRV_STF_RUNNING
   |  SRV_STF_GOINGDOWN
   |   |  SRV_STF_WARMINGUP
   |   |   |
   0   x   x     SRV_ST_STOPPED
   1   0   0     SRV_ST_RUNNING
   1   0   1     SRV_ST_STARTING
   1   1   x     SRV_ST_STOPPING

Note that the case where all bits were set used to exist and was randomly
dealt with. For example, the task was not stopped, the throttle value was
still updated and reported in the stats and in the http_server_state header.
It was the same if the server was stopped by the agent or for maintenance.

It's worth noting that the internal function names are still quite confusing.
2014-05-22 11:27:00 +02:00
Willy Tarreau
2012521d7b REORG/MEDIUM: server: move the maintenance bits out of the server state
Now we introduce srv->admin and srv->prev_admin which are bitfields
containing one bit per source of administrative status (maintenance only
for now). For the sake of backwards compatibility we implement a single
source (ADMF_FMAINT) but the code already checks any source (ADMF_MAINT)
where the STF_MAINTAIN bit was previously checked. This will later allow
us to add ADMF_IMAINT for maintenance mode inherited from tracked servers.

Along doing these changes, it appeared that some places will need to be
revisited when implementing the inherited bit, this concerns all those
modifying the ADMF_FMAINT bit (enable/disable actions on the CLI or stats
page), and the checks to report "via" on the stats page. But currently
the code is harmless.
2014-05-22 11:27:00 +02:00
Willy Tarreau
c93cd16b6c REORG/MEDIUM: server: split server state and flags in two different variables
Till now, the server's state and flags were all saved as a single bit
field. It causes some difficulties because we'd like to have an enum
for the state and separate flags.

This commit starts by splitting them in two distinct fields. The first
one is srv->state (with its counter-part srv->prev_state) which are now
enums, but which still contain bits (SRV_STF_*).

The flags now lie in their own field (srv->flags).

The function srv_is_usable() was updated to use the enum as input, since
it already used to deal only with the state.

Note that currently, the maintenance mode is still in the state for
simplicity, but it must move as well.
2014-05-22 11:27:00 +02:00
Willy Tarreau
c5150dafd8 MINOR: server: use functions to detect state changes and to update them
Detecting that a server's status has changed is a bit messy, as well
as it is to commit the status changes. We'll have to add new conditions
soon and we'd better avoid to multiply the number of touched locations
with the high risk of forgetting them.

This commit introduces :
  - srv_lb_status_changed() to report if the status changed from the
    previously committed one ;
  - svr_lb_commit_status() to commit the current status

The function is now used by all load-balancing algorithms.
2014-05-13 22:18:22 +02:00
Willy Tarreau
02615f9b16 MINOR: server: remove the SRV_DRAIN flag which can always be deduced
This flag is only a copy of (srv->uweight == 0), so better get rid of
it to reduce some of the confusion that remains in the code, and use
a simple function to return this state based on this weight instead.
2014-05-13 22:18:13 +02:00
Willy Tarreau
5cf0b52d29 MEDIUM: checks: only complain about the missing port when the check uses TCP
For UNIX socket addresses, we don't need any port, so let's disable the
check under this condition.
2014-05-10 01:26:38 +02:00
Willy Tarreau
9cf8d3f46b MINOR: protocols: use is_inet_addr() when only INET addresses are desired
We used to have is_addr() in place to validate sometimes the existence
of an address, sometimes a valid IPv4 or IPv6 address. Replace them
carefully so that is_inet_addr() is used wherever we can only use an
IPv4/IPv6 address.
2014-05-10 01:26:37 +02:00
Willy Tarreau
640556c692 BUG/MINOR: checks: correctly configure the address family and protocol
Currently, mixing an IPv4 and an IPv6 address in checks happens to
work by pure luck because the two protocols use the same functions
at the socket level and both use IPPROTO_TCP. However, they're
definitely wrong as the protocol for the check address is retrieved
from the server's address.

Now the protocol assigned to the connection is the same as the one
the address in use belongs to (eg: the server's address or the
explicit check address).
2014-05-10 01:26:37 +02:00
David S
afb768340c MEDIUM: connection: Implement and extented PROXY Protocol V2
This commit modifies the PROXY protocol V2 specification to support headers
longer than 255 bytes allowing for optional extensions.  It implements the
PROXY protocol V2 which is a binary representation of V1. This will make
parsing more efficient for clients who will know in advance exactly how
many bytes to read.  Also, it defines and implements some optional PROXY
protocol V2 extensions to send information about downstream SSL/TLS
connections.  Support for PROXY protocol V1 remains unchanged.
2014-05-09 08:25:38 +02:00
Willy Tarreau
272adea423 REORG: cfgparse: move server keyword parsing to server.c
The cfgparse.c file becomes huge, and a large part of it comes from the
server keyword parser. Since the configuration is a bit more modular now,
move this parser to server.c.

This patch also moves the check of the "server" keyword earlier in the
supported keywords list, resulting in a slightly faster config parsing
for configs with large numbers of servers (about 10%).

No functional change was made, only the code was moved.
2014-03-31 10:42:03 +02:00
Bhaskar Maddala
a20cb85eba MINOR: stats: Enhancement to stats page to provide information of last session time.
Summary:
Track and report last session time on the stats page for each server
in every backend, as well as the backend.

This attempts to address the requirement in the ROADMAP

  - add a last activity date for each server (req/resp) that will be
    displayed in the stats. It will be useful with soft stop.

The stats page reports this as time elapsed since last session. This
change does not adequately address the requirement for long running
session (websocket, RDP... etc).
2014-02-08 01:19:58 +01:00
Willy Tarreau
ff5ae35b9f MINOR: checks: use check->state instead of srv->state & SRV_CHECKED
Having the check state partially stored in the server doesn't help.
Some functions such as srv_getinter() rely on the server being checked
to decide what check frequency to use, instead of relying on the check
being configured. So let's get rid of SRV_CHECKED and SRV_AGENT_CHECKED
and only use the check's states instead.
2013-12-14 16:02:19 +01:00
Simon Horman
58c32978b2 MEDIUM: Set rise and fall of agent checks to 1
This is achieved by moving rise and fall from struct server to struct check.

After this move the behaviour of the primary check, server->check is
unchanged. However, the secondary agent check, server->agent now has
independent rise and fall values each of which are set to 1.

The result is that receiving "fail", "stopped" or "down" just once from the
agent will mark the server as down. And receiving a weight just once will
allow the server to be marked up if its primary check is in good health.

This opens up the scope to allow the rise and fall values of the agent
check to be configurable, however this has not been implemented at this
stage.

Signed-off-by: Simon Horman <horms@verge.net.au>
2013-11-25 07:31:16 +01:00
Willy Tarreau
004e045f31 BUG/MAJOR: server: weight calculation fails for map-based algorithms
A crash was reported by Igor at owind when changing a server's weight
on the CLI. Lukas Tribus could reproduce a related bug where setting
a server's weight would result in the new weight being multiplied by
the initial one. The two bugs are the same.

The incorrect weight calculation results in the total farm weight being
larger than what was initially allocated, causing the map index to be out
of bounds on some hashes. It's easy to reproduce using "balance url_param"
with a variable param, or with "balance static-rr".

It appears that the calculation is made at many places and is not always
right and not always wrong the same way. Thus, this patch introduces a
new function "server_recalc_eweight()" which is dedicated to this task
of computing ->eweight from many other elements including uweight and
current time (for slowstart), and all users now switch to use this
function.

The patch is a bit large but the code was not trivially fixable in a way
that could guarantee this situation would not occur anymore. The fix is
much more readable and has been verified to work with all algorithms,
with both consistent and map-based hashes, and even with static-rr.

Slowstart was tested as well, just like enable/disable server.

The same bug is very likely present in 1.4 as well, so the patch will
probably need to be backported eventhough it will not apply as-is.

Thanks to Lukas and Igor for the information they provided to reproduce it.
2013-11-21 15:09:02 +01:00
Simon Horman
125d099662 MEDIUM: Move health element to struct check
This is in preparation for associating a agent check
with a server which runs as well as the server's existing check.

Signed-off-by: Simon Horman <horms@verge.net.au>
2013-11-19 09:36:07 +01:00
Simon Horman
4a741432be MEDIUM: Paramatise functions over the check of a server
Paramatise the following functions over the check of a server

* set_server_down
* set_server_up
* srv_getinter
* server_status_printf
* set_server_check_status
* set_server_disabled
* set_server_enabled

Generally the server parameter of these functions has been removed.
Where it is still needed it is obtained using check->server.

This is in preparation for associating a agent check
with a server which runs as well as the server's existing check.
By paramatising these functions they may act on each of the checks
without further significant modification.

Explanation of the SSP_O_HCHK portion of this change:

* Prior to this patch SSP_O_HCHK serves a single purpose which
  is to tell server_status_printf() weather it should print
  the details of the check of a server or not.

  With the paramatisation that this patch adds there are two cases.
  1) Printing the details of the check in which case a
     valid check parameter is needed.
  2) Not printing the details of the check in which case
     the contents check parameter are unused.

  In case 1) we could pass SSP_O_HCHK and a valid check and;
  In case 2) we could pass !SSP_O_HCHK and any value for check
  including NULL.

  If NULL is used for case 2) then SSP_O_HCHK becomes supurfulous
  and as NULL is used for case 2) SSP_O_HCHK has been removed.

Signed-off-by: Simon Horman <horms@verge.net.au>
2013-11-19 09:35:54 +01:00
Simon Horman
6618300e13 MEDIUM: Split up struct server's check element
This is in preparation for associating a agent check
with a server which runs as well as the server's existing check.

The split has been made by:
* Moving elements of struct server's check element that will
  be shared by both checks into a new check_common element
  of struct server.
* Moving the remaining elements to a new struct check and
  making struct server's check element a struct check.
* Adding a server element to struct check, a back-pointer
  to the server element it is a member of.
  - At this time the server could be obtained using
    container_of, however, this will not be so easy
    once a second struct check element is added to struct server
    to accommodate an agent health check.

Signed-off-by: Simon Horman <horms@verge.net.au>
2013-11-19 09:35:48 +01:00
Simon Horman
a360844735 CLEANUP: Make parameters of srv_downtime and srv_getinter const
The parameters of srv_downtime and srv_getinter are not modified
and thus may be const.

Signed-off-by: Simon Horman <horms@verge.net.au>
2013-11-19 08:04:58 +01:00
Simon Horman
b796afa60d MEDIUM: server: Tighten up parsing of weight string
Detect:
* Empty weight string, including no digits before '%' in relative
  weight string
* Trailing garbage, including between the last integer and '%'
  in relative weights

The motivation for this is to allow the weight string to be safely
logged if successfully parsed by this function

Signed-off-by: Simon Horman <horms@verge.net.au>
2013-02-13 10:59:50 +01:00
Simon Horman
58b5d292b3 MEDIUM: server: Allow relative weights greater than 100%
Allow relative weights greater than 100%,
capping the absolute value to 256 which is
the largest supported absolute weight.

Signed-off-by: Simon Horman <horms@verge.net.au>
2013-02-13 10:56:28 +01:00
Simon Horman
7d09b9a4df MEDIUM: server: Break out set weight processing code
Break out set weight processing code.
This is in preparation for reusing the code.

Also, remove duplicate check in nested if clauses.
{px->lbprm.algo & BE_LB_PROP_DYN) is checked by
the immediate outer if clause, so there is no need
to check it a second time.

Signed-off-by: Simon Horman <horms@verge.net.au>
2013-02-13 10:53:40 +01:00
Willy Tarreau
dff5543618 MEDIUM: server: move parsing of keyword "id" to server.c
This is the first keyword to be moved to server.c.
2012-10-10 17:51:05 +02:00
Willy Tarreau
21faa91be6 MINOR: server: add minimal infrastructure to parse keywords
Just like with the "bind" lines, we'll switch the "server" line
parsing to keyword registration. The code is essentially the same
as for bind keywords, with minor changes such as support for the
default-server keywords and support for variable argument count.
2012-10-10 17:42:39 +02:00
Willy Tarreau
ec6c5df018 [CLEANUP] remove many #include <types/xxx> from C files
It should be stated as a rule that a C file should never
include types/xxx.h when proto/xxx.h exists, as it gives
less exposure to declaration conflicts (one of which was
caught and fixed here) and it complicates the file headers
for nothing.

Only types/global.h, types/capture.h and types/polling.h
have been found to be valid includes from C files.
2008-07-16 10:30:42 +02:00
Krzysztof Piotr Oledzki
5259dfedd1 [MEDIUM]: rework checks handling
This patch adds two new variables: fastinter and downinter.
When server state is:
 - non-transitionally UP -> inter (no change)
 - transitionally UP (going down), unchecked or transitionally DOWN (going up) -> fastinter
 - down -> downinter

It allows to set something like:
        server sr6 127.0.51.61:80 cookie s6 check inter 10000 downinter 20000 fastinter 500 fall 3 weight 40
In the above example haproxy uses 10000ms between checks but as soon as
one check fails fastinter (500ms) is used. If server is down
downinter (20000) is used or fastinter (500ms) if one check pass.
Fastinter is also used when haproxy starts.

New "timeout.check" variable was added, if set haproxy uses it as an additional
read timeout, but only after a connection has been already established. I was
thinking about using "timeout.server" here but most people set this
with an addition reserve but still want checks to kick out laggy servers.
Please also note that in most cases check request is much simpler
and faster to handle than normal requests so this timeout should be smaller.

I also changed the timeout used for check connections establishing.

Changes from the previous version:
 - use tv_isset() to check if the timeout is set,
 - use min("timeout connect", "inter") but only if "timeout check" is set
   as this min alone may be to short for full (connect + read) check,
 - debug code (fprintf) commented/removed
 - documentation

Compile tested only (sorry!) as I'm currently traveling but changes
are rather small and trivial.
2008-01-22 11:29:06 +01:00
Krzysztof Oledzki
85130941e7 [MEDIUM] stats: report server and backend cumulated downtime
Hello,

This patch implements new statistics for SLA calculation by adding new
field 'Dwntime' with total down time since restart (both HTTP/CSV) and
extending status field (HTTP) or inserting a new one (CSV) with time
showing how long each server/backend is in a current state. Additionaly,
down transations are also calculated and displayed for backends, so it is
possible to know how many times selected backend was down, generating "No
server is available to handle this request." error.

New information are presentetd in two different ways:
   - for HTTP: a "human redable form", one of "100000d 23h", "23h 59m" or
      "59m 59s"
   - for CSV: seconds

I believe that seconds resolution is enough.

As there are more columns in the status page I decided to shrink some
names to make more space:
   - Weight -> Wght
   - Check -> Chk
   - Down -> Dwn

Making described changes I also made some improvements and fixed some
small bugs:
   - don't increment s->health above 's->rise + s->fall - 1'. Previously it
     was incremented an then (re)set to 's->rise + s->fall - 1'.
   - do not set server down if it is down already
   - do not set server up if it is up already
   - fix colspan in multiple places (mostly introduced by my previous patch)
   - add missing "status" header to CSV
   - fix order of retries/redispatches in server (CSV)
   - s/Tthen/Then/
   - s/server/backend/ in DATA_ST_PX_BE (dumpstats.c)

Changes from previous version:
  - deal with negative time intervales
  - don't relay on s->state (SRV_RUNNING)
  - little reworked human_time + compacted format (no spaces). If needed it
    can be used in the future for other purposes by optionally making "cnt"
    as an argument
  - leave set_server_down mostly unchanged
  - only little reworked "process_chk: 9"
  - additional fields in CSV are appended to the rigth
  - fix "SEC" macro
  - named arguments (human_time, be_downtime, srv_downtime)

Hope it is OK. If there are only cosmetic changes needed please fill free
to correct it, however if there are some bigger changes required I would
like to discuss it first or at last to know what exactly was changed
especially since I already put this patch into my production server. :)

Thank you,

Best regards,

 				Krzysztof Oledzki
2007-10-22 21:36:23 +02:00
Willy Tarreau
e3ba5f0aaa [CLEANUP] included common/version.h everywhere 2006-06-29 18:54:54 +02:00
Willy Tarreau
baaee00406 [BIGMOVE] exploded the monolithic haproxy.c file into multiple files.
The files are now stored under :
  - include/haproxy for the generic includes
  - include/types.h for the structures needed within prototypes
  - include/proto.h for function prototypes and inline functions
  - src/*.c for the C files

Most include files are now covered by LGPL. A last move still needs
to be done to put inline functions under GPL and not LGPL.

Version has been set to 1.3.0 in the code but some control still
needs to be done before releasing.
2006-06-26 02:48:02 +02:00