16773 Commits

Author SHA1 Message Date
Willy Tarreau
7615366c70 MINOR: cli: add support for the "show sess all" command
Sometimes when debugging haproxy, it is important to take a full
snapshot of all sessions and their respective states. Till now it
was complicated to do because we had to use scripts and sessions
would vanish between two runs.

Now with this command we have the same output as "show sess $id"
but for all sessions in the table. This is a debugging command only,
it should only be used by developers as it is never guaranteed to
perfectly work !
2012-11-26 01:18:33 +01:00
Willy Tarreau
95898ac211 BUILD: buffer: fix another isprint() warning on solaris
This one came with commit recent be0efd8. Solaris wants ints, not chars.
2012-11-26 00:57:40 +01:00
Willy Tarreau
77e3af9e6f MINOR: tcp: add support for the "v4v6" bind option
Commit 9b6700f added "v6only". As suggested by Vincent Bernat, it is
sometimes useful to have the opposite option to force binding to the
two protocols when the system is configured to bind to v6 only by
default. This option does exactly this. v6only still has precedence.
2012-11-24 15:07:23 +01:00
Willy Tarreau
5e16cbc3bd MINOR: stats: report the total number of compressed responses per front/back
Depending on the content-types and accept-encoding fields, some responses
might or might not be compressed. Let's have a counter of the number of
compressed responses and report it in the stats to help improve compression
usage.

Some cosmetic issues were fixed in the CSV output too (missing commas at the
end).
2012-11-24 14:54:13 +01:00
Willy Tarreau
f149d8f21e MINOR: stats: also report the computed compression savings in html stats
It's interesting to know the average compression ratio obtained on
frontends and backends without having to compute it by hand, so let's
report it in the HTML stats.
2012-11-24 14:06:49 +01:00
Willy Tarreau
9b6700f673 MINOR: tcp: add support for the "v6only" bind option
This option forces a socket to bind to IPv6 only when it uses the
default address (eg: ":::80").
2012-11-24 12:20:28 +01:00
Willy Tarreau
e3635edc88 BUG/MEDIUM: connection: local_send_proxy must wait for connection to establish
The conn_local_send_proxy() function has to retrieve the local and remote
addresses, but the getpeername() and getsockname() functions may fail until
the connection is established. So now we catch this error and poll for write
when this happens.
2012-11-24 11:23:04 +01:00
Willy Tarreau
6c560da279 BUG/MEDIUM: checks: report handshake failures
Up to now, only data layer failures were reported to the task, but
if a handshake failed from the beginning, the error was not reported
as a failure.
2012-11-24 11:14:45 +01:00
Willy Tarreau
9a92cd5985 MINOR: connection: abort earlier when errors are detected
If an uncaught CO_FL_ERROR flag on a connection is detected, we
immediately go to the wakeup function. This ensures that even if
an error is asynchronously delivered, we don't risk re-enabling
polling or doing unexpected things in the handshake handlers.
2012-11-24 11:12:13 +01:00
Willy Tarreau
36fb02c526 BUG/MEDIUM: connection: always disable polling upon error
Commit 0ffde2cc in 1.5-dev13 tried to always disable polling on file
descriptors when errors were encountered. Unfortunately it did not
always succeed in doing so because it relied on detecting polling
changes to disable it. Let's use a dedicated conn_stop_polling()
function that is inconditionally called upon error instead.

This managed to stop a busy loop observed when a health check makes
use of the send-proxy protocol and fails before the connection can
be established.
2012-11-24 11:09:07 +01:00
Willy Tarreau
f0837b259b MEDIUM: tcp: add explicit support for delayed ACK in connect()
Commit 24db47e0 tried to improve support for delayed ACK upon connect
but it was incomplete, because checks with the proxy protocol would
always enable polling for data receive and there was no way of
distinguishing data polling and delayed ack.

So we add a distinct delack flag to the connect() function so that
the caller decides whether or not to use a delayed ack regardless
of pending data (eg: when send-proxy is in use). Doing so covers all
combinations of { (check with data), (sendproxy), (smart-connect) }.
2012-11-24 10:24:27 +01:00
Willy Tarreau
0eb2bed561 BUG/MINOR: stats: fix inversion of the report of a check in progress
Recent fix for health checks 5a78f36d inverted the condition to display
a "*" in front of the check status on the stats page.
2012-11-24 00:20:24 +01:00
Willy Tarreau
4a6e5c6d69 BUG/MEDIUM: acl: make prue_acl_expr() correctly free ACL expressions upon exit
When leaving, during the deinit() process, prune_acl_expr() is called to
delete all ACL expressions. A bug was introduced with commit 34db1084 that
caused every other expression argument to be skipped, and more annoyingly,
it introduced the risk of scanning past the arg list and crashing or
freezing the old process during a reload.

Credits for finding this issue go to Dmitry Sivachenko who first reported
it, and second did a lot of research to narrow it down to a minimal
configuration.
2012-11-24 00:02:14 +01:00
Willy Tarreau
7d1df41171 BUG/MEDIUM: acl: correctly resolve all args, not just the first one
Since 1.5-dev9, ACLs support multiple args. The changes performed in
acl_find_targets() were bogus as they were not always applied to the
current argument being processed, but sometimes to the first one only.

Fortunately till now, all ACLs which support resolvable arguments have
it in the first place only, so there was no impact.
2012-11-23 23:47:36 +01:00
Willy Tarreau
50de90a228 MINOR: listeners: make the accept loop more robust when maxaccept==0
If some listeners are mistakenly configured with 0 as the maxaccept value,
then we now consider them as limited to one accept() at a time. This will
avoid some issues as fixed by the past commit.
2012-11-23 20:22:10 +01:00
Willy Tarreau
ca57de3e7b BUG/MAJOR: peers: the listener's maxaccept was not set and caused loops
Recent commit 16a214 to move the maxaccept parameter to listeners didn't
set it on the peers' listeners, resulting in the value zero being used
there. This caused a busy loop for each peers section, because no incoming
connection could be accepted.

Thanks to Herv Commowick for reporting this issue.
2012-11-23 20:21:37 +01:00
Willy Tarreau
cfd97c6f04 BUG/MEDIUM: checks: prevent TIME_WAITs from appearing also on timeouts
We need to disable lingering before closing on timeout too, otherwise
we accumulate TIME_WAITs.
2012-11-23 17:35:59 +01:00
Willy Tarreau
2b199c9ac3 MEDIUM: connection: provide a common conn_full_close() function
Several places got the connection close sequence wrong because it
was not obvious. In practice we always need the same sequence when
aborting, so let's have a common function for this.
2012-11-23 17:32:21 +01:00
Willy Tarreau
db3b4a2891 MINOR: checks: fix recv polling after connect()
Commit a522f801 moved a call to __conn_data_want_recv() just after the
connect() call, which is not 100% correct. First, it does not take errors
into account, eventhough this is harmless. Second, this change will only
be taken into account after next call do conn_data_polling_update(), which
is not necessarily what is expected (eg: if an error is only reported on
the recv side).

So let's use conn_data_poll_recv() instead, which directly subscribes
the event to polling.
2012-11-23 16:32:33 +01:00
Willy Tarreau
b63b59641e BUG/MAJOR: checks: close FD on all timeouts
Since last commit, some timeouts were converted into an error to report
the status, and as a result, the socket was not closed because it was
supposed to have been done during the wake() call.

Close the socket as soon as the timeout is detected to fix the issue.
Also we now ensure to first initialize the connection flags.
2012-11-23 16:22:08 +01:00
Willy Tarreau
74fa7fbec9 MEDIUM: checks: close the socket as soon as we have a response
Until now, the check socked was closed in the task which handles the
check, which can sometimes be substantially later when many tasks are
running. It's much cleaner to close() in the wake call, which also
helps removing some FD management from the task itself.

The code is faster and smaller, and fast health checks show a more
predictable behaviour.
2012-11-23 14:43:49 +01:00
Willy Tarreau
24db47e0cc MEDIUM: checks: avoid waking the application up for pure TCP checks
Pure TCP checks only use the SYN/ACK in return to a SYN. By forcing
the system to use delayed ACKs, it is possible to send an RST instead
of the ACK and thus ensure that the application will never be needlessly
woken up. This avoids error logs or counters on checked components since
the application is never made aware of this connection which dies in the
network stack.
2012-11-23 14:18:39 +01:00
Willy Tarreau
acbdc7a760 BUG/MINOR: checks: slightly clean the state machine up
The process_chk() function still did not consider the the timeout when
it was woken up, so a spurious wakeup could trigger a false timeout. Some
checks were now redundant or could not be triggered (eg: L7 timeout).
So remove them and rearrange the timeout detection.
2012-11-23 14:02:57 +01:00
Willy Tarreau
5a78f36db3 MAJOR: checks: rework completely bogus state machine
The porting of checks to using connections was totally bogus. Some checks
were considered successful as soon as the connection was established,
regardless of any response. Some errors would be triggered upon recv
if polling was enabled for send or if the send channel was shut down.

Now the behaviour is much better. It would be cleaner to perform the
fd_delete() in wake_srv_chk() and to process failures and timeouts
separately, but this is already a good start.
2012-11-23 12:47:05 +01:00
Willy Tarreau
d3aac7088e CLEANUP: checks: rename some server check flags
Some server check flag names were not properly choosen and cause
analysis trouble, especially the CHK_RUNNING one which does not
mean that a check is running but that the server is running...

Here's the rename :
  CHK_RUNNING -> CHK_PASSED
  CHK_ERROR   -> CHK_FAILED
2012-11-23 11:32:12 +01:00
Willy Tarreau
e6d9702e7e MINOR: cli: report the msg state in full text in "show sess $PTR"
It's more convenient to debug with real state names.
2012-11-23 11:31:56 +01:00
William Lallemand
be0efd884d MINOR: buffer_dump with ASCII
Improve the buffer_dump function with ASCII output.
2012-11-23 11:13:16 +01:00
William Lallemand
00bf1dee9c BUG/MEDIUM: compression: does not forward trailers
The commit bf3ae617 introduced a regression about the forward of the
trailers in compression mode.
2012-11-23 11:12:33 +01:00
Willy Tarreau
fd29cc537b MEDIUM: checks: avoid accumulating TIME_WAITs during checks
Some checks which do not induce a close from the server accumulate
local TIME_WAIT sockets because they're cleanly shut down. Typically
TCP probes cause this. This is very problematic when there are many
servers, when the checks are fast or when local source ports are rare.

So now we'll disable lingering on the socket instead of sending a
shutdown. Before doing this we try to drain any possibly pending data.
That way we avoid sending an RST when the server has closed first.

This change means that some servers will see more RSTs, but this is
needed to avoid local source port starvation.
2012-11-23 09:18:20 +01:00
Willy Tarreau
ef8a719f70 BUG/MINOR: checks: don't mark the FD as closed before transport close
Some future transport layers might need the connection's file descriptor
on ->close(), so we must not destroy it before we're finished with it.
2012-11-23 09:05:05 +01:00
Willy Tarreau
a522f801fb BUG/MEDIUM: checks: ensure we completely disable polling upon success
When a check succeeds, it used to only disable receive events while
it should disable both directions. The problem is that if the send
event was reported too, it could re-enable the recv event. In theory
this is not a problem as the task is going to be woken up, but if
there are many tasks in the queue and this task is not processed
immediately, we could theorically face a storm of unprocessed events
(typically POLL_HUP).

So better stop both directions, prevent the send side from enabling
recv and have the process_chk() code enable both directions. This
will also help detecting closes before the check is sent.

Note that all this mess has been inherited from the old code that used
the fd as a flag to report if a check was running. We should have a
dedicated flag and perform the fd_delete() in wake_srv_chk() instead.
2012-11-23 09:03:59 +01:00
Willy Tarreau
6b0a850503 BUG/MEDIUM: checks: mark the check as stopped after a connect error
Health checks currently still use the connection's fd to know whether
a check is running (this needs to change). When a health check
immediately fails during connect() because of a lack of local resource
(eg: port), we failed to unset the fd, so each time the process_chk
woken up after such an error, it believed a check was still running
and used to close the fd again instead of starting a new check. This
could result in other connections being closed because they were
assigned the same fd value.

The bug is only marked medium because when this happens, the system
is already in a bad state.

A comment was added above tcp_connect_server() to clarify that the
fd is *not* valid on error.
2012-11-23 09:03:29 +01:00
Willy Tarreau
55058a7c1e MINOR: stats: report HTTP compression stats per frontend and per backend
It was a bit frustrating to have no idea about the bandwidth saved by
HTTP compression. Now we have per-frontend and per-backend stats. The
stats on the HTTP interface are shown in a hover title in the "bytes out"
column if at least something was fed to the compressor. 3 new columns
appeared in the CSV stats output.
2012-11-22 01:07:40 +01:00
Willy Tarreau
83d84cfc8a BUILD: silence a warning on Solaris about usage of isdigit()
On Solaris, isdigit() is a macro and it complains about the use of
a char instead of the int for the argument. Let's cast it to an int
to silence it.
2012-11-22 01:04:31 +01:00
Willy Tarreau
193b8c6168 MINOR: http: allow the cookie capture size to be changed
Some users need more than 64 characters to log large cookies. The limit
was set to 63 characters (and not 64 as previously documented). Now it
is possible to change this using the global "tune.http.cookielen" setting
if required.
2012-11-22 00:44:27 +01:00
Willy Tarreau
f9fbfe8229 BUG/MAJOR: stream_interface: read0 not always handled since dev12
The connection handling changed introduced in 1.5-dev12 introduced a
regression with commit 9bf9c14c. The issue is that the stream_sock_read0()
callback must update the channel flags to indicate that the side is closed
so that when process_session() is called, it can propagate the close to the
other side and terminate the session.

The issue only appears in HTTP tunnel mode. It's a bit tricky to trigger
the issue, it requires that the request channel is full with data flowing
from the client to the server and that both the response and the read0()
are received at once so that the flags are not updated, and that the HTTP
analyser switches to tunnel mode without being informed that the request
write side is closed. After that, process_session() does not know that the
connection has to be aborted either, and no more event appears on this side
where the connection stays here forever.

Many thanks to Igor at owind for testing several snapshots and for providing
valuable traces to reproduce and diagnose the issue!
2012-11-21 21:59:51 +01:00
Willy Tarreau
85d47f9d98 MINOR: cli: report an error message on missing argument to compression rate
"set rate-limit http-compression global" needs an integer and must
complain when it's not there.
2012-11-21 02:15:16 +01:00
William Lallemand
072a2bf537 MINOR: compression: CPU usage limit
New option 'maxcompcpuusage' in global section.
Sets the maximum CPU usage HAProxy can reach before stopping the
compression for new requests or decreasing the compression level of
current requests.  It works like 'maxcomprate' but with the Idle.
2012-11-21 02:15:16 +01:00
William Lallemand
c71407657d BUG/MINOR: compression: dynamic level increase
Using compression rate limit, the compression level wasn't taking care
of the max compression level during a session because the test was done
on the wrong variable.
2012-11-21 02:15:16 +01:00
William Lallemand
e3a7d99062 MINOR: compression: report zlib memory usage
Show the memory usage and the max memory available for zlib.
The value stored is now the memory used instead of the remaining
available memory.
2012-11-21 02:15:16 +01:00
William Lallemand
096f554ee1 MINOR: compression: rate limit in 'show info'
Show the compression rate limit 'CompressRateLim' in bytes per second on
the UNIX socket.
2012-11-21 01:58:11 +01:00
William Lallemand
8b52bb3878 MEDIUM: compression: use pool for comp_ctx
Use pool for comp_ctx, it is allocated during the comp_algo->init().
The allocation of comp_ctx is accounted for in the zlib_memory_available.
2012-11-21 01:56:47 +01:00
Willy Tarreau
1feca01896 MINOR: cli: report the fd state in "show sess xxx"
This is useful to check the FD polling state during debugging sessions.
2012-11-19 18:15:19 +01:00
Willy Tarreau
7a0169a41a BUILD: cli: fix build when SSL is enabled
Commit bc174aa forgot to include proto/ssl_sock.h.
2012-11-19 17:13:41 +01:00
Willy Tarreau
9f7c6a183b BUG/MAJOR: stream_interface: certain workloads could cause get stuck
Some very specifically scheduled workloads could sometimes get stuck when
data receive was disabled due to buffer full then re-enabled due to a full
send(). A conn_data_want_recv() had to be set again in this specific case.

This bug was introduced with connection rework and polling changes in dev12.
2012-11-19 17:11:00 +01:00
Willy Tarreau
bc174aa144 MINOR: cli: report connection status in "show sess xxx"
Connection flags, targets and transport layers are now reported in
"show sess $PTR", as it is an absolute requirement in debugging.
2012-11-19 16:22:22 +01:00
William Lallemand
bf3ae61789 MEDIUM: compression: don't compress when no data
This patch makes changes in the http_response_forward_body state
machine. It checks if the compress algorithm had consumed data before
swapping the temporary and the input buffer. So it prevents null sized
zlib chunks.
2012-11-19 14:57:29 +01:00
Willy Tarreau
b97b6190e1 BUG: compression: properly disable compression when content-type does not match
Disabling compression based on the content-type was improperly done since the
introduction of the COMP_READY flag, sometimes resulting in truncated responses.
2012-11-19 14:55:02 +01:00
Willy Tarreau
16a2147dfe MEDIUM: adjust the maxaccept per listener depending on the number of processes
global.tune.maxaccept was used for all listeners. This becomes really not
convenient when some listeners are bound to a single process and other ones
are bound to many processes.

Now we change the principle : we count the number of processes a listener
is bound to, and apply the maxaccept either entirely if there is a single
process, or divided by twice the number of processes in order to maintain
fairness.

The default limit has also been increased from 32 to 64 as it appeared that
on small machines, 32 was too low to achieve high connection rates.
2012-11-19 12:39:59 +01:00
Emeric Brun
4f65bff1a5 MINOR: ssl: Add tune.ssl.lifetime statement in global.
Sets the ssl session <lifetime> in seconds. Openssl default is 300 seconds.
2012-11-16 16:47:20 +01:00