5905 Commits

Author SHA1 Message Date
Willy Tarreau
db3b4a2891 MINOR: checks: fix recv polling after connect()
Commit a522f801 moved a call to __conn_data_want_recv() just after the
connect() call, which is not 100% correct. First, it does not take errors
into account, eventhough this is harmless. Second, this change will only
be taken into account after next call do conn_data_polling_update(), which
is not necessarily what is expected (eg: if an error is only reported on
the recv side).

So let's use conn_data_poll_recv() instead, which directly subscribes
the event to polling.
2012-11-23 16:32:33 +01:00
Willy Tarreau
b63b59641e BUG/MAJOR: checks: close FD on all timeouts
Since last commit, some timeouts were converted into an error to report
the status, and as a result, the socket was not closed because it was
supposed to have been done during the wake() call.

Close the socket as soon as the timeout is detected to fix the issue.
Also we now ensure to first initialize the connection flags.
2012-11-23 16:22:08 +01:00
Willy Tarreau
74fa7fbec9 MEDIUM: checks: close the socket as soon as we have a response
Until now, the check socked was closed in the task which handles the
check, which can sometimes be substantially later when many tasks are
running. It's much cleaner to close() in the wake call, which also
helps removing some FD management from the task itself.

The code is faster and smaller, and fast health checks show a more
predictable behaviour.
2012-11-23 14:43:49 +01:00
Willy Tarreau
24db47e0cc MEDIUM: checks: avoid waking the application up for pure TCP checks
Pure TCP checks only use the SYN/ACK in return to a SYN. By forcing
the system to use delayed ACKs, it is possible to send an RST instead
of the ACK and thus ensure that the application will never be needlessly
woken up. This avoids error logs or counters on checked components since
the application is never made aware of this connection which dies in the
network stack.
2012-11-23 14:18:39 +01:00
Willy Tarreau
acbdc7a760 BUG/MINOR: checks: slightly clean the state machine up
The process_chk() function still did not consider the the timeout when
it was woken up, so a spurious wakeup could trigger a false timeout. Some
checks were now redundant or could not be triggered (eg: L7 timeout).
So remove them and rearrange the timeout detection.
2012-11-23 14:02:57 +01:00
Willy Tarreau
5a78f36db3 MAJOR: checks: rework completely bogus state machine
The porting of checks to using connections was totally bogus. Some checks
were considered successful as soon as the connection was established,
regardless of any response. Some errors would be triggered upon recv
if polling was enabled for send or if the send channel was shut down.

Now the behaviour is much better. It would be cleaner to perform the
fd_delete() in wake_srv_chk() and to process failures and timeouts
separately, but this is already a good start.
2012-11-23 12:47:05 +01:00
Willy Tarreau
d3aac7088e CLEANUP: checks: rename some server check flags
Some server check flag names were not properly choosen and cause
analysis trouble, especially the CHK_RUNNING one which does not
mean that a check is running but that the server is running...

Here's the rename :
  CHK_RUNNING -> CHK_PASSED
  CHK_ERROR   -> CHK_FAILED
2012-11-23 11:32:12 +01:00
Willy Tarreau
e6d9702e7e MINOR: cli: report the msg state in full text in "show sess $PTR"
It's more convenient to debug with real state names.
2012-11-23 11:31:56 +01:00
William Lallemand
be0efd884d MINOR: buffer_dump with ASCII
Improve the buffer_dump function with ASCII output.
2012-11-23 11:13:16 +01:00
William Lallemand
00bf1dee9c BUG/MEDIUM: compression: does not forward trailers
The commit bf3ae617 introduced a regression about the forward of the
trailers in compression mode.
2012-11-23 11:12:33 +01:00
Willy Tarreau
fd29cc537b MEDIUM: checks: avoid accumulating TIME_WAITs during checks
Some checks which do not induce a close from the server accumulate
local TIME_WAIT sockets because they're cleanly shut down. Typically
TCP probes cause this. This is very problematic when there are many
servers, when the checks are fast or when local source ports are rare.

So now we'll disable lingering on the socket instead of sending a
shutdown. Before doing this we try to drain any possibly pending data.
That way we avoid sending an RST when the server has closed first.

This change means that some servers will see more RSTs, but this is
needed to avoid local source port starvation.
2012-11-23 09:18:20 +01:00
Willy Tarreau
ef8a719f70 BUG/MINOR: checks: don't mark the FD as closed before transport close
Some future transport layers might need the connection's file descriptor
on ->close(), so we must not destroy it before we're finished with it.
2012-11-23 09:05:05 +01:00
Willy Tarreau
a522f801fb BUG/MEDIUM: checks: ensure we completely disable polling upon success
When a check succeeds, it used to only disable receive events while
it should disable both directions. The problem is that if the send
event was reported too, it could re-enable the recv event. In theory
this is not a problem as the task is going to be woken up, but if
there are many tasks in the queue and this task is not processed
immediately, we could theorically face a storm of unprocessed events
(typically POLL_HUP).

So better stop both directions, prevent the send side from enabling
recv and have the process_chk() code enable both directions. This
will also help detecting closes before the check is sent.

Note that all this mess has been inherited from the old code that used
the fd as a flag to report if a check was running. We should have a
dedicated flag and perform the fd_delete() in wake_srv_chk() instead.
2012-11-23 09:03:59 +01:00
Willy Tarreau
6b0a850503 BUG/MEDIUM: checks: mark the check as stopped after a connect error
Health checks currently still use the connection's fd to know whether
a check is running (this needs to change). When a health check
immediately fails during connect() because of a lack of local resource
(eg: port), we failed to unset the fd, so each time the process_chk
woken up after such an error, it believed a check was still running
and used to close the fd again instead of starting a new check. This
could result in other connections being closed because they were
assigned the same fd value.

The bug is only marked medium because when this happens, the system
is already in a bad state.

A comment was added above tcp_connect_server() to clarify that the
fd is *not* valid on error.
2012-11-23 09:03:29 +01:00
Willy Tarreau
55058a7c1e MINOR: stats: report HTTP compression stats per frontend and per backend
It was a bit frustrating to have no idea about the bandwidth saved by
HTTP compression. Now we have per-frontend and per-backend stats. The
stats on the HTTP interface are shown in a hover title in the "bytes out"
column if at least something was fed to the compressor. 3 new columns
appeared in the CSV stats output.
2012-11-22 01:07:40 +01:00
Willy Tarreau
83d84cfc8a BUILD: silence a warning on Solaris about usage of isdigit()
On Solaris, isdigit() is a macro and it complains about the use of
a char instead of the int for the argument. Let's cast it to an int
to silence it.
2012-11-22 01:04:31 +01:00
Willy Tarreau
193b8c6168 MINOR: http: allow the cookie capture size to be changed
Some users need more than 64 characters to log large cookies. The limit
was set to 63 characters (and not 64 as previously documented). Now it
is possible to change this using the global "tune.http.cookielen" setting
if required.
2012-11-22 00:44:27 +01:00
Willy Tarreau
f9fbfe8229 BUG/MAJOR: stream_interface: read0 not always handled since dev12
The connection handling changed introduced in 1.5-dev12 introduced a
regression with commit 9bf9c14c. The issue is that the stream_sock_read0()
callback must update the channel flags to indicate that the side is closed
so that when process_session() is called, it can propagate the close to the
other side and terminate the session.

The issue only appears in HTTP tunnel mode. It's a bit tricky to trigger
the issue, it requires that the request channel is full with data flowing
from the client to the server and that both the response and the read0()
are received at once so that the flags are not updated, and that the HTTP
analyser switches to tunnel mode without being informed that the request
write side is closed. After that, process_session() does not know that the
connection has to be aborted either, and no more event appears on this side
where the connection stays here forever.

Many thanks to Igor at owind for testing several snapshots and for providing
valuable traces to reproduce and diagnose the issue!
2012-11-21 21:59:51 +01:00
Willy Tarreau
85d47f9d98 MINOR: cli: report an error message on missing argument to compression rate
"set rate-limit http-compression global" needs an integer and must
complain when it's not there.
2012-11-21 02:15:16 +01:00
William Lallemand
072a2bf537 MINOR: compression: CPU usage limit
New option 'maxcompcpuusage' in global section.
Sets the maximum CPU usage HAProxy can reach before stopping the
compression for new requests or decreasing the compression level of
current requests.  It works like 'maxcomprate' but with the Idle.
2012-11-21 02:15:16 +01:00
William Lallemand
c71407657d BUG/MINOR: compression: dynamic level increase
Using compression rate limit, the compression level wasn't taking care
of the max compression level during a session because the test was done
on the wrong variable.
2012-11-21 02:15:16 +01:00
William Lallemand
e3a7d99062 MINOR: compression: report zlib memory usage
Show the memory usage and the max memory available for zlib.
The value stored is now the memory used instead of the remaining
available memory.
2012-11-21 02:15:16 +01:00
William Lallemand
096f554ee1 MINOR: compression: rate limit in 'show info'
Show the compression rate limit 'CompressRateLim' in bytes per second on
the UNIX socket.
2012-11-21 01:58:11 +01:00
William Lallemand
8b52bb3878 MEDIUM: compression: use pool for comp_ctx
Use pool for comp_ctx, it is allocated during the comp_algo->init().
The allocation of comp_ctx is accounted for in the zlib_memory_available.
2012-11-21 01:56:47 +01:00
Willy Tarreau
1feca01896 MINOR: cli: report the fd state in "show sess xxx"
This is useful to check the FD polling state during debugging sessions.
2012-11-19 18:15:19 +01:00
Willy Tarreau
7a0169a41a BUILD: cli: fix build when SSL is enabled
Commit bc174aa forgot to include proto/ssl_sock.h.
2012-11-19 17:13:41 +01:00
Willy Tarreau
9f7c6a183b BUG/MAJOR: stream_interface: certain workloads could cause get stuck
Some very specifically scheduled workloads could sometimes get stuck when
data receive was disabled due to buffer full then re-enabled due to a full
send(). A conn_data_want_recv() had to be set again in this specific case.

This bug was introduced with connection rework and polling changes in dev12.
2012-11-19 17:11:00 +01:00
Willy Tarreau
bc174aa144 MINOR: cli: report connection status in "show sess xxx"
Connection flags, targets and transport layers are now reported in
"show sess $PTR", as it is an absolute requirement in debugging.
2012-11-19 16:22:22 +01:00
William Lallemand
bf3ae61789 MEDIUM: compression: don't compress when no data
This patch makes changes in the http_response_forward_body state
machine. It checks if the compress algorithm had consumed data before
swapping the temporary and the input buffer. So it prevents null sized
zlib chunks.
2012-11-19 14:57:29 +01:00
Willy Tarreau
b97b6190e1 BUG: compression: properly disable compression when content-type does not match
Disabling compression based on the content-type was improperly done since the
introduction of the COMP_READY flag, sometimes resulting in truncated responses.
2012-11-19 14:55:02 +01:00
Willy Tarreau
16a2147dfe MEDIUM: adjust the maxaccept per listener depending on the number of processes
global.tune.maxaccept was used for all listeners. This becomes really not
convenient when some listeners are bound to a single process and other ones
are bound to many processes.

Now we change the principle : we count the number of processes a listener
is bound to, and apply the maxaccept either entirely if there is a single
process, or divided by twice the number of processes in order to maintain
fairness.

The default limit has also been increased from 32 to 64 as it appeared that
on small machines, 32 was too low to achieve high connection rates.
2012-11-19 12:39:59 +01:00
Emeric Brun
4f65bff1a5 MINOR: ssl: Add tune.ssl.lifetime statement in global.
Sets the ssl session <lifetime> in seconds. Openssl default is 300 seconds.
2012-11-16 16:47:20 +01:00
Willy Tarreau
6ec58dbacc MINOR: ssl: rename and document the tune.ssl.cachesize option
Its was initially called "tune.sslcachesize" but not documented, let's
rename it and document it.
2012-11-16 16:47:10 +01:00
Willy Tarreau
fc6c032d8d MEDIUM: global: add support for CPU binding on Linux ("cpu-map")
The new "cpu-map" directive allows one to assign the CPU sets that
a process is allowed to bind to. This is useful in combination with
the "nbproc" and "bind-process" directives.

The support is implicit on Linux 2.6.28 and above.
2012-11-16 16:16:53 +01:00
Emeric Brun
c52962f292 MINOR: conf: add warning if ssl is not enabled and a certificate is present on bind. 2012-11-15 18:46:03 +01:00
Willy Tarreau
110ecc1acd MINOR: config: support process ranges for "bind-process"
Several users have already been caught by "bind-process" which does not
support ranges, so let's support them now.
2012-11-15 17:50:01 +01:00
Willy Tarreau
247a13a315 MINOR: global: don't prevent nbproc from being redefined
Having nbproc preinitialized to zero is really annoying as it prevents
some checks from being correctly performed. Also the check to prevent
nbproc from being redefined is totally useless, so let's preset it to
1 and remove the test.
2012-11-15 17:38:15 +01:00
Willy Tarreau
543db62e1f BUG/MEDIUM: compression: release the zlib pools between keep-alive requests
There was a possible memory leak in the zlib code when the first response of
a keep-alive session was compressed, because the next request would reset the
compression algo, preventing a later call to session_free() from releasing it.
The reason is that it is necessary to release the assigned resources in
http_end_txn_clean_session().
2012-11-15 16:41:22 +01:00
William Lallemand
ec3e3890f0 BUG/MINOR: compression: deinit zlib only when required
The zlib stream was deinitialized even when the init failed.
2012-11-15 15:42:17 +01:00
William Lallemand
c04ca58222 BUG/MEDIUM: compression: no Content-Type header but type in configuration
HAProxy was compressing data when there was no Content-Type header in
the response but a compression type specified in the configuration.
2012-11-15 15:42:11 +01:00
Willy Tarreau
4690985fca BUG: compression: do not always increment the round counter on allocation failure
Zlib (at least 1.2 and 1.3) aborts when it fails to allocate the state, so we
must not count a round on this event. If the state succeeds, then it allocates
all the 4 remaining counters at once.
2012-11-15 15:00:55 +01:00
Emeric Brun
4663577e24 MINOR: build: allow packagers to specify the ssl cache size
This is done by passing the default value to SSLCACHESIZE in sessions.
User can use tune.sslcachesize to change this value.
By default, it is set to 20000 sessions as openssl internal cache size.
Currently, a session entry size is between 592 and 616 bytes depending on the arch.
2012-11-15 10:52:19 +01:00
Willy Tarreau
4055a107a7 BUG: proxy: fix server name lookup in get_backend_server()
The lookup was broken by commit 050536d5. The server ID is
initialized to a negative value but unfortunately not all the
tests were converted. Thanks to Igor at owind for reporting it.
2012-11-15 00:15:18 +01:00
Willy Tarreau
96aa6b32d7 MINOR: build: allow packagers to specify the default maxzlibmem
This is done by passing the default value to DEFAULT_MAXZLIBMEM in megs.
2012-11-12 15:52:53 +01:00
Willy Tarreau
45b8893966 MINOR: splice: disable it when the system returns EBADF
At least on a heavily patched 2.6.35.9, we can see splice() fail
with EBADF :

  recv(6, "789.123456789.123456789.12345678"..., 1049, 0) = 1049
  send(5, "HTTP/1.1 200\r\nContent-length: 10"..., 8030, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_MORE) = 8030
  gettimeofday({1352717854, 515601}, NULL) = 0
  epoll_wait(0x3, 0x40221008, 0x7, 0)     = 0
  gettimeofday({1352717854, 515793}, NULL) = 0
  pipe([7, 8])                            = 0
  splice(0x6, 0, 0x8, 0, 0xfe12c, 0x3)    = -1 EBADF (Bad file descriptor)
  close(6)                                = 0

This clearly is a kernel issue since all FDs are valid here, so let's
simply disable splice() on the connection when this happens so that
the session correctly recovers from that issue using recv().
2012-11-12 12:02:20 +01:00
Emeric Brun
674b743067 BUG/MEDIUM: ssl: Fix sometimes reneg fails if requested by server.
SSL_do_handshake is not appropriate for reneg, it's only appropriate at the
beginning of a connection. OpenSSL correctly handles renegs using the data
functions, so we use SSL_peek() here to make its state machine progress if
SSL_renegotiate_pending() says a reneg is pending.
2012-11-12 11:46:08 +01:00
Emeric Brun
282a76acc1 BUG/MEDIUM: ssl: Fix some reneg cases not correctly handled.
SSL may decide to switch to a handshake in the middle of a transfer due to
a reneg. In this case we don't want to re-enable polling because data might
have been left pending in the buffer. We just want to switch immediately to
the handshake mode.
2012-11-12 11:43:05 +01:00
Emeric Brun
8af8dd1a9a BUG/MEDIUM: ssl: review polling on reneg.
SSL may return SSL_ERROR_WANT_WRITE or SSL_ERROR_WANT_READ when switching
from data to handshake even if it does not need to poll first.
2012-11-12 11:41:16 +01:00
Willy Tarreau
70d0ad560c BUG: polling: don't skip polled events in the spec list
Commit 09f245 came with a bug : if we don't process events from the
spec list that are also being polled, we can end up with some stuck
events that nobody processes.

We must process all events from the spec list even if they're being
polled in parallel.
2012-11-12 01:57:14 +01:00
Willy Tarreau
54a08d3e08 BUG: connection: fix typo in previous commit
A typo broke the logs (obj_type() instead of objt_server()).
2012-11-12 01:14:56 +01:00