747 Commits

Author SHA1 Message Date
Willy Tarreau
ee113f5345 [BUG] regparm is broken on gcc < 3
Gcc < 3 does not consider regparm declarations for function pointers.
This causes big trouble at least with pollers (and with any function
pointer after all). Disable CONFIG_HAP_USE_REGPARM for gcc < 3.
(cherry picked from commit 61eadc028fb8774ea05d893cd3eca6c671fb511e)
2008-09-02 11:03:37 +02:00
Willy Tarreau
2f9127b4b9 [BUG] maintain_proxies must not disable backends
maintain_proxies could disable backends (p->maxconn == 0) which is
wrong (but apparently harmless). Add a check for p->maxconn == 0.
(cherry picked from commit d5382b4aaa099ce5ce2af5828bd4d6dc38e9e8ea)
2008-09-02 11:03:22 +02:00
Willy Tarreau
116f4105d4 [BUG] ev_sepoll: closed file descriptors could persist in the spec list
If __fd_clo() was called on a file descriptor which was previously
disabled, it was not removed from the spec list. This apparently
could not happen on previous code because the TCP states prevented
this, but now it happens regularly. The effects are spec entries
stuck populated, leading to busy loops.

(cherry picked from commit 7a52a5c4680477272b2f34eaf5896b85746e6fd6)
2008-09-02 11:01:49 +02:00
Willy Tarreau
df82605d3e [BUG] server timeout was not considered in some circumstances
Due to a copy-paste typo, the client timeout was refreshed instead
of the server's when waiting for server response. This means that
the server's timeout remained eternity.

(cherry picked from commit 9f1f24bb7fb8ebd6b43b5fee1bda0afbdbcb768e)
2008-09-02 11:00:03 +02:00
Willy Tarreau
3449d158ad [BUG] fix segfault with url_param + check_post
If an HTTP/0.9-like POST request is sent to haproxy while
configured with url_param + check_post, it will crash. The
reason is that the total buffer length was computed based
on req->total (which equals the number of bytes read) and
not req->l (number of bytes in the buffer), thus leading
to wrong size calculations when calling memchr().

The affected code does not look like it could have been
exploited to run arbitrary code, only reads were performed
at wrong locations.
(cherry picked from commit fb0528bd56063e9800c7dd6fbd96b3c5c6a687f2)
2008-09-02 10:52:14 +02:00
Willy Tarreau
0e3e59b11f [CLEANUP] remove dependency on obsolete INTBITS macro
The INTBITS macro was found to be already defined on some platforms,
and to equal 32 (while INTBITS was 5 here). Due to pure luck, there
was no declaration conflict, but it's nonetheless a problem to fix.

Looking at the code showed that this macro was only used for left
shifts and nothing else anymore. So the replacement is obvious. The
new macro, BITS_PER_INT is more obviously correct.
(cherry picked from commit 177e2b012723ef65c6c7f850df3e6e0cd2cca2b4)
2008-09-02 10:50:22 +02:00
Willy Tarreau
e05484f3ec [BUG] use_backend would not correctly consider "unless"
A copy-paste typo made use_backend not correctly consider the "unless"
case, depending on the previous "block" rule.
(cherry picked from commit a8cfa34a9c011cecfaedfaf7d91de3e5f7f004a0)
2008-09-02 10:50:01 +02:00
Willy Tarreau
3ff2c1d9b8 [BUILD] silent a warning in unlikely() with gcc 4.x
The unlikely() implementation for gcc 4.x spits out a warning
when a pointer is passed. Add a cast to unsigned long.
(cherry picked from commit 75875a7c8c7348e90e373c007bafdedcc6253ab4)
2008-09-02 10:49:43 +02:00
Willy Tarreau
66c9f287c2 [BUILD] change declaration of base64tab to fix build with Intel C++
I got a report that Intel C++ complains about the size of the
base64tab in base64.c. Setting it to 65 chars to allow for the
trailing zero fixes the problem.
(cherry picked from commit 69e989ccbcc1d5cbb623493d6c9cca169fb36ff6)
2008-09-02 10:48:56 +02:00
Willy Tarreau
80f35306e9 [BUG] disable buffer read timeout when reading stats
The buffer read timeouts were not reset when stats were produced. This
caused unneeded wakeups.
(cherry picked from commit 284c7b319566a66d5b742c905072175aac6445e1)
2008-09-02 10:48:34 +02:00
Willy Tarreau
b4ac1f9995 [RELEASE] Released version 1.3.15.2
Released version 1.3.15.2 with the following main changes :
    - [BUILD] make install should depend on haproxy not "all"
    - [BUG] event pollers must not wait if a task exists in the run queue
    - [BUG] queue management: wake oldest request in queues
    - [BUG] log: reported queue position was offed-by-one
    - [BUG] fix the dequeuing logic to ensure that all requests get served
    - [DOC] documentation for the "retries" parameter was missing.
v1.3.15.2
2008-06-21 21:59:05 +02:00
Willy Tarreau
fd7782b52a [DOC] documentation for the "retries" parameter was missing. 2008-06-20 17:29:31 +02:00
Willy Tarreau
209f9c2fb6 [BUG] fix the dequeuing logic to ensure that all requests get served
The dequeuing logic was completely wrong. First, a task was assigned
to all servers to process the queue, but this task was never scheduled
and was only woken up on session free. Second, there was no reservation
of server entries when a task was assigned a server. This means that
as long as the task was not connected to the server, its presence was
not accounted for. This was causing trouble when detecting whether or
not a server had reached maxconn. Third, during a redispatch, a session
could lose its place at the server's and get blocked because another
session at the same moment would have stolen the entry. Fourth, the
redispatch option did not work when maxqueue was reached for a server,
and it was not possible to do so without indefinitely hanging a session.

The root cause of all those problems was the lack of pre-reservation of
connections at the server's, and the lack of tracking of servers during
a redispatch. Everything relied on combinations of flags which could
appear similarly in quite distinct situations.

This patch is a major rework but there was no other solution, as the
internal logic was deeply flawed. The resulting code is cleaner, more
understandable, uses less magics and is overall more robust.

As an added bonus, "option redispatch" now works when maxqueue has
been reached on a server.
2008-06-20 15:22:51 +02:00
Willy Tarreau
3d85ee3be0 [BUG] log: reported queue position was offed-by-one
The reported queue position in the logs was 0 for the first pending request
in the queue, which is wrong because it means that one request will have to
be completed before the queued one may execute. It caused the undesired side
effect that 0/0 was reported when either 0 or 1 request was pending in the
queue. Thus, we have to increment the queue size before reporting the value.
2008-06-20 15:22:51 +02:00
Willy Tarreau
8857af4b00 [BUG] queue management: wake oldest request in queues
When a server terminates a connection, the next session in its
own queue was immediately processed. Because of this, if all
server queues are always filled, then no new anonymous request
will be processed. Consider oldest request between global and
server queues to choose from which to pick the request.

An improvement over this will consist in adding a configurable
offset when comparing expiration dates, so that cookie-less
requests can get either less or more priority.
2008-06-20 15:22:50 +02:00
Willy Tarreau
f4c7353757 [BUG] event pollers must not wait if a task exists in the run queue
Under some circumstances, a task may already lie in the run queue
(eg: inter-task wakeup). It is disastrous to wait for an event in
this case because some processing gets delayed.
2008-06-20 15:22:50 +02:00
Willy Tarreau
97da34a8b4 [BUILD] make install should depend on haproxy not "all"
Reported by Cherife Li : just doing a "make install" fails because it
depends on "all" which is equivalent to "help" if no TARGET was specified.
Make it depend on "haproxy" instead.
2008-06-12 00:29:59 +02:00
Willy Tarreau
98f0b04234 [RELEASE] Released version 1.3.15.1
Released version 1.3.15.1 with the following main changes :
    - [BUILD] fix build with gcc 4.3
    - [TESTS] add a debug patch to help trigger the stats bug
    - [BUG] Flush buffers also where there are exactly 0 bytes left
    - [DOC] update the README file with new build options
    - [MEDIUM] reduce risk of event starvation in ev_sepoll
v1.3.15.1
2008-05-25 21:12:58 +02:00
Willy Tarreau
e3dabe51ad [MEDIUM] reduce risk of event starvation in ev_sepoll
If too many events are set for spec I/O, those ones can starve the
polled events. Experiments show that when polled events starve, they
quickly turn into spec I/O, making the situation even worse. While
we can reduce the number of polled events processed at once, we
cannot do this on speculative events because most of them are new
ones (avg 2/3 new - 1/3 old from experiments).

The solution against this problem relies on those two factors :
  1) one FD registered as a spec event cannot be polled at the same time
  2) even during very high loads, we will almost never be interested in
     simultaneous read and write streaming on the same FD.

The first point implies that during starvation, we will not have more than
half of our FDs in the poll list, otherwise it means there is less than that
in the spec list, implying there is no starvation.

The second point implies that we're statically only interested in half of
the maximum number of file descriptors at once, because we will unlikely
have simultaneous read and writes for a same buffer during long periods.

So, if we make it possible to drain maxsock/2/2 during peak loads, then we
can ensure that there will be no starvation effect. This means that we must
always allocate maxsock/4 events for the poller.

Last, sepoll uses an optimization consisting in reducing the number of calls
to epoll_wait() to once every too polls. However, when dealing with many
spec events, we can wait very long and skipping epoll_wait() every second
time increases latency. For this reason, we try to detect if we are beyond
a reasonable limit and stop doing so at this stage.
2008-05-25 14:39:35 +02:00
Willy Tarreau
caa9f39b5b [DOC] update the README file with new build options 2008-05-25 14:39:28 +02:00
Jeremy Hinegardner
ff78e9ade0 [BUILD] fix build with gcc 4.3
For Fedora 9 gcc 4.3 will be shipping as a feature, and right now haproxy does
not compile with gcc 4.3.

It appears that there is a reordering of headers or something along those lines,
This is the patch that gets haproxy to compile with gcc 4.3.  I'm not sure if
this is the correct approach you would want to use, so please correct me.

If this works for you, I'll go ahead and put this patch in the src rpm until a
release of haproxy which compiles with gcc 4.3 is released.
2008-05-25 14:39:15 +02:00
Krzysztof Oledzki
edaf01fc28 [TESTS] add a debug patch to help trigger the stats bug
About: [BUG] Flush buffers also where there are exactly 0 bytes left

I'm also attaching a debug patch that helps to trigger this bug.

Without the fix:
# echo -ne "GET /haproxy?stats;csv;norefresh HTTP/1.0\r\n\r\n"|nc 127.0.0.1
801|wc -c
16384

With the fix:
# echo -ne "GET /haproxy?stats;csv;norefresh HTTP/1.0\r\n\r\n"|nc 127.0.0.1
801|wc -c
33089

Best regards,

                                Krzysztof Oledzki
2008-05-25 14:39:12 +02:00
Krzysztof Piotr Oledzki
144a3d3016 [BUG] Flush buffers also where there are exactly 0 bytes left
I noticed it was possible to get truncated http/csv stats. Sometimes.
Usually the problem disappeared as fast as it appeared, but once it
happend that my http-stats page was truncated for about one hour.
It was quite weird as it happened independently for csv and http
output and it took me some time to track & fix this bug.

Both buffer_write & buffer_write_chunk used to return 0 in two
situations: is case of success or where there was exactly 0 bytes
left. The first one is intentional but I believe the second one
is not as it was not possible to distinguish between successful
write and unsuccessful one, which means that if the buffer was 100%
filled, it was never flushed and it was not possible to write
more data.

This patch fixes this problem.
2008-05-25 14:39:08 +02:00
Willy Tarreau
7b4c5aee55 [RELEASE] Released version 1.3.15
Released version 1.3.15 with the following main changes :
    - [BUILD] Added support for 'make install'
    - [BUILD] Added 'install-man' make target for installing the man page
    - [BUILD] Added 'install-bin' make target
    - [BUILD] Added 'install-doc' make target
    - [BUILD] Removed "/" after '$(DESTDIR)' in install targets
    - [BUILD] Changed 'install' target to install the binaries first
    - [BUILD] Replace hardcoded 'LD = gcc' with 'LD = $(CC)'
    - [MEDIUM]: Inversion for options
    - [MEDIUM]: Count retries and redispatches also for servers, fix redistribute_pending, extend logs, %d->%u cleanup
    - [BUG]: Restore clearing t->logs.bytes
    - [MEDIUM]: rework checks handling
    - [DOC] Update a "contrib" file with a hint about a scheme used for formathing subjects
    - [MEDIUM] Implement "track [<backend>/]<server>"
    - [MINOR] Implement persistent id for proxies and servers
    - [BUG] Don't increment server connections too much + fix retries
    - [MEDIUM]: Prevent redispatcher from selecting the same server, version #3
    - [MAJOR] proto_uxst rework -> SNMP support
    - [BUG] appsession lookup in URL does not work
    - [BUG] transparent proxy address was ignored in backend
    - [BUG] hot reconfiguration failed because of a wrong error check
    - [DOC] big update to the configuration manual
    - [DOC] large update to the configuration manual
    - [DOC] document more options
    - [BUILD] major rework of the GNU Makefile
    - [STATS] add support for "show info" on the unix socket
    - [DOC] document options forwardfor to logasap
    - [MINOR] add support for the "backlog" parameter
    - [OPTIM] introduce global parameter "tune.maxaccept"
    - [MEDIUM] introduce "timeout http-request" in frontends
    - [MINOR] tarpit timeout is also allowed in backends
    - [BUG] increment server connections for each connect()
    - [MEDIUM] add a turn-around state of one second after a connection failure
    - [BUG] fix typo in redispatched connection
    - [DOC] document options nolinger to ssl-hello-chk
    - [DOC] added documentation for "option tcplog" to "use_backend"
    - [BUG] connect_server: server might not exist when sending error report
    - [MEDIUM] support fully transparent proxy on Linux (USE_LINUX_TPROXY)
    - [MEDIUM] add non-local bind to connect() on Linux
    - [MINOR] add transparent proxy support for balabit's Tproxy v4
    - [BUG] use backend's source and not server's source with tproxy
    - [BUG] fix overlapping server flags
    - [MEDIUM] fix server health checks source address selection
    - [BUG] build failed on CONFIG_HAP_LINUX_TPROXY without CONFIG_HAP_CTTPROXY
    - [DOC] added "server", "source" and "stats" keywords
    - [DOC] all server parameters have been documented
    - [DOC] document all req* and rsp* keywords.
    - [DOC] added documentation about HTTP header manipulations
    - [BUG] log response byte count, not request
    - [BUILD] code did not build in full debug mode
    - [BUG] fix truncated responses with sepoll
    - [MINOR] use s->frt_addr as the server's address in transparent proxy
    - [MINOR] fix configuration hint about timeouts
    - [DOC] minor cleanup of the doc and notice to contributors
    - [MINOR] report correct section type for unknown keywords.
    - [BUILD] update MacOS Makefile to build on newer versions
    - [DOC] fix erroneous "useallbackups" option in the doc
    - [DOC] applied small fixes from early readers
    - [MINOR] add configuration support for "redir" server keyword
    - [MEDIUM] completely implement the server redirection method
    - [TESTS] add a test case for the server redirection mechanism
    - [DOC] add a configuration entry for "server ... redir <prefix>"
    - [BUILD] backend.c and checks.c did not build without tproxy !
    - Revert "[BUILD] backend.c and checks.c did not build without tproxy !"
    - [BUILD] backend.c and checks.c did not build without tproxy !
    - [OPTIM] used unsigned ints for HTTP state and message offsets
    - [OPTIM] GCC4's builtin_expect() is suboptimal
    - [BUG] failed conns were sometimes incremented in the frontend!
    - [BUG] timeout.check was not pre-set to eternity
    - [TESTS] add test-pollers.cfg to easily report pollers in use
    - [BUG] do not apply timeout.connect in checks if unset
    - [BUILD] ensure that makefile understands USE_DLMALLOC=1
    - [MINOR] silent gcc for a wrong warning
    - [CLEANUP] update .gitignore to ignore more temporary files
    - [CLEANUP] report dlmalloc's source path only if explictly specified
    - [BUG] str2sun could leak a small buffer in case of error during parsing
    - [BUG] option allbackups was not working anymore in roundrobin mode
    - [MAJOR] implementation of the "leastconn" load balancing algorithm
    - [BUILD] ensure that users don't build without setting the target anymore.
    - [DOC] document the leastconn LB algo
    - [MEDIUM] fix stats socket limitation to 16 kB
    - [DOC] fix unescaped space in httpchk example.
    - [BUG] fix double-decrement of server connections
    - [TESTS] add a test case for port mapping
    - [TESTS] add a benchmark for integer hashing
    - [TESTS] add new methods in ip-hash test file
    - [MAJOR] implement parameter hashing for POST requests
v1.3.15
2008-04-19 21:25:12 +02:00
Willy Tarreau
192ee3e630 [BUILD] fix build of POST analysis code with gcc < 3
move variable declarations at beginning of blocks.
2008-04-19 21:24:56 +02:00
matt.farnsworth@nokia.com
1c2ab96be5 [MAJOR] implement parameter hashing for POST requests
This patch extends the "url_param" load balancing method by introducing
the "check_post" option. Using this option enables analysis of the beginning
of POST requests to search for the specified URL parameter.

The patch also fixes a few minor typos in comments that were discovered
during code review.
2008-04-15 15:30:41 +02:00
Willy Tarreau
a532324128 [TESTS] add new methods in ip-hash test file
added methods to provide a better hash with small input sets
2008-04-13 09:27:00 +02:00
Willy Tarreau
53cfa0e58d [TESTS] add a benchmark for integer hashing
If we want to support netmasks for IP address hashing,
we will need something better than a pure modulus, otherwise
people with even numbers of servers will get surprizes.

Bob Jenkins is known for his works on hashing, and his site
has a lot of very interesting researches and algorithms for
integer hashing. He also points to the work of Thomas Wang
who has similar findings.

The program here tests their algorithms in order to find one
well suited for IP address hashing.
2008-04-12 22:28:32 +02:00
Willy Tarreau
73290cc74f [TESTS] add a test case for port mapping 2008-04-12 11:19:04 +02:00
Willy Tarreau
f899b94e63 [BUG] fix double-decrement of server connections
If a client does a sudden dirty close (CL_STCLOSE) during a server
connect turn-around, then the number of server connections is
decremented twice. This causes huge problems on the affected
server because when its connection number becomes negative, it
overflows and prevents the server from accepting new connections
due to an apparent saturation.

The fix consists in not decrementing the counter if the server is
in a turn-around state.
2008-03-28 18:19:05 +01:00
Willy Tarreau
ebaf21af95 [DOC] fix unescaped space in httpchk example.
Lars Braeuer reported a missing space in the example which drives
readers in wrong direction.
2008-03-21 20:18:01 +01:00
Christian Wiese
f630830b60 [BUILD] Replace hardcoded 'LD = gcc' with 'LD = $(CC)'
haproxy relies on linking the binary using gcc, so there is no real need to
hardcode both (CC and LD). Setting 'LD = $(CC)' will make the build system
a bit more cross-compile friendly because only the right cross-compiler has
to be passed via make.
2008-03-17 22:49:28 +01:00
Willy Tarreau
39f7e6d516 [MEDIUM] fix stats socket limitation to 16 kB
Due to the way the stats socket work, it was not possible to
maintain the information related to the command entered, so
after filling a whole buffer, the request was lost and it was
considered that there was nothing to write anymore.

The major reason was that some flags were passed directly
during the first call to stats_dump_raw() instead of being
stored persistently in the session.

To definitely fix this problem, flags were added to the stats
member of the session structure.

A second problem appeared. When the stats were produced, a first
call to client_retnclose() was performed, then one or multiple
subsequent calls to buffer_write_chunks() were done. But once the
stats buffer was full and a reschedule operated, the buffer was
flushed, the write flag cleared from the buffer and nothing was
done to re-arm it.

For this reason, a check was added in the proto_uxst_stats()
function in order to re-call the client FSM when data were added
by stats_dump_raw(). Finally, the whole unix stats dump FSM was
rewritten to avoid all the magics it depended on. It is now
simpler and looks more like the HTTP one.
2008-03-17 22:08:01 +01:00
Willy Tarreau
2d2a7f8fbc [DOC] document the leastconn LB algo 2008-03-17 12:08:13 +01:00
Christian Wiese
db5238d3a4 [BUILD] Changed 'install' target to install the binaries first 2008-03-17 08:59:07 +01:00
Christian Wiese
ce8f3428a2 [BUILD] Removed "/" after '$(DESTDIR)' in install targets
SBINDIR, MANDIR and DOCDIR use an absolute path
2008-03-17 08:59:07 +01:00
Christian Wiese
bf34eb4f73 [BUILD] Added 'install-doc' make target
This change is also introducing a new make variable called DOCDIR,
which is set to "$(PREFIX)/doc/haproxy" by default.
2008-03-17 08:59:07 +01:00
Christian Wiese
4cf5d575b5 [BUILD] Added 'install-bin' make target 2008-03-17 08:59:07 +01:00
Christian Wiese
19b5029318 [BUILD] Added 'install-man' make target for installing the man page
This change is also introducing a new variable in the Makefile called
MANDIR, which is set to "$PREFIX/share/man" by default.
2008-03-17 08:59:06 +01:00
Christian Wiese
a184aa273c [BUILD] Added support for 'make install'
To be flexible while installing haproxy following variables have been
added to the Makefile:
- DESTDIR useful i.e. while installing in a sandbox (not set by default)
- PREFIX  defines the default install prefix (default: /usr/local)
- SBINDIR defines the dir the haproxy binary gets installed
  (default: $PREFIX/sbin)
2008-03-17 08:59:06 +01:00
Willy Tarreau
e4208cbc9d [BUILD] ensure that users don't build without setting the target anymore.
Too often, people report performance issues on Linux 2.6 because they don't
use the available optimizations. We need to ensure that people are aware of
the available features, and for this, we must force them to choose a target
OS (or "generic"), but at least prevent them from blindly building for a
generic target.
2008-03-11 06:37:39 +01:00
Willy Tarreau
51406233bb [MAJOR] implementation of the "leastconn" load balancing algorithm
The new "leastconn" LB algorithm selects the server which has the
least established or pending connections. The weights are considered,
so that a server with a weight of 20 will get twice as many connections
as the server with a weight of 10.

The algorithm respects the minconn/maxconn settings, as well as the
slowstart since it is a dynamic algorithm. It also correctly supports
backup servers (one and all).

It is generally suited for protocols with long sessions (such as remote
terminals and databases), as it will ensure that upon restart, a server
with no connection will take all new ones until its load is balanced
with others.

A test configuration has been added in order to ease regression testing.
2008-03-10 22:04:30 +01:00
Willy Tarreau
f4cca45b5e [BUG] option allbackups was not working anymore in roundrobin mode
Commit 3168223a7b33a1d5aad1e11b8f2ad917645d7f27 broke option
"allbackups" in roundrobin mode due to an erroneous structure
member replacement in backend.c. The PR_O_USE_ALL_BK flag was
not tested in the right member anymore.

This bug uncoverred another one, by which all backup servers would
be used whatever the option's value, if all of them had been seen
as simultaneously failed at one moment.

This patch fixes the two stupid errors. Correctness has been tested
using the test-fwrr.cfg config example.
2008-03-08 21:42:54 +01:00
Willy Tarreau
caf720d3ff [BUG] str2sun could leak a small buffer in case of error during parsing
Matt Farnsworth reported a memory leak in str2sun() in case a too large
socket path is passed. The bug is very minor because it only happens
once during config parsing, but has to be fixed nevertheless. The patch
Matt provided could even be improved by completely removing the useless
strdup() in this function.
2008-03-07 10:07:04 +01:00
Willy Tarreau
f32d19a395 [CLEANUP] report dlmalloc's source path only if explictly specified
There's no point in reporting dlmalloc's source path if it was the
default one.
2008-03-07 10:02:14 +01:00
Willy Tarreau
83ded5082e [CLEANUP] update .gitignore to ignore more temporary files 2008-03-07 09:39:37 +01:00
Willy Tarreau
f863ac152a [MINOR] silent gcc for a wrong warning
gcc believes that avoididx may be used uninitialized, which is wrong.
2008-03-04 06:38:57 +01:00
Krzysztof Piotr Oledzki
2c6962c3c0 [MAJOR] proto_uxst rework -> SNMP support
Currently there is a ~16KB limit for a data size passed via unix socket.
It is caused by a trivial bug ttat is going to fixed soon, however
in most cases there is no need to dump a full stats.

This patch makes possible to select a scope of dumped data by extending
current "show stat" to "show stat [<iid> <type> <sid>]":
 - iid is a proxy id, -1 to dump all proxies
 - type selects type of dumpable objects: 1 for frontend, 2 for backend, 4 for
   server, -1 for all types. Values can be ORed, for example:
     1+2=3   -> frontend+backend.
     1+2+4=7 -> frontend+backend+server.
 - sid is a service id, -1 to dump everything from the selected proxy.

To do this I implemented a new session flag (SN_STAT_BOUND), added three
variables in data_ctx.stats (iid, type, sid), modified dumpstats.c and
completely revorked the process_uxst_stats: now it waits for a "\n"
terminated string, splits args and uses them. BTW: It should be quite easy
to add new commands, for example to enable/disable servers, the only problem
I can see is a not very lucky config name (*stats* socket). :|

During the work I also fixed two bug:
 - s->flags were not initialized for proto_uxst
 - missing comma if throttling not enabled (caused by a stupid change in
     "Implement persistent id for proxies and servers")

Other changes:
 - No more magic type valuse, use STATS_TYPE_FE/STATS_TYPE_BE/STATS_TYPE_SV
 - Don't memset full s->data_ctx (it was clearing s->data_ctx.stats.{iid/type/sid},
    instead initialize stats.sv & stats.sv_st (stats.px and stats.px_st were already
    initialized)

With all that changes it was extremely easy to write a short perl plugin
for a perl-enabled net-snmp (also included in this patch).

29385 is my PEN (Private Enterprise Number) and I'm willing to donate
the SNMPv2-SMI::enterprises.29385.106.* OIDs for HAProxy if there
is nothing assigned already.
2008-03-04 06:32:16 +01:00
Krzysztof Piotr Oledzki
5a329cf017 [MEDIUM]: Prevent redispatcher from selecting the same server, version #3
When haproxy decides that session needs to be redispatched it chose a server,
but there is no guarantee for it to be a different one. So, it often
happens that selected server is exactly the same that it was previously, so
a client ends up with a 503 error anyway, especially when one sever has
much bigger weight than others.

Changes from the previous version:
 - drop stupid and unnecessary SN_DIRECT changes

 - assign_server(): use srvtoavoid to keep the old server and clear s->srv
    so SRV_STATUS_NOSRV guarantees that t->srv == NULL (again)
    and get_server_rr_with_conns has chances to work (previously
    we were passing a NULL here)

 - srv_redispatch_connect(): remove t->srv->cum_sess and t->srv->failed_conns
   incrementing as t->srv was guaranteed to be NULL

 - add avoididx to get_server_rr_with_conns. I hope I correctly understand this code.

 - fix http_flush_cookie_flags() and move it to assign_server_and_queue()
   directly. The code here was supposed to set CK_DOWN and clear CK_VALID,
   but: (TX_CK_VALID | TX_CK_DOWN) == TX_CK_VALID == TX_CK_MASK so:
	if ((txn->flags & TX_CK_MASK) == TX_CK_VALID)
		txn->flags ^= (TX_CK_VALID | TX_CK_DOWN);
   was really a:
	if ((txn->flags & TX_CK_MASK) == TX_CK_VALID)
		txn->flags &= TX_CK_VALID

   Now haproxy logs "--DI" after redispatching connection.

 - defer srv->redispatches++ and s->be->redispatches++ so there
   are called only if a conenction was redispatched, not only
   supposed to.

 - don't increment lbconn if redispatcher selected the same sarver

 - don't count unsuccessfully redispatched connections as redispatched
   connections

 - don't count redispatched connections as errors, so:

 - the number of connections effectively served by a server is:
 srv->cum_sess - srv->failed_conns - srv->retries - srv->redispatches
   and
 SUM(servers->failed_conns) == be->failed_conns

 - requires the "Don't increment server connections too much + fix retries" patch

 - needs little more testing and probably some discussion so reverting to the RFC state

Tests #1:
 retries 4
 redispatch

i) 1 server(s): b (wght=1, down)
  b) sessions=5, lbtot=1, err_conn=1, retr=4, redis=0
  -> request failed

ii) server(s): b (wght=1, down), u (wght=1, down)
  b) sessions=4, lbtot=1, err_conn=0, retr=3, redis=1
  u) sessions=1, lbtot=1, err_conn=1, retr=0, redis=0
  -> request FAILED

iii) 2 server(s): b (wght=1, down), u (wght=1, up)
  b) sessions=4, lbtot=1, err_conn=0, retr=3, redis=1
  u) sessions=1, lbtot=1, err_conn=0, retr=0, redis=0
  -> request OK

iv) 2 server(s): b (wght=100, down), u (wght=1, up)
  b) sessions=4, lbtot=1, err_conn=0, retr=3, redis=1
  u) sessions=1, lbtot=1, err_conn=0, retr=0, redis=0
  -> request OK

v) 1 server(s): b (down for first 4 SYNS)
  b) sessions=5, lbtot=1, err_conn=0, retr=4, redis=0
  -> request OK

Tests #2:
 retries 4

i) 1 server(s): b (down)
  b) sessions=5, lbtot=1, err_conn=1, retr=4, redis=0
  -> request FAILED
2008-03-04 06:16:37 +01:00
Krzysztof Piotr Oledzki
626a19b66f [BUG] Don't increment server connections too much + fix retries
Commit 98937b875798e10fac671d109355cde29d2a411a while fixing
one bug introduced another one. With "retries 4" and
"option redispatch" haproxy tries to connect 4 times to
one server server and 1 time to a second one. However
logs showed 5 connections to the first server (the
last one was counted twice) and 2 to the second.

This patch also fixes srv->retries and be->retries increments.

Now I get: 3 retries and 1 error in a first server (4 cum_sess)
and 1 error in a second server (1 cum_sess) with:
 retries 4
 option redispatch

and: 4 retries and 1 error (5 cum_sess) with:
 retries 4

So, the number of connections effectively served by a server is:
 srv->cum_sess - srv->failed_conns - srv->retries
2008-03-04 06:11:17 +01:00