There were a few unchecked write() calls in the debug code that cause
gcc 4.x to emit warnings on recent libc. We don't want to check them
as we can't make anything from the result, let's simply surround them
with an empty if statement.
Note that one of the warnings was for chdir("/") which normally cannot
fail since it follows a successful chroot (which means the perms are
necessarily there). Anyway let's move the call uppe to protect it too.
%Fi: Frontend IP
%Fp: Frontend Port
%Si: Server IP
%Sp: Server Port
%Ts: Timestamp
%rt: HTTP request counter
%H: hostname
%pid: PID
+X: Hexadecimal represenation
The +X mode in logformat displays hexadecimal for the following flags
%Ci %Cp %Fi %Fp %Bi %Bp %Si %Sp %Ts %ct %pid
rename logformat_write_string() to lf_text()
Optimize size computation
Sometimes it is desirable to forward a particular request to a specific
server without having to declare a dedicated backend for this server. This
can be achieved using the "use-server" rules. These rules are evaluated after
the "redirect" rules and before evaluating cookies, and they have precedence
on them. There may be as many "use-server" rules as desired. All of these
rules are evaluated in their declaration order, and the first one which
matches will assign the server.
Released version 1.5-dev8 with the following main changes :
- MINOR: patch for minor typo (ressources/resources)
- MEDIUM: http: add support for sending the server's name in the outgoing request
- DOC: mention that default checks are TCP connections
- BUG/MINOR: fix options forwardfor if-none when an alternative header name is specified
- CLEANUP: Make check_statuses, analyze_statuses and process_chk static
- CLEANUP: Fix HCHK spelling errors
- BUG/MINOR: fix typo in processing of http-send-name-header
- MEDIUM: log: Use linked lists for loggers
- BUILD: fix declaration inside a scope block
- REORG: log: split send_log function
- MINOR: config: Parse the string of the log-format config keyword
- MINOR: add ultoa, ulltoa, ltoa, lltoa implementations
- MINOR: Date and time fonctions that don't use snprintf
- MEDIUM: log: make http_sess_log use log_format
- DOC: log-format documentation
- MEDIUM: log: use log_format for mode tcplog
- MEDIUM: log-format: backend source address %Bi %Bp
- BUG/MINOR: log-format: fix %o flag
- BUG/MEDIUM: bad length in log_format and __send_log
- MINOR: logformat %st is signed
- BUILD/MINOR: fix the source URL in the spec file
- DOC: acl is http_first_req, not http_req_first
- BUG/MEDIUM: don't trim last spaces from headers consisting only of spaces
- MINOR: acl: add new matches for header/path/url length
- BUILD: halog: make halog build on solaris
- BUG/MINOR: don't use a wrong port when connecting to a server with mapped ports
- MINOR: remove the client/server side distinction in SI addresses
- MINOR: halog: add support for matching queued requests
- DOC: indicate that cookie "prefix" and "indirect" should not be mixed
- OPTIM/MINOR: move struct sockaddr_storage to the tail of structs
- OPTIM/MINOR: make it possible to change pipe size (tune.pipesize)
- BUILD/MINOR: silent a build warning in src/pipe.c (fcntl)
- OPTIM/MINOR: move the hdr_idx pools out of the proxy struct
- MEDIUM: tune.http.maxhdr makes it possible to configure the maximum number of HTTP headers
- BUG/MINOR: fix a segfault when parsing a config with undeclared peers
- CLEANUP: rename possibly confusing struct field "tracked"
- BUG/MEDIUM: checks: fix slowstart behaviour when server tracking is in use
- MINOR: config: tolerate server "cookie" setting in non-HTTP mode
- MEDIUM: buffers: add some new primitives and rework existing ones
- BUG: buffers: don't return a negative value on buffer_total_space_res()
- MINOR: buffers: make buffer_pointer() support negative pointers too
- CLEANUP: kill buffer_replace() and use an inline instead
- BUG: tcp: option nolinger does not work on backends
- CLEANUP: ebtree: remove a few annoying signedness warnings
- CLEANUP: ebtree: clarify licence and update to 6.0.6
- CLEANUP: ebtree: remove 4-year old harmless typo in duplicates insertion code
- CLEANUP: ebtree: remove another typo, a wrong initialization in insertion code
- BUG: ebtree: ebst_lookup() could return the wrong entry
- OPTIM: stream_sock: reduce the amount of in-flight spliced data
- OPTIM: stream_sock: save a failed recv syscall when splice returns EAGAIN
- MINOR: acl: add support for TLS server name matching using SNI
- BUG: http: re-enable TCP quick-ack upon incomplete HTTP requests
- BUG: proto_tcp: don't try to bind to a foreign address if sin_family is unknown
- MINOR: pattern: export the global temporary pattern
- CLEANUP: patterns: get rid of pattern_data_setstring()
- MEDIUM: acl: use temp_pattern to store fetched information in the "method" match
- MINOR: acl: include pattern.h to make pattern migration more transparent
- MEDIUM: pattern: change the pattern data integer from unsigned to signed
- MEDIUM: acl: use temp_pattern to store any integer-type information
- MEDIUM: acl: use temp_pattern to store any address-type information
- CLEANUP: acl: integer part of acl_test is not used anymore
- MEDIUM: acl: use temp_pattern to store any string-type information
- CLEANUP: acl: remove last data fields from the acl_test struct
- MEDIUM: http: replace get_ip_from_hdr2() with http_get_hdr()
- MEDIUM: patterns: the hdr() pattern is now of type string
- DOC: add minimal documentation on how ACLs work internally
- DOC: add a coding-style file
- OPTIM: halog: keep a fast path for the lines-count only
- CLEANUP: silence a warning when building on sparc
- BUG: http: tighten the list of allowed characters in a URI
- MEDIUM: http: block non-ASCII characters in URIs by default
- DOC: add some documentation from RFC3986 about URI format
- BUG/MINOR: cli: correctly remove the whole table on "clear table"
- BUG/MEDIUM: correctly disable servers tracking another disabled servers.
- BUG/MEDIUM: zero-weight servers must not dequeue requests from the backend
- MINOR: halog: add some help on the command line
- BUILD: fix build error on FreeBSD
- BUG: fix double free in peers config error path
- MEDIUM: improve config check return codes
- BUILD: make it possible to look for pcre in the default system paths
- MINOR: config: emit a warning when 'default_backend' masks servers
- MINOR: backend: rework the LC definition to support other connection-based algos
- MEDIUM: backend: add the 'first' balancing algorithm
- BUG: fix httplog trailing LF
- MEDIUM: increase chunk-size limit to 2GB-1
- BUG: queue: fix dequeueing sequence on HTTP keep-alive sessions
- BUG: http: disable TCP delayed ACKs when forwarding content-length data
- BUG: checks: fix server maintenance exit sequence
- BUG/MINOR: stream_sock: don't remove BF_EXPECT_MORE and BF_SEND_DONTWAIT on partial writes
- DOC: enumerate valid status codes for "observe layer7"
- MINOR: buffer: switch a number of buffer args to const
- CLEANUP: silence signedness warning in acl.c
- BUG: stream_sock: si->release was not called upon shutw()
- MINOR: log: use "%ts" to log term status only and "%tsc" to log with cookie
- BUG/CRITICAL: log: fix risk of crash in development snapshot
- BUG/MAJOR: possible crash when using capture headers on TCP frontends
- MINOR: config: disable header captures in TCP mode and complain
parse_logformat_string: parse the string, detect the type: text,
separator or variable
parse_logformat_var: dectect variable name
parse_logformat_var_args: parse arguments and flags
add_to_logformat_list: add to the logformat linked list
When checking a configuration file using "-c -f xxx", sometimes it is
reported that a config is valid while it will later fail (eg: no enabled
listener). Instead, let's improve the return values :
- return 0 if config is 100% OK
- return 1 if config has errors
- return 2 if config is OK but no listener nor peer is enabled
This patch settles the 2 loggers limitation.
Loggers are now stored in linked lists.
Using "global log", the global loggers list content is added at the end
of the current proxy list. Each "log" entries are added at the end of
the proxy list.
"no log" flush a logger list.
Ludovic Levesque reported and diagnosed an annoying bug. When a server is
configured to track another one and has a slowstart interval set, it's
assigned a minimal weight when the tracked server goes back up but keeps
this weight forever.
This is because the throttling during the warmup phase is only computed
in the health checking function.
After several attempts to resolve the issue, the only real solution is to
split the check processing task in two tasks, one for the checks and one
for the warmup. Each server with a slowstart setting has a warmum task
which is responsible for updating the server's weight after a down to up
transition. The task does not run in othe situations.
In the end, the fix is neither complex nor long and should be backported
to 1.4 since the issue was detected there first.
It makes no sense to have one pointer to the hdr_idx pool in each proxy
struct since these pools do not depend on the proxy. Let's have a common
pool instead as it is already the case for other types.
Passing -C <dir> causes haproxy to chdir to <dir> before loading
any file. The argument may be passed anywhere on the command line.
A typical use case is :
$ haproxy -C /etc/haproxy -f global.cfg -f haproxy.cfg
The way the unix socket is initialized is awkward. Some of the settings are put
in the sockets itself, other ones in the backend. And more importantly the
global.maxsock value is adjusted so that the stats socket evades the global
maxconn value. This complexifies maxsock computations for nothing, since the
stats socket is not supposed to receive hundreds of concurrent connections when
the global maxconn is very low. What is needed however is to ensure that there
are always connections left for the stats socket even when traffic sockets are
saturated, but this guarantee is not offered anymore by current code.
So as of now, the stats socket is subject to the global maxconn limitation just
as any other socket until a reservation mechanism is implemented.
This global task is used to periodically check for end of resource shortage
and to try to enable queued listeners again. This is important in case some
temporary system-wide shortage is encountered, so that we don't have to wait
for an existing connection to be released before checking the queue again.
For situations where listeners are queued due to the global maxconn being
reached, the task is woken up at least every second. For situations where
a system resource shortage is detected (memory, sockets, ...) the task is
woken up at least every 100 ms. That way, recovery from severe events can
still be achieved under acceptable conditions.
This function is finally not needed anymore, as it has been replaced with
a per-proxy task that is scheduled when some limits are encountered on
incoming connections or when the process is stopping. The savings should
be noticeable on configs with a large number of proxies. The most important
point is that the rate limiting is now enforced in a clean and solid way.
When an accept() fails because of a connection limit or a memory shortage,
we now disable it and queue it so that it's dequeued only when a connection
is released. This has improved the behaviour of the process near the fd limit
as now a listener with a no connection (eg: stats) will not loop forever
trying to get its connection accepted.
The solution is still not 100% perfect, as we'd like to have this used when
proxy limits are reached (use a per-proxy list) and for safety, we'd need
to have dedicated tasks to periodically re-enable them (eg: to overcome
temporary system-wide resource limitations when no connection is released).
Managing listeners state is difficult because they have their own state
and can at the same time have theirs dictated by their proxy. The pause
is not done properly, as the proxy code is fiddling with sockets. By
introducing new functions such as pause_listener()/resume_listener(), we
make it a bit more obvious how/when they're supposed to be used. The
listen_proxies() function was also renamed to resume_proxies() since
it's only used for pause/resume.
This patch is the first in a series aiming at getting rid of the maintain_proxies
mess. In the end, proxies should not call enable_listener()/disable_listener()
anymore.
By default on a single process, we accept 100 connections at once. This is too
much on recent CPUs where the cache is constantly thrashing, because we visit
all those connections several times. We should batch the processing slightly
less so that all the accepted session may remain in cache during their initial
processing.
Lowering the batch size from 100 to 32 has changed the connection rate for
concurrencies between 5-10k from 67 kcps to 94 kcps on a Core i5 660 (4M L3),
and forward rates from 30k to 39.5k.
Tests on this hardware show that values between 10 and 30 seem to do the job fine.
The motivation for this is that when soft-restart is merged
it will be come more important to free all relevant memory in deinit()
Discovered using valgrind.
The motivation for this is that when soft-restart is merged
it will be come more important to free all relevant memory in deinit()
Discovered using valgrind.
The motivation for this is that when soft-restart is merged
it will be come more important to free all relevant memory in deinit()
Discovered using valgrind.
The motivation for this is that when soft-restart is merged
it will be come more important to free all relevant memory in deinit()
Discovered using valgrind.
And also rename "req_acl_rule" "http_req_rule". At the beginning that
was a bit confusing to me, especially the "req_acl" list which in fact
holds what we call rules. After some digging, it appeared that some
part of the code is 100% HTTP and not just related to authentication
anymore, so let's move that part to HTTP and keep the auth-only code
in auth.c.
It's very annoying that frontend and backend stats are merged because we
don't know what we're observing. For instance, if a "listen" instance
makes use of a distinct backend, it's impossible to know what the bytes_out
means.
Some points take care of not updating counters twice if the backend points
to the frontend, indicating a "listen" instance. The thing becomes more
complex when we try to add support for server side keep-alive, because we
have to maintain a pointer to the backend used for last request, and to
update its stats. But we can't perform such comparisons anymore because
the counters will not match anymore.
So in order to get rid of this situation, let's have both frontend AND
backend stats in the "struct proxy". We simply update the relevant ones
during activity. Some of them are only accounted for in the backend,
while others are just for frontend. Maybe we can improve a bit on that
later, but the essential part is that those counters now reflect what
they really mean.
As reported by the Loadbalancer.org team, it was not possible to bind
more than 1024 ports. This is because the process' limits were set after
trying to bind the sockets, which defeats their purpose.
This fix must be backported to 1.4 and 1.3.
One of the requirements we have is to run multiple instances of haproxy on a
single host; this is so that we can split the responsibilities (and change
permissions) between product teams. An issue we ran up against is how we
would distinguish between the logs generated by each instance. The solution
we came up with (please let me know if there is a better way) is to override
the application tag written to syslog. We can then configure syslog to write
these to different files.
I have attached a patch adding a global option 'log-tag' to override the
default syslog tag 'haproxy' (actually defaults to argv[0]).
Haproxy does not include the hostname rather the IP of the machine in
the syslog headers it sends. Unfortunately this means that for each log
line rsyslog does a reverse dns on the client IP and in the case of
non-routable IPs one gets the public hostname not the internal one.
While this is valid according to RFC3164 as one might imagine this is
troublsome if you have some machines with public IPs, internal IPs, no
reverse DNS entries, etc and you want a standardized hostname based log
directory structure. The rfc says the preferred value is the hostname.
This patch adds a global "log-send-hostname" statement which accepts an
optional string to force the host name. If unset, the local host name
is used.
HTTP content-based health checks will be involved in searching text in pages.
Some pages may not fit in the default buffer (16kB) and sometimes it might be
desired to have larger buffers in order to find patterns. Running checks on
smaller URIs is always preferred of course.
(cherry picked from commit 043f44aeb835f3d0b57626c4276581a73600b6b1)
In deinit(), it is possible that we first free the listeners, then
unbind them all. Right now this situation can't happen because the
only way to call deinit() is to pass via a soft-stop which will
already unbind all protocols. But later this might become a problem.
This counter is incremented for each incoming connection and each active
listener, and is used to prevent haproxy from stopping upon SIGUSR1. It
will thus be possible for some tasks in increment this counter in order
to prevent haproxy from dying until they have completed their job.
Signal zero is never delivered by the system. However having a signal to
which functions and tasks can subscribe to be notified of a stopping event
is useful. So this patch does two things :
1) allow signal zero to be delivered from any function of signal handler
2) make soft_stop() deliver this signal so that tasks can be notified of
a stopping condition.
The two new functions below make it possible to register any number
of functions or tasks to a system signal. They will be called in the
registration order when the signal is received.
struct sig_handler *signal_register_fct(int sig, void (*fct)(struct sig_handler *), int arg);
struct sig_handler *signal_register_task(int sig, struct task *task, int reason);
In case of binding failure during startup, we wait for some time sending
signals to old pids so that they release the ports we need. But if there
aren't any old pids anymore, it's useless to wait, we prefer to fail fast.
Along with this change, we now have the number of old pids really found
in the nb_oldpids variable.
The 'client.c' file now only contained frontend-specific functions,
so it has naturally be renamed 'frontend.c'. Same for client.h. This
has also been an opportunity to remove some cross references from
files that should not have depended on it.
In the end, this file should contain a protocol-agnostic accept()
code, which would initialize a session, task, etc... based on an
accept() from a lower layer. Right now there are still references
to TCP.
Apparently some systems define MSG_NOSIGNAL but do not necessarily
check it (or maybe binaries are built somewhere and used on older
versions). There were reports of very recent FreeBSD setups causing
SIGPIPEs, while older ones catch the signal. Recent FreeBSD manpages
indeed define MSG_NOSIGNAL.
So let's now unconditionnaly catch the signal. It's useless not to do
it for the rare cases where it's not needed (linux 2.4 and below).
We are seeing both real servers repeatedly going on- and off-line with
a period of tens of seconds. Packet tracing, stracing, and adding
debug code to HAProxy itself has revealed that the real servers are
always responding correctly, but HAProxy is sometimes receiving only
part of the response.
It appears that the real servers are sending the test page as three
separate packets. HAProxy receives the contents of one, two, or three
packets, apparently randomly. Naturally, the health check only
succeeds when all three packets' data are seen by HAProxy. If HAProxy
and the real servers are modified to use a plain HTML page for the
health check, the response is in the form of a single packet and the
checks do not fail.
(...)
I've added buffer and length variables to struct server, and allocated
space with the rest of the server initialisation.
(...)
It seems to be working fine in my tests, and handles check responses
that are bigger than the buffer.
Marcello Gorlani reported that at least on FreeBSD, a long hostname
was reported with garbage on the stats page. POSIX does not make it
mandatory for gethostname() to NULL-terminate the string in case of
truncation, and at least FreeBSD appears not to do it. So let's
force null-termination to keep safe.
The bounce realign function was algorithmically good but as expected
it was not cache-friendly. Using it with large requests caused so many
cache thrashing that the function itself could drain 70% of the total
CPU time for only 0.5% of the calls !
Revert back to a standard memcpy() using a specially allocated swap
buffer. We're now back to 2M req/s on pipelined requests.
Thich patch fixes cfgparser not to leak memory on each
default server statement and adds several missing free
calls in deinit():
- free(l->name)
- free(l->counters)
- free(p->desc);
- free(p->fwdfor_hdr_name);
None of them are critical, hopefully.
Released version 1.4-rc1 with the following main changes :
- [MEDIUM] add a maintenance mode to servers
- [MINOR] http-auth: last fix was wrong
- [CONTRIB] add base64rev-gen.c that was used to generate the base64rev table.
- [MINOR] Base64 decode
- [MINOR] generic auth support with groups and encrypted passwords
- [MINOR] add ACL_TEST_F_NULL_MATCH
- [MINOR] http-request: allow/deny/auth support for frontend/backend/listen
- [MINOR] acl: add http_auth and http_auth_group
- [MAJOR] use the new auth framework for http stats
- [DOC] add info about userlists, http-request and http_auth/http_auth_group acls
- [STATS] make it possible to change a CLI connection timeout
- [BUG] patterns: copy-paste typo in type conversion arguments
- [MINOR] pattern: make the converter more flexible by supporting void* and int args
- [MINOR] standard: str2mask: string to netmask converter
- [MINOR] pattern: add support for argument parsers for converters
- [MINOR] pattern: add the "ipmask()" converting function
- [MINOR] config: off-by-one in "stick-table" after list of converters
- [CLEANUP] acl, patterns: make use of my_strndup() instead of malloc+memcpy
- [BUG] restore accidentely removed line in last patch !
- [MINOR] checks: make the HTTP check code add the CRLF itself
- [MINOR] checks: add the server's status in the checks
- [BUILD] halog: make without arch-specific optimizations
- [BUG] halog: fix segfault in case of empty log in PCT mode (cherry picked from commit fe362fe476)
- [MINOR] http: disable keep-alive when process is going down
- [MINOR] acl: add build_acl_cond() to make it easier to add ACLs in config
- [CLEANUP] config: use build_acl_cond() instead of parse_acl_cond()
- [CLEANUP] config: use warnif_cond_requires_resp() to check for bad ACLs
- [MINOR] prepare req_*/rsp_* to receive a condition
- [CLEANUP] config: specify correct const char types to warnif_* functions
- [MEDIUM] config: factor out the parsing of 20 req*/rsp* keywords
- [MEDIUM] http: make the request filter loop check for optional conditions
- [MEDIUM] http: add support for conditional request filter execution
- [DOC] add some build info about the AIX platform (cherry picked from commit e41914c77e)
- [MEDIUM] http: add support for conditional request header addition
- [MEDIUM] http: add support for conditional response header rewriting
- [DOC] add some missing ACLs about response header matching
- [MEDIUM] http: add support for proxy authentication
- [MINOR] http-auth: make the 'unless' keyword work as expected
- [CLEANUP] config: use build_acl_cond() to simplify http-request ACL parsing
- [MEDIUM] add support for anonymous ACLs
- [MEDIUM] http: switch to tunnel mode after status 101 responses
- [MEDIUM] http: stricter processing of the CONNECT method
- [BUG] config: reset check request to avoid double free when switching to ssl/sql
- [MINOR] config: fix too large ssl-hello-check message.
- [BUG] fix error response in case of server error
Support the new syntax (http-request allow/deny/auth) in
http stats.
Now it is possible to use the same syntax is the same like in
the frontend/backend http-request access control:
acl src_nagios src 192.168.66.66
acl stats_auth_ok http_auth(L1)
stats http-request allow if src_nagios
stats http-request allow if stats_auth_ok
stats http-request auth realm LB
The old syntax is still supported, but now it is emulated
via private acls and an aditional userlist.
Add generic authentication & authorization support.
Groups are implemented as bitmaps so the count is limited to
sizeof(int)*8 == 32.
Encrypted passwords are supported with libcrypt and crypt(3), so it is
possible to use any method supported by your system. For example modern
Linux/glibc instalations support MD5/SHA-256/SHA-512 and of course classic,
DES-based encryption.