Unix permissions are per-bind configuration line and not per listener,
so let's concretize this in the way the config is stored. This avoids
some unneeded loops to set permissions on all listeners.
The access level is not part of the unix perms so it has been moved
away. Once we can use str2listener() to set all listener addresses,
we'll have a bind keyword parser for this one.
Navigating through listeners was very inconvenient and error-prone. Not to
mention that listeners were linked in reverse order and reverted afterwards.
In order to definitely get rid of these issues, we now do the following :
- frontends have a dual-linked list of bind_conf
- frontends have a dual-linked list of listeners
- bind_conf have a dual-linked list of listeners
- listeners have a pointer to their bind_conf
This way we can now navigate from anywhere to anywhere and always find the
proper bind_conf for a given listener, as well as find the list of listeners
for a current bind_conf.
Some settings need to be merged per-bind config line and are not necessarily
SSL-specific. It becomes quite inconvenient to have this ssl_conf SSL-specific,
so let's replace it with something more generic.
Since it's common enough to discover that some config options are not
supported due to some openssl version or build options, we report the
relevant ones in "haproxy -vv".
A side effect of this change is that the "ssl" keyword on "bind" lines is now
just a boolean and that "crt" is needed to designate certificate files or
directories.
Note that much refcounting was needed to have the free() work correctly due to
the number of cert aliases which can make a context be shared by multiple names.
SSL config holds many parameters which are per bind line and not per
listener. Let's use a per-bind line config instead of having it
replicated for each listener.
At the moment we only do this for the SSL part but this should probably
evolved to handle more of the configuration and maybe even the state per
bind line.
SSL connections take a huge amount of memory, and unfortunately openssl
does not check malloc() returns and easily segfaults when too many
connections are used.
The only solution against this is to provide a global maxsslconn setting
to reject SSL connections above the limit in order to avoid reaching
unsafe limits.
Thomas Heil reported that when using nbproc > 1, his pidfiles were
regularly truncated. The issue could be tracked down to the presence
of a call to lseek(pidfile, 0, SEEK_SET) just before the close() call
in the children, resulting in the file being truncated by the children
while the parent was feeding it. This unexpected lseek() is transparently
performed by fclose().
Since there is no way to have the file automatically closed during the
fork, the only solution is to bypass the libc and use open/write/close
instead of fprintf() and fclose().
The issue was observed on eglibc 2.15.
This is a massive rename of most functions which should make use of the
word "channel" instead of the word "buffer" in their names.
In concerns the following ones (new names) :
unsigned long long channel_forward(struct channel *buf, unsigned long long bytes);
static inline void channel_init(struct channel *buf)
static inline int channel_input_closed(struct channel *buf)
static inline int channel_output_closed(struct channel *buf)
static inline void channel_check_timeouts(struct channel *b)
static inline void channel_erase(struct channel *buf)
static inline void channel_shutr_now(struct channel *buf)
static inline void channel_shutw_now(struct channel *buf)
static inline void channel_abort(struct channel *buf)
static inline void channel_stop_hijacker(struct channel *buf)
static inline void channel_auto_connect(struct channel *buf)
static inline void channel_dont_connect(struct channel *buf)
static inline void channel_auto_close(struct channel *buf)
static inline void channel_dont_close(struct channel *buf)
static inline void channel_auto_read(struct channel *buf)
static inline void channel_dont_read(struct channel *buf)
unsigned long long channel_forward(struct channel *buf, unsigned long long bytes)
Some functions provided by channel.[ch] have kept their "buffer" name because
they are really designed to act on the buffer according to some information
gathered from the channel. They have been moved together to the same place in
the file for better readability but they were not changed at all.
The "buffer" memory pool was also renamed "channel".
The "raw_sock" prefix will be more convenient for naming functions as
it will be prefixed with the data layer and suffixed with the data
direction. So let's rename the files now to avoid any further confusion.
The #include directive was also removed from a number of files which do
not need it anymore.
In an attempt to get rid of fdtab[].state, and to move the relevant
parts to the connection struct, we remove the FD_STCLOSE state which
can easily be deduced from the <owner> pointer as there is a 1:1 match.
When passing arguments to ACLs and samples, some types are stored as
strings then resolved later after config parsing is done. Upon exit,
the arguments need to be freed only if the string was not resolved
yet. At the moment we can encounter double free during deinit()
because some arguments (eg: userlists) are freed once as their own
type and once as a string.
The solution consists in adding an "unresolved" flag to the args to
say whether the value is still held in the <str> part or is final.
This could be debugged thanks to a useful bug report from Sander Klein.
Option httplog needs to be checked only once the proxy has been validated,
so that its final mode (tcp/http) can be used. Also we need to check for
httplog before checking the log format, so that we can report a warning
about this specific option and not about the format it implies.
Before it was possible to resize the buffers using global.tune.bufsize,
the trash has always been the size of a buffer by design. Unfortunately,
the recent buffer sizing at runtime forgot to adjust the trash, resulting
in it being too short for content rewriting if buffers were enlarged from
the default value.
The bug was encountered in 1.4 so the fix must be backported there.
We'll soon have an SSL socket layer, and in order to ease the difference
between the two, we use the name "sock_raw" to designate the one which
directly talks to the sockets without any conversion.
From time to time, some bugs are discovered that are caused by non-initialized
memory areas. It happens that most platforms return a zero-filled area upon
first malloc() thus hiding potential bugs. This patch also replaces malloc()
in pools with calloc() to ensure that all platforms exhibit the same behaviour
upon startup. In order to catch these bugs more easily, add a -dM command line
flag to enable memory poisonning. Optionally, passing -dM<byte> forces the
poisonning byte to <byte>.
This is mainly a massive renaming in the code to get it in line with the
calling convention. Next patch will rename a few files to complete this
operation.
arg_i was almost unused, and since we migrated to use struct arg everywhere,
the rare cases where arg_i was needed could be replaced by switching to
arg->type = ARGT_STOP.
There were a few unchecked write() calls in the debug code that cause
gcc 4.x to emit warnings on recent libc. We don't want to check them
as we can't make anything from the result, let's simply surround them
with an empty if statement.
Note that one of the warnings was for chdir("/") which normally cannot
fail since it follows a successful chroot (which means the perms are
necessarily there). Anyway let's move the call uppe to protect it too.
%Fi: Frontend IP
%Fp: Frontend Port
%Si: Server IP
%Sp: Server Port
%Ts: Timestamp
%rt: HTTP request counter
%H: hostname
%pid: PID
+X: Hexadecimal represenation
The +X mode in logformat displays hexadecimal for the following flags
%Ci %Cp %Fi %Fp %Bi %Bp %Si %Sp %Ts %ct %pid
rename logformat_write_string() to lf_text()
Optimize size computation
Sometimes it is desirable to forward a particular request to a specific
server without having to declare a dedicated backend for this server. This
can be achieved using the "use-server" rules. These rules are evaluated after
the "redirect" rules and before evaluating cookies, and they have precedence
on them. There may be as many "use-server" rules as desired. All of these
rules are evaluated in their declaration order, and the first one which
matches will assign the server.
Released version 1.5-dev8 with the following main changes :
- MINOR: patch for minor typo (ressources/resources)
- MEDIUM: http: add support for sending the server's name in the outgoing request
- DOC: mention that default checks are TCP connections
- BUG/MINOR: fix options forwardfor if-none when an alternative header name is specified
- CLEANUP: Make check_statuses, analyze_statuses and process_chk static
- CLEANUP: Fix HCHK spelling errors
- BUG/MINOR: fix typo in processing of http-send-name-header
- MEDIUM: log: Use linked lists for loggers
- BUILD: fix declaration inside a scope block
- REORG: log: split send_log function
- MINOR: config: Parse the string of the log-format config keyword
- MINOR: add ultoa, ulltoa, ltoa, lltoa implementations
- MINOR: Date and time fonctions that don't use snprintf
- MEDIUM: log: make http_sess_log use log_format
- DOC: log-format documentation
- MEDIUM: log: use log_format for mode tcplog
- MEDIUM: log-format: backend source address %Bi %Bp
- BUG/MINOR: log-format: fix %o flag
- BUG/MEDIUM: bad length in log_format and __send_log
- MINOR: logformat %st is signed
- BUILD/MINOR: fix the source URL in the spec file
- DOC: acl is http_first_req, not http_req_first
- BUG/MEDIUM: don't trim last spaces from headers consisting only of spaces
- MINOR: acl: add new matches for header/path/url length
- BUILD: halog: make halog build on solaris
- BUG/MINOR: don't use a wrong port when connecting to a server with mapped ports
- MINOR: remove the client/server side distinction in SI addresses
- MINOR: halog: add support for matching queued requests
- DOC: indicate that cookie "prefix" and "indirect" should not be mixed
- OPTIM/MINOR: move struct sockaddr_storage to the tail of structs
- OPTIM/MINOR: make it possible to change pipe size (tune.pipesize)
- BUILD/MINOR: silent a build warning in src/pipe.c (fcntl)
- OPTIM/MINOR: move the hdr_idx pools out of the proxy struct
- MEDIUM: tune.http.maxhdr makes it possible to configure the maximum number of HTTP headers
- BUG/MINOR: fix a segfault when parsing a config with undeclared peers
- CLEANUP: rename possibly confusing struct field "tracked"
- BUG/MEDIUM: checks: fix slowstart behaviour when server tracking is in use
- MINOR: config: tolerate server "cookie" setting in non-HTTP mode
- MEDIUM: buffers: add some new primitives and rework existing ones
- BUG: buffers: don't return a negative value on buffer_total_space_res()
- MINOR: buffers: make buffer_pointer() support negative pointers too
- CLEANUP: kill buffer_replace() and use an inline instead
- BUG: tcp: option nolinger does not work on backends
- CLEANUP: ebtree: remove a few annoying signedness warnings
- CLEANUP: ebtree: clarify licence and update to 6.0.6
- CLEANUP: ebtree: remove 4-year old harmless typo in duplicates insertion code
- CLEANUP: ebtree: remove another typo, a wrong initialization in insertion code
- BUG: ebtree: ebst_lookup() could return the wrong entry
- OPTIM: stream_sock: reduce the amount of in-flight spliced data
- OPTIM: stream_sock: save a failed recv syscall when splice returns EAGAIN
- MINOR: acl: add support for TLS server name matching using SNI
- BUG: http: re-enable TCP quick-ack upon incomplete HTTP requests
- BUG: proto_tcp: don't try to bind to a foreign address if sin_family is unknown
- MINOR: pattern: export the global temporary pattern
- CLEANUP: patterns: get rid of pattern_data_setstring()
- MEDIUM: acl: use temp_pattern to store fetched information in the "method" match
- MINOR: acl: include pattern.h to make pattern migration more transparent
- MEDIUM: pattern: change the pattern data integer from unsigned to signed
- MEDIUM: acl: use temp_pattern to store any integer-type information
- MEDIUM: acl: use temp_pattern to store any address-type information
- CLEANUP: acl: integer part of acl_test is not used anymore
- MEDIUM: acl: use temp_pattern to store any string-type information
- CLEANUP: acl: remove last data fields from the acl_test struct
- MEDIUM: http: replace get_ip_from_hdr2() with http_get_hdr()
- MEDIUM: patterns: the hdr() pattern is now of type string
- DOC: add minimal documentation on how ACLs work internally
- DOC: add a coding-style file
- OPTIM: halog: keep a fast path for the lines-count only
- CLEANUP: silence a warning when building on sparc
- BUG: http: tighten the list of allowed characters in a URI
- MEDIUM: http: block non-ASCII characters in URIs by default
- DOC: add some documentation from RFC3986 about URI format
- BUG/MINOR: cli: correctly remove the whole table on "clear table"
- BUG/MEDIUM: correctly disable servers tracking another disabled servers.
- BUG/MEDIUM: zero-weight servers must not dequeue requests from the backend
- MINOR: halog: add some help on the command line
- BUILD: fix build error on FreeBSD
- BUG: fix double free in peers config error path
- MEDIUM: improve config check return codes
- BUILD: make it possible to look for pcre in the default system paths
- MINOR: config: emit a warning when 'default_backend' masks servers
- MINOR: backend: rework the LC definition to support other connection-based algos
- MEDIUM: backend: add the 'first' balancing algorithm
- BUG: fix httplog trailing LF
- MEDIUM: increase chunk-size limit to 2GB-1
- BUG: queue: fix dequeueing sequence on HTTP keep-alive sessions
- BUG: http: disable TCP delayed ACKs when forwarding content-length data
- BUG: checks: fix server maintenance exit sequence
- BUG/MINOR: stream_sock: don't remove BF_EXPECT_MORE and BF_SEND_DONTWAIT on partial writes
- DOC: enumerate valid status codes for "observe layer7"
- MINOR: buffer: switch a number of buffer args to const
- CLEANUP: silence signedness warning in acl.c
- BUG: stream_sock: si->release was not called upon shutw()
- MINOR: log: use "%ts" to log term status only and "%tsc" to log with cookie
- BUG/CRITICAL: log: fix risk of crash in development snapshot
- BUG/MAJOR: possible crash when using capture headers on TCP frontends
- MINOR: config: disable header captures in TCP mode and complain
parse_logformat_string: parse the string, detect the type: text,
separator or variable
parse_logformat_var: dectect variable name
parse_logformat_var_args: parse arguments and flags
add_to_logformat_list: add to the logformat linked list
When checking a configuration file using "-c -f xxx", sometimes it is
reported that a config is valid while it will later fail (eg: no enabled
listener). Instead, let's improve the return values :
- return 0 if config is 100% OK
- return 1 if config has errors
- return 2 if config is OK but no listener nor peer is enabled
This patch settles the 2 loggers limitation.
Loggers are now stored in linked lists.
Using "global log", the global loggers list content is added at the end
of the current proxy list. Each "log" entries are added at the end of
the proxy list.
"no log" flush a logger list.
Ludovic Levesque reported and diagnosed an annoying bug. When a server is
configured to track another one and has a slowstart interval set, it's
assigned a minimal weight when the tracked server goes back up but keeps
this weight forever.
This is because the throttling during the warmup phase is only computed
in the health checking function.
After several attempts to resolve the issue, the only real solution is to
split the check processing task in two tasks, one for the checks and one
for the warmup. Each server with a slowstart setting has a warmum task
which is responsible for updating the server's weight after a down to up
transition. The task does not run in othe situations.
In the end, the fix is neither complex nor long and should be backported
to 1.4 since the issue was detected there first.
It makes no sense to have one pointer to the hdr_idx pool in each proxy
struct since these pools do not depend on the proxy. Let's have a common
pool instead as it is already the case for other types.
Passing -C <dir> causes haproxy to chdir to <dir> before loading
any file. The argument may be passed anywhere on the command line.
A typical use case is :
$ haproxy -C /etc/haproxy -f global.cfg -f haproxy.cfg
The way the unix socket is initialized is awkward. Some of the settings are put
in the sockets itself, other ones in the backend. And more importantly the
global.maxsock value is adjusted so that the stats socket evades the global
maxconn value. This complexifies maxsock computations for nothing, since the
stats socket is not supposed to receive hundreds of concurrent connections when
the global maxconn is very low. What is needed however is to ensure that there
are always connections left for the stats socket even when traffic sockets are
saturated, but this guarantee is not offered anymore by current code.
So as of now, the stats socket is subject to the global maxconn limitation just
as any other socket until a reservation mechanism is implemented.
This global task is used to periodically check for end of resource shortage
and to try to enable queued listeners again. This is important in case some
temporary system-wide shortage is encountered, so that we don't have to wait
for an existing connection to be released before checking the queue again.
For situations where listeners are queued due to the global maxconn being
reached, the task is woken up at least every second. For situations where
a system resource shortage is detected (memory, sockets, ...) the task is
woken up at least every 100 ms. That way, recovery from severe events can
still be achieved under acceptable conditions.
This function is finally not needed anymore, as it has been replaced with
a per-proxy task that is scheduled when some limits are encountered on
incoming connections or when the process is stopping. The savings should
be noticeable on configs with a large number of proxies. The most important
point is that the rate limiting is now enforced in a clean and solid way.
When an accept() fails because of a connection limit or a memory shortage,
we now disable it and queue it so that it's dequeued only when a connection
is released. This has improved the behaviour of the process near the fd limit
as now a listener with a no connection (eg: stats) will not loop forever
trying to get its connection accepted.
The solution is still not 100% perfect, as we'd like to have this used when
proxy limits are reached (use a per-proxy list) and for safety, we'd need
to have dedicated tasks to periodically re-enable them (eg: to overcome
temporary system-wide resource limitations when no connection is released).
Managing listeners state is difficult because they have their own state
and can at the same time have theirs dictated by their proxy. The pause
is not done properly, as the proxy code is fiddling with sockets. By
introducing new functions such as pause_listener()/resume_listener(), we
make it a bit more obvious how/when they're supposed to be used. The
listen_proxies() function was also renamed to resume_proxies() since
it's only used for pause/resume.
This patch is the first in a series aiming at getting rid of the maintain_proxies
mess. In the end, proxies should not call enable_listener()/disable_listener()
anymore.
By default on a single process, we accept 100 connections at once. This is too
much on recent CPUs where the cache is constantly thrashing, because we visit
all those connections several times. We should batch the processing slightly
less so that all the accepted session may remain in cache during their initial
processing.
Lowering the batch size from 100 to 32 has changed the connection rate for
concurrencies between 5-10k from 67 kcps to 94 kcps on a Core i5 660 (4M L3),
and forward rates from 30k to 39.5k.
Tests on this hardware show that values between 10 and 30 seem to do the job fine.
The motivation for this is that when soft-restart is merged
it will be come more important to free all relevant memory in deinit()
Discovered using valgrind.
The motivation for this is that when soft-restart is merged
it will be come more important to free all relevant memory in deinit()
Discovered using valgrind.
The motivation for this is that when soft-restart is merged
it will be come more important to free all relevant memory in deinit()
Discovered using valgrind.