Declaring a master CLI socket without activating the master-worker mode
is likely a user error, so we issue a warning.
This patch can be backported as far as 1.8.
There were still a very small list of functions, variables and fields
called "stats_" while they were really purely CLI-centric. There's the
frontend called "stats_fe" in the global section, which instantiates a
"cli_applet" called "<CLI>" so it was renamed "cli_fe".
The "alloc_stats_fe" function cas renamed to "cli_alloc_fe" which also
better matches the naming convention of all cli-specific functions.
Finally the "stats_permission_denied_msg" used to return an error on
the CLI was renamed "cli_permission_denied_msg".
Now there's no more "stats_something" that designates the CLI.
The recent default runqueue size reduction appeared to have significantly
lowered performance on low-thread count configs. Testing various values
runqueue values on different workloads under thread counts ranging from
1 to 64, it appeared that lower values are more optimal for high thread
counts and conversely. It could even be drawn that the optimal value for
various workloads sits around 280/sqrt(nbthread), and probably has to do
with both the L3 cache usage and how to optimally interlace the threads'
activity to minimize contention. This is much easier to optimally
configure, so let's do this by default now.
There are multiple per-thread lists in the listeners, which isn't the
most efficient in terms of cache, and doesn't easily allow to store all
the per-thread stuff.
Now we introduce an srv_per_thread structure which the servers will have an
array of, and place the idle/safe/avail conns tree heads into. Overall this
was a fairly mechanical change, and the array is now always initialized for
all servers since we'll put more stuff there. It's worth noting that the Lua
code still has to deal with its own deinit by itself despite being in a
global list, because its server is not dynamically allocated.
It's a real pain not to have access to the list of all registered servers,
because whenever there is a need to late adjust their configuration, only
those attached to regular proxies are seen, but not the peers, lua, logs
nor DNS.
What this patch does is that new_server() will automatically add the newly
created server to a global list, and it does so as well for the 1 or 2
statically allocated servers created for Lua. This way it will be possible
to iterate over all of them.
This makes the code more readable and less prone to copy-paste errors.
In addition, it allows to place some __builtin_constant_p() predicates
to trigger a link-time error in case the compiler knows that the freed
area is constant. It will also produce compile-time error if trying to
free something that is not a regular pointer (e.g. a function).
The DEBUG_MEM_STATS macro now also defines an instance for ha_free()
so that all these calls can be checked.
178 occurrences were converted. The vast majority of them were handled
by the following Coccinelle script, some slightly refined to better deal
with "&*x" or with long lines:
@ rule @
expression E;
@@
- free(E);
- E = NULL;
+ ha_free(&E);
It was verified that the resulting code is the same, more or less a
handful of cases where the compiler optimized slightly differently
the temporary variable that holds the copy of the pointer.
A non-negligible amount of {free(str);str=NULL;str_len=0;} are still
present in the config part (mostly header names in proxies). These
ones should also be cleaned for the same reasons, and probably be
turned into ist strings.
RAND_keep_random_devices_open is OpenSSL specific function, not
implemented in LibreSSL and BoringSSL. Let us define guard
HAVE_SSL_RAND_KEEP_RANDOM_DEVICES_OPEN in include/haproxy/openssl-compat.h
That guard does not depend anymore on HA_OPENSSL_VERSION
Now default proxies are stored into a dedicated tree, sorted by name.
Only unnamed entries are not kept upon new section creation. The very
first call to cfg_parse_listen() will automatically allocate a dummy
defaults section which corresponds to the previous static one, since
the code requires to have one at a few places.
The first immediately visible benefit is that it allows to reuse
alloc_new_proxy() to allocate a defaults section instead of doing it by
hand. And the secret goal is to allow to keep multiple named defaults
section in memory to reuse them from various proxies.
We don't want to expose this one anymore as we'll soon keep multiple
default proxies. Let's move it inside the parser which is the only
place which still uses it, and initialize it on the fly once needed
instead of doing it at boot time.
init_default_instance() was still left in cfgparse.c which is not the
best place to pre-initialize a proxy. Let's place it in proxy.c just
after init_new_proxy(), take this opportunity for renaming it to
proxy_preset_defaults() and taking out init_new_proxy() from it, and
let's pass it the pointer to the default proxy to be initialized instead
of implicitly assuming defproxy. We'll soon be able to exploit this.
Only two call places had to be updated.
The server idle/safe/available connection lists are replaced with ebmb-
trees. This is used to store backend connections, with the new field
connection hash as the key. The hash is a 8-bytes size field, used to
reflect specific connection parameters.
This is a preliminary work to be able to reuse connection with SNI,
explicit src/dst address or PROXY protocol.
The HAPROXY_CFGFILES env variable is built using a static trash chunk, via a
call to get_trash_chunk() function. This chunk is reserved during the whole
configuration parsing. It is far too large to guarantee it will not be
reused during the configuration parsing. And in fact, it happens in the lua
code since the commit f67442efd ("BUG/MINOR: lua: warn when registering
action, conv, sf, cli or applet multiple times"), when a lua script is
loaded.
To fix the bug, we now use a dynamic buffer instead. And we call memprintf()
function to handle both the allocation and the formatting. Allocation errors
at this stage are fatal.
This patch should fix the issue #1041. It must be backported as far as 2.0.
The strict-limits global option was introduced with commit 0fec3ab7b
("MINOR: init: always fail when setrlimit fails"). When used in
conjuction with master-worker, haproxy will not fail when a setrlimit
fails. This happens because we only exit() if master-worker isn't used.
This patch removes all tests for master-worker mode for all cases covered
by strict-limits scope.
This should be backported from 2.1 onward.
This should fix issue #1042.
Reviewed by William Dauchy <wdauchy@gmail.com>
GitHub Issue #1037 Reported a memory leak in deinit() caused by an
allocation made in sa2str() that was stored in srv_set_addr_desc().
When destroying each server for a proxy in deinit, include freeing the
memory in the key of server->addr_node.
The leak was introduced in commit 92149f9a8 ("MEDIUM: stick-tables: Add
srvkey option to stick-table") which is not in any released version so
no backport is needed.
Cc: Tim Duesterhus <tim@bastelstu.be>
This is from the output of codespell. It's done at once over a bunch
of files and only affects comments, so there is nothing user-visible.
No backport needed.
Released version 2.4-dev5 with the following main changes :
- BUG/MEDIUM: mux_h2: Add missing braces in h2_snd_buf()around trace+wakeup
- BUILD: hpack: hpack-tbl-t.h uses VAR_ARRAY but does not include compiler.h
- MINOR: time: increase the minimum wakeup interval to 60s
- MINOR: check: do not ignore a connection header for http-check send
- REGTESTS: complete http-check test
- CI: travis-ci: drop coverity scan builds
- MINOR: atomic: don't use ; to separate instruction on aarch64.
- IMPORT: xxhash: update to v0.8.0 that introduces stable XXH3 variant
- MEDIUM: xxhash: use the XXH3 functions to generate 64-bit hashes
- MEDIUM: xxhash: use the XXH_INLINE_ALL macro to inline all functions
- CLEANUP: xxhash: remove the unused src/xxhash.c
- MINOR: sample: add the xxh3 converter
- REGTESTS: add tests for the xxh3 converter
- MINOR: protocol: Create proto_quic QUIC protocol layer.
- MINOR: connection: Attach a "quic_conn" struct to "connection" struct.
- MINOR: quic: Redefine control layer callbacks which are QUIC specific.
- MINOR: ssl_sock: Initialize BIO and SSL objects outside of ssl_sock_init()
- MINOR: connection: Add a new xprt to connection.
- MINOR: ssl: Export definitions required by QUIC.
- MINOR: cfgparse: Do not modify the QUIC xprt when parsing "ssl".
- MINOR: tools: Add support for QUIC addresses parsing.
- MINOR: quic: Add definitions for QUIC protocol.
- MINOR: quic: Import C source code files for QUIC protocol.
- MINOR: listener: Add QUIC info to listeners and receivers.
- MINOR: server: Add QUIC definitions to servers.
- MINOR: ssl: SSL CTX initialization modifications for QUIC.
- MINOR: ssl: QUIC transport parameters parsing.
- MINOR: quic: QUIC socket management finalization.
- MINOR: cfgparse: QUIC default server transport parameters init.
- MINOR: quic: Enable the compilation of QUIC modules.
- MAJOR: quic: Make usage of ebtrees to store QUIC ACK ranges.
- MINOR: quic: Attempt to make trace more readable
- MINOR: quic: Make usage of the congestion control window.
- MINOR: quic: Flag RX packet as ack-eliciting from the generic parser.
- MINOR: quic: Code reordering to help in reviewing/modifying.
- MINOR: quic: Add traces to congestion avoidance NewReno callback.
- MINOR: quic: Display the SSL alert in ->ssl_send_alert() callback.
- MINOR: quic: Update the initial salt to that of draft-29.
- MINOR: quic: Add traces for in flght ack-eliciting packet counter.
- MINOR: quic: make a packet build fails when qc_build_frm() fails.
- MINOR: quic: Add traces for quic_packet_encrypt().
- MINOR: cache: Refactoring of secondary_key building functions
- MINOR: cache: Avoid storing responses whose secondary key was not correctly calculated
- BUG/MINOR: cache: Manage multiple headers in accept-encoding normalization
- MINOR: cache: Add specific secondary key comparison mechanism
- MINOR: http: Add helper functions to trim spaces and tabs
- MEDIUM: cache: Manage a subset of encodings in accept-encoding normalizer
- REGTESTS: cache: Simplify vary.vtc file
- REGTESTS: cache: Add a specific test for the accept-encoding normalizer
- MINOR: cache: Remove redundant test in http_action_req_cache_use
- MINOR: cache: Replace the "process-vary" option's expected values
- CI: GitHub Actions: enable daily Coverity scan
- BUG/MEDIUM: cache: Fix hash collision in `accept-encoding` handling for `Vary`
- MEDIUM: stick-tables: Add srvkey option to stick-table
- REGTESTS: add test for stickiness using "srvkey addr"
- BUILD: Makefile: disable -Warray-bounds until it's fixed in gcc 11
- BUG/MINOR: sink: Return an allocation failure in __sink_new if strdup() fails
- BUG/MINOR: lua: Fix memory leak error cases in hlua_config_prepend_path
- MINOR: lua: Use consistent error message 'memory allocation failed'
- CLEANUP: Compare the return value of `XXXcmp()` functions with zero
- CLEANUP: Apply the coccinelle patch for `XXXcmp()` on include/
- CLEANUP: Apply the coccinelle patch for `XXXcmp()` on contrib/
- MINOR: qpack: Add static header table definitions for QPACK.
- CLEANUP: qpack: Wrong comment about the draft for QPACK static header table.
- CLEANUP: quic: Remove useless QUIC event trace definitions.
- BUG/MINOR: quic: Possible CRYPTO frame building errors.
- MINOR: quic: Pass quic_conn struct to frame parsers.
- BUG/MINOR: quic: Wrong STREAM frames parsing.
- MINOR: quic: Drop packets with STREAM frames with wrong direction.
- CLEANUP: ssl: Remove useless loop in tlskeys_list_get_next()
- CLEANUP: ssl: Remove useless local variable in tlskeys_list_get_next()
- MINOR: ssl: make tlskeys_list_get_next() take a list element
- Revert "BUILD: Makefile: disable -Warray-bounds until it's fixed in gcc 11"
- BUG/MINOR: cfgparse: Fail if the strdup() for `rule->be.name` for `use_backend` fails
- CLEANUP: mworker: remove duplicate pointer tests in cfg_parse_program()
- CLEANUP: Reduce scope of `header_name` in http_action_store_cache()
- CLEANUP: Reduce scope of `hdr_age` in http_action_store_cache()
- CLEANUP: spoe: fix typo on `var_check_arg` comment
- BUG/MINOR: tcpcheck: Report a L7OK if the last evaluated rule is a send rule
- CI: github actions: build several popular "contrib" tools
- DOC: Improve the message printed when running `make` w/o `TARGET`
- BUG/MEDIUM: server: srv_set_addr_desc() crashes when a server has no address
- REGTESTS: add unresolvable servers to srvkey-addr
- BUG/MINOR: stats: Make stat_l variable used to dump a stat line thread local
- BUG/MINOR: quic: NULL pointer dereferences when building post handshake frames.
- SCRIPTS: improve announce-release to support different tag and versions
- SCRIPTS: make announce release support preparing announces before tag exists
- CLEANUP: assorted typo fixes in the code and comments
- BUG/MINOR: srv: do not init address if backend is disabled
- BUG/MINOR: srv: do not cleanup idle conns if pool max is null
- CLEANUP: assorted typo fixes in the code and comments
- CLEANUP: few extra typo and fixes over last one ("ot" -> "to")
This option is now ignored because I/O check buffers are now allocated using the
buffer pool. Thus, it is marked as deprecated in the documentation and ignored
during the configuration parsing. The field is also removed from the global
structure.
Because this option is ignored since a recent fix, backported as fare as 2.2,
this patch should be backported too. Especially because it updates the
documentation.
"RAND_keep_random_devices_open" is OpenSSL specific, does not present
in other OpenSSL variants like LibreSSL or BoringSSL. BoringSSL recently
"updated" its internal openssl version to 1.1.1, we temporarily set it
back to 1.1.0, as we are going to remove that hack, let us add proper
guarding.
Functions registered to release memory per-thread have no return value. But the
registering function and the function pointer in per_thread_free_fct structure
specify it should return an integer. This patch fixes it.
This patch may be backported as far as 2.0.
per-proxy and per-server post-check callback functions must be skipped for
disabled proxies because most of the configuration validity check is skipped for
these proxies.
This patch must be backported as far as 2.1.
This is an anticipation of finer grained locking for the queues. For now
all lock places take a write lock so that there is no difference at all
with previous code.
As previously discussed, nbproc usage is bad, deprecated, and scheduled
for removal in 2.5.
If "nbproc" is found with more than one process while nbthread is not
set, a warning will be emitted encouraging to remove it or to migrate
to nbthread instead. This makes sure the user has an opportunity to
both see the message and silence it.
It was previously a spinlock, and it happens that a number of LB algos
only lock it for lookups, without performing any modification. Let's
first turn it to an rwlock and w-lock it everywhere. This is strictly
identical.
It was carefully checked that every HA_SPIN_LOCK() was turned to
HA_RWLOCK_WRLOCK() and that HA_SPIN_UNLOCK() was turned to
HA_RWLOCK_WRUNLOCK() on this lock. _INIT and _DESTROY were updated too.
Now we define a new sock_accept_iocb() for socket-based stream protocols
and use it as a wrapper for listener_accept() which now takes a listener
and not an FD anymore. This will allow the receiver's I/O cb to be
redefined during registration, and more specifically to get rid of the
hard-coded hacks in protocol_bind_all() made for syslog.
The previous ->accept() callback in the protocol was removed since it
doesn't have anything to do with accept() anymore but is more generic.
A few places where listener_accept() was compared against the FD's IO
callback for debugging purposes on the CLI were updated.
When running a pure config check (haproxy -c) we go through the deinit
phase without having allocated fdtab, so we can't blindly dereference
it. The issue was added by recent commit ae7bc4a23 ("MEDIUM: deinit:
close all receivers/listeners before scanning proxies"), no backport is
needed.
On some operating systems, RLIM_INFINITY is set to -1 so that when the
hard limit on the number of FDs is set to unlimited, taking the MAX
of both values keeps rlim_fd_cur and everything works. But on other
systems this values is defined as the highest positive integer. This
is what was observed on a 32-bit AIX 5.1. The effect is that maxsock
becomes 2^31-1 and that fdtab allocation fails.
Note that a simple workaround consists in manually setting maxconn in
the global section.
Let's ignore unlimited as soon as we retrieve rlim_fd_max so that all
systems behave consistently.
This may be backported as far as 2.0, though it doesn't seem like it
has annoyed anyone.
When factoring out the pause/resume error messages in commit 775e00158
("MAJOR: signals: use protocol_pause_all() and protocol_resume_all()")
I forgot that ha_warning() and send_log() take a format string and not
just a const string. No backport is needed, this is 2.3-dev.
When temporarily pausing the listeners with SIG_TTOU, we now pause
all listeners via the protocols instead of the proxies. This has the
benefits that listeners are paused regardless of whether or not they
belong to a visible proxy. And for resuming via SIG_TTIN we do the
same, which allows to report binding conflicts and address them,
since the operation can be repeated on a per-listener basis instead
of a per-proxy basis.
While in appearance all cases were properly handled, it's impossible
to completely rule out the possibility that something broken used to
work by luck due to the scan ordering which is naturally different,
hence the major tag.
The two functions don't need to be distinguished anymore since they have
all the necessary info to act as needed on their listeners. Let's just
pass via stop_proxy() and make it check for each listener which one to
close or not.
Its sole remaining purpose was to display "proxy foo started", which
has little benefit and pollutes output for those with plenty of proxies.
Let's remove it now.
The VTCs were updated to reflect this, because many of them had explicit
counts of dropped lines to match this message.
This is tagged as MEDIUM because some users may be surprized by the
loss of this quite old message.
The remaining proxy states were only used to distinguish an enabled
proxy from a disabled one. Due to the initialization order, both
PR_STNEW and PR_STREADY were equivalent after startup, and they
would only differ from PR_STSTOPPED when the proxy is disabled or
shutdown (which is effectively another way to disable it).
Now we just have a "disabled" field which allows to distinguish them.
It's becoming obvious that start_proxies() is only used to print a
greeting message now, that we'd rather get rid of. Probably that
zombify_proxy() and stop_proxy() should be merged once their
differences move to the right place.
Instead of looking at listeners in proxies in PR_STNEW state, we'd
rather check for listeners in those not in PR_STSTOPPED as it's only
this state which indicates the proxy was disabled. And let's check
the listeners count instead of testing the list's head.
Because of the zombie state, proxies have a skewed vision of the state
of listeners, which explains why there are hacks switching the state
from ZOMBIE to INIT in the proxy cleaning loop. This is particularly
complicated and not needed, as all the information is now available
in the protocol list and the fdtab.
What we do here instead is to first close all active listeners or
receivers by protocol and clean their protocol parts. Then we scan the
fdtab to get rid of remaining ones that were necessarily in INIT state
after a previous invocation of delete_listener(). From this point, we
know the listeners are cleaned, the can safely be freed by scanning the
proxies.
During the startup process we don't have any fdtab nor fd_updt for quite
a long time, and as such some operations on the listeners are not
permitted, such as fd_want_*/fd_stop_* or fd_delete(). The latter is of
particular concern because it's used when stopping a disabled frontend,
and it's performed very early during check_config_validity() while there
is no fdtab yet. The trick till now relies on the listener's state which
is a bit brittle.
There is absolutely no valid reason for stopping a proxy's listeners this
early, we can postpone it after init_pollers() which will at least have
allocated fdtab.
Old processes didn't die if a log foward section is declared and
a soft stop is requested.
This patch fix this issue and should be backpored in banches including
the log forward feature.
This is executed on startup with the registered statistics module. The
existing statistics have been merged in a list containing all
statistics for each domain. This is useful to print all available
statistics in a generic way.
Allocate extra counters for all proxies/servers/listeners instances.
These counters are allocated with the counters from the stats modules
registered on startup.
We use chunk_initstr() to store the program name as the default log-tag.
If we use the log-tag directive in the config file, this chunk will be
destroyed and replaced. chunk_initstr() sets the chunk size to 0 so we
will free the chunk itself, but not its content.
This happens for a global section and also for a proxy.
We fix this by using chunk_initlen() instead of chunk_initstr().
We also check that the memory allocation was successfull, otherwise we quit.
This fixes github issue #850.
It can be backported as far as 1.9, with minor adjustments to includes.
The code dealing with zombie proxies in soft_stop() is bogus, it uses
close() instead of fd_delete(), leaving a live entry in the fdtab with
a dangling pointer to a free memory location. The FD might be reassigned
for an outgoing connection for the time it takes the proxy to completely
stop, or could be dumped on the CLI's "show fd" command. In addition,
the listener's FD was not even reset, leaving doubts about whether or
not it will happen again in deinit().
And in deinit(), the loop in charge of closing zombie FDs is particularly
unsafe because it closes the fd then calls unbind_listener() then
delete_listener() hoping none of them will touch it again. Since it
requires some mental efforts to figure what's done there, let's correctly
reset the fd here as well and close it using fd_delete() to eliminate any
remaining doubts.
It's uncertain whether this should be backported. Zombie proxies are rare
and the situations capable of triggering such issues are not trivial to
setup. However it's easy to imagine how things could go wrong if backported
too far. Better wait for any matching report if at all (this code has been
there since 1.8 without anobody noticing).
The listening socket is represented by its file descriptor, which is
generic to all receivers and not just listeners, so it must move to
the rx struct.
It's worth noting that in order to extend receivers and listeners to
other protocols such as QUIC, we'll need other handles than file
descriptors here, and that either a union or a cast to uintptr_t
will have to be used. This was not done yet and the field was
preserved under the name "fd" to avoid adding confusion.
Changes performed using the following coccinelle patch:
@@
type T;
expression E;
expression t;
@@
(
t = calloc(E, sizeof(*t))
|
- t = calloc(E, sizeof(T))
+ t = calloc(E, sizeof(*t))
)
Looking through the commit history, grepping for coccinelle shows that the same
replacement with a different patch was already performed in the past in commit
02779b6263.