As reported in github issue #1765, some people get trapped into building
haproxy and companion libraries on Windows using a compiler following the
LLP64 model. This has no chance to work, and definitely causes nasty bugs
everywhere when pointers are passed as longs. Let's save them time and
detect this at boot time.
The message and detection was factored with the existing one for -fwrapv
since we need the same info and actions.
This should be backported to all recent supported versions (the ones
that are likely to be tried on such platforms when people don't know).
When updating from 2.4 to 2.6, the child->reloads++ instruction changed
place, resulting in a former worker from the 2.4 process, still
identified as a current worker once in 2.6, because its reload counter
is still 0.
Unfortunately this counter is used to chose the mworker_proc structure
that will be used for the new worker.
What happens next, is that the mworker_proc structure of the previous
process is selected, and this one has ipc_fd[1] set to -1, because this
structure was supposed to be in the master.
The process then forks, and mworker_sockpair_register_per_thread() tries
to register ipc_fd[1] which is set to -1, instead of the fd of the new
socketpair.
This patch fixes the issue by checking if child->pid is equal to -1 when
selecting proc_self. This way we could be sure it wasn't a previous
process.
Should fix issue #1785.
This must be backported as far as 2.4 to fix the issue related to the
reload computation difference. However backporting it in every stable
branch will enforce the reload process.
Since these are not used anymore, let's now remove them. Given the
number of places where we're using ti->ldit_bit, maybe an equivalent
might be useful though.
The principle remains the same, but instead of having a single process
and ignoring extra ones, now we set the affinity masks for the respective
threads of all groups.
The doc was updated with a few extra examples.
These old legacy pollers are not designed for this. They're still
using a shared list of events for all threads, this will not scale at
all, so there's no point in enabling thread-groups there. Modern
systems have epoll, kqueue or event ports and do not need these ones.
We arrange for failing at boot time, only when thread-groups > 1 so
that existing setups will remain unaffected.
If there's a compelling reason for supporting thread groups with these
pollers in the future, the rework should not be too hard, it would just
consume a lot of memory to have an fd_evts[] array per thread, but that
is doable.
The sd_notify API is not able to change the "Active:" line in "systemcl
status". However a message can still be displayed on a "Status: " line,
even if the service is still green and "active (running)".
When startup succeed the Status will be set to "Ready.", upon a reload
it will be set to "Reloading Configuration." If the configuration
succeed "Ready." again. However if the reload failed, it will be set to
"Reload failed!".
Keep in mind that the "Active:" line won't change upon a reload failure,
and will still be green.
As much as possible we should take care of not leaving bits from stopped
threads in shared thread masks. It can avoid issues like the previous
fix and will also make debugging less confusing.
When soft-stopping, there's a comparison between stopping_threads and
threads_enabled to make sure all threads are stopped, but this is not
correct and is racy since the threads_enabled bit is removed when a
thread is stopped but not its stopping_threads bit. The consequence is
that depending on timing, when stopping, if the first stopping thread
is fast enough to remove its bit from threads_enabled, the other threads
will see that stopping_threads doesn't match threads_enabled anymore and
will wait forever. As such the mask must be applied to stopping_threads
during the test. This issue was introduced in recent commit ef422ced9
("MEDIUM: thread: make stopping_threads per-group and add stopping_tgroups"),
no backport is needed.
Commit ef422ced9 ("MEDIUM: thread: make stopping_threads per-group and add
stopping_tgroups") moved the stopping_threads mask to per-group, but one
test in the loop preserved its global value instead, resulting in stopping
threads never sleeping on stop and eating 100% CPU until all were stopped.
No backport is needed.
Commit 377e37a80 ("MINOR: tinfo: add the mask of enabled threads in each
group") forgot -1 on the tgid, thus the groups was not always correctly
tested, which is visible only when running with more than one group. No
backport is needed.
Stopping threads need a mask to figure who's still there without scanning
everything in the poll loop. This means this will have to be per-group.
And we also need to have a global stopping groups mask to know what groups
were already signaled. This is used both to figure what thread is the first
one to catch the event, and which one is the first one to detect the end of
the last job. The logic isn't changed, though a loop is required in the
slow path to make sure all threads are aware of the end.
Note that for now the soft-stop still takes time for group IDs > 1 as the
poller is not yet started on these threads and needs to expire its timeout
as there's no way to wake it up. But all threads are eventually stopped.
In order to kill all_threads_mask we'll need to have an equivalent for
the thread groups. The all_tgroups_mask does just this, it keeps one bit
set per enabled group.
In order to replace the global "all_threads_mask" we'll need to have an
equivalent per group. Take this opportunity for calling it threads_enabled
and make sure which ones are counted there (in case in the future we allow
to stop some).
Every single place where sleeping_thread_mask was still used was to test
or set a single thread. We can now add a per-thread flag to indicate a
thread is sleeping, and remove this shared mask.
The wake_thread() function now always performs an atomic fetch-and-or
instead of a first load then an atomic OR. That's cleaner and more
reliable.
This is not easy to test, as broadcast FD events are rare. The good
way to test for this is to run a very low rate-limited frontend with
a listener that listens to the fewest possible threads (2), and to
send it only 1 connection at a time. The listener will periodically
pause and the wakeup task will sometimes wake up on a random thread
and will call wake_thread():
frontend test
bind :8888 maxconn 10 thread 1-2
rate-limit sessions 5
Alternately, disabling/enabling a frontend in loops via the CLI also
broadcasts such events, but they're more difficult to observe since
this is causing connection failures.
Right now when an inter-thread wakeup happens, we preliminary check if the
thread was asleep, and if so we wake the poller up and remove its bit from
the sleeping mask. That's not very clean since the sleeping mask cannot be
entirely trusted since a thread that's about to wake up will already have
its sleeping bit removed.
This patch adds a new per-thread flag (TH_FL_NOTIFIED) to remember that a
thread was notified to wake up. It's cleared before checking the task lists
last, so that new wakeups can be considered again (since wake_thread() is
only used to notify about task wakeups and FD polling changes). This way
we do not need to modify a remote thread's sleeping mask anymore. As such
wake_thread() now only tests and sets the TH_FL_NOTIFIED flag but doesn't
clear sleeping anymore.
In bug #1751, it was reported that haproxy is consumming too much memory
since the 2.4 version. This is because of a change in the master, which
loses completely its configuration in wait mode, and lose its maxconn.
Without the maxconn, haproxy will try to compute one itself, and will
allocate RAM consequently, too much in our case. Which means the master
will have a too high maxconn and too much RAM allocated.
The patch fixes the issue by setting the maxconn to the default value
when re-executing the master in wait mode.
Must be backported as far as 2.5.
This is becoming difficult to distinguish the default values for
transport parameters which come with the RFC from our implementation
default values when not set by configuration (tunable parameters).
Add a comment to distinguish them.
Prefix these default values by QUIC_TP_DFLT_ to distinguish them from
QUIC_DFLT_* value even if there are not numerous.
Furthermore ->max_udp_payload_size must be first initialized to
QUIC_TP_DFLT_MAX_UDP_PAYLOAD_SIZE especially for received value.
Add tunable "tune.quic.frontend.max_streams_bidi" setting for QUIC frontends
to set the "initial_max_streams_bidi" transport parameter.
Add some documentation for this new setting.
Add two tunable settings both for backends and frontends "max_idle_timeout"
QUIC transport parameter, "tune.quic.frontend.max-idle-timeout" and
"tune.quic.backend.max-idle-timeout" respectively.
cfg_parse_quic_time() has been implemented to parse a time value thanks
to parse_time_err(). It should be reused for any tunable time value to be
parsed.
Add the documentation for this tunable setting only for frontend.
Commit 2cb3be76b ("CLEANUP: init: address a coverity warning about
possible multiply overflow") was incomplete, two other locations were
present. This should address issue #1585.
In issue #1585 Coverity suspects a risk of multiply overflow when
calculating the SSL cache size, though in practice the cache is
limited to 2^32 anyway thus it cannot really happen. Nevertheless,
casting the operation should be sufficient to avoid marking it as a
false positive.
This QUIC specific keyword may be used to set the theshold, in number of
connection openings, beyond which QUIC Retry feature will be automatically
enabled. Its default value is 100.
The random generator initialization needs to be performed before the
chroot but it is not needed before. If we want to add provider
configuration option to the configuration file, they need to be
processed before any call to a crypto-related OpenSSL function.
We can then delay the initialization until after the configuration file
is parsed and processed.
It could be usefull to set a ASCII secret which could be used for different
usages. For instance, it will be used to derive QUIC stateless reset tokens.
The freeing of pre-check callbacks was missing when this feature was
recently added with commit b53eb8790 ("MINOR: init: add the pre-check
callback"), let's do it to make valgrind happy.
It appears that it is safe to call perform a clean deinit at this point, so
let's do this to exercise the deinit paths some more.
Running `valgrind --leak-check=full --show-leak-kinds=all ./haproxy -vv` with
this change reports:
==261864== HEAP SUMMARY:
==261864== in use at exit: 344 bytes in 11 blocks
==261864== total heap usage: 1,178 allocs, 1,167 frees, 1,102,089 bytes allocated
==261864==
==261864== 24 bytes in 1 blocks are still reachable in loss record 1 of 2
==261864== at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==261864== by 0x324BA6: hap_register_pre_check (init.c:92)
==261864== by 0x155824: main (haproxy.c:3024)
==261864==
==261864== 320 bytes in 10 blocks are still reachable in loss record 2 of 2
==261864== at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==261864== by 0x26E54E: cfg_register_postparser (cfgparse.c:4238)
==261864== by 0x155824: main (haproxy.c:3024)
==261864==
==261864== LEAK SUMMARY:
==261864== definitely lost: 0 bytes in 0 blocks
==261864== indirectly lost: 0 bytes in 0 blocks
==261864== possibly lost: 0 bytes in 0 blocks
==261864== still reachable: 344 bytes in 11 blocks
==261864== suppressed: 0 bytes in 0 blocks
which is looking pretty good.
__comp_fetch_init() only presets the maxzlibmem, and only when both
USE_ZLIB and DEFAULT_MAXZLIBMEM are set. The intent is to preset a
default value to protect the system against excessive memory usage
when no setting is set by the user.
Nowadays the entry in the global struct is always there so there's no
point anymore in passing via a constructor to possibly set this value.
Let's go the cleaner way by always presetting DEFAULT_MAXZLIBMEM to 0
in defaults.h unless these conditions are met, and always assigning it
instead of pre-setting the entry to zero. This is more straightforward
and removes some ifdefs and the last constructor. In addition, now the
setting has a chance of being found.
On some systems, the hard limit for ulimit -n may be huge, in the order
of 1 billion, and using this to automatically compute maxconn doesn't
work as it requires way too much memory. Users tend to hard-code maxconn
but that's not convenient to manage deployments on heterogenous systems,
nor when porting configs to developers' machines. The ulimit-n parameter
doesn't work either because it forces the limit. What most users seem to
want (and it makes sense) is to respect the system imposed limits up to
a certain value and cap this value. This is exactly what fd-hard-limit
does.
This addresses github issue #1622.
This adds a call to function <fct> to the list of functions to be called at
the step just before the configuration validity checks. This is useful when you
need to create things like it would have been done during the configuration
parsing and where the initialization should continue in the configuration
check.
It could be used for example to generate a proxy with multiple servers using
the configuration parser itself. At this step the trash buffers are allocated.
Threads are not yet started so no protection is required. The function is
expected to return non-zero on success, or zero on failure. A failure will make
the process emit a succinct error message and immediately exit.
The new 'close-spread-time' global option can be used to spread idle and
active HTTP connction closing after a SIGUSR1 signal is received. This
allows to limit bursts of reconnections when too many idle connections
are closed at once. Indeed, without this new mechanism, in case of
soft-stop, all the idle connections would be closed at once (after the
grace period is over), and all active HTTP connections would be closed
by appending a "Connection: close" header to the next response that goes
over it (or via a GOAWAY frame in case of HTTP2).
This patch adds the support of this new option for HTTP as well as HTTP2
connections. It works differently on active and idle connections.
On active connections, instead of sending systematically the GOAWAY
frame or adding the 'Connection: close' header like before once the
soft-stop has started, a random based on the remainder of the close
window is calculated, and depending on its result we could decide to
keep the connection alive. The random will be recalculated for any
subsequent request/response on this connection so the GOAWAY will still
end up being sent, but we might wait a few more round trips. This will
ensure that goaways are distributed along a longer time window than
before.
On idle connections, a random factor is used when determining the expire
field of the connection's task, which should naturally spread connection
closings on the time window (see h2c_update_timeout).
This feature request was described in GitHub issue #1614.
This patch should be backported to 2.5. It depends on "BUG/MEDIUM:
mux-h2: make use of http-request and keep-alive timeouts" which
refactorized the timeout management of HTTP2 connections.
Similar to the sample fetch keywords, let's also list the converter
keywords. They're much simpler since there's no compatibility matrix.
Instead the input and output types are listed. This is called by
dump_registered_keywords() for the "cnv" keywords class.
New function smp_dump_fetch_kw lists registered sample fetch keywords
with their compatibility matrix, mandatory and optional argument types,
and output types. It's called from dump_registered_keywords() with class
"smp".
New function acl_dump_kwd() dumps the registered ACL keywords and their
sample-fetch equivalent to stdout. It's called by dump_registered_keywords()
for keyword class "acl".
New function cli_list_keywords() scans the list of registered CLI keywords
and dumps them on stdout. It's now called from dump_registered_keywords()
for the class "cli".
Some keywords are valid for the master, they'll be suffixed with
"[MASTER]". Others are valid for the worker, they'll have "[WORKER]".
Those accessible only in expert mode will show "[EXPERT]" and the
experimental ones will show "[EXPERIM]".
When no output stream is passed, stdout is used with one entry per line,
and this is called from dump_registered_services() when passed the class
"svc".
When passing a NULL output buffer the function will now dump to stdout
with a more compact format that is more suitable for machine processing.
An entry was added to dump_registered_keyword() to call it when the
keyword class "flt" is requested.
All registered config keywords that are valid in the config parser are
dumped to stdout organized like the regular sections (global, listen,
etc). Some keywords that are known to only be valid in frontends or
backends will be suffixed with [FE] or [BE].
All regularly registered "bind" and "server" keywords are also dumped,
one per "bind" or "server" line. Those depending on ssl are listed after
the "ssl" keyword. Doing so required to export the listener and server
keyword lists that were static.
The function is called from dump_registered_keywords() for keyword
class "cfg".
It's difficult from outside haproxy to detect the supported keywords
and syntax. Interestingly, many of our modern keywords are enumerated
since they're registered from constructors, so it's not very hard to
enumerate most of them.
This patch creates some basic infrastructure to support dumping existing
keywords from different classes on stdout. The format will differ depending
on the classes, but the idea is that the output could easily be passed to
a script that generates some simple syntax highlighting rules, completion
rules for editors, syntax checkers or config parsers.
The principle chosen here is that if "-dK" is passed on the command-line,
at the end of the parsing the registered keywords will be dumped for the
requested classes passed after "-dK". Special name "help" will show known
classes, while "all" will execute all of them. The reason for doing that
after the end of the config processor is that it will also enumerate
internally-generated keywords, Lua or even those loaded from external
code (e.g. if an add-on is loaded using LD_PRELOAD). A typical way to
call this with a valid config would be:
./haproxy -dKall -q -c -f /path/to/config
If there's no config available, feeding /dev/null will also do the job,
though it will not be able to detect dynamically created keywords, of
course.
This patch also updates the management doc.
For now nothing but the help is listed, various subsystems will follow
in subsequent patches.
The functions needed to manipulate the "tainted" flags were located in
too high a level to be callable from the lower code layers. Let's move
them to bug.h.
get_tainted() was using an atomic store from the atomic value to a
local one instead of using an atomic load. In practice it has no effect
given the relatively rare updates of this field and the fact that it's
read only when dumping "show info" output, but better fix it.
There's probably no need to backport this.
The 9 currently available debugging options may now be checked, set, or
cleared using -dM. The directive now takes a comma-delimited list of
options after the optional poisonning byte. With "help", the list of
available options is displayed with a short help and their current
status.
The management doc was updated.
New function pool_parse_debugging() is now dedicated to parsing options
of -dM. For now it only handles the optional memory poisonning byte, but
the function may already return an informative message to be printed for
help, a warning or an error. This way we'll reuse it for the settings
that will be needed for configurable debugging options.
The argument parser runs too late, we'll soon need it before creating
pools, hence just after init_early(). No visible change is expected but
this part is sensitive enough to be placed into its own commit for easier
bisection later if needed.
The cmdline argument parsing was performed quite late, which prevents
from retrieving elements that can be used to initialize the pools and
certain sensitive areas. The goal is to improve this by parsing command
line arguments right after the early init stage. This is possible
because the cmdline parser already does very little beyond retrieving
config elements that are used later.
Doing so requires to move the parser code to a separate function and
to externalize a few variables out of the function as they're used
later in the boot process, in the original function.
This patch creates init_args() but doesn't move it upfront yet, it's
still executed just before init(), which essentially corresponds to
what was done before (only the trash buffers, ACLs and Lua were
initialized earlier and are not needed for this).
The rest is not modified and as expected no change is observed.
Note that the diff doesn't to justice to the change as it makes it
look like the early init() code was moved to a new function after
the function was renamed, while in fact it's clearly the parser
itself which moved.
There are some delicate chicken-and-egg situations in the initialization
code, because the init() function currently does way too much (it goes
as far as parsing the config) and due to this it must be started very
late. But it's also in charge of initializing a number of variables that
are needed in early boot (e.g. hostname/pid for error reporting, or
entropy for random generators).
This patch carefully extracts all the early code that depends on
absolutely nothing, and places it immediately after the STG_LOCK init
stage. The only possible failures at this stage are only allocation
errors and they continue to provoke an immediate exit().
Some environment variables, hostname, date, pid etc are retrieved at
this stage. The program's arguments are also copied there since they're
needed to be kept intact for the master process.
The STG_REGISTER init level is used to register known keywords and
protocol stacks. It must be called earlier because some of the init
code already relies on it to be known. For example, "haproxy -vv"
for now is constrained to start very late only because of this.
This patch moves it between STG_LOCK and STG_ALLOC, which is fine as
it's used for static registration.
Now -dM will set POOL_DBG_POISON for consistency with the rest of the
pool debugging options. As such now we only check for the new flag,
which allows the default value to be preset.
There's no point keeping the vars_init_head() call in init() when we
already have a vars_init() registered at the right time to do that,
and it complexifies the boot sequence, so let's move it there.
When started in master-worker mode combined with daemon mode, HAProxy
will open() with O_TRUNC the pidfile when switching to wait mode.
In 2.5, it happens everytime after trying to load the configuration,
since we switch to wait mode.
In previous version this happens upon a failure of the configuration
loading.
Fixes bug #1545.
Must be backported in every supported branches.
Released version 2.6-dev1 with the following main changes :
- BUG/MINOR: cache: Fix loop on cache entries in "show cache"
- BUG/MINOR: httpclient: allow to replace the host header
- BUG/MINOR: lua: don't expose internal proxies
- MEDIUM: mworker: seamless reload use the internal sockpairs
- BUG/MINOR: lua: remove loop initial declarations
- BUG/MINOR: mworker: does not add the -sf in wait mode
- BUG/MEDIUM: mworker: FD leak of the eventpoll in wait mode
- MINOR: quic: do not reject PADDING followed by other frames
- REORG: quic: add comment on rare thread concurrence during CID alloc
- CLEANUP: quic: add comments on CID code
- MEDIUM: quic: handle CIDs to rattach received packets to connection
- MINOR: qpack: support litteral field line with non-huff name
- MINOR: quic: activate QUIC traces at compilation
- MINOR: quic: use more verbose QUIC traces set at compile-time
- MEDIUM: pool: refactor malloc_trim/glibc and jemalloc api addition detections.
- MEDIUM: pool: support purging jemalloc arenas in trim_all_pools()
- BUG/MINOR: mworker: deinit of thread poller was called when not initialized
- BUILD: pools: only detect link-time jemalloc on ELF platforms
- CI: github actions: add the output of $CC -dM -E-
- BUG/MEDIUM: cli: Properly set stream analyzers to process one command at a time
- BUILD: evports: remove a leftover from the dead_fd cleanup
- MINOR: quic: Set "no_application_protocol" alert
- MINOR: quic: More accurate immediately close.
- MINOR: quic: Immediately close if no transport parameters extension found
- MINOR: quic: Rename qc_prep_hdshk_pkts() to qc_prep_pkts()
- MINOR: quic: Possible crash when inspecting the xprt context
- MINOR: quic: Dynamically allocate the secrete keys
- MINOR: quic: Add a function to derive the key update secrets
- MINOR: quic: Add structures to maintain key phase information
- MINOR: quic: Optional header protection key for quic_tls_derive_keys()
- MINOR: quic: Add quic_tls_key_update() function for Key Update
- MINOR: quic: Enable the Key Update process
- MINOR: quic: Delete the ODCIDs asap
- BUG/MINOR: vars: Fix the set-var and unset-var converters
- MEDIUM: pool: Following up on previous pool trimming update.
- BUG/MEDIUM: mux-h1: Fix splicing by properly detecting end of message
- BUG/MINOR: mux-h1: Fix splicing for messages with unknown length
- MINOR: mux-h1: Improve H1 traces by adding info about http parsers
- MINOR: mux-h1: register a stats module
- MINOR: mux-h1: add counters instance to h1c
- MINOR: mux-h1: count open connections/streams on stats
- MINOR: mux-h1: add stat for total count of connections/streams
- MINOR: mux-h1: add stat for total amount of bytes received and sent
- REGTESTS: h1: Add a script to validate H1 splicing support
- BUG/MINOR: server: Don't rely on last default-server to init server SSL context
- BUG/MEDIUM: resolvers: Detach query item on response error
- MEDIUM: resolvers: No longer store query items in a list into the response
- BUG/MAJOR: segfault using multiple log forward sections.
- BUG/MEDIUM: h1: Properly reset h1m flags when headers parsing is restarted
- BUG/MINOR: resolvers: Don't overwrite the error for invalid query domain name
- BUILD: bug: Fix error when compiling with -DDEBUG_STRICT_NOCRASH
- BUG/MEDIUM: sample: Fix memory leak in sample_conv_jwt_member_query
- DOC: spoe: Clarify use of the event directive in spoe-message section
- DOC: config: Specify %Ta is only available in HTTP mode
- BUILD: tree-wide: avoid warnings caused by redundant checks of obj_types
- IMPORT: slz: use the correct CRC32 instruction when running in 32-bit mode
- MINOR: quic: fix segfault on CONNECTION_CLOSE parsing
- MINOR: h3: add BUG_ON on control receive function
- MEDIUM: xprt-quic: finalize app layer initialization after ALPN nego
- MINOR: h3: remove duplicated FIN flag position
- MAJOR: mux-quic: implement a simplified mux version
- MEDIUM: mux-quic: implement release mux operation
- MEDIUM: quic: detect the stream FIN
- MINOR: mux-quic: implement subscribe on stream
- MEDIUM: mux-quic: subscribe on xprt if remaining data after send
- MEDIUM: mux-quic: wake up xprt on data transferred
- MEDIUM: mux-quic: handle when sending buffer is full
- MINOR: quic: RX buffer full due to wrong CRYPTO data handling
- MINOR: quic: Race issue when consuming RX packets buffer
- MINOR: quic: QUIC encryption level RX packets race issue
- MINOR: quic: Delete remaining RX handshake packets
- MINOR: quic: Remove QUIC TX packet length evaluation function
- MINOR: hq-interop: fix tx buffering
- MINOR: mux-quic: remove uneeded code to check fin on TX
- MINOR: quic: add HTX EOM on request end
- BUILD: mux-quic: fix compilation with DEBUG_MEM_STATS
- MINOR: http-rules: Add capture action to http-after-response ruleset
- BUG/MINOR: cli/server: Don't crash when a server is added with a custom id
- MINOR: mux-quic: do not release qcs if there is remaining data to send
- MINOR: quic: notify the mux on CONNECTION_CLOSE
- BUG/MINOR: mux-quic: properly initialize flow control
- MINOR: quic: Compilation fix for quic_rx_packet_refinc()
- MINOR: h3: fix possible invalid dereference on htx parsing
- DOC: config: retry-on list is space-delimited
- DOC: config: fix error-log-format example
- BUG/MEDIUM: mworker/cli: crash when trying to access an old PID in prompt mode
- MINOR: hq-interop: refix tx buffering
- REGTESTS: ssl: use X509_V_ERR_UNABLE_TO_GET_ISSUER_CERT_LOCALLY for cert check
- MINOR: cli: "show version" displays the current process version
- CLEANUP: cfgparse: modify preprocessor guards around numa detection code
- MEDIUM: cfgparse: numa detect topology on FreeBSD.
- BUILD: ssl: unbreak the build with newer libressl
- MINOR: vars: Move UPDATEONLY flag test to vars_set_ifexist
- MINOR: vars: Set variable type to ANY upon creation
- MINOR: vars: Delay variable content freeing in var_set function
- MINOR: vars: Parse optional conditions passed to the set-var converter
- MINOR: vars: Parse optional conditions passed to the set-var actions
- MEDIUM: vars: Enable optional conditions to set-var converter and actions
- DOC: vars: Add documentation about the set-var conditions
- REGTESTS: vars: Add new test for conditional set-var
- MINOR: quic: Attach timer task to thread for the connection.
- CLEANUP: quic_frame: Remove a useless suffix to STOP_SENDING
- MINOR: quic: Add traces for STOP_SENDING frame and modify others
- CLEANUP: quic: Remove cdata_len from quic_tx_packet struct
- MINOR: quic: Enable TLS 0-RTT if needed
- MINOR: quic: No TX secret at EARLY_DATA encryption level
- MINOR: quic: Add quic_set_app_ops() function
- MINOR: ssl_sock: Set the QUIC application from ssl_sock_advertise_alpn_protos.
- MINOR: quic: Make xprt support 0-RTT.
- MINOR: qpack: Missing check for truncated QPACK fields
- CLEANUP: quic: Comment fix for qc_strm_cpy()
- MINOR: hq_interop: Stop BUG_ON() truncated streams
- MINOR: quic: Do not mix packet number space and connection flags
- CLEANUP: quic: Shorten a litte bit the traces in lstnr_rcv_pkt()
- MINOR: mux-quic: fix trace on stream creation
- CLEANUP: quic: fix spelling mistake in a trace
- CLEANUP: quic: rename quic_conn conn to qc in quic_conn_free
- MINOR: quic: add missing lock on cid tree
- MINOR: quic: rename constant for haproxy CIDs length
- MINOR: quic: refactor concat DCID with address for Initial packets
- MINOR: quic: compare coalesced packets by DCID
- MINOR: quic: refactor DCID lookup
- MINOR: quic: simplify the removal from ODCID tree
- REGTESTS: vars: Remove useless ssl tunes from conditional set-var test
- MINOR: ssl: Remove empty lines from "show ssl ocsp-response" output
- MINOR: quic: Increase the RX buffer for each connection
- MINOR: quic: Add a function to list remaining RX packets by encryption level
- MINOR: quic: Stop emptying the RX buffer asap.
- MINOR: quic: Do not expect to receive only one O-RTT packet
- MINOR: quic: Do not forget STREAM frames received in disorder
- MINOR: quic: Wrong packet refcount handling in qc_pkt_insert()
- DOC: fix misspelled keyword "resolve_retries" in resolvers
- CLEANUP: quic: rename quic_conn instances to qc
- REORG: quic: move mux function outside of xprt
- MINOR: quic: add reference to quic_conn in ssl context
- MINOR: quic: add const qualifier for traces function
- MINOR: trace: add quic_conn argument definition
- MINOR: quic: use quic_conn as argument to traces
- MINOR: quic: add quic_conn instance in traces for qc_new_conn
- MINOR: quic: Add stream IDs to qcs_push_frame() traces
- MINOR: quic: unchecked qc_retrieve_conn_from_cid() returned value
- MINOR: quic: Wrong dropped packet skipping
- MINOR: quic: Handle the cases of overlapping STREAM frames
- MINOR: quic: xprt traces fixes
- MINOR: quic: Drop asap Retry or Version Negotiation packets
- MINOR: pools: work around possibly slow malloc_trim() during gc
- DEBUG: ssl: make sure we never change a servername on established connections
- MINOR: quic: Add traces for RX frames (flow control related)
- MINOR: quic: Add CONNECTION_CLOSE phrase to trace
- REORG: quic: remove qc_ prefix on functions which not used it directly
- BUG/MINOR: quic: upgrade rdlock to wrlock for ODCID removal
- MINOR: quic: remove unnecessary call to free_quic_conn_cids()
- MINOR: quic: store ssl_sock_ctx reference into quic_conn
- MINOR: quic: remove unnecessary if in qc_pkt_may_rm_hp()
- MINOR: quic: replace usage of ssl_sock_ctx by quic_conn
- MINOR: quic: delete timer task on quic_close()
- MEDIUM: quic: implement refcount for quic_conn
- BUG/MINOR: quic: fix potential null dereference
- BUG/MINOR: quic: fix potential use of uninit pointer
- BUG/MEDIUM: backend: fix possible sockaddr leak on redispatch
- BUG/MEDIUM: peers: properly skip conn_cur from incoming messages
- CI: Github Actions: do not show VTest failures if build failed
- BUILD: opentracing: display warning in case of using OT_USE_VARS at compile time
- MINOR: compat: detect support for dl_iterate_phdr()
- MINOR: debug: add ability to dump loaded shared libraries
- MINOR: debug: add support for -dL to dump library names at boot
- BUG/MEDIUM: ssl: initialize correctly ssl w/ default-server
- REGTESTS: ssl: fix ssl_default_server.vtc
- BUG/MINOR: ssl: free the fields in srv->ssl_ctx
- BUG/MEDIUM: ssl: free the ckch instance linked to a server
- REGTESTS: ssl: update of a crt with server deletion
- BUILD/MINOR: cpuset FreeBSD 14 build fix.
- MINOR: pools: always evict oldest objects first in pool_evict_from_local_cache()
- DOC: pool: document the purpose of various structures in the code
- CLEANUP: pools: do not use the extra pointer to link shared elements
- CLEANUP: pools: get rid of the POOL_LINK macro
- MINOR: pool: allocate from the shared cache through the local caches
- CLEANUP: pools: group list updates in pool_get_from_cache()
- MINOR: pool: rely on pool_free_nocache() in pool_put_to_shared_cache()
- MINOR: pool: make pool_is_crowded() always true when no shared pools are used
- MINOR: pool: check for pool's fullness outside of pool_put_to_shared_cache()
- MINOR: pool: introduce pool_item to represent shared pool items
- MINOR: pool: add a function to estimate how many may be released at once
- MEDIUM: pool: compute the number of evictable entries once per pool
- MINOR: pools: prepare pool_item to support chained clusters
- MINOR: pools: pass the objects count to pool_put_to_shared_cache()
- MEDIUM: pools: centralize cache eviction in a common function
- MEDIUM: pools: start to batch eviction from local caches
- MEDIUM: pools: release cached objects in batches
- OPTIM: pools: reduce local pool cache size to 512kB
- CLEANUP: assorted typo fixes in the code and comments This is 29th iteration of typo fixes
- CI: github actions: update OpenSSL to 3.0.1
- BUILD/MINOR: tools: solaris build fix on dladdr.
- BUG/MINOR: cli: fix _getsocks with musl libc
- BUG/MEDIUM: http-ana: Preserve response's FLT_END analyser on L7 retry
- MINOR: quic: Wrong traces after rework
- MINOR: quic: Add trace about in flight bytes by packet number space
- MINOR: quic: Wrong first packet number space computation
- MINOR: quic: Wrong packet number space computation for PTO
- MINOR: quic: Wrong loss time computation in qc_packet_loss_lookup()
- MINOR: quic: Wrong ack_delay compution before calling quic_loss_srtt_update()
- MINOR: quic: Remove nb_pto_dgrams quic_conn struct member
- MINOR: quic: Wrong packet number space trace in qc_prep_pkts()
- MINOR: quic: Useless test in qc_prep_pkts()
- MINOR: quic: qc_prep_pkts() code moving
- MINOR: quic: Speeding up Handshake Completion
- MINOR: quic: Probe Initial packet number space more often
- MINOR: quic: Probe several packet number space upon timer expiration
- MINOR: quic: Comment fix.
- MINOR: quic: Improve qc_prep_pkts() flexibility
- MINOR: quic: Do not drop secret key but drop the CRYPTO data
- MINOR: quic: Prepare Handshake packets asap after completed handshake
- MINOR: quic: Flag asap the connection having reached the anti-amplification limit
- MINOR: quic: PTO timer too often reset
- MINOR: quic: Re-arm the PTO timer upon datagram receipt
- MINOR: proxy: add option idle-close-on-response
- MINOR: cpuset: switch to sched_setaffinity for FreeBSD 14 and above.
- CI: refactor spelling check
- CLEANUP: assorted typo fixes in the code and comments
- BUILD: makefile: add -Wno-atomic-alignment to work around clang abusive warning
- MINOR: quic: Only one CRYPTO frame by encryption level
- MINOR: quic: Missing retransmission from qc_prep_fast_retrans()
- MINOR: quic: Non-optimal use of a TX buffer
- BUG/MEDIUM: mworker: don't use _getsocks in wait mode
- BUG/MINOR: ssl: Store client SNI in SSL context in case of ClientHello error
- BUG/MAJOR: mux-h1: Don't decrement .curr_len for unsent data
- DOC: internals: document the pools architecture and API
- CI: github actions: clean default step conditions
- BUILD: cpuset: fix build issue on macos introduced by previous change
- MINOR: quic: Remaining TRACEs with connection as firt arg
- MINOR: quic: Reset ->conn quic_conn struct member when calling qc_release()
- MINOR: quic: Flag the connection as being attached to a listener
- MINOR: quic: Wrong CRYPTO frame concatenation
- MINOR: quid: Add traces quic_close() and quic_conn_io_cb()
- REGTESTS: ssl: Fix ssl_errors regtest with OpenSSL 1.0.2
- MINOR: quic: Do not dereference ->conn quic_conn struct member
- MINOR: quic: fix return of quic_dgram_read
- MINOR: quic: add config parse source file
- MINOR: quic: implement Retry TLS AEAD tag generation
- MEDIUM: quic: implement Initial token parsing
- MINOR: quic: define retry_source_connection_id TP
- MEDIUM: quic: implement Retry emission
- MINOR: quic: free xprt tasklet on its thread
- BUG/MEDIUM: connection: properly leave stopping list on error
- MINOR: pools: enable pools with DEBUG_FAIL_ALLOC as well
- MINOR: quic: As server, skip 0-RTT packet number space
- MINOR: quic: Do not wakeup the I/O handler before the mux is started
- BUG/MEDIUM: htx: Adjust length to add DATA block in an empty HTX buffer
- CI: github actions: use cache for OpenTracing
- BUG/MINOR: httpclient: don't send an empty body
- BUG/MINOR: httpclient: set default Accept and User-Agent headers
- BUG/MINOR: httpclient/lua: don't pop the lua stack when getting headers
- BUILD/MINOR: fix solaris build with clang.
- BUG/MEDIUM: server: avoid changing healthcheck ctx with set server ssl
- CI: refactor OpenTracing build script
- DOC: management: mark "set server ssl" as deprecated
- MEDIUM: cli: yield between each pipelined command
- MINOR: channel: add new function co_getdelim() to support multiple delimiters
- BUG/MINOR: cli: avoid O(bufsize) parsing cost on pipelined commands
- MEDIUM: h2/hpack: emit a Dynamic Table Size Update after settings change
- MINOR: quic: Retransmit the TX frames in the same order
- MINOR: quic: Remove the packet number space TX MT_LIST
- MINOR: quic: Splice the frames which could not be added to packets
- MINOR: quic: Add the number of TX bytes to traces
- CLEANUP: quic: Replace <nb_pto_dgrams> by <probe>
- MINOR: quic: Send two ack-eliciting packets when probing packet number spaces
- MINOR: quic: Probe regardless of the congestion control
- MINOR: quic: Speeding up handshake completion
- MINOR: quic: Release RX Initial packets asap
- MINOR: quic: Release asap TX frames to be transmitted
- MINOR: quic: Probe even if coalescing
- BUG/MEDIUM: cli: Never wait for more data on client shutdown
- BUG/MEDIUM: mcli: do not try to parse empty buffers
- BUG/MEDIUM: mcli: always realign wrapping buffers before parsing them
- BUG/MINOR: stream: make the call_rate only count the no-progress calls
- MINOR: quic: do not use quic_conn after dropping it
- MINOR: quic: adjust quic_conn refcount decrement
- MINOR: quic: fix race-condition on xprt tasklet free
- MINOR: quic: free SSL context on quic_conn free
- MINOR: quic: Add QUIC_FT_RETIRE_CONNECTION_ID parsing case
- MINOR: quic: Wrong packet number space selection
- DEBUG: pools: add new build option DEBUG_POOL_INTEGRITY
- MINOR: quic: add missing include in quic_sock
- MINOR: quic: fix indentation in qc_send_ppkts
- MINOR: quic: remove dereferencement of connection when possible
- MINOR: quic: set listener accept cb on parsing
- MEDIUM: quic/ssl: add new ex data for quic_conn
- MINOR: quic: initialize ssl_sock_ctx alongside the quic_conn
- MINOR: ssl: fix build in release mode
- MINOR: pools: partially uninline pool_free()
- MINOR: pools: partially uninline pool_alloc()
- MINOR: pools: prepare POOL_EXTRA to be split into multiple extra fields
- MINOR: pools: extend pool_cache API to pass a pointer to a caller
- DEBUG: pools: add new build option DEBUG_POOL_TRACING
- DEBUG: cli: add a new "debug dev fd" expert command
- MINOR: fd: register the write side of the poller pipe as well
- CI: github actions: use cache for SSL libs
- BUILD: debug/cli: condition test of O_ASYNC to its existence
- BUILD: pools: fix build error on DEBUG_POOL_TRACING
- MINOR: quic: refactor header protection removal
- MINOR: quic: handle app data according to mux/connection layer status
- MINOR: quic: refactor app-ops initialization
- MINOR: receiver: define a flag for local accept
- MEDIUM: quic: flag listener for local accept
- MINOR: quic: do not manage connection in xprt snd_buf
- MINOR: quic: remove wait handshake/L6 flags on init connection
- MINOR: listener: add flags field
- MINOR: quic: define QUIC flag on listener
- MINOR: quic: create accept queue for QUIC connections
- MINOR: listener: define per-thr struct
- MAJOR: quic: implement accept queue
- CLEANUP: mworker: simplify mworker_free_child()
- BUILD/DEBUG: lru: update the standalone code to support the revision
- DEBUG: lru: use a xorshift generator in the testing code
- BUG/MAJOR: compiler: relax alignment constraints on certain structures
- BUG/MEDIUM: fd: always align fdtab[] to 64 bytes
- MINOR: quic: No DCID length for datagram context
- MINOR: quic: Comment fix about the token found in Initial packets
- MINOR: quic: Get rid of a struct buffer in quic_lstnr_dgram_read()
- MINOR: quic: Remove the QUIC haproxy server packet parser
- MINOR: quic: Add new defintion about DCIDs offsets
- MINOR: quic: Add a list to QUIC sock I/O handler RX buffer
- MINOR: quic: Allocate QUIC datagrams from sock I/O handler
- MINOR: proto_quic: Allocate datagram handlers
- MINOR: quic: Pass CID as a buffer to quic_get_cid_tid()
- MINOR: quic: Convert quic_dgram_read() into a task
- CLEANUP: quic: Remove useless definition
- MINOR: proto_quic: Wrong allocations for TX rings and RX bufs
- MINOR: quic: Do not consume the RX buffer on QUIC sock i/o handler side
- MINOR: quic: Do not reset a full RX buffer
- MINOR: quic: Attach all the CIDs to the same connection
- MINOR: quic: Make usage of by datagram handler trees
- MEDIUM: da: new optional data file download scheduler service.
- MEDIUM: da: update doc and build for new scheduler mode service.
- MEDIUM: da: update module to handle schedule mode.
- MINOR: quic: Drop Initial packets with wrong ODCID
- MINOR: quic: Wrong RX buffer tail handling when no more contiguous data
- MINOR: quic: Iterate over all received datagrams
- MINOR: quic: refactor quic CID association with threads
- BUG/MEDIUM: resolvers: Really ignore trailing dot in domain names
- DEV: flags: Add missing flags
- BUG/MINOR: sink: Use the right field in appctx context in release callback
- MINOR: sock: move the unused socket cleaning code into its own function
- BUG/MEDIUM: mworker: close unused transferred FDs on load failure
- BUILD: atomic: make the old HA_ATOMIC_LOAD() support const pointers
- BUILD: cpuset: do not use const on the source of CPU_AND/CPU_ASSIGN
- BUILD: checks: fix inlining issue on set_srv_agent_[addr,port}
- BUILD: vars: avoid overlapping field initialization
- BUILD: server-state: avoid using not-so-portable isblank()
- BUILD: mux_fcgi: avoid aliasing of a const struct in traces
- BUILD: tree-wide: mark a few numeric constants as explicitly long long
- BUILD: tools: fix warning about incorrect cast with dladdr1()
- BUILD: task: use list_to_mt_list() instead of casting list to mt_list
- BUILD: mworker: include tools.h for platforms without unsetenv()
- BUG/MINOR: mworker: fix a FD leak of a sockpair upon a failed reload
- MINOR: mworker: set the master side of ipc_fd in the worker to -1
- MINOR: mworker: allocate and initialize a mworker_proc
- CI: Consistently use actions/checkout@v2
- REGTESTS: Remove REQUIRE_VERSION=1.8 from all tests
- MINOR: mworker: sets used or closed worker FDs to -1
- MINOR: quic: Try to accept 0-RTT connections
- MINOR: quic: Do not try to treat 0-RTT packets without started mux
- MINOR: quic: Do not try to accept a connection more than one time
- MINOR: quic: Initialize the connection timer asap
- MINOR: quic: Do not use connection struct xprt_ctx too soon
- Revert "MINOR: mworker: sets used or closed worker FDs to -1"
- BUILD: makefile: avoid testing all -Wno-* options when not needed
- BUILD: makefile: validate support for extra warnings by batches
- BUILD: makefile: only compute alternative options if required
- DEBUG: fd: make sure we never try to insert/delete an impossible FD number
- MINOR: mux-quic: add comment
- MINOR: mux-quic: properly initialize qcc flags
- MINOR: mux-quic: do not consider CONNECTION_CLOSE for the moment
- MINOR: mux-quic: create a timeout task
- MEDIUM: mux-quic: delay the closing with the timeout
- MINOR: mux-quic: release idle conns on process stopping
- MINOR: listener: replace the listener's spinlock with an rwlock
- BUG/MEDIUM: listener: read-lock the listener during accept()
- MINOR: mworker/cli: set expert/experimental mode from the CLI
When starting HAProxy in master-worker, the master pre-allocate a struct
mworker_proc and do a socketpair() before the configuration parsing. If
the configuration loading failed, the FD are never closed because they
aren't part of listener, they are not even in the fdtab.
This patch fixes the issue by cleaning the mworker_proc structure that
were not asssigned a process, and closing its FDs.
Must be backported as far as 2.0, the srv_drop() only frees the memory
and could be dropped since it's done before an exec().
When the master process is reloaded on a new config, it will try to
connect to the previous process' socket to retrieve all known
listening FDs to be reused by the new listeners. If listeners were
removed, their unused FDs are simply closed.
However there's a catch. In case a socket fails to bind, the master
will cancel its startup and swithc to wait mode for a new operation
to happen. In this case it didn't close the possibly remaining FDs
that were left unused.
It is very hard to hit this case, but it can happen during a
troubleshooting session with fat fingers. For example, let's say
a config runs like this:
frontend ftp
bind 1.2.3.4:20000-29999
The admin wants to extend the port range down to 10000-29999 and
by mistake ends up with:
frontend ftp
bind 1.2.3.41:20000-29999
Upon restart the bind will fail if the address is not present, and the
master will then switch to wait mode without releasing the previous FDs
for 1.2.3.4:20000-29999 since they're now apparently unused. Then once
the admin fixes the config and does:
frontend ftp
bind 1.2.3.4:10000-29999
The service will start, but will bind new sockets, half of them
overlapping with the previous ones that were not properly closed. This
may result in a startup error (if SO_REUSEPORT is not enabled or not
available), in a FD number exhaustion (if the error is repeated many
times), or in connections being randomly accepted by the process if
they sometimes land on the old FD that nobody listens on.
This patch will need to be backported as far as 1.8, and depends on
previous patch:
MINOR: sock: move the unused socket cleaning code into its own function
Note that before 2.3 most of the code was located inside haproxy.c, so
the patch above should probably relocate the function there instead of
sock.c.
The startup code used to scan the list of unused sockets retrieved from
an older process, and to close them one by one. This also required that
the knowledge of the internal storage of these temporary sockets was
known from outside sock.c and that the code was copy-pasted at every
call place.
This patch moves this into sock.c under the name
sock_drop_unused_old_sockets(), and removes the xfer_sock_list
definition from sock.h since the rest of the code doesn't need to know
this.
This cleanup is minimal and preliminary to a future fix that will need
to be backported to all versions featuring FD transfers over the CLI.
The build on macos was broken by recent commit df91cbd58 ("MINOR: cpuset:
switch to sched_setaffinity for FreeBSD 14 and above."), let's move the
variable declaration inside the ifdef.
Since version 2.5 the master is automatically re-executed in wait-mode
when the config is successfully loaded, puting corner cases of the wait
mode in plain sight.
When using the -x argument and with the right timing, the master will
try to get the FDs again in wait mode even through it's not needed
anymore, which will harm the worker by removing its listeners.
However, if it fails, (and it's suppose to, sometimes), the
master will exit with EXIT_FAILURE because it does not have the
MODE_MWORKER flag, but only the MODE_MWORKER_WAIT flag. With the
consequence of killing the workers.
This patch fixes the issue by restricting the use of _getsocks to some
modes.
This patch must be backported in every version supported, even through
the impact should me more harmless in version prior to 2.5.
This is a second help to dump loaded library names late at boot, once
external code has already been initialized. The purpose is to provide
a format that makes it easy to pass to "tar" to produce an archive
containing the executable and the list of dependencies. For example
if haproxy is started as "haproxy -f foo.cfg", a config check only
will suffice to quit before starting, "-q" will be used to disable
undesired output messages, and -dL will be use to dump libraries.
This will result in such a command to trivially produce a tarball
of loaded libraries:
./haproxy -q -c -dL -f foo.cfg | tar -T - -hzcf archive.tgz
Commit 67e371e ("BUG/MEDIUM: mworker: FD leak of the eventpoll in wait
mode") introduced a regression. Upon a reload it tries to deinit the
poller per thread, but no poll loop was initialized after loading the
configuration.
This patch fixes the issue by moving this part of the code in
mworker_reload(), since this function will be called only when the
poller is fully initialized.
This patch must be backported in 2.5.
Since 2.5, before re-executing in wait mode, the master can have a
working configuration loaded, with a eventpoll fd. This case was not
handled correctly and a new eventpoll FD is leaking in the master at
each reload, which is inherited by the new worker.
Must be backported in 2.5.
Since the wait mode is automatically executed after charging the
configuration, -sf was shown in argv[] with the previous PID, which is
normal, but also the current one. This is only a visual problem when
listing the processes, because -sf does not do anything in wait mode.
Fix the issue by removing the whole "-sf" part in wait mode, but the
executed command can be seen in the argv[] of the latest worker forked.
Must be backported in 2.5.
With the master worker, the seamless reload was still requiring an
external stats socket to the previous process, which is a pain to
configure.
This patch implements a way to use the internal socketpair between the
master and the workers to transfer the sockets during the reload.
This way, the master will always try to transfer the socket, even
without any configuration.
The master will still reload with the -x argument, followed by the
sockpair@ syntax. ( ex -x sockpair@4 ). Which use the FD of internal CLI
to the worker.
Previously, the cleanup of the listeners was done in mworker_loop(),
which was called once the configuration file was parsed. HAProxy was
switching in wait mode when the configuration failed to load, so no
listeners where created.
Since the latest change on the mworker mode, HAProxy switch to wait mode
after successfuly loading the configuration, without cleaning its
listeners, because it was done in mworker_loop, resulting in the master
not closing its listeners and keeping them. The master needs its
configuration to know which listeners it need to close, so that must be
done before the exec().
This patch fixes the problem by cleaning the listeners in the
mworker_reexec() function.
No backport needeed.
Implement a reload failure counter which counts the number of failure
since the last success. This counter is available in 'show proc' over
the master CLI.
Clarify the startup and reload messages:
On a successful configuration load, haproxy will emit "Loading success."
after successfuly forked the children.
When it didn't success to load the configuration it will emit "Loading failure!".
When trying to reload the master process, it will emit "Reloading
HAProxy".
Use the waitpid mode after successfully loading the configuration, this
way the memory will be freed in the master, and will preserve the memory.
This will be useful when doing a reload with a configuration which has
large maps or a lot of SSL certificates, avoiding an OOM because too
much memory was allocated in the master.
nbproc was removed, it's time to remove any reference to the relative
PID in the master-worker, since there can be only 1 current haproxy
process.
This patch cleans up the alerts and warnings emitted during the exit of
a process, as well as the "show proc" output.
A proxy may now references the defaults section it is used. To do so, a
pointer on the default proxy was added in the proxy structure. And a
refcount must be used to track proxies using a default proxy. A default
proxy is destroyed iff its refcount is equal to zero and when it drops to
zero.
All this stuff must be performed during init/deinit staged for now. All
unreferenced default proxies are removed after the configuration parsing.
This patch is mandatory to support TCP/HTTP rules in defaults sections.
This change is required to support TCP/HTTP rules in defaults sections. The
'disabled' bitfield in the proxy structure, used to know if a proxy is
disabled or stopped, is replaced a generic bitfield named 'flags'.
PR_DISABLED and PR_STOPPED flags are renamed to PR_FL_DISABLED and
PR_FL_STOPPED respectively. In addition, everywhere there is a test to know
if a proxy is disabled or stopped, there is now a bitwise AND operation on
PR_FL_DISABLED and/or PR_FL_STOPPED flags.
ha_set_tid() was randomly used either to explicitly set thread 0 or to
set any possibly incomplete thread during boot. Let's replace it with
a pointer to a valid thread or NULL for any thread. This allows us to
check that the designated threads are always valid, and to ignore the
thread 0's mapping when setting it to NULL, and always use group 0 with
it during boot.
The initialization code is also cleaner, as we don't pass ugly casts
of a thread ID to a pointer anymore.
The scheduler contains a lot of stuff that is thread-local and not
exclusively tied to the scheduler. Other parts (namely thread_info)
contain similar thread-local context that ought to be merged with
it but that is even less related to the scheduler. However moving
more data into this structure isn't possible since task.h is high
level and cannot be included everywhere (e.g. activity) without
causing include loops.
In the end, it appears that the task_per_thread represents most of
the per-thread context defined with generic types and should simply
move to tinfo.h so that everyone can use them.
The struct was renamed to thread_ctx and the variable "sched" was
renamed to "th_ctx". "sched" used to be initialized manually from
run_thread_poll_loop(), now it's initialized by ha_set_tid() just
like ti, tid, tid_bit.
The memset() in init_task() was removed in favor of a bss initialization
of the array, so that other subsystems can put their stuff in this array.
Since the tasklet array has TL_CLASSES elements, the TL_* definitions
was moved there as well, but it's not a problem.
The vast majority of the change in this patch is caused by the
renaming of the structures.
This was previously open-coded in run_thread_poll_loop(). Now that
we have clock.c dedicated to such stuff, let's move the code there
so that we don't need to keep such ifdefs nor to depend on the
clock_id.
There is currently a problem related to time keeping. We're mixing
the functions to perform calculations with the os-dependent code
needed to retrieve and adjust the local time.
This patch extracts from time.{c,h} the parts that are solely dedicated
to time keeping. These are the "now" or "before_poll" variables for
example, as well as the various now_*() functions that make use of
gettimeofday() and clock_gettime() to retrieve the current time.
The "tv_*" functions moved there were also more appropriately renamed
to "clock_*".
Other parts used to compute stolen time are in other files, they will
have to be picked next.
It was brought by a variable declared after some statements in commit
21185970c ("MINOR: proc: setting the process to produce a core dump on
FreeBSD."). It's worth noting that some versions of clang seem to ignore
-Wdeclaration-after-statement by default. No backport is needed.
This removes the thread identifiers from struct thread_info and moves
them only in static array in thread.c since it's now the only file that
needs to touch it. It's also the only file that needs to include
pthread.h, beyond haproxy.c which needs it to start the poll loop. As
a result, much less system includes are needed and the LoC reduced by
around 3%.
haproxy.c still has to deal with pthread-specific low-level stuff that
is OS-dependent. We should not have to deal with this there, and we do
not need to access pthread anywhere else.
Let's move these 3 functions to thread.c and keep empty inline ones for
when threads are disabled.
The startup code was still ugly with tons of unreadable nested ifdefs.
Let's just have one function to set up the extra threads and another one
to wait for their completion. The ifdefs are isolated into their own
functions now and are more readable, just like the end of main(), which
now uses the same statements to start thread 0 with and without threads.
Till now the threads startup was quite messy:
- we would start all threads but one
- then we would change all threads' CPU affinities
- then we would manually start the poll loop for the current thread
Let's change this by moving the CPU affinity setting code to a function
set_thread_cpu_affinity() that does this job for the current thread only,
and that is called during the thread's initialization in the polling loop.
It takes care of not doing this for the master, and will result in all
threads to be properly bound earlier and with cleaner code. It also
removes some ugly nested ifdefs.
Move the code to allocate/free the mux cleanup task outside of the polling
loop. A new thread_alloc/free handler is registered for this in
connection.c.
This has the benefit to clean up the polling loop code. And as another
benefit, if the task allocation fails, the handler can report an error
to exit the haproxy process. This prevents a potential null pointer
dereferencing.
This should fix the github issue #1389.
This must be backported up to 2.4.
The vars_init() name is particularly confusing as it does not initialize
the variables code but the head of a list of variables passed in
arguments. And we'll soon need to have proper initialization code, so
let's rename it now.
using the procctl api to set the current process as traceable, thus being able to produce a core dump as well.
making it as compile option if not wished or using freebsd prior to 11.x (last no EOL release).
User reported that the config check returns an error with the message:
"Configuration file has no error but will not start (no listener) => exit(2)."
if the configuration present only a log-forward section with bind or dgram-bind
listeners but no listen/backend nor peer sections.
The process checked if there was 'peers' section avalaible with
an internal frontend (and so a listener) or a 'listen/backend'
section not disabled with at least one configured listener (into the
global proxies_list). Since the log-forward proxies appear in a
different list, they were not checked.
This patch adds a lookup on the 'log-forward' proxies list to check
if one of them presents a listener and is not disabled. And
this is done only if there was no available listener found into
'listen/backend' sections.
I have also studied how to re-work this check considering the 'listeners'
counter used after startup/init to keep the same algo and avoid further
mistakes but currently this counter seems increased during config parsing
and if a proxy is disabled, decreased during startup/init which is done
after the current config check. So the fix still not rely on this
counter.
This patch should fix the github issue #1346
This patch should be backported as far as 2.3 (so on branches
including the "log-forward" feature)
Commit 048368ef6 ("MINOR: deinit: always deinit the init_mutex on
failed initialization") added the missing unlock but forgot to
condition it on USE_THREAD, resulting in a build failure. No
backport is needed.
This addresses oss-fuzz issue 36426.
This undocumented variable is only for internal use, and its sole
presence affects the process' behavior, as shown in bug #1324. It must
not be exported to workers, external checks, nor programs. Let's unset
it before forking programs and workers.
This should be backported as far as 1.8. The worker code might differ
a bit before 2.5 due to the recent removal of multi-process support.
The master-worker code registers an exit handler to deal with configuration
issues during reload, leading to a restart of the master process in wait
mode. But it shouldn't do that when it's expected that the program stops
during config parsing or condition checks, as the reload operation is
unexpectedly called and results in abnormal behavior and even crashes:
$ HAPROXY_MWORKER_REEXEC=1 ./haproxy -W -c -f /dev/null
Configuration file is valid
[NOTICE] (18418) : haproxy version is 2.5-dev2-ee2420-6
[NOTICE] (18418) : path to executable is ./haproxy
[WARNING] (18418) : config : Reexecuting Master process in waitpid mode
Segmentation fault
$ HAPROXY_MWORKER_REEXEC=1 ./haproxy -W -cc 1
[NOTICE] (18412) : haproxy version is 2.5-dev2-ee2420-6
[NOTICE] (18412) : path to executable is ./haproxy
[WARNING] (18412) : config : Reexecuting Master process in waitpid mode
[WARNING] (18412) : config : Reexecuting Master process
Note that the presence of this variable happens by accident when haproxy
is called from within its own programs (see issue #1324), but this should
be the object of a separate fix.
This patch fixes this by preventing the atexit registration in such
situations. This should be backported as far as 1.8. MODE_CHECK_CONDITION
has to be dropped for versions prior to 2.5.
The init_mutex was not unlocked in case an error is encountered during
a thread initialization, and the polling loop was aborted during startup.
In practise it does not have any observable effect since an explicit
exit() is placed there, but it could confuse some debugging tools or
some static analysers, so let's release it as expected.
This addresses issue #1326.
The removal for the shared inter-process cache in commit 6fd0450b4
("CLEANUP: shctx: remove the different inter-process locking techniques")
accidentally removed the enforcement of rlimit_memmax_all which
corresponds to what is passed to the command-line "-m" argument.
Let's restore it.
Thanks to @nafets227 for spotting this. This fixes github issue #1319.
Till now we were dealing with single-word expressions but in order to
extend the configuration condition language a bit more, we'll need to
support slightly more complex expressions involving operators, and we
must absolutely support spaces around them to keep them readable.
As all arguments are pointers to the same line with spaces replaced by
zeroes, we can trivially rebuild the whole line before calling the
condition evaluator, and remove the test for extraneous argument. This
is what this patch does.
The .if/.else/.endif and condition evaluation code is quite dirty and
was dumped into cfgparse.c because it was easy. But it should be tidied
quite a bit as it will need to evolve.
Let's move all that to cfgcond.{c,h}.
I found myself a few times testing some conditoin examples from the doc
against command line's "-cc" to see that they didn't work with environment
variables expansion. Not being documented as being on purpose it looks like
a miss, so let's add PARSE_OPT_ENV and PARSE_OPT_WORD_EXPAND to be able to
test for example -cc "streq(${WITH_SSL},yes)" to help debug expressions.