Implement MT_LIST_POP_LOCKED(), that behaves as MT_LIST_POP() and
removes the first element from the list, if any, but keeps it locked.
This should be backported to 3.2, as it will be use in a bug fix in the
stick tables that affects 3.2 too.
Add some documentation about shm stats file structure to help writing
tools that can parse the file to use the shared stats counters.
This file was written for shm stats file version 1.0 specifically,
it may need to be updated when the shm stats file structure changes
in the future.
This one is a macro and will allocate a properly aligned and sized
object. This will help make sure that the alignment promised to the
compiler is respected.
When memstats is used, the type name is passed as a string into the
.extra field so that it can be displayed in "debug dev memstats". Two
tiny mistakes related to memstats macros were also fixed (calloc
instead of malloc for zalloc), and the doc was also added to document
how to use these calls.
This way we can preserve the entire contents of the released area for
later inspection. This automatically enables comparison at reallocation
time as well (like "integrity" does). If used in combination with
integrity, the comparison is disabled but the check of non-corruption
of the area mangled by integrity is still operated.
The new MEM_F_UAF flag can be set just after a pool's creation to make
this pool UAF for debugging purposes. This allows to maintain a better
overall performance required to reproduce issues while still having a
chance to catch UAF. It will only be used by developers who will manually
add it to areas worth being inspected, though.
Instead of using the thread dump buffer for post-mortem analysis, we'll
keep a copy of the assigned pointer whenever it's used, even for warnings
or "show threads". This will offer more opportunities to figure from a
core what happened, and will give us more freedom regarding the value of
the thread_dump_buffer itself. For example, even at the end of the dump
when the pointer is reset, the last used buffer is now preserved.
Implement mt_list_try_lock_prev(), that does the same thing
as mt_list_lock_prev(), exceot if the list is locked, it
returns { NULL, NULL } instaed of waiting.
Each time we go into the watchdog and panic code, it's super hard to
figure who calls what since signals are involved to bounce between
threads. Let's document the main principles and sequences to ease the
journey next time.
Before this patch, REGISTER_CONFIG_SECTION() allowed to register one and only
one callback (<post>) called after the parsing of a section.
It was limitating because you couldn't register a post callback from anywhere
else in the code.
This patch introduces the new REGISTER_CONFIG_SECTION_POST() macros which allows
to register a new post callback for a section keyword from anywhere.
This patch introduces the feature by allowing `struct cfg_section` entries that
does not have a `section_parser`, and then iterating on all cfg_section with a
post_section_parser for a keyword.
shutdown-backup-sessions action for on-marked-up directive does not work anymore
since the stream_shutdown() function was modified to be async-safe.
When stream_shutdown() was modified to be async-safe, dedicated task events were
added to map the reasons to shut a stream down. SF_ERR_DOWN was mapped to
TASK_F_EVT1 and SF_ERR_KILLED was mapped to TASK_F_EVT2. The reverse mapping was
performed by process_stream() to shut the stream with the appropriate reason.
However, SF_ERR_UP reason, used by shutdown-backup-sessions action to shut a
stream down because a preferred server became available, was not mapped in the
same way. So since commit b8e3b0a18d ("BUG/MEDIUM: stream: make
stream_shutdown() async-safe"), this action is ignored and does not work
anymore.
To fix an issue, and being able to bakcport the fix, a third task event was
added. TASK_F_EVT3 is now mapped on SF_ERR_UP.
This patch should fix the issue #2848. It must be backported as far as 2.6.
tasklet_wakeup_on() and its derivates (tasklet_wakeup_after() and
tasklet_wakeup()) do not support passing a wakeup cause like
task_wakeup(). This is essentially due to an API limitation cause by
the fact that for a very long time the only reason for waking up was
to process pending I/O. But with the growing complexity of mux tasks,
it is becoming important to be able to skip certain heavy processing
when not strictly needed.
One possibility is to permit the caller of tasklet_wakeup() to pass
flags like task_wakeup(). Instead of going with a complex naming scheme,
let's simply make the flags optional and be zero when not specified. This
means that tasklet_wakeup_on() now takes either 2 or 3 args, and that the
third one is the optional flags to be passed to the callee. Eligible flags
are essentially the non-persistent ones (TASK_F_UEVT* and TASK_WOKEN_*)
which are cleared when the tasklet is executed. This way the handler
will find them in its <state> argument and will be able to distinguish
various causes for the call.
These are user-defined one-shot events that are application-specific
and reset upon wakeup and were not documented. No backport is needed
since these were added to 3.1.
The buffer ring is problematic in multiple aspects, one of which being
that it is only usable by one entity. With multiplexed protocols, we need
to have shared buffers used by many entities (streams and connection),
and the only way to use the buffer ring model in this case is to have
each entity store its own array, and keep a shared counter on allocated
entries. But even with the default 32 buf and 100 streams per HTTP/2
connection, we're speaking about 32*101*32 bytes = 103424 bytes per H2
connection, just to store up to 32 shared buffers, spread randomly in
these tables. Some users might want to achieve much higher than default
rates over high speed links (e.g. 30-50 MB/s at 100ms), which is 3 to 5
MB storage per connection, hence 180 to 300 buffers. There it starts to
cost a lot, up to 1 MB per connection, just to store buffer indexes.
Instead this patch introduces a variant which we call a buffer list.
That's basically just a free list encoded in an array. Each cell
contains a buffer structure, a next index, and a few flags. The index
could be reduced to 16 bits if needed, in order to make room for a new
struct member. The design permits initializing a whole freelist at once
using memset(0).
The list pointer is stored at a single location (e.g. the connection)
and all users (the streams) will just have indexes referencing their
first and last assigned entries (head and tail). This means that with
a single table we can now have all our buffers shared between multiple
streams, irrelevant to the number of potential streams which would want
to use them. Now the 180 to 300 entries array only costs 7.2 to 12 kB,
or 80 times less.
Two large functions (bl_deinit() & bl_get()) were implemented in buf.c.
A basic doc was added to explain how it works.
This is the second attempt at importing the updated mt_list code (commit
59459ea3). The previous one was attempted with commit c618ed5ff4 ("MAJOR:
import: update mt_list to support exponential back-off") but revealed
problems with QUIC connections and was reverted.
The problem that was faced was that elements deleted inside an iterator
were no longer reset, and that if they were to be recycled in this form,
they could appear as busy to the next user. This was trivially reproduced
with this:
$ cat quic-repro.cfg
global
stats socket /tmp/sock1 level admin
stats timeout 1h
limited-quic
frontend stats
mode http
bind quic4@:8443 ssl crt rsa+dh2048.pem alpn h3
timeout client 5s
stats uri /
$ ./haproxy -db -f quic-repro.cfg &
$ h2load -c 10 -n 100000 --npn h3 https://127.0.0.1:8443/
=> hang
This was purely an API issue caused by the simplified usage of the macros
for the iterator. The original version had two backups (one full element
and one pointer) that the user had to take care of, while the new one only
uses one that is transparent for the user. But during removal, the element
still has to be unlocked if it's going to be reused.
All of this sparked discussions with Fred and Aurlien regarding the still
unclear state of locking. It was found that the lock API does too much at
once and is lacking granularity. The new version offers a much more fine-
grained control allowing to selectively lock/unlock an element, a link,
the rest of the list etc.
It was also found that plenty of places just want to free the current
element, or delete it to do anything with it, hence don't need to reset
its pointers (e.g. event_hdl). Finally it appeared obvious that the
root cause of the problem was the unclear usage of the list iterators
themselves because one does not necessarily expect the element to be
presented locked when not needed, which makes the unlock easy to overlook
during reviews.
The updated version of the list presents explicit lock status in the
macro name (_LOCKED or _UNLOCKED suffixes). When using the _LOCKED
suffix, the caller is expected to unlock the element if it intends to
reuse it. At least the status is advertised. The _UNLOCKED variant,
instead, always unlocks it before starting the loop block. This means
it's not necessary to think about unlocking it, though it's obviously
not usable with everything. A few _UNLOCKED were used at obvious places
(i.e. where the element is deleted and freed without any prior check).
Interestingly, the tests performed last year on QUIC forwarding, that
resulted in limited traffic for the original version and higher bit
rate for the new one couldn't be reproduced because since then the QUIC
stack has gaind in efficiency, and the 100 Gbps barrier is now reached
with or without the mt_list update. However the unit tests definitely
show a huge difference, particularly on EPYC platforms where the EBO
provides tremendous CPU savings.
Overall, the following changes are visible from the application code:
- mt_list_for_each_entry_safe() + 1 back elem + 1 back ptr
=> MT_LIST_FOR_EACH_ENTRY_LOCKED() or MT_LIST_FOR_EACH_ENTRY_UNLOCKED()
+ 1 back elem
- MT_LIST_DELETE_SAFE() no longer needed in MT_LIST_FOR_EACH_ENTRY_UNLOCKED()
=> just manually set iterator to NULL however.
For MT_LIST_FOR_EACH_ENTRY_LOCKED()
=> mt_list_unlock_self() (if element going to be reused) + NULL
- MT_LIST_LOCK_ELT => mt_list_lock_full()
- MT_LIST_UNLOCK_ELT => mt_list_unlock_full()
- l = MT_LIST_APPEND_LOCKED(h, e); MT_LIST_UNLOCK_ELT();
=> l=mt_list_lock_prev(h); mt_list_lock_elem(e); mt_list_unlock_full(e, l)
Released version 3.1-dev2 with the following main changes :
- BUG/MINOR: log: fix broken '+bin' logformat node option
- DEBUG: hlua: distinguish burst timeout errors from exec timeout errors
- REGTESTS: ssl: fix some regtests 'feature cmd' start condition
- BUG/MEDIUM: ssl: AWS-LC + TLSv1.3 won't do ECDSA in RSA+ECDSA configuration
- MINOR: ssl: activate sigalgs feature for AWS-LC
- REGTESTS: ssl: activate new SSL reg-tests with AWS-LC
- BUG/MEDIUM: proxy: fix email-alert invalid free
- REORG: mailers: move free_email_alert() to mailers.c
- BUG/MINOR: proxy: fix email-alert leak on deinit() (2nd try)
- DOC: configuration: fix alphabetical order of bind options
- DOC: management: document ptr lookup for table commands
- BUG/MAJOR: quic: fix padding with short packets
- BUG/MAJOR: quic: do not loop on emission on closing/draining state
- MINOR: sample: date converter takes HTTP date and output an UNIX timestamp
- SCRIPTS: git-show-backports: do not truncate git-show output
- DOC: api/event_hdl: small updates, fix an example and add some precisions
- BUG/MINOR: h3: fix crash on STOP_SENDING receive after GOAWAY emission
- BUG/MINOR: mux-quic: fix crash on qcs SD alloc failure
- BUG/MINOR: h3: fix BUG_ON() crash on control stream alloc failure
- BUG/MINOR: quic: fix BUG_ON() on Tx pkt alloc failure
- DEV: flags/show-fd-to-flags: adapt to recent versions
- MINOR: capabilities: export capget and __user_cap_header_struct
- MINOR: capabilities: prepare support for version 3
- MINOR: capabilities: use _LINUX_CAPABILITY_VERSION_3
- MINOR: cli/debug: show dev: add cmdline and version
- MINOR: cli/debug: show dev: show capabilities
- MINOR: debug: print gdb hints when crashing
- BUILD: debug: also declare strlen() in __ABORT_NOW()
- BUILD: Missing inclusion header for ssize_t type
- BUG/MINOR: hlua: report proper context upon error in hlua_cli_io_handler_fct()
- MINOR: cfgparse/log: remove leftover dead code
- BUG/MEDIUM: stick-table: Decrement the ref count inside lock to kill a session
- MINOR: stick-table: Always decrement ref count before killing a session
- REORG: init: do MODE_CHECK_CONDITION logic first
- REORG: init: encapsulate CHECK_CONDITION logic in a func
- REORG: init: encapsulate 'reload' sockpair and master CLI listeners creation
- REORG: init: encapsulate code that reads cfg files
- BUG/MINOR: server: fix first server template name lookup UAF
- MINOR: activity: make the memory profiling hash size configurable at build time
- BUG/MEDIUM: server/dns: prevent DOWN/UP flap upon resolution timeout or error
- BUG/MEDIUM: h3: ensure the ":method" pseudo header is totally valid
- BUG/MEDIUM: h3: ensure the ":scheme" pseudo header is totally valid
- BUG/MEDIUM: quic: fix race-condition in quic_get_cid_tid()
- BUG/MINOR: quic: fix race condition in qc_check_dcid()
- BUG/MINOR: quic: fix race-condition on trace for CID retrieval
Fix an example suggesting that using EVENT_HDL_SUB_TYPE(x, y) with y being
0 was valid. Then add some notes to explain how to use
EVENT_HDL_SUB_FAMILY() and EVENT_HDL_SUB_TYPE() with valid values.
Also mention that the feature is available starting from 2.8 and not 2.7.
Finally, perform some purely cosmetic updates.
This could be backported in 2.8.
Add a documentation about the history of the master-worker and how it
was implemented in its first version and how it is currently working.
This is a global view of the architecture, and not an exhaustive
explanation of all mechanisms.
The goal is to indicate how critical the allocation is, between the
least one (growing an existing buffer ring) and the topmost one (boot
time allocation for the life of the process).
The 3 tcp-based muxes (h1, h2, fcgi) use a common allocation function
to try to allocate otherwise subscribe. There's currently no distinction
of direction nor part that tries to allocate, and this should be revisited
to improve this situation, particularly when we consider that mux-h2 can
reduce its Tx allocations if needed.
For now, 4 main levels are planned, to translate how the data travels
inside haproxy from a producer to a consumer:
- MUX_RX: buffer used to receive data from the OS
- SE_RX: buffer used to place a transformation of the RX data for
a mux, or to produce a response for an applet
- CHANNEL: the channel buffer for sync recv
- MUX_TX: buffer used to transfer data from the channel to the outside,
generally a mux but there can be a few specificities (e.g.
http client's response buffer passed to the application,
which also gets a transformation of the channel data).
The other levels are a bit different in that they don't strictly need to
allocate for the first two ones, or they're permanent for the last one
(used by compression).
Released version 2.9-dev9 with the following main changes :
- DOC: internal: filters: fix reference to entities.pdf
- BUG/MINOR: ssl: load correctly @system-ca when ca-base is define
- MINOR: lua: Add flags to configure logging behaviour
- MINOR: lua: change tune.lua.log.stderr default from 'on' to 'auto'
- BUG/MINOR: backend: fix wrong BUG_ON for avail conn
- BUG/MAJOR: backend: fix idle conn crash under low FD
- MINOR: backend: refactor insertion in avail conns tree
- DEBUG: mux-h2/flags: fix list of h2c flags used by the flags decoder
- BUG/MEDIUM: server/log: "mode log" after server keyword causes crash
- MINOR: connection: add conn_pr_mode_to_proto_mode() helper func
- BUG/MEDIUM: server: "proto" not working for dynamic servers
- MINOR: server: add helper function to detach server from proxy list
- DEBUG: add a tainted flag when ha_panic() is called
- DEBUG: lua: add tainted flags for stuck Lua contexts
- DEBUG: pools: detect that malloc_trim() is in progress
- BUG/MINOR: quic: do not consider idle timeout on CLOSING state
- MINOR: frontend: implement a dedicated actconn increment function
- BUG/MINOR: ssl: use a thread-safe sslconns increment
- MEDIUM: quic: count quic_conn instance for maxconn
- MEDIUM: quic: count quic_conn for global sslconns
- BUG/MINOR: ssl: suboptimal certificate selection with TLSv1.3 and dual ECDSA/RSA
- REGTESTS: ssl: update the filters test for TLSv1.3 and sigalgs
- BUG/MINOR: mux-quic: fix early close if unset client timeout
- BUG/MEDIUM: ssl: segfault when cipher is NULL
- BUG/MINOR: tcpcheck: Report hexstring instead of binary one on check failure
- MEDIUM: systemd: be more verbose about the reload
- MINOR: sample: Add fetcher for getting all cookie names
- BUG/MINOR: proto_reverse_connect: support SNI on active connect
- MINOR: proxy/stktable: add resolve_stick_rule helper function
- BUG/MINOR: stktable: missing free in parse_stick_table()
- BUG/MINOR: cfgparse/stktable: fix error message on stktable_init() failure
- MINOR: stktable: stktable_init() sets err_msg on error
- MINOR: stktable: check if a type should be used as-is
- MEDIUM: stktable/peers: "write-to" local table on peer updates
- CI: github: update wolfSSL to 5.6.4
- DOC: install: update the wolfSSL required version
- MINOR: server: Add parser support for set-proxy-v2-tlv-fmt
- MINOR: connection: Send out generic, user-defined server TLVs
- BUG/MEDIUM: pattern: don't trim pools under lock in pat_ref_purge_range()
- MINOR: mux-h2: always use h2_send() in h2_done_ff(), not h2_process()
- OPTIM: mux-h2: call h2_send() directly from h2_snd_buf()
- BUG/MINOR: server: remove some incorrect free() calls on null elements
This reverts commit c618ed5ff41ce29454e784c610b23bad0ea21f4f.
The list iterator is broken. As found by Fred, running QUIC single-
threaded shows that only the first connection is accepted because the
accepter relies on the element being initialized once detached (which
is expected and matches what MT_LIST_DELETE_SAFE() used to do before).
However while doing this in the quic_sock code seems to work, doing it
inside the macro show total breakage and the unit test doesn't work
anymore (random crashes). Thus it looks like the fix is not trivial,
let's roll this back for the time it will take to fix the loop.
The new mt_list code supports exponential back-off on conflict, which
is important for use cases where there is contention on a large number
of threads. The API evolved a little bit and required some updates:
- mt_list_for_each_entry_safe() is now in upper case to explicitly
show that it is a macro, and only uses the back element, doesn't
require a secondary pointer for deletes anymore.
- MT_LIST_DELETE_SAFE() doesn't exist anymore, instead one just has
to set the list iterator to NULL so that it is not re-inserted
into the list and the list is spliced there. One must be careful
because it was usually performed before freeing the element. Now
instead the element must be nulled before the continue/break.
- MT_LIST_LOCK_ELT() and MT_LIST_UNLOCK_ELT() have always been
unclear. They were replaced by mt_list_cut_around() and
mt_list_connect_elem() which more explicitly detach the element
and reconnect it into the list.
- MT_LIST_APPEND_LOCKED() was only in haproxy so it was left as-is
in list.h. It may however possibly benefit from being upstreamed.
This required tiny adaptations to event_hdl.c and quic_sock.c. The
test case was updated and the API doc added. Note that in order to
keep include files small, the struct mt_list definition remains in
list-t.h (par of the internal API) and was ifdef'd out in mt_list.h.
A test on QUIC with both quictls 1.1.1 and wolfssl 5.6.3 on ARM64 with
80 threads shows a drastic reduction of CPU usage thanks to this and
the refined memory barriers. Please note that the CPU usage on OpenSSL
3.0.9 is significantly higher due to the excessive use of atomic ops
by openssl, but 3.1 is only slightly above 1.1.1 though:
- before: 35 Gbps, 3.5 Mpps, 7800% CPU
- after: 41 Gbps, 4.2 Mpps, 2900% CPU
More and more utility functions rely on the trash while most of the init
code doesn't have access to it because it's initialized very late (in
PRE_CHECK for the initial one). It's a pool, and it purposely supports
being reallocated, so let's initialize it in STG_POOL so that early
STG_INIT code can at least use it.
These were docs for very old design thoughts or internal subsystems
which are now totally irrelevant and even misleading. Those with some
outdated ideas mixed with useful stuff were kept though.
The conditions where ERR, EOS and EOI are found are not always
crystal clear, and the fact that there's still a good bunch of
original ones dating from the early days and that seem to test for
non-existing cases doesn't help either.
After auditing the code base and projecting the 3 main muxes' stream
termination conditions, with Christopher and Amaury we could establish
the current flags matrix which indicates both what each combination
means for each mux and when it is set by each of them (or not set and
for what reason).
It should be sufficient to void doubts when adding code or when chasing
a bug.
It *must not* be backported because it is highly specific to the latest
2.8-dev.
Released version 2.8-dev7 with the following main changes :
- BUG/MINOR: stats: Don't replace sc_shutr() by SE_FL_EOS flag yet
- BUG/MEDIUM: mux-h2: Be able to detect connection error during handshake
- BUG/MINOR: quic: Missing padding in very short probe packets
- MINOR: proxy/pool: prevent unnecessary calls to pool_gc()
- CLEANUP: proxy: remove stop_time related dead code
- DOC/MINOR: reformat configuration.txt's "quoting and escaping" table
- MINOR: http_fetch: Add support for empty delim in url_param
- MINOR: http_fetch: add case insensitive support for smp_fetch_url_param
- MINOR: http_fetch: Add case-insensitive argument for url_param/urlp_val
- REGTESTS : Add test support for case insentitive for url_param
- BUG/MEDIUM: proxy/sktable: prevent watchdog trigger on soft-stop
- BUG/MINOR: backend: make be_usable_srv() consistent when stopping
- BUG/MINOR: ssl: Remove dead code in cli_parse_update_ocsp_response
- BUG/MINOR: ssl: Fix potential leak in cli_parse_update_ocsp_response
- BUG/MINOR: ssl: ssl-(min|max)-ver parameter not duplicated for bundles in crt-list
- BUG/MINOR: quic: Wrong use of now_ms timestamps (cubic algo)
- MINOR: quic: Add recovery related information to "show quic"
- BUG/MINOR: quic: Wrong use of now_ms timestamps (newreno algo)
- BUG/MINOR: quic: Missing max_idle_timeout initialization for the connection
- MINOR: quic: Implement cubic state trace callback
- MINOR: quic: Adjustments for generic control congestion traces
- MINOR: quic: Traces adjustments at proto level.
- MEDIUM: quic: Ack delay implementation
- BUG/MINOR: quic: Wrong rtt variance computing
- MINOR: cli: support filtering on FD types in "show fd"
- MINOR: quic: Add a fake congestion control algorithm named "nocc"
- CI: run smoke tests on config syntax to check memory related issues
- CLEANUP: assorted typo fixes in the code and comments
- CI: exclude doc/{design-thoughts,internals} from spell check
- BUG/MINOR: quic: Remaining useless statements in cubic slow start callback
- BUG/MINOR: quic: Cubic congestion control window may wrap
- MINOR: quic: Add missing traces in cubic algorithm implementation
- BUG/MAJOR: quic: Congestion algorithms states shared between the connection
- BUG/MINOR: ssl: Undefined reference when building with OPENSSL_NO_DEPRECATED
- BUG/MINOR: quic: Remove useless BUG_ON() in newreno and cubic algo implementation
- MINOR: http-act: emit a warning when a header field name contains forbidden chars
- DOC: config: strict-sni allows to start without certificate
- MINOR: quic: Add trace to debug idle timer task issues
- BUG/MINOR: quic: Unexpected connection closures upon idle timer task execution
- BUG/MINOR: quic: Wrong idle timer expiration (during 20s)
- BUILD: quic: 32bits compilation issue in cli_io_handler_dump_quic()
- BUG/MINOR: quic: Possible wrong PTO computing
- BUG/MINOR: tcpcheck: Be able to expect an empty response
- BUG/MEDIUM: stconn: Add a missing return statement in sc_app_shutr()
- BUG/MINOR: stream: Fix test on channels flags to set clientfin/serverfin touts
- MINOR: applet: Uninline appctx_free()
- MEDIUM: applet/trace: Register a new trace source with its events
- CLEANUP: stconn: Remove remaining debug messages
- BUG/MEDIUM: channel: Improve reports for shut in co_getblk()
- BUG/MEDIUM: dns: Properly handle error when a response consumed
- MINOR: stconn: Remove unecessary test on SE_FL_EOS before receiving data
- MINOR: stconn/channel: Move CF_READ_DONTWAIT into the SC and rename it
- MINOR: stconn/channel: Move CF_SEND_DONTWAIT into the SC and rename it
- MINOR: stconn/channel: Move CF_NEVER_WAIT into the SC and rename it
- MINOR: stconn/channel: Move CF_EXPECT_MORE into the SC and rename it
- MINOR: mux-pt: Report end-of-input with the end-of-stream after a read
- BUG/MINOR: mux-h1: Properly report EOI/ERROR on read0 in h1_rcv_pipe()
- CLEANUP: mux-h1/mux-pt: Remove useless test on SE_FL_SHR/SE_FL_SHW flags
- MINOR: mux-h1: Report an error to the SE descriptor on truncated message
- MINOR: stconn: Always ack EOS at the end of sc_conn_recv()
- MINOR: stconn/applet: Handle EOI in the applet .wake callback function
- MINOR: applet: No longer set EOI on the SC
- MINOR: stconn/applet: Handle EOS in the applet .wake callback function
- MEDIUM: cache: Use the sedesc to report and detect end of processing
- MEDIUM: cli: Use the sedesc to report and detect end of processing
- MINOR: dns: Remove the test on the opposite SC state to send requests
- MEDIUM: dns: Use the sedesc to report and detect end of processing
- MEDIUM: spoe: Use the sedesc to report and detect end of processing
- MEDIUM: hlua/applet: Use the sedesc to report and detect end of processing
- MEDIUM: log: Use the sedesc to report and detect end of processing
- MEDIUM: peers: Use the sedesc to report and detect end of processing
- MINOR: sink: Remove the tests on the opposite SC state to process messages
- MEDIUM: sink: Use the sedesc to report and detect end of processing
- MEDIUM: stats: Use the sedesc to report and detect end of processing
- MEDIUM: promex: Use the sedesc to report and detect end of processing
- MEDIUM: http_client: Use the sedesc to report and detect end of processing
- MINOR: stconn/channel: Move CF_EOI into the SC and rename it
- MEDIUM: tree-wide: Move flags about shut from the channel to the SC
- MINOR: tree-wide: Simplifiy some tests on SHUT flags by accessing SCs directly
- MINOR: stconn/applet: Add BUG_ON_HOT() to be sure SE_FL_EOS is never set alone
- MINOR: server: add SRV_F_DELETED flag
- BUG/MINOR: server/del: fix srv->next pointer consistency
- BUG/MINOR: stats: properly handle server stats dumping resumption
- BUG/MINOR: sink: free forward_px on deinit()
- BUG/MINOR: log: free log forward proxies on deinit()
- MINOR: server: always call ssl->destroy_srv when available
- MINOR: server: correctly free servers on deinit()
- BUG/MINOR: hlua: hook yield does not behave as expected
- MINOR: hlua: properly handle hlua_process_task HLUA_E_ETMOUT
- BUG/MINOR: hlua: enforce proper running context for register_x functions
- MINOR: hlua: Fix two functions that return nothing useful
- MEDIUM: hlua: Dynamic list of frontend/backend in Lua
- MINOR: hlua_fcn: alternative to old proxy and server attributes
- MEDIUM: hlua_fcn: dynamic server iteration and indexing
- MEDIUM: hlua_fcn/api: remove some old server and proxy attributes
- CLEANUP: hlua: fix conflicting comment in hlua_ctx_destroy()
- MINOR: hlua: add simple hlua reference handling API
- MINOR: hlua: fix return type for hlua_checkfunction() and hlua_checktable()
- BUG/MINOR: hlua: fix reference leak in core.register_task()
- BUG/MINOR: hlua: fix reference leak in hlua_post_init_state()
- BUG/MINOR: hlua: prevent function and table reference leaks on errors
- CLEANUP: hlua: use hlua_ref() instead of luaL_ref()
- CLEANUP: hlua: use hlua_pushref() instead of lua_rawgeti()
- CLEANUP: hlua: use hlua_unref() instead of luaL_unref()
- MINOR: hlua: simplify lua locking
- BUG/MEDIUM: hlua: prevent deadlocks with main lua lock
- MINOR: hlua_fcn: add server->get_rid() method
- MINOR: hlua: support for optional arguments to core.register_task()
- DOC: lua: silence "literal block ends without a blank line" Sphinx warnings
- DOC: lua: silence "Unexpected indentation" Sphinx warnings
- BUG/MINOR: event_hdl: fix rid storage type
- BUG/MINOR: event_hdl: make event_hdl_subscribe thread-safe
- MINOR: event_hdl: global sublist management clarification
- BUG/MEDIUM: event_hdl: clean soft-stop handling
- BUG/MEDIUM: event_hdl: fix async data refcount issue
- MINOR: event_hdl: normal tasks support for advanced async mode
- MINOR: event_hdl: add event_hdl_async_equeue_isempty() function
- MINOR: event_hdl: add event_hdl_async_equeue_size() function
- MINOR: event_hdl: pause/resume for subscriptions
- MINOR: proxy: add findserver_unique_id() and findserver_unique_name()
- MEDIUM: hlua/event_hdl: initial support for event handlers
- MINOR: hlua/event_hdl: per-server event subscription
- EXAMPLES: add basic event_hdl lua example script
- MINOR: http-ana: Add a HTTP_MSGF flag to state the Expect header was checked
- BUG/MINOR: http-ana: Don't switch message to DATA when waiting for payload
- BUG/MINOR: quic: Possible crashes in qc_idle_timer_task()
- MINOR: quic: derive first DCID from client ODCID
- MINOR: quic: remove ODCID dedicated tree
- MINOR: quic: remove address concatenation to ODCID
- BUG/MINOR: mworker: unset more internal variables from program section
- BUG/MINOR: errors: invalid use of memprintf in startup_logs_init()
- MINOR: applet: Use unsafe version to get stream from SC in the trace function
- BUG/MUNOR: http-ana: Use an unsigned integer for http_msg flags
- MINOR: compression: Make compression offload a flag
- MINOR: compression: Prepare compression code for request compression
- MINOR: compression: Store algo and type for both request and response
- MINOR: compression: Count separately request and response compression
- MEDIUM: compression: Make it so we can compress requests as well.
- BUG/MINOR: lua: remove incorrect usage of strncat()
- CLEANUP: tcpcheck: remove the only occurrence of sprintf() in the code
- CLEANUP: ocsp: do no use strpcy() to copy a path!
- CLEANUP: tree-wide: remove strpcy() from constant strings
- CLEANUP: opentracing: remove the last two occurrences of strncat()
- BUILD: compiler: fix __equals_1() on older compilers
- MINOR: compiler: define a __attribute__warning() macro
- BUILD: bug.h: add a warning in the base API when unsafe functions are used
- BUG/MEDIUM: listeners: Use the right parameters for strlcpy2().
advanced async mode (EVENT_HDL_ASYNC_TASK) provided full support for
custom tasklets registration.
Due to the similarities between tasks and tasklets, it may be useful
to use the advanced mode with an existing task (not a tasklet).
While the API did not explicitly disallow this usage, things would
get bad if we try to wakeup a task using tasklet_wakeup() for notifying
the task about new events.
To make the API support both custom tasks and tasklets, we use the
TASK_IS_TASKLET() macro to call the proper waking function depending
on the task's type:
- For tasklets: we use tasklet_wakeup()
- For tasks: we use task_wakeup()
If 68e692da0 ("MINOR: event_hdl: add event handler base api")
is being backported, then this commit should be backported with it.
The same change was already performed for the cli. The stats applet and the
prometheus exporter are also concerned. Both use the stats API and rely on
pool functions to get total pool usage in bytes. pool_total_allocated() and
pool_total_used() must return 64 bits unsigned integer to avoid any wrapping
around 4G.
This may be backported to all versions.
Till now it was only possible to change the thread local hot cache size
at build time using CONFIG_HAP_POOL_CACHE_SIZE. But along benchmarks it
was sometimes noticed a huge contention in the lower level memory
allocators indicating that larger caches could be beneficial, especially
on machines with large L2 CPUs.
Given that the checks against this value was no longer on a hot path
anymore, there was no reason for continuing to force it to be tuned at
build time. So this patch allows to set it by tune.memory-hot-size.
It's worth noting that during the boot phase the value remains zero so
that it's possible to know if the value was set or not, which opens the
possibility that we try to automatically adjust it based on the per-cpu
L2 cache size or the use of certain protocols (none of this is done yet).
Since the massive pools cleanup that happened in 2.6, the pools
architecture was made quite more hierarchical and many alternate code
blocks could be moved to runtime flags set by -dM. One of them had not
been converted by then, DEBUG_UAF. It's not much more difficult actually,
since it only acts on a pair of functions indirection on the slow path
(OS-level allocator) and a default setting for the cache activation.
This patch adds the "uaf" setting to the options permitted in -dM so
that it now becomes possible to set or unset UAF at boot time without
recompiling. This is particularly convenient, because every 3 months on
average, developers ask a user to recompile haproxy with DEBUG_UAF to
understand a bug. Now it will not be needed anymore, instead the user
will only have to disable pools and enable uaf using -dMuaf. Note that
-dMuaf only disables previously enabled pools, but it remains possible
to re-enable caching by specifying the cache after, like -dMuaf,cache.
A few tests with this mode show that it can be an interesting combination
which catches significantly less UAF but will do so with much less
overhead, so it might be compatible with some high-traffic deployments.
The change is very small and isolated. It could be helpful to backport
this at least to 2.7 once confirmed not to cause build issues on exotic
systems, and even to 2.6 a bit later as this has proven to be useful
over time, and could be even more if it did not require a rebuild. If
a backport is desired, the following patches are needed as well:
CLEANUP: pools: move the write before free to the uaf-only function
CLEANUP: pool: only include pool-os from pool.c not pool.h
REORG: pool: move all the OS specific code to pool-os.h
CLEANUP: pools: get rid of CONFIG_HAP_POOLS
DEBUG: pool: show a few examples in -dMhelp
This is an initial work for the dedicated
event handler API internal documentation.
The file is located at doc/internals/api/event_hdl.txt
event_hdl feature has been introduced with:
MINOR: event_hdl: add event handler base api
Let's keep these notes as references for later use. Polling on connect()
can sometimes return a few unexpected state combinations that such tests
illustrate. They can serve as reminders for special error handling.
The "sequence" and "entities" diagrams have become so much outdated that
they are at best confusing, but more generally wrong. Let's simply remove
them.
The "layers" mini-doc shows how streams, stconn, sedesc, conns, applets
and muxes interact, with field names, pointers and invariants. It should
be completed but already provides a quick overview about what can be
guaranteed at any step and at different layers.
The stream connector replaced the conn_stream and the sc_conn_io_cb()
function appeared. There's no place there to mention the endpoint
descriptor, but a separate diagram showing the relation between stream
and endpoint via the connector would be nice.
This adds a call to function <fct> to the list of functions to be called at
the step just before the configuration validity checks. This is useful when you
need to create things like it would have been done during the configuration
parsing and where the initialization should continue in the configuration
check.
It could be used for example to generate a proxy with multiple servers using
the configuration parser itself. At this step the trash buffers are allocated.
Threads are not yet started so no protection is required. The function is
expected to return non-zero on success, or zero on failure. A failure will make
the process emit a succinct error message and immediately exit.
The STG_REGISTER init level is used to register known keywords and
protocol stacks. It must be called earlier because some of the init
code already relies on it to be known. For example, "haproxy -vv"
for now is constrained to start very late only because of this.
This patch moves it between STG_LOCK and STG_ALLOC, which is fine as
it's used for static registration.