This was reported by Coverity. In sample_conv_q_preferred() function, a goto
statement after a "while(1)" loop is unreachable. Instead of just removing
it, the same goto statement in the loop is replaced by a break. It is safer
this way, in case the loop change in future.
This patch should fix the issue #2683.
The receiver protocol is always set when a listener is created or cloned. At
least for now. And there is no check on it at many places, except in
listener_accept() function. So, let's remove remaining useless checks. That
will avoid false Coverity reports in future.
This patch should fix the issue #2631.
This bug arrived with this commit:
cdfceb10a MINOR: quic: refactor qc_prep_pkts() loop
which prevents haproxy from sending PING only packets/datagrams (some
packets/datagrams with only PING frame as ack-eliciting frames inside).
Such packets/datagrams are useful in rare cases during retransmissions
when one wants to probe the peer without exceeding the anti-amplification
limit.
Modify the condition passed to qc_build_pkt() to add padding to the current
datagram. One does not want to do that when probing the peer without ack-eliciting
frames passed as <frms> parameter. Indeed qc_build_pkt() calls qc_do_build_pkt()
which supports this case: if <probe> is true (probing required), qc_do_build_pkt()
handles the case where some padding must be added to a PING only packet/datagram.
This is the case when probing with an empty <frms> frame list of ack-eliciting
frames without exceeding the anti-amplification limit from qc_dgrams_retransmit().
Add some comments to qc_build_pkt() and qc_do_build_pkt() to clarify this
as this code is easy to break!
Thank you for @Tristan971 for having reported this issue in GH #2709.
Must be backported to 3.0.
Add a BUG_ON() to detect some malformed packets which are supposed to probe the
peer without being ack-eliciting: the peer would not acknowledged such packets.
These functions return a symbolic error code such as ECONNRESET to keep
logs compact while making them human-readable. It's a good alternative
to the numeric code in that it's more expressive, and a good one to the
full message since it's shorter and more precise (some codes even match
errno names).
The doc was updated so that the symbolic names appear in the table. It
could be useful to backport this feature to help with troubleshooting
some issues, though backporting the doc might possibly be more annoying
in case users have local patches already, so maybe the table update does
not need to be backported in this case.
For a long time the errno values returned by recv/send/splice() were not
translated to connection error codes. There are not that many eligible
and having them would help a lot when debugging some complex issues where
logs disagree with network traces. Let's add them now.
While we get reports of connection setup errors in fc_err/bc_err, we
don't have the equivalent for the recv/send/splice syscalls. Let's
add provisions for new codes that cover the common errno values that
recv/send/splice can return, i.e. ECONNREFUSED, ENOMEM, EBADF, EFAULT,
EINVAL, ENOTCONN, ENOTSOCK, ENOBUFS, EPIPE. We also add a special case
for when the poller reported the error itself. It's worth noting that
EBADF/EFAULT/EINVAL will generally indicate serious bugs in the code
and should not be reported.
The only thing is that it's quite hard to forcefully (and reliably)
trigger these errors in automated tests as the timing is critical.
Using iptables to manually reset established connections in the
middle of large transfers at least permits to see some ECONNRESET
and/or EPIPE, but the other ones are harder to trigger.
"debug dev close <fd>" currently closes that FD using fd_delete() after
checking that it's known from the fdtab. Sometimes we also want to just
perform a pure close() of FDs not in the fdtab (pollers, etc) in order
to provoke certain error cases. The optional "hard" argument to the
command will make it use a plain close() instead of fd_delete() and skip
the fd owner check. The main visible effect when closing a traffic socket
with it is that instead of dying from a double fd_delete() by seeing that
fd.owner is already 0, it will die during the next fd_insert() seeing that
fd.owner was not 0.
It was the only one prefixed with "CO_ERR_", making it harder to batch
process and to look up. It was added in 2.5 by commit 61944f7a73 ("MINOR:
ssl: Set connection error code in case of SSL read or write fatal failure")
so it can be backported as far as 2.6 if needed to help integrate other
patches.
First there is a typo in the directive name, then it is not documented and
finally, it is not used at all. The directive is only removed from the
keyword list. Parsing function is not updated.
This patch should fix the issue #2601.
We're using a few occurrences of __builtin_prefetch() but tcc doesn't
know about it so let's give it a dummy definition. Now the code builds
and works again with tcc without thread support.
TCC is often convenient to quickly test builds, run CI tests etc. It has
limited thread support (e.g. no thread-local stuff) but that is often
sufficient for testing. TCC lacks __atomic_exchange_n() but has the
exactly equivalent __atomic_exchange(), and doesn't have any barrier.
For this reason we force the atomic_exchange to use the stricter SEQ_CST
mem ordering that allows to ignore the barrier.
[wt: that's upstream commit ca8b865 ("BUILD: support building with TCC")]
When extra counters are dumped for an entity (frontend, backend, server or
listener), there is a filter on capabilities. Some extra counters are not
available for all entities and must be ignored. However, when this was
performed, the field number, used as an index to dump the metric value, was
still incremented while it should not and leads to an overflow or a stats
mix-up.
This patch must be backported to 3.0.
This commit introduces the tune.renice.startup and tune.renice.runtime
global keywords that allows to change the priority with setpriority().
tune.renice.startup is parsed and applied in the worker or the standalone
process for configuration parsing. If this keyword is used alone, the
nice value is changed to the previous one after configuration parsing.
tune.renice.runtime is applied after configuration parsing, so in the
worker or a standalone process. Combined with tune.renice.startup it
allows to have a different nice value during configuration parsing and
during runtime.
The feature was discussed in github issue #1919.
Example:
global
tune.renice.startup 15
tune.renice.runtime 0
Released version 3.1-dev11 with the following main changes :
- BUG/MINOR: httpclient: return NULL when no proxy available during httpclient_new()
- BUG/MEDIUM: mworker/httpclient: initialization skipped by accident in mworker mode
- BUG/MINOR: resolvers/mworker: missing default resolvers in mworker mode
- MINOR: mworker/ocsp: skip ocsp-update proxy init in master
- BUG/MEDIUM: stconn: Wait iobuf is empty to shut SE down during a check send
- MINOR: mux-h1: Show the SD iobuf in trace messages on stream send events
- MINOR: mux-h1: Add a trace on shutdown when keep-alive is not possible
- BUG/MINOR: http-ana: Don't report a server abort if response payload is invalid
- BUG/MEDIUM: stconn: Check FF data of SC to perform a shutdown in sc_notify()
- BUG/MAJOR: filters/htx: Add a flag to state the payload is altered by a filter
- REGTESTS: Never reuse server connection in http-messaging/truncated.vtc
- BUG/MINOR: quic: avoid leaking post handshake frames
- MINOR: quic: send new tokens (NEW_TOKEN) even for 1RTT sessions
- BUG/MEDIUM: quic: avoid freezing 0RTT connections
- DOC: config: fix rfc7239 forwarded typo in desc
- MINOR: http_ext: implement rfc7239_{nn,np} converters
- CLEANUP: http_ext: remove useless BUG_ON() in http_handle_xot_header()
- BUG/MINOR: sample: free err2 in smp_resolve_args for type ARGT_REG
- MINOR: arg: add an argument type for identifier
- BUILD: buffers: keep b_getblk_nc() and b_peek_varint() in buf.h
- CLEANUP: buffers: simplify b_get_varint()
- OPTIM: buffers: avoid a useless wrapping check for ofs == 0
- MINOR: debug: make mark_tainted() return the previous value
- MINOR: chunk: drop the global thread_dump_buffer
- MINOR: debug: split ha_thread_dump() in two parts
- MINOR: debug: slightly change the thread_dump_pointer signification
- MINOR: debug: make ha_thread_dump_done() take the pointer to be used
- MINOR: debug: replace ha_thread_dump() with its two components
- MEDIUM: debug: on panic, make the target thread automatically allocate its buf
- BUILD: mux-h2/traces: fix build on 32-bit due to size of the DATA frame
- CI: prepare Coverity build for Ubuntu 24
- CI: bump development builds explicitely to Ubuntu 24.04
- CI: modernize macos builds to macos-15
- BUG/MINOR: mworker: fix mworker-max-reloads parser
- MINOR: mux-quic: simplify sending of empty STREAM FIN
- BUG/MINOR: mux-quic: do not close STREAM with empty FIN if no data sent
- CLEANUP: debug: make the BUG_ON() macros check the condition in the outer one
- MEDIUM: debug: add match counters for BUG_ON/WARN_ON/CHECK_IF
- MINOR: debug: add a new debug macro COUNT_IF()
- MINOR: debug: add "debug dev counters" to list code counters
- BUG/MEDIUM: stats-html: Never dump more data than expected during 0-copy FF
- BUG/MEDIUM: mux-h2: Remove H2S from send list if data are sent via 0-copy FF
- BUG/MINOR: stconn: Pretend the SE have more data to deliver on abortonclose
- CLEANUP: stream: remove outdated comments
- DEBUG: stream: Add debug counters to track some client/server aborts
- DEBUG: mux-h1: Add debug counters to track some errors
- MINOR: mux-h1: Add support of the debug string for logs
- MINOR: stream: maintain per-stream counters of the number of passes on code
- MINOR: filters: add per-filter call counters
- MINOR: sample: add the "when" converter to condition some expressions
- BUG/MEDIUM: connection/http-reuse: fix address collision on unhandled address families
- BUILD: spoe: fix build warning on older gcc around sub-struct initialization
- Revert "OPTIM: mux-h2: make h2_send() report more accurate wake up conditions"
- DEBUG: mux-h1: Add debug counters to track errors with in/out pending data
- BUG/MINOR: mux-h1: Fix conditions on pipe in some COUNT_IF()
- MINOR: activity/memprofile: show per-DSO stats
- BUG/MINOR: mworker/cli: show master startup logs in recovery mode
- MINOR: mworker: stop MASTER proxy listener on worker mcli sockpair
- MINOR: error: simplify startup_logs_init_shm
- BUG/MINOR: mworker: show worker warnings in startup logs
- CLEANUP: mworker: clean mworker_reexec
- MINOR: mworker/cli: split mworker_cli_proxy_create
- BUG/MINOR: server: fix dynamic server leak with check on failed init
- BUG/MEDIUM: server: fix race on servers_list during server deletion
- BUG/MEDIUM: stconn: Report blocked send if sends are blocked by an error
- BUG/MINOR: http-ana: Fix wrong client abort reports during responses forwarding
- BUG/MINOR: stconn: Don't disable 0-copy FF if EOS was reported on consumer side
- MINOR: mworker/cli: add 'debug' to 'show proc'
- MINOR: mworker/cli: remove comment line for program when useless
- MINOR: mworker/cli: 'show proc debug' for old workers
- BUILD: debug: silence a build warning with threads disabled
- CLEANUP: mux-h2: remove the unused "full" variable in h2_frt_transfer_data()
- MINOR: pools: export the pools variable
- MINOR: debug: place a magic pattern at the beginning of post_mortem
- MINOR: debug: place the post_mortem struct in its own section.
- MINOR: debug: store important pointers in post_mortem
- MINOR: debug: do not limit backtraces to stuck threads
- MINOR: cli: remove non-printable characters from 'debug dev fd'
- MINOR: cli: add an 'echo' command
- MINOR: debug: also add a pointer to struct global to post_mortem
- CLEANUP: mworker: make mworker_create_master_cli more readable
- BUG/MEIDUM: mworker: fix fd leak from master to worker
- BUG/MINOR: mworker/cli: fix mworker_cli_global_proxy_new_listener
- MINOR: tools: add strnlen2() helper
- CLEANUP: log: use strnlen2() in _lf_text_len() to compute string length
- DOC: design: add notes about more detailed error reporting for logs
- MINOR: debug: also add fdtab and acitvity to struct post_mortem
- MINOR: debug: remove the redundant process.thread_info array from post_mortem
- DEV: gdb: add a number of gdb scripts to navigate in core dumps
- BUG/MINOR: trace: stop rewriting argv with -dt
- MEDIUM: protocol: make abns a custom unix socket address family
- MEDIUM: protocol: rely on AF_CUST_ABNS family to recognize ABNS sockets
- CLEANUP: tools: rely on address family to detect ABNS sockets
- MINOR: protocol: create abnsz socket address family
- MINOR: sock: restore effective UNIX family in sock_get_old_sockets()
- MEDIUM: sock: also restore effective unix family in get_{src,dst}()
- MEDIUM: sock_unix: use per-family addrcmp function
- MEDIUM: socket: add zero-terminated ABNS alternative
- BUG/MINOR: ssl/cli: 'set ssl cert' does not check the transaction name correctly
- BUG/MINOR: mworker: mworker_reexec: unset MODE_STARTING before free startup logs ring
- BUG/MINOR: errors: startup_logs_free: set global startup_logs ptr to NULL
- BUG/MINOR: errors: print_message: don't allocate startup logs ring
- BUG/MINOR: startup: don't fork worker if started with -c -W
- BUG/MINOR: startup: dump libs only in worker if started with -W -dL
- BUG/MINOR: startup: dump keywords only in worker if started with -W -dKAll
- BUG/MINOR: startup: don't dump polling info for master in verbose mode
- CI: switch QUIC Interop on AWS-LC to common docker image
- CI: switch QUIC Interop on LibreSSL to common docker image
- CI: enable chacha20 test on LibreSSL QUIC Interop
- DOC: config: add missing glitch_{cnt,rate} data types
- DOC: config: add missing glitch_{cnt,rate} sample definitions
- CI: LibreSSL QUIC Interop: fix docker context
- DEBUG: mux-h1: Add H1C expiration dates in trace messages
- BUG/MEDIUM: mux-h1: Fix how timeouts are applied on H1 connections
- BUG/MINOR: http-ana: Report internal error if an action yields on a final eval
- MINOR: stream: Save last evaluated rule on invalid yield
- MINOR: quic: complete trace in qc_may_build_pkt()
- MINOR: quic: move qc_send_mux() prototype into quic_tx.h
- MINOR: stream: Replace last_rule_file/line fields by a more generic field
- MINOR: stream: Save the last filter evaluated interrupting the processing
- MINOR: stream: Save the entity waiting to continue its processing
- MINOR: stream: Use an enum to identify last and waiting entities for streams
- MINOR: stream: Add http-buffer-request option in the waiting entities
- DOC: config: Add documentation about last_entity sample fetch
- DOC: config: Add documentation about waiting_entity sample fetch
When http-buffer-request option is set on a proxy, the processing will be
paused to wait the full request payload or a full buffer. So it is an entity
that block the processing, just like a rule or a filter that yields. So now,
it is reported as a waiting entity if an error or a timeout occurred.
To do so, an stream entity type is added for this option. There is no
pointer. And "waiting_entity" sample fetch returns the option name.
When a rule or a filter yields because it waits for something to be able to
continue its processing, this entity is saved in the stream. If an error or
a timeout occurred, info on this entity may be retrieved via the
"waiting_entity" sample fetch, for instance to dump it in the logs. This
info may be useful to found root cause of some bugs because it is a way to
know the processing was temporarily stopped. This may explain timeouts for
instance.
The sample fetch is not documented yet.
It is very similar to the last evaluated rule. When a filter returns an
error that interrupts the processing, it is saved in the stream, in the
last_entity field, with the type 2. The pointer on filter config is
saved. This pointer never changes during runtime and is part of the proxy's
structure. It is an element of the filter_configs list in the proxy
structure.
"last_entity" sample fetch was update accordingly. The filter identifier is
returned, if defined. Otherwise the save pointer.
The last evaluated rule is now saved in a generic structure, named
last_entity, with a type to identify it. The idea is to be able to store
other kind of entity that may interrupt a specific processing.
The type of the last evaluated rule is set to 1. It will be replace later by
an enum to be more explicit. In addition, the pointer to the rule itself is
saved instead of its location.
The sample fetch "last_entity" was added to retrieve the information about
it. In this case, it is the rule localtion, the config file containing the
rule followed by the line where the rule is defined, separated by a
colon. This sample fetch is not documented yet.
When an action yields while it is not allowed, an internal error is
reported. This interrupts the processing. So info about the last evaluated
rule must be filled.
This patch may be bakcported if needed. If so, the commit ("MINOR: stream:
Save last evaluated rule on invalid yield") must be backported first.
This was already performed for tcp actions at content level, but not for
HTTP actions. It is always a bug, so it must be reported accordingly.
This patch may be backported to all stable versions.
There were several flaws in the way the different timeouts were applied on
H1 connections. First, the H1C task handling timeouts was not created if no
client/server timeout was specified. But there are other timeouts to
consider. First, the client-fin/server-fin timeouts. But for frontend
connections, http-keey-alive and http-request timeouts may also be used. And
finally, on soft-stop, the close-spread-time value must be considered too.
So at the end, it is probably easier to always create a task to manage H1C
timeouts. Especially since the client/server timeouts are most often set.
Then, when the expiration date of the H1C's task must only be updated if the
considered timeout is set. So tick_add_ifset() must be used instead of
tick_add(). Otherwise, if a timeout is undefined, the taks may expire
immediately while it should in fact never expire.
Finally, the idle expiration date must only be considered for idle
connections.
This patch should be backported in all stable versions, at least as far as
2.6. On the 2.4, it will have to be slightly adapted for the idle_exp
part. On 2.2 and 2.0, the patch will have to be rewrite because
h1_refresh_timeout() is quite different.
Following previous commit, when glitch_cnt and glitch_rate data types were
implemented in c9c6b683f ("MEDIUM: stick-tables: add a new stored type for
glitch_cnt and glitch_rate"), newly exposed samples such as
table_glitch_cnt(), table_glitch_rate, src_glitch_cnt() and
src_glitch_rate() were documented but their definitions was missing in
supported keywords list.
It should be backported in 3.0 with c9c6b683f
When glitch_cnt and glitch_rate data types were implemented in
c9c6b683f ("MEDIUM: stick-tables: add a new stored type for glitch_cnt and
glitch_rate"), the data types list for "stick-table" keyword documentation
was overlooked.
This was reported by Nick Ramirez.
It should be backported in 3.0 with c9c6b683f.
As master-worker fork happens now before step_init_2(), when pollers are
initialized and polling settings and dumped then in verbose and in debug modes
to stdout, it turns out that master and worker dump its same polling
settings separately. This creates long and messy output in these modes.
Polling settings are the same for master and for worker process for the moment.
Even if they would diverge in future we are interested here in worker's
settings. So, when started in the master-worker mode let's dump it only in the
worker context.
This doesn't need to be backported as related to the latest master-worker
refactoring.
If haproxy was started with -W -dK*, after master-worker refactoring, we dump
registered keywords to stdout twice in master and in worker processes. This
information is redundant and output has no longer the right format. So, as the
keyword registration happens very early before the fork, let's dump keywords
only in the worker context, if haproxy was launched with -W.
This does not need to be backported, as related to the latest master-worker
refactoring.
If haproxy was started with -W -dL, after master-worker refactoring we dump
libs to stdout twice in master and in worker processes. This is information is
redundant. So let's show linked libraries only in the worker context, if
haproxy was started also with -W.
This does not need to be backported, as related to the latest master-worker
rework.
Don't do master-worker fork if MODE_CHECK is detected from the command line along
with the master-worker mode. We should exit in MODE_CHECK, after the
configuration parsing and validation. So, with the new master-worker architecture
it's better to align this mode with the standalone.
This patch does not need to be backported, as related to the latest
master-worker rework.
Don't call startup_logs_init() in order to allocate the startup logs ring
again, if startup_logs pointer is NULL. Startup logs ring is allocated
explicitly in step_init_1 routine, when the process starts, and it's freed
explicitly for master process at the end of mworker_reexec scope. So, when
we no longer have this pointer, let's just save the log message in the
message buffer.
Otherwise, in case of master process, we will allocate the startup logs ring
again here and we will lost its address after execvp.
No need to backport this fix as it's related to the latest master-worker
refactoring.
ring_free() calls free() on the ring struct pointer, but startup_logs continues
to keep this address. So let's reset at the end startup_logs to NULL.
startup_logs is checked in print_message().
No need to backport this fix, as it's related to the latest master-worker
refactoring.
Flag MODE_STARTING should be unset for master just before freeing the startup
logs ring, as it triggers the copy of process logs to this ring, see the code
of print_message().
Moreover with this flag set, if startup logs ring pointer is NULL, any
print_message() triggered just before the execvp in mworker_reexec() will call
startup_logs_init(). So ring will be allocated again "discretely" and after
execvp we will lost its address, as in step_init_1() we will call again
startup_logs_init().
No need to backport this fix as it's related to the latest master-worker
refactoring.
Since commit 089c13850f ("MEDIUM: ssl: ssl-load-extra-del-ext work
only with .crt"), the 'set ssl cert' CLI command does not check
correctly if the transaction you are trying to update is the right one.
The consequence is that you could commit accidentaly a transaction on
the wrong certificate.
The fix introduces the check again in case you are not using
ssl-load-extra-del-ext.
This must be backported in all stable versions.
When an abstract unix socket is bound by HAProxy (using "abns@" prefix),
NUL bytes are appended at the end of its path until sun_path is filled
(for a total of 108 characters).
Here we add an alternative to pass only the non-NUL length of that path
to connect/bind calls, such that the effective path of the socket's name
is as humanly written. This may be useful to interconnect with existing
softwares that implement abstract sockets with this logic instead of the
default haproxy one.
This is achieved by implementing the "abnsz" socket prefix (instead of
"abns"), which stands for "zero-terminated ABNS". "abnsz" prefix may be
used anywhere "abns" is. Internally, haproxy uses the custom socket
family (AF_CUST_ABNS vs AF_CUST_ABNSZ) to differentiate default abns
sockets from zero-terminated ones.
Documentation was updated and regtest was added.
Fixes GH issues #977 and #2479
Co-authored-by: Aurelien DARRAGON <adarragon@haproxy.com>
Thanks to previous commit, we may now use dedicated addrcmp functions for
each UNIX address family. This allows to simplify sock_unix_addrcmp()
function and avoid useless checks in order to try to guess the socket
type.
In this patch we implement sock_abns_addrcmp() and sock_abnsz_addrcmp()
functions, which are respectively used for ABNS and ABNSZ custom families
sock_unix_addrcmp() now only holds regular UNIX socket comparing logic.
As in previous commit, let's push the logic a bit further in order to
properly restore the effective UNIX socket type when leveraging
get_src() and get_dst() sock functions, since they rely on getpeername()
and getsockname() under the hood, both of which will actually loose the
effective family and return AF_UNIX for all our custom UNIX sockets.
To do this, add sock_restore_unix_family() helper function from the logic
implemented in the previous commit, and call this function from get_src()
and get_dst() in case of unix socket prior to returning.
When getting sockets from older process in sock_get_old_sockets(), we
leverage getsockname() to fill sockaddr struct from known fd.
However, the kernel doesn't know about our custom UNIX families such
as CUST_ABNS and CUST_ABNSZ which are both based on AF_UNIX real family.
Since haproxy socket API relies on effective family (and not real family)
to recognize the socket type instead of having to guess it by analyzing
the path content, let's restore it right after getsockname() since we
have all the infos needed to deduce the right family.
If the path starts with a NULL byte, we know that it is an abstract sock.
Then we simply check <addrlen> value from getsockname() to know if the
addr makes uses of the whole path space (normal ABNS) or partial path
space (zero ABNS / aka ABNZ) terminated by 0.
For now it's the same as abns. We'll need to modify sock_unix_addrcmp(),
and a few other ones to support effective path length when dealing with
the \0. Let's check with Tristan's patch for this (upcoming patch).
Co-authored-by: Aurelien DARRAGON <adarragon@haproxy.com>
Following previous commit, in str2sa_range(), make use of address' family
which was just set to check if the socket is ABNS or not instead of
relying on an extra boolean to save this info.
Now that we can easily distinguish regular UNIX socket from ABNS sockets
by simply looking at the address family, stop looking at the first byte
from addr->sun_path to guess if the socket is an ABNS one or not. Looking
at the family is straightforward and will allow to differentiate between
upcoming ABNSZ and ABNS (where looking at the first byte from path won't
help anymore).