Released version 3.2.0 with the following main changes :
- MINOR: promex: Add agent check status/code/duration metrics
- MINOR: ssl: support strict-sni in ssl-default-bind-options
- MINOR: ssl: also provide the "tls-tickets" bind option
- MINOR: server: define CLI I/O handler for "add server"
- MINOR: server: implement "add server help"
- MINOR: server: use stress mode for "add server help"
- BUG/MEDIUM: server: fix crash after duplicate GUID insertion
- BUG/MEDIUM: server: fix potential null-deref after previous fix
- MINOR: config: list recently added sections with -dKcfg
- BUG/MAJOR: cache: Crash because of wrong cache entry deleted
- DOC: configuration: fix the example in crt-store
- DOC: config: clarify the wording around single/double quotes
- DOC: config: clarify the legacy cookie and header captures
- DOC: config: fix alphabetical ordering of layer 7 sample fetch functions
- DOC: config: fix alphabetical ordering of layer 6 sample fetch functions
- DOC: config: fix alphabetical ordering of layer 5 sample fetch functions
- DOC: config: fix alphabetical ordering of layer 4 sample fetch functions
- DOC: config: fix alphabetical ordering of internal sample fetch functions
- BUG/MINOR: h3: Set HTX flags corresponding to the scheme found in the request
- BUG/MEDIUM: h3: Declare absolute URI as normalized when a :authority is found
- DOC: config: mention in bytes_in and bytes_out that they're read on input
- DOC: config: clarify the basics of ACLs (call point, multi-valued etc)
- REGTESTS: Make the script testing conditional set-var compatible with Vtest2
- REGTESTS: Explicitly allow failing shell commands in some scripts
- MINOR: listeners: Add support for a label on bind line
- BUG/MEDIUM: cli/ring: Properly handle shutdown in "show event" I/O handler
- BUG/MEDIUM: hlua: Properly detect shudowns for TCP applets based on the new API
- BUG/MEDIUM: hlua: Fix getline() for TCP applets to work with applet's buffers
- BUG/MEDIUM: hlua: Fix receive API for TCP applets to properly handle shutdowns
- CI: vtest: Rely on VTest2 to run regression tests
- CI: vtest: Fix the build script to properly work on MaOS
- CI: combine AWS-LC and AWS-LC-FIPS by template
- BUG/MEDIUM: httpclient: Throw an error if an lua httpclient instance is reused
- DOC: hlua: Add a note to warn user about httpclient object reuse
- DOC: hlua: fix a few typos in HTTPMessage.set_body_len() documentation
- DEV: patchbot: prepare for new version 3.3-dev
- MINOR: version: mention that it's 3.2 LTS now.
A few typos were noticed while gathering info for the 3.2 announce
messages, this fixes them, and will probably constitute the last
commit of this release. There's no need to backport it unless commit
94055a5e7 ("MEDIUM: hlua: Add function to change the body length of
an HTTP Message") is backported.
It is not supported to reuse an lua httpclient instance to process several
requests. A new object must be created for each request. Thanks to the
previous patch ("BUG/MEDIUM: httpclient: Throw an error if an lua httpclient
instance is reused"), an error is now reported if this happens. But it is
not obvious for users. So the lua-api docuementation was updated accordingly.
This patch is related to issue #2986. It should be backported with the
commit above.
It is not expected/supported to reuse an httpclient instance to process
several requests. A new instance must be created for each request. However,
in lua, there is nothing to prevent a user to create an httpclient object
and use it in a loop to process requests.
That's unfortunate because this will apparently work, the requests will be
sent and a response will be received and processed. However internally some
ressources will be allocated and never released. When the next response is
processed, the ressources allocated for the previous one are definitively
lost.
In this patch we take care to check that the httpclient object was never
used when a request is sent from a lua script by checking
HTTPCLIENT_FS_STARTED flags. This flag is set when a httpclient applet is
spawned to process a request and never removed after that. In lua, the
httpclient applet is created when the request is sent. So, it is the right
place to do this test.
This patch should fix the issue #2986. It should be backported as far as
2.6.
VTest2 (https://github.com/vtest/VTest2) was released and is a remplacement
for VTest. VTest was archived. So let's use the new version now.
If this commit is backported, the 2 following commits must also be
backported:
* 2808e3577 ("REGTESTS: Explicitly allow failing shell commands in some scripts")
* 82c291124 ("REGTESTS: Make the script testing conditional set-var compatible with Vtest2")
An optional timeout was added to AppletTCP.receive() to interrupt calls after a
delay. It was mandatory to be able to implement interactive applets (like
trisdemo). However, this broke the API and it made impossible to differentiate
the shutdowns from the delays expirations. Indeed, in both cases, an empty
string was returned.
Because historically an empty string was used to notify a connection shutdown,
it should not be changed. So now, 'nil' value is returned when no data was
available before the delay expiration.
The new AppletTCP:try_receive() function was also affected. To fix it, instead
of stating there is no delay when a receive is tried, an expired delay is
set. Concretely TICK_ETERNITY was replaced by now_ms.
Finally, AppletTCP:getline() function is not concerned for now because there
is no way to interrupt it after some delay.
The documentation and trisdemo lua script were updated accordingly.
This patch depends on "BUG/MEDIUM: hlua: Properly detect shudowns for TCP
applets based on the new API". However, it is a 3.2-specific issue, so no
backport is needed.
The commit e5e36ce09 ("BUG/MEDIUM: hlua/cli: Fix lua CLI commands to work
with applet's buffers") fixed the TCP applets API to work with applets using
its own buffers. Howver the getline() function was not updated. It could be
an issue for anyone registering a CLI commands reading lines.
This patch should be backported as far as 3.0.
The internal function responsible to receive data for TCP applets with
internal buffers is buggy. Indeed, for these applets, the buffer API is used
to get data. So there is no tests on the SE to properly detect connection
shutdowns. So, it must be performed by hand after the call to b_getblk_nc().
This patch must be backported as far as 3.0.
The commit 03dc54d802 ("BUG/MINOR: ring: Fix I/O handler of "show event"
command to not rely on the SC") introduced a regression. By removing
dependencies on the SC, a test to detect client shutdowns was removed. So
now, the CLI applet is no longer released when the client shut the
connection during a "show event -w".
So of course, we should not use the SC to detect the shutdowns. But the SE
must be used insteead.
It is a 3.2-specific issue, so no backport needed.
It is now possile to set a label on a bind line. All sockets attached to
this bind line inherits from this label. The idea is to be able to groud of
sockets. For now, there is no mechanism to create these groups, this must be
done by hand.
Vtest2, that should replaced Vtest in few months, will reject any failing
commands in shell blocks. However, some scripts are executing some commands,
expecting an error to be able to parse the error output. So, now use "set
+e" in those scripts to explicitly state failing commads are expected.
It is just used for non-final commands. At the end, the shell block must
still report a success.
VTest2 will replaced VTest in few months. There is not so much change
expected. One of them is that a User-Agent header is added by default in all
requests, except if an custom one is already set or if "-nouseragent" option
is used. To still be compatible with VTest, it is not possible to use the
option to avoid the header addition. So, a custom user-agent is added in the
last test of "sample_fetches/cond_set_var.vtc" to be sure it will pass with
Vtest and Vtest2. It is mandatory because the request length is tested.
This is essentially in order to address the concerns expressed in
issue #2226 where it is mentioned that the moment they are called is
not clear enough. Admittedly, re-reading the paragraph doesn't make
it obvious on a quick read that they behave like functions. This patch
adds an extra paragraph that makes the parallel with programming
languages' boolean functions and explains the fact that they can be
multi-valued. Hoping this is clearer now.
Issue #2267 suggests that it's unclear what exactly the byte counts mean
(particularly when compression is involved). Let's clarify that the counts
are read on data input and that they also cover headers and a bit of
internal overhead.
Since commit 2c3d656f8 ("MEDIUM: h3: use absolute URI form with
:authority"), the absolute URI form is used when a ':authority'
pseudo-header is found. However, this URI was not declared as normalized
internally. So, when the request is reformated to be sent to an h1 server,
the absolute-form is used instead of the origin-form. It is unexpected and
may be an issue for some servers that could reject the request.
So, now, we take care to set HTX_SL_F_HAS_AUTHORITY flag on the HTX message
when an authority was found and HTX_SL_F_NORMALIZED_URI flag is set for
"http" or "https" schemes.
No backport needed because the commit above must not be backported. It
should fix a regression reported on the 3.2-dev17 in issue #2977.
This commit depends on "BUG/MINOR: h3: Set HTX flags corresponding to the
scheme found in the request".
When a ":scheme" pseudo-header is found in a h3 request, the
HTX_SL_F_HAS_SCHM flag must be set on the HTX message. And if the scheme is
'http' or 'https', the corresponding HTX flag must also be set. So,
respectively, HTX_SL_F_SCHM_HTTP or HTX_SL_F_SCHM_HTTPS.
It is mainly used to send the right ":scheme" pseudo-header value to H2
server on backend side.
This patch could be backported as far as 2.6.
As reported in issue #2195, cookie captures and header captures are no
longer the recommended way to proceed. Let's mention that this is the
legacy way and provide a few pointers to the recommended functions and
actions to use the modern methods.
As reported in issue #2327, the wording used in the section about quoting
can be read two ways due to the use of the two types of quotes to protect
each other quote. Better only use the quoting without mixing the two when
mentioning them.
Fix a bad example in the crt-store section. site1 does not use the "web"
crt-store but the global one.
Must be backported as far as 3.0 however the section was 3.12 in
previous version.
When "vary" is enabled, we can have multiple entries for a given primary
key in the cache tree. There is a limit to how many secondary entries
can be inserted for a given key. When we try to insert a new secondary
entry, if the limit is already reached, we can try to find expired
entries with the same primary key, and if the limit is still reached we
want to abort the current insertion and to remove the node that was just
inserted.
In commit "a29b073: MEDIUM: cache: Add refcount on cache_entry" though,
a regression was introduced. Instead of removing the entry just inserted
as the comments suggested, we removed the second to last entry and
returned NULL. We then reset the eb.key of the cache_entry in the caller
because we assumed that the entry was already removed from the tree.
This means that some entries with an empty key were wrongly kept in the
tree and the last secondary entry, which keeps the number of secondary
entries of a given key was removed.
This ended up causing some crashes later on when we tried to iterate
over the elements of this given key. The crash could occur in multiple
places, either when trying to retrieve an entry or to add some new ones.
This crash was raised in GitHub issue #2950.
The fix should be backported up to 3.0.
A valid build warning was reported in the CI with latest commit b40ce97ecc
("BUG/MEDIUM: server: fix crash after duplicate GUID insertion"). Indeed,
if the first test in the function fails, we branch to the err label
with guid==NULL and will crash there. Let's just test guid before
dereferencing it for freeing.
This needs to be backported to 3.0 as well since the commit above was
meant to go there.
On "add server", if a GUID is defined, guid_insert() is used to add the
entry into the global GUID tree. If a similar entry already exists, GUID
insertion fails and the server creation is eventually aborted.
A crash could occur in this case because of an invalid memory access via
guid_remove(). The latter is caused via free_server() as the server
insertion is rejected. The invalid occurs on GUID key.
The issue occurs because of guid_insert(). The function properly
deallocates the GUID key on duplicate insertion, but it failed to reset
<guid.node.key> to NULL. This caused the invalid memory access on
guid_remove(). To fix this, ensure that key member is properly resetted
on guid_insert() error path.
This must be backported up to 3.0.
Implement stress mode on "add server help". This ensures that the
command is fully reentrant on full output buffer.
For testing, it requires compilation with USE_STRESS and global setting
"stress-level 1".
Implement "help" as a sub-command for "add server" CLI. The objective is
to list all the keywords that are supported for dynamic servers. CLI IO
handler and add_srv_ctx are used to support reentrancy on full output
buffer.
Now that this command is implemented, the outdated keyword list on "add
server" from management documentation can be removed.
Extend "add server" to support an IO handler function named
cli_io_handler_add_server(). A context object is also defined whose
usage will depend on IO handler capabilities.
IO handler is skipped when "add server" is run in default mode, i.e. on
a dynamic server creation. Thus, currently IO handler is unneeded.
However, it will become useful to support sub-commands for "add server".
Note that return value of "add server" parser has been changed on server
creation success. Previously, it was used incorrectly to report if
server was inserted or not. In fact, parser return value is used by CLI
generic code to detect if command processing has been completed, or
should continue to the IO handler. Now, "add server" always returns 1 to
signal that CLI processing is completed. This is necessary to preserve
CLI output emitted by parser, even now that IO handler is defined for
the command. Previously, output was emitted in every situations due to
IO handler not defined. See below code snippet from cli.c for a better
overview :
if (kw->parse && kw->parse(args, payload, appctx, kw->private) != 0) {
ret = 1;
goto fail;
}
/* kw->parse could set its own io_handler or io_release handler */
if (!appctx->cli_ctx.io_handler) {
ret = 1;
goto fail;
}
appctx->st0 = CLI_ST_CALLBACK;
ret = 1;
goto end;
Currently there is "no-tls-tickets" that is also supported in the
ssl-default-bind-options directive, but there's no way to re-enable
them on a specific "bind" line. This patch simply provides the option
to re-enable them. Note that the flag is inverted because tickets are
enabled by default and the no-tls-ticket option sets the flag to
disable them.
Several users already reported that it would be nice to support
strict-sni in ssl-default-bind-options. However, in order to support
it, we also need an option to disable it.
This patch moves the setting of the option from the strict_sni field
to a flag in the ssl_options field so that it can be inherited from
the default bind options, and adds a new "no-strict-sni" directive to
allow to disable it on a specific "bind" line.
The test file "del_ssl_crt-list.vtc" which already tests both options
was updated to make use of the default option and the no- variant to
confirm everything continues to work.
In the Prometheus exporter, the last health check status is already exposed,
with its code and duration in seconds. The server status is also exposed.
But the information about the agent check are not available. It is not
really handy because when a server status is changed because of the agent,
it is not obvious by looking to the Prometheus metrics. Indeed, the server
may reported as DOWN for instance, while the health check status still
reports a success. Being able to get the agent status in that case could be
valuable.
So now, the last agent check status is exposed, with its code and duration
in seconds. Following metrics can be grabbe now:
* haproxy_server_agent_status
* haproxy_server_agent_code
* haproxy_server_agent_duration_seconds
Note that unlike the other metrics, no per-backend aggregated metric is
exposed.
This patch is related to issue #2983.
Released version 3.2-dev17 with the following main changes :
- DOC: configuration: explicit multi-choice on bind shards option
- BUG/MINOR: sink: detect and warn when using "send-proxy" options with ring servers
- BUG/MEDIUM: peers: also limit the number of incoming updates
- MEDIUM: hlua: Add function to change the body length of an HTTP Message
- BUG/MEDIUM: stconn: Disable 0-copy forwarding for filters altering the payload
- BUG/MINOR: h3: don't insert more than one Host header
- BUG/MEDIUM: h1/h2/h3: reject forbidden chars in the Host header field
- DOC: config: properly index "table and "stick-table" in their section
- DOC: management: change reference to configuration manual
- BUILD: debug: mark ha_crash_now() as attribute(noreturn)
- IMPORT: slz: avoid multiple shifts on 64-bits
- IMPORT: slz: support crc32c for lookup hash on sse4 but only if requested
- IMPORT: slz: use a better hash for machines with a fast multiply
- IMPORT: slz: fix header used for empty zlib message
- IMPORT: slz: silence a build warning on non-x86 non-arm
- BUG/MAJOR: leastconn: do not loop forever when facing saturated servers
- BUG/MAJOR: queue: properly keep count of the queue length
- BUG/MINOR: quic: fix crash on quic_conn alloc failure
- BUG/MAJOR: leastconn: never reuse the node after dropping the lock
- MINOR: acme: renewal notification over the dpapi sink
- CLEANUP: quic: Useless BIO_METHOD initialization
- MINOR: quic: Add useful error traces about qc_ssl_sess_init() failures
- MINOR: quic: Allow the use of the new OpenSSL 3.5.0 QUIC TLS API (to be completed)
- MINOR: quic: implement all remaining callbacks for OpenSSL 3.5 QUIC API
- MINOR: quic: OpenSSL 3.5 internal QUIC custom extension for transport parameters reset
- MINOR: quic: OpenSSL 3.5 trick to support 0-RTT
- DOC: update INSTALL for QUIC with OpenSSL 3.5 usages
- DOC: management: update 'acme status'
- BUG/MEDIUM: wdt: always ignore the first watchdog wakeup
- CLEANUP: wdt: clarify the comments on the common exit path
- BUILD: ssl: avoid possible printf format warning in traces
- BUILD: acme: fix build issue on 32-bit archs with 64-bit time_t
- DOC: management: precise some of the fields of "show servers conn"
- BUG/MEDIUM: mux-quic: fix BUG_ON() on rxbuf alloc error
- DOC: watchdog: update the doc to reflect the recent changes
- BUG/MEDIUM: acme: check if acme domains are configured
- BUG/MINOR: acme: fix formatting issue in error and logs
- EXAMPLES: lua: avoid screen refresh effect in "trisdemo"
- CLEANUP: quic: remove unused cbuf module
- MINOR: quic: move function to check stream type in utils
- MINOR: quic: refactor handling of streams after MUX release
- MINOR: quic: add some missing includes
- MINOR: quic: adjust quic_conn-t.h include list
- CLEANUP: cfgparse: alphabetically sort the global keywords
- MINOR: glitches: add global setting "tune.glitches.kill.cpu-usage"
It was mentioned during the development of glitches that it would be
nice to support not killing misbehaving connections below a certain
CPU usage so that poor implementations that routinely misbehave without
impact are not killed. This is now possible by setting a CPU usage
threshold under which we don't kill them via this parameter. It defaults
to zero so that we continue to kill them by default.
Adjust include list in quic_conn-t.h. This file is included in many QUIC
source, so it is useful to keep as lightweight as possible. Note that
connection/QUIC MUX are transformed into forward declaration for better
layer separation.
Insert some missing includes statement in QUIC source files. This was
detected after the next commit which adjust the include list used in
quic_conn-t.h file.
quic-conn layer has to handle itself STREAM frames after MUX release. If
the stream was already seen, it is probably only a retransmitted frame
which can be safely ignored. For other streams, an active closure may be
needed.
Thus it's necessary that quic-conn layer knows the highest stream ID
already handled by the MUX after its release. Previously, this was done
via <nb_streams> member array in quic-conn structure.
Refactor this by replacing <nb_streams> by two members called
<stream_max_uni>/<stream_max_bidi>. Indeed, it is unnecessary for
quic-conn layer to monitor locally opened uni streams, as the peer
cannot by definition emit a STREAM frame on it. Also, bidirectional
streams are always opened by the remote side.
Previously, <nb_streams> were set by quic-stream layer. Now,
<stream_max_uni>/<stream_max_bidi> members are only set one time, just
prior to QUIC MUX release. This is sufficient as quic-conn do not use
them if the MUX is available.
Note that previously, IDs were used relatively to their type, thus
incremented by 1, after shifting the original value. For simplification,
use the plain stream ID, which is incremented by 4.
Move general function to check if a stream is uni or bidirectional from
QUIC MUX to quic_utils module. This should prevent unnecessary include
of QUIC MUX header file in other sources.
In current version of the game, there is a "screen refresh" effect: the
screen is cleared before being re-drawn.
I moved the clear right after the connection is opened and removed it
from rendering time.
Stop emitting \n in errmsg for intermediate error messages, this was
emitting multiline logs and was returning to a new line in the middle of
sentences.
We don't need to emit them in acme_start_task() since the errmsg is
ouput in a send_log which already contains a \n or on the CLI which
also emits it.
When starting the ACME task with a ckch_conf which does not contain the
domains, the ACME task would segfault because it will try to dereference
a NULL in this case.
The patch fix the issue by emitting a warning when no domains are
configured. It's not done at configuration parsing because it is not
easy to emit the warning because there are is no callback system which
give access to the whole ckch_conf once a line is parsed.
No backport needed.