It can happen in some cases that the last block of an H2 transfer over
HTX is truncated. This was tracked down to a leftover of an earlier
implementation of htx_xfer_blks() causing the computed size of a block
to be incorrectly calculated if a data block doesn't completely fit into
the target buffer. In practice it causes the EOM block to be attempted to
be emitted with a wrong size and the message to be truncated. One way to
reproduce this is to chain two haproxy instances in h1->h2->h1 with
httpterm as the server and h2load as the client, making many requests
between 8 and 10kB over a single connection. Usually one of the very
first requests will fail.
This fix must be backported to 1.9.
Modify h2c_restart_reading() to add a new parameter, to let it know if it
should consider if the buffer isn't empty when retrying to read or not, and
call h2c_restart_reading() using 0 as a parameter from h2_process_demux().
If we're leaving h2_process_demux() with a non-empty buffer, it means the
frame is incomplete, and we're waiting for more data, and if we already
subscribed, we'll be waken when more data are available.
Failing to do so means we'll be waken up in a loop until more data are
available.
This should be backported to 1.9.
The deinit took place in only peer_session_release, but in the a case of a
previous call to peer_session_forceshutdown, the session cursors
won't be reset, resulting in a bad state for new session of the same
peer. For instance, a table definition message could be dropped and
so all update messages will be dropped by the remote peer.
This patch move the deinit processing directly in the force shutdown
funtion. Killed session remains in "ST_END" state but ref on peer was
reset to NULL and deinit will be skipped on session release function.
The session release continue to assure the deinit for "active" sessions.
This patch should be backported on all stable version since proto
peers v2.
Because we try to forward infinitly message body, when its state is set to DONE,
we must be sure to reset to_foward value of the corresponding
channel. Otherwise, some errors can be errornously triggered.
No need to backport this patch.
Displays a prefix for every addresses in 'show cli sockets'.
It could be 'unix@', 'ipv4@', 'ipv6@', 'abns@' or 'sockpair@'.
Could be backported in 1.9 and 1.8.
The 'show cli sockets' was not handling the abns sockets. This is a
problem since it uses the AF_UNIX family, it displays nothing
in the path column because the path starts by \0.
Should be backported to 1.9 and 1.8.
This patch implements the external binary support in the master worker.
To configure an external process, you need to use the program section,
for example:
program dataplane-api
command ./dataplane_api
Those processes are launched at the same time as the workers.
During a reload of HAProxy, those processes are dealing with the same
sequence as a worker:
- the master is re-executed
- the master sends a USR1 signal to the program
- the master launches a new instance of the program
During a stop, or restart, a SIGTERM is sent to the program.
The children variable is still used in haproxy, it is not required
anymore since we have the information about the current workers in the
mworker_proc linked list.
The oldpids array is also replaced by this linked list when we
generated the arguments for the master reexec.
The converter can be used to decrypt the raw byte input using the
AES-GCM algorithm, using provided nonce, key and AEAD tag. This can
be useful to decrypt encrypted cookies for example and make decisions
based on the content.
AIX defines ip_v as ip_ff.ip_fv in netinet/ip.h using a macro, and
unfortunately we do have a local variable with such a name and which
uses the same header file. Let's rename the variable to ip_ver to fix
this.
I found on an (old) AIX 5.1 machine that stdint.h didn't exist while
inttypes.h which is expected to include it does exist and provides the
desired functionalities.
As explained here, stdint being just a subset of inttypes for use in
freestanding environments, it's probably always OK to switch to inttypes
instead:
https://pubs.opengroup.org/onlinepubs/009696799/basedefs/stdint.h.html
Also it's even clearer here in the autoconf doc :
https://www.gnu.org/software/autoconf/manual/autoconf-2.61/html_node/Header-Portability.html
"The C99 standard says that inttypes.h includes stdint.h, so there's
no need to include stdint.h separately in a standard environment.
Some implementations have inttypes.h but not stdint.h (e.g., Solaris
7), but we don't know of any implementation that has stdint.h but not
inttypes.h"
The current initcall implementation relies on dedicated sections (one
section per init stage) to store the initcall descriptors. Then upon
startup, these sections are scanned from beginning to end and all items
found there are called in sequence.
On platforms like AIX or Cygwin it seems difficult to figure the
beginning and end of sections as the linker doesn't seem to provide
the corresponding symbols. In order to replace this, this patch
simply implements an array of single linked (one per init stage)
which are fed using constructors for each register call. These
constructors are declared static, with a name depending on their
line number in the file, in order to avoid name clashes. The final
effect is the same, except that the method is slightly more expensive
in that it explicitly produces code to register these initcalls :
$ size haproxy.sections haproxy.constructor
text data bss dec hex filename
4060312 249176 1457652 5767140 57ffe4 haproxy.sections
4062862 260408 1457652 5780922 5835ba haproxy.constructor
This mechanism is enabled as an alternative to the default one when
build option USE_OBSOLETE_LINKER is set. This option is currently
enabled by default only on AIX and Cygwin, and may be attempted for
any target which fails to build complaining about missing symbols
__start_init_* and/or __stop_init_*.
Once confirmed as a reliable fix, this will likely have to be backported
to 1.9 where AIX and Cygwin do not build anymore.
Older Solaris and AIX versions do not have unsetenv(). This adds a
fairly simple implementation which scans the environment, for use
with those systems. It will simply require to pass the define in
the "DEFINE" macro at build time like this :
DEFINE="-Dunsetenv=my_unsetenv"
Most modern platforms don't touch the output buffer when the size
argument is null, but there exist a few old ones (like AIX 5 and
possibly Tru64) where the output will be dereferenced anyway, probably
to write the trailing null, crashing the process. memprintf() uses this
to measure the desired length.
There is a very simple workaround to this consisting in passing a pointer
to a character instead of a NULL pointer. It was confirmed to fix the issue
on AIX 5.1.
The struct http_cache_applet was fully declared at the beginning
instead of just doing a forward declaration using an extern modifier.
Some linkers report warnings about a redefined symbol since these
really are two complete declarations.
The proper way to do this is to use extern on the first one and to
have a full declaration later. However it's not permitted to have
both static and extern so the change done in commit 0f2229943
("CLEANUP: cache: don't export http_cache_applet anymore") has to
be partially undone.
This should be backported to 1.9 for sanity but has no effet on
most platforms. However on 1.9 the extern keyword must also be
added to include/types/cache.h.
When using TCP health checks (tcp-check connect), it is possible to
crash with a segfault when, for reasons yet to be understood, the
protocol family is unknown.
In the function tcpcheck_main(), proto is dereferenced without a prior
test in case it is NULL, leading to the segfault during proto->connect
dereference.
The line has been unmodified since it was introduced, in commit
69e273f3fcfbfb9cc0fb5a09668faad66cfbd36b. This was the only use of
proto (or more specifically, the return of protocol_by_family()) that
was unprotected; all other callsites perform the test for a NULL
pointer.
This patch should be backported to 1.9, 1.8, 1.7, and 1.6.
In __event_srv_chk_r() and __event_srv_chk_w(), don't bother subscribing
if we're waiting for a handshake, but we had a connection error. We will
never be able to send/receive anything on that connection anyway, and
the conn_stream is probably about to be destroyed, and we will crash if
the tasklet is waken up.
I'm not convinced we need to subscribe here at all anyway, but I'd rather
modify the check code as little as possible.
This should be backported to 1.9.
A bug occurs when the sigchld handler is called and a child which is
not in the process list just left, or with an empty process list.
The child variable won't be set and left as an uninitialized variable or
set to the wrong child entry, which can lead to a free of this
uninitialized variable or of the wrong child.
This can lead to a crash of the master during a stop or a reload.
It is not supposed to happen with a worker which was created by the
master. A cause could be a fork made by a dependency. (openssl, lua ?)
This patch strengthens the case of the missing child by doing the free
only if the child was found.
This patch must be backported to 1.9.
When an HTTP request with an empty body is received, the flag HTX_SL_F_BODYLESS
is set on the HTX start-line block. It is true if the header content-length is
explicitly set to 0 or if it is omitted for a non chunked request.
On the server side, when the request is reformatted, because HTX_SL_F_BODYLESS
is set, the flag H1_MF_CLEN is added on the request parser. It is done to not
add an header transfer-encoding on bodyless requests. But if an header
content-length is explicitly set to 0, when it is parsed, because H1_MF_CLEN is
set, the function h1_parse_cont_len_header() returns 0, meaning the header can
be dropped. So in such case, a request without any header content-length is sent
to the server.
Some servers seems to reject empty POST requests with an error 411 when there is
no header content-length. So to fix this issue, on the output side, only headers
with an invalid content length are skipped, ie only when the function
h1_parse_cont_len_header() returns a negative value.
This patch must be backported to 1.9.
This patch fixes a bug introduced by 045e0d4 commit where it was really a bad
idea to reset the peer applet context before shutting down the underlying
session. This had as side effect to cancel the re-initializations done by
peer_session_release(), especially prevented this function from re-initializing
the current table pointer which is there to force annoucement of stick-table
definitions on when reconnecting. Consequently the peers could send stick-table
update messages without a first stick-table definition message. As this is
forbidden, this leaded the remote peers to close the sessions.
It's not convenient not to know the status of default options, and
requires the user to know what option is enabled by default in each
target. With this patch, a new "Features list" line is added to the
output of "haproxy -vv" to report the whole list of known features
with their respective status. They're prefixed with a "+" when enabled
or a "-" when disabled. The "USE_" prefix is removed for clarity.
There were tabs in between macro names and their values in their
definition, forcing everyone to do the same, and causing some
mangling in patches. Let's fix all this.
645635d commit was not sufficient to implement the heartbeat feature.
When no heartbeat was received before its timeout has expired the session was not
closed due to the fact that process_peer_sync() which is the task responsible of
handling the heartbeat and session expirations only checked the heartbeat timeout,
and sent a heartbeat message if it has expired. This has as side
effect to leave the session opened. On the remote side, a peer which receives a
heartbeat message, even if not supported, does not close the session.
Furthermore it not sufficient to update ->reconnect peer member field to schedule
a peer session release.
With this patch, a peer is flagged as alive as soon as it received peer protocol
messages (and not only heartbeat messages). When no updates must be sent,
we first check the reconnection timeout (->reconnect peer member field). If expired,
we really shutdown the session if the peer is not alive, but if the peer seen as alive,
we reset this flag and update the ->reconnect for the next period.
If the reconnection timeout has not expired, then we check the heartbeat timeout
which is there only to emit heartbeat messages upon expirations. If expired, as before this
patch we increment the heartbeat timeout by 3s to schedule the next heartbeat message
then we emit a heartbeat message waking up the peer I/O handler.
In every cases we update the task expiration to the earlier time between the
reconnection time and the heartbeat timeout time so that to be sure to check
again these two ->reconnect and ->heartbeat timers.
This flag is set by the stream layer to request an abort, and results in
CF_SHUTR being set once the abort is performed. However by analogy with
the send side, the flag was removed once the CF_SHUTR flag was set, thus
we lose the information about the cause of the shutr. This is what creates
the confusion that sometimes arises between client and server aborts.
This patch makes sure we don't remove this flag anymore in this case.
All call places only use it to perform the shutr and already check it
against CF_SHUTR. So no condition needs to be updated to take this into
account.
Some later, more careful changes may consist in refining the conditions
where we report a client reset or a server reset to ignore SHUTR when
SHUTR_NOW is set so that we don't report such misleading information
anymore.
Recent commit 63768a63d ("MEDIUM: mux-h2: Don't mix the end of the message
with the end of stream") introduced a race which may manifest itself with
small connection counts on large objects and large server timeouts in
legacy mode. Sometimes h2s_close() is called while the data layer is
subscribed to read events but nothing in the chain can cause this wake-up
to happen and some streams stall for a while at the end of a transfer
until the server timeout strikes and ends the stream completes.
We need to wake the stream up if it's subscribed to rx events there,
which is what this patch does. When the patch above is backported to
1.9, this patch will also have to be backported.
Previous commit 3ea351368 ("BUG/MEDIUM: h2: Remove the tasklet from the
task list if unsubscribing.") uncovered an issue which needs to be
addressed in the scheduler's API. The function task_remove_from_task_list()
was initially designed to remove a task from the running tasklet list from
within the scheduler, and had to be used in h2 to abort pending I/O events.
However this function was not designed to be idempotent, occasionally
causing a double removal from the tasklet list, with the second doing
nothing but affecting the apparent tasks count and making haproxy use
100% CPU on some tests consisting in stopping the client during some
transfers. The h2_unsubscribe() function can sometimes be called upon
stream exit after an error where the tasklet was possibly already
removed, so it.
This patch does 2 things :
- it renames task_remove_from_task_list() to
__task_remove_from_tasklet_list() to discourage users from calling
it. Also note the fix in the naming since it's a tasklet list and
not a task list. This function is still uesd from the scheduler.
- it adds a new, idempotent, task_remove_from_tasklet_list() function
which does nothing if the task is already not in the tasklet list.
This patch will need to be backported where the commit above is backported.
In h2_unsubscribe(), if we unsubscribe on SUB_CALL_UNSUBSCRIBE, then remove
ourself from the sending_list, and remove the tasklet from the task list.
We're probably about to destroy the stream anyway, so we don't want the
tasklet to run, or to stay in the sending_list, or it could lead to a crash.
This should be backpored to 1.9.
In h2_deferred_shut(), don't just set h2s->send_wait to NULL, instead, use
the same logic as in h2_snd_buf() and only do so if we successfully sent data
(or if we don't want to send them anymore). Setting it to NULL can lead to
crashes.
This should be backported to 1.9.
In h2s_notify_send(), use the new sending_list instead of using the old
way of setting hs->send_wait to NULL, failing to do so may lead to crashes.
This should be backported to 1.9.
In h2_deferred_shut(), only attempt to destroy the h2s if h2s->cs is NULL.
h2s->cs being non-NULL means it's still referenced by the stream interface,
so it may try to use it later, and that could lead to a crash.
This should be backported to 1.9.
This commit was reverted because of bugs. Now it should be ok. Difference with
the commit f52170d2f ("MEDIUM: proto_htx: Switch to infinite forwarding if there
is no data filte") is that when the infinite forwarding is enabled, the message
is switched to the state HTTP_MSG_DONE if the flag CF_EOI is set.
Because the flag CF_SHUTR is no more set to mark the end of the message by the
H2 multiplexer, we can rely on it again to detect aborts. there is no more need
to make a check on the flag SI_FL_CLEAN_ABRT when the option abortonclose is
enabled. So, this option should work as before for h2 clients.
This patch must be backported to 1.9 with the previous EOI patches.
As for the H2 multiplexer, When the end of a message is detected, the flag
CS_FL_EOI is set on the conn_stream.
This patch should be backported to 1.9.
The H2 multiplexer now sets CS_FL_EOI when it receives a frame with the ES
flag. And when the H2 streams is closed, it set the flag CS_FL_REOS.
This patch should be backported to 1.9.
The flag CF_EOI is now set on the input channel when the flag CS_FL_EOI is set
on the corresponding conn_stream. In addition, if a read activity is reported
when this flag is set, the stream is woken up.
This patch should be backported to 1.9.
In the commit 93e02d8b7 ("MINOR: proto-http/proto-htx: Make error handling
clearer during data forwarding"), a return clause was removed by error in the
function http_request_forward_body(). This bug seems not having any visible
impact.
This patch must be backported to 1.9.
On the send path, try to be fair, and make sure the first to attempt to
send data will actually be the first to send data when it's possible (ie
when the mux' buffer is not full anymore).
To do so, use a separate list element for the sending_list, and only remove
the h2s from the send_list/fctl_list if we successfully sent data. If we did
not, we'll keep our place in the list, and will be able to try again next time.
This should be backported to 1.9.