It is not especially a bug fixed. But APPCTX_FL_EOS and APPCTX_FL_ERROR
flags must be handled first. These flags are set by the applet itself and
should mark the end of all processing. So there is not reason to get the
output buffer in first place.
This patch could be backported as far as 3.0.
The output buffer must be available to process a command, at least to be
able to emit error messages. When this buffer is full or cannot be
allocated, we must wait. In that case, we must take care to notify the SE
will not consume input data. It is important to avoid wakeup in loop,
especially when the client aborts.
When the output buffer is available again and no longer full, and the CLI
applet is waiting for a command line, it must notify it will consume input
data.
This patch must be backported as far as 3.0.
Replace ->li quic_conn pointer to struct listener member by ->target which is
an object type enum and adapt the code.
Use __objt_(listener|server)() where the object type is known. Typically
this is were the code which is specific to one connection type (frontend/backend).
Remove <server> parameter passed to qc_new_conn(). It is redundant with the
<target> parameter.
GSO is not supported at this time for QUIC backend. qc_prep_pkts() is modified
to prevent it from building more than an MTU. This has as consequence to prevent
qc_send_ppkts() to use GSO.
ssl_clienthello.c code is run only by listeners. This is why __objt_listener()
is used in place of ->li.
Empty lines was not properly parsed and could lead to crashes because the
last argument was parsed outside of the cmdline buffer. Indeed, the last
argument is parsed to look for an eventual payload pattern. It is started
one character after the newline at the end of the command line. But it is
only valid for an non-empty command line.
So, now, this case is properly detected when we leave if an empty line is
detected.
This patch must be backported to 3.2.
while new_server() takes the parent proxy as argument and even assigns
srv->proxy to the parent proxy, it didn't actually inserted the server
to the parent proxy server list on success.
The result is that sometimes we add the server to the list after
new_server() is called, and sometimes we don't.
This is really error-prone and because of that hooks such as
REGISTER_POST_SERVER_CHECK() which as run for all servers listed in
all proxies may not be relied upon for servers which are not actually
inserted in their parent proxy server list. Plus it feels very strange
to have a server that points to a proxy, but then the proxy doesn't know
about it because it cannot find it in its server list.
To prevent errors and make proxy->srv list reliable, we move the insertion
logic directly under new_server(). This requires to know if we are called
during parsing or during runtime to either insert or append the server to
the parent proxy list. For that we use PR_FL_CHECKED flag from the parent
proxy (if the flag is set, then the proxy was checked so we are past the
init phase, thus we assume we are called during runtime)
This implies that during startup if new_server() has to be cancelled on
error paths we need to call srv_detach() (which is now exposed in server.h)
before srv_drop().
The consequence of this commit is that REGISTER_POST_SERVER_CHECK() should
not run reliably on all servers created using new_server() (without having
to manually loop on global servers_list)
d3f928944 ("BUG/MINOR: cli: Issue an error when too many args are passed
for a command") added a new check to prevent the command to run when
too many arguments are provided. In this case an error is reported.
However it turns out this check (despite marked for backports) was
ineffective prior to 20ec1de21 ("MAJOR: cli: Refacor parsing and
execution of pipelined commands") as 'p' pointer was reset to the end of
the buffer before the check was executed.
Now since 20ec1de21, the check works, but we have another issue: we may
read past initialized bytes in the buffer because 'p' pointer is always
incremented in a while loop without checking if we increment it past 'end'
(This was detected using valgrind)
To fix the issue introduced by 20ec1de21, let's only increment 'p' pointer
if p < end.
For 3.2 this is it, now for older versions, since d3f928944 was marked for
backport, a sligthly different approach is needed:
- conditional p increment must be done in the loop (as in this patch)
- max arg check must moved above "fill unused slots" comment where p is
assigned to the end of the buffer
This patch should be backported with d3f928944.
While humans can find it convenient to enter the worker process in prompt
mode, for external tools it will not be convenient to have to systematically
disable it. A better approach is to replicate the master socket's mode
there, since it has already been configured to suit the user: interactive,
prompt and timed modes are automatically passed to the worker process.
This makes the using the worker commands more natural from the master
process, without having to systematically adapt it for each new connection.
Support the same syntax in master mode as in worker mode in order to
configure the prompt. The only thing is that for now the master doesn't
have a non-interactive mode and it doesn't seem necessary to implement
it, so we only support the interactive and prompt modes. However the code
was written in a way that makes it easy to change this later if desired.
Now the prompt mode can more finely be configured between non-interactive
(default), interactive without prompt, and interactive with prompt. This
will ease the usage from automated tools which are not necessarily
interested in having to consume '> ' after each command nor displaying
"+" on payload lines. This can also be convenient when coming from the
master CLI to keep the same output format.
The CLI's "prompt" command toggles two distinct things:
- displaying or hiding the prompt at the beginning of the line
- single-command vs interactive mode
These are two independent concepts and the prompt mode doesn't
always cope well with tools that would like to upload data without
having to read the prompt on return. Also, the master command line
works in interactive mode by default with no prompt, which is not
consistent (and not convenient for tools). So let's start by splitting
the bit in two, and have a new APPCTX_CLI_ST1_INTER flag dedicated
to the interactive mode. For now the "prompt" command alone continues
to toggle the two at once.
The new adhoc parser for the '@@' prefix forgot to require the presence
of the LF character marking the end of the line. This is the reason why
entering incomplete commands would display garbage, because the line was
expected to have its LF character replaced with a zero.
The problem is well illustrated by using socat in raw mode:
socat /tmp/master.sock STDIO,raw,echo=0
then entering "@@1 show info" one character at a time would error just
after the second "@". The command must take care to report an incomplete
line and wait for more data in such a case.
This reverts commit 0e94339eaf.
This patch was in fact fixing the symptom, not the cause. The root cause
of the problem is that the parser was processing an incomplete line when
looking for '@@'. When the LF is present, this problem does not exist
as it's properly replaced with a zero. This can be verified using socat
in raw mode:
socat /tmp/master.sock STDIO,raw,echo=0
Then entering "@@1 show info" one character at a time will immediately
fail on "@@" without going further. A subsequent patch will fix this.
No backport is needed.
When the CLI applet was refactord in the commit 20ec1de21 ("MAJOR: cli:
Refacor parsing and execution of pipelined commands"), a regression was
introduced. The applet shutdown was not longer handled when the applet was
waiting for the next command line. It is especially visible when a client
timeout occurred because the client connexion is no longer closed.
To fix the issue, the test on the SE_FL_SHW flag was reintroduced in
CLI_ST_PARSE_CMDLINE state, but only is there is no pending input data.
It is a 3.2-specific issue. No backport needed.
When '@@' alone is sent on the master CLI (no trailing LF), we get an
error that displays anything past these two characters in the buffer
since there's no room for a \0. Let's make sure to limit the length of
the process name in this case. No backport is needed since this was added
with 00c967fac4 ("MINOR: master/cli: support bidirectional communications
with workers").
There are several fields in the appctx structure only used by the CLI. To
make things cleaner, all these fields are now placed in a dedicated context
inside the appctx structure. The final goal is to move it in the service
context and add an API for cli commands to get a command coontext inside the
cli context.
CLI_ST_GETREQ state was renamed into CLI_ST_PARSE_CMDLINE and CLI_ST_PARSEREQ
into CLI_ST_PROCESS_CMDLINE to reflect the real action performed in these
states.
Before this patch, when pipelined commands were received, each command was
parsed and then excuted before moving to the next command. Pending commands
were not copied in the input buffer of the applet. The major issue with this
way to handle commands is the impossibility to consume inputs from commands
with an I/O handler, like "show events" for instance. It was working thanks
to a "bug" if such commands were the last one on the command line. But it
was impossible to use them followed by another command. And this prevents us
to implement any streaming support for CLI commands.
So we decided to refactor the command line parsing to have something similar
to a basic shell. Now an entire line is parsed, including the payload,
before starting commands execution. The command line is copied in a
dedicated buffer. "appctx->chunk" buffer is used for this purpose. It was an
unsed field, so it is safe to use it here. Once the command line copied, the
commands found on this line are executed. Because the applet input buffer
was flushed, any input can be safely consumed by the CLI applet and is
available for the command I/O handler. Thanks to this change, "show event
-w" command can be followed by a command. And in theory, it should be
possible to implement commands supporting input data streaming. For
instance, the Tetris like lua applet can be used on the CLI now.
Note that the payload, if any, is part of the command line and must be fully
received before starting the commands processing. It means there is still
the limitation to a buffer, but not only for the payload but for the whole
command line. The payload is still necessarily at the end of the command
line and is passed as argument to the last command. Internally, the
"appctx->cli_payload" field was introduced to point on the payload in the
command line buffer.
This patch is quite huge but it cannot easily be splitted. It should not
introduced significant changes.
When a bidirection connection with no command is establisehd with a worker
(so "@@<pid>" alone), a "prompt" command is automatically added to display
the worker's prompt and enter in interactive mode in the worker context.
However, till now, an unfinished command line is sent, with a semicolon
instead of a newline at the end. It is not exactly a bug because this
works. But it is not really expected and could be a problem for future
changes.
So now, a full command line is sent: the "prompt" command finished by a
newline character.
When a command is parsed to split it in an array of arguments, by default,
at most 64 arguments are supported. But no warning was emitted when there
were too many arguments. Instead, the arguments above the limit were
silently ignored. It could be an issue for some commands, like "add server",
because there was no way to know some arguments were ignored.
Now an error is issued when too many arguments are passed and the command is
not executed.
This patch should be backported to all stable versions.
These ones were added by mistake during the change of the cfgparse
mechanism in 3.1, but they're corrupting the output of "debug counters"
by leaving stray ']' on their own lines. We could possibly check them
all once at boot but it doens't seem worth it.
This should be backported to 3.1.
Some rare commands in the worker require to keep their input open and
terminate when it's closed ("show events -w", "wait"). Others maintain
a per-session context ("set anon on"). But in its default operation
mode, the master CLI passes commands one at a time to the worker, and
closes the CLI's input channel so that the command can immediately
close upon response. This effectively prevents these two specific cases
from being used.
Here the approach that we take is to introduce a bidirectional mode to
connect to the worker, where everything sent to the master is immediately
forwarded to the worker (including the raw command), allowing to queue
multiple commands at once in the same session, and to continue to watch
the input to detect when the client closes. It must be a client's choice
however, since doing so means that the client cannot batch many commands
at once to the master process, but must wait for these commands to complete
before sending new ones. For this reason we use the prefix "@@<pid>" for
this. It works exactly like "@" except that it maintains the channel
open during the whole execution. Similarly to "@<pid>" with no command,
"@@<pid>" will simply open an interactive CLI session to the worker, that
will be ended by "quit" or by closing the connection. This can be convenient
for the user, and possibly for clients willing to dedicate a connection to
the worker.
In this patch we try to use the proxy API init functions as much as
possible to avoid code redundancy and prevent proxy initialization
errors. As such, we prefer using alloc_new_proxy() and setup_new_proxy()
instead of manually allocating the proxy pointer and performing the
base init ourselves.
Thanks to the previous commits, we now know that "wait srv-removable"
does not require thread isolation, as long as 3372a2ea00 ("BUG/MEDIUM:
queues: Stricly respect maxconn for outgoing connections") and c880c32b16
("MINOR: stream: decrement srv->served after detaching from the list")
are present. Let's just get rid of thread_isolate() here, which can
consume a lot of CPU on highly threaded machines when removing many
servers at once.
On reload, the new worker requests bound FDs to the old one. The old worker
sends them in message of at most 252 FDs. Each message is acknowledged by
the new worker. All messages sent or received by the old worker are handled
manually via sendmsg/recv syscalls. So the old worker must be sure consume
all the ACK replies. However, the last one was never consumed. So it was
considered as a command by the CLI applet. This issue was hidden since
recently. But it was the root cause of the issue #2862.
Note this last ack is also the first one when there are less than 252 FDs to
transfer.
This patch must be backported to all stable versions.
Commit 7214dcd ("BUG/MEDIUM: applet: Don't pretend to have more data to
handle EOI/EOS/ERROR") revealed a bug with the CLI applet. Pending input
data when the applet is in CLI_ST_END state were never consumed or dropped,
leading to a wakeup loop.
The CLI applet implements its own snd_buf callback function. It is important
it consumes all pending input data. Otherwise, the applet is woken up in
loop until it empties the request buffer. Another way to fix the issue would
be to report an error. But in that case, it seems reasonnable to drop these
data.
The issue can be observed on reload, in master/worker mode, because of issue
about the last ACK message which was never consummed by the _getsocks()
command.
This patch should fix the issue #2862. It must be backported to 3.1 with the
commit above.
The crappy epoll API stroke again with reloads and transferred FDs.
Indeed, when listening sockets are retrieved by a new worker from a
previous one, and the old one finally stops listening on them, it
closes the FDs. But in this case, since the sockets themselves were
not closed, epoll will not unregister them and will continue to
report new activity for these in the old process, which can only
observe, count an fd_poll_drop event and not unregister them since
they're not reachable anymore.
The unfortunate effect is that long-lasting old processes are woken
up at the same rate as the new process when accepting new connections,
and can waste a lot of CPU. Accept rates divided by 8 were observed on
a small test involving a slow transfer on 10 connections facing a reload
every second so that 10 processes were busy dealing with them while
another process was hammering the service with new connections.
Fortunately, years ago we implemented a flag FD_CLONED exactly for
similar purposes. Let's simply mark transferred FDs with FD_CLONED
so that the process knows that these ones require special treatment
and have to be manually unregistered before being closed. This does
the job fine, now old processes correctly unregister the FD before
closing it and no longer receive accept events for the new process.
This needs to be backported to all stable versions. It only affects
epoll, as usual, and this time in combination with transferred FDs
(typically reloads in master-worker mode). Thanks to Damien Claisse
for providing all detailed measurements and statistics allowing to
understand and reproduce the problem.
In _getsocks() functuoin, when we failed to set the unix socket in
non-blocking mode, a goto to "out" label led to loop infinitly. To fix the
issue, we must only let the function exit.
This patch should be backported to all stable versions.
Some errors in parse function of _getsocks commands were not properly handled
and immediately returned, leading to a memory leak on cmsgbuf and tmpbuf
buffers.
To fix the issue, instead of immediately return with -1, we jump to "out"
label. Returning 1 intead of -1 in that case is valid.
This was reported by Coverity in #2841: CIDs 1587773 and 1587772.
This patch should be backported as far as 2.4.
Since the CLI was updated to use the new applet API, it should no longer set
directly the SE flags. Instead, the corresponding applet flags must be set,
using the applet API (appet_set_*). It is true for the CLI I/O handler but also
for the commands parse function and I/O callback function.
This patch should be backported as far as 3.0.
This patch adds a counter of close() on file descriptors in the fdtab.
The goal is to better detect if reported events concern the current or
a previous file descriptor. For now the counter is only added, and is
showed in "show fd" as "gen". We're reusing unused space at the end of
the struct. If it's needed for something more important later, this
patch can be reverted.
That's essentially in order to help with debugging strange cases like
the occasional epoll issues/races, by keeping a counter of how many
times an FD was taken over since last inserted. The room is available
so let's use it. If it's needed later, this patch can easily be reverted.
The counter is also reported in "show fd" as "tkov".
cli_snd_buf() analyzez input line by line. Before this patch it has always
scanned a given line for the presence of '\r' followed by '\n'.
This is only needed for strings, that contain the commands itself like
"show ssl cert\n", "set ssl cert test.pem <<\n".
In case of strings, which contain the command's payload, like
"-----BEGIN CERTIFICATE-----\r\n", '\r\n' should be preserved
as is.
This patch fixes the GitHub issue #2818.
This patch should be backported in v3.1 and in v3.0.
Some master process' initialization steps are conditioned by receiving the
READY message from worker (pidfile creation, forwarding READY message to the
launching parent). So, master process can not do these initialization routines
before.
If the master process fails, while creating pid or forwarding the READY to the
parent in daemon mode, he exits with a proper alert message. In daemon mode we
no longer see such message, as process is already detached from the tty.
To fix this, as these alerts could be very useful, let's detach the master
process from the tty after his last initialization steps in _send_status.
As daemonization fork happens now very early and before the master-worker
fork, if master or worker processes fail during the initialization, some
critical errors can't be reported to stdout. The launching (parent) process in
such cases exits with 0. This makes an impression, that master and his worker
have successfully started at background, which really complicates the
operations.
In the previous commit a pipe was added to make daemonized child communicate
with his parent. Let's add the same logic to master-worker mode. Up to
receiving the READY message from the worker, master will "forward" it via the
pipe to the launching process. Launching process can obtain master's exit
status, if the master fails to start and nothing has been written in the pipe.
This fix should be backported only in 3.1.
If master process can't open a pidfile, there is no sense to send SIGTTIN to
oldpids, as it will exit. So, old workers will terminate as well. It's better
to send the last alert to the log about unrecoverable error, because master is
already in its polling loop.
For the standalone mode we should keep the previous logic in this case: send
SIGTTIN to old process and unbind listeners for the new one. So, it's better
to put this error path in main(), as it's done when other configuration settings
can't be applied.
This patch should be backported only in 3.1.
When a new master process is launched like below:
./haproxy -W -D -p ha.pid -sf $(cat ha.pid)...
The old master process and its workers do not stop. Since the master-worker
refactoring, the code, which sends USR1/TERM to old pids from -sf, is called
only for the standalone mode. In master-worker mode we should receive the READY
message from the newly forked worker at first, in order to be able to terminate
the previous master.
So, to fix this, let's terminate the previous master in _send_status(), where
we parse the READY message from the newly forked worker. And let's continue to
use oldpids array, as it was in 3.0, in order to stop the workers, launched
before the reload.
This patch should be backported only in 3.1.
Pidfile should be created at the latest initialization stage, when we are
sure, that process is able to start successfully, otherwise PID value, written
in this file is no longer valid.
So, for the standalone mode, let's move the block, which opens the pidfile and
let's put it just before applying "chroot". In master-worker mode, master
doesn't perform chroot. So it creates the pidfile, only when the "READY"
message from the newly forked worker is received.
This should be backported only in 3.1
Since sd_notify() is now implemented in src/systemd.c, there is no need
anymore to build its support conditionnally with USE_SYSTEMD.
This patch add supports for -Ws for every build and removes the
USE_SYSTEMD build option. It also remove every reference to USE_SYSTEMD
in the documentation and the CI.
This also allows to run the reg-tests in -Ws with the new VTest support.
Before this patch, we have need to put the master CLI in debug mode to be able
to issue 'show env' command for the master process. Output of this command is
handy even for the master process context, as it allows to control its
environment variables, which could be used/modified in the 'global' section.
So, let's provide in 'show env' command structure the level ACCESS_MASTER.
This allows to see and to access this command in master CLI without putting it
in debug mode.
Before this fix, HAPROXY_CLI and HAPROXY_MASTER_CLI have contained along with
CLI sockets addresses internal sockpairs, which are used only for master CLI
(reload sockpair and sockpair shared with a worker process). These internal
sockpairs are always need to be hidden.
At the moment there is no any client, who uses sockpair addresses for the
stats listener or in order to connect to master CLI. So, let's simply not copy
these internal sockpair addresses of MASTER and GLOBAL proxy listeners.
As listeners with sockpairs are skipped and they can be presented in the
listeners list in any order, let's add semicolon separator between addresses
only in the case, when there are already some string saved in the trash and we
are sure, that we are adding a new address to it. Otherwise, we could have such
weird output:
HAPROXY_MASTER_CLI=unix@/tmp/mcli.sock;;
This fix is need to be backported in all stable versions.
After master-worker refactoring, master performs re-exec only once up to
receiving "reload" command or USR2 signal. There is no more the second
master's re-exec to free unused memory. Thus, there is no longer need to export
environment variable HAPROXY_LOAD_SUCCESS with worker process load status. This
status can be simply saved in a global variable load_status.
For now it's the same as abns. We'll need to modify sock_unix_addrcmp(),
and a few other ones to support effective path length when dealing with
the \0. Let's check with Tristan's patch for this (upcoming patch).
Co-authored-by: Aurelien DARRAGON <adarragon@haproxy.com>
Now that we can easily distinguish regular UNIX socket from ABNS sockets
by simply looking at the address family, stop looking at the first byte
from addr->sun_path to guess if the socket is an ABNS one or not. Looking
at the family is straightforward and will allow to differentiate between
upcoming ABNSZ and ABNS (where looking at the first byte from path won't
help anymore).
This is a pre-requisite to adding the abnsz socket address family:
in this patch we make use of protocol API rework started by 732913f
("MINOR: protocol: properly assign the sock_domain and sock_family") in
order to implement a dedicated address family for ABNS sockets (based on
UNIX parent family).
Thanks to this, it will become trivial to implement a new ABNSZ (for abns
zero) family which is essentially the same as ABNS but with a slight
difference when it comes to path handling (ABNS uses the whole sun_path
length, while ABNSZ's path is zero terminated and evaluation stops at 0)
It was verified that this patch doesn't break reg-tests and behaves
properly (tests performed on the CLI with show sess and show fd).
Anywhere relevant, AF_CUST_ABNS is handled alongside AF_UNIX. If no
distinction needs to be made, real_family() is used to fetch the proper
real family type to handle it properly.
Both stream and dgram were converted, so no functional change should be
expected for this "internal" rework, except that proto will be displayed
as "abns_{stream,dgram}" instead of "unix_{stream,dgram}".
Before ("show sess" output):
0x64c35528aab0: proto=unix_stream src=unix:1 fe=GLOBAL be=<NONE> srv=<none> ts=00 epoch=0 age=0s calls=1 rate=0 cpu=0 lat=0 rq[f=848000h,i=0,an=00h,ax=] rp[f=80008000h,i=0,an=00h,ax=] scf=[8,0h,fd=21,rex=10s,wex=] scb=[8,1h,fd=-1,rex=,wex=] exp=10s rc=0 c_exp=
After:
0x619da7ad74c0: proto=abns_stream src=unix:1 fe=GLOBAL be=<NONE> srv=<none> ts=00 epoch=0 age=0s calls=1 rate=0 cpu=0 lat=0 rq[f=848000h,i=0,an=00h,ax=] rp[f=80008000h,i=0,an=00h,ax=] scf=[8,0h,fd=22,rex=10s,wex=] scb=[8,1h,fd=-1,rex=,wex=] exp=10s rc=0 c_exp=
Co-authored-by: Aurelien DARRAGON <adarragon@haproxy.com>
There are two parts in mworker_cli_proxy_create(): allocating and setting up
MASTER proxy and allocating and setting up servers on ipc_fd[0] of the
sockpairs shared with workers.
So, let's split mworker_cli_proxy_create() into two functions respectively.
Each of them takes **errmsg as an argument to write an error message, which may
be triggered by some subcalls. The content of this errmsg will allow to extend
the final alert message shown to user, if these new functions will fail.
The main goals of this split is to allow to move these two parts independantly
in future and makes the code of haproxy initialization in haproxy.c more
transparent.
After sending its "READY" status worker should not keep the access
to MASTER proxy, thus, it shouldn't be able to send any other commands further
to master process.
To achieve this, let's stop in master context master CLI listener attached on
the sockpair shared with worker. We do this just after receiving the worker's
status message.
Let's reintroduce systemd support in the refactored master-worker mode.
As for now, the master-worker fork happens during early initialization steps and
then the master process receieves the "READY" status message from the newly
forked worker, that has successfully loaded. Let's propagate this "READY" status
message at this moment to the systemd from the master process context
(_send_status()). We use the master process to send messages to systemd,
because it is only the process, monitored by systemd.
In master recovery mode, we also need to send to the systemd the "READY"
message, but with the status "Reload failed". "READY" will signal to systemd,
that master process is still alive, because it doesn't exit in recovery mode
and it keeps the existed worker. Status "Reload failed" will signal to user,
that something wrong has happened with the configuration. Same message logic
was originally preserved for the case, when the worker fails to read its
configuration, see on_new_child_failure() for more details.