A config containing "stats socket /path/to/socket mode admin" used to
silently start and be unusable (mode 0, level user) because the "mode"
parser doesn't take care of non-digits. Now it properly reports :
[ALERT] 276/144303 (7019) : parsing [ext-check.cfg:4] : 'stats socket' : ''mode' : missing or invalid mode 'admin' (octal integer expected)'
This can probably be backported to 1.7, 1.6 and 1.5, though reporting
parsing errors in very old versions probably isn't a good idea if the
feature was left unused for years.
Since everything is self contained in proto_uxst.c there's no need to
export anything. The same should be done for proto_tcp.c but the file
contains other stuff that's not related to the TCP protocol itself
and which should first be moved somewhere else.
cfgparse has no business directly calling each individual protocol's 'add'
function to create a listener. Now that they're all registered, better
perform a protocol lookup on the family and have a standard ->add method
for all of them.
It's a shame that cfgparse() has to make special cases of each protocol
just to cast the port to the target address family. Let's pass the port
in argument to the function. The unix listener simply ignores it.
Till now connections used to rely exclusively on file descriptors. It
was planned in the past that alternative solutions would be implemented,
leading to member "union t" presenting sock.fd only for now.
With QUIC, the connection will need to continue to exist but will not
rely on a file descriptor but a connection ID.
So this patch introduces a "connection handle" which is either a file
descriptor or a connection ID, to replace the existing "union t". We've
now removed the intermediate "struct sock" which was never used. There
is no functional change at all, though the struct connection was inflated
by 32 bits on 64-bit platforms due to alignment.
James Brown reported some cases where a race condition happens between
the old and the new processes resulting in the leaving process removing
a newly bound unix socket. Jeff gave all the details he observed here :
https://www.mail-archive.com/haproxy@formilux.org/msg25001.html
The unix socket removal was an attempt at an optimal cleanup, which
almost never works anyway since the process is supposed to be chrooted.
And in the rare cases where it works it occasionally creates trouble.
There was already a workaround in place to avoid removing this socket
when it's been inherited from a parent's file descriptor.
So let's finally kill this useless stuff now to definitely get rid of
this persistent problem.
This fix should be backported to all stable releases.
Add a new command that will send all the listening sockets, via the
stats socket, and their properties.
This is a first step to workaround the linux problem when reloading
haproxy.
There's a test after a successful synchronous connect() consisting
in waking the data layer up asap if there's no more handshake.
Unfortunately this test is run before setting the CO_FL_SEND_PROXY
flag and before the transport layer adds its own flags, so it can
indicate a willingness to send data while it's not the case and it
will have to be handled later.
This has no visible effect except a useless call to a function in
case of health checks making use of the proxy protocol for example.
Additionally a corner case where EALREADY was returned and considered
equivalent to EISCONN was fixed so that it's considered equivalent to
EINPROGRESS given that the connection is not complete yet. But this
code should never return on the first call anyway so it's mostly a
cleanup.
This fix should be backported to 1.7 and 1.6 at least to avoid
headaches during some debugging.
When a connect() to a unix socket returns EAGAIN we talk about
"no free ports" in the error/debug message, which only makes
sense when using TCP.
Explain connect() failure and suggest troubleshooting server
backlog size.
Abstract namespace sockets ignore the shutdown() call and do not make
it possible to temporarily stop listening. The issue it causes is that
during a soft reload, the new process cannot bind, complaining that the
address is already in use.
This change registers a new pause() function for unix sockets and
completely unbinds the abstract ones since it's possible to rebind
them later. It requires the two previous patches as well as preceeding
fixes.
This fix should be backported into 1.5 since the issue apperas there.
Jan Seda noticed that abstract sockets are incompatible with soft reload,
because the new process cannot bind and immediately fails. This patch marks
the binding as retryable and not fatal so that the new process can try to
bind again after sending a signal to the old process.
Note that this fix is not enough to completely solve the problem, but it
is necessary. This patch should be backported to 1.5.
When bind() fails (function uxst_bind_listener()), the fail path doesn't
consider the abstract namespace and tries to unlink paths held in
uninitiliazed memory (tempname and backname). See the strace excerpt;
the strings still hold the path from test1.
===============================================================================================
23722 bind(5, {sa_family=AF_FILE, path=@"test2"}, 110) = -1 EADDRINUSE (Address already in use)
23722 unlink("/tmp/test1.sock.23722.tmp") = -1 ENOENT (No such file or directory)
23722 close(5) = 0
23722 unlink("/tmp/test1.sock.23722.bak") = -1 ENOENT (No such file or directory)
===============================================================================================
This patch should be backported to 1.5.
Plain "tcp" health checks sent to a unix socket cause two connect()
calls to be made, one to connect, and a second one to verify that the
connection properly established. But with unix sockets, we get
immediate notification of success, so we can avoid this second
attempt. However we need to ensure that we'll visit the connection
handler even if there's no remaining handshake pending, so for this
we claim we have some data to send in order to enable polling for
writes if there's no more handshake.
These sockets are the same as Unix sockets except that there's no need
for any filesystem access. The address may be whatever string both sides
agree upon. This can be really convenient for inter-process communications
as well as for chaining backends to frontends.
These addresses are forced by prepending their address with "abns@" for
"abstract namespace".
We've had everything in place for this for a while now, we just missed
the connect function for UNIX sockets. Note that in order to connect to
a UNIX socket inside a chroot, the path will have to be relative to the
chroot.
UNIX sockets connect about twice as fast as TCP sockets (or consume
about half of the CPU at the same rate). This is interesting for
internal communications between SSL processes and HTTP processes
for example, or simply to avoid allocating source ports on the
loopback.
The tcp_connect_probe() function is still used to probe a dataless
connection, but it is compatible so that's not an issue for now.
Health checks are not yet fully supported since they require a port.
Using the address syntax "fd@<num>", a listener may inherit a file
descriptor that the caller process has already bound and passed as
this number. The fd's socket family is detected using getsockname(),
and the usual initialization is performed through the existing code
for that family, but the socket creation is skipped.
Whether the parent has performed the listen() call or not is not
important as this is detected.
For UNIX sockets, we immediately clear the path after preparing a
socket so that we never remove it in case an abort would happen due
to a late error during startup.
Unix permissions are per-bind configuration line and not per listener,
so let's concretize this in the way the config is stored. This avoids
some unneeded loops to set permissions on all listeners.
The access level is not part of the unix perms so it has been moved
away. Once we can use str2listener() to set all listener addresses,
we'll have a bind keyword parser for this one.
Navigating through listeners was very inconvenient and error-prone. Not to
mention that listeners were linked in reverse order and reverted afterwards.
In order to definitely get rid of these issues, we now do the following :
- frontends have a dual-linked list of bind_conf
- frontends have a dual-linked list of listeners
- bind_conf have a dual-linked list of listeners
- listeners have a pointer to their bind_conf
This way we can now navigate from anywhere to anywhere and always find the
proper bind_conf for a given listener, as well as find the list of listeners
for a current bind_conf.
The "mode", "uid", "gid", "user" and "group" bind options were moved to
proto_uxst as they are unix-specific.
Note that previous versions had a bug here, only the last listener was
updated with the specified settings. However, it almost never happens
that bind lines contain multiple UNIX socket paths so this is not that
much of a problem anyway.
The "raw_sock" prefix will be more convenient for naming functions as
it will be prefixed with the data layer and suffixed with the data
direction. So let's rename the files now to avoid any further confusion.
The #include directive was also removed from a number of files which do
not need it anymore.
fdtab[].state was only used to know whether a connection was in progress
or an error was encountered. Instead we now use connection->flags to store
a flag for both. This way, connection management will be able to update the
connection status on I/O.
The destination address is purely a connection thing and not an fd thing.
It's also likely that later the address will be stored into the connection
and linked to by the SI.
struct fdinfo only keeps the pointer to the port range and the local port
for now. All of this also needs to move to the connection but before this
the release of the port range must move from fd_delete() to a new function
dedicated to the connection.
These pointers were used to hold pointers to buffers in the past, but
since we introduced the stream interface, they're no longer used but
they were still sometimes set.
Removing them shrink the struct fdtab from 32 to 24 bytes on 32-bit machines,
and from 52 to 36 bytes on 64-bit machines, which is a significant saving. A
quick tests shows a steady 0.5% performance gain, probably due to the better
cache efficiency.
Commit e164e7a removed get_src/get_dst setting in the stream interfaces but
forgot to set it in proto_tcp. Get the feature back because we need it for
logging, transparent mode, ACLs etc... We now rely on the stream interface
direction to know what syscall to use.
One benefit of doing it this way is that we don't use getsockopt() anymore
on outgoing stream interfaces nor on UNIX sockets.
We'll soon have an SSL socket layer, and in order to ease the difference
between the two, we use the name "sock_raw" to designate the one which
directly talks to the sockets without any conversion.
The previous sockstream_accept() function uses nothing from sockstream, and
is totally irrelevant to stream interfaces. Move this to the protocols.c
file which handles listeners and protocols, and call it listener_accept().
It now makes much more sense that the code dealing with listen() also handles
accept() and passes it to upper layers.
Managing listeners state is difficult because they have their own state
and can at the same time have theirs dictated by their proxy. The pause
is not done properly, as the proxy code is fiddling with sockets. By
introducing new functions such as pause_listener()/resume_listener(), we
make it a bit more obvious how/when they're supposed to be used. The
listen_proxies() function was also renamed to resume_proxies() since
it's only used for pause/resume.
This patch is the first in a series aiming at getting rid of the maintain_proxies
mess. In the end, proxies should not call enable_listener()/disable_listener()
anymore.
Since unix sockets are supported for bind, the default backlog size was not
enough to accept the traffic. The size is now inherited from the listener
to behave like the tcp listeners.
This also affects the "stats socket" backlog, which is now determined by
"stats maxconn".
There were a lot of snprintf() everywhere in the UNIX bind code. Now we
proceed as for tcp and indicate the socket path at the end between square
brackets. The code is smaller and more readable.
MAXPATHLEN may be used at other places, it's unconvenient to have it
redefined in a few files. Also, since checking it requires including
sys/param.h, some versions of it cause a macro declaration conflict
with MIN/MAX which are defined in tools.h. The solution consists in
including sys/param.h in both files so that we ensure it's loaded
before the macros are defined and MAXPATHLEN is checked.
For a long time we had two large accept() functions, one for TCP
sockets instanciating proxies, and another one for UNIX sockets
instanciating the stats interface.
A lot of code was duplicated and both did not work exactly the same way.
Now we have a stream_sock layer accept() called for either TCP or UNIX
sockets, and this function calls the frontend-specific accept() function
which does the rest of the frontend-specific initialisation.
Some code is still duplicated (session & task allocation, stream interface
initialization), and might benefit from having an intermediate session-level
accept() callback to perform such initializations. Still there are some
minor differences that need to be addressed first. For instance, the monitor
nets should only be checked for proxies and not for other connection templates.
Last, we renamed l->private as l->frontend. The "private" pointer in
the listener is only used to store a frontend, so let's rename it to
eliminate this ambiguity. When we later support detached listeners
(eg: FTP), we'll add another field to avoid the confusion.
Right now we count the incoming connection only once everything has
been allocated. Since we're planning on considering early ACL rules,
we need to count the connection earlier.
The response analyser was not emptied upon creation of a new session. In
fact it was always zero just because last session leaved it in a zero state,
but in case of shared pools this cannot be guaranteed. The net effect is
that it was possible to have some HTTP (or any other) analysers on the
response path of a stats unix socket, which would reject the response.
This fix must be backported to 1.4.
Since the last documentation cleanups, I've found more typos that I kept
in a corner instead of sending you a mail just for one character :)
--
Cyril Bont
Some rarely information are stored in fdtab, making it larger for no
reason (source port ranges, remote address, ...). Such information
lie there because the checks can't find them anywhere else. The goal
will be to move these information to the stream interface once the
checks make use of it.
For now, we move them to an fdinfo array. This simple change might
have improved the cache hit ratio a little bit because a 0.5% of
performance increase has measured.