Add a new pipe, one per thread, so that we can write on it to wake a thread
sleeping in a poller, and use it to wake threads supposed to take care of a
task, if they are all sleeping.
It remained some fragments of the old buffers API in debug messages, here and
there.
This was caused by the recent buffer API changes, no backport is needed.
Now all the code used to manipulate chunks uses a struct buffer instead.
The functions are still called "chunk*", and some of them will progressively
move to the generic buffer handling code as they are cleaned up.
Chunks are only a subset of a buffer (a non-wrapping version with no head
offset). Despite this we still carry a lot of duplicated code between
buffers and chunks. Replacing chunks with buffers would significantly
reduce the maintenance efforts. This first patch renames the chunk's
fields to match the name and types used by struct buffers, with the goal
of isolating the code changes from the declaration changes.
Most of the changes were made with spatch using this coccinelle script :
@rule_d1@
typedef chunk;
struct chunk chunk;
@@
- chunk.str
+ chunk.area
@rule_d2@
typedef chunk;
struct chunk chunk;
@@
- chunk.len
+ chunk.data
@rule_i1@
typedef chunk;
struct chunk *chunk;
@@
- chunk->str
+ chunk->area
@rule_i2@
typedef chunk;
struct chunk *chunk;
@@
- chunk->len
+ chunk->data
Some minor updates to 3 http functions had to be performed to take size_t
ints instead of ints in order to match the unsigned length here.
Now the buffers only contain the header and a pointer to the storage
area which can be anywhere. This will significantly simplify buffer
swapping and will make it possible to map chunks on buffers as well.
The buf_empty variable was removed, as now it's enough to have size==0
and area==NULL to designate the empty buffer (thus a non-allocated head
is the empty buffer by default). buf_wanted for now is indicated by
size==0 and area==(void *)1.
The channels and the checks now embed the buffer's head, and the only
pointer is to the storage area. This slightly increases the unallocated
buffer size (3 extra ints for the empty buffer) but considerably
simplifies dynamic buffer management. It will also later permit to
detach unused checks.
The way the struct buffer is arranged has proven quite efficient on a
number of tests, which makes sense given that size is always accessed
and often first, followed by the othe ones.
There's no real reason to have a specific scheduler for applets anymore, so
nuke it and just use tasks. This comes with some benefits, the first one
being that applets cannot induce high latencies anymore since they share
nice values with other tasks. Later it will be possible to configure the
applets' nice value. The second benefit is that the applet scheduler was
not very thread-friendly, having a big lock around it in prevision of this
change. Thus applet-intensive workloads should now scale much better with
threads.
Some more improvement is possible now : some applets also use a task to
handle timers and timeouts. These ones could now be simplified to use only
one task.
In commit abbf607 ("MEDIUM: cli: Add payload support") some cli keywords
without usage message have been added at the beginning of the keywords
array.
cli_gen_usage_usage_msg() use the kw->usage == NULL to stop generating
the usage message for the current keywords array. With those keywords at
the beginning, the whole array in cli.c was ignored in the usage message
generation.
This patch now checks the keyword itself, allowing a keyword without
usage message anywhere in the array.
In order to use arbitrary data in the CLI (multiple lines or group of words
that must be considered as a whole, for example), it is now possible to add a
payload to the commands. To do so, the first line needs to end with a special
pattern: <<\n. Everything that follows will be left untouched by the CLI parser
and will be passed to the commands parsers.
Per-command support will need to be added to take advantage of this
feature.
Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
Some error paths (especially those followed when running out of memory)
can set the error message to NULL. In order to avoid a crash, use a
generic message ("Out of memory") when this case arises.
It should be backported to 1.8.
Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
This function will be called from the CLI's "show fd" command to append some
extra mux-specific information that only the mux handler can decode. This is
supposed to help collect various hints about what is happening when facing
certain anomalies.
Commit 35b1b48 ("MINOR: cli: make "show fd" report the mux and mux_ctx
pointers when available") introduced an accidental build warning due to
a missing const statement.
This is handy to quickly distinguish H2 connections as well as to easily
access the h2c context. It could be backported to 1.8 to help during
troubleshooting sessions.
This bug was introduced in 48bcfdab2 ("MEDIUM: dumpstat: make the CLI
parser understand the backslash as an escape char").
This should be backported to 1.8.
Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
Since 200b0fac ("MEDIUM: Add support for updating TLS ticket keys via
socket"), 4147b2ef ("MEDIUM: ssl: basic OCSP stapling support."),
4df59e9 ("MINOR: cli: add socket commands and config to prepend
informational messages with severity") and 654694e1 ("MEDIUM: stats/cli:
add support for "set table key" to enter values"), commands
'set ssl tls-key', 'set ssl ocsp-response', 'set severity-output' and
'set table' do not always send an extra LF at the end of their outputs.
This is required as mentioned in doc/management.txt:
"Since multiple commands may be issued at once, haproxy uses the empty
line as a delimiter to mark an end of output for each command"
Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
The correct keyword is 'ssl-sessions' (vs. 'ssl-session').
The typo was introduced in 45c742be05 ('REORG: cli: move the "set
rate-limit" functions to their own parser').
Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
This printf() was added in f886e3478d ("MINOR: cli: Add a command to
send listening sockets.").
Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
This bug is present since 7a4a0ac71d ("MINOR: cli: add a new "show fd"
command").
This should be backported to 1.8.
Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
An fd cache entry might be removed and added at the end of the list, while
another thread is parsing it, if that happens, we may miss fd cache entries,
to avoid that, add a new field in the struct fdtab, "added_mask", which
contains a mask for potentially affected threads, if it is set, the
corresponding thread will set its bit in fd_cache_mask, to avoid waiting in
poll while it may have more work to do.
Create a local, per-thread, fdcache, for file descriptors that only belongs
to one thread, and make the global fd cache mostly lockless, as we can get
a lot of contention on the fd cache lock.
The "show fd" command on the CLI doesn't list the last FD in use since
it doesn't include maxfd. We don't need to use maxfd here anyway as
global.maxsock will do the job pretty well and removes this dependency.
This patch may be backported to 1.8.
Since the fd update tables are per-thread, we need to have a bit per
thread to indicate whether an update exists, otherwise this can lead
to lost update events every time multiple threads want to update the
same FD. In practice *for now*, it only happens at start time when
listeners are enabled and ask for polling after facing their first
EAGAIN. But since the pollers are still shared, a lost event is still
recovered by a neighbor thread. This will not reliably work anymore
with per-thread pollers, where it has been observed a few times on
startup that a single-threaded listener would not always accept
incoming connections upon startup.
It's worth noting that during this code review it appeared that the
"new" flag in the fdtab isn't used anymore.
This fix should be backported to 1.8.
A number of counters have been added at special places helping better
understanding certain bug reports. These counters are maintained per
thread and are shown using "show activity" on the CLI. The "clear
counters" commands also reset these counters. The output is sent as a
single write(), which currently produces up to about 7 kB of data for
64 threads. If more counters are added, it may be necessary to write
into multiple buffers, or to reset the counters.
To backport to 1.8 to help collect more detailed bug reports.
Rename the global variable "proxy" to "proxies_list".
There's been multiple proxies in haproxy for quite some time, and "proxy"
is a potential source of bugs, a number of functions have a "proxy" argument,
and some code used "proxy" when it really meant "px" or "curproxy". It worked
by pure luck, because it usually happened while parsing the config, and thus
"proxy" pointed to the currently parsed proxy, but we should probably not
rely on this.
[wt: some of these are definitely fixes that are worth backporting]
The prefix "auto:" can be added before the process set to let HAProxy
automatically bind a process to a CPU by incrementing process and CPU sets. To
be valid, both sets must have the same size. No matter the declaration order of
the CPU sets, it will be bound from the lower to the higher bound.
Examples:
# all these lines bind the process 1 to the cpu 0, the process 2 to cpu 1
# and so on.
cpu-map auto:1-4 0-3
cpu-map auto:1-4 0-1 2-3
cpu-map auto:1-4 3 2 1 0
# bind each process to exaclty one CPU using all/odd/even keyword
cpu-map auto:all 0-63
cpu-map auto:even 0-31
cpu-map auto:odd 32-63
# invalid cpu-map because process and CPU sets have different sizes.
cpu-map auto:1-4 0 # invalid
cpu-map auto:1 0-3 # invalid
This is useful to know what thread(s) an fd is scheduled to be
handled on. It's worth noting that at the moment the "show fd"d
doesn't seem totally thread-safe.
All the references to connections in the data path from streams and
stream_interfaces were changed to use conn_streams. Most functions named
"something_conn" were renamed to "something_cs" for this. Sometimes the
connection still is what matters (eg during a connection establishment)
and were not always renamed. The change is significant and minimal at the
same time, and was quite thoroughly tested now. As of this patch, all
accesses to the connection from upper layers go through the pass-through
mux.
For HTTP/2 we'll need some buffer-only equivalent functions to some of
the ones applying to channels and still squatting the bi_* / bo_*
namespace. Since these names have kept being misleading for quite some
time now and are really getting annoying, it's time to rename them. This
commit will use "ci/co" as the prefix (for "channel in", "channel out")
instead of "bi/bo". The following ones were renamed :
bi_getblk_nc, bi_getline_nc, bi_putblk, bi_putchr,
bo_getblk, bo_getblk_nc, bo_getline, bo_getline_nc, bo_inject,
bi_putchk, bi_putstr, bo_getchr, bo_skip, bi_swpbuf
I misplaced the "if (!fdt.owner)" test so it can occasionally crash
when dumping an fd that's already been closed but still appears in
the table. It's not critical since this was not pushed into any
release nor backported though.
Since everything is self contained in proto_uxst.c there's no need to
export anything. The same should be done for proto_tcp.c but the file
contains other stuff that's not related to the TCP protocol itself
and which should first be moved somewhere else.
Adds cli commands to change at runtime whether informational messages
are prepended with severity level or not, with support for numeric and
worded severity in line with syslog severity level.
Adds stats socket config keyword severity-output to set default behavior
per socket on startup.
Historically listeners used to have a handler depending on the upper
layer. But now it's exclusively process_stream() and nothing uses it
anymore so it can safely be removed.
Till now connections used to rely exclusively on file descriptors. It
was planned in the past that alternative solutions would be implemented,
leading to member "union t" presenting sock.fd only for now.
With QUIC, the connection will need to continue to exist but will not
rely on a file descriptor but a connection ID.
So this patch introduces a "connection handle" which is either a file
descriptor or a connection ID, to replace the existing "union t". We've
now removed the intermediate "struct sock" which was never used. There
is no functional change at all, though the struct connection was inflated
by 32 bits on 64-bit platforms due to alignment.
Recent commit 7a4a0ac ("MINOR: cli: add a new "show fd" command") introduced
a warning when building at -O2 and above. The compiler doesn't know if a
variable's value might have changed between two if blocks so warns that some
values might be used uninitialized, which is not the case. Let's simply
initialize them to shut the warning.
This one dumps the fdtab for all active FDs with some quickly interpretable
characters to read the flags (like upper case=set, lower case=unset). It
can probably be improved to report fdupdt[] and/or fdinfo[] but at least it
provides a good start and allows to see how FDs are seen. When the fd owner
is a connection, its flags are also reported as it can help compare with the
polling status, and the target (fe/px/sv) as well. When it's a listener, the
listener's state is reported as well as the frontend it belongs to.
This patch changes the stats socket rights for allowing the sending of
listening sockets.
The previous behavior was to allow any unix stats socket with admin
level to send sockets. It's not possible anymore, you have to set this
option to activate the socket sending.
Example:
stats socket /var/run/haproxy4.sock mode 666 expose-fd listeners level user process 4
The current level variable use only 2 bits for storing the 3 access
level (user, oper and admin).
This patch add a bitmask which allows to use the remaining bits for
other usage.
When running with multiple process, if some proxies are just assigned
to some processes, the other processes will just close the file descriptors
for the listening sockets. However, we may still have to provide those
sockets when reloading, so instead we just try hard to pretend those proxies
are dead, while keeping the sockets opened.
A new global option, no-reused-socket", has been added, to restore the old
behavior of closing the sockets not bound to this process.
Add a new command that will send all the listening sockets, via the
stats socket, and their properties.
This is a first step to workaround the linux problem when reloading
haproxy.
Now we exclusively use xprt_get(XPRT_RAW) instead of &raw_sock or
xprt_get(XPRT_SSL) for &ssl_sock. This removes a bunch of #ifdef and
include spread over a number of location including backend, cfgparse,
checks, cli, hlua, log, server and session.