Commit 41f28b3c53 ("DEV: phash: Update 414 and 431 status codes to phash")
accidentally committed a.out, resulting in build/checkout issues when
locally rebuilt. Let's drop it.
This should be backported to 3.1.
Nick Ramirez reported the following error while testing the h2-tracer.lua
script:
Lua filter 'h2-tracer' : [state-id 0] runtime error: /etc/haproxy/h2-tracer.lua:227: attempt to index a nil value (field '?') from /etc/haproxy/h2-tracer.lua:227: in function line 109.
It is caused by h2ff indexing with an out of bound value. Indeed, h2ff
is indexed with the frame type, which can potentially be > 9 (not common
nor observed during Willy's tests), while h2ff only defines indexes from
0 to 9.
The fix was provided by Willy, it consists in skipping h2ff indexing if
frame type is > 9. It was confirmed that doing so fixes the error.
This is also needed in order to make the requested number of CPUs
appear. For now we don't reroute to the original sysconf() call so
we return -1,EINVAL for all other info.
The following config is sufficient to trace H2 exchanges between a client
and a server:
global
lua-load "dev/h2/h2-tracer.lua"
listen h2_sniffer
mode tcp
bind :8002
filter lua.h2-tracer #hex
server s1 127.0.0.1:8003
The commented "hex" argument will also display full frames in hex (not
recommended). The connections are prefixed with a 3-hex digit number in
order to also support a bit of multiplexing without impacting the reading
too much. The screen is split in two, with the request on the left and
the response on the right. Here's an example of what it does between an
haproxy backend and an haproxy frontend both in H2, when submitted a
curl request for /?s=30k handled by httpterm:
[001] ### req start
[001] [PREFACE len=24]
[001] [SETTINGS sid=0 len=24 (bytes=24)]
[001] | ### res start
[001] | [SETTINGS sid=0 len=18 (bytes=27)]
[001] | [SETTINGS ACK sid=0 len=0 (bytes=0)]
[001] [SETTINGS ACK sid=0 len=0 (bytes=56)]
[001] [HEADERS EH+ES sid=1 len=47 (bytes=47)]
[001] | [HEADERS EH sid=1 len=101 (bytes=15351)]
[001] | [DATA sid=1 len=15126 (bytes=15241)]
[001] | [DATA sid=1 len=1258 (bytes=106)]
[001] | ... -106 = 1152
[001] | ... -1152 = 0
[001] [WINDOW_UPDATE sid=1 len=4 (bytes=43)]
[001] [WINDOW_UPDATE sid=0 len=4 (bytes=30)]
[001] [WINDOW_UPDATE sid=1 len=4 (bytes=17)]
[001] [WINDOW_UPDATE sid=0 len=4 (bytes=4)]
[001] | [DATA ES sid=1 len=14336 (bytes=14336)]
[001] [WINDOW_UPDATE sid=0 len=4 (bytes=4)]
[001] ### req end: 31080 bytes total
[001] | [GOAWAY sid=0 len=8 (bytes=8)]
[001] | ### res end: 31097 bytes total
It deserves some improvements. For instance at the moment it does not
verify the preface, any 24 bytes will work. It does not perform any
protocol validation either. Detecting some issues such as out-of-sequence
frames could be helpful. But it already helps as-is.
Connection errors can be detected via connect/recv/send syscall, but also
because it was reported by the poller. So dedicated events, at the FD level,
are introduced to make the difference.
term_events tool was updated accordingly.
This development tool can be used to convert a string representing a
termination event logs to its human redable representation. Several string
may be converting at a time. To do so, several arguments can be specified on
the commeand line or they can be provided on STDIN, using "-" argument.
Here is an exemple:
> term_events f2x2f4x4 m2m4m1 e2e1 s2s1S1 E1 M1 F1
### f2x2f4x4 : fd:shutr > xprt:shutr > fd:snd_err > xprt:snd_err
### m2m4m1 : muxc:shutr > muxc:snd_err > muxc:shutw
### e2e1 : se:eos > se:shutw
### s2s1S1 : strm:eos > strm:shutw > STRM:shutw
### E1 : SE:shutw
### M1 : MUXC:shutw
### F1 : FD:shutw
The make target "dev/term_events/term_events" must be used to compile it.
It's convenient to have a share lib be able to also work as a wrapper.
But recent glibc broke support for this dual-mode thing some time ago:
https://patchwork.ozlabs.org/project/glibc/patch/20190312130235.8E82C89CE49C@oldenburg2.str.redhat.com/https://stackoverflow.com/questions/59074126/loading-executable-or-executing-a-library
Trying to preload such an executable indeed returns:
ERROR: ld.so: object '/path/to/ncpu.so' from LD_PRELOAD cannot be preloaded (cannot dynamically load position-independent executable): ignored.
Note that the code still supports it since libc.so is both an executable
and a lib. The approach taken here is the same as in the nousr.so wrapper.
It consists in dropping the DF_1_PIE flag from the resulting executable
since it's what the dynamic linker is looking for. This flag is found in
FLAGS_1 in the .dynamic section. As readelf -a suggests, it's after the
tag 0x6ffffffb. The value is 0x08000000. We're using objdump to figure the
length and offset of the struct, dd to extract the 3 parts, and sed to
patch the binary.
It's likely that it will only work on 64-bit little endian, though tests
should be performed to see what to do on other platforms. At least on
x86_64, ld.so is happy and it continues to be possible to use the binary
as a .so, and that the platform where most of the development happens so
that's fine.
In any case the wrapper and the standard shared lib are still made two
distinct files so that it's possible to use the non-patched version on
unsupported OSes or architectures.
The wrapper mode allows to present itself as LD_PRELOAD before loading
haproxy, which is often more convenient since it allows to pass the
number of CPUs in argument. However, this mode is no longer supported by
modern glibcs, so a future patch will come to implement a trick that was
tested to work at least on x86.
Collecting captures of /sys isn't sufficient for NUMA development because
haproxy detects the number of CPUs at boot time and will not be able to
inspect more than this number. Let's just have a small utility to report
a fake number of CPUs, that will be loaded using LD_PRELOAD. It checks
the NCPU variable if it exists and will present this number of CPUs, or
if it does not exist, will expose the maximum supported number.
These is a collection of functions I'm occasionally using to navigate
in core dumps. Only working ones were extracted.
Those requiring knowledge of global variables (e.g. pools, proxy list)
use the one extracted from the post_mortem struct. That one is defined
in post-mortem.gdb and needs to be initialized using "pm_init post_mortem"
or "pm_init <pointer>". From this point a number of global variables are
accessible even if symbols are missing; those ones are then used by other
functions to dump streams, threads, pools, proxies etc.
The files can be sourced or copy-pasted into a gdb session. It's worth
trying to keep them up-to-date, as the old ones used to navigate through
tasks are no longer usable due to massive changes.
It's useful to instantly see how many patches of each category have
already been backported and are still pending, let's count them and
report them at the top of the page.
Decode QUIC MUX connection and stream elements via qcc_show_flags() and
qcs_show_flags(). Flags definition have been moved outside of USE_QUIC
to ease compilation of flags binary.
It is no possible yet to use it. Idles connections and pipelining mode are
not supported for now. But it should be possible to open a SPOP connection,
perform the HELLO handshake, send a NOTIFY frame based on data produced by
the client side and receive the corresponding ACK frame to transfer its
content to the client side.
The related issue is #2502.
The script hadn't been updated since it was introduced, and the
hard-coded field 12 doesn't match anymore (it's 16 now). Let's just
use "grep -o cflg..." to extract the desired part more flexibly.
This can be backported at least to 3.0, probably further, but it
will need to be tested prior to this. Better not bring it too far,
it's only used when debugging.
We're now locking the tail while looking for some room in the ring. In
fact it's still while writing to it, but the goal definitely is to get
rid of the lock ASAP. For this we reserve the topmost bit of the tail
as a lock, which may have as a possible visible effect that buffers will
be limited to 2GB instead of 4GB on 32-bit machines (though in practise,
good luck for allocating more than 2GB contiguous on 32-bit), but in
practice since the size is read with atol() and some operating systems
limit it to LONG_MAX unless passing negative numbers, the limit is
already there.
For now the impact on x86_64 is significant (drop from 2.35 to 1.4M/s
on 48 threads on EPYC 24 cores) but this situation is only temporary
so that changes can be reviewable and bisectable.
Other approaches were attempted, such as using XCHG instead, which is
slightly faster on x86 with low thread counts (but causes more write
contention), and forces readers to stall under heavy traffic because
they can't access a valid value for the queue anymore. A CAS requires
preloading the value and is les good on ARMv8.1. XADD could also be
considered with 12-13 upper bits of the offset dedicated to locking,
but that looks overkill.
We really want to let the readers and writers act on different areas, so
we want to have the tail and the head on separate cache lines, themselves
separate from the rest of the ring. Doing so improves the performance from
2.15 to 2.35M msg/s at 48 threads on a 24-core EPYC.
This increases the header space from 32 to 192 bytes when threads are
enabled. But since we already have the header size available in the file,
haring remains able to detect the aligned vs unaligned formats and call
dump_v2a() when aligned is detected.
The purpose is to store a head and a tail that are independent so that
we can further improve the API to update them independently from each
other.
The struct was arranged like the original one so that as long as a ring
has its head set to zero (i.e. no recycling) it will continue to work.
The new format is already detectable thanks to the "rsvd" field which
indicates the number of reserved bytes at the beginning. It's located
where the buffer's area pointer previously was, so that older versions
of haring can continue to open the ring in repair mode, and newer ones
can use the fact that the upper bits of that variable are zero to guess
that it's working with the new format instead of the old one. Also let's
keep in mind that the layout will further change to place some alignment
constraints.
The haring tool will thus updated based on this and it detects that the
rsvd field is smaller than a page and that the sum of it with the size
equals the mapped size, in which case it uses the new dump_v2() function
instead of dump_v1(). The new function also creates a buffer from the
ring's area, size, head and tail and calls the generic one so that no
other code had to be adapted.
Instead of emitting a warning, since we don't need the ring struct
anymore, we can just read what we need, parse the buffer and use the
advertised offset. Thus for now -f is simply ignored.
By splitting the initialization and the parsing of the ring, we'll ease
the support for multiple ring sizes and get rid of the annoyances of the
optional lock.
haring needs to be self-sufficient about the ring format so that it continues
to build when the ring API changes. Let's import the struct ring definition
and call it "ring_v1".
In issue #2427 Ilya reports that gcc-14 rightfully complains about
sizeof() being placed in the left term of calloc(). There's no impact
but it's a bad pattern that gets copy-pasted over time. Let's fix the
few remaining occurrences (debug.c, halog, udp-perturb).
This can be backported to all branches, and the irrelevant parts dropped.
For HPACK-encoded headers (particularly with huffman encoding), it's
really necessary to support hex sequences as they appear in RFC7541
examples, so let's support hex digit pairs with -R.
Now it's possible to do this to send GET https://www.example.com/ :
(dev/h2/mkhdr.sh -t p; dev/h2/mkhdr.sh -t s;
dev/h2/mkhdr.sh -t h -i 1 -f es,eh \
-R '8286 8441 0f77 7777 2e65 7861 6d70 6c65 2e63 6f6d ') | nc 0 8080
We have a few places where we're checking many status codes. The sets
are so small that just a multiply, a shift and an optional xor are
sufficient to turn that into a smaller set. The program used to produce
them is hackish as it allows to easily fiddle with various operations
manually and experiment by brute-forcing a pair of integers for the
mul and the xor. It also supports producing an incomplete table that
gives an easier modulo operation. Let's commit it as-is so that it can
be reused later (e.g. if new status codes are introduced for example).
Some rare commit messages area really too large because they contain
code excerpts in the message body or are release commits with their
changelog. In this case, instead of leaving an empty file that will
be silently ignored, let's produce an output message indicating that
the verdict is uncertain, with an explanation stating that there was
an error.
With -r it's possible to pass raw data that will be interpreted by
printf so it even supports \x sequences. E.g. for a RST_STREAM, let's
just use \x00\x00\x00\x00.
In order to spot old patches marked "wait" that have not yet been
backported, it's convenient to be able to click "all" to start the
review from the first patch, deselect "no", "uncertain" and "backported",
leaving only "wait" and "yes". This will reveal all pending patches that
have still not yet been backported, including those prior to the last
review, allowing to reconsider older patches marked "wait" that have
not yet been picked.
The statuses[] table was pre-filled from the shell code during
initialization based on the evaluation and the buttons pre-checked
accordingly, but upon reload, the checked buttons are preserved and
the statuses reinitialized, leading to a different status and color
on lines that were changed.
In practice we don't need this table and we can directly check each
button's state. This makes sure that displayed state is consistent
with checked buttons and allows to preserve the statuses upon reloads
to benefit from updates. Only the start of the review is reset upon
reload now (this allows to consider latest backport state). Of course,
a full reload (shift-ctrl-R) continues to reset the form.
This is a set of scripts, prompts and howtos to have an LLM read commit
messages and determine with great accuracy whether the patch's author
intended for the patch to be backported ASAP, backported after some time,
not backported, or unknown state. It provides all this in an interactive
interface making it easy to adjust choices and proceed with what was
selected. This has been improving over the last 9 months, as helped to
spot patches for a handful of backport sessions, and was only limited by
usability issues (UI). Now that these issues are solved, let's commit the
tool in its current working state. It currently runs every hour in a
crontab for me and started to prove useful since the last update, so it
should be considered in a usable state now, especially since this latest
update reaches close to 100% accuracy compared to a human choice, so it
saves precious development time and may allow stable releases to be
emitted more regularly.
There's detailed readme, please read it before complaining about the
ugliness of the UI :-)