The goal is to indicate how critical the allocation is, between the
least one (growing an existing buffer ring) and the topmost one (boot
time allocation for the life of the process).
The 3 tcp-based muxes (h1, h2, fcgi) use a common allocation function
to try to allocate otherwise subscribe. There's currently no distinction
of direction nor part that tries to allocate, and this should be revisited
to improve this situation, particularly when we consider that mux-h2 can
reduce its Tx allocations if needed.
For now, 4 main levels are planned, to translate how the data travels
inside haproxy from a producer to a consumer:
- MUX_RX: buffer used to receive data from the OS
- SE_RX: buffer used to place a transformation of the RX data for
a mux, or to produce a response for an applet
- CHANNEL: the channel buffer for sync recv
- MUX_TX: buffer used to transfer data from the channel to the outside,
generally a mux but there can be a few specificities (e.g.
http client's response buffer passed to the application,
which also gets a transformation of the channel data).
The other levels are a bit different in that they don't strictly need to
allocate for the first two ones, or they're permanent for the last one
(used by compression).
While HTTP makes no promises of sub-message delivery, haproxy tries to
make reasonable efforts to be friendly to applications relying on this,
particularly though the "option http-no-delay" statement. However, it
was reported that when slz compression is being used, a few bytes can
remain pending for more data to complete them in the SLZ queue when
built on a 64-bit little endian architecture. This is because aligning
blocks on byte boundary is costly (requires to switch to literals and
to send 4 bytes of block size), so incomplete bytes are left pending
there until they reach at least 32 bits. On other architecture, the
queue is only 8 bits long.
Robert Newson from Apache's CouchDB project explained that the heartbeat
used by CouchDB periodically delivers a single LF character, that it used
to work fine before the change enlarging the queue for 64-bit platforms,
but only forwards once every 3 LF after the change. This was definitely
caused by this incomplete byte sequence queuing. Zlib is not affected,
and the code shows that ->flush() is always called. In the case of SLZ,
the called function is rfc195x_flush_or_finish() and when its "finish"
argument is zero, no flush is performed because there was no such flush()
operation.
The previous patch implemented a flush() operation in SLZ, and this one
makes rfc195x_flush_or_finish() call it when finish==0. This way each
delivered data block will now provoke a flush of the queue as is done
with zlib.
This may very slightly degrade the compression ratio, but another change
is needed to condition this on "option http-no-delay" only in order to
be consistent with other parts of the code.
This patch (and the preceeding slz one) should be backported at least to
2.6, but any further change to depend on http-no-delay should not.
Make provision for being able to store both compression algorithms and
content-types to compress for both requests and responses. For now only
the responses one are used.
__comp_fetch_init() only presets the maxzlibmem, and only when both
USE_ZLIB and DEFAULT_MAXZLIBMEM are set. The intent is to preset a
default value to protect the system against excessive memory usage
when no setting is set by the user.
Nowadays the entry in the global struct is always there so there's no
point anymore in passing via a constructor to possibly set this value.
Let's go the cleaner way by always presetting DEFAULT_MAXZLIBMEM to 0
in defaults.h unless these conditions are met, and always assigning it
instead of pre-setting the entry to zero. This is more straightforward
and removes some ifdefs and the last constructor. In addition, now the
setting has a chance of being found.
The "thread_info" name was initially chosen to store all info about
threads but since we now have a separate per-thread context, there is
no point keeping some of its elements in the thread_info struct.
As such, this patch moves prev_cpu_time, prev_mono_time and idle_pct to
thread_ctx, into the thread context, with the scheduler parts. Instead
of accessing them via "ti->" we now access them via "th_ctx->", which
makes more sense as they're totally dynamic, and will be required for
future evolutions. There's no room problem for now, the structure still
has 84 bytes available at the end.
A memory allocation failure happening in comp_append_type or
comp_append_algo called while parsing compression options would have
resulted in a crash. These functions are only called during
configuration parsing.
It was raised in GitHub issue #1233.
It could be backported to all stable branches.
As we now embed the library we don't need to support the older 1.0 API
any more, so we can remove the explicit calls to slz_make_crc_table()
and slz_prepare_dist_table().
Now that SLZ is merged, let's update the makefile and compression
files to use it. As a result, SLZ_INC and SLZ_LIB are neither defined
nor used anymore.
USE_SLZ is enabled by default ("USE_SLZ=default") and can be disabled
by passing "USE_SLZ=" or by enabling USE_ZLIB=1.
The doc was updated to reflect the changes.
The tests were made on slz and the zlib parsers for memlevel and windowsize
managed to escape the change made by commit 018251667 ("CLEANUP: config: make
the cfg_keyword parsers take a const for the defproxy"). This is now fixed.
In issue #685 gcc 10 seems to find a null pointer deref in free_zlib().
There is no such case because all possible pools are tested and there's
no other one in zlib, except if there's a bug or memory corruption
somewhere else. The code used to be written like this to make sure that
any such bug couldn't remain unnoticed.
Now gcc 10 sees this theorical code path and complains, so let's just
change the code to place an explicit crash in case of no match (which
must never happen).
This might be backported if other versions are affected.
This patch fixes all the leftovers from the include cleanup campaign. There
were not that many (~400 entries in ~150 files) but it was definitely worth
doing it as it revealed a few duplicates.
There's no point splitting the file in two since only cfgparse uses the
types defined there. A few call places were updated and cleaned up. All
of them were in C files which register keywords.
There is nothing left in common/ now so this directory must not be used
anymore.
This one was not easy because it was embarking many includes with it,
which other files would automatically find. At least global.h, arg.h
and tools.h were identified. 93 total locations were identified, 8
additional includes had to be added.
In the rare files where it was possible to finalize the sorting of
includes by adjusting only one or two extra lines, it was done. But
all files would need to be rechecked and cleaned up now.
It was the last set of files in types/ and proto/ and these directories
must not be reused anymore.
The files were moved almost as-is, just dropping arg-t and auth-t from
acl-t but keeping arg-t in acl.h. It was useful to revisit the call places
since a handful of files used to continue to include acl.h while they did
not need it at all. Struct stream was only made a forward declaration
since not otherwise needed.
global.h was one of the messiest files, it has accumulated tons of
implicit dependencies and declares many globals that make almost all
other file include it. It managed to silence a dependency loop between
server.h and proxy.h by being well placed to pre-define the required
structs, forcing struct proxy and struct server to be forward-declared
in a significant number of files.
It was split in to, one which is the global struct definition and the
few macros and flags, and the rest containing the functions prototypes.
The UNIX_MAX_PATH definition was moved to compat.h.
The pretty confusing "buffer.h" was in fact not the place to look for
the definition of "struct buffer" but the one responsible for dynamic
buffer allocation. As such it defines the struct buffer_wait and the
few functions to allocate a buffer or wait for one.
This patch moves it renaming it to dynbuf.h. The type definition was
moved to its own file since it's included in a number of other structs.
Doing this cleanup revealed that a significant number of files used to
rely on this one to inherit struct buffer through it but didn't need
anything from this file at all.
Now the file is ready to be stored into its final destination. A few
minor reorderings were performed to keep the file properly organized,
making the various sections more visible (cache & lockless).
In addition and to stay consistent, memory.c was renamed to pool.c.
types/freq_ctr.h was moved to haproxy/freq_ctr-t.h and proto/freq_ctr.h
was moved to haproxy/freq_ctr.h. Files were updated accordingly, no other
change was applied.
This splits the hathreads.h file into types+macros and functions. Given
that most users of this file used to include it only to get the definition
of THREAD_LOCAL and MAXTHREADS, the bare minimum was placed into thread-t.h
(i.e. types and macros).
All the thread management was left to haproxy/thread.h. It's worth noting
the drop of the trailing "s" in the name, to remove the permanent confusion
that arises between this one and the system implementation (no "s") and the
makefile's option (no "s").
For consistency, src/hathreads.c was also renamed thread.c.
A number of files were updated to only include thread-t which is the one
they really needed.
Some future improvements are possible like replacing empty inlined
functions with macros for the thread-less case, as building at -O0 disables
inlining and causes these ones to be emitted. But this really is cosmetic.
All files that were including one of the following include files have
been updated to only include haproxy/api.h or haproxy/api-t.h once instead:
- common/config.h
- common/compat.h
- common/compiler.h
- common/defaults.h
- common/initcall.h
- common/tools.h
The choice is simple: if the file only requires type definitions, it includes
api-t.h, otherwise it includes the full api.h.
In addition, in these files, explicit includes for inttypes.h and limits.h
were dropped since these are now covered by api.h and api-t.h.
No other change was performed, given that this patch is large and
affects 201 files. At least one (tools.h) was already freestanding and
didn't get the new one added.
This commit replaces the explicit pool creation that are made in
constructors with a pool registration. Not only this simplifies the
pools declaration (it can be done on a single line after the head is
declared), but it also removes references to pools from within
constructors. The only remaining create_pool() calls are those
performed in init functions after the config is parsed, so there
is no more user of potentially uninitialized pool now.
It has been the opportunity to remove no less than 12 constructors
and 6 init functions.
Most register_build_opts() calls use static strings. These ones were
replaced with a trivial REGISTER_BUILD_OPTS() statement adding the string
and its call to the STG_REGISTER section. A dedicated section could be
made for this if needed, but there are very few such calls for this to
be worth it. The calls made with computed strings however, like those
which retrieve OpenSSL's version or zlib's version, were moved to a
dedicated function to guarantee they are called late in the process.
For example, the SSL call probably requires that SSL_library_init()
has been called first.
This patch replaces a number of __decl_hathread() followed by HA_SPIN_INIT
or HA_RWLOCK_INIT by the new __decl_spinlock() or __decl_rwlock() which
automatically registers the lock for initialization in during the STG_LOCK
init stage. A few static modifiers were lost in the process, but since they
were not essential at all it was not worth extending the API to provide such
a variant.
This switches explicit calls to various trivial registration methods for
keywords, muxes or protocols from constructors to INITCALL1 at stage
STG_REGISTER. All these calls have in common to consume a single pointer
and return void. Doing this removes 26 constructors. The following calls
were addressed :
- acl_register_keywords
- bind_register_keywords
- cfg_register_keywords
- cli_register_kw
- flt_register_keywords
- http_req_keywords_register
- http_res_keywords_register
- protocol_register
- register_mux_proto
- sample_register_convs
- sample_register_fetches
- srv_register_keywords
- tcp_req_conn_keywords_register
- tcp_req_cont_keywords_register
- tcp_req_sess_keywords_register
- tcp_res_cont_keywords_register
- flt_register_keywords
Surprisingly, the compression pool was created at runtime on first use,
which is not very convenient, has performance and reliability impacts,
and even makes monitoring less easy. Let's move the pool creation at
startup time instead. This even removes the need for the spinlock in
case USE_ZLIB is not defined.
The tune.maxzlibmem setting was moved with commit 368780334 ("MEDIUM:
compression: move the zlib-specific stuff from global.h to compression.c")
but the preset value using DEFAULT_MAXZLIBMEM was incorrectly moved :
- the field is in "global" and not "global.tune"
- the trailing comma instead of semi-colon will make it either zero
(threads enabled), break (threads enabled with debugging), or cast
the memprintf's return pointer to int (threads disabled)
It simply proves that nobody ever used DEFAULT_MAXZLIBMEM since 1.8!
This needs to be backported to 1.8.
It's a bit painful to have to deal with HTTP semantics for each protocol
version (H1 and H2), and working on the version-agnostic code further
emphasizes the problem.
This patch creates http.h and http.c which are agnostic to the version
in use, and which borrow a few parts from proto_http and from h1. For
example the once thought h1-specific h1_char_classes array is in fact
dictated by RFC7231 and is used to parse HTTP headers. A few changes
were made to a few files which were including proto_http.h while they
only needed http.h.
Certain string definitions pre-dated the introduction of indirect
strings (ist) so some were used to simplify the definition of the known
HTTP methods. The current lookup code saves 2 kB of a heavily used table
and is faster than the previous table based lookup (typ. 14 ns vs 16
before).
Now the buffers only contain the header and a pointer to the storage
area which can be anywhere. This will significantly simplify buffer
swapping and will make it possible to map chunks on buffers as well.
The buf_empty variable was removed, as now it's enough to have size==0
and area==NULL to designate the empty buffer (thus a non-allocated head
is the empty buffer by default). buf_wanted for now is indicated by
size==0 and area==(void *)1.
The channels and the checks now embed the buffer's head, and the only
pointer is to the storage area. This slightly increases the unallocated
buffer size (3 extra ints for the empty buffer) but considerably
simplifies dynamic buffer management. It will also later permit to
detach unused checks.
The way the struct buffer is arranged has proven quite efficient on a
number of tests, which makes sense given that size is always accessed
and often first, followed by the othe ones.
This part is tricky, it passes a channel where we used to have a buffer,
in order to reduce the API changes during the big switch. This way all
the channel's wrappers to distinguish between input and output are
available. It also makes sense given that the compression applies on
a channel since it's in the forwarding path.
We used to have variations around buffer_total_space() and
size-buffer_len() or size-b_data(). Let's simplify all this. buffer_len()
was also removed as not used anymore.
During the migration to the second version of the pools, the new
functions and pool pointers were all called "pool_something2()" and
"pool2_something". Now there's no more pool v1 code and it's a real
pain to still have to deal with this. Let's clean this up now by
removing the "2" everywhere, and by renaming the pool heads
"pool_head_something".
This macro should be used to declare variables or struct members depending on
the USE_THREAD compile option. It avoids the encapsulation of such declarations
between #ifdef/#endif. It is used to declare all lock variables.
When haproxy is compiled without zlib or slz, the output of
haproxy -vv shows (null).
Make haproxy -vv output great again by providing the proper
information (which is what we did before).
This is for 1.8 only.
This finishes to clean up the zlib-specific parts. It also unbreaks recent
commit b97c6fb ("CLEANUP: compression: use the build options list to report
the algos") which broke USE_ZLIB due to MAXWBITS not being defined anymore
in haproxy.c.
This removes 2 #ifdef, an include, an ugly construct and a wild "extern"
declaration from haproxy.c. The message indicating that compression is
*not* enabled is not there anymore.
This warning appears when building without any compression library :
src/compression.c:143:19: warning: unused function 'init_comp_ctx' [-Wunused-function]
static inline int init_comp_ctx(struct comp_ctx **comp_ctx)
^
src/compression.c:176:19: warning: unused function 'deinit_comp_ctx' [-Wunused-function]
static inline int deinit_comp_ctx(struct comp_ctx **comp_ctx)
No backport is needed.
For quite some time, a few users have been experiencing random crashes
when compressing with zlib, from versions 1.2.3 to 1.2.8 included.
Upon thourough investigation in zlib's deflate.c, it appeared obvious
that avail_in and next_in are used during the flush operation and need
to be initialized, while admittedly it's not obvious in the documentation.
By simply forcing both values to -1 it's possible to immediately reproduce
the exact crash that these users have been experiencing :
(gdb) bt
#0 0x00007fa73ce10c00 in __memcpy_sse2 () from /lib64/libc.so.6
#1 0x00007fa73e0c5d49 in ?? () from /lib64/libz.so.1
#2 0x00007fa73e0c68e0 in ?? () from /lib64/libz.so.1
#3 0x00007fa73e0c73c7 in deflate () from /lib64/libz.so.1
#4 0x00000000004dca1c in deflate_flush_or_finish (comp_ctx=0x7b6580, out=0x7fa73e5bd010, flag=2) at src/compression.c:808
#5 0x00000000004dcb60 in deflate_flush (comp_ctx=0x7b6580, out=0x7fa73e5bd010) at src/compression.c:835
#6 0x00000000004dbc50 in http_compression_buffer_end (s=0x7c0050, in=0x7c00a8, out=0x78adf0 <tmpbuf.24662>, end=0) at src/compression.c:249
#7 0x000000000048bb5f in http_response_forward_body (s=0x7c0050, res=0x7c00a0, an_bit=1048576) at src/proto_http.c:7173
#8 0x00000000004cce54 in process_stream (t=0x7bffd8) at src/stream.c:1939
#9 0x0000000000427ddf in process_runnable_tasks () at src/task.c:238
#10 0x0000000000419892 in run_poll_loop () at src/haproxy.c:1573
#11 0x000000000041a4a5 in main (argc=4, argv=0x7fffcda38348) at src/haproxy.c:1933
Note that for all reports deflate_flush_or_finish() was always involved.
The crash is very hard to reproduce when using regular traffic because it
requires that the combination of avail_in and next_in are inadequate so
that the memcpy() call reads out of bounds. But this can very likely
happen when the input buffer points to an area reused by another stream
when the flush has been interrupted due to a full output buffer. This
also explains why this report is recent, as dynamic buffer allocation
was introduced in 1.6.
Anyway it's not acceptable to call a function with a randomly set input
buffer. The deflate() function explicitly checks for the case where both
avail_in and next_in are null and doesn't use it in this case during a
flush, so this is the best solution.
Special thanks to Sasha Litvak, James Hartshorn and Paul Bauer for
reporting very useful stack traces which were critical to finding the
root cause of this bug.
This fix must be backported into 1.6 and 1.5, though 1.5 is less likely to
trigger this case given that it keeps its own buffers allocated all along
the session's life.
Instead of repeating the type of the LHS argument (sizeof(struct ...))
in calls to malloc/calloc, we directly use the pointer
name (sizeof(*...)). The following Coccinelle patch was used:
@@
type T;
T *x;
@@
x = malloc(
- sizeof(T)
+ sizeof(*x)
)
@@
type T;
T *x;
@@
x = calloc(1,
- sizeof(T)
+ sizeof(*x)
)
When the LHS is not just a variable name, no change is made. Moreover,
the following patch was used to ensure that "1" is consistently used as
a first argument of calloc, not the last one:
@@
@@
calloc(
+ 1,
...
- ,1
)