The hash type stored everywhere is XXH64_hash_t, which annoyingly forces
everyone to include the huge xxhash file. We know it's an uint64_t because
that's its purpose and the type is only made to abstract it on machines
where uint64_t is not availble. Let's switch the type to uint64_t
everywhere and avoid including xxhash from the type file.
This one is expensive in code size because it comes with xxhash.h at a
low level of dependency that's inherited at plenty of places, and for
a function does doesn't benefit from inlining and could possibly even
benefit from not being inline given that it's large and called from the
scheduler.
Moving it to activity.c reduces the LoC by 1.2% and the binary size by
~1kB.
This function has no reason for being inlined, it's called from non
critical places (once in pollers), is quite large and comes with
dependencies (time and freq_ctr). Let's move it to acitvity.c. That's
another 0.4% less LoC to build.
The idle time calculation stuff was moved to task.h by commit 6dfab112e
("REORG: sched: move idle time calculation from time.h to task.h") but
these two variables that are only maintained by task.{c,h} were still
left in time.{c,h}. They have to move as well.
These ones require openssl and are only built when it's enabled. There's
no point keeping them in sample.c when ssl_sample.c already deals with this
and the required includes. This also allows to remove openssl-compat.h
from sample.c and to further reduce the number of inclusions of openssl
includes, and the build time is now down to under 8 seconds.
These two counters were the only ones not in the global struct, while
the SSL freq counters or the req counts are already in it, this forces
stats.c to include ssl_sock just to know about them. Let's move them
over there with their friends. This reduces from 408 to 384 the number
of includes of opensslconf.h.
This one has nothing to do with ssl_sock as it manipulates the struct
server only. Let's move it to server.c and remove unneeded dependencies
on ssl_sock.h. This further reduces by 10% the number of includes of
opensslconf.h and by 0.5% the number of compiled lines.
This one doesn't use anything from an SSL context, it only checks the
type of the transport layer of a connection, thus it belongs to
connection.h. This is particularly visible due to all the ifdefs
around it in various call places.
These functions have no reason for being inlined, and they require some
includes with long dependencies. Let's move them to listener.c and trim
unused includes in listener.h.
The lock-debugging code in thread.h has no reason to be inlined. the
functions are quite fat and perform a lot of operations so there's no
saving keeping them inlined. Worse, most of them are in fact not
inlined, resulting in a significantly bigger executable.
This patch moves all this part from thread.h to thread.c. The functions
are still exported in thread.h of course. This results in ~166kB less
code:
text data bss dec hex filename
3165938 99424 897376 4162738 3f84b2 haproxy-before
2991987 99424 897376 3988787 3cdd33 haproxy-after
In addition the build time with thread debugging enabled has shrunk
from 19.2 to 17.7s thanks to much less code to be parsed in thread.h
that is included virtually everywhere.
pool-os.h relies on a number of includes solely because the
pool_alloc_area() function was inlined, and this only because we want
the normal version to be inlined so that we can track the calling
places for the memory profiler. It's worth noting that it already
does not work at -O0, and that when UAF is enabled we don't care a
dime about profiling.
This patch does two things at once:
- force-inline the functions so that pool_alloc_area() is still
inlined at -O0 to help track malloc() users ;
- uninline the UAF version of these (that rely on mmap/munmap)
and move them to pools.c so that we can remove all unneeded
includes.
Doing so reduces by ~270kB or 0.15% the total build size.
A number of files currently access activity counters but rely on their
definitions to be inherited from other files (task.c, backend.c hlua.c,
sock.c, pool.c, stats.c, fd.c).
backend.c, all muxes, backend.c started manipulating ebmb_nodes with
the introduction of idle conns but the types were inherited through
other includes. Let's add ebmbtree.h there.
The various variable-to-sample converters allow to turn a variable to
a sample of type string, sint or binary, but both the string one used
by strcmp() and the binary one used by secure_memcmp() are missing a
pointer check on the ability to the cast, making them crash if a
variable of type addr is used with strcmp(), or if an addr or bool is
used with secure_memcmp().
Let's rely on the new sample_conv_var2smp() function to run the proper
checks.
This will need to be backported to all supported version. It relies on
previous commits:
CLEANUP: server: always include the storage for SSL settings
CLEANUP: sample: rename sample_conv_var2smp() to *_sint
CLEANUP: sample: uninline sample_conv_var2smp_str()
MINOR: sample: provide a generic var-to-sample conversion function
For backports it's probably easier to check the sample_casts[] pointer
before calling it in sample_conv_strcmp() and sample_conv_secure_memcmp().
We're using variable-to-sample conversion at least 4 times in the code,
two of which are bogus. Let's introduce a generic conversion function
that performs the required checks.
This one only handles integers, contrary to its sibling with the suffix
_str that only handles strings. Let's rename it and uninline it since it
may well be used from outside.
The SSL stuff in struct server takes less than 3% of it and requires
lots of annoying ifdefs in the code just to take care of the cases
where the field is absent. Let's get rid of this and stop including
openssl-compat from server.c to detect NPN and ALPN capabilities.
This reduces the total LoC by another 0.4%.
Migrate the httpclient:get() method to named arguments so we can
specify optional arguments.
This allows to pass headers as an optional argument as an array.
The () in the method call must be replaced by {}:
local res = httpclient:get{url="http://127.0.0.1:9000/?s=99",
headers= {["X-foo"] = { "salt" }, ["X-bar"] = {"pepper" }}}
During httpclient_destroy, add a condition in the BUG_ON which checks
that the client was started before it has ended. A httpclient structure
could have been created without being started.
When using the lua httpclient, haproxy could crash because a b_xfer is
done in httpclient_xfer, which will do a zero-copy swap of the data in
the buffers. The ptr will then be free() by the pool.
However this can't work with a trash buffer, because the area was not
allocated from the pool buffer, so the pool is not suppose to free it
because it does not know this ptr, using -DDEBUG_MEMORY_POOLS will
result with a crash during the free.
Fix the problem by using b_force_xfer() instead of b_xfer which copy
the data instead. The problem still exist with the trash however, and
the trash API must be reworked.
Implement the garbage collector of the lua httpclient.
This patch declares the __gc method of the httpclient object which only
does a httpclient_stop_and_destroy().
httpclient_stop_and_destroy() tries to destroy the httpclient structure
if the client was stopped.
In the case the client wasn't stopped, it ask the client to stop itself
and to destroy the httpclient structure itself during the release of the
applet.
httpclient_destroy() must free all the ist in the httpclient structure,
the URL in the request, the vsn and reason in the response.
It also must free the list of headers of the response.
A bug was introduced by the commit 2d5650082 ("BUG/MEDIUM: http-ana: Reset
channels analysers when returning an error").
The request analyzers must be cleared when a redirect rule is applied. It is
not a problem if the redirect rule is inside an http-request ruleset because
the analyzer takes care to clear it. However, when it comes from a redirect
ruleset (via the "redirect ..." directive), because of the above commit,
the request analyzers are no longer cleared. It means some HTTP request
analyzers may be called while the request channel was already flushed. It is
totally unexpected and may lead to crash.
Thanks to Yves Lafon for reporting the problem.
This patch must be backported everywhere the above commit was backported.
When a filter is attached to a stream, the wrong FLT_END analyzer is added
on the request channel. AN_REQ_FLT_END must be added instead of
AN_RES_FLT_END. Because of this bug, the stream may hang on the filter
release stage.
It seems to be ok for HTTP filters (cache & compression) in HTTP mode. But
when enabled on a TCP proxy, the stream is blocked until the client or the
server timeout expire because data forwarding is blocked. The stream is then
prematurely aborted.
This bug was introduced by commit 26eb5ea35 ("BUG/MINOR: filters: Always set
FLT_END analyser when CF_FLT_ANALYZE flag is set"). The patch must be
backported in all stable versions.
time.h is a horrible place to put activity calculation, it's a
historical mistake because the functions were there. We already have
most of the parts in sched.{c,h} and these ones make an exception in
the middle, forcing time.h to include some thread stuff and to access
the before/after_poll and idle_pct values.
Let's move these 3 functions to task.h with the other ones. They were
prefixed with "sched_" instead of the historical "tv_" which already
made no sense anymore.
I don't know why I inlined this one, this makes no sense given that it's
only used for stats, and it starts a circular dependency on tinfo.h which
can be problematic in the future. In addition, all the stuff related to
idle time calculation should be with the rest of the scheduler, which
currently is in task.{c,h}, so let's move it there.
We'll need to improve the API to pass other arguments in the future, so
let's start to adapt better to the current use cases. task_new() is used:
- 18 times as task_new(tid_bit)
- 18 times as task_new(MAX_THREADS_MASK)
- 2 times with a single bit (in a loop)
- 1 in the debug code that uses a mask
This patch provides 3 new functions to achieve this:
- task_new_here() to create a task on the calling thread
- task_new_anywhere() to create a task to be run anywhere
- task_new_on() to create a task to run on a specific thread
The change is trivial and will allow us to later concentrate the
required adaptations to these 3 functions only. It's still possible
to call task_new() if needed but a comment was added to encourage the
use of the new ones instead. The debug code was not changed and still
uses it.
Work lists were a mechanism introduced in 1.8 to asynchronously delegate
some work to be performed on another thread via a dedicated task.
The only user was the listeners, to deal with the queue. Nowadays
the tasklets have made this much more convenient, and have replaced
work_lists in the listeners. It seems there will be no valid use case
of work lists anymore, so better get rid of them entirely and keep the
scheduler code cleaner.
__task_queue() must absolutely not be called with TICK_ETERNITY or it
will place a never-expiring node upfront in the timers queue, preventing
any timer from expiring until the process is restarted. Code was found
to cause this using "task_schedule(task, now_ms)" which does this one
millisecond every 49.7 days, so let's add a condition against this. It
must never trigger since any process susceptible to trigger it would
already accumulate tasks until it dies.
An extra test was added in wake_expired_tasks() to detect tasks whose
timeout would have been changed after being queued.
An improvement over this could be in the future to use a non-scalar
type (union/struct) for expiration dates so as to avoid the risk of
using them directly like this. But now_ms is already such a valid
time and this specific construct would still not be caught.
This could even be backported to stable versions to help detect other
occurrences if any.
For now, tcp-request and tcp-response content rules evaluation is
interrupted before the inspect-delay when the channel's buffer is full, the
RX path is blocked or when a shutdown for reads was received. To sum up, the
evaluation is interrupted when no more input data are expected. However, it
is not exhaustive. It also happens when end of input is reached (CF_EOI flag
set) or when a read error occurred (CF_READ_ERROR flag set).
Note that, AFAIK, it is only a problem on HAProy 2.3 and prior when a H1 to
H2 upgrade is performed. On newer versions, it works as expected because the
stream is not created at this stage.
This patch must be backported as far as 2.0.