There's no more reason for keepin the code and definitions in conn_stream,
let's move all that to stconn. The alphabetical ordering of include files
was adjusted.
This file contains all the stream-connector functions that are specific
to application layers of type stream. So let's name it accordingly so
that it's easier to figure what's located there.
The alphabetical ordering of include files was preserved.
The following functions which act on a connection-based stream connector
were renamed to sc_conn_* (~60 places):
cs_conn_drain_and_shut
cs_conn_process
cs_conn_read0
cs_conn_ready
cs_conn_recv
cs_conn_send
cs_conn_shut
cs_conn_shutr
cs_conn_shutw
These functions return the app-layer associated with an stconn, which
is a check, a stream or a stream's task. They're used a lot to access
channels, flags and for waking up tasks. Let's just name them
appropriately for the stream connector.
We're starting to propagate the stream connector's new name through the
API. Most call places of these functions that retrieve the channel or its
buffer are in applets. The local variable names are not changed in order
to keep the changes small and reviewable. There were ~92 uses of cs_ic(),
~96 of cs_oc() (due to co_get*() being less factorizable than ci_put*),
and ~5 accesses to the buffer itself.
This applies the change so that the applet code stops using ci_putchk()
and friends everywhere possible, for the much saferapplet_put*() instead.
The change is mechanical but large. Two or three functions used to have no
appctx and a cs derived from the appctx instead, which was a reminiscence
of old times' stream_interface. These were simply changed to directly take
the appctx. No sensitive change was performed, and the old (more complex)
API is still usable when needed (e.g. the channel is already known).
The change touched roughly a hundred of locations, with no less than 124
lines removed.
It's worth noting that the stats applet, the oldest of the series, could
get a serious lifting, as it's still very channel-centric instead of
propagating the appctx along the chain. Given that this code doesn't
change often, there's no emergency to clean it up but it would look
better.
This renames the "struct conn_stream" to "struct stconn" and updates
the descriptions in all comments (and the rare help descriptions) to
"stream connector" or "connector". This touches a lot of files but
the change is minimal. The local variables were not even renamed, so
there's still a lot of "cs" everywhere.
This one is the pointer to the conn_stream which is always in the
endpoint that is always present in the appctx, thus it's not needed.
This patch removes it and replaces it with appctx_cs() instead. A
few occurences that were using __cs_strm(appctx->owner) were moved
directly to appctx_strm() which does the equivalent.
Just like for the TCP service, let's move the context away from
appctx.ctx. A new struct hlua_http_ctx was defined, reserved in
hlua_applet_http_init() and used everywhere else. Similarly, the
task dump code will no more report decoded stack traces in case
these services would be involved. That may be solved later.
The use-service mechanism for Lua in TCP mode relies on the
hlua_tcp storage in appctx->ctx. We can move its definition to
hlua.c and simply use appctx_reserve_svcctx() to reserve and access
the stoage. One tiny side effect is that the task dump used in panics
will not show anymore the Lua call stack in its trace. For this a
better API is needed from the Lua code to expose a function that does
the job from an appctx.
The command only requires to store an int, but it will be useful later
to have a struct to pass extra info such as an "all" flag to dump all
FDs. The new context is now a struct dev_fd_ctx stored in svcctx.
This gets rid of most open-coded fcntl() calls, some of which were passed
through DISGUISE() to avoid a useless test. The FD_CLOEXEC was most often
set without preserving previous flags, which could become a problem once
new flags are created. Now this will not happen anymore.
cs_conn_io_cb(), cs_conn_sync_recv() and cs_conn_sync_send() are moved in
conn_stream.c. Associated functions are moved too (cs_notify, cs_conn_read0,
cs_conn_recv, cs_conn_send and cs_conn_process).
Remaining flags and associated functions are move in the conn-stream
scope. These flags are added on the endpoint and not the conn-stream
itself. This way it will be possible to get them from the mux or the
applet. The functions to get or set these flags are renamed accordingly with
the "cs_" prefix and updated to manipualte a conn-stream instead of a
stream-interface.
The stream-interface state (SI_ST_*) is now in the conn-stream. It is a
mechanical replacement for now. Nothing special. SI_ST_* and SI_SB_* were
renamed accordingly. Utils functions to manipulate these infos were moved
under the conn-stream scope.
But it could be good to keep in mind that this part should be
reworked. Indeed, at the CS level, we only need to know if it is ready to
receive or to send. The state of conn-stream from INI to EST is only used on
the server side. The client CS is immediately set to EST. Thus current
SI_ST_* states should probably be moved to the stream to reflect the server
connection state during the establishment stage.
The expiration date in the stream-interface was only used on the server side
to set the connect, queue or turn-around timeout. It was checked on the
frontend stream-interface, but never used concretely. So it was removed and
replaced by a connect expiration date in the stream itself. Thus, SI_FL_EXP
flag in stream-interfaces is replaced by a stream flag, SF_CONN_EXP.
At many places, we now use the new CS functions to get a stream or a channel
from a conn-stream instead of using the stream-interface API. It is the
first step to reduce the scope of the stream-interfaces. The main change
here is about the applet I/O callback functions. Before the refactoring, the
stream-interface was the appctx owner. Thus, it was heavily used. Now, as
far as possible,the conn-stream is used. Of course, it remains many calls to
the stream-interface API.
The appctx owner is not a stream-interface anymore. It is now a conn-stream.
In the cli I/O handler for the command "debug dev fd", we still handle it as
a stream-interface. It is now fixed.
It is 2.6-specific, no backport is needed.
Many inline functions involve some BUG_ON() calls and because of the
partial complexity of the functions, they're not inlined anymore (e.g.
co_data()). The reason is that the expression instantiates the message,
its size, sometimes a counter, then the atomic OR to taint the process,
and the back trace. That can be a lot for an inline function and most
of it is always the same.
This commit modifies this by delegating the common parts to a dedicated
function "complain()" that takes care of updating the counter if needed,
writing the message and measuring its length, and tainting the process.
This way the caller only has to check a condition, pass a pointer to the
preset message, and the info about the type (bug or warn) for the tainting,
then decide whether to dump or crash. Note that this part could also be
moved to the function but resulted in complain() always being at the top
of the stack, which didn't seem like an improvement.
Thanks to these changes, the BUG_ON() calls do not result in uninlining
functions anymore and the overall code size was reduced by 60 to 120 kB
depending on the build options.
The only reason for warning once is to check if a condition really
happens. Let's use a term that better translates the intent, that's
important when reading the code.
This one will maintain a static counter per call place and will only
emit the warning on the first call. It may be used to invite users to
report an unexpected event without spamming them with messages.
This is the same as BUG_ON() except that it never crashes and only emits
a warning and a backtrace, inviting users to report the problem. This will
be usable for non-fatal issues that should not happen and need to be fixed.
This way the BUG_ON() when using DEBUG_STRICT_NOCRASH is effectively an
equivalent of WARN_ON().
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the debug part.
Because appctx is now an endpoint of the conn-stream, there is no reason to
still have the stream-interface as appctx owner. Thus, the conn-stream is
now the appctx owner.
David Carlier reported a build breakage on Haiku since commit
5be7c198e ("DEBUG: cli: add a new "debug dev fd" expert command")
due to O_ASYNC not being defined. Ilya also reported it broke the
build on Cygwin. It's not that portable and sometimes defined as
O_NONBLOCK for portability. But here we don't even need that, as
we already condition other flags, let's just ignore it if it does
not exist.
This command will scan the whole file descriptors space to look for
existing FDs that are unknown to haproxy's fdtab, and will try to dump
a maximum number of information about them (including type, mode, device,
size, uid/gid, cloexec, O_* flags, socket types and addresses when
relevant). The goal is to help detecting inherited FDs from parent
processes as well as potential leaks.
Some of those listed are actually known but handled so deep into some
systems that they're not in the fdtab (such as epoll FDs or inter-
thread pipes). This might be refined in the future so that these ones
become known and do not appear.
Example of output:
$ socat - /tmp/sock1 <<< "expert-mode on;debug dev fd"
0 type=tty. mod=0620 dev=0x8803 siz=0 uid=1000 gid=5 fs=0x16 ino=0x6 getfd=+0 getfl=O_RDONLY,O_APPEND
1 type=tty. mod=0620 dev=0x8803 siz=0 uid=1000 gid=5 fs=0x16 ino=0x6 getfd=+0 getfl=O_RDONLY,O_APPEND
2 type=tty. mod=0620 dev=0x8803 siz=0 uid=1000 gid=5 fs=0x16 ino=0x6 getfd=+0 getfl=O_RDONLY,O_APPEND
3 type=pipe mod=0600 dev=0 siz=0 uid=1000 gid=100 fs=0xc ino=0x18112348 getfd=+0
4 type=epol mod=0600 dev=0 siz=0 uid=0 gid=0 fs=0xd ino=0x3674 getfd=+0 getfl=O_RDONLY
33 type=pipe mod=0600 dev=0 siz=0 uid=1000 gid=100 fs=0xc ino=0x24af8251 getfd=+0 getfl=O_RDONLY
34 type=epol mod=0600 dev=0 siz=0 uid=0 gid=0 fs=0xd ino=0x3674 getfd=+0 getfl=O_RDONLY
36 type=pipe mod=0600 dev=0 siz=0 uid=1000 gid=100 fs=0xc ino=0x24af8d1b getfd=+0 getfl=O_RDONLY
37 type=epol mod=0600 dev=0 siz=0 uid=0 gid=0 fs=0xd ino=0x3674 getfd=+0 getfl=O_RDONLY
39 type=pipe mod=0600 dev=0 siz=0 uid=1000 gid=100 fs=0xc ino=0x24afa04f getfd=+0 getfl=O_RDONLY
41 type=pipe mod=0600 dev=0 siz=0 uid=1000 gid=100 fs=0xc ino=0x24af8252 getfd=+0 getfl=O_RDONLY
42 type=epol mod=0600 dev=0 siz=0 uid=0 gid=0 fs=0xd ino=0x3674 getfd=+0 getfl=O_RDONLY
Many times core dumps reported by users who experience trouble are
difficult to exploit due to missing system libraries. Sometimes,
having just a list of loaded libraries and their respective addresses
can already provide some hints about some problems.
This patch makes a step in that direction by adding a new "show libs"
command that will try to enumerate the list of object files that are
loaded in memory, relying on the dynamic linker for this. It may also
be used to detect that some foreign code embarks other undesired libs
(e.g. some external Lua modules).
At the moment it's only supported on glibc when USE_DL is set, but it's
implemented in a way that ought to make it reasonably easy to be extended
to other platforms.
Now thread dumps will report the thread group number and the ID within
this group. Note that this is still quite limited because some masks
are calculated based on the thread in argument while they have to be
performed against a group-level thread ID.
The TI_FL_STUCK flag is manipulated by the watchdog and scheduler
and describes the apparent life/death of a thread so it changes
all the time and it makes sense to move it to the thread's context
for an active thread.
The "thread_info" name was initially chosen to store all info about
threads but since we now have a separate per-thread context, there is
no point keeping some of its elements in the thread_info struct.
As such, this patch moves prev_cpu_time, prev_mono_time and idle_pct to
thread_ctx, into the thread context, with the scheduler parts. Instead
of accessing them via "ti->" we now access them via "th_ctx->", which
makes more sense as they're totally dynamic, and will be required for
future evolutions. There's no room problem for now, the structure still
has 84 bytes available at the end.
The scheduler contains a lot of stuff that is thread-local and not
exclusively tied to the scheduler. Other parts (namely thread_info)
contain similar thread-local context that ought to be merged with
it but that is even less related to the scheduler. However moving
more data into this structure isn't possible since task.h is high
level and cannot be included everywhere (e.g. activity) without
causing include loops.
In the end, it appears that the task_per_thread represents most of
the per-thread context defined with generic types and should simply
move to tinfo.h so that everyone can use them.
The struct was renamed to thread_ctx and the variable "sched" was
renamed to "th_ctx". "sched" used to be initialized manually from
run_thread_poll_loop(), now it's initialized by ha_set_tid() just
like ti, tid, tid_bit.
The memset() in init_task() was removed in favor of a bss initialization
of the array, so that other subsystems can put their stuff in this array.
Since the tasklet array has TL_CLASSES elements, the TL_* definitions
was moved there as well, but it's not a problem.
The vast majority of the change in this patch is caused by the
renaming of the structures.
This removes the knowledge of clockid_t from anywhere but clock.c, thus
eliminating a source of includes burden. The unused clock_id field was
removed from thread_info, and the definition setting of clockid_t was
removed from compat.h. The most visible change is that the function
now_cpu_time_thread() now takes the thread number instead of a tinfo
pointer.
There is currently a problem related to time keeping. We're mixing
the functions to perform calculations with the os-dependent code
needed to retrieve and adjust the local time.
This patch extracts from time.{c,h} the parts that are solely dedicated
to time keeping. These are the "now" or "before_poll" variables for
example, as well as the various now_*() functions that make use of
gettimeofday() and clock_gettime() to retrieve the current time.
The "tv_*" functions moved there were also more appropriately renamed
to "clock_*".
Other parts used to compute stolen time are in other files, they will
have to be picked next.
There were 102 CLI commands whose help were zig-zagging all along the dump
making them unreadable. This patch realigns all these messages so that the
command now uses up to 40 characters before the delimiting colon. About a
third of the commands did not correctly list their arguments which were
added after the first version, so they were all updated. Some abuses of
the term "id" were fixed to use a more explanatory term. The
"set ssl ocsp-response" command was not listed because it lacked a help
message, this was fixed as well. The deprecated enable/disable commands
for agent/health/server were prominently written as deprecated. Whenever
possible, clearer explanations were provided.
This command attempts to resolve a pointer to a symbol name. This is
convenient during development as it's easier to get such pointers live
than by issuing a debugger or calling addr2line.
This patch replaces roughly all occurrences of an HA_ATOMIC_ADD(&foo, 1)
or HA_ATOMIC_SUB(&foo, 1) with the equivalent HA_ATOMIC_INC(&foo) and
HA_ATOMIC_DEC(&foo) respectively. These are 507 changes over 45 files.
The commit reverts following commits:
* 83926a04 BUG/MEDIUM: debug/lua: Don't dump the lua stack if not dumpable
* a61789a1 MEDIUM: lua: Use a per-thread counter to track some non-reentrant parts of lua
Instead of relying on a Lua function to print the lua traceback into the
debugger, we are now using our own internal function (hlua_traceback()).
This one does not allocate memory and use a chunk instead. This avoids any
issue with a possible deadlock in the memory allocator because the thread
processing was interrupted during a memory allocation.
This patch relies on the commit "BUG/MEDIUM: debug/lua: Use internal hlua
function to dump the lua traceback". Both must be backported wherever the
patches above are backported, thus as far as 2.0
When we try to dump the stack of a lua context, if it is not dumpable,
nothing is performed and a message is emitted instead. This happens when a
lua execution was interrupted inside a non-reentrant part.
This patch depends on following commit :
* MEDIUM: lua: Use a per-thread counter to track some non-reentrant parts of lua
Thanks to this patch, we avoid a possible deadllock if the lua is
interrupted by the watchdog in the lua memory allocator, because realloc()
is not async-signal-safe.
Both patches must be backported as far as 2.0.
It's been too short for quite a while now and is now full. It's still
time to extend it to 32-bits since we have room for this without
wasting any space, so we now gained 16 new bits for future flags.
The values were not reassigned just in case there would be a few
hidden u16 or short somewhere in which these flags are placed (as
it used to be the case with stream->pending_events).
The patch is tagged MEDIUM because this required to update the task's
process() prototype to use an int instead of a short, that's quite a
bunch of places.
We frequently need to access a simple and fast PRNG for statistical
purposes. The debug_prng() function did exactly this using a xorshift
generator but its use was limited to debug only. Let's move this to
tools.h and tools.c to make it accessible everywhere. Since it needs to
be fast, its state is thread-local. An initialization function starts a
different initial value for each thread for better distribution.
This one is systematically misunderstood due to its unclear name. It
is in fact the number of tasks in the local tasklet list. Let's call
it "tasks_in_list" to remove some of the confusion.
This counter is solely used for reporting in the stats and is the hottest
thread contention point to date. Moving it to the scheduler and having a
separate one for the global run queue dramatically improves the performance,
showing a 12% boost on the request rate on 16 threads!
In addition, the thread debugging output which used to rely on rqueue_size
was not totally accurate as it would only report task counts. Now we can
return the exact thread's run queue length.
It is also interesting to note that there are still a few other task/tasklet
counters in the scheduler that are not efficiently updated because some cover
a single area and others cover multiple areas. It looks like having a distinct
counter for each of the following entries would help and would keep the code
a bit cleaner:
- global run queue (tree)
- per-thread run queue (tree)
- per-thread shared tasklets list
- per-thread local lists
Maybe even splitting the shared tasklets lists between pure tasklets and
tasks instead of having the whole and tasks would simplify the code because
there remain a number of places where several counters have to be updated.