5449 Commits

Author SHA1 Message Date
Willy Tarreau
c046d167e4 MEDIUM: log: add support for logging to a ring buffer
Now by prefixing a log server with "ring@<name>" it's possible to send
the logs to a ring buffer. One nice thing is that it allows multiple
sessions to consult the logs in real time in parallel over the CLI, and
without requiring file system access. At the moment, ring0 is created as
a default sink for tracing purposes and is available. No option is
provided to create new rings though this is trivial to add to the global
section.
2019-08-30 15:24:59 +02:00
Willy Tarreau
f3dc30f6de MINOR: log: add a target type instead of hacking the address family
Instead of detecting an AF_UNSPEC address family for a log server and
to deduce a file descriptor, let's create a target type field and
explicitly mention that the socket is of type FD.
2019-08-30 15:07:25 +02:00
Willy Tarreau
d660990cee MINOR: fd: add a new "initialized" bit in the fdtab struct
The purpose is to be able to remember that initialization was already
done for a file descriptor. This will allow to get rid of some dirty
hacks performed in the logs or fd sinks where the init state of the
fd has to be guessed.
2019-08-30 15:07:25 +02:00
Willy Tarreau
76913d3ef4 CLEANUP: fd: remove leftovers of the fdcache
The "cache" entry was still present in the fdtab struct and it was
reported in "show sess". Removing it broke the cache-line alignment
on 64-bit machines which is important for threads, so it was fixed
by adding an attribute(aligned()) when threads are in use. Doing it
only in this case allows 32-bit thread-less platforms to see the
struct fit into 32 bytes.
2019-08-30 15:07:25 +02:00
Willy Tarreau
1d181e489c MEDIUM: ring: implement a wait mode for watchers
Now it is possible for a reader to subscribe and wait for new events
sent to a ring buffer. When new events are written to a ring buffer,
the applets that are subscribed are woken up to display new events.
For now we only support this with the CLI applet called by "show events"
since the I/O handler is indeed a CLI I/O handler. But it's not
complicated to add other mechanisms to consume events and forward them
to external log servers for example. The wait mode is enabled by adding
"-w" after "show events <sink>". An extra "-n" was added to directly
seek to new events only.
2019-08-30 11:58:58 +02:00
Willy Tarreau
300decc8d9 MINOR: cli: extend the CLI context with a list and two offsets
Some CLI parsers are currently abusing the CLI context types such as
pointers to stuff longs into them by lack of room. But the context is
80 bytes while cli is only 48, thus there's some room left. This patch
adds a list element and two size_t usable as various offsets. The list
element is initialized.
2019-08-30 11:58:58 +02:00
Willy Tarreau
370a694879 MINOR: trace: change the detail_level to per-source verbosity
The detail level initially based on syslog levels is not used, while
something related is missing, trace verbosity, to indicate whether or
not we want to call the decoding callback and what level of decoding
we want (raw captures etc). Let's change the field to "verbosity" for
this. A verbosity of zero means that the decoding callback is not
called, and all other levels are handled by this callback and are
source-specific. The source is now prompted to list the levels that
are proposed to the user. When the source doesn't define anything,
"quiet" and "default" are available.
2019-08-29 17:11:25 +02:00
Willy Tarreau
09fb0df6fd MINOR: trace: prepend the function name for developer level traces
Working on adding traces to mux-h2 revealed that the function names are
manually copied a lot in developer traces. The reason is that they are
not preprocessor macros and as such cannot be concatenated. Let's
slightly adjust the trace() function call to take a function name just
after the file:line argument. This argument is only added for the
TRACE_DEVEL and 3 new TRACE_ENTER, TRACE_LEAVE, and TRACE_POINT macros
and left NULL for others. This way the function name is only reported
for traces aimed at the developers. The pretty-print callback was also
extended to benefit from this. This will also significantly shrink the
data segment as the "entering" and "leaving" strings will now be merged.

One technical point worth mentioning is that the function name is *not*
passed as an ist to the inline function because it's not considered as
a builtin constant by the compiler, and would lead to strlen() being
run on it from all call places before calling the inline function. Thus
instead we pass the const char * (that the compiler knows where to find)
and it's the __trace() function that converts it to an ist for internal
consumption and for the pretty-print callback. Doing this avoids losing
5-10% peak performance.
2019-08-29 17:09:13 +02:00
Willy Tarreau
2ea549bc43 MINOR: trace: change the "payload" level to "data" and move it
The "payload" trace level was ambigous because its initial purpose was
to be able to dump received data. But it doesn't make sense to force to
report data transfers just to be able to report state changes. For
example, all snd_buf()/rcv_buf() operations coming from the application
layer should be tagged at this level. So here we move this payload level
above the state transitions and rename it to avoid the ambiguity making
one think it's only about request/response payload. Now it clearly is
about any data transfer and is thus just below the developer level. The
help messages on the CLI and the doc were slightly reworded to help
remove this ambiguity.
2019-08-29 10:46:11 +02:00
Willy Tarreau
be5a288424 MINOR: trace: replace struct trace_lockon_args with struct name_desc
No need for a specific struct anymore, name_desc suits us.
2019-08-29 09:34:53 +02:00
Willy Tarreau
fb4ba91ac1 MINOR: tools: add a generic struct "name_desc" for name-description pairs
In prompts on the CLI we now commonly need to propose a keyword name
and a description and it doesn't make sense to define a new struct for
each such pairs. Let's simply have a generic "name_desc" for this.
2019-08-29 09:34:53 +02:00
Geoff Simmons
7185b789f9 MINOR: connection: add the fc_pp_authority fetch -- authority TLV, from PROXYv2
Save the authority TLV in a PROXYv2 header from the client connection,
if present, and make it available as fc_pp_authority.

The fetch can be used, for example, to set the SNI for a backend TLS
connection.
2019-08-28 17:16:20 +02:00
Willy Tarreau
c326ecc9b1 MINOR: trace: change the TRACE() calling convention to put the args and cb last
Previously the callback was almost mandatory so it made sense to have it
before the message. Now that it can default to the one declared in the
trace source, most TRACE() calls contain series of empty args and callbacks,
which make them suitable for being at the end and being totally omitted.

This patch thus reverses the TRACE arguments so that the message appears
first, then the mask, then arg1..arg4, then the callback. In practice
we'll mostly see 1 arg, or 2 args and nothing else, and it will not be
needed anymore to pass long series of commas in the middle of the
arguments. However if a source is enforced, the empty commas will still
be needed for all omitted arguments.
2019-08-28 10:39:43 +02:00
Willy Tarreau
3da0026d25 MINOR: trace: support a default callback for the source
It becomes apparent that most traces will use a single trace pretty
print callback, so let's allow the trace source to declare a default
one so that it can be omitted from trace calls, and will be used if
no other one is specified.
2019-08-28 07:06:23 +02:00
Willy Tarreau
8f24023ba0 MINOR: sink: now report the number of dropped events on output
The principle is that when emitting a message, if some dropped events
were logged, we first attempt to report this counter before going
further. This is done under an exclusive lock while all logs are
produced under a shared lock. This ensures that the dropped line is
accurately reported and doesn't accidently arrive after a later
event.
2019-08-27 17:14:19 +02:00
Willy Tarreau
4ed23ca0e7 MINOR: sink: add support for ring buffers
This now provides sink_new_buf() which allocates a ring buffer. One such
ring ("buf0") of 1 MB is created already, and may be used by sink_write().
The sink's creation should probably be moved somewhere else later.
2019-08-27 17:14:19 +02:00
Willy Tarreau
072931cdcb MINOR: ring: add a generic CLI io_handler to dump a ring buffer
The three functions (attach, IO handler, and release) are meant to be
called by any CLI command which requires to dump the contents of a ring
buffer. We do not implement anything generic to dump any ring buffer on
the CLI since it's meant to be used by other functionalities above.
However these functions deal with locking and everything so it's trivial
to embed them in other code.
2019-08-27 17:14:19 +02:00
Willy Tarreau
be97853c2f MINOR: ring: add a ring_write() function
This function tries to write to the ring buffer, possibly removing enough
old messages to make room for the new one. It takes two arrays of fragments
on input to ease the insertion of prefixes by the caller. It atomically
writes the message, possibly truncating it if desired, and returns the
operation's status.
2019-08-27 17:14:19 +02:00
Willy Tarreau
172945fbad MINOR: ring: add a new mechanism for retrieving/storing ring data in buffers
Our circular buffers are well suited for being used as ring buffers for
not-so-structured data. The machanism here consists in making room in a
buffer before inserting a new record which is prefixed by its size, and
looking up next record based on the previous one's offset and size. We
can have up to 255 consumers watching for data (dump in progress, tail)
which guarantee that entrees are not recycled while they're being dumped.
The complete representation is described in the header file. For now only
ring_new(), ring_resize() and ring_free() are created.
2019-08-27 17:14:19 +02:00
Willy Tarreau
931d8b79a8 MINOR: fd: add fd_write_frag_line() to send a fragmented line to an fd
Currently both logs and event sinks may use a file descriptor to
atomically emit some output contents. The two may use the same FD though
nothing is done to make sure they use the same lock. Also there is quite
some redundancy between the two. Better make a specific function to send
a fragmented message to a file descriptor which will take care of the
locking via the fd's lock. The function is also able to truncate a
message and to enforce addition of a trailing LF when building the
output message.
2019-08-27 17:14:19 +02:00
Willy Tarreau
b88d231773 MINOR: buffer: add functions to read/write varints from/to buffers
The new functions are :
  __b_put_varint() : inserts a varint when it's known that it fits
  b_put_varint()   : tries to insert a varint at the tail
  b_get_varint()   : tries to get a varint from the head
  b_peek_varint()  : tries to peek a varint at a specific offset

Wrapping is supported so that they are expected to be safe to use to
manipulate varints with buffers anywhere.
2019-08-27 17:14:19 +02:00
Willy Tarreau
4d589e719b MINOR: tools: add a function varint_bytes() to report the size of a varint
It will sometimes be useful to encode varints to know the output size in
advance. Two versions are provided, one inline using a switch/case construct
which will be trivial for use with constants (and will be very fast albeit
huge) and one function iterating on the number which is 5 times smaller,
for use with variables.
2019-08-27 17:14:19 +02:00
Willy Tarreau
e40f274878 BUILD: trace: make the lockon_ptr const to silence a warning without threads
I forgot to fix this one before pushing, despite my tests. lockon_ptr is
only used to compare pointers, it doesn't need to point to a writable
location. Without threads the atomic store is turned into an assignment
and rightfully complains.
2019-08-22 20:26:28 +02:00
Willy Tarreau
c14eea49e6 MINOR: trace: add the possibility to lock on some arguments
Given that we can pass typed arguments to the trace() function, let's
add provisions for tracking them. They are source-specific so we need
to let the source fill their name and description. Only those with a
non-null name will be proposed.
2019-08-22 20:21:00 +02:00
Willy Tarreau
17a51c64b5 MINOR: trace: add a definition of typed arguments to trace()
With a few macros it's possible for a trace source to commit to only
using a certain type for a given argument (or set of). This will be
particularly useful to let the trace subsystem retrieve some precious
information such as a connection, session, listener, source address or
so, and enable/disable filtering and/or locking.
2019-08-22 20:21:00 +02:00
Willy Tarreau
4ab242136d MINOR: trace: add per-level macros to produce traces
The new TRACE_<level>() macros take a mask, 4 args, a callback and a
static message. From this they also inherit the TRACE_SOURCE macro from
the caller, which contains the pointer to the trace source (so that it's
not required to paste it everywhere), and an ist string is also made by
the concatenation of the file name and the line number. This uses string
concatenation by the preprocessor, and turns it into an ist by the compiler
so that there is no operation at all to perform to adjust the data length
as the compiler knows where to cut during the optimization phase. Last,
the message is also automatically turned into an ist so that it's trivial
to put it into an iovec without having to run strlen() on it.

All arguments and the callback may be empty and will then automatically
be replaced with a NULL pointer. This makes the TRACE calls slightly
lighter especially since arguments are not always used. Several other
options were considered to use variadic macros but there's no outstanding
rule that justifies to place an argument before another one, and it still
looks convenient to have the message be the last one to encourage copy-
pasting of the trace statements.

A generic TRACE() macro takes TRACE_LEVEL in from the source file as the
trace level instead of taking it from its name. This may slightly simplify
the production of traces that always run at the same level (internal core
parts may probably only be called at developer level).
2019-08-22 20:21:00 +02:00
Willy Tarreau
bfd14fc6eb MINOR: trace: implement a call to a decode function
The trace() call will support an optional decoding callback and 4
arguments that this function is supposed to know how to use to provide
extra information. The output remains unchanged when the function is
NULL. Otherwise, the message is pre-filled into the thread-local
trace_buf, and the function is called with all arguments so that it
completes the buffer in a readable form depending on the expected
level of detail.
2019-08-22 20:21:00 +02:00
Willy Tarreau
5da408818b MINOR: trace: make trace() now also take a level in argument
This new "level" argument will allow the trace sources to label the
traces for different purposes, and filter out some of them if they
are not relevant to the current target. Right now we have 5 different
levels:
  - USER : the least verbose one, only a few functional information
  - PAYLOAD: like user but also displays some payload-related information
  - PROTO: focuses on the protocol's framing
  - STATE: also indicate state internal transitions or non-transitions
  - DEVELOPER: adds extra info about branches taken in the code (break
    points, return points)
2019-08-22 20:21:00 +02:00
Willy Tarreau
419bd49f0b MINOR: trace: add the file name and line number in the prefix
We now pass an extra argument "where" to the trace() call, which
is supposed to be an ist made of the concatenation of the filename
and the line number. We only keep the last 10 chars from this string
since the end of file names is most often easy to recognize. This
gives developers useful information at very low cost.
2019-08-22 20:21:00 +02:00
Willy Tarreau
4c2ae48375 MINOR: trace: implement a very basic trace() function
For now it remains quite basic. It performs a few state checks, calls
the source's sink if defined, and performs the transitions between
RUNNING, STOPPED and WAITING when the configured events match.
2019-08-22 20:21:00 +02:00
Willy Tarreau
864e880f6c MINOR: trace/cli: register the "trace" CLI keyword to list the sources
For now it lists the sources if one is not provided, and checks
for the source's existence. It lists the events if not provided,
checks for their existence if provided, and adjusts reported
events/start/stop/pause events, and performs state transitions.
It lists sinks and adjusts them as well. Filters, lock, and
level are not implemented yet.
2019-08-22 20:21:00 +02:00
Willy Tarreau
88ebd4050e MINOR: trace: add allocation of buffer-sized trace buffers
This will be needed so that we can implement protocol decoders which will
have to emit their contents into such a buffer.
2019-08-22 20:21:00 +02:00
Willy Tarreau
4151c753fc MINOR: trace: start to create a new trace subsystem
The principle of this subsystem will be to support taking live traces
at various places in the code with conditional triggers, filters, and
ability to lock on some elements. The traces will support typed events
and will be sent into sinks made of ring buffers, file descriptors or
remote servers.
2019-08-22 20:21:00 +02:00
Willy Tarreau
973e662fe8 MINOR: sink: add a support for file descriptors
This is the most basic type of sink. It pre-registers "stdout" and
"stderr", and is able to use writev() on them. The writev() operation
is locked to avoid mixing outputs. It's likely that the registration
should move somewhere else to take into account the fact that stdout
and stderr are still opened or are closed.
2019-08-22 20:21:00 +02:00
Willy Tarreau
67b5a161b4 MINOR: sink: create definitions a minimal code for event sinks
The principle will be to be able to dispatch events to various destinations
called "sinks". This is already done in part in logs where log servers can
be either a UDP socket or a file descriptor. This will be needed with the
new trace subsystem where we may also want to add ring buffers. And it turns
out that all such destinations make sense at all places. Logs may need to be
sent to a TCP server via a ring buffer, or consulted from the CLI. Trace
events may need to be sent to stdout/stderr as well as to remote log servers.

This patch creates a new structure "sink" aiming at addressing these similar
needs. The goal is to merge together what is common to all of them, such as
the output format, the dropped events count, etc, and also keep separately
the target identification (network address, file descriptor). Provisions
were made to have a "waiter" on the sink. For a TCP log server it will be
the task to wake up after writing to the log buffer. For a ring buffer, it
could be the list of watchers on the CLI running a "tail" operation and
waiting for new events. A lock was also placed in the struct since many
operations will require some locking, including the FD ones. The output
formats covers those in use by logs and two extra ones prepending the ISO
time in front of the message (convenient for stdio/buffer).

For now only the generic infrastructure is present, no type-specific
output is implemented. There's the sink_write() function which prepares
and formats a message to be sent, trying hard to avoid copies and only
using pointer manipulation, where the type-specific code just has to be
added. Dropped messages are already counted (for now 100% drop). The
message is put into an iovec array as it will be trivial to use with
file descriptors and sockets.
2019-08-22 20:21:00 +02:00
Willy Tarreau
9eebd8a978 REORG: trace: rename trace.c to calltrace.c and mention it's not thread-safe
The function call tracing code is a quite old and was never ported to
support threads. It's not even sure whether it still works well, but
at least its presence creates confusion for future work so let's rename
it to calltrace.c and add a comment about its lack of thread-safety.
2019-08-22 20:21:00 +02:00
Willy Tarreau
32c24552e4 MINOR: tools: add a DEFNULL() macro to use NULL for empty args
It's sometimes convenient for debugging macros not to be forced to
explicitly pass NULL in an unused argument. This macro does this, it
replaces a missing arg with NULL.
2019-08-22 20:21:00 +02:00
Willy Tarreau
9bead8c7f5 MINOR: list: add LIST_SPLICE() to merge one list into another
This will move the contents of list <old> at the beginning of list
<new>.
2019-08-22 20:21:00 +02:00
Willy Tarreau
60409db0b1 MINOR: lua: export applet and task handlers
The current functions are seen outside from the debugging code and are
convenient to export so that we can improve the thread dump output :

  void hlua_applet_tcp_fct(struct appctx *ctx);
  void hlua_applet_http_fct(struct appctx *ctx);
  struct task *hlua_process_task(struct task *task, void *context, unsigned short state);

Of course they are only available when USE_LUA is defined.
2019-08-21 14:32:09 +02:00
Willy Tarreau
a2c9911ace MINOR: tools: add append_prefixed_str()
This is somewhat related to indent_msg() except that this one places a
known prefix at the beginning of each line, allows to replace the EOL
character, and not to insert a prefix on the first line if not desired.
It works with a normal output buffer/chunk so it doesn't need to allocate
anything nor to modify the input string. It is suitable for use in multi-
line backtraces.
2019-08-21 14:32:09 +02:00
Willy Tarreau
f5cab82025 MINOR: fd: make sure to mark the thread as not stuck in fd_update_events()
When I/O events are being processed, we want to make sure to mark the
thread as not stuck. The reason is that some pollers (like poll()) which
do not limit the number of FDs they report could possibly report a huge
amount of FD all having to perform moderately expensive operations in
the I/O callback (e.g. via mux-pt which forwards to the upper layers),
making the watchdog think the thread is stuck since it does not schedule.
Of course this must never happen but if it ever does we must be liberal
about it.

This should be backported to 2.0, where the situation may happen more
easily due to the FD cache which can start to collect a large amount of
events. It may be related to the report in issue #201 though nothing is
certain about it.
2019-08-16 16:06:14 +02:00
Willy Tarreau
edb91ad647 MINOR: cli: add cli_msg(), cli_err(), cli_dynmsg(), cli_dynerr()
These functions perform all the boring filling of the appctx's
cli struct needed by CLI parsers to return a message or an error,
and they return 1 so that they can be used as a single-line return
statement. They may be used for const messages or dynamic messages.
2019-08-09 10:11:38 +02:00
Willy Tarreau
d50c7feaa1 MINOR: cli: add two new states to print messages on the CLI
Right now we used to have extremely inconsistent states to report output,
one is CLI_ST_PRINT which prints constant message cli->msg with the
assigned severity, and CLI_ST_PRINT_FREE which prints dynamically
allocated cli->err with severity LOG_ERR, and nothing in between,
eventhough it's useful to be able to report dynamically allocated
messages as well as constant error messages.

This patch adds two extra states, which are not particularly well named
given the constraints imposed by existing ones. One is CLI_ST_PRINT_ERR
which prints a constant error message. The other one is CLI_ST_PRINT_DYN
which prints a dynamically allocated message. By doing so we maintain
the compatibility with current code.

It is important to keep in mind that we cannot pre-initialize pointers
and automatically detect what message type it is based on the assigned
fields, because the CLI's context is in a union shared with all other
users, thus unused fields contain anything upon return. This is why we
have no choice but using 4 states. Keeping the two fields <msg> and
<err> remains useful because one is const and not the other one, and
this catches may copy-paste mistakes. It's just that <err> is pretty
confusing here, it should be renamed.
2019-08-09 10:11:38 +02:00
Willy Tarreau
247a8b1d81 CLEANUP: task: move the cpu_time field to the task-only part
The CPU time accounting field called "cpu_time" is used only by tasks
and not tasklets, yet it used to be stored into the TASK_COMMON part,
which doesn't make sense and wastes tasklet memory. In addition, moving
it to tasks also helps better group the various parts in cache lines.
2019-08-08 10:11:05 +02:00
Willy Tarreau
e0d0b4089d CLEANUP: buffer: replace b_drop() with b_free()
Since last commit there's no point anymore in having two variants of the
same function, let's switch to b_free() only. __b_drop() was renamed to
__b_free() for obvious consistency reasons.
2019-08-08 08:07:45 +02:00
Willy Tarreau
3b091f80aa BUG/MINOR: buffers/threads: always clear a buffer's head before releasing it
A small race exists in buffers with "show sess all". This one wants to show
some information grabbed from the buffer (especially in HTX mode). But the
thread owning this buffer might just be releasing its area, right after a
free() or munmap() call, resulting in a head that is not seen as empty yet
though the area was released. It may then be dereferenced by "show sess all"
causing a crash. Note that in practice it only happens in debug mode with
UAF enabled, but it's tricky enough to fix it right now.

This should be backported to stable versions which support threads and a
store barrier. It's worth noting that by performing the clearing first,
b_free() and b_drop() now become two exact equivalent.
2019-08-08 08:07:45 +02:00
Willy Tarreau
229e739c21 BUG/MINOR: pools: don't mark the thread harmless if already isolated
Commit 85b2cae63 ("MINOR: pools: make the thread harmless during the
mmap/munmap syscalls") was used to relax the pressure experienced by
other threads when running in debug mode with UAF enabled. It places
a pair of thread_harmless_now()/thread_harmless_end() around the call
to mmap(), assuming callers are not sensitive to parallel activity.
But there are a few cases like "show sess all" where this happens in
isolated threads, and marking the thread as harmless there is a very
bad idea, even worse when arriving to thread_harmless_end() which loops
forever.

Let's only do that when the thread is not isolated. No backport is
needed as the patch above was only in 2.1-dev.
2019-08-08 07:41:52 +02:00
Frédéric Lécaille
be36793d1d BUG/MEDIUM: stick-table: Wrong stick-table backends parsing.
When parsing references to stick-tables declared as backends, they are added to
a list of proxies (they are proxies!) which refer to this stick-tables.
Before this patch we added them to these list without checking they were already
present, making the silly hypothesis the actions/sample were checked/resolved in the same
order the proxies are parsed.

This patch implement a simple inline function to in_proxies_list() to test
the presence of a proxy in a list of proxies. We use this function when resolving
/checking samples/actions.

This bug was introduced by 015e4d7 commit.

Must be backported to 2.0.
2019-08-07 10:32:31 +02:00
Olivier Houchard
4c18f94c11 BUG/MEDIUM: proxy: Make sure to destroy the stream on upgrade from TCP to H2
In stream_set_backend(), if we have a TCP stream, and we want to upgrade it
to H2 instead of attempting ot reuse the stream, just destroy the
conn_stream, make sure we don't log anything about the stream, and pretend
we failed setting the backend, so that the stream will get destroyed.
New streams will then be created by the mux, as if the connection just
happened.
This fixes a crash when upgrading from TCP to H2, as the H2 mux totally
ignored the conn_stream provided by the upgrade, as reported in github
issue #196.

This should be backported to 2.0.
2019-08-02 18:28:58 +02:00
Emmanuel Hocdet
f580d0f391 BUILD: ssl: BoringSSL add EVP_PKEY_base_id
Remove EVP_PKEY_base_id compatibility, it is now included in BoringSSL.
2019-08-01 11:21:42 +02:00