Now it's possible to preserve spacing everywhere except in "log-format",
"log-format-sd" and "unique-id-format" directives, where spaces are
delimiters and are merged. That may be useful when the response payload
is specified as a log format string by "lf-file" or "lf-string", or even
for headers or anything else.
In order to merge spaces, a new option LOG_OPT_MERGE_SPACES is applied
exclusively on options passed to function parse_logformat_string().
This patch fixes an issue #701 ("http-request return log-format file
evaluation altering spacing of ASCII output/art").
We used to have 3 thread-based arrays for toremove_lock, idle_cleanup,
and toremove_connections. The problem is that these items are small,
and that this creates false sharing between threads since it's possible
to pack up to 8-16 of these values into a single cache line. This can
cause real damage where there is contention on the lock.
This patch creates a new array of struct "idle_conns" that is aligned
on a cache line and which contains all three members above. This way
each thread has access to its variables without hindering the other
ones. Just doing this increased the HTTP/1 request rate by 5% on a
16-thread machine.
The definition was moved to connection.{c,h} since it appeared a more
natural evolution of the ongoing changes given that there was already
one of them declared in connection.h previously.
Issue 23653 in oss-fuzz reports a heap overflow bug which is in fact a
bug introduced by commit 9e1758efb ("BUG/MEDIUM: cfgparse: use
parse_line() to expand/unquote/unescape config lines") to address
oss-fuzz issue 22689, which was only partially fixed by commit 70f58997f
("BUG/MINOR: cfgparse: Support configurations without newline at EOF").
Actually on an empty line, end == line so we cannot dereference end-1
to check for a trailing LF without first being sure that end is greater
than line.
No backport is needed, this is 2.2 only.
With the rework of the config line parser, we've started to emit a dump
of the initial line underlined by a caret character indicating the error
location. But with extremely large lines it starts to take time and can
even cause trouble to slow terminals (e.g. over ssh), and this becomes
useless. In addition, control characters could be dumped as-is which is
bad, especially when the input file is accidently wrong (an executable).
This patch adds a string sanitization function which isolates an area
around the error position in order to report only that area if the string
is too large. The limit was set to 80 characters, which will result in
roughly 40 chars around the error being reported only, prefixed and suffixed
with "..." as needed. In addition, non-printable characters in the line are
now replaced with '?' so as not to corrupt the terminal. This way invalid
variable names, unmatched quotes etc will be easier to spot.
A typical output is now:
[ALERT] 176/092336 (23852) : parsing [bad.cfg:8]: forbidden first char in environment variable name at position 811957:
...c$PATH$PATH$d(xlc`%?$PATH$PATH$dgc?T$%$P?AH?$PATH$PATH$d(?$PATH$PATH$dgc?%...
^
The config parser change in commit 9e1758efb ("BUG/MEDIUM: cfgparse: use
parse_line() to expand/unquote/unescape config lines") is wrong when
displaying the last parsed word, because it doesn't verify that the output
string was properly allocated. This may fail in two cases:
- very first line (outline is NULL, as in oss-fuzz issue 23657)
- much longer line than previous ones, requiring a realloc(), in which
case the final 0 is out of the allocated space.
This patch moves the reporting after the allocation check to fix this.
No backport is needed, this is 2.2 only.
parse_line() as added in commit c8d167bcf ("MINOR: tools: add a new
configurable line parse, parse_line()") presents an difficult usage
because it's up to the caller to determine the last written argument
based on what was passed to it. In practice the only way to safely
use it is for the caller to always pass nbarg-1 and make that last
entry point to the last arg + its strlen. This is annoying because
it makes it as painful to use as the infamous strncpy() while it has
all the information the caller needs.
This patch changes its behavior so that it guarantees that at least
one argument will point to the trailing zero at the end of the output
string, as long as there is at least one argument. The caller just
has to pass +1 to the arg count to make sure at least a last one is
empty.
When fgets() returns an incomplete line we must not increment linenum
otherwise line numbers become incorrect. This may happen when parsing
files with extremely long lines which require a realloc().
The bug has been present since unbounded line length was supported, so
the fix should be backported to older branches.
The arguments are relative to the outline, not relative to the input line.
This patch fixes up commit 9e1758efbd which
is 2.2 only. No backport needed.
The returned `arg` value is the number of arguments found, but in case
of the error message it's not a valid argument index.
Because we know how many arguments we allowed (MAX_LINE_ARGS) we know
what to print in the error message, so do just that.
Consider a configuration like this:
listen foo
1 2 3 [...] 64 65
Then running a configuration check within valgrind reports the following:
==18265== Conditional jump or move depends on uninitialised value(s)
==18265== at 0x56E8B83: vfprintf (vfprintf.c:1631)
==18265== by 0x57B1895: __vsnprintf_chk (vsnprintf_chk.c:63)
==18265== by 0x4A8642: vsnprintf (stdio2.h:77)
==18265== by 0x4A8642: memvprintf (tools.c:3647)
==18265== by 0x4CB8A4: print_message (log.c:1085)
==18265== by 0x4CE0AC: ha_alert (log.c:1128)
==18265== by 0x459E41: readcfgfile (cfgparse.c:1978)
==18265== by 0x507CB5: init (haproxy.c:2029)
==18265== by 0x4182A2: main (haproxy.c:3137)
==18265==
==18265== Use of uninitialised value of size 8
==18265== at 0x56E576B: _itoa_word (_itoa.c:179)
==18265== by 0x56E912C: vfprintf (vfprintf.c:1631)
==18265== by 0x57B1895: __vsnprintf_chk (vsnprintf_chk.c:63)
==18265== by 0x4A8642: vsnprintf (stdio2.h:77)
==18265== by 0x4A8642: memvprintf (tools.c:3647)
==18265== by 0x4CB8A4: print_message (log.c:1085)
==18265== by 0x4CE0AC: ha_alert (log.c:1128)
==18265== by 0x459E41: readcfgfile (cfgparse.c:1978)
==18265== by 0x507CB5: init (haproxy.c:2029)
==18265== by 0x4182A2: main (haproxy.c:3137)
==18265==
==18265== Conditional jump or move depends on uninitialised value(s)
==18265== at 0x56E5775: _itoa_word (_itoa.c:179)
==18265== by 0x56E912C: vfprintf (vfprintf.c:1631)
==18265== by 0x57B1895: __vsnprintf_chk (vsnprintf_chk.c:63)
==18265== by 0x4A8642: vsnprintf (stdio2.h:77)
==18265== by 0x4A8642: memvprintf (tools.c:3647)
==18265== by 0x4CB8A4: print_message (log.c:1085)
==18265== by 0x4CE0AC: ha_alert (log.c:1128)
==18265== by 0x459E41: readcfgfile (cfgparse.c:1978)
==18265== by 0x507CB5: init (haproxy.c:2029)
==18265== by 0x4182A2: main (haproxy.c:3137)
==18265==
==18265== Conditional jump or move depends on uninitialised value(s)
==18265== at 0x56E91AF: vfprintf (vfprintf.c:1631)
==18265== by 0x57B1895: __vsnprintf_chk (vsnprintf_chk.c:63)
==18265== by 0x4A8642: vsnprintf (stdio2.h:77)
==18265== by 0x4A8642: memvprintf (tools.c:3647)
==18265== by 0x4CB8A4: print_message (log.c:1085)
==18265== by 0x4CE0AC: ha_alert (log.c:1128)
==18265== by 0x459E41: readcfgfile (cfgparse.c:1978)
==18265== by 0x507CB5: init (haproxy.c:2029)
==18265== by 0x4182A2: main (haproxy.c:3137)
==18265==
==18265== Conditional jump or move depends on uninitialised value(s)
==18265== at 0x56E8C59: vfprintf (vfprintf.c:1631)
==18265== by 0x57B1895: __vsnprintf_chk (vsnprintf_chk.c:63)
==18265== by 0x4A8642: vsnprintf (stdio2.h:77)
==18265== by 0x4A8642: memvprintf (tools.c:3647)
==18265== by 0x4CB8A4: print_message (log.c:1085)
==18265== by 0x4CE0AC: ha_alert (log.c:1128)
==18265== by 0x459E41: readcfgfile (cfgparse.c:1978)
==18265== by 0x507CB5: init (haproxy.c:2029)
==18265== by 0x4182A2: main (haproxy.c:3137)
==18265==
==18265== Conditional jump or move depends on uninitialised value(s)
==18265== at 0x56E941A: vfprintf (vfprintf.c:1631)
==18265== by 0x57B1895: __vsnprintf_chk (vsnprintf_chk.c:63)
==18265== by 0x4A8642: vsnprintf (stdio2.h:77)
==18265== by 0x4A8642: memvprintf (tools.c:3647)
==18265== by 0x4CB8A4: print_message (log.c:1085)
==18265== by 0x4CE0AC: ha_alert (log.c:1128)
==18265== by 0x459E41: readcfgfile (cfgparse.c:1978)
==18265== by 0x507CB5: init (haproxy.c:2029)
==18265== by 0x4182A2: main (haproxy.c:3137)
==18265==
==18265== Conditional jump or move depends on uninitialised value(s)
==18265== at 0x56E8CAB: vfprintf (vfprintf.c:1631)
==18265== by 0x57B1895: __vsnprintf_chk (vsnprintf_chk.c:63)
==18265== by 0x4A8642: vsnprintf (stdio2.h:77)
==18265== by 0x4A8642: memvprintf (tools.c:3647)
==18265== by 0x4CB8A4: print_message (log.c:1085)
==18265== by 0x4CE0AC: ha_alert (log.c:1128)
==18265== by 0x459E41: readcfgfile (cfgparse.c:1978)
==18265== by 0x507CB5: init (haproxy.c:2029)
==18265== by 0x4182A2: main (haproxy.c:3137)
==18265==
==18265== Conditional jump or move depends on uninitialised value(s)
==18265== at 0x56E8CE2: vfprintf (vfprintf.c:1631)
==18265== by 0x57B1895: __vsnprintf_chk (vsnprintf_chk.c:63)
==18265== by 0x4A8642: vsnprintf (stdio2.h:77)
==18265== by 0x4A8642: memvprintf (tools.c:3647)
==18265== by 0x4CB8A4: print_message (log.c:1085)
==18265== by 0x4CE0AC: ha_alert (log.c:1128)
==18265== by 0x459E41: readcfgfile (cfgparse.c:1978)
==18265== by 0x507CB5: init (haproxy.c:2029)
==18265== by 0x4182A2: main (haproxy.c:3137)
==18265==
==18265== Conditional jump or move depends on uninitialised value(s)
==18265== at 0x56EA2DB: vfprintf (vfprintf.c:1632)
==18265== by 0x57B1895: __vsnprintf_chk (vsnprintf_chk.c:63)
==18265== by 0x4A8642: vsnprintf (stdio2.h:77)
==18265== by 0x4A8642: memvprintf (tools.c:3647)
==18265== by 0x4CB8A4: print_message (log.c:1085)
==18265== by 0x4CE0AC: ha_alert (log.c:1128)
==18265== by 0x459E41: readcfgfile (cfgparse.c:1978)
==18265== by 0x507CB5: init (haproxy.c:2029)
==18265== by 0x4182A2: main (haproxy.c:3137)
==18265==
[ALERT] 174/165720 (18265) : parsing [./config.cfg:2]: too many words, truncating at word 65, position -95900735: <(null)>.
[ALERT] 174/165720 (18265) : Error(s) found in configuration file : ./config.cfg
[ALERT] 174/165720 (18265) : Fatal errors found in configuration.
Valgrind reports conditional jumps relying on an undefined value and the
error message clearly shows incorrect stuff.
After this patch is applied the relying on undefined values is gone and
the <(null)> will actually show the argument. However the position value
still is incorrect. This will be fixed in a follow up patch.
This patch fixes up commit 9e1758efbd which
is 2.2 only. No backport needed.
As discussed on the list: https://www.mail-archive.com/haproxy@formilux.org/msg37698.html
This patch adds warnings to the configuration parser that detect the
following situations:
- A line being truncated by a null byte in the middle.
- A file not ending in a new line (and possibly being truncated).
Fix parsing of configurations if the configuration file does not end with
an LF.
This patch fixes GitHub issue #704. It's a regression in
9e1758efbd which is 2.2 specific. No backport
needed.
One issue with the config parser is that while it tries to report as many
errors as possible at once, it's actually unbounded. Thus, when calling
haproxy on a wrong file, it can take ages to process, such as here on
half a gigabyte of map file instead of config file:
$ time ./haproxy -c -f large.map 2>&1 |wc -l
16777220
real 0m31.324s
user 0m22.595s
sys 0m28.909s
This patch modifies readcfgfile() to stop reading the config file after a
reasonable amount of fatal errors. This threshold is set to 50, which seems
more than enough to spot a recurrent issue with a bit of context in a terminal
to address several issues at once, without filling logs nor taking time to
parse the file. The difference is clear now:
$ time ./haproxy -c -f large.map 2>&1 |wc -l
55
real 0m0.005s
user 0m0.004s
sys 0m0.003s
This may be backported to older versions without causing too many
difficulties. However the patch will not apply as-is, it will require
to increment the "fatal" count for each place where ERR_FATAL is set
in the parsing loop.
Issue 22689 in oss-fuzz shows that specially crafted config files can take
a long time to process. This happens when variable expansion, backslash
escaping or unquoting causes calls to memmove() and possibly to realloc()
resulting in O(N^2) complexity with N following the line size.
By using parse_line() we now have a safe parser that remains in O(N)
regardless of the type of operation. Error reporting changed a little bit
since the errors are not reported anymore from the deepest parsing level.
As such we now report the beginning of the error. One benefit is that for
many invalid character sequences, the original line is shown and the first
bad char or sequence is designated with a caret ('^'), which tends to be
visually easier to spot, for example:
[ALERT] 167/170507 (14633) : parsing [mini5.cfg:19]: unmatched brace in environment variable name below:
"${VAR"}
^
or:
[ALERT] 167/170645 (14640) : parsing [mini5.cfg:18]: unmatched quote below:
timeout client 10s'
^
In case the target buffer is too short for the new line, the output buffer
is grown in 1kB chunks and kept till the end, so that it should not happen
too often.
Before this patch a test like below involving a 4 MB long line would take
138s to process, 98% of which were spent in __memmove_avx_unaligned_erms(),
and now it takes only 65 milliseconds:
$ perl -e 'print "\"\$A\""x1000000,"\n"' | ./haproxy -c -f /dev/stdin 2>/dev/null
This may be backported to stable versions after a long period of
observation to be sure nothing broke. It relies on patch "MINOR: tools:
add a new configurable line parse, parse_line()".
This patch fixes all the leftovers from the include cleanup campaign. There
were not that many (~400 entries in ~150 files) but it was definitely worth
doing it as it revealed a few duplicates.
check.c is one of the largest file and contains too many things. The
e-mail alerting code is stored there while nothing is in mailers.c.
Let's move this code out. That's only 4% of the code but a good start.
In order to do so, a few tcp-check functions had to be exported.
There's no point splitting the file in two since only cfgparse uses the
types defined there. A few call places were updated and cleaned up. All
of them were in C files which register keywords.
There is nothing left in common/ now so this directory must not be used
anymore.
This one was not easy because it was embarking many includes with it,
which other files would automatically find. At least global.h, arg.h
and tools.h were identified. 93 total locations were identified, 8
additional includes had to be added.
In the rare files where it was possible to finalize the sorting of
includes by adjusting only one or two extra lines, it was done. But
all files would need to be rechecked and cleaned up now.
It was the last set of files in types/ and proto/ and these directories
must not be reused anymore.
extern struct dict server_name_dict was moved from the type file to the
main file. A handful of inlined functions were moved at the bottom of
the file. Call places were updated to use server-t.h when relevant, or
to simply drop the entry when not needed.
The files remained mostly unchanged since they were OK. However, half of
the users didn't need to include them, and about as many actually needed
to have it and used to find functions like srv_currently_usable() through
a long chain that broke when moving the file.
This one is particularly difficult to split because it provides all the
functions used to manipulate a proxy state and to retrieve names or IDs
for error reporting, and as such, it was included in 73 files (down to
68 after cleanup). It would deserve a small cleanup though the cut points
are not obvious at the moment given the number of structs involved in
the struct proxy itself.
The current state of the logging is a real mess. The main problem is
that almost all files include log.h just in order to have access to
the alert/warning functions like ha_alert() etc, and don't care about
logs. But log.h also deals with real logging as well as log-format and
depends on stream.h and various other things. As such it forces a few
heavy files like stream.h to be loaded early and to hide missing
dependencies depending where it's loaded. Among the missing ones is
syslog.h which was often automatically included resulting in no less
than 3 users missing it.
Among 76 users, only 5 could be removed, and probably 70 don't need the
full set of dependencies.
A good approach would consist in splitting that file in 3 parts:
- one for error output ("errors" ?).
- one for log_format processing
- and one for actual logging.
It was moved without any change, however many callers didn't need it at
all. This was a consequence of the split of proto_http.c into several
parts that resulted in many locations to still reference it.
Just some minor reordering, and the usual cleanup of call places for
those which didn't need it. We don't include the whole tools.h into
stats-t anymore but just tools-t.h.
Initially it looked like this could have been placed into auth.h or
stats.h but it's not the case as it's what makes the link between them
and the HTTP layer. However the file needed to be split in two. Quite
a number of call places were dropped because these were mostly leftovers
from the early days where the stats and cli were packed together.
The files were moved almost as-is, just dropping arg-t and auth-t from
acl-t but keeping arg-t in acl.h. It was useful to revisit the call places
since a handful of files used to continue to include acl.h while they did
not need it at all. Struct stream was only made a forward declaration
since not otherwise needed.
The stktable_types[] array declaration was moved to the main file as
it had nothing to do in the types. A few declarations were reordered
in the types file so that defines were before the structs. Thread-t
was added since there are a few __decl_thread(). The loss of peers.h
revealed that cfgparse-listen needed it.
The cfg_peers external declaration was moved to the main file instead
of the type one. A few types were still missing from the proto, causing
warnings in the functions prototypes (proxy, stick_table).
All includes that were not absolutely necessary were removed because
checks.h happens to very often be part of dependency loops. A warning
was added about this in check-t.h. The fields, enums and structs were
a bit tidied because it's particularly tedious to find anything there.
It would make sense to split this in two or more files (at least
extract tcp-checks).
The file was renamed to the singular because it was one of the rare
exceptions to have an "s" appended to its name compared to the struct
name.
The type file is becoming a mess, half of it is for the proxy protocol,
another good part describes conn_streams and mux ops, it would deserve
being split again. At least it was reordered so that elements are easier
to find, with the PP-stuff left at the end. The MAX_SEND_FD macro was moved
to compat.h as it's said to be the value for Linux.
The TASK_IS_TASKLET() macro was moved to the proto file instead of the
type one. The proto part was a bit reordered to remove a number of ugly
forward declaration of static inline functions. About a tens of C and H
files had their dependency dropped since they were not using anything
from task.h.
global.h was one of the messiest files, it has accumulated tons of
implicit dependencies and declares many globals that make almost all
other file include it. It managed to silence a dependency loop between
server.h and proxy.h by being well placed to pre-define the required
structs, forcing struct proxy and struct server to be forward-declared
in a significant number of files.
It was split in to, one which is the global struct definition and the
few macros and flags, and the rest containing the functions prototypes.
The UNIX_MAX_PATH definition was moved to compat.h.
This one is particularly tricky to move because everyone uses it
and it depends on a lot of other types. For example it cannot include
arg-t.h and must absolutely only rely on forward declarations to avoid
dependency loops between vars -> sample_data -> arg. In order to address
this one, it would be nice to split the sample_data part out of sample.h.
The file mostly contained struct definitions but there was also a
variable export. Most of the stuff currently lies in checks.h and
should definitely move here!