last_change was a member present in both proxy and server struct. It is
used as an age statistics to report the last update of the object.
Move last_change into fe_counters/be_counters. This is necessary to be
able to manipulate it through generic stat column and report it into
stats-file.
Note that there is a change for proxy structure with now 2 different
last_change values, on frontend and backend side. Special care was taken
to ensure that the value is initialized only on the proxy side. The
other value is set to 0 unless a listen proxy is instantiated. For the
moment, only backend counter is reported in stats. However, with now two
distinct values, stats could be extended to report it on both side.
Implement support for FN_RATE stat column into stat-file.
For the output part, only minimal change is required. Reuse the function
read_freq_ctr() to print the same value in both stats output and
stats-file dump.
For counter preloading, define a new utility function
preload_freq_ctr(). This can be used to initialize a freq-ctr type by
preloading previous period value. Reuse this function in load_ctr()
during stats-file parsing.
At the moment, no rate column is defined as generic. Thus, this commit
does not have functional change. This will be changed as soon as FN_RATE
are converted to generic columns.
Move freq-ctr defined in proxy or server structures into their dedicated
fe_counters/be_counters struct.
Functionnaly no change here. This commit will allow to convert rate
stats column to generic one, which is mandatory to manipulate them in
the stats-file.
Currently, only FN_COUNTER are dumped and preloaded via a stats-file.
Thus in several places we relied on the assumption that only FN_COUNTER
are valid in stats-file context.
New stats types will soon be implemented as they are also eligilible to
statistics reloading on process startup. Thus, prepare stats-file
functions to remove any FN_COUNTER restriction.
As one of this change, generate_stat_tree() now uses stcol_is_generic()
for stats name tree indexing before stats-file parsing.
Also related to stats-file parsing, individual counter preloading step
as been extracted from line parsing in a dedicated new function
load_ctr(). This will allow to extend it to support multiple mechanism
of counter preloading depending on the stats type.
If 'namespace' keyword is used in the backend server settings or/and in the
bind string, it means that haproxy process will call setns() to change its
default namespace to the configured one and then, it will create a
socket in this new namespace. setns() syscall requires CAP_SYS_ADMIN
capability in the process Effective set (see man 2 setns). Otherwise, the
process must be run as root.
To avoid to run haproxy as root, let's add cap_sys_admin capability in the
same way as we already added the support for some other network capabilities.
As CAP_SYS_ADMIN belongs to CAP_SYS_* capabilities type, let's add a separate
flag LSTCHK_SYSADM for it. This flag is set, if the 'namespace' keyword was
found during configuration parsing. The flag may be unset only in
prepare_caps_for_setuid() or in prepare_caps_from_permitted_set(), which
inspect process EUID/RUID and Effective and Permitted capabilities sets.
If system doesn't support Linux capabilities or 'cap_sys_admin' was not set
in 'setcap', but 'namespace' keyword is presented in the configuration, we
keep the previous strict behaviour. Process, that has changed uid to the
non-priviledged user, will terminate with alert. This alert invites the user
to recheck its configuration.
In the case, when haproxy will start and run under a non-root user and
'cap_sys_admin' is not set, but 'namespace' keyword is presented, this patch
does not change previous behaviour as well. We'll still let the user to try
its configuration, but we inform via warning, that unexpected things, like
socket creation errors, may occur.
quic_connect_server(), tcp_connect_server(), uxst_connect_server() duplicate
same code to check different ERRNOs, that socket() and setns() may return.
They also duplicate some runtime condition checks, applied to the obtained
server socket fd.
So, in order to remove these duplications and to improve code readability,
let's encapsulate socket() and setns() ERRNOs handling in
sock_handle_system_err(). It must be called just before fd's runtime condition
checks, which we also move in sock_create_server_socket by the same reason.
SO_MARK, SO_USER_COOKIE, SO_RTABLE socket options (used to set the special
mark/ID on socket, in order to perform mark-based routing) are only supported
by AF_INET sockets. So, let's check socket address family, when we enter into
this function.
In 98b44e8 ("BUG/MINOR: log: fix global lf_expr node options behavior"),
I properly restored global node options behavior for when encoding is
not used, however the fix is not optimal when encoding is involved:
Indeed, encoding logic in sess_build_logline() relies on global node
options to know if encoding must be handled expression-wide or
individually. However, because of the above fix, if an expression is
made of 1 or multiple nodes that all set an encoding option manually
(without '%o'), we consider that the option was set globally, but
that's probably not what the user intended. Instead we should only
evaluate global options from '%o', so that it remains possible to
skip global encoding when needed.
No backport needed.
LF_NODE_WITH_OPT(node) returns true if the node's option may be set and
thus should be considered. Logic is based on logformat node's type:
for now only TAG and FMT nodes can be configured.
Rename e_byte_fct to e_fct_byte and e_fct_byte_ctx to e_fct_ctx, and
adjust some comments to make it clear that e_fct_ctx is here to provide
additional user-ctx to the custom cbor encode function pointers.
For now, only e_fct_byte function may be provided, but we could imagine
having e_fct_int{16,32,64}() one day to speed up the encoding when we
know we can encode multiple bytes at a time, but for now it's not worth
the hassle.
The new LIST_ATMOST1() test verifies that the designated element is either
alone or points on both sides to the same element. This is used to detect
that a list has at most a single element, or that an element about to be
deleted was the last one of a list.
In this patch, we make use of the CBOR (RFC8949) encode helper functions
from the previous commit to implement '+cbor' encoding option for log-
formats. The logic behind it is pretty similar to '+json' encoding option,
except that the produced output is a CBOR payload written in HEX format so
that it remains compatible to use this with regular syslog endpoints.
Example:
log-format "%{+cbor}o %[int(4)] test %(named_field)[str(ok)]"
Will produce:
BF6B6E616D65645F6669656C64626F6BFF
Detailed view (from cbor.me):
BF # map(*)
6B # text(11)
6E616D65645F6669656C64 # "named_field"
62 # text(2)
6F6B # "ok"
FF # primitive(*)
If the option isn't set globally, but on a specific node instead, then
only the value will be encoded according to CBOR specification.
Example:
log-format "test cbor bool: %{+cbor}[bool(true)]"
Will produce:
test cbor bool: F5
Add cbor helpers to encode strings (bytes/text) and integers according to
RFC8949, also add cbor_encode_ctx struct to pass encoding options such as
how to encode a single byte.
In this patch, we add the "+json" log format option that can be set
globally or per log format node.
What it does, it that it sets the LOG_OPT_ENCODE_JSON flag for the
current context which is provided to all lf_* log building function.
This way, all lf_* are now aware of this option and try to comply with
JSON specification when the option is set.
If the option is set globally, then sess_build_logline() will produce a
map-like object with key=val pairs for named logformat nodes.
(logformat nodes that don't have a name are simply ignored).
Example:
log-format "%{+json}o %[int(4)] test %(named_field)[str(ok)]"
Will produce:
{"named_field": "ok"}
If the option isn't set globally, but on a specific node instead, then
only the value will be encoded according to JSON specification.
Example:
log-format "{ \"manual_key\": %(named_field){+json}[bool(true)] }"
Will produce:
{"manual_key": true}
When the option is set, +E option will be ignored, and partial numerical
values (ie: because of logasap) will be encoded as-is.
Support '+bin' option argument on logformat nodes to try to preserve
binary output type with binary sample expressions.
For this, we rely on the log/sink API which is capable of conveying binary
data since all related functions don't search for a terminating NULL byte
in provided log payload as they take a string pointer and a string length
as argument.
Example:
log-format "%{+bin}o %[bin(00AABB)]"
Will produce:
00aabb
(output was piped to `hexdump -ve '1/1 "%.2x"'` to dump raw bytes as HEX
characters)
This should be used carefully, because many syslog endpoints don't expect
binary data (especially NULL bytes). This is mainly intended for use with
set-var-fmt actions or with ring/udp log endpoints that know how to deal
with such binary payloads.
Also, this option is only supported globally (for use with '%o'), it will
not have any effect when set on an individual node. (it makes no sense to
have binary data in the middle of log payload that was started without
binary data option)
There is no need to expose such functions since they are only involved in
the log building process that occurs inside sess_build_logline().
Making functions static and removing their public prototype to ease code
maintenance.
This patch implements parsing of headers line from stats-file.
A header line is defined as starting with '#' character. It is directly
followed by a domain name. For the moment, either 'fe' or 'be' is
allowed. The following lines will contain counters values relatives to
the domain context until the next header line.
This is implemented via static function parse_header_line(). It first
sets the domain context used during apply_stats_file(). A stats column
array is generated to contains the order on which column are stored.
This will be reused to parse following lines values.
If an invalid line is found and no header was parsed, considered the
stats-file as ill formatted and stop parsing. This allows to immediately
interrupt parsing if a garbage file was used without emitting a ton of
warnings to the user.
This commit is the first one of a serie to implement preloading of
haproxy counters via stats-file parsing.
This patch defines a basic apply_stats_file() function. It implements
reading line by line of a stats-file without any parsing for the moment.
It is called automatically on process startup via init().
Extract GUID format validation in a dedicated function named
guid_is_valid_fmt(). For the moment, it is only used on guid_insert().
This will be reused when parsing stats-file, to ensure GUID has a valid
format before tree lookup.
Define a new CLI command "dump stats-file" with its handler
cli_parse_dump_stat_file(). It will loop twice on proxies_list to dump
first frontend and then backend side. It reuses the common function
stats_dump_stat_to_buffer(), using STAT_F_BOUND to restrict on the
correct side.
A new module stats-file.c is added to regroup function specifics to
stats-file. It defines two main functions :
* stats_dump_file_header() to generate the list of column list prefixed
by the line context, either "#fe" or "#be"
* stats_dump_fields_file() to generate each stat lines. Object without
GUID are skipped. Each stat entry is separated by a comma.
For the moment, stats-file does not support statistics modules. As such,
stats_dump_*_line() functions are updated to prevent looping over stats
module on stats-file output.
Prepare stats function to handle a new format labelled "stats-file". Its
purpose is to generate a statistics dump with a format closed from the
CSV output. Such output will be then used to preload haproxy internal
counters on process startup.
stats-file output differs from a standard CSV on several points. First,
only an excerpt of all statistics is outputted. All values that does not
make sense to preload are excluded. For the moment, stats-file only list
stats fully defined via "struct stat_col" method. Contrary to a CSV, sll
columns of a stats-file will be filled. As such, empty field value is
used to mark stats which should not be outputted.
Some adaptation specifics to stats-file are necessary into
me_generate_field(). First, stats-file will output separatedly values
from frontend and backend sides with their own respective set of
columns. As such, an empty field value is returned if stat is not
defined for either frontend/listener, or backend/server when outputting
the other side. Also, as stats-file does not support empty column,
stcol_hide() is not used for it.
A minor adjustement was necessary for stats_fill_fe_line() to pass
context flags. This is necessary to detect stat output format. All other
listener/server/backend corresponding functions already have it.
Convert most of proxy counters statistics to new "struct stat_col"
definition. Remove their corresponding switch..case entries in
stats_fill_*_line() functions. Their value are automatically calculate
via me_generate_field() invocation.
Along with this, also complete stcol_hide() when some stats should be
hidden.
Only a few counters where not converted. This is because they rely on
values stored outside of fe/be_counters structure, which
me_generate_field() cannot use for now.
This commit is a direct follow-up of the previous one which define a new
type "struct stat_col" to fully define a statistic entry.
Define a new function metric_generate(). For metrics statistics, it is
able to automatically calculate a stat value field for "offsets" from
"struct stat_col". Use it in stats_fill_*_stats() functions. Maintain a
fallback to previously used switch-case for old-style statistics.
This commit does not introduce functional change as currently no
statistic is defined as "struct stat_col". This will be the subject of a
future commit.
Previously, statistics were simply defined as a list of name_desc, as
for example "stat_cols_px" for proxy stats. No notion of type was fixed
for each stat definition. This correspondance was done individually
inside stats_fill_*_line() functions. This renders the process to
define new statistics tedious.
Implement a more expressive stat definition method via a new API. A new
type "struct stat_col" for stat column to replace name_desc usage is
defined. It contains a field to store the stat nature and format. A
<cap> field is also defined to be able to define a proxy stat only for
certain type of objects.
This new type is also further extended to include counter offsets. This
allows to define a method to automatically generate a stat value field
from a "struct stat_col". This will be the subject of a future commit.
New type "struct stat_col" is fully compatible full name_desc. This
allows to gradually convert stats definition. The focus will be first
for proxies counters to implement statistics preservation on reload.
The name "metrics" was chosen to represent the various list of haproxy
exposed statistics. However, it is deemed as ambiguous as some stats are
indeed metric in the true sense, but some are not, as highlighted by
various "enum field_origin" values.
Replace it by the new name "stat_cols" for statistic columns. Along with
the already existing notion of stat lines it should better reflect its
purpose.
When a process is reloaded, the old process must performed a synchronisation
with the new process. To do so, the sync task notify the local peer to
proceed and waits. Internally, the sync task used PEERS_F_DONOTSTOP flag to
know it should wait. However, this flag was only set/unset in a single
function. There is no real reason to set a flag to do so. A static variable
set to 1 when the resync starts and to 0 when it is finished is enough.
Some flags were used to define the learn state of a peer. It was a bit
confusing, especially because the learn state of a peer is manipulated from
the peer applet but also from the sync task. It is harder to understand the
transitions if it is based on flags than if it is based a dedicated state
based on an enum. It is the purpose of this patch.
Now, we can define the following rules regarding this learn state:
* A peer is assigned to learn by the sync task
* The learn state is then changed by the peer itself to notify the
learning is in progress and when it is finished.
* Finally, when the peer finished to learn, the sync task must acknowledge
it by unassigning the peer.
This patch is a cleanup of the recent change about the relation between a
peer and the applet used to deal with I/O. Three flags was introduced to
reflect the peer applet state as seen from outside (from the sync task in
fact). Using flags instead of true states was in fact a bad idea. This work
but it is confusing. Especially because it was mixed with LEARN and TEACH
peer flags.
So, now, to make it clearer, we are now using a dedicated state for this
purpose. From the outside, the peer may be in one of the following state
with respects of its applet:
* the peer has no applet, it is stopped (PEER_APP_ST_STOPPED).
* the peer applet was created with a validated connection from the protocol
perspective. But the sync task must synchronized it with the peers
section. It is in starting state (PEER_APP_ST_STARTING).
* The starting starting was acknowledged by the sync task, the peer applet
can start to process messages. It is in running state
(PEER_APP_ST_RUNNING).
* The last peer applet was released and the associated connection
closed. But the sync task must synchronized it with the peers section. It
is in stopping state (PEER_APP_ST_STOPPING).
Functionnaly speaking, there is no true change here. But it should be easier
to understand now.
In addition to these changes, __process_peer_state() function was renamed
sync_peer_app_state().
appctx_is_back() function may be used to know if an applet was create on
frontend side or on backend side. It may be handy for some applets that may
exist on both sides, like peer applets.
These new functions is_char4_outside() and is_char8_outside() are meant
to be used to verify if any of the 4 or 8 chars represented respectively
by a uint32_t or a uint64_t is outside of the min,max byte range passed
in argument. This is the simplified, fast version of the function so it
is restricted to less than 0x80 distance between min and max (sufficient
to validate chars). Extra functions are also provided to check for min
or max alone as well, with the same restriction.
The use case typically is to check that the output of read_u32() or
read_u64() contains exclusively certain bytes.
From Linux 5.17, anonymous regions can be name via prctl/PR_SET_VMA
so caches can be identified when looking at HAProxy process memory
mapping.
The most possible error is lack of kernel support, as a result
we ignore it, if the naming fails the mapping of memory context
ought to still occur.
Since 3.0-dev7 with commit 1a088da7c2 ("MAJOR: stktable: split the keys
across multiple shards to reduce contention"), building without threads
yields a warning about the shard not being used. This is because the
locks API does nothing of its arguments, which is the only place where
the shard is being used. We cannot modify the lock API to pretend to
consume its argument because quite often it's not even instantiated.
Let's just pretend we consume shard using an explict ALREADY_CHECKED()
statement instead. While we're at it, let's make sure that XXH32() is
not called when there is a single bucket!
No backport is needed.
Some flags are defined during statistics generation and output. They use
the prefix STAT_* which is also used for other purposes. Rename them
with the new prefix STAT_F_* to differentiate them from the other
usages.
Several unique names were used for different purposes under statistics
implementation. This caused the code to be difficult to understand.
* stat/stats name is removed when a more specific name could be used
* restrict field usage to purely refer to <struct field> which
represents a raw stat value.
* use "line" naming to represent an array of <struct field>
Info are used to expose haproxy global metrics. It is similar to proxy
statistics and any other module. As such, rename info indexes using
SI_I_INF_* prefix. Also info variable is renamed stat_line_info.
Thanks to this, naming is now consistent between info and other
statistics. It will help to integrate it as a "global" statistics
module.
Statistics were extended with the introduction of stats module. This
mechanism allows to expose various metrics for several haproxy
components. As a consequence of this, some static variables were
transformed to dynamic ones to be able to regroup all statistics
definition.
Rename these variables with more explicit naming :
* stat_lines can be used to generate one line of statistics for any
module using struct field as value
* metrics and metrics_len are used to stored description of metrics
indexed by module
Note that info is not integrated in the statistics module mechanism.
However, it could be done in the future to better reflect its purpose.
This commit is the first one of a serie which adjust naming convention
for stats module. The objective is to remove ambiguity and better
reflect how stats are implemented, especially since the introduction of
stats module.
This patch renames elements related to proxies statistics. One of the
main change is to rename ST_F_* statistics indexes prefix with the new
name ST_I_PX_*. This remove the reference to field which represents
another concept in the stats module. In the same vein, global
stat_fields variable is renamed metrics_px.
This commit is part of a serie to align counters usage between
frontends/listeners on one side and backends/servers on the other.
"stot" metric refers to the total number of sessions. On backend side,
it is interpreted as a number of streams. Previously, this was accounted
using <cum_sess> be_counters field for servers, but <cum_conn> instead
for backend proxies.
Adjust this by using <cum_sess> for both proxies and servers. As such,
<cum_conn> field can be removed from be_counters.
Note that several diagnostic messages which reports total frontend and
backend connections were adjusted to use <cum_sess>. However, this is an
outdated and misleading information as it does reports streams count on
backend side. These messages should be fixed in a separate commit.
This should be backported to all stable releases.
This commit is the first one of a series which aims to align counters
usage between frontends/listeners on one side and backends/servers on
the other.
Remove <down_trans> field from proxy structure. Use instead the same
name field from be_counters structure, which is already used for
servers.
mux-ops .shutr and .shutw callback functions are merged into a unique
functions, called .shut. The shutdown mode is still passed as argument,
muxes are responsible to test it. Concretly, .shut() function of each mux is
now the content of the old .shutw() followed by the content of the old
.shutr().
se_shutdown() function is now used to perform a shutdown on a connection
endpoint and an applet endpoint. The same function is used for
both. sc_conn_shut() function was removed and appctx_shut() function was
updated to only deal with the applet stuff.
It is the same than the previous patch but for applets. Here there is
already only one function. But with this patch, appctx_shut() function was
modified to explicitly get shutdown mode as parameter. In addition
appctx_shutw() was removed.
The SC API to perform shutdowns on connection endpoints was unified to have
only one function, sc_conn_shut(), with read/write shut modes passed
explicitly. It means sc_conn_shutr() and sc_conn_shutw() were removed. The
next step is to do the same at the mux level.
CO_SHR_* and CO_SHW_* modes are in fact used by the stream-connectors to
instruct the muxes how streams must be shut done. It is then the mux
responsibility to decide if it must be propagated to the connection layer or
not. And in this case, the modes above are only tested to pass a boolean
(clean or not).
So, it is not consistant to still use connection related modes for
information set at an upper layer and never used by the connection layer
itself.
These modes are thus moved at the sedesc level and merged into a single
enum. Idea is to add more modes, not necessarily mutually exclusive, to pass
more info to the muxes. For now, it is a one-for-one renaming.
Since the begining, this function returns a pointer on an appctx while it
should be a void pointer. It is the caller responsibility to cast it to the
right type, the corresponding mux stream in this case.
However, it is not a big deal because this function is unused for now. Only
the unsafe one is used.
This patch must be backported as far as 2.6.
When the stat code was reorganized, and the prototype to
stats_dump_html_end() was moved to its own header, it missed the function
arguments. Fix that.
This should fix issue 2540.
Extract functions related to HTML stats webpage from stats.c into a new
module named stats-html. This allows to reduce stats.c to roughly half
of its original size.
A static variable trash_chunk was used as implicit buffer in most of
stats output function. It was a oneline buffer uses as temporary storage
before emitting to the final applet or CLI buffer.
Replaces it by a buffer defined in show_stat_ctx structure. This allows
to retrieve it in most of stats output function. An additional parameter
was added for the function where context was not already used. This
renders the code cleaner and will allow to split stats.c in several
source files.
As a result of a new member into show_stat_ctx, per-command context max
size has increased. This forces to increase APPLET_MAX_SVCCTX to ensure
pool size is big enough. Increase it to 128 bytes which includes some
extra room for the future.
Expected arguments were not specified in the
prepare_caps_from_permitted_set() function declaration. It is an issue for
some compilers, for instance clang. But at the end, it is unexpected and
deprecated.
No backport needed, except if f0b6436f57 ("MEDIUM: capabilities: check
process capabilities sets") is backported.
applet_putblk and co were added to simplify applets. In 2.8, a fix was
pushed to deal with all errors as a room error because the vast majority of
applets didn't expect other kind of errors. The API was changed with the
commit 389b7d1f7b ("BUG/MEDIUM: applet: Fix API for function to push new
data in channels buffer").
Unfortunately and for unknown reason, the fix was totally failed. Checks on
channel functions were just wrong and not consistent. applet_putblk()
function is especially affected because the error is returned but no flag
are set on the SC to request more room. Because of this bug, applets relying
on it may be blocked, waiting for more room, and never woken up.
It is an issue for the peer and spoe applets.
This patch must be backported as far as 2.8.
The crt-store load line parser relies on offsets of member of the
ckch_conf struct. However the new "alias" keyword as an offset to
-1, because it does not need to be used. Plan was to handle it that way
in the parser, but it wasn't supported yet. So -1 was still used in an
offset computation which was not used, but ASAN could see the problem.
This patch fixes the issue by using a signed type for the offset value,
so any negative value would be skipped. It also introduced a
PARSE_TYPE_NONE for the parser.
No backport needed.
Testing an undefined macro emits warnings due to -Wundef, and we have
exactly one such case in xxhash:
include/import/xxhash.h:3390:42: warning: "__cplusplus" is not defined [-Wundef]
#if ((defined(sun) || defined(__sun)) && __cplusplus) /* Solaris includes __STDC_VERSION__ with C++. Tested with GCC 5.5 */
Let's just prepend "defined(__cplusplus) &&" before __cplusplus to
resolve the problem. Upstream is still affected apparently.
There were several places in grpc and its dependency protobuf where unaligned
accesses were done. Read accesses to 32 (resp. 64) bits values should be performed
by read_u32() (resp. read_u64()).
Replace these unligned read accesses by correct calls to these functions.
Same fixes for doubles and floats.
Such unaligned read accesses could lead to crashes with bus errors on CPU
archictectures which do not fix them at run time.
This patch depends on this previous commit:
861199fa71 MINOR: net_helper: Add support for floats/doubles.
Must be backported as far as 2.6.
The global 'key-base' keyword allows to read the 'key' parameter of a
crt-store load line using a path prefix.
This is the equivalent of the 'crt-base' keyword but for 'key'.
It only applies on crt-store.
Add crt-base support for "crt-store". It will be used by 'crt', 'ocsp',
'issuer', 'sctl' load line parameter.
In order to keep compatibility with previous configurations and scripts
for the CLI, a crt-store load line will save its ckch_store using the
absolute crt path with the crt-base as the ckch tree key. This way, a
`show ssl cert` on the CLI will always have the completed path.
There's currently an abiguity around ring_size(), it's said to return
the allocated size but returns the usable size. We can't change it as
it's used everywhere in the code like this. Let's fix the comment and
add ring_allocated_size() instead for anything related to allocation.
When the integrity check fails, it's useful to get a dump of the area
around the first faulty byte. That's what this patch does. For example
it now shows this before reporting info about the tag itself:
Contents around first corrupted address relative to pool item:.
Contents around address 0xe4febc0792c0+40=0xe4febc0792e8:
0xe4febc0792c8 [80 75 56 d8 fe e4 00 00] [.uV.....]
0xe4febc0792d0 [a0 f7 23 a4 fe e4 00 00] [..#.....]
0xe4febc0792d8 [90 75 56 d8 fe e4 00 00] [.uV.....]
0xe4febc0792e0 [d9 93 fb ff fd ff ff ff] [........]
0xe4febc0792e8 [d9 93 fb ff ff ff ff ff] [........]
0xe4febc0792f0 [d9 93 fb ff ff ff ff ff] [........]
0xe4febc0792f8 [d9 93 fb ff ff ff ff ff] [........]
0xe4febc079300 [d9 93 fb ff ff ff ff ff] [........]
This may be backported to 2.9 and maybe even 2.8 as it does help spot
the cause of the memory corruption.
This function is particularly useful to dump unknown areas watching
for opportunistic symbols, so let's move it to tools.c so that we can
reuse it a little bit more.
'crt-store' is a new section useful to define the struct ckch_store.
The "load" keyword in the "crt-store" section allows to define which
files you want to load for a specific certificate definition.
Ex:
crt-store
load crt "site1.crt" key "site1.key"
load crt "site2.crt" key "site2.key"
frontend in
bind *:443 ssl crt "site1.crt" crt "site2.crt"
This is part of the certificate loading which was discussed in #785.
This option has been set by default for a very long time and also
complicates the manipulation of the DEBUG variable. Let's make it
the official default and permit to unset it by setting it to zero.
The other pool-related DEBUG options were adjusted to also explicitly
check for the zero value for consistency.
We continue to carry it in the makefile, which adds to the difficulty
of passing new options. Let's make DEBUG_STRICT=1 the default so that
one has to explicitly pass DEBUG_STRICT=0 to disable it. This allows us
to remove the option from the default DEBUG variable in the makefile.
Setting DEBUG_STRICT=0 only validates the defined(DEBUG_STRICT) test
regarding DEBUG_STRICT_ACTION, which is equivalent to DEBUG_STRICT>=0.
Let's make sure the test checks for >0 so that DEBUG_STRICT=0 properly
disables DEBUG_STRICT.
Recent commit 4c1480f13b ("MINOR: stick-tables: mark the seen stksess
with a flag "seen"") introduced a build regression on older versions of
gcc before 4.7. This is in the old __sync_ API, the HA_ATOMIC_LOAD()
implementation uses an intermediary return value called "ret" that is
of the same name as the variable passed in argument to the macro in the
aforementioned commit. As such, the compiler complains with a cryptic
error:
src/peers.c: In function 'peer_teach_process_stksess_lookup':
src/peers.c:1502: error: invalid type argument of '->' (have 'int')
The solution is to avoid referencing the argument in the expression and
using an intermediary variable for the pointer as done elsewhere in the
code. It seems there's no other place affected with this. It probably
does not need to be backported since this code is antique and very rarely
used nowadays.
William rightfully reported that not supporting =0 to disable a USE_xxx
option is sometimes painful (e.g. a script might do USE_xxx=$(command)).
It's not that difficult to handle actually, we just need to consider the
value 0 as empty at the few places that test for an empty string in
options.mk, and in each "ifneq" test in the main Makefile, so let's do
that. We even take care of preserving the original value in the build
options string so that building with USE_OPENSSL=0 will be reported
as-is in haproxy -vv, and with "-OPENSSL" in the feature list.
William suggested that it would be nice to warn about unknown USE_*
variables to more easily catch misspelled ones. The valid ones are
present in use_opts, so by appending "=%" to each of them, we can
build a series of patterns to exclude from MAKEOVERRIDES and emit
a warning for the ones that stand out.
Example:
$ make TARGET=linux-glibc USE_QUIC_COMPAT_OPENSSL=1
Makefile:338: Warning: ignoring unknown build option: USE_QUIC_COMPAT_OPENSSL=1
CC src/slz.o
These values are obviously wrong. There is an extra zero at the end for both
defines. By chance, it is harmless. But it is better to fix it.
This patch should be backported as far as 2.6.
qc_send() was systematically called by quic_conn IO handlers with all
instantiated quic_enc_level. Change this to only register quic_enc_level
for send if needed. Do not call at all qc_send() if no qel registered.
A new function qel_need_sending() is defined to detect if sending is
required. First, it checks if quic_enc_level has prepared frames or
probing is set. It can also returns true if ACK required either on
quic_enc_level itself or because of quic_conn ack timer fired. Finally,
a CONNECTION_CLOSE emission for quic_conn is also a valid case.
This should reduce the number of invocations of qc_send(). This could
improve slightly performance, as well as simplify traces debugging.
A series of previous patches have clean up sending function for
handshake case. Their new exposed API is now flexible enough to convert
app case to use the same functions.
As such, qc_send_hdshk_pkts() is renamed qc_send() and become the single
entry point for QUIC emission. It is used during application packets
emission in quic_conn_app_io_cb(), qc_send_mux(). Also the internal
function qc_prep_hpkts() is renamed qc_prep_pkts().
Remove the new unneeded qc_send_app_pkts() and qc_prep_app_pkts().
Also removed qc_send_app_probing(). It was a simple wrapper over other
application send functions. Now, default qc_send() can be reuse for such
cases with <old_data> argument set to true.
An adjustment was needed when converting qc_send_hdshk_pkts() to the
general qc_send() version. Previously, only a single packets
encoding/emission cycle was performed. This was enough as handshake
packets are always smaller than Tx buffer. However, it may be possible
to emit more application data. As such, a loop is necessary to perform
multiple encoding/emission cycles, as this was already the case in
qc_send_app_pkts().
No functional difference should happen with this commit. However, as
these are critcal functions with a lot of changes, this patch is
labelled as medium.
quic_conn_io_cb() manually implements emission by using lower level
functions qc_prep_pkts() and qc_send_ppkts(). Replace this by using the
higher level function qc_send_hdshk_pkts() which notably handle buffer
allocation and purging.
This allows to clean up send API by flagging qc_prep_pkts() and
qc_send_ppkts() as static. They are now used in a single location inside
qc_send_hdshk_pkts().
qc_send_hdshk_pkts() is a wrapper for qc_prep_hpkts() used on
retransmission. It was restricted to use two quic_enc_level pointers as
distinct arguments. Adapt it to directly use the same list of
quic_enc_level which is passed then to qc_prep_hpkts().
Now for retransmission quic_enc_level send list is built directly into
qc_dgrams_retransmit() which calls qc_send_hdshk_pkts().
Along this change, a new utility function qel_register_send() is
defined. It is an helper to build the quic_enc_level send list. It
enfores that each quic_enc_level instance is only registered in a single
list to prevent memory issues. It is both used in qc_dgrams_retransmit()
and quic_conn_io_cb().
Emission of packets during handshakes was implemented via an API which
uses two alternative ways to specify the list of frames.
The first one uses a NULL list of quic_enc_level as argument for
qc_prep_hpkts(). This was an implicit method to iterate on all qels
stored in quic_conn instance, with frames already inserted in their
corresponding quic_pktns.
The second method was used for retransmission. It uses a custom local
quic_enc_level list specified by the caller as input to qc_prep_hpkts().
Frames were accessible through <retransmit> list pointers of each
quic_enc_level used in an implicit mechanism.
This commit clarifies the API by using a single common method. Now
quic_enc_level list must always be specified by the caller. As for
frames list, each qels must set its new field <send_frms> pointer to the
list of frames to send. Callers of qc_prep_hpkts() are responsible to
always clear qels send list. This prevent a single instance of
quic_enc_level to be inserted while being attached to another list.
This allows notably to clean up some unnecessary code. First,
<retransmit> list of quic_enc_level is removed as it is replaced by new
<send_frms>. Also, it's now possible to use proper list_for_each_entry()
inside qc_prep_hpkts() to loop over each qels. Internal functions for
quic_enc_level selection is now removed.
encode_{chunk,string}() is often found to be used this way:
ret = encode_{chunk,string}(start, stop...)
if (ret == NULL || *ret != '\0') {
//error
}
//success
Indeed, encode_{chunk,string} will always try to add terminating NULL byte
to the output string, unless no space is available for even 1 byte.
However, it means that for the caller to be able to spot an error, then it
must provide a buffer (here: start) which is already initialized.
But this is wrong: not only this is very tricky to use, but since those
functions don't return NULL on failure, then if the output buffer was not
properly initialized prior to calling the function, the caller will
perform invalid reads when checking for failure this way. Moreover, even
if the buffer is initialized, we cannot reliably tell if the function
actually failed this way because if the buffer was previously initialized
with NULL byte, then the caller might think that the call actually
succeeded (since the function didn't return NULL and didn't update the
buffer).
Also, sess_build_logline() relies lf_encode_{chunk,string}() functions
which are in fact wrappers for encode_{chunk,string}() functions and thus
exhibit the same error handling mechanism. It turns out that
sess_build_logline() makes unsafe use of those functions because it uses
the error-checking logic mentionned above while buffer (tmplog) is not
guaranteed to be initialized when entering the function. This may
ultimately cause malfunctions or invalid reads if the output buffer is
lacking space.
To fix the issue once and for all and prevent similar bugs from being
introduced, we make it so encode_{string, chunk} and escape_string()
(based on encode_string()) now explicitly return NULL on failure
(when the function failed to write at least the ending NULL byte)
lf_encode_{string,chunk}() helpers had to be patched as well due to code
duplication.
This should be backported to all stable versions.
[ada: for 2.4 and 2.6 the patch won't apply as-is, it might be helpful to
backport ae1e14d65 ("CLEANUP: tools: removing escape_chunk() function")
first, considering it's not very relevant to maintain a dead function]
Since the Linux capabilities support add-on (see the commit bd84387beb
("MEDIUM: capabilities: enable support for Linux capabilities")), we can also
check haproxy process effective and permitted capabilities sets, when it
starts and runs as non-root.
Like this, if needed network capabilities are presented only in the process
permitted set, we can get this information with capget and put them in the
process effective set via capset. To do this properly, let's introduce
prepare_caps_from_permitted_set().
First, it checks if binary effective set has CAP_NET_ADMIN or CAP_NET_RAW. If
there is a match, LSTCHK_NETADM is removed from global.last_checks list to
avoid warning, because in the initialization sequence some last configuration
checks are based on LSTCHK_NETADM flag and haproxy process euid may stay
unpriviledged.
If there are no CAP_NET_ADMIN and CAP_NET_RAW in the effective set, permitted
set will be checked and only capabilities given in 'setcap' keyword will be
promoted in the process effective set. LSTCHK_NETADM will be also removed in
this case by the same reason. In order to be transparent, we promote from
permitted set only capabilities given by user in 'setcap' keyword. So, if
caplist doesn't include CAP_NET_ADMIN or CAP_NET_RAW, LSTCHK_NETADM would not
be unset and warning about missing priviledges will be emitted at
initialization.
Need to call it before protocol_bind_all() to allow binding to priviledged
ports under non-root and 'setcap cap_net_bind_service' must be set in the
global section in this case.
This commit is similar with the two previous ones. Its purpose is to add
GUID support on listeners. Due to bind_conf and listeners configuration,
some specifities were required.
Its possible to define several listeners on a single bind line, for
example by specifying multiple addresses. As such, it's impossible to
support a "guid" keyword on a bind line. The problem is exacerbated by
the cloning of listeners when sharding is used.
To resolve this, a new keyword "guid-prefix" is defined for bind lines.
It allows to specify a string which will be used as a prefix for
automatically generated GUID for each listeners attached to a bind_conf.
Automatic GUID listeners generation is implemented via a new function
bind_generate_guid(). It is called on post-parsing, after
bind_complete_thread_setup(). For each listeners on a bind_conf, a new
GUID is generated with bind_conf prefix and the index of the listener
relative to other listeners in the bind_conf. This last value is stored
in a new bind_conf field named <guid_idx>. If a GUID cannot be inserted,
for example due to a non-unique value, an error is returned, startup is
interrupted with configuration rejected.
This commit is similar to previous one, except that it implements GUID
support for server instances. A guid_node field is inserted into server
structure. A new "guid" server keyword is defined.
Implement proxy identiciation through GUID. As such, a guid_node member
is inserted into proxy structure. A proxy keyword "guid" is defined to
allow user to fix its value.
GUID format is unspecified to allow users to choose the naming scheme.
Some restrictions however are added by this patch, mainly to ensure
coherence and memory usage.
The first restriction is on the length of GUID. No more than 127
characters can be used to prevent memory over consumption.
The second restriction is on the character set allowed in GUID. Utility
function invalid_char() is used for this : it allows alphanumeric
values and '-', '_', '.' and ':'.
Define a new module guid. Its purpose is to be able to attach a global
identifier for various objects such as proxies, servers and listeners.
A new type guid_node is defined. It will be stored in the objects which
can be referenced by such GUID. Several functions are implemented to
properly initialized, insert, remove and lookup GUID in a global tree.
Modification operations should only be conducted under thread isolation.
Currently, the way proxy-oriented logformat directives are handled is way
too complicated. Indeed, "log-format", "log-format-error", "log-format-sd"
and "unique-id-format" all rely on preparsing hints stored inside
proxy->conf member struct. Those preparsing hints include the original
string that should be compiled once the proxy parameters are known plus
the config file and line number where the string was found to generate
precise error messages in case of failure during the compiling process
that happens within check_config_validity().
Now that lf_expr API permits to compile a lf_expr struct that was
previously prepared (with original string and config hints), let's
leverage lf_expr_compile() from check_config_validity() and instead
of relying on individual proxy->conf hints for each logformat expression,
store string and config hints in the lf_expr struct directly and use
lf_expr helpers funcs to handle them when relevant (ie: original
logformat string freeing is now done at a central place inside
lf_expr_deinit(), which allows for some simplifications)
Doing so allows us to greatly simplify the preparsing logic for those 4
proxy directives, and to finally save some space in the proxy struct.
Also, since httpclient proxy has its "logformat" automatically compiled
in check_config_validity(), we now use the file hint from the logformat
expression struct to set an explicit name that will be reported in case
of error ("parsing [httpclient:0] : ...") and remove the extraneous check
in httpclient_precheck() (logformat was parsed twice previously..)
split parse_logformat_string() into two functions:
parse_logformat_string() sticks to the same behavior, but now becomes an
helper for lf_expr_compile() which uses explicit arguments so that it
becomes possible to use lf_expr_compile() without a proxy, but also
compile an expression which was previously prepared for compiling (set
string and config hints within the logformat expression to avoid manually
storing string and config context if the compiling step happens later).
lf_expr_dup() may be used to duplicate an expression before it is
compiled, lf_expr_xfer() now makes sure that the input logformat is
already compiled.
This is some prerequisite works for log-profiles implementation, no
functional change should be expected.
This patch tries to address a design flaw with how logformat expressions
are parsed from config. Indeed, some parse_logformat_string() calls are
performed during config parsing when the proxy mode is not yet known.
Here's a config example that illustrates the issue:
defaults
mode tcp
listen test
bind :8888
http-response set-header custom-hdr "%trl" # needs http
mode http
The above config should work, because the effective proxy mode is http,
yet haproxy fails with this error:
[ALERT] (99051) : config : parsing [repro.conf:6] : error detected in proxy 'test' while parsing 'http-response set-header' rule : format tag 'trl' is reserved for HTTP mode.
To fix the issue once and for all, let's implement smart postparsing for
logformat expressions encountered during config parsing:
- split parse_logformat_string() (and subfonctions) in order to create a
new lf_expr_postcheck() function that must be called to finish
preparing and checking the logformat expression once the proxy type is
known.
- save some config hints info during parse_logformat_string() to
generate more precise error messages during lf_expr_postcheck(), if
needed, we rely on curpx->conf.args.{file,line} hints for that because
parse_logformat_string() doesn't know about current file and line
number.
- lf_expr_postcheck() uses PR_FL_CHECKED proxy flag to know if the
function may try to make the proxy compatible with the expression, or
if it should simply fail as soon as an incompatibility is detected.
- if parse_logformat_string() is called from an unchecked proxy, then
schedule the expression for postparsing, else (ie: during runtime),
run the postcheck right away.
This change will also allow for some logformat expression error handling
simplifications in the future.
PR_FL_CHECKED is set on proxy once the proxy configuration was fully
checked (including postparsing checks).
This information may be useful to functions that need to know if some
config-related proxy properties are likely to change or not due to parsing
or postparsing/check logics. Also, during runtime, except for some rare cases
config-related proxy properties are not supposed to be changed.
log format expressions are broadly used within the code: once they are
parsed from input string, they are converted to a linked list of
logformat nodes.
We're starting to face some limitations because we're simply storing the
converted expression as a generic logformat_node list.
The first issue we're facing is that storing logformat expressions that
way doesn't allow us to add metadata alongside the list, which is part
of the prerequites for implementing log-profiles.
Another issue with storing logformat expressions as generic lists of
logformat_node elements is that it's starting to become really hard to
tell when we rely on logformat expressions or not in the code given that
there isn't always a comment near the list declaration or manipulation
to indicate that it's relying on logformat expressions under the hood,
so this adds some complexity for code maintenance.
This patch looks quite impressive due to changes in a lot of header and
source files (since logformat expressions are broadly used), but it does
a simple thing: it defines the lf_expr structure which itself holds a
generic list of logformat nodes, and then declares some helpers to
manipulate lf_expr elements and fixes the code so that we now exclusively
manipulate logformat_node lists as lf_expr elements outside of log.c.
For now, lf_expr struct only contains the list of logformat nodes (no
additional metadata), but now that we have dedicated type and helpers,
doing so in the future won't be problematic at all and won't require
extensive code changes.
This is a pretty simple patch despite requiring to make some visible
changes in the code:
When parsing a logformat string, log tags (ie: '%tag', AKA log tags) are
turned into logformat nodes with their type set to the type of the
corresponding logformat_tag element which was matched by name. Thus, when
"compiling" a logformat tag, we only keep a reference to the tag type
from the original logformat_tag.
For example, for "%B" log tag, we have the following logformat_tag
element:
{
.name = "B",
.type = LOG_FMT_BYTES,
.mode = PR_MODE_TCP,
.lw = LW_BYTES,
.config_callback = NULL
}
When parsing "%B" string, we search for a matching logformat tag
inside logformat_tags[] array using the provided name, once we find a
matching element, we craft a logformat node whose type will be
LOG_FMT_BYTES, but from the node itself, we no longer have access to
other informations that are set in the logformat_tag struct element.
Thus from a logformat_node resulting from a log tag, with current
implementation, we cannot easily get back to matching logformat_tag
struct element as it would require us to scan the whole logformat_tags
array at runtime using node->type to find the matching element.
Let's take a simpler path and consider all tag-specific LOG_FMT_*
subtypes as being part of the same logformat node type: LOG_FMT_TAG.
Thanks to that, we're now able to distinguish logformat nodes made
from logformat tag from other logformat nodes, and link them to
their corresponding logformat_tag element from logformat_tags[] array. All
it costs is a simple indirection and an extra pointer in logformat_node
struct.
While at it, all LOG_FMT_* types related to logformat tags were moved
inside log.c as they have no use outside of it since they are simply
lookup indexes for sess_build_logline() and could even be replaced by
function pointers some day...
rename logformat_type internal struct to logformat_tag to to make it less
confusing, then expose logformat_tag struct through header file so that it
can be referenced in other structs.
also rename logformat_keywords[] to logformat_tags[] for better
consistency.