Commit Graph

13949 Commits

Author SHA1 Message Date
Amaury Denoyelle
adb2276524 MINOR: quic: compare coalesced packets by DCID
If an UDP datagram contains multiple QUIC packets, they must all use the
same DCID. The datagram context is used partly for this.

To ensure this, a comparison was made on the dcid_node of DCID tree. As
this is a comparison based on pointer address, it can be faulty when
nodes are removed/readded on the same pointer address.

Replace this comparison by a proper comparison on the DCID data itself.
To this end, the dgram_ctx structure contains now a quic_cid member.
2021-12-17 10:59:36 +01:00
Amaury Denoyelle
c92cbfc014 MINOR: quic: refactor concat DCID with address for Initial packets
For first Initial packets, the socket source dest address is
concatenated to the DCID. This is used to be able to differentiate
possible collision between several clients which used the same ODCID.

Refactor the code to manage DCID and the concatenation with the address.
Before this, the concatenation was done on the quic_cid struct and its
<len> field incremented. In the code it is difficult to differentiate a
normal DCID with a DCID + address concatenated.

A new field <addrlen> has been added in the quic_cid struct. The <len>
field now only contains the size of the QUIC DCID. the <addrlen> is
first initialized to 0. If the address is concatenated, it will be
updated with the size of the concatenated address. This now means we
have to explicitely used either cid.len or cid.len + cid.addrlen to
access the DCID or the DCID + the address. The code should be clearer
thanks to this.

The field <odcid_len> in quic_rx_packet struct is now useless and has
been removed. However, a new parameter must be added to the
qc_new_conn() function to specify the size of the ODCID addrlen.
2021-12-17 10:59:36 +01:00
Amaury Denoyelle
d496251cde MINOR: quic: rename constant for haproxy CIDs length
On haproxy implementation, generated DCID are on 8 bytes, the minimal
value allowed by the specification. Rename the constant representing
this size to inform that this is haproxy specific.
2021-12-17 10:59:36 +01:00
Amaury Denoyelle
260e5e6c24 MINOR: quic: add missing lock on cid tree
All operation on the ODCID/DCID trees must be conducted under a
read-write lock. Add a missing read-lock on the lookup operation inside
listener handler.
2021-12-17 10:59:36 +01:00
Amaury Denoyelle
67e6cd50ef CLEANUP: quic: rename quic_conn conn to qc in quic_conn_free
Rename quic_conn from conn to qc to differentiate it from a struct
connection instance. This convention is already used in the majority of
the code.
2021-12-17 10:59:35 +01:00
Amaury Denoyelle
47e1f6d4e2 CLEANUP: quic: fix spelling mistake in a trace
Initiial -> Initial
2021-12-17 10:59:35 +01:00
Amaury Denoyelle
fdbf63e86e MINOR: mux-quic: fix trace on stream creation
Replace non-initialized qcs.by_id.key by the id to report the proper
stream ID on stream creation.
2021-12-17 09:55:01 +01:00
Frédéric Lécaille
8678eb0d19 CLEANUP: quic: Shorten a litte bit the traces in lstnr_rcv_pkt()
Some traces were too long and confusing when displaying 0 for a non-already
parsed packet number.
2021-12-17 08:38:43 +01:00
Frédéric Lécaille
25eeebe293 MINOR: quic: Do not mix packet number space and connection flags
The packet number space flags were mixed with the connection level flags.
This leaded to ACK to be sent at the connection level without regard to
the underlying packet number space. But we want to be able to acknowleged
packets for a specific packet number space.
2021-12-17 08:38:43 +01:00
Frédéric Lécaille
afd373c232 MINOR: hq_interop: Stop BUG_ON() truncated streams
This is required if we do not want to make haproxy crash during zerortt
interop runner test which makes a client open multiple streams with
long request paths.
2021-12-17 08:38:43 +01:00
Frédéric Lécaille
3fe7df877d CLEANUP: quic: Comment fix for qc_strm_cpy()
This function never returns a negative value... hopefully because it returns
a size_t!!!
2021-12-17 08:38:43 +01:00
Frédéric Lécaille
e629cfd96a MINOR: qpack: Missing check for truncated QPACK fields
Decrementing <len> variable without checking could make haproxy crash (on abort)
when printing a huge buffer (with negative length).
2021-12-17 08:38:43 +01:00
Frédéric Lécaille
a5da31d186 MINOR: quic: Make xprt support 0-RTT.
A client sends a 0-RTT data packet after an Initial one in the same datagram.
We must be able to parse such packets just after having parsed the Initial packets.
2021-12-17 08:38:43 +01:00
Frédéric Lécaille
1761fdf0c6 MINOR: ssl_sock: Set the QUIC application from ssl_sock_advertise_alpn_protos.
Make this function call quic_set_app_ops() if the protocol could be negotiated
by the TLS stack.
2021-12-17 08:38:43 +01:00
Frédéric Lécaille
b0bd62db23 MINOR: quic: Add quic_set_app_ops() function
Export the code responsible which set the ->app_ops structure into
quic_set_app_ops() function. It must be called by  the TLS callback which
selects the application (ssl_sock_advertise_alpn_protos) so that
to be able to build application packets after having received 0-RTT data.
2021-12-17 08:38:43 +01:00
Frédéric Lécaille
4015cbb723 MINOR: quic: No TX secret at EARLY_DATA encryption level
The TLS does not provide us with TX secrets after we have provided it
with 0-RTT data. This is logic: the server does not need to send 0-RTT
data. We must skip the section where such secrets are derived if we do not
want to close the connection with a TLS alert.
2021-12-17 08:38:43 +01:00
Frédéric Lécaille
ad3c07ae81 MINOR: quic: Enable TLS 0-RTT if needed
Enable 0-RTT at the TLS context level:
    RFC 9001 4.6.1. Enabling 0-RTT
    Accordingly, the max_early_data_size parameter is repurposed to hold a
    sentinel value 0xffffffff to indicate that the server is willing to accept
    QUIC 0-RTT data.
At the SSL connection level, we must call SSL_set_quic_early_data_enabled().
2021-12-17 08:38:43 +01:00
Frédéric Lécaille
0371cd54d0 CLEANUP: quic: Remove cdata_len from quic_tx_packet struct
This field is no more useful. Modify the traces consequently.
Also initialize ->pn_node.key value to -1, which is an illegal value
for QUIC packet number, and display it in traces if different from -1.
2021-12-17 08:38:43 +01:00
Frédéric Lécaille
d8b8443047 MINOR: quic: Add traces for STOP_SENDING frame and modify others
If not handled by qc_parse_pkt_frms(), the packet which contains it is dropped.
Add only a trace when parsing this frame at this time.
Also modify others to reduce the traces size and have more information about streams.
2021-12-17 08:38:43 +01:00
Frédéric Lécaille
1d2faa24d2 CLEANUP: quic_frame: Remove a useless suffix to STOP_SENDING
This is to be consistent with the other frame names. Adding a _frame
suffixe to STOP_SENDING is useless. We know this is a frame.
2021-12-17 08:38:43 +01:00
Frédéric Lécaille
f57c333ac1 MINOR: quic: Attach timer task to thread for the connection.
This is to avoid races between the connection I/O handler and this task
which share too much variables.
2021-12-17 08:38:43 +01:00
Remi Tricot-Le Breton
0b9e190028 MEDIUM: vars: Enable optional conditions to set-var converter and actions
This patch adds the possibility to add a set of conditions to a set-var
call, be it a converter or an action (http-request or http-response
action for instance). The conditions must all be true for the given
set-var call for the variable to actually be set. If any of the
conditions is false, the variable is left untouched.
The managed conditions are the following : "ifexists", "ifnotexists",
"ifempty", "ifnotempty", "ifset", "ifnotset", "ifgt", "iflt".  It is
possible to combine multiple conditions in a single set-var call since
some of them apply to the variable itself, and some others to the input.

This patch does not change the fact that variables of scope proc are
still created during configuration parsing, regardless of the conditions
that might be added to the set-var calls in which they are mentioned.
For instance, such a line :
    http-request set-var(proc.foo,ifexists) int(5)
would not prevent the creation of the variable during init, and when
actually reaching this line during runtime, the proc.foo variable would
already exist. This is specific to the proc scope.

These new conditions mean that a set-var could "fail" for other reasons
than memory allocation failures but without clearing the contents of the
variable.
2021-12-16 17:31:57 +01:00
Remi Tricot-Le Breton
bb6bc95b1e MINOR: vars: Parse optional conditions passed to the set-var actions
This patch adds the parsing of the optional condition parameters that
can be passed to the set-var and set-var-fmt actions (http as well as
tcp). Those conditions will not be taken into account yet in the var_set
function so conditions passed as parameters will not have any effect.
Since actions do not benefit from the parameter preparsing that
converters have, parsing conditions needed to be done by hand.
2021-12-16 17:31:57 +01:00
Remi Tricot-Le Breton
51899d251c MINOR: vars: Parse optional conditions passed to the set-var converter
This patch adds the parsing of the optional condition parameters that
can be passed to the set-var converter. Those conditions will not be
taken into account yet in the var_set function so conditions passed as
parameters will not have any effect. This is true for any condition
apart from the "ifexists" one that is also used to replace the
VF_UPDATEONLY flag that was used to prevent proc scope variable creation
from a LUA module.
2021-12-16 17:31:55 +01:00
Remi Tricot-Le Breton
25fccd52ac MINOR: vars: Delay variable content freeing in var_set function
When calling var_set on a variable of type string (SMP_T_STR, SMP_T_BIN
or SMP_T_METH), the contents of the variable were freed directly. When
adding conditions to set-var calls we might have cases in which the
contents of an existing variable should be kept unchanged so the freeing
of the internal buffers is delayed in the var_set function (so that we
can bypass it later).
2021-12-16 17:31:31 +01:00
Remi Tricot-Le Breton
1bd9805085 MINOR: vars: Set variable type to ANY upon creation
The type of a newly created variable was not initialized. This patch
sets it to SMP_T_ANY by default. This will be required when conditions
can be added to a set-var call because we might end up creating a
variable without setting it yet.
2021-12-16 17:31:31 +01:00
Remi Tricot-Le Breton
7055301934 MINOR: vars: Move UPDATEONLY flag test to vars_set_ifexist
The vars_set_by_name_ifexist function was created to avoid creating too
many variables from a LUA module. This was made thanks to the
VF_UPDATEONLY flags which prevented variable creation in the var_set
function. Since commit 3a4bedccc ("MEDIUM: vars: replace the global name
index with a hash") this limitation was restricted to 'proc' scope
variables only.
This patch simply moves the scope test to the vars_set_by_name_ifexist
function instead of the var_set function.
2021-12-16 17:31:27 +01:00
David CARLIER
f5d48f8b3b MEDIUM: cfgparse: numa detect topology on FreeBSD.
allowing for all platforms supporting cpu affinity to have a chance
 to detect the cpu topology from a given valid node (e.g.
 DragonflyBSD seems to be NUMA aware from a kernel's perspective
 and seems to be willing start to provide userland means to get
 proper info).
2021-12-15 11:05:51 +01:00
Amaury Denoyelle
b09f4477f4 CLEANUP: cfgparse: modify preprocessor guards around numa detection code
numa_detect_topology() is always define now if USE_CPU_AFFINITY is
activated. For the moment, only on Linux an actual implementation is
provided. For other platforms, it always return 0.

This change has been made to easily add implementation of NUMA detection
for other platforms. The phrasing of the documentation has also been
edited to removed the mention of Linux-only on numa-cpu-mapping
configuration option.
2021-12-15 11:05:51 +01:00
William Lallemand
740629e296 MINOR: cli: "show version" displays the current process version
This patch implements a simple "show version" command which returns
the version of the current process.

It's available from the master and the worker processes, so it is easy
to check if the master and the workers have the same version.

This is a minor patch that really improve compatibility checks
for scripts.

Could be backported in haproxy version as far as 2.0.
2021-12-14 15:40:06 +01:00
Amaury Denoyelle
1ac95445e6 MINOR: hq-interop: refix tx buffering
Incorrect usage of the buffer API : b_room() replaces b_size() to ensure
that we have enough size for http data copy.
2021-12-10 15:14:58 +01:00
William Lallemand
dcbe7b91d6 BUG/MEDIUM: mworker/cli: crash when trying to access an old PID in prompt mode
The master process encounter a crash when trying to access an old
process which left from the master CLI.

To reproduce the problem, you need a prompt to a previous worker, then
wait for this worker to leave, once it left launch a command from this
prompt. The s->target is then filled with a NULL which is dereferenced
when trying to connect().

This patch fixes the problem by checking if s->target is NULL.

Must be backported as far as 2.0.
2021-12-10 14:30:18 +01:00
Amaury Denoyelle
7059ebc095 MINOR: h3: fix possible invalid dereference on htx parsing
The htx variable is only initialized if we have received a HTTP/3
HEADERS frame. Else it must not be dereferenced.

This should fix the compilation on CI with gcc.
  src/h3.c: In function ‘h3_decode_qcs’:
  src/h3.c:224:14: error: ‘htx’ may be used uninitialized in this function
  [-Werror=maybe-uninitialized]
    224 |   htx->flags |= HTX_FL_EOM
2021-12-08 15:52:59 +01:00
Amaury Denoyelle
f3b0ba7dc9 BUG/MINOR: mux-quic: properly initialize flow control
Initialize all flow control members on the qcc instance. Without this,
the value are undefined and it may be possible to have errors about
reached streams limit.
2021-12-08 15:26:16 +01:00
Amaury Denoyelle
5154e7a252 MINOR: quic: notify the mux on CONNECTION_CLOSE
The xprt layer is reponsible to notify the mux of a CONNECTION_CLOSE
reception. In this case the flag QC_CF_CC_RECV is positionned on the
qcc and the mux tasklet is waken up.

One of the notable effect of the QC_CF_CC_RECV is that each qcs will be
released even if they have remaining data in their send buffers.
2021-12-08 15:26:16 +01:00
Amaury Denoyelle
2873a31c81 MINOR: mux-quic: do not release qcs if there is remaining data to send
A qcs is not freed if there is remaining data in its buffer. In this
case, the flag QC_SF_DETACH is positionned.

The qcc io handler is responsible to remove the qcs if the QC_SF_DETACH
is set and their buffers are empty.
2021-12-08 15:26:16 +01:00
Christopher Faulet
70f8948364 BUG/MINOR: cli/server: Don't crash when a server is added with a custom id
When a server is dynamically added via the CLI with a custom id, the key
used to insert it in the backend's tree of used names is not initialized.
The server id must be used but it is only used when no custom id is
provided. Thus, with a custom id, HAProxy crashes.

Now, the server id is always used to init this key, to be able to insert the
server in the corresponding tree.

This patch should fix the issue #1481. It must be backported as far as 2.4.
2021-12-07 19:04:33 +01:00
Christopher Faulet
ba8f06304e MINOR: http-rules: Add capture action to http-after-response ruleset
It is now possible to perform captures on the response when
http-after-response rules are evaluated. It may be handy to capture headers
from responses generated by HAProxy.

This patch is trivial, it may be backported if necessary.
2021-12-07 19:04:33 +01:00
Amaury Denoyelle
db44338473 MINOR: quic: add HTX EOM on request end
Set the HTX EOM flag on RX the app layer. This is required to notify
about the end of the request for the stream analyzers, else the request
channel never goes to MSG_DONE state.
2021-12-07 17:11:22 +01:00
Amaury Denoyelle
fecfa0d822 MINOR: mux-quic: remove uneeded code to check fin on TX
Remove a wrong comparaison with the same buffer on both sides. In any
cases, the FIN is properly set by qcs_push_frame only when the payload
has been totally emptied.
2021-12-07 17:11:22 +01:00
Amaury Denoyelle
5ede40be67 MINOR: hq-interop: fix tx buffering
On h09 app layer, if there is not enought size in the tx buffer, the
transfer is interrupted and the flag QC_SF_BLK_MROOM is positionned.
The transfer is woken up by the mux when new buffer size becomes
available.

This ensure that no data is silently discarded during transfer. Without
this, once the buffer is full the data were removed and thus not send to
the client resulting in a truncating payload.
2021-12-07 17:08:52 +01:00
Frédéric Lécaille
73dcc6ee62 MINOR: quic: Remove QUIC TX packet length evaluation function
Remove qc_eval_pkt() which has come with the multithreading support. It
was there to evaluate the length of a TX packet before building. We could
build from several thread TX packets without consuming a packet number for nothing (when
the building failed). But as the TX packet building functions are always
executed by the same thread, the one attached to the connection, this does
not make sense to continue to use such a function. Furthermore it is buggy
since we had to recently pad the TX packet under certain circumstances.
2021-12-07 15:53:56 +01:00
Frédéric Lécaille
fee7ba673f MINOR: quic: Delete remaining RX handshake packets
After the handshake has succeeded, we must delete any remaining
Initial or Handshake packets from the RX buffer. This cannot be
done depending on the state the connection (->st quic_conn struct
member value) as the packet are not received/treated in order.
2021-12-07 15:53:56 +01:00
Frédéric Lécaille
7d807c93f4 MINOR: quic: QUIC encryption level RX packets race issue
The tree containing RX packets must be protected from concurrent accesses.
2021-12-07 15:53:56 +01:00
Frédéric Lécaille
d61bc8db59 MINOR: quic: Race issue when consuming RX packets buffer
Add a null byte to the end of the RX buffer to notify the consumer there is no
more data to treat.
Modify quic_rx_packet_pool_purge() which is the function which remove the
RX packet from the buffer.
Also rename this function to quic_rx_pkts_del().
As the RX packets may be accessed by the QUIC connection handler (quic_conn_io_cb())
the function responsible of decrementing their reference counters must not
access other information than these reference counters! It was a very bad idea
to try to purge the RX buffer asap when executing this function.
2021-12-07 15:53:56 +01:00
Frédéric Lécaille
f9cb3a9b0e MINOR: quic: RX buffer full due to wrong CRYPTO data handling
Do not leave in the RX buffer packets with CRYPTO data which were
already received. We do this when parsing CRYPTO frame. If already
received we must not consider such frames as if they were not received
in order! This had as side effect to interrupt the transfer of long streams
(ACK frames not parsed).
2021-12-07 15:53:56 +01:00
Amaury Denoyelle
84ea8dcbc4 MEDIUM: mux-quic: handle when sending buffer is full
Handle the case when the app layer sending buffer is full. A new flag
QC_SF_BLK_MROOM is set in this case and the transfer is interrupted. It
is expected that then the conn-stream layer will subscribe to SEND.

The MROOM flag is reset each time the muxer transfer data from the app
layer to its own buffer. If the app layer has been subscribed on SEND it
is woken up.
2021-12-07 15:44:45 +01:00
Amaury Denoyelle
e257d9e8ec MEDIUM: mux-quic: wake up xprt on data transferred
On qc_send, data are transferred for each stream from their qcs.buf to
the qcs.xprt_buf. Wake up the xprt to warn about new data available for
transmission.
2021-12-07 15:44:45 +01:00
Amaury Denoyelle
a2c58a7c8d MEDIUM: mux-quic: subscribe on xprt if remaining data after send
The streams data are transferred from the qcs.buf to the qcs.xprt_buf
during qc_send. If the xprt_buf is not empty and not all data can be
transferred, subscribe the connection on the xprt for sending.

The mux will be woken up by the xprt when the xprt_buf will be cleared.
This happens on ACK reception.
2021-12-07 15:44:45 +01:00
Amaury Denoyelle
a3f222dc1e MINOR: mux-quic: implement subscribe on stream
Implement the subscription in the mux on the qcs instance.

Subscribe is now used by the h3 layer when receiving an incomplete frame
on the H3 control stream. It is also used when attaching the remote
uni-directional streams on the h3 layer.

In the qc_send, the mux wakes up the qcs for each new transfer executed.
This is done via the method qcs_notify_send().

The xprt wakes up the qcs when receiving data on unidirectional streams.
This is done via the method qcs_notify_recv().
2021-12-07 15:44:45 +01:00
Amaury Denoyelle
c2025c1ec6 MEDIUM: quic: detect the stream FIN
Set the QC_SF_FIN_STREAM on the app layers (h3 / hq-interop) when
reaching the HTX EOM. This is used to warn the mux layer to set the FIN
on the QUIC stream.
2021-12-07 15:44:45 +01:00
Amaury Denoyelle
916f0ac1e7 MEDIUM: mux-quic: implement release mux operation
Implement qc_release. This function is called by the upper layer on
connection close. For the moment, this only happens on client timeout.

This functions is used the free a qcs instance. If all bidirectional
streams are freed, the qcc instance and the connection are purged.
2021-12-07 15:44:45 +01:00
Amaury Denoyelle
deed777766 MAJOR: mux-quic: implement a simplified mux version
Re-implement the QUIC mux. It will reuse the mechanics from the previous
mux without all untested/unsupported features. This should ease the
maintenance.

Note that a lot of features are broken for the moment. They will be
re-implemented on the following commits to have a clean commit history.
2021-12-07 15:44:45 +01:00
Amaury Denoyelle
d1202edadd MINOR: h3: remove duplicated FIN flag position
The FIN flag is already set in h3_snd_buf on HTX EOM reception. The same
action in h3_resp_headers_send is duplicated and thus now removed.
2021-12-07 15:37:53 +01:00
Amaury Denoyelle
e2288c3087 MEDIUM: xprt-quic: finalize app layer initialization after ALPN nego
The app layer is initialized after the handshake completion by the XPRT
stack. Call the finalize operation just after that.

Remove the erroneous call to finalize by the mux in the TPs callback as
the app layer is not yet initialized at this stage.

This should fix the missing H3 settings currently not emitted by
haproxy.
2021-12-07 15:37:53 +01:00
Amaury Denoyelle
e1f3ff0d08 MINOR: h3: add BUG_ON on control receive function
Add BUG_ON statement when handling a non implemented frames on the
control stream. This is required because frames must be removed from the
RX buffer or else it will stall the buffer.
2021-12-07 15:37:53 +01:00
Amaury Denoyelle
942fc79b5f MINOR: quic: fix segfault on CONNECTION_CLOSE parsing
At the moment the reason_phrase member of a
quic_connection_close/quic_connection_close_app structure is not
allocated. Comment the memcpy to it to avoid segfault.
2021-12-07 15:37:53 +01:00
Willy Tarreau
b154422db1 IMPORT: slz: use the correct CRC32 instruction when running in 32-bit mode
Many ARMv8 processors also support Aarch32 and can run armv7 and even
thumb2 code. While armv8 compilers will not emit these instructions,
armv7 compilers that are aware of these processors will do. For
example, using gcc built for an armv7 target and passing it
"-mcpu=cortex-a72" or "-march=armv8-a+crc" will result in the CRC32
instruction to be used.

In this case the current assembly code fails because with the ARM and
Thumb2 instruction sets there is no such "%wX" half-registers. We need
to use "%X" instead as the native 32-bit register when running with a
32-bit instruction set, and use "%wX" when using the 64-bit instruction
set (A64).

This is slz upstream commit fab83248612a1e8ee942963fe916a9cdbf085097
2021-12-06 09:14:20 +01:00
Willy Tarreau
88bc800eae BUILD: tree-wide: avoid warnings caused by redundant checks of obj_types
At many places we use construct such as:

   if (objt_server(blah))
       do_something(objt_server(blah));

At -O2 the compiler manages to simplify the operation and see that the
second one returns the same result as the first one. But at -O1 that's
not always the case, and the compiler is able to emit a second
expression and sees the potential null that results from it, and may
warn about a potential null deref (e.g. with gcc-6.5). There are two
solutions to this:
  - either the result of the first test has to be passed to a local
    variable
  - or the second reference ought to be unchecked using the __objt_*
    variant.

This patch fixes all occurrences at once by taking the second approach
(the least intrusive). For constructs like:

   objt_server(blah) ? objt_server(blah)->name : "no name"

a macro could be useful. It would for example take the object type
(server), the field name (name) and the default value. But there
are probably not enough occurrences across the whole code for this
to really matter.

This should be backported wherever it applies.
2021-12-06 09:11:47 +01:00
Tim Duesterhus
caf5f5d302 BUG/MEDIUM: sample: Fix memory leak in sample_conv_jwt_member_query
The function leaked one full buffer per invocation. Fix this by simply removing
the call to alloc_trash_chunk(), the static chunk from get_trash_chunk() is
sufficient.

This bug was introduced in 0a72f5ee7c, which is
2.5-dev10. This fix needs to be backported to 2.5+.
2021-12-03 09:03:55 +01:00
Christopher Faulet
af93d2fd70 BUG/MINOR: resolvers: Don't overwrite the error for invalid query domain name
When a response is validated, the query domain name is checked to be sure it
is the same than the one requested. When an error is reported, the wrong
goto label was used. Thus, the error was lost. Instead of
RSLV_RESP_WRONG_NAME, RSLV_RESP_INVALID was reported.

This bug was introduced by the commit c1699f8c1 ("MEDIUM: resolvers: No
longer store query items in a list into the response").

This patch should fix the issue #1473. No backport is needed.
2021-12-02 10:05:04 +01:00
Christopher Faulet
02c893332b BUG/MEDIUM: h1: Properly reset h1m flags when headers parsing is restarted
If H1 headers are not fully received at once, the parsing is restarted a
last time when all headers are finally received. When this happens, the h1m
flags are sanitized to remove all value set during parsing.

But some flags where erroneously preserved. Among others, H1_MF_TE_CHUNKED
flag was not removed, what could lead to parsing error.

To fix the bug and make things easy, a mask has been added with all flags
that must be preserved. It will be more stable. This mask is used to
sanitize h1m flags.

This patch should fix the issue #1469. It must be backported to 2.5.
2021-12-02 09:46:29 +01:00
Emeric Brun
2ad2b1c94c BUG/MAJOR: segfault using multiple log forward sections.
For each new log forward section, the proxy was added to the log forward
proxy list but the ref on the previous log forward section's proxy was
scratched using "init_new_proxy" which performs a memset. After configuration
parsing this list contains only the last section's proxy.

The post processing walk through this list to resolve "ring" names.
Since some section's proxies are missing in this list, the resolving
is not done for those ones and the pointer on the ring is kept to null
causing a segfault at runtime trying to write a log message
into the ring.

This patch shift the "init_new_proxy" before adding the ref on the
previous log forward section's proxy on currently parsed one.

This patch shoud fix github issue #1464

This patch should be backported to 2.3
2021-12-01 15:21:56 +01:00
Christopher Faulet
c1699f8c1b MEDIUM: resolvers: No longer store query items in a list into the response
When the response is parsed, query items are stored in a list, attached to
the parsed response (resolve_response).

First, there is one and only one query sent at a time. Thus, there is no
reason to use a list. There is a test to be sure there is only one query
item in the response. Then, the reference on this query item is only used to
validate the domain name is the one requested. So the query list can be
removed. We only expect one query item, no reason to loop on query records.
In addition, the query domain name is now immediately checked against the
resolution domain name. This way, the query item is only manipulated during
the response parsing.
2021-12-01 15:21:56 +01:00
Christopher Faulet
80b2e34b18 BUG/MEDIUM: resolvers: Detach query item on response error
When a new response is parsed, it is unexpected to have an old query item
still attached to the resolution. And indeed, when the response is parsed
and validated, the query item is detached and used for a last check on its
dname. However, this is only true for a valid response. If an error is
detected, the query is not detached. This leads to undefined behavior (most
probably a crash) on the next response because the first element in the
query list is referencing an old response.

This patch must be backported as far as 2.0.
2021-12-01 11:47:08 +01:00
Christopher Faulet
4ab2679689 BUG/MINOR: server: Don't rely on last default-server to init server SSL context
During post-parsing stage, the SSL context of a server is initialized if SSL
is configured on the server or its default-server. It is required to be able
to enable SSL at runtime. However a regression was introduced, because the
last parsed default-server is used. But it is not necessarily the
default-server line used to configure the server. This may lead to
erroneously initialize the SSL context for a server without SSL parameter or
the skip it while it should be done.

The problem is the default-server used to configure a server is not saved
during configuration parsing. So, the information is lost during the
post-parsing. To fix the bug, the SRV_F_DEFSRV_USE_SSL flag is
introduced. It is used to know when a server was initialized with a
default-server using SSL.

For the record, the commit f63704488e ("MEDIUM: cli/ssl: configure ssl on
server at runtime") has introduced the bug.

This patch must be backported as far as 2.4.
2021-12-01 11:47:08 +01:00
Christopher Faulet
41951ab9d6 MINOR: mux-h1: add stat for total amount of bytes received and sent
Add counters for total amount of bytes received and sent. Bytes received and
sent via kernel splicing are also counted.
2021-12-01 11:47:08 +01:00
Christopher Faulet
3bca28c9fd MINOR: mux-h1: add stat for total count of connections/streams
Add counters for total number of http1 connections/stream since haproxy
startup. Contrary to open_conn/stream, they are never reset to zero.
2021-12-01 11:47:08 +01:00
Christopher Faulet
60fa051e71 MINOR: mux-h1: count open connections/streams on stats
Implement as a gauge h1 counters for currently open connections and
streams. The counters are decremented when closing the stream or the
connection.
2021-12-01 11:47:08 +01:00
Christopher Faulet
563c345f6f MINOR: mux-h1: add counters instance to h1c
Add pointer to counters as a member for h1c structure. This pointer is
initialized on h1_init function. This is useful to quickly access and
manipulate the counters inside every h1 functions.
2021-12-01 11:47:08 +01:00
Christopher Faulet
b4c584eed1 MINOR: mux-h1: register a stats module
Use statistics API to register a new stats module generating counters on h1
module. The counters are attached to frontend/backend instances.
2021-12-01 11:47:08 +01:00
Christopher Faulet
6580f2868e MINOR: mux-h1: Improve H1 traces by adding info about http parsers
Info about the request and the response parsers are now displayed in H1
traces for advanced and complete verbosity only. This should help debugging.

This patch may be backported as far as 2.4.
2021-12-01 11:47:08 +01:00
Christopher Faulet
f5ce320156 BUG/MINOR: mux-h1: Fix splicing for messages with unknown length
Splicing was disabled fo Messages with an unknown length (no C-L or T-E
header) with no valid reason. So now, it is possible to use the kernel
splicing for such messages.

This patch should be backported as far as 2.4.
2021-12-01 11:47:08 +01:00
Christopher Faulet
140f1a5852 BUG/MEDIUM: mux-h1: Fix splicing by properly detecting end of message
Since the 2.4.4, the splicing support in the H1 multiplexer is buggy because
end of the message is not properly detected.

On the 2.4, when the requests is spliced, there is no issue. But when the
response is spliced, the client connection is always closed at the end of the
message. Note the response is still fully sent.

On the 2.5 and higher, when the last requests on a connection is spliced, a
client abort is reported. For other requests there is no issue. In all cases,
the requests are fully sent. When the response is spliced, the server connection
hangs till the server timeout and a server abort is reported. The response is
fully sent with no delay.

The root cause is the EOM block suppression. There is no longer extra block to
be sure to call a last time rcv_buf()/snd_buf() callback functions. At the end,
to fix the issue, we must now detect end of the message in rcv_pipe() and
snd_pipe() callback functions. To do so, we rely on the announced message length
to know when the payload is finished. This works because the chunk-encoded
messages are not spliced.

This patch must be backported as far as 2.4 after an observation period.
2021-12-01 11:46:21 +01:00
David CARLIER
b1e190a885 MEDIUM: pool: Following up on previous pool trimming update.
Apple libmalloc has its own notion of memory arenas as malloc_zone with
rich API having various callbacks for various allocations strategies but
here we just use the defaults.
In trim_all_pools,  we advise to purge each zone as much as possible, called "greedy" mode.
2021-12-01 10:38:31 +01:00
Remi Tricot-Le Breton
bb3e80e181 BUG/MINOR: vars: Fix the set-var and unset-var converters
In commit 3a4bedccc6 the variable logic was changed. Instead of
accessing variables by their name during runtime, the variable tables
are now indexed by a hash of the name. But the set-var and unset-var
converters try to access the correct variable by calculating a hash on
the sample instead of the already calculated variable hash.

It should be backported to 2.5.
2021-12-01 10:32:19 +01:00
Frédéric Lécaille
008386bec4 MINOR: quic: Delete the ODCIDs asap
As soon as the connection ID (the one choosen by the QUIC server) has been used
by the client, we can delete its original destination connection ID from its tree.
2021-11-30 12:01:32 +01:00
Frédéric Lécaille
a7d2c09468 MINOR: quic: Enable the Key Update process
This patch modifies ha_quic_set_encryption_secrets() to store the
secrets received by the TLS stack and prepare the information for the
next key update thanks to quic_tls_key_update().
qc_pkt_decrypt() is modified to check if we must used the next or the
previous key phase information to decrypt a short packet.
The information are rotated if the packet could be decrypted with the
next key phase information. Then new secrets, keys and IVs are updated
calling quic_tls_key_update() to prepare the next key phase.
quic_build_packet_short_header() is also modified to handle the key phase
bit from the current key phase information.
2021-11-30 11:51:12 +01:00
Frédéric Lécaille
a7973a6dce MINOR: quic: Add quic_tls_key_update() function for Key Update
This function derives the next RX and TX keys and IVs from secrets
for the next key update key phase. We also implement quic_tls_rotate_keys()
which rotate the key update key phase information to be able to continue
to decrypt old key phase packets. Most of these information are pointers
to unsigned char.
2021-11-30 11:51:12 +01:00
Frédéric Lécaille
6e351d6c19 MINOR: quic: Optional header protection key for quic_tls_derive_keys()
quic_tls_derive_keys() is responsible to derive the AEAD keys, IVs and$
header protection key from a secret provided by the TLS stack. We want
to make the derivation of the header protection key be optional. This
is required for the Key Update process where there is no update for
the header protection key.
2021-11-30 11:51:12 +01:00
Frédéric Lécaille
40df78f116 MINOR: quic: Add structures to maintain key phase information
When running Key Update process, we must maintain much information
especially when the key phase bit has been toggled by the peer as
it is possible that it is due to late packets. This patch adds
quic_tls_kp new structure to do so. They are used to store
previous and next secrets, keys and IVs associated to the previous
and next RX key phase. We also need the next TX key phase information
to be able to encrypt packets for the next key phase.
2021-11-30 11:51:12 +01:00
Frédéric Lécaille
39484de813 MINOR: quic: Add a function to derive the key update secrets
This is the function used to derive an n+1th secret from the nth one as
described in RFC9001 par. 6.1.
2021-11-30 11:51:12 +01:00
Frédéric Lécaille
fc768ecc88 MINOR: quic: Dynamically allocate the secrete keys
This is done for any encryption level. This is to prepare the Key Update feature.
2021-11-30 11:51:12 +01:00
Frédéric Lécaille
d77c50b6d6 MINOR: quic: Possible crash when inspecting the xprt context
haproxy may crash when running this statement in qc_lstnr_pkt_rcv():
	conn_ctx = qc->conn->xprt_ctx;
because qc->conn may not be initialized. With this patch we ensure
qc->conn is correctly initialized before accessing its ->xprt_ctx
members. We zero the xrpt_ctx structure (ssl_conn_ctx struct), then
initialize its ->conn member with HA_ATOMIC_STORE. Then, ->conn and
->conn->xptr_ctx members of quic_conn struct can be accessed with HA_ATOMIC_LOAD()
2021-11-30 11:50:42 +01:00
Frédéric Lécaille
e2660e61e2 MINOR: quic: Rename qc_prep_hdshk_pkts() to qc_prep_pkts()
qc_prep_hdshk_pkts() does not prepare only handshake packets but
any type of packet.
2021-11-30 11:47:46 +01:00
Frédéric Lécaille
b5b5247b18 MINOR: quic: Immediately close if no transport parameters extension found
If the ClientHello callback does not manage to find a correct QUIC transport
parameters extension, we immediately close the connection with
missing_extension(109) as TLS alert which is turned into 0x16d QUIC connection
error.
2021-11-30 11:47:46 +01:00
Frédéric Lécaille
1fc5e16c4c MINOR: quic: More accurate immediately close.
When sending a CONNECTION_CLOSE frame to immediately close the connection,
do not provide CRYPTO data to the TLS stack. Do not built anything else than a
CONNECTION_CLOSE and do not derive any secret when in immediately close state.
Seize the opportunity of this patch to rename ->err quic_conn struct member
to ->error_code.
2021-11-30 11:47:46 +01:00
Frédéric Lécaille
067a82bba1 MINOR: quic: Set "no_application_protocol" alert
We set this TLS error when no application protocol could be negotiated
via the TLS callback concerned. It is converted as a QUIC CRYPTO_ERROR
error (0x178).
2021-11-30 11:47:46 +01:00
Willy Tarreau
3cc1e3d5ca BUILD: evports: remove a leftover from the dead_fd cleanup
Commit b1f29bc62 ("MINOR: activity/fd: remove the dead_fd counter") got
rid of FD_UPDT_DEAD, but evports managed to slip through the cracks and
wasn't cleaned up, thus it doesn't build anymore, as reported in github
issue #1467. We just need to remove the related lines since the situation
is already handled by the remaining conditions.

Thanks to Dominik Hassler for reporting the issue and confirming the fix.

This must be backported to 2.5 only.
2021-11-30 09:34:32 +01:00
Christopher Faulet
d98da3bc90 BUG/MEDIUM: cli: Properly set stream analyzers to process one command at a time
The proxy used by the master CLI is an internal proxy and no filter are
registered on it. Thus, there is no reason to take care to set or unset
filter analyzers in the master CLI analyzers. AN_REQ_FLT_END was set on the
request channel to prevent the infinite forward and be sure to be able to
process one commande at a time. However, the only work because
CF_FLT_ANALYZE flag was used by error as a channel analyzer instead of a
channel flag. This erroneously set AN_RES_FLT_END on the request channel,
that really prevent the infinite forward, be side effet.

In fact, We must avoid this kind of trick because this only work by chance
and may be source of bugs in future. Instead, we must always keep the CLI
request analyzer and add an early return if the response is not fully
processed. It happens when the CLI response analyzer is set.

This patch must be backported as far as 2.0.
2021-11-29 11:28:54 +01:00
Willy Tarreau
781f07a620 BUILD: pools: only detect link-time jemalloc on ELF platforms
The build broke on Windows and MacOS after commit ed232148a ("MEDIUM:
pool: refactor malloc_trim/glibc and jemalloc api addition detections."),
because the extern+attribute(weak) combination doesn't result in a really
weak symbol and it causes an undefined symbol at link time.

Let's reserve this detection to ELF platforms. The runtime detection using
dladdr() remains used if defined.

No backport needed, this is purely 2.6.
2021-11-26 16:13:17 +01:00
William Lallemand
efd954793e BUG/MINOR: mworker: deinit of thread poller was called when not initialized
Commit 67e371e ("BUG/MEDIUM: mworker: FD leak of the eventpoll in wait
mode") introduced a regression. Upon a reload it tries to deinit the
poller per thread, but no poll loop was initialized after loading the
configuration.

This patch fixes the issue by moving this part of the code in
mworker_reload(), since this function will be called only when the
poller is fully initialized.

This patch must be backported in 2.5.
2021-11-26 14:43:57 +01:00
David Carlier
d450ff636c MEDIUM: pool: support purging jemalloc arenas in trim_all_pools()
In the case of Linux/glibc, falling back to malloc_trim if jemalloc
had not been detected beforehand.
2021-11-25 18:54:50 +01:00
David Carlier
ed232148a7 MEDIUM: pool: refactor malloc_trim/glibc and jemalloc api addition detections.
Attempt to detect jemalloc at runtime before hand whether linked or via
symbols overrides, and fall back to malloc_trim/glibc for Linux otherwise.
2021-11-25 18:54:50 +01:00
Amaury Denoyelle
5bae85d0d2 MINOR: quic: use more verbose QUIC traces set at compile-time
Remove the verbosity set to 0 on quic_init_stdout_traces. This will
generate even more verbose traces on stdout with the default verbosity
of 1 when compiling with -DENABLE_QUIC_STDOUT_TRACES.
2021-11-25 18:10:58 +01:00
Amaury Denoyelle
118b2cbf84 MINOR: quic: activate QUIC traces at compilation
Implement a function quic_init_stdout_traces called at STG_INIT. If
ENABLE_QUIC_STDOUT_TRACES preprocessor define is set, the QUIC trace
module will be automatically activated to emit traces on stdout on the
developer level.

The main purpose for now is to be able to generate traces on the haproxy
docker image used for QUIC interop testing suite. This should facilitate
test failure analysis.
2021-11-25 16:12:44 +01:00
Amaury Denoyelle
7d3aea50b8 MINOR: qpack: support litteral field line with non-huff name
Support qpack header using a non-huffman encoded name in a litteral
field line with name reference.

This format is notably used by picoquic client and should improve
haproxy interop covering.
2021-11-25 11:41:29 +01:00
Amaury Denoyelle
d6a352a58b MEDIUM: quic: handle CIDs to rattach received packets to connection
Change the way the CIDs are organized to rattach received packets DCID
to QUIC connection. This is necessary to be able to handle multiple DCID
to one connection.

For this, the quic_connection_id structure has been extended. When
allocated, they are inserted in the receiver CID tree instead of the
quic_conn directly. When receiving a packet, the receiver tree is
inspected to retrieve the quic_connection_id. The quic_connection_id
contains now contains a reference to the QUIC connection.
2021-11-25 11:41:29 +01:00
Amaury Denoyelle
42b9f1c6dd CLEANUP: quic: add comments on CID code
Add minor comment to explain how the CID are stored in the QUIC
connection.
2021-11-25 11:33:35 +01:00
Amaury Denoyelle
aff4ec86eb REORG: quic: add comment on rare thread concurrence during CID alloc
The comment is here to warn about a possible thread concurrence issue
when treating INITIAL packets from the same client. The macro unlikely
is added to further highlight this scarce occurence.
2021-11-25 11:13:12 +01:00
Amaury Denoyelle
cb318a80e4 MINOR: quic: do not reject PADDING followed by other frames
It is valid for a QUIC packet to contain a PADDING frame followed by
one or several other frames.

quic_parse_padding_frame() does not require change as it detect properly
the end of the frame with the first non-null byte.

This allow to use quic-go implementation which uses a PADDING-CRYPTO as
the first handshake packet.
2021-11-25 11:13:12 +01:00
William Lallemand
67e371ea14 BUG/MEDIUM: mworker: FD leak of the eventpoll in wait mode
Since 2.5, before re-executing in wait mode, the master can have a
working configuration loaded, with a eventpoll fd. This case was not
handled correctly and a new eventpoll FD is leaking in the master at
each reload, which is inherited by the new worker.

Must be backported in 2.5.
2021-11-25 10:45:29 +01:00
William Lallemand
befab9ee4a BUG/MINOR: mworker: does not add the -sf in wait mode
Since the wait mode is automatically executed after charging the
configuration, -sf was shown in argv[] with the previous PID, which is
normal, but also the current one. This is only a visual problem when
listing the processes, because -sf does not do anything in wait mode.

Fix the issue by removing the whole "-sf" part in wait mode, but the
executed command can be seen in the argv[] of the latest worker forked.

Must be backported in 2.5.
2021-11-25 10:39:54 +01:00
Bertrand Jacquin
7fbc7708d4 BUG/MINOR: lua: remove loop initial declarations
HAProxy is documented to support gcc >= 3.4 as per INSTALL file, however
hlua.c makes use of c11 only loop initial declarations leading to build
failure when using gcc-4.9.4:

  x86_64-unknown-linux-gnu-gcc -Iinclude  -Wchar-subscripts -Wcomment -Wformat -Winit-self -Wmain -Wmissing-braces -Wno-pragmas -Wparentheses -Wreturn-type -Wsequence-point -Wstrict-aliasing -Wswitch -Wtrigraphs -Wuninitialized -Wunknown-pragmas -Wunused-label -Wunused-variable -Wunused-value -Wpointer-sign -Wimplicit -pthread -fdiagnostics-color=auto -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -O3 -msse -mfpmath=sse -march=core2 -g -fPIC -g -Wall -Wextra -Wundef -Wdeclaration-after-statement -fwrapv -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers -Wtype-limits      -DUSE_EPOLL  -DUSE_NETFILTER   -DUSE_PCRE2 -DUSE_PCRE2_JIT -DUSE_POLL -DUSE_THREAD -DUSE_BACKTRACE   -DUSE_TPROXY -DUSE_LINUX_TPROXY -DUSE_LINUX_SPLICE -DUSE_LIBCRYPT -DUSE_CRYPT_H -DUSE_GETADDRINFO -DUSE_OPENSSL -DUSE_LUA -DUSE_ACCEPT4   -DUSE_SLZ -DUSE_CPU_AFFINITY -DUSE_TFO -DUSE_NS -DUSE_DL -DUSE_RT      -DUSE_PRCTL  -DUSE_THREAD_DUMP        -DUSE_PCRE2 -DPCRE2_CODE_UNIT_WIDTH=8  -I/usr/local/include -DCONFIG_HAPROXY_VERSION=\"2.5.0\" -DCONFIG_HAPROXY_DATE=\"2021/11/23\" -c -o src/connection.o src/connection.c
  src/hlua.c: In function 'hlua_config_prepend_path':
  src/hlua.c:11292:2: error: 'for' loop initial declarations are only allowed in C99 or C11 mode
    for (size_t i = 0; i < 2; i++) {
    ^
  src/hlua.c:11292:2: note: use option -std=c99, -std=gnu99, -std=c11 or -std=gnu11 to compile your code

This commit moves loop iterator to an explicit declaration.

Must be backported to 2.5 because this issue was introduced in
v2.5-dev10~69 with commit 9e5e586e35 ("BUG/MINOR: lua: Fix lua error
 handling in `hlua_config_prepend_path()`")
2021-11-25 09:07:34 +01:00
William Lallemand
2be557f7cb MEDIUM: mworker: seamless reload use the internal sockpairs
With the master worker, the seamless reload was still requiring an
external stats socket to the previous process, which is a pain to
configure.

This patch implements a way to use the internal socketpair between the
master and the workers to transfer the sockets during the reload.
This way, the master will always try to transfer the socket, even
without any configuration.

The master will still reload with the -x argument, followed by the
sockpair@ syntax. ( ex -x sockpair@4 ). Which use the FD of internal CLI
to the worker.
2021-11-24 19:00:39 +01:00
William Lallemand
82d5f013f9 BUG/MINOR: lua: don't expose internal proxies
Since internal proxies are now in the global proxy list, they are now
reachable from core.proxies, core.backends, core.frontends.

This patch fixes the issue by checking the PR_CAP_INT flag before
exposing them in lua, so the user can't have access to them.

This patch must be backported in 2.5.
2021-11-24 16:14:24 +01:00
William Lallemand
f03b53c81d BUG/MINOR: httpclient: allow to replace the host header
This patch allows to replace the host header generated by the
httpclient instead of adding a new one, resulting in the server replying
an error 400.

The host header is now generated from the uri only if it wasn't found in
the list of headers.

Also add a new request in the VTC file to test this.

This patch must be backported in 2.5.
2021-11-24 15:44:36 +01:00
Christopher Faulet
27f88a9059 BUG/MINOR: cache: Fix loop on cache entries in "show cache"
A regression was introduced in the commit da91842b6 ("BUG/MEDIUM: cache/cli:
make "show cache" thread-safe"). When cli_io_handler_show_cache() is called,
only one node is retrieved and is used to fill the output buffer in loop.
Once set, the "node" variable is never renewed. At the end, all nodes are
dumped but each one is duplicated several time into the output buffer.

This patch must be backported everywhere the above commit is. It means only
to 2.5 and 2.4.
2021-11-23 16:15:02 +01:00
William Lallemand
ce9903319c BUG/MINOR: ssl: free correctly the sni in the backend SSL cache
__ssl_sock_load_new_ckch_instance() does not free correctly the SNI in
the session cache, it only frees the one in the current tid.

This bug was introduced with e18d4e8 ("BUG/MEDIUM: ssl: backend TLS
resumption with sni and TLSv1.3").

This fix must be backported where the mentionned commit was backported.
(all maintained versions).
2021-11-23 15:20:59 +01:00
Willy Tarreau
c5e7cf9e69 BUG/MINOR: ssl: make SSL counters atomic
SSL counters were added with commit d0447a7c3 ("MINOR: ssl: add counters
for ssl sessions") in 2.4, but their updates were not atomic, so it's
likely that under significant loads they are not correct.

This needs to be backported to 2.4.
2021-11-22 17:46:13 +01:00
Willy Tarreau
0a1e1cb555 BUG/MEDIUM: cli: make sure we can report a warning from a bind keyword
Since recent 2.5 commit c8cac04bd ("MEDIUM: listener: deprecate "process"
in favor of "thread" on bind lines"), the "process" bind keyword may
report a warning. However some parts like the "stats socket" parser
will call such bind keywords and do not expect to face warnings, so
this will instantly cause a fatal error to be reported. A concrete
effect is that "stats socket ... process 1" will hard-fail indicating
the keyword is deprecated and will be removed in 2.7.

We must relax this test, but the code isn't designed to report warnings,
it uses a single string and only supports reporting an error code (-1).

This patch makes a special case of the ERR_WARN code and uses ha_warning()
to report it, and keeps the rest of the existing error code for other
non-warning codes. Now "process" on the "stats socket" is properly
reported as a warning.

No backport is needed.
2021-11-20 20:15:37 +01:00
Willy Tarreau
97b5d07a3e BUILD: cli: clear a maybe-unused warning on some older compilers
The SHOW_TOT() and SHOW_AVG() macros used in cli_io_handler_show_activity()
produce a warning on gcc 4.7 on MIPS with threads disabled because the
compiler doesn't know that global.nbthread is necessarily non-null, hence
that at least one iteration is performed. Let's just change the loop for
a do {} while () that lets the compiler know it's always initialized. It
also has the tiny benefit of making the code shorter.
2021-11-20 20:15:37 +01:00
Tim Duesterhus
f897fc99bd CLEANUP: sock: Wrap accept4_broken = 1 into additional parenthesis
This makes it clear to static analysis tools that this assignment is
intentional and not a mistyped comparison.
2021-11-20 14:52:01 +01:00
Willy Tarreau
48b608026b MINOR: shctx: add a few BUG_ON() for consistency checks
The shctx code relies on sensitive conditions that are hard to infer
from the code itself, let's add some BUG_ON() to verify them. They
helped spot the previous bugs.
2021-11-19 19:25:13 +01:00
Willy Tarreau
cafe15c743 BUG/MINOR: shctx: do not look for available blocks when the first one is enough
In shctx_row_reserve_hot() we only leave if we've found the exact
requested size instead of at least as large, as is documented. This
results in extra lookups and free calls in the avail loop while it is
not needed, and participates to seeing a negative data_len early as
spotted in previous bugs.

It doesn't seem to have any other impact however, but it's better to
backport it to stable branches.
2021-11-19 19:25:13 +01:00
Willy Tarreau
b15e8a1c96 BUG/MEDIUM: shctx: leave the block allocator when enough blocks are found
In shctx_row_reserve_hot(), a missing break allows the avail loop to
loop for a while after having allocated the required blocks, possibly
leading to the point where it could trigger the watchdog after checking
up to 2 million blocks. In addition, the extra iteration may leave one
block assigned with size zero at the head of the avail list, and mark
it as being an isolated chain of 1 block. It's unclear whether this
could have had other consequences.

There is a non-negligible chance that it addreses bugs #1451 and #1284,
as the pattern observed in the loop looks exactly the same as the one
reported there in the crashes.

It's only marked medium because it is extremely hard to trigger. Here
the conditions were reproduced when starting 4k connections at once
requesting objects of random sizes between 0 and 20k to store them into
a small 1MB cache. However the watchdog will never trigger in such a case
so one needs to instrument the functions.

Thanks to Sohaib Ahmad and @g0uZ for providing useful traces.

This will need to be backported to all stable branches.
2021-11-19 19:25:13 +01:00
Willy Tarreau
da91842b6c BUG/MEDIUM: cache/cli: make "show cache" thread-safe
The "show cache" command restarts from the previous node to look for a
duplicate key, but does this after having released the lock, so under
high write load, the node has many chances of having been reassigned
and the dereference of the node crashes after a few iterations. Since
the keys are unique anyway, there's no point looking for a dup, so
let's just continue from the next value.

This is only marked as medium as it seems to have been there for a
while, and discovering it that late simply means that nobody uses that
command, thus in practice it has a very limited impact on real users.

This should be backported to all stable versions.
2021-11-19 19:25:13 +01:00
Amaury Denoyelle
ee72a43321 BUILD: quic: fix potential NULL dereference on xprt_quic
A warning is triggered by gcc9 on this code path, which is the compiler
version used by ubuntu20.04 on the github CI.

This is linked to github issue #1445.
2021-11-19 15:55:19 +01:00
Amaury Denoyelle
b48c59a5a3 BUG/MINOR: hq-interop: fix potential NULL dereference
Test return from htx_add_stline() and returns an error if NULL.
2021-11-19 15:10:46 +01:00
Amaury Denoyelle
ed66b0f04a BUG/MINOR: quic: fix segfault on trace for version negotiation
When receiving Initial packets for Version Negotiation, no quic_conn is
instantiated. Thus, on the final trace, the quic_conn dereferencement
must be tested before using it.
2021-11-19 15:10:44 +01:00
Frédéric Lécaille
56d3e1b0bd MINOR: quic: Support draft-29 QUIC version
This is only to support quic-tracker test suite.
2021-11-19 15:09:57 +01:00
Frédéric Lécaille
ea78ee1adb MINOR: quic: Wrong value for version negotiation packet 'Unused' field
The seven less significant bits of the first byte must be arbitrary.
Without this fix, QUIC tracker "version_negotiation" test could not pass.
2021-11-19 14:37:35 +01:00
Frédéric Lécaille
f366cb7bf6 MINOR: quic: Add minimalistic support for stream flow control frames
This simple patch add the parsing support for theses frames. But nothing is
done at this time about the streams or flow control concerned. This is only to
prevent some QUIC tracker or interop runner tests from failing for a reason
independant of their tested features.
2021-11-19 14:37:35 +01:00
Frédéric Lécaille
83b7a5b490 MINOR: quic: Wrong largest acked packet number parsing
When we have already received ACK frames with the same largest packet
number, this is not an error at all. In this case, we must continue
to parse the ACK current frame.
2021-11-19 14:37:35 +01:00
Frédéric Lécaille
66cbb8232c MINOR: quic: Send CONNECTION_CLOSE frame upon TLS alert
Add ->err member to quic_conn struct to store the connection errors.
This is the responsability of ->send_alert callback of SSL_QUIC_METHOD
struct to handle the TLS alert and consequently update ->err value.
At this time, when entering qc_build_pkt() we build a CONNECTION_CLOSE
frame close the connection when ->err value is not null.
2021-11-19 14:37:35 +01:00
Frédéric Lécaille
0e25783d47 MINOR: quic: Wrong ACK range building
When adding a range, if no "lower" range was present in the ack range root for
the packet number space concerned, we did not check if the new added range could
overlap the next one. This leaded haproxy to crash when encoding negative integer
when building ACK frames.
This bug was revealed thanks to "multi_packet_client_hello" QUIC tracker
test which makes a client send two first Initial packets out of order.
2021-11-19 14:37:35 +01:00
Frédéric Lécaille
f67b35620e MINOR: quic: Wrong Initial packet connection initialization
->qc (QUIC connection) member of packet structure were badly initialized
when received as second Initial packet (from picoquic -Q for instance).
This leaded to corrupt the quic_conn structure with random behaviors
as size effects. This bug came with this commit:
   "MINOR: quic: Possible wrong connection identification"
2021-11-19 14:37:35 +01:00
Frédéric Lécaille
ca98a7f9c0 MINOR: quic: Anti-amplification implementation
A QUIC server MUST not send more than three times as many bytes as received
by clients before its address validation.
2021-11-19 14:37:35 +01:00
Frédéric Lécaille
a956d15118 MINOR: quic: Support transport parameters draft TLS extension
If we want to run quic-tracker against haproxy, we must at least
support the draft version of the TLS extension for the QUIC transport
parameters (0xffa5). quic-tracker QUIC version is draft-29 at this time.
We select this depending on the QUIC version. If draft, we select the
draft TLS extension.
2021-11-19 14:37:35 +01:00
Frédéric Lécaille
28f51faf0b MINOR: quic: Correctly pad UDP datagrams
UDP datagrams with Initial packet were padded only for the clients (haproxy
servers). But such packets MUST also be padded for the servers (haproxy
listeners). Furthere, for servers, only UDP datagrams containing ack-eliciting
Initial packet must be padded.
2021-11-19 14:37:35 +01:00
Frédéric Lécaille
8370c93a03 MINOR: quic: Possible wrong connection identification
A client may send several Initial packets. This is the case for picoquic
with -Q option. In this case we must identify the connection of incoming
Initial packets thanks to the original destination connection ID.
2021-11-19 14:37:35 +01:00
Frédéric Lécaille
d169efe52b MINOR: quic_sock: missing CO_FL_ADDR_TO_SET flag
When allocating destination addresses for QUIC connections we did not set
this flag which denotes these addresses have been set. This had as side
effect to prevent the H3 request results from being returned to the QUIC clients.

Note that this bug was revealed by this commit:
  "MEDIUM: backend: Rely on addresses at stream level to init server connection"

Thanks to Christopher for having found the real cause of this issue.
2021-11-19 14:37:35 +01:00
Willy Tarreau
3a8bbcc38e BUG/MEDIUM: mux-h2: always process a pending shut read
During 2.4-dev, an issue with partial frames was fixed with commit
3d4631fec ("BUG/MEDIUM: mux-h2: fix read0 handling on partial frames").
However this patch is not completely correct. It makes h2_recv() return
0 if the connection was shut for reads, but this not make h2_io_cb()
call h2_process(), so if there are any pending data left in the demux
buffer, they will never be processed, and the I/O callback will be
called in loops forever from the poller.

The correct return value there is 1, as is done at the end of the
function to report a pending read0.

This should definitely fix issue #1328. However even after a lot of
tests I couldn't manage to reproduce it, the conditions to enter that
situation are quite racy.

This must be backported to 2.0 since the fix above was merged into
2.0.21 and 2.2.9.
2021-11-19 12:10:02 +01:00
William Lallemand
7980dff10c BUG/MEDIUM: ssl: abort with the correct SSL error when SNI not found
Since commit c2aae74 ("MEDIUM: ssl: Handle early data with OpenSSL
1.1.1"), the codepath of the clientHello callback changed, letting an
unknown SNI escape with a 'return 1' instead of passing through the
abort label.

An error was still emitted because the frontend continued the handshake
with the initial_ctx, which can't be used to achieve an handshake.
However, it had the ugly side effect of letting the request pass in the
case of a TLS resume. Which could be surprising when combining strict-sni
with the removing of a crt-list entry over the CLI for example. (like
its done in the ssl/new_del_ssl_crlfile.vtc reg-test).

This patch switches the code path of the allow_early and abort label, so
the default code path is the abort one, letting the clientHello returns
the correct SSL_AD_UNRECOGNIZED_NAME in case of errors.

Which means the client will now receive:

	OpenSSL error[0x14094458] ssl3_read_bytes: tlsv1 unrecognized name

Instead of:

	OpenSSL error[0x14094410] ssl3_read_bytes: sslv3 alert handshake failure

Which was the error emitted before HAProxy 1.8.

This patch must be carrefuly backported as far as 1.8 once we validated
its impact.
2021-11-19 03:59:56 +01:00
William Lallemand
e18d4e8286 BUG/MEDIUM: ssl: backend TLS resumption with sni and TLSv1.3
When establishing an outboud connection, haproxy checks if the cached
TLS session has the same SNI as the connection we are trying to
resume.

This test was done by calling SSL_get_servername() which in TLSv1.2
returned the SNI. With TLSv1.3 this is not the case anymore and this
function returns NULL, which invalidates any outboud connection we are
trying to resume if it uses the sni keyword on its server line.

This patch fixes the problem by storing the SNI in the "reused_sess"
structure beside the session itself.

The ssl_sock_set_servername() now has a RWLOCK because this session
cache entry could be accessed by the CLI when trying to update a
certificate on the backend.

This fix must be backported in every maintained version, however the
RWLOCK only exists since version 2.4.
2021-11-19 03:58:30 +01:00
Willy Tarreau
ec347b1239 MINOR: config: support default values for environment variables
Sometimes it is really useful to be able to specify a default value for
an optional environment variable, like the ${name-value} construct in
shell. In fact we're really missing this for a number of settings in
reg tests, starting with timeouts.

This commit simply adds support for the common syntax above. Other
common forms like '+' to replace existing variables, or ':-' and ':+'
to act on empty variables, were not implemented at this stage, as they
are less commonly needed.
2021-11-18 17:54:49 +01:00
William Lallemand
002e2068cc CLEANUP: ssl: fix wrong #else commentary
The else is not for boringSSL but for the lack of Client Hello callback.
Should have been changed in 1fc44d4 ("BUILD: ssl: guard Client Hello
callbacks with HAVE_SSL_CLIENT_HELLO_CB macro instead of openssl
version").

Could be backported in 2.4.
2021-11-18 15:38:42 +01:00
Amaury Denoyelle
10eed8ed03 BUG/MINOR: quic: fix version negotiation packet generation
Fix wrong memcpy usage for source and connection ID in generated Version
Negotiation packet.
2021-11-18 13:49:40 +01:00
William Lallemand
c4810b8cc8 BUG/MEDIUM: mworker: cleanup the listeners when reexecuting
Previously, the cleanup of the listeners was done in mworker_loop(),
which was called once the configuration file was parsed. HAProxy was
switching in wait mode when the configuration failed to load, so no
listeners where created.

Since the latest change on the mworker mode, HAProxy switch to wait mode
after successfuly loading the configuration, without cleaning its
listeners, because it was done in mworker_loop, resulting in the master
not closing its listeners and keeping them. The master needs its
configuration to know which listeners it need to close, so that must be
done before the exec().

This patch fixes the problem by cleaning the listeners in the
mworker_reexec() function.

No backport needeed.
2021-11-18 11:01:16 +01:00
Amaury Denoyelle
a22d860406 MEDIUM: quic: send version negotiation packet on unknown version
If the client announced a QUIC version not supported by haproxy, emit a
Version Negotiation Packet, according to RFC9000 6. Version Negotiation.

This is required to be able to use the framework for QUIC interop
testing from https://github.com/marten-seemann/quic-interop-runner. The
simulator checks that the server is available by sending packets to
force the emission of a Version Negotiation Packet.
2021-11-18 10:50:58 +01:00
Amaury Denoyelle
154bc7f864 MINOR: quic: support hq-interop
Implement a new app_ops layer for quic interop. This layer uses HTTP/0.9
on top of QUIC. Implementation is minimal, with the intent to be able to
pass interoperability test suite from
https://github.com/marten-seemann/quic-interop-runner.

It is instantiated if the negotiated ALPN is "hq-interop".
2021-11-18 10:50:58 +01:00
Amaury Denoyelle
71e588c8a7 MEDIUM: quic: inspect ALPN to install app_ops
Remove the hardcoded initialization of h3 layer on mux init. Now the
ALPN is looked just after the SSL handshake. The app layer is then
installed if the ALPN negotiation returned a supported protocol.

This required to add a get_alpn on the ssl_quic layer which is just a
call to ssl_sock_get_alpn() from ssl_sock. This is mandatory to be able
to use conn_get_alpn().
2021-11-18 10:50:58 +01:00
Amaury Denoyelle
abbe91e5e8 MINOR: quic: redirect app_ops snd_buf through mux
This change is required to be able to use multiple app_ops layer on top
of QUIC. The stream-interface will now call the mux snd_buf which is
just a proxy to the app_ops snd_buf function.

The architecture may be simplified in the structure to install the
app_ops on the stream_interface and avoid the detour via the mux layer
on the sending path.
2021-11-18 10:50:58 +01:00
Amaury Denoyelle
d1acaf9828 BUG/MINOR: h3: ignore unknown frame types
When receiving an unknown h3 frame type, the frame must be discarded
silently and the processing of the remaing frames must continue. This is
according to the HTTP/3 draft34.

This issue was detected when using the quiche client which uses GREASE
frame to test interoperability.
2021-11-18 10:50:58 +01:00
Christopher Faulet
7530830414 BUG/MEDIUM: mux-h1: Handle delayed silent shut in h1_process() to release H1C
The commit a85c522d4 ("BUG/MINOR: mux-h1: Save shutdown mode if the shutdown
is delayed") revealed several hidden bugs in connection's shutdown
handling. One of them is about delayed silent shudown.

If outgoing data are not fully sent, we delayed the shutdown. However, in
h1_process(), only normal (or clean) shutdown are really detected. If a
silent (or dirty) shutdown is performed, the H1 connection is not
immediately released. Of course, in this situation, the client never
acknowledged the shutdown. Thus, the H1 connection remains open till the
client timeout.

This patch should fix the issues #1448 and #1453. It must be backported as
far as 2.0.
2021-11-15 15:03:21 +01:00
Christopher Faulet
1ccbe12f4a DOC: log: Add comments to specify when session's listener is defined or not
When a log message is emitted, The session's listener is always defined when
the session's owner is an inbound connection while it is undefined for a
health-check. It is not obvious. So, comments have been added to make it
clear.

This patch is related to the issue #1434.
2021-11-15 11:31:09 +01:00
Christopher Faulet
d9e6b35701 CLEANUP: peers: Remove useless test on peer variable in peer_trace()
A useless test on peer variable was reported by cppcheck in peer_trace().

This patch should fix the issue #1165.
2021-11-15 09:41:00 +01:00
Christopher Faulet
b7c962b0c0 BUG/MINOR: stick-table/cli: Check for invalid ipv6 key
When an ipv6 key is used to filter a CLI command on a stick table
(clear/set/show table ...), the return value of inet_pton() call must be
checked to be sure the key is valid.

This patch should fix the issue #1163. It should be backported to all
supported versions.
2021-11-15 09:17:27 +01:00
Willy Tarreau
fdf53b4962 BUG/MINOR: pools: don't mark ourselves as harmless in DEBUG_UAF mode
When haproxy is built with DEBUG_UAF=1, some particularly slow
allocation functions are used for each pool, and it was not uncommon
to see the watchdog trigger during performance tests. For this reason
the allocation functions were surrounded by a pair of thread_harmless
calls to mention that the function was waiting in slow syscalls. The
problem is that this also releases functions blocked in thread_isolate()
which can then start their work.

In order to protect against the accidental removal of a shared resource
in this situation, in 2.5-dev4 with commit ba3ab7907 ("MEDIUM: servers:
make the server deletion code run under full thread isolation") was added
thread_isolate_full() for functions which want to be totally protected
due to being manipulating some data.

But this is not sufficient, because there are still places where we
can allocate/free (thus sleep) under a lock, such as in long call
chains involving the release of an idle connection. In this case, if
one thread asks for isolation, one thread might hang in
pool_alloc_area_uaf() with a lock held (for example the conns_lock
when coming from conn_backend_get()->h1_takeover()->task_new()), with
another thread blocked on a lock waiting for that one to release it,
both keeping their bit clear in the thread_harmless mask, preventing
the first thread from being released, thus causing a deadlock.

In addition to this, it was already seen that the "show fd" CLI handler
could wake up during a pool_free_area_uaf() with an incompletely
released memory area while deleting a file descriptor, and be fooled
showing bad pointers, or during a pool_alloc() on another thread that
was in the process of registering a freshly allocated connection to a
new file descriptor.

One solution could consist in replacing all thread_isolate() calls by
thread_isolate_full() but then that makes thread_isolate() useless
and only shifts the problem by one slot.

A better approach could possibly consist in having a way to mark that
a thread is entering an extremely slow section. Such sections would
be timed so that this is not abused, and the bit would be used to
make the watchdog more patient. This would be acceptable as this would
only affect debugging.

The approach used here for now consists in removing the harmless bits
around the UAF allocator, thus essentially undoing commit 85b2cae63
("MINOR: pools: make the thread harmless during the mmap/munmap
syscalls").

This is marked as minor because nobody is expected to be running with
DEBUG_UAF outside of development or serious debugging, so this issue
cannot affect regular users. It must be backported to stable branches
that have thread_harmless_now() around the mmap() call.
2021-11-12 11:17:37 +01:00
Christopher Faulet
47940c39e2 BUG/MINOR: mux-h2: Fix H2_CF_DEM_SHORT_READ value
The value for H2_CF_DEM_SHORT_READ flag is wrong. 2 bits are erroneously
set, 0x200 and 0x80000.  It is not an issue because both bits are not used
anywhere else.

The typo was introduced in the commit b5f7b5296 ("BUG/MEDIUM: mux-h2: Handle
remaining read0 cases on partial frames"). Thus this patch must also be
backported as far a 2.0.
2021-11-10 18:04:36 +01:00
William Lallemand
67b778418e BUG/MEDIUM: httpclient/cli: free of unallocated hc->req.uri
httpclient_new() sets the hc->req.uri ist without duplicating its
memory, which is a problem since the string in the ist could be
inaccessible at some point. The API was made to use a ist which was
allocated dynamically, but httpclient_new() didn't do that, which result
in a crash when calling istfree().

This patch fixes the problem by doing an istdup()

Fix issue #1452.
2021-11-10 17:02:50 +01:00
William Lallemand
5f47b2e280 BUG/MINOR: mworker: doesn't launch the program postparser
When in wait mode, the mworker-prog postparser is launched, but
unfortunately the child structure doesn't contain all required
information to be able to launch the test.

This test is only required when doing a configuration parsing.

Must be backported as far as 2.0.
2021-11-10 15:53:01 +01:00
William Lallemand
90034bba15 MINOR: mworker: change the way we set PROC_O_LEAVING
Since the wait mode is always used once we successfuly loaded the
configuration, every processes were marked as old workers.

To fix this, the PROC_O_LEAVING flag is set only on the processes which
have a number of reloads greater than the current processes.
2021-11-10 15:53:01 +01:00
William Lallemand
3ba7c7b5e1 MINOR: mworker: ReloadFailed shown depending on failedreload
The ReloadFailed prompt in the master CLI is shown only when
failedreloads > 0. It was previously using a check on the wait mode, but
we always use the wait mode now.
2021-11-10 15:53:01 +01:00
William Lallemand
6883674084 MINOR: mworker: implement a reload failure counter
Implement a reload failure counter which counts the number of failure
since the last success. This counter is available in 'show proc' over
the master CLI.
2021-11-10 15:53:01 +01:00
William Lallemand
ad221f4ece MINOR: mworker: only increment the number of reload in wait mode
Since the wait mode will be started in any case of succesful or failed
reload, change the way haproxy computes the number of reloads of the
processes.
2021-11-10 15:53:01 +01:00
William Lallemand
836bda226c MINOR: mworker: clarify starting/failure messages
Clarify the startup and reload messages:

On a successful configuration load, haproxy will emit "Loading success."
after successfuly forked the children.

When it didn't success to load the configuration it will emit "Loading failure!".

When trying to reload the master process, it will emit "Reloading
HAProxy".
2021-11-10 15:53:01 +01:00
William Lallemand
fab0fdce98 MEDIUM: mworker: reexec in waitpid mode after successful loading
Use the waitpid mode after successfully loading the configuration, this
way the memory will be freed in the master, and will preserve the memory.

This will be useful when doing a reload with a configuration which has
large maps or a lot of SSL certificates, avoiding an OOM because too
much memory was allocated in the master.
2021-11-10 15:53:01 +01:00
William Lallemand
5d71a6b0f1 CLEANUP: mworker: remove any relative PID reference
nbproc was removed, it's time to remove any reference to the relative
PID in the master-worker, since there can be only 1 current haproxy
process.

This patch cleans up the alerts and warnings emitted during the exit of
a process, as well as the "show proc" output.
2021-11-10 15:53:01 +01:00
Christopher Faulet
99293b0380 MINOR: mux-h1: Slightly Improve H1 traces
Connection and conn-stream pointers and flags are now dumped, if available,
in each trace messages. In addition, shutr and shutw mode is now reported.
2021-11-10 11:45:27 +01:00
Christopher Faulet
4c5a591b10 Revert "BUG/MINOR: http-ana: Don't eval front after-response rules if stopped on back"
This reverts commit 597909f4e6

http-after-response rules evaluation was changed to do the same that was
done for http-response, in the code. However, the opposite must be performed
instead. Only the rules of the current section must be stopped. Thus the
above commit is reverted and the http-response rules evaluation will be
fixed instead.

Note that only "allow" action is concerned. It is most probably an uncommon
action for an http-after-request rule.

This patch must be backported as far as 2.2 if the above commit was
backported.
2021-11-09 18:02:49 +01:00
Christopher Faulet
46f46df300 BUG/MINOR: http-ana: Apply stop to the current section for http-response rules
A TCP/HTTP action can stop the rules evaluation. However, it should be
applied on the current section only. For instance, for http-requests rules,
an "allow" on a frontend must stop evaluation of rules defined in this
frontend. But the backend rules, if any, must still be evaluated.

For http-response rulesets, according the configuration manual, the same
must be true. Only "allow" action is concerned. However, since the
beginning, this action stops evaluation of all remaining rules, not only
those of the current section.

This patch may be backported to all supported versions. But it is not so
critical because the bug exists since a while. I doubt it will break any
existing configuration because the current behavior is
counterintuitive.
2021-11-09 18:02:36 +01:00
William Dauchy
42d7c402d5 MINOR: promex: backend aggregated server check status
- add new metric: `haproxy_backend_agg_server_check_status`
  it counts the number of servers matching a specific check status
  this permits to exclude per server check status as the usage is often
  to rely on the total. Indeed in large setup having thousands of
  servers per backend the memory impact is not neglible to store the per
  server metric.
- realign promex_str_metrics array

quite simple implementation - we could improve it later by adding an
internal state to the prometheus exporter, thus to avoid counting at
every dump.

this patch is an attempt to close github issue #1312. It may bebackported
to 2.4 if requested.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-11-09 10:51:08 +01:00
William Lallemand
db8a1f391d BUG/MEDIUM: httpclient: channel_add_input() must use htx->data
The httpclient uses channel_add_input() to notify the channel layer that
it must forward some data. This function was used with b_data(&req->buf)
which ask to send the size of a buffer (because of the HTX metadata
which fill the buffer completely).

This is wrong and will have the consequence of trying to send data that
doesn't exist, letting HAProxy looping at 100% CPU.

When using htx channel_add_input() must be used with the size of the htx
payload, and not the size of a buffer.

When sending the request payload it also need to sets the buffer size to
0, which is achieved with a htx_to_buf() when the htx payload is empty.
2021-11-08 17:36:31 +01:00
William Lallemand
933fe394bb BUG/MINOR: httpclient/lua: rcv freeze when no request payload
This patch fixes the receive part of the lua httpclient when no payload
was sent.

The lua task was not awoken once it jumped into
hlua_httpclient_rcv_yield(), which caused the lua client to freeze.

It works with a payload because the payload push is doing the wakeup.

A change in the state machine of the IO handler is also require to
achieve correctly the change from the REQ state to the RES state, it has
to detect if there is the right EOM flag in the request.
2021-11-08 17:36:31 +01:00
Willy Tarreau
1f38bdb3f6 BUG/MINOR: cache: properly ignore unparsable max-age in quotes
When "max-age" or "s-maxage" receive their values in quotes, the pointer
to the integer to be parsed is advanced by one, but the error pointer
check doesn't consider this advanced offset, so it will not match a
parse error such as max-age="a" and will take the value zero instead.

This probably needs to be backported, though it's unsure it has any
effect in the real world.
2021-11-08 12:09:27 +01:00
Willy Tarreau
49b0482ed4 CLEANUP: chunk: remove misleading chunk_strncat() function
This function claims to perform an strncat()-like operation but it does
not, it always copies the indicated number of bytes, regardless of the
presence of a NUL character (what is currently done by chunk_memcat()).
Let's remove it and explicitly replace it with chunk_memcat().
2021-11-08 12:08:26 +01:00
Tim Duesterhus
9f7ed8a60c CLEANUP: Apply ist.cocci
This is to make use of `chunk_istcat()`.
2021-11-08 12:08:26 +01:00
Tim Duesterhus
2471f5c2b2 CLEANUP: Apply ist.cocci
Make use of the new rules to use `isttrim()`.
2021-11-08 12:08:26 +01:00
Frédéric Lécaille
c4becf5424 MINOR: quic: Fix potential null pointer dereference
Fix compilation warnings about non initialized pointers.

This partially address #1445 github issue.
2021-11-08 11:31:12 +01:00
Amaury Denoyelle
b9ce14e5a2 MINOR: h3: fix potential NULL dereference
Fix potential allocation failure of HTX start-line during H3 request
decoding. In this case, h3_decode_qcs returns -1 as error code.

This addresses in part github issue #1445.
2021-11-08 09:17:24 +01:00
Amaury Denoyelle
7bb54f9906 MINOR: mux-quic: fix gcc11 warning
Fix minor warnings about an unused variable.

This addresses in part github issue #1445.
2021-11-08 08:59:30 +01:00
Amaury Denoyelle
3cae4049b0 MINOR: h3/qpack: fix gcc11 warnings
Fix minor warnings about unused variables and mixed declarations.

This addresses in part github issue #1445.
2021-11-08 08:59:30 +01:00
Tim Duesterhus
16cc16dd82 CLEANUP: Re-apply xalloc_size.cocci
Use a consistent size as the parameter for the *alloc family.
2021-11-08 08:05:39 +01:00
Tim Duesterhus
4c8f75fc31 CLEANUP: Apply ist.cocci
Make use of the new rules to use `istend()`.
2021-11-08 08:05:39 +01:00
Willy Tarreau
68574dd492 MEDIUM: log: add the client's SNI to the default HTTPS log format
During a troublehooting it came obvious that the SNI always ought to
be logged on httpslog, as it explains errors caused by selection of
the default certificate (or failure to do so in case of strict-sni).

This expectation was also confirmed on the mailing list.

Since the field may be empty it appeared important not to leave an
empty string in the current format, so it was decided to place the
field before a '/' preceding the SSL version and ciphers, so that
in the worst case a missing field leads to a field looking like
"/TLSv1.2/AES...", though usually a missing element still results
in a "-" in logs.

This will change the log format for users who already deployed the
2.5-dev versions (hence the medium level) but no released version
was using this format yet so there's no harm for stable deployments.
The reg-test was updated to check for "-" there since we don't send
SNI in reg-tests.

Link: https://www.mail-archive.com/haproxy@formilux.org/msg41410.html
Cc: William Lallemand <wlallemand@haproxy.org>
2021-11-06 09:20:07 +01:00
Willy Tarreau
579259d150 MINOR: ssl: make the ssl_fc_sni() sample-fetch function always available
Its definition is enclosed inside an ifdef SSL_CTRL_SET_TLSEXT_HOSTNAME
which is defined since OpenSSL 0.9.8. Having it conditioned like this
prevents us from using it by default in a log format, which could cause
an error on an old or exotic library.

Let's just always define it and make the sample fetch fail to return
anything on such libs instead.
2021-11-06 09:20:07 +01:00
Willy Tarreau
6f7497616e MEDIUM: connection: rename fc_conn_err and bc_conn_err to fc_err and bc_err
Commit 3d2093af9 ("MINOR: connection: Add a connection error code sample
fetch") added these convenient sample-fetch functions but it appears that
due to a misunderstanding the redundant "conn" part was kept in their
name, causing confusion, since "fc" already stands for "front connection".

Let's simply call them "fc_err" and "bc_err" to match all other related
ones before they appear in a final release. The VTC they appeared in were
also updated, and the alpha sort in the keywords table updated.

Cc: William Lallemand <wlallemand@haproxy.org>
2021-11-06 09:20:07 +01:00
Christopher Faulet
44d34bfbe7 MINOR: compression: Warn for 'compression offload' in defaults sections
This directive is documented as being ignored if set in a defaults
section. But it is only mentionned in a small note in the configuration
manual. Thus, now, a warning is emitted. To do so, the errors handling in
parse_compression_options() function was slightly changed.

In addition, this directive is now documented apart from the other
compression directives. This way, it is clearly visible that it must not be
used in a defaults section.
2021-11-05 16:36:42 +01:00
Christopher Faulet
34a3eb4c42 MINOR: backend: Get client dst address to set the server's one only if needful
In alloc_dst_address(), the client destination address must only be
retrieved when we are sure to use it. Most of time, this save a syscall to
getsockname(). It is not a bugfix in itself. But it revealed a bug in the
QUIC part. The CO_FL_ADDR_TO_SET flag is not set when the destination
address is create for anew quic client connection.
2021-11-05 15:25:34 +01:00
Frédéric Lécaille
b0006eee09 MINOR: quic: Use QUIC_LOCK QUIC specific lock label.
Very minor modifications without any impact.
2021-11-05 15:20:04 +01:00
Frédéric Lécaille
46ea033be0 MINOR: quic: Remove a useless lock for CRYPTO frames
->frms_rwlock is an old lock supposed to be used when several threads
could handle the same connection. This is no more the case since this
commit:
 "MINOR: quic: Attach the QUIC connection to a thread."
2021-11-05 15:20:04 +01:00
Frédéric Lécaille
324ecdafbb MINOR: quic: Enhance the listener RX buffering part
Add a buffer per QUIC connection. At this time the listener which receives
the UDP datagram is responsible of identifying the underlying QUIC connection
and must copy the QUIC packets to its buffer.
->pkt_list member has been added to quic_conn struct to enlist the packets
in the order they have been copied to the connection buffer so that to be
able to consume this buffer when the packets are freed. This list is locked
thanks to a R/W lock to protect it from concurent accesses.
quic_rx_packet struct does not use a static buffer anymore to store the QUIC
packets contents.
2021-11-05 15:20:04 +01:00
Frédéric Lécaille
c5c69a0ad2 CLEANUP: quic: Remove useless code
Remove old I/O handler implementation (listener and server).
At this time keep a defined but not used function for servers (qc_srv_pkt_rcv()).
2021-11-05 15:20:04 +01:00
Frédéric Lécaille
c1029f6182 MINOR: quic: Allocate listener RX buffers
At this time we allocate an RX buffer by thread.
Also take the opportunity offered by this patch to rename TX related variable
names to distinguish them from the RX part.
2021-11-05 15:20:04 +01:00
Tim Duesterhus
284fbe1214 CLEANUP: Apply ist.cocci
Make use of the new rules to use `istnext()`.
2021-11-05 07:48:38 +01:00
Tim Duesterhus
025b93e3a2 CLEANUP: Apply ha_free.cocci
Use `ha_free()` where possible.
2021-11-05 07:48:38 +01:00
Remi Tricot-Le Breton
7266350181 BUG/MINOR: jwt: Fix jwt_parse_alg incorrectly returning JWS_ALG_NONE
jwt_parse_alg would mistakenly return JWT_ALG_NONE for algorithms "",
"n", "no" and "non" because of a strncmp misuse. It now sees them as
unknown algorithms.

No backport needed.

Cc: Tim Duesterhus <tim@bastelstu.be>
2021-11-03 17:19:48 +01:00
Emeric Brun
f8642ee826 MEDIUM: resolvers: rename dns extra counters to resolvers extra counters
This patch renames all dns extra counters and stats functions, types and
enums using the 'resolv' prefix/suffixes.

The dns extra counter domain id used on cli was replaced by "resolvers"
instead of "dns".

The typed extra counter prefix dumping resolvers domain "D." was
also renamed "N." because it points counters on a Nameserver.

This was done to finish the split between "resolver" and "dns" layers
and to avoid further misunderstanding when haproxy will handle dns
load balancing.

This should not be backported.
2021-11-03 17:16:46 +01:00
Emeric Brun
d174f0e59a MINOR: resolvers/dns: split dns and resolver counters in dns_counter struct
This patch add a union and struct into dns_counter struct to split
application specific counters.

The only current existing application is the resolver.c layer but
in futur we could handle different application such as dns load
balancing with others specific counters.

This patch should not be backported.
2021-11-03 17:16:46 +01:00
Emeric Brun
0161d32df2 BUG/MINOR: resolvers: throw log message if trash not large enough for query
Before this patch the sent error counter was increased
for each targeted nameserver as soon as we were unable to build
the query message into the trash buffer. But this counter is here
to count sent errors at dns.c transport layer and this error is not
related to a nameserver.

This patch stops to increase those counters and sent a log message
to signal the trash buffer size is not large enough to build the query.

Note: This case should not happen except if trash size buffer was
customized to a very low value.

The function was also re-worked to return -1 in this error case
as it was specified in comment. This function is currently
called at multiple point in resolver.c but return code
is still not yet handled. So to advert the user of the malfunction
the log message was added.

This patch should be backported on all versions including the
layer split between dns.c and resolver.c (v >= 2.4)
2021-11-03 17:16:46 +01:00
Emeric Brun
c37caab21c BUG/MINOR: resolvers: fix sent messages were counted twice
The sent messages counter was increased at both resolver.c and dns.c
layers.

This patch let the dns.c layer count the sent messages since this
layer handle a retry if transport layer is not ready (EAGAIN on udp
or tcp session ring buffer full).

This patch should be backported on all versions using a split of those
layers for resolving (v >=2.4)
2021-11-03 17:16:46 +01:00
Amaury Denoyelle
f9d5957cd9 MINOR: server: add ws keyword
Implement parsing for the server keyword 'ws'. This is used to configure
the mode of selection for websocket protocol. The configuration
documentation has been updated.

A new regtest has been created to test the proper behavior of the
keyword.
2021-11-03 16:24:48 +01:00
Amaury Denoyelle
9c3251d108 MEDIUM: server/backend: implement websocket protocol selection
Handle properly websocket streams if the server uses an ALPN with both
h1 and h2. Add a new field h2_ws in the server structure. If set to off,
reuse is automatically disable on backend and ALPN is forced to http1.x
if possible. Nothing is done if on.

Implement a mechanism to be able to use a different http version for
websocket streams. A new server member <ws> represents the algorithm to
select the protocol. This can overrides the server <proto>
configuration. If the connection uses ALPN for proto selection, it is
updated for websocket streams to select the right protocol.

Three mode of selection are implemented :
- auto : use the same protocol between non-ws and ws streams. If ALPN is
  use, try to update it to "http/1.1"; this is only done if the server
  ALPN contains "http/1.1".
- h1 : use http/1.1
- h2 : use http/2.0; this requires the server to support RFC8441 or an
  error will be returned by haproxy.
2021-11-03 16:24:48 +01:00
Amaury Denoyelle
ac03ef26e8 MINOR: connection: add alternative mux_ops param for conn_install_mux_be
Add a new parameter force_mux_ops. This will be useful to specify an
alternative to the srv->mux_proto field. If non-NULL, it will be use to
force the mux protocol wether srv->mux_proto is set or not.

This argument will become useful to install a mux for non-standard
streams, most notably websocket streams.
2021-11-03 16:24:48 +01:00
Amaury Denoyelle
2454bda140 MINOR: connection: implement function to update ALPN
Implement a new function to update the ALPN on an existing connection.
on an existing connection. The ALPN from the ssl context can be checked
to update the ALPN only if it is a subset of the context value.

This method will be useful to change a connection ALPN for websocket,
must notably if the server does not support h2 websocket through the
rfc8441 Extended Connect.
2021-11-03 16:24:48 +01:00
Amaury Denoyelle
90ac605ef3 MINOR: stream/mux: implement websocket stream flag
Define a new stream flag SF_WEBSOCKET and a new cs flag CS_FL_WEBSOCKET.
The conn-stream flag is first set by h1/h2 muxes if the request is a
valid websocket upgrade. The flag is then converted to SF_WEBSOCKET on
the stream creation.

This will be useful to properly manage websocket streams in
connect_server().
2021-11-03 16:24:48 +01:00
Amaury Denoyelle
0df043608f BUG/MEDIUM: mux-h2: reject upgrade if no RFC8441 support
The RFC8441 was not respected by haproxy in regards with server support
for Extended CONNECT. The Extended CONNECT method was used to convert an
Upgrade header stream even if no SETTINGS_ENABLE_CONNECT_PROTOCOL was
received, which is forbidden by the RFC8441. In this case, the behavior
of the http/2 server is unspecified.

Fix this by flagging the connection on receiption of the RFC8441
settings SETTINGS_ENABLE_CONNECT_PROTOCOL. Extended CONNECT is thus only
be used if the flag is present. In the other case, the stream is
immediatly closed as there is no way to handle it in http/2. It results
in a http/1.1 502 or http/2 RESET_STREAM to the client side.

The protocol-upgrade regtest has been extended to test that haproxy does
not emit Extended CONNECT on servers without RFC8441 support.

It must be backported up to 2.4.
2021-11-03 16:24:48 +01:00
Amaury Denoyelle
e0c258c84d MINOR: mux-h2: add trace on extended connect usage
Add a state trace to report that a protocol upgrade is converted using
the rfc8441 Extended connect method. This is useful in regards with the
recent changes to improve http/2 websockets.
2021-11-03 11:42:02 +01:00
Tim Duesterhus
ab896ee3f7 MINOR: jwt: Make invalid static JWT algorithms an error in jwt_verify converter
It is not useful to start a configuration where an invalid static string is
provided as the JWT algorithm. Better make the administrator aware of the
suspected typo by failing to start.
2021-11-03 11:15:32 +01:00
Jaroslaw Rzesztko
c8637032a7 MINOR: vars: add "set-var" for "tcp-request connection" rules.
Session struct is already allocated when "tcp-request connection" rules
are evaluated so session-scoped variables turned out easy to support.

This resolves github issue #1408.
2021-11-02 17:58:35 +01:00
Willy Tarreau
44c5ff69ac MEDIUM: vars: make the var() sample fetch function really return type ANY
A long-standing issue was reported in issue #1215.

In short, var() was initially internally declared as returning a string
because it was not possible by then to return "any type". As such, users
regularly get trapped thinking that when they're storing an integer there,
then the integer matching method automatically applies. Except that this
is not possible since this is related to the config parser and is decided
at boot time where the variable's type is not known yet.

As such, what is done is that the output being declared as type string,
the string match will automatically apply, and any value will first be
converted to a string. This results in several issues like:

    http-request set-var(txn.foo) int(-1)
    http-request deny if { var(txn.foo) lt 0 }

not working. This is because the string match on the second line will in
fact compare the string representation of the variable against strings
"lt" and "0", none of which matches.

The doc says that the matching method is mandatory, though that's not
the case in the code due to that default string type being permissive.
There's not even a warning when no explicit match is placed, because
this happens very deep in the expression evaluator and making a special
case just for "var" can reveal very complicated.

The set-var() converter already mandates a matching method, as the
following will be rejected:

    ... if { int(12),set-var(txn.truc) 12 }

  while this one will work:

    ... if { int(12),set-var(txn.truc) -m int 12 }

As such, this patch this modifies var() to match the doc, returning the
type "any", and mandating the matching method, implying that this bogus
config which does not work:

    http-request set-var(txn.foo) int(-1)
    http-request deny if { var(txn.foo) lt 0 }

  will need to be written like this:

    http-request set-var(txn.foo) int(-1)
    http-request deny if { var(txn.foo) -m int lt 0 }

This *will* break some configs (and even 3 of our regtests relied on
this), but except those which already match string exclusively, all
other ones are already broken and silently fail (and one of the 3
regtests, the one on FIX, was bogus regarding this).

In order to fix existing configs, one can simply append "-m str"
after a "var()" in an ACL or "if" expression:

    http-request deny unless { var(txn.jwt_alg) "ES" }

  must become:

    http-request deny unless { var(txn.jwt_alg) -m str "ES" }

Most commonly, patterns such as "le", "lt", "ge", "gt", "eq", "ne" in
front of a number indicate that the intent was to match an integer,
and in this case "-m int" would be desired:

    tcp-response content reject if ! { var(res.size) gt 3800 }

  ought to become:

    tcp-response content reject if ! { var(res.size) -m int gt 3800 }

This must not be backported, but if a solution is found to at least
detect this exact condition in the generic expression parser and
emit a warning, this could probably help spot configuration bugs.

Link: https://www.mail-archive.com/haproxy@formilux.org/msg41341.html
Cc: Christopher Faulet <cfaulet@haproxy.com>
Cc: Tim Dsterhus <tim@bastelstu.be>
2021-11-02 17:28:43 +01:00
Christopher Faulet
e8f3596cd0 MINOR: stream: Improve dump of bogus streams
Stream flags and information about the HTTP txn, if defined, are now
emitted. This will help us to identify bugs when such message is reported.
2021-11-02 17:25:48 +01:00
Christopher Faulet
9ed1a0601d BUG/MEDIUM: resolvers: Track api calls with a counter to free resolutions
The kill list introduced in commit f766ec6b5 ("MEDIUM: resolvers: use a kill
list to preserve the list consistency") contains a bug. The deatch_row must
be initialized before calling resolv_process_responses() function. However,
this function is called for the dns code. The death_row is not visible from
the outside. So, it is possible to add a resolution in an uninitialized
death_row, leading to a crash.

But, with the current implementation, it is not possible to handle the
death_row in resolv_process_responses() function because, internally, the
kill list may be freed via a call to resolv_unlink_resolution(). At the end,
we are unable to determine all call chains to guarantee a safe use of the
kill list. It is a shameful observation, but unfortunatly true.

So, to make the fix simple, we track all calls to the public resolvers
api. A counter is incremented when we enter in the resolver code and
decremented when we leave it. This way, we are able to track the recursions
to init and release the kill list only once, at the edge.

Following functions are incrementing/decrementing the recurse counter:

  * resolv_trigger_resolution()
  * resolv_srvrq_expire_task()
  * resolv_link_resolution()
  * resolv_unlink_resolution()
  * resolv_detach_from_resolution_answer_items()
  * resolv_process_responses()
  * process_resolvers()
  * resolvers_finalize_config()
  * resolv_action_do_resolve()

This patch should fix the issue #1404. It must be backported everywhere the
above commit was backported.
2021-11-02 16:55:01 +01:00
Christopher Faulet
69fad00ebf BUG/MEDIUM: stream-int: Block reads if channel cannot receive more data
First of all, we must be careful here because this part was modified and
each time, this introduced a bug. But, in si_update_rx(), we must not
re-enables receives if the channel buffer cannot receive more
data. Otherwise the multiplexer will be wake up for nothing. Because the
stream is woken up when the multiplexer is waiting for more room to move on,
this may lead to a ping-pong loop between the stream and the mux.

Note that for now, it does not fix any known bug. All reported issues in
this area were fixed in another way.

This patch must be backported with a special care. Technically speaking, it
may be backported as far as 2.0.
2021-11-02 16:55:01 +01:00
William Lallemand
0f41c384ea BUG/MINOR: httpclient: use a placeholder value for Host header
A Host header must be present for http_update_host() to success.
htx_add_header(htx, ist("Host"), IST_NULL) was used but this is not a
good idea from a semantic point of view. It also tries to make a memcpy
with a len of 0, which is unrequired.

Use an ist("h") instead as a placeholder value.

This patch fixes bug #1439.
2021-11-02 15:53:09 +01:00
William Lallemand
d1187eb3e1 BUG/MINOR: httpclient/lua: misplaced luaL_buffinit()
Some luaL_buffinit() call was done before the push of the variable name,
where it seems to work correctly with lua < 5.4.3, it brokes
systematically on this version.

This patch inverts the pushstring and the buffinit.
2021-11-02 10:40:06 +01:00
Remi Tricot-Le Breton
7da35bff9f BUG/MINOR: http: http_auth_bearer fetch does not work on custom header name
The http_auth_bearer sample fetch can take a header name as parameter,
in which case it will try to extract a Bearer value out of the given
header name instead of the default "Authorization" one. In this case,
the extraction would not have worked because of a misuse of strncasecmp.
This patch fixes this by replacing the standard string functions by ist
ones.
It also properly manages the multiple spaces that could be found between
the scheme and its value.

No backport needed, that's part of JWT which is only in 2.5.

Co-authored-by: Tim Duesterhus <tim@bastelstu.be>
2021-10-29 17:40:17 +02:00
Remi Tricot-Le Breton
68c4eae87f BUG/MINOR: http: Authorization value can have multiple spaces after the scheme
As per RFC7235, there can be multiple spaces in the value of an
Authorization header, between the scheme and the actual authentication
parameters.

This can be backported to all stable versions since basic auth has almost
always been there.
2021-10-29 17:40:17 +02:00
Christopher Faulet
b0c87f1c61 BUG/MEDIUM: http-ana: Drain request data waiting the tarpit timeout expiration
When a tarpit action is performed, we must be sure to drain data from the
request channel. Otherwise, the mux on the frontend side may be blocked
because the request channel buffer is full.

This may lead to Two bugs. The first one is a HOL blocking on the H2
multiplexer. A tarpitted stream may block all the others because data are
not drained for the whole tarpit timeout. The second bug is a ping-pong loop
between the multiplexer and the stream. The mux is waiting for more space in
the channel buffer, so it wakes up the stream. And the stream systematically
re-enables receives.

This last part is not pretty clean and it will be addressed with another
fix. But draning request data is a good way to fix both bugs in same time.

This patch must be backported as far as 2.0. The legacy HTTP mode is
probably affected, but I don't know if same bugs may be experienced in this
mode.
2021-10-29 15:06:31 +02:00
Christopher Faulet
bce6db6c3c BUG/MEDIUM: resolvers: Don't recursively perform requester unlink
When a requester is unlink from a resolution, by reading the code, we can
have this call chain:

_resolv_unlink_resolution(srv->resolv_requester)
  resolv_detach_from_resolution_answer_items(resolution, requester)
    resolv_srvrq_cleanup_srv(srv)
      _resolv_unlink_resolution(srv->resolv_requester)

A loop on the resolution answer items is performed inside
resolv_detach_from_resolution_answer_items(). But by reading the code, it
seems possible to recursively unlink the same requester.

To avoid any loop at this stage, the requester clean up must be performed
before the call to resolv_detach_from_resolution_answer_items(). This way,
the second call to _resolv_unlink_resolution() does nothing and returns
immediately because the requester was already detached from the resolution.

This patch is related to the issue #1404. It must be backported as far as
2.2.
2021-10-29 15:06:31 +02:00
Christopher Faulet
e76b4f055d BUG/MEDIUM: mux-h1: Perform a connection shutdown when the h1c is released
When the H1 connection is released, a connection shutdown is now performed.
If it was already performed when the stream was detached, this action has no
effect. But it is mandatory, when an idle H1C is released. Otherwise the
xprt and the socket shutdown is never perfmed. It is especially important
for SSL client connections, because it is the only way to perform a clean
SSL shutdown.

Without this patch, SSL_shutdown is never called, preventing, among other
things, the SSL session caching.

This patch depends on the commit "BUG/MINOR: mux-h1: Save shutdown mode if
the shutdown is delayed". It should be backported as far as 2.0.
2021-10-29 15:06:31 +02:00
Christopher Faulet
a85c522d42 BUG/MINOR: mux-h1: Save shutdown mode if the shutdown is delayed
The connection shutdown may be delayed if there are pending outgoing
data. The action is performed once data are fully sent. In this case the
mode (dirty/clean) was lost and a clean shutdown was always performed. Now,
the mode is saved to be sure to perform the connection shutdown using the
right mode. To do so, H1C_F_ST_SILENT_SHUT flag is introduced.

This patch should be backported as far as 2.0.
2021-10-29 15:06:31 +02:00
William Lallemand
bd5739e93e MINOR: httpclient/lua: handle the streaming into the lua applet
With this feature the lua implementation of the httpclient is now able
to stream a payload larger than an haproxy buffer.

The hlua_httpclient_send() function is now split into:

hlua_httpclient_send() which initiate the httpclient and parse the lua
parameters

hlua_httpclient_snd_yield() which will send the request and be called
again to stream the request if the body is larger than an haproxy buffer

hlua_httpclient_rcv_yield() which will receive the response and store it
in the lua buffer.
2021-10-28 16:24:14 +02:00
William Lallemand
0da616ee18 MINOR: httpclient: request streaming with a callback
This patch add a way to handle HTTP requests streaming using a
callback.

The end of the data must be specified by using the "end" parameter in
httpclient_req_xfer().
2021-10-28 16:24:14 +02:00
Tim Duesterhus
8aee3030f8 CLEANUP: hlua: Remove obsolete branch in hlua_alloc()
This branch is no longer required, because the `!nsize` case is handled for any
value of `ptr` now.

see 22586524e3
see a5efdff93c
2021-10-28 09:45:48 +02:00
Tim Duesterhus
e0c1d749a8 CLEANUP: jwt: Remove the use of a trash buffer in jwt_jwsverify_rsa_ecdsa()
`trash` was completely unused within this function.
2021-10-28 09:45:48 +02:00
Tim Duesterhus
c87d3c21bf CLEANUP: jwt: Remove the use of a trash buffer in jwt_jwsverify_hmac()
The OpenSSL documentation (https://www.openssl.org/docs/man1.1.0/man3/HMAC.html)
specifies:

> It places the result in md (which must have space for the output of the hash
> function, which is no more than EVP_MAX_MD_SIZE bytes). If md is NULL, the
> digest is placed in a static array. The size of the output is placed in
> md_len, unless it is NULL. Note: passing a NULL value for md to use the
> static array is not thread safe.

`EVP_MAX_MD_SIZE` appears to be defined as `64`, so let's simply use a stack
buffer to avoid the whole memory management.
2021-10-28 09:45:48 +02:00
Willy Tarreau
14e7f29e86 MINOR: protocols: replace protocol_by_family() with protocol_lookup()
At a few places we were still using protocol_by_family() instead of
the richer protocol_lookup(). The former is limited as it enforces
SOCK_STREAM and a stream protocol at the control layer. At least with
protocol_lookup() we don't have this limitationn. The values were still
set for now but later we can imagine making them configurable on the
fly.
2021-10-27 17:41:07 +02:00
Willy Tarreau
e3b4518414 MINOR: protocols: make use of the protocol type to select the protocol
Instead of using sock_type and ctrl_type to select a protocol, let's
make use of the new protocol type. For now they always match so there
is no change. This is applied to address parsing and to socket retrieval
from older processes.
2021-10-27 17:31:20 +02:00
Willy Tarreau
337edfdbc5 MINOR: protocols: add a new protocol type selector
The protocol selection is currently performed based on the family,
control type and socket type. But this is often not enough, as both
only provide DGRAM or STREAM, leaving few variants. Protocols like
SCTP for example might be indistinguishable from TCP here. Same goes
for TCP extensions like MPTCP.

This commit introduces a new enum proto_type that is placed in each
and every protocol definition, that will usually more or less match
the sock_type, but being an enum, will support additional values.
2021-10-27 17:05:36 +02:00
Willy Tarreau
bdcee7fbc9 DEBUG: protocol: yell loudly during registration of invalid sock_domain
The test on the sock_domain is a bit useless because the protocols are
registered at boot time, and the test silently fails and returns no
error. Use a BUG_ON() instead to make sure to catch such bugs in the
code if any.
2021-10-27 15:50:49 +02:00
Christopher Faulet
52b28d2f30 BUILD: log: Fix compilation without SSL support
When compiled without SSL support, a variable is reported as not used by
GCC.

src/log.c: In function ‘sess_build_logline’:
src/log.c:2056:36: error: unused variable ‘conn’ [-Werror=unused-variable]
 2056 |                 struct connection *conn;
      |                                    ^~~~

This does not need to be backported.
2021-10-27 12:00:15 +02:00
Christopher Faulet
16f16afb31 MINOR: stream: Use backend stream-interface dst address instead of target_addr
target_addr field in the stream structure is removed. The backend
stream-interface destination address is now used.
2021-10-27 11:35:59 +02:00
Christopher Faulet
888cd700f4 MINOR: tcp-sample: Add samples to get original info about client connection
Because source and destination address of the client connection are now
updated at the appropriated level (connection, session or stream), original
info about the client connection are preserved.  src/src_port/src_is_local
and dst/dst_port/dst_is_local return current info about the client
connection. It is the info at the highest available level. Most of time, the
stream. Any tcp/http rules may alter this info.

To get original info, "fc_" prefix must be added. For instance
"fc_src". Here, only "tcp-request connection" rules may alter source and
destination address/port.
2021-10-27 11:35:59 +02:00
Christopher Faulet
1e83b70409 MINOR: tcp-act: Add set-src/set-src-port for "tcp-request content" rules
This patch was reverted because it was inconsitent to change connection
addresses at stream level. Especially in HTTP because all requests was
affected by this change and not only the current one. In HTTP/2, it was
worse. Several streams was able to change the connection addresses at the
same time.

It is no longer an issue, thanks to recent changes. With multi-level client
source and destination addresses, it is possible to limit the change to the
current request. Thus this patch can be reintroduced.

If it possible to set source IP/Port from "tcp-request connection",
"tcp-request session" and "http-request" rules but not from "tcp-request
content" rules. There is no reason for this limitation and it may be a
problem for anyone wanting to call a lua fetch to dynamically set source
IP/Port from a TCP proxy. Indeed, to call a lua fetch, we must have a
stream. And there is no stream when "tcp-request connection/session" rules
are evaluated.

Thanks to this patch, "set-src" and "set-src-port" action are now supported
by "tcp_request content" rules.

This patch is related to the issue #1303.
2021-10-27 11:35:59 +02:00
Christopher Faulet
d69377eb02 MEDIUM: tcp-act: Set addresses at the apprioriate level in set-(src/dst) actions
When client source or destination addresses are changed via a tcp/http
action, we update addresses at the appropriate level. When "tcp-request
connection" rules are evaluated, we update addresses at the connection
level. When "tcp-request session" rules is evaluated, we update those at the
session level. And finally, when "tcp-request content" or "http-request"
rules are evaluated, we update the addresses at the stream level.

The same is performed when source or destination ports are changed.

Of course, for now, not all level are supported. But thanks to this patch,
it will be possible.
2021-10-27 11:35:59 +02:00
Christopher Faulet
e83e8821bb MEDIUM: connection: Assign session addresses when NetScaler CIP proto is parsed
Just like for the PROXY protocol, when the NetScaler Client IP insertion
header is received, the retrieved client source and destination addresses
are set at the session level. This leaves those at the connection level
intact.
2021-10-27 11:35:59 +02:00
Christopher Faulet
c105c9213f MEDIUM: connection: Assign session addresses when PROXY line is received
When PROXY protocol line is received, the retrieved client source and
destination addresses are set at the session level. This leaves those at the
connection level intact.
2021-10-27 11:35:59 +02:00
Christopher Faulet
a8e95fed43 MEDIUM: backend: Rely on addresses at stream level to init server connection
Client source and destination addresses at stream level are used to initiate
the connections to a server. For now, stream-interface addresses are never
set. So, thanks to the fallback mechanism, no changes are expected with this
patch. But its purpose is to rely on addresses at the appropriate level when
set instead of those at the connection level.
2021-10-27 11:35:59 +02:00
Christopher Faulet
b097aef2ef MEDIUM: connection: Rely on addresses at stream level to make proxy line
If the stream exists, the frontend stream-interface is used to get the
client source and destination addresses when the proxy line is built. For
now, stream-interface or session addresses are never set. So, thanks to the
fallback mechanism, no changes are expected with this patch. But its purpose
is to rely on addresses at the appropriate level when set instead of those
at the connection level.
2021-10-27 11:35:57 +02:00
Christopher Faulet
c03be1a129 MEDIUM: tcp-sample: Rely on addresses at the appropriate level in tcp samples
In src, src-port, dst and dst-port sample fetches, the client source and
destination addresses are retrieved from the appropriate level. It means
that, if the stream exits, we use the frontend stream-interface to get the
client source and destination addresses. Otherwise, the session is used. For
now, stream-interface or session addresses are never set. So, thanks to the
fallback mechanism, no changes are expected with this patch. But its purpose
is to rely on addresses at the appropriate level when set instead of those
at the connection level.
2021-10-27 11:34:21 +02:00
Christopher Faulet
568008d199 MINOR: mux-fcgi: Rely on client addresses at stream level to set default params
Client source and destination addresses at stream level are now used to emit
SERVER_NAME/SERVER_PORT and REMOTE_ADDR/REMOTE_PORT parameters. For now,
stream-interface addresses are never set. So, thanks to the fallback
mechanism, no changes are expected with this patch. But its purpose is to
rely on addresses at the stream level, when set, instead of those at the
connection level.
2021-10-27 11:34:21 +02:00
Christopher Faulet
6fc817a28e MINOR: http-fetch: Rely on addresses at stream level in HTTP sample fetches
Client source and destination addresses at stream level are now used to
compute base32+src and url32+src hashes. For now, stream-interface addresses
are never set. So, thanks to the fallback mechanism, no changes are expected
with this patch. But its purpose is to rely on addresses at the stream
level, when set, instead of those at the connection level.
2021-10-27 11:34:21 +02:00
Christopher Faulet
8a104ba3e0 MINOR: http-ana: Rely on addresses at stream level to set xff and xot headers
Client source and destination addresses at stream level are now used to emit
X-Forwarded-For and X-Original-To headers. For now, stream-interface addresses
are never set. So, thanks to the fallback mechanism, no changes are expected
with this patch. But its purpose is to rely on addresses at the stream level,
when set, instead of those at the connection level.
2021-10-27 11:34:21 +02:00
Christopher Faulet
c269f664bd MINOR: session: Rely on client source address at session level to log error
When an embryonic session is killed, if no log format is defined for this
error, a generic error is emitted. When this happens, we now rely on the
session to get the client source address. For now, session addresses are
never set. So, thanks to the fallback mechanism, no changes are expected
with this patch. But its purpose is to rely on addresses at the session
level when set instead of those at the connection level.
2021-10-27 11:34:21 +02:00
Christopher Faulet
f9c4d8d5be MINOR: log: Rely on client addresses at the appropriate level to log messages
When a log message is emitted, if the stream exits, we use the frontend
stream-interface to retrieve the client source and destination
addresses. Otherwise, the session is used. For now, stream-interface or
session addresses are never set. So, thanks to the fallback mechanism, no
changes are expected with this patch. But its purpose is to rely on
addresses at the appropriate level when set instead of those at the
connection level.
2021-10-27 11:34:21 +02:00
Christopher Faulet
c9c8e1cc01 MINOR: frontend: Rely on client src and dst addresses at stream level
For now, stream-interface or session addresses are never set. So, thanks to
the fallback mechanism, no changes are expected with this patch. But its
purpose is to rely on the client addresses at the stream level, when set,
instead of those at the connection level. The addresses are retrieved from
the frontend stream-interface.
2021-10-27 11:34:21 +02:00
Christopher Faulet
859ff84f8c MINOR: stream-int: Add src and dst addresses to the stream-interface
For now, these addresses are never set. But the idea is to be able to set, at
least first, the client source and destination addresses at the stream level
without updating the session or connection ones.

Of course, because these addresses are carried by the strream-interface, it
would be possible to set server source and destination addresses at this level
too.

Functions to fill these addresses have been added: si_get_src() and
si_get_dst(). If not already set, these functions relies on underlying
layers to fill stream-interface addresses. On the frontend side, the session
addresses are used if set, otherwise the client connection ones are used. On
the backend side, the server connection addresses are used.

And just like for sessions and conncetions, si_src() and si_dst() may be used to
get source and destination addresses or the stream-interface. And, if not set,
same mechanism as above is used.
2021-10-27 11:34:21 +02:00
Christopher Faulet
f46e1ea1ad MINOR: session: Add src and dst addresses to the session
For now, these addresses are never set. But the idea is to be able to set
client source and destination addresses at the session level without
updating the connection ones.

Functions to fill these addresses have been added: sess_get_src() and
sess_get_dst(). If not already set, these functions relies on
conn_get_src() and conn_get_dst() to fill session addresses.

And just like for conncetions, sess_src() and sess_dst() may be used to get
source and destination addresses. However, if not set, the corresponding
address from the underlying client connection is returned. When this
happens, the addresses is filled in the connection object.
2021-10-27 11:34:21 +02:00
Christopher Faulet
e6465b3b75 CLEANUP: lua: Use a const address to retrieve info about a connection
hlua_socket_info() only extracts information about an address, there is no
reason to not use a const.
2021-10-27 11:34:21 +02:00
Christopher Faulet
4bfce397b8 CLEANUP: connection: No longer export make_proxy_line_v1/v2 functions
These functions are only used by the make_proxy_line() function. Thus, we
can turn them as static.
2021-10-27 11:34:14 +02:00
vishnu
0af4bd7beb BUG/MEDIUM: lua: fix invalid return types in hlua_http_msg_get_body
hlua_http_msg_get_body must return either a Lua string or nil. For some
HTTPMessage objects, HTX_BLK_EOT blocks are also present in the HTX buffer
along with HTX_BLK_DATA blocks. In such cases, _hlua_http_msg_dup will start
copying data into a luaL_Buffer until it encounters an HTX_BLK_EOT. But then
instead of pushing neither the luaL_Buffer nor `nil` to the Lua stack, the
function will return immediately. The end result will be that the caller of
the HTTPMessage.body() method from a Lua filter will see whatever object was
on top of the stack as return value. It may be either a userdata object if
HTTPMessage.body() was called with only two arguments, or the third argument
itself if called with three arguments. Hence HTTPMessage.body() would return
either nil, or HTTPMessage body as Lua string, or a userdata objects, or
number.

This fix ensure that HTTPMessage.body() will always return either a string
or nil.

Reviewed-by: Christopher Faulet <cfaulet@haproxy.com>
2021-10-27 11:04:16 +02:00
William Lallemand
6137a9ee20 MINOR: httpclient/lua: return an error when it can't generate the request
Add a check during the httpclient request generation which yield an lua
error when the generation didn't work. The most common case is the lack
of space in the buffer, it can because of too much headers or a too big
body.
2021-10-27 10:19:58 +02:00
William Lallemand
dc2cc9008b MINOR: httpclient/lua: support more HTTP methods
Add support for HEAD/PUT/POST/DELETE method with the lua httpclient.

This patch use the httpclient_req_gen() function with a different meth
parameter to implement this.

Also change the reg-test to support a POST request with a body.
2021-10-27 10:19:49 +02:00
William Lallemand
dec25c3e14 MINOR: httpclient: support payload within a buffer
httpclient_req_gen() takes a payload argument which can be use to put a
payload in the request. This payload can only fit a request buffer.

This payload can also be specified by the "body" named parameter within
the lua. httpclient.

It is also used within the CLI httpclient when specified as a CLI
payload with "<<".
2021-10-27 10:19:41 +02:00
Amaury Denoyelle
8e358af8a3 MINOR: connection: remove unneeded memset 0 for idle conns
Remove the zeroing of an idle connection node on remove from a tree.
This is not needed and should improve slightly the performance of idle
connection usage. Besides, it breaks the memory poisoning feature.
2021-10-22 17:29:25 +02:00
Amaury Denoyelle
926712ab2d MINOR: backend: improve perf with tcp proxies skipping idle conns
Skip the hash connection calcul when reuse must not be used in
connect_server() : this is the case for TCP proxies. This should result
in slightly better performance when using this use-case.
2021-10-22 17:28:29 +02:00
Amaury Denoyelle
aee4fdbd17 BUG/MINOR: backend: fix improper insert in avail tree for always reuse
In connect_server(), if http-reuse always is set, the backend connection
is inserted into the available tree as soon as created. However, the
hash connection field is only set later at the end of the function.

This seems to have no impact as the hash connection field is always
position before a lookup. However, this is not a proper usage of ebmb
API. Fix this by setting the hash connection field before the insertion
into the avail tree.

This must be backported up to 2.4.
2021-10-22 17:26:22 +02:00
Amaury Denoyelle
1252b6f951 MINOR: backend: add traces for idle connections reuse
Add traces in connect_server() to debug idle connection reuse. These
are attached to stream trace module, as it's already in use in
backend.c with the macro TRACE_SOURCE.
2021-10-22 17:21:14 +02:00
Willy Tarreau
1de51eb727 MINOR: memprof: add one pointer size to the size of allocations
The current model causes an issue when trying to spot memory leaks,
because malloc(0) or realloc(0) do not count as allocations since we only
account for the application-usable size. This is the problem that made
issue #1406 not to appear as a leak.

What we're doing now is to account for one extra pointer (the one that
memory allocators usually place before the returned area), so that a
malloc(0) will properly account for 4 or 8 bytes. We don't need something
exact, we just need something non-zero so that a realloc(X) followed by a
realloc(0) without a free() gives a small non-zero result.

It was verified that the results are stable including in the presence
of lots of malloc/realloc/free as happens when stressing Lua.

It would make sense to backport this to 2.4 as it helps in bug reports.
2021-10-22 16:40:09 +02:00
Willy Tarreau
8cce4d79ff MINOR: memprof: report the delta between alloc and free on realloc()
realloc() calls are painful to analyse because they have two non-zero
columns and trying to spot a leaking one requires a bit of scripting.
Let's simply append the delta at the end of the line when alloc and
free are non-nul.

It would be useful to backport this to 2.4 to help with bug reports.
2021-10-22 16:40:09 +02:00
Willy Tarreau
a5efdff93c BUG/MEDIUM: lua: fix memory leaks with realloc() on non-glibc systems
In issue #1406, Lev Petrushchak reported a nasty memory leak on Alpine
since haproxy 2.4 when using Lua, that memory profiling didn't detect.
After inspecting the code and Lua's code, it appeared that Lua's default
allocator does an explicit free() on size zero, while since 2.4 commit
d36c7fa5e ("MINOR: lua: simplify hlua_alloc() to only rely on realloc()"),
haproxy only calls realloc(ptr,0) that performs a free() on glibc but not
on other systems as it's not required by POSIX...

This patch reinstalls the explicit test for nsize==0 to call free().

Thanks to Lev for the very documented report, and to Tim for the links
to a musl thread on the same subject that confirms the diagnostic.

This must be backported to 2.4.
2021-10-22 16:40:09 +02:00
Frédéric Lécaille
46be7e92b4 MINOR: quic: Increase the size of handshake RX UDP datagrams
Some browsers may send Initial packets with sizes greater than 1252 bytes
(QUIC_INITIAL_IPV4_MTU). Let us increase this size limit up to 2048 bytes.
Also use this size for "max_udp_payload_size" transport parameter to limit
the size of the datagrams we want to receive.
2021-10-22 15:48:19 +02:00
Willy Tarreau
dbb0bb59e3 CLEANUP: resolvers: get rid of single-iteration loop in resolv_get_ip_from_response()
In issue 1424 Coverity reports that the loop increment is unreachable,
which is true, the list_for_each_entry() was replaced with a for loop,
but it was already not needed and was instead used as a convenient
construct for a single iteration lookup. Let's get rid of all this
now and replace the loop with an "if" statement.
2021-10-22 08:34:14 +02:00
Willy Tarreau
0b22247606 MINOR: mux-h2: perform a full cycle shutdown+drain on close
While in H1 we can usually close quickly, in H2 a client might be sending
window updates or anything while we're sending a GOAWAY and the pending
data in the socket buffers at the moment the close() is performed on the
socket results in the output data being lost and an RST being emitted.

One example where this happens easily is with h2spec, which randomly
reports connection resets when waiting for a GOAWAY while haproxy sends
it, as seen in issue #1422. With h2spec it's not window updates that are
causing this but the fact that h2spec has to upload the payload that
comes with invalid frames to accommodate various implementations, and
does that in two different segments. When haproxy aborts on the invalid
frame header, the payload was not yet received and causes an RST to
be sent.

Here we're dealing with this two ways:
  - we perform a shutdown(WR) on the connection to forcefully push pending
    data on a front connection after the xprt is shut and closed ;
  - we drain pending data
  - then we close

This totally solves the issue with h2spec, and the extra cost is very
low, especially if we consider that H2 connections are not set up and
torn down often. This issue was never observed with regular clients,
most likely because this pattern does not happen in regular traffic.

After more testing it could make sense to backport this, at least to
avoid reporting errors on h2spec tests.
2021-10-21 22:24:31 +02:00
Willy Tarreau
20b622e04b MINOR: connection: add a new CO_FL_WANT_DRAIN flag to force drain on close
Sometimes we'd like to do our best to drain pending data before closing
in order to save the peer from risking to receive an RST on close.

This adds a new connection flag CO_FL_WANT_DRAIN that is used to
trigger a call to conn_ctrl_drain() from conn_ctrl_close(), and the
sock_drain() function ignores fd_recv_ready() if this flag is set,
in order to catch latest data. It's not used for now.
2021-10-21 21:48:23 +02:00
Willy Tarreau
e6dc7a0129 BUG/MINOR: mux-h2: do not prevent from sending a final GOAWAY frame
Some checks were added by commit 9a3d3fcb5 ("BUG/MAJOR: mux-h2: Don't try
to send data if we know it is no longer possible") to make sure we don't
loop forever trying to send data that cannot leave. But one of the
conditions there is not correct, the one relying on H2_CS_ERROR2. Indeed,
this state indicates that the error code was serialized into the mux
buffer, and since the test is placed before trying to send the data to
the socket, if the connection states only contains a GOAWAY frame, it
may refrain from sending and may close without sending anything. It's
not dramatic, as GOAWAY reports connection errors in situations where
delivery is not even certain, but it's cleaner to make sure the error
is properly sent, and it avoids upsetting h2spec, as seen in github
issue #1422.

Given that the patch above was backported as far as 1.8, this patch will
also have to be backported that far.

Thanks to Ilya for reporting this one.
2021-10-21 17:37:22 +02:00
Willy Tarreau
3193eb9907 BUG/MINOR: task: do not set TASK_F_USR1 for no reason
This applicationn specific flag was added in 2.4-dev by commit 6fa8bcdc7
("MINOR: task: add an application specific flag to the state: TASK_F_USR1")
to help preserve a the idle connections status across wakeup calls. While
the code to do this was OK for tasklets, it was wrong for tasks, as in an
effort not to lose it when setting the RUNNING flag (that tasklets don't
have), it ended up being inconditionally set. It just happens that for now
no regular tasks use it, only tasklets.

This fix makes sure we always atomically perform (state & flags | running)
there, using a CAS. It also does it for tasklets because it was possible
to lose some such flags if set by another thread, even though this should
not happen with current code. In order to make the code more readable (and
avoid the previous mistake of repeated flags in the bit field), a new
TASK_PERSISTENT aggregate was declared in task.h for this.

In practice the CAS is cheap here because task states are stable or
convergent so the loop will almost never be taken.

This should be backported to 2.4.
2021-10-21 16:17:29 +02:00
Willy Tarreau
dde1b4499a OPTIM: dns: use an atomic check for the list membership
The crash that was fixed by commit 7045590d8 ("BUG/MAJOR: dns: attempt
to lock globaly for msg waiter list instead of use barrier") was now
completely analysed and confirmed to be partially a result of the
debugging code added to LIST_INLIST(), which was looking at both
pointers and their reciprocals, and that, if used in a concurrent
context, could perfectly return false if a neighbor was being added or
removed while the current one didn't change, allowing the LIST_APPEND
to fail.

As the LIST API was not designed to be used in a concurrent context,
we should not rely on LIST_INLIST() but on the newly introduced
LIST_INLIST_ATOMIC().

This patch simply reverts the commit above to switch to the new test,
saving a lock during potentially long operations. It was verified that
the check doesn't fail anymore.

It is unsure what the performance impact of the fix above could be in
some contexts. If any performance regression is observed, it could make
sense to backport this patch, along with the previous commit introducing
the LIST_INLIST_ATOMIC() macro.
2021-10-21 15:28:42 +02:00
Willy Tarreau
dcb696cd31 MEDIUM: resolvers: hash the records before inserting them into the tree
We're using an XXH32() on the record to insert it into or look it up from
the tree. This way we don't change the rest of the code, the comparisons
are still made on all fields and the next node is visited on mismatch. This
also allows to continue to use roundrobin between identical nodes.

Just doing this is sufficient to see the CPU usage go down from ~60-70% to
4% at ~2k DNS requests per second for farm with 300 servers. A larger
config with 12 backends of 2000 servers each shows ~8-9% CPU for 6-10000
DNS requests per second.

It would probably be possible to go further with multiple levels of indexing
but it's not worth it, and it's important to remember that tree nodes take
space (the struct answer_list went back from 576 to 600 bytes).
2021-10-21 08:29:02 +02:00
Willy Tarreau
7893ae117f MEDIUM: resolvers: replace the answer_list with a (flat) tree
With SRV records, a huge amount of time is spent looking for records
by walking long lists. It is possible to reduce this by indexing values
in trees instead. However the whole code relies a lot on the list
ordering, and even implements some round-robin on it to distribute IP
addresses to servers.

This patch starts carefully by replacing the list with a an eb32 tree
that is still used like a list, with a constant key 0. Since ebtrees
preserve insertion order for duplicates, the tree walk visits the nodes
in the exact same order it did with the lists. This allows to implement
the required infrastructure without changing the behavior.
2021-10-21 08:02:08 +02:00
Willy Tarreau
a89c19127d BUG/MEDIUM: checks: fix the starting thread for external checks
When cleaning up the code to remove most explicit task masks in commit
beeabf531 ("MINOR: task: provide 3 task_new_* wrappers to simplify the
API"), a mistake was done with the external checks where the call does
task_new_on(1) instead of task_new_on(0) due to the confusion with the
previous mask 1.

No backport is needed as that's only 2.5-dev.
2021-10-20 18:43:30 +02:00
Willy Tarreau
6878f80427 MEDIUM: resolvers: remove the last occurrences of the "safe" argument
This one was used to indicate whether the callee had to follow particularly
safe code path when removing resolutions. Since the code now uses a kill
list, this is not needed anymore.
2021-10-20 17:54:27 +02:00
Willy Tarreau
f766ec6b53 MEDIUM: resolvers: use a kill list to preserve the list consistency
When scanning resolution.curr it's possible to try to free some
resolutions which will themselves result in freeing other ones. If
one of these other ones is exactly the next one in the list, the list
walk visits deleted nodes and causes memory corruption, double-frees
and so on. The approach taken using the "safe" argument to some
functions seems to work but it's extremely brittle as it is required
to carefully check all call paths from process_ressolvers() and pass
the argument to 1 there to refrain from deleting entries, so the bug
is very likely to come back after some tiny changes to this code.

A variant was tried, checking at various places that the current task
corresponds to process_resolvers() but this is also quite brittle even
though a bit less.

This patch uses another approach which consists in carefully unlinking
elements from the list and deferring their removal by placing it in a
kill list instead of deleting them synchronously. The real benefit here
is that the complexity only has to be placed where the complications
are.

A thread-local list is fed with elements to be deleted before scanning
the resolutions, and it's flushed at the end by picking the first one
until the list is empty. This way we never dereference the next element
and do not care about its presence or not in the list. One function,
resolv_unlink_resolution(), is exported and used outside, so it had to
be modified to use this list as well. Internal code has to use
_resolv_unlink_resolution() instead.
2021-10-20 17:54:22 +02:00
Willy Tarreau
aae7320b0d CLEANUP: resolvers: replace all LIST_DELETE with LIST_DEL_INIT
The code as it is uses crossed lists between many elements, and at
many places the code relies on list iterators or emptiness checks,
which does not work with only LIST_DELETE. Further, it is quite
difficult to place debugging code and checks in the current situation,
and gdb is helpless.

This code replaces all LIST_DELETE calls with LIST_DEL_INIT so that
it becomes possible to trust the lists.
2021-10-20 17:54:14 +02:00
Willy Tarreau
239675e4a9 CLEANUP: resolvers: simplify resolv_link_resolution() regarding requesters
This function allocates requesters by hand for each and every type. This
is complex and error-prone, and it doesn't even initialize the list part,
leaving dangling pointers that complicate debugging.

This patch introduces a new function resolv_get_requester() that either
returns the current pointer if valid or tries to allocate a new one and
links it to its destination. Then it makes use of it in the function
above to clean it up quite a bit. This allows to remove complicated but
unneeded tests.
2021-10-20 17:54:01 +02:00
Willy Tarreau
48664c048d CLEANUP: always initialize the answer_list
Similar to the previous patch, the answer's list was only initialized the
first time it was added to a list, leading to bogus outdated pointer to
appear when debugging code is added around it to watch it. Let's make
sure it's always initialized upon allocation.
2021-10-20 17:53:54 +02:00
Willy Tarreau
25e010906a BUG/MEDIUM: resolvers: always check a valid item in query_list
The query_list is physically stored in the struct resolution itself,
so we have a list that contains a list to items stored in itself (and
there is a single item). But the list is first initialized in
resolv_validate_dns_response(), while it's scanned in
resolv_process_responses() later after calling the former. First,
this results in crashes as soon as the code is instrumented a little
bit for debugging, as elements from a previous incarnation can appear.

But in addition to this, the presence of an element is checked by
verifying that the return of LIST_NEXT() is not NULL, while it may
never be NULL even for an empty list, resulting in bugs or crashes
if the number of responses does not match the list's contents. This
is easily triggered by testing for the list non-emptiness outside of
the function.

Let's make sure the list is always correct, i.e. it's initialized to
an empty list when the structure is allocated, elements are checked by
first verifying the list is not empty, they are deleted once checked,
and in any case at end so that there are no dangling pointers.

This should be backported, but only as long as the patch fits without
modifications, as adaptations can be risky there given that bugs tend
to hide each other.
2021-10-20 17:53:35 +02:00
Willy Tarreau
10c1a8c3bd BUILD: resolvers: avoid a possible warning on null-deref
Depending on the code that precedes the loop, gcc may emit this warning:

  src/resolvers.c: In function 'resolv_process_responses':
  src/resolvers.c:1009:11: warning: potential null pointer dereference [-Wnull-dereference]
   1009 |  if (query->type != DNS_RTYPE_SRV && flags & DNS_FLAG_TRUNCATED) {
        |      ~~~~~^~~~~~

However after carefully checking, r_res->header.qdcount it exclusively 1
when reaching this place, which forces the for() loop to enter for at
least one iteration, and <query> to be set. Thus there's no code path
leading to a null deref. It's possibly just because the assignment is
too far and the compiler cannot figure that the condition is always OK.
Let's just mark it to please the compiler.
2021-10-20 17:53:35 +02:00
Willy Tarreau
2acc160c05 CLEANUP: resolvers: do not export resolv_purge_resolution_answer_records()
This code is dangerous enough that we certainly don't want external code
to ever approach it, let's not export unnecessary functions like this one.
It was made static and a comment was added about its purpose.
2021-10-20 17:52:50 +02:00
Willy Tarreau
2a67aa0a51 BUG/MAJOR: resolvers: add other missing references during resolution removal
There is a fundamental design bug in the resolvers code which is that
a list of active resolutions is being walked to try to delete outdated
entries, and that the code responsible for removing them also removes
other elements, including the next one which will be visited by the
list iterator. This randomly causes a use-after-free condition leading
to crashes, infinite loops and various other issues such as random memory
corruption.

A first fix for the memory fix for this was brought by commit 0efc0993e
("BUG/MEDIUM: resolvers: Don't release resolution from a requester
callbacks"). While preparing for more fixes, some code was factored by
commit 11c6c3965 ("MINOR: resolvers: Clean server in a dedicated function
when removing a SRV item"), which inadvertently passed "0" as the "safe"
argument all the time, missing one case of removal protection, instead
of always using "safe". This patch reintroduces the correct argument.

This must be backported with all fixes above.

Cc: Christopher Faulet <cfaulet@haproxy.com>
2021-10-20 17:52:36 +02:00
Willy Tarreau
62e467c667 DEBUG: dns: add a few more BUG_ON at sensitive places
A few places have been caught triggering late bugs recently, always cases
of use-after-free because a freed element was still found in one of the
lists. This patch adds a few checks for such elements in dns_session_free()
before the final pool_free() and dns_session_io_handler() before adding
elements to lists to make sure they remain consistent. They do not trigger
anymore now.
2021-10-20 17:52:17 +02:00
Willy Tarreau
b56a878950 CLEANUP: dns: always detach the appctx from the dns session on release
When dns_session_release() calls dns_session_free(), it was shown that
it might still be attached there:

  Program terminated with signal SIGSEGV, Segmentation fault.
  #0  0x00000000006437d7 in dns_session_free (ds=0x7f895439e810) at src/dns.c:768
  768             BUG_ON(!LIST_ISEMPTY(&ds->ring.waiters));
  [Current thread is 1 (Thread 0x7f895bbe2700 (LWP 31792))]
  (gdb) bt
  #0  0x00000000006437d7 in dns_session_free (ds=0x7f895439e810) at src/dns.c:768
  #1  0x0000000000643ab8 in dns_session_release (appctx=0x7f89545a4ff0) at src/dns.c:805
  #2  0x000000000062e35a in si_applet_release (si=0x7f89545a5550) at include/haproxy/stream_interface.h:236
  #3  0x000000000063150f in stream_int_shutw_applet (si=0x7f89545a5550) at src/stream_interface.c:1697
  #4  0x0000000000640ab8 in si_shutw (si=0x7f89545a5550) at include/haproxy/stream_interface.h:437
  #5  0x0000000000643103 in dns_session_io_handler (appctx=0x7f89545a4ff0) at src/dns.c:725
  #6  0x00000000006d776f in task_run_applet (t=0x7f89545a5100, context=0x7f89545a4ff0, state=81924) at src/applet.c:90
  #7  0x000000000068b82b in run_tasks_from_lists (budgets=0x7f895bbbf5c0) at src/task.c:611
  #8  0x000000000068c258 in process_runnable_tasks () at src/task.c:850
  #9  0x0000000000621e61 in run_poll_loop () at src/haproxy.c:2636
  #10 0x0000000000622328 in run_thread_poll_loop (data=0x8d7440 <ha_thread_info+64>) at src/haproxy.c:2807
  #11 0x00007f895c54a06b in start_thread () from /lib64/libpthread.so.0
  #12 0x00007f895bf3772f in clone () from /lib64/libc.so.6
  (gdb) p &ds->ring.waiters
  $1 = (struct list *) 0x7f895439e8a8
  (gdb) p ds->ring.waiters
  $2 = {
    n = 0x7f89545a5078,
    p = 0x7f89545a5078
  }
  (gdb) p ds->ring.waiters->n
  $3 = (struct list *) 0x7f89545a5078
  (gdb) p *ds->ring.waiters->n
  $4 = {
    n = 0x7f895439e8a8,
    p = 0x7f895439e8a8
  }

Let's always detach it before freeing so that it remains possible to
check the dns_session's ring before releasing it, and possibly catch
bugs.
2021-10-20 17:52:13 +02:00
Emeric Brun
7045590d8a BUG/MAJOR: dns: attempt to lock globaly for msg waiter list instead of use barrier
The barrier is insufficient here to protect the waiters list as we can
definitely catch situations where ds->waiter shows an inconsistency
whereby the element is not attached when entering the "if" block and
is already attached when attaching it later.

This patch uses a larger lock to maintain consistency. Without it the
code would crash in 30-180 minutes under heavy stress, always showing
the same problem (ds->waiter->n->p != &ds->waiter). Now it seems to
always resist, suggesting that this was indeed the problem.

This will have to be backported to 2.4.
2021-10-20 17:52:07 +02:00
Emeric Brun
d20dc21eec BUG/MAJOR: dns: tcp session can remain attached to a list after a free
Using tcp, after a session release and free, the session can remain
attached to the list of sessions with a response message waiting for
a commit (ds->waiter). This results to a use after free of this
session.

Also, on some error path and after free, a session could remain attached
to the lists of available idle/free sessions (ds->list).

This patch ensure to remove the session from those external lists
before a free.

This patch should be backported to all version including
the dns over tcp (2.4)
2021-10-20 17:52:02 +02:00
Christopher Faulet
d16e7dd0e4 BUG/MEDIUM: tcpcheck: Properly catch early HTTP parsing errors
When an HTTP response is parsed, early parsing errors are not properly
handled. When this error is reported by the multiplexer, nothing is copied
into the input buffer. The HTX message remains empty but the
HTX_FL_PARSING_ERROR flag is set. In addition CS_FL_EOI is set on the
conn-stream. This last flag must be handled to prevent subscription for
receive events. Otherwise, in the best case, a L7 timeout error is
reported. But a transient loop is also possible if a shutdown is received
because the multiplexer notifies the check of the event while the check
never handles it and waits for more data.

Now, if CS_FL_EOI flag is set on the conn-stream, expect rules are
evaluated. Any error must be handled there.

Thanks to @kazeburo for his valuable report.

This patch should fix the issue #1420. It must be backported at least to
2.4. On 2.3 and 2.2, there is no loop but the wrong error is reported (empty
response instead of invalid one). Thus it may also be backported as far as
2.2.
2021-10-20 14:35:38 +02:00
William Lallemand
34b3a93655 MINOR: httpclient/cli: access should be only done from expert mode
Only enable the usage of the CLI HTTP client in expert mode.
2021-10-19 15:02:42 +02:00
Christopher Faulet
813f913444 BUG/MEDIUM: stream: Keep FLT_END analyzers if a stream detects a channel error
If a channel error (READ_ERRO|READ_TIMEOUT|WRITE_ERROR|WRITE_TIMEOUT) is
detected by the stream, in process_stream(), FLT_END analyers must be
preserved. It is important to be sure to ends filter analysis and be able to
release the stream.

First, filters may release some ressources when FLT_END analyzers are
called. Then, the CF_FL_ANALYZE flag is used to sync end of analysis for the
request and the response. If FLT_END analyzer is ignored on a channel, this
may block the other side and freeze the stream.

This patch must be backported to all stable versions
2021-10-19 11:29:30 +02:00
Remi Tricot-Le Breton
8abed17a34 MINOR: jwt: Do not rely on enum order anymore
Replace the test based on the enum value of the algorithm by an explicit
switch statement in case someone reorders it for some reason (while
still managing not to break the regtest).
2021-10-18 16:02:31 +02:00
Remi Tricot-Le Breton
0b24d2fa45 MINOR: jwt: Empty the certificate tree during deinit
The tree in which the JWT certificates are stored was not emptied. It is
now done during deinit.
2021-10-18 16:02:28 +02:00
Willy Tarreau
75cc65356f MEDIUM: resolvers: replace bogus resolv_hostname_cmp() with memcmp()
resolv_hostname_cmp() is bogus, it is applied on labels and not plain
names, but doesn't make any distinction between length prefixes and
characters, so it compares the labels lengths via tolower() as well.
The only reason for which it doesn't break is because labels cannot
be larger than 63 bytes, and that none of the common encoding systems
have upper case letters in the lower 63 bytes, that could be turned
into a different value via tolower().

Now that all labels are stored in lower case, we don't need to burn
CPU cycles in tolower() at run time and can use memcmp() instead of
resolv_hostname_cmp(). This results in a ~22% lower CPU usage on large
farms using SRV records:

before:
  18.33%  haproxy                   [.] resolv_validate_dns_response
  10.58%  haproxy                   [.] process_resolvers
  10.28%  haproxy                   [.] resolv_hostname_cmp
   7.50%  libc-2.30.so              [.] tolower
  46.69%  total

after:
  24.73%  haproxy                     [.] resolv_validate_dns_response
   7.78%  libc-2.30.so                [.] __memcmp_avx2_movbe
   3.65%  haproxy                     [.] process_resolvers
  36.16%  total
2021-10-18 10:47:36 +02:00
Willy Tarreau
814889c28a MEDIUM: resolvers: lower-case labels when converting from/to DNS names
The whole code relies on performing case-insensitive comparison on
lookups, which is extremely inefficient.

Let's make sure that all labels to be looked up or sent are first
converted to lower case. Doing so is also the opportunity to eliminate
an inefficient memcpy() in resolv_dn_label_to_str() that essentially
runs over a few unaligned bytes at once. As a side note, that call
was dangerous because it relied on a sign-extended size taken from a
string that had to be sanitized first.

This is tagged medium because while this is 100% safe, it may cause
visible changes on the wire at the packet level and trigger bugs in
test programs.
2021-10-18 09:14:02 +02:00
Ilya Shipitsin
bd6b4be721 CLEANUP: assorted typo fixes in the code and comments
This is 27th iteration of typo fixes
2021-10-18 07:26:19 +02:00
Bjrn Jacke
20d0f50b00 MINOR: add ::1 to predefined LOCALHOST acl
The "LOCALHOST" ACL currently matches only 127.0.0.1/8. This adds the
IPv6 "::1" address to the supported patterns.
2021-10-18 07:21:28 +02:00
Tim Duesterhus
c5aa113d80 CLEANUP: Apply strcmp.cocci
This fixes the use of the various *cmp functions to use != 0 or == 0.
2021-10-18 07:17:04 +02:00
Willy Tarreau
6d19f0d837 CLEANUP: listeners: remove unreachable code in clone_listener()
Coverity reported in issue #1416 that label oom3 is not reachable in
function close_listener() added by commit 59a877dfd ("MINOR: listeners:
add clone_listener() to duplicate listeners at boot time"). The code
leading to it was removed during the development of the function, but
not the label itself.
2021-10-16 14:58:30 +02:00
Willy Tarreau
7c4c830d04 BUG/MINOR: listener: add an error check for unallocatable trash
Coverity noticed in issue #1416 that a missing allocation error was
introduced in tcp_bind_listener() with the rework of error messages by
commit ed1748553 ("MINOR: proto_tcp: use chunk_appendf() to ouput socket
setup errors"). In practice nobody will ever face it but better address
it anyway.

No backport is needed.
2021-10-16 14:54:19 +02:00
Willy Tarreau
a146289d4f BUG/MINOR: listener: fix incorrect return on out-of-memory
When the clone_listener() function was added in commit 59a877dfd
("MINOR: listeners: add clone_listener() to duplicate listeners at
boot time"), a stupid bug was introduced when splitting the error
path because while the first case where calloc fails will leave NULL
in the output value, the other cases will return the pointer to a
freed area. This was reported by Coverity in issue #1416.

In practice nobody will face it (out-of-memory while checking config),
but let's fix it.

No backport is needed.
2021-10-16 14:45:29 +02:00
Willy Tarreau
b39e47a52b BUG/MINOR: sample: fix backend direction flags consecutive to last fix
Commit 7a06ffb85 ("BUG/MEDIUM: sample: Cumulate frontend and backend
sample validity flags") introduced a typo confusing the request and
the response direction when checking for validity of a rule applied
to a backend. This was reported by Coverity in issue #1417.

This needs to be backported where the patch above is backported.
2021-10-16 14:41:09 +02:00
Amaury Denoyelle
697cfde340 BUG/MEDIUM: cpuset: fix cpuset size for FreeBSD
Fix the macro used to retrieve the max number of cpus on FreeBSD. The
MAXCPU is not properly defined in userspace and always set to 1 despite
the machine architecture. Replace it with CPU_SETSIZE.

See https://freebsd-hackers.freebsd.narkive.com/gw4BeLum/smp-in-machine-params-h#post6

Without this, the following config file is rejected on FreeBSD even if
the machine is SMP :
  global
    cpu-map 1-2 0-1

This must be backported up to 2.4.
2021-10-15 17:16:11 +02:00
Christopher Faulet
6db9a97f61 BUG/MINOR: proxy: Release ACLs and TCP/HTTP rules of default proxies
It is now possible to have TCP/HTTP rules and ACLs defined in defaults
sections. So we must try to release corresponding lists when a default proxy
is destroyed.

No backport needed.
2021-10-15 14:33:35 +02:00
Christopher Faulet
7a06ffb854 BUG/MEDIUM: sample: Cumulate frontend and backend sample validity flags
When the sample validity flags are computed to check if a sample is used in
a valid scope, the flags depending on the proxy capabilities must be
cumulated. Historically, for a sample on the request, only the frontend
capability was used to set the sample validity flags while for a sample on
the response only the backend was used. But it is a problem for listen or
defaults proxies. For those proxies, all frontend and backend samples should
be valid. However, at many place, only frontend ones are possible.

For instance, it is impossible to set the backend name (be_name) into a
variable from a listen proxy.

This bug exists on all stable versions. Thus this patch should probably be
backported. But with some caution because the code has probably changed
serveral times. Note that nobody has ever noticed this issue. So the need to
backport this patch must be evaluated for each branch.
2021-10-15 14:12:19 +02:00
Christopher Faulet
d4150ad869 MEDIUM: http-ana: Eval HTTP rules defined in defaults sections
As for TCP rules, HTTP rules from defaults section are now evaluated. These
rules are evaluated before those of the proxy. The same default ruleset
cannot be attached to the frontend and the backend. However, at this stage,
we take care to not execute twice the same ruleset. So, in theory, a
frontend and a backend could use the same defaults section. In this case,
the default ruleset is executed before all others and only once.
2021-10-15 14:12:19 +02:00
Christopher Faulet
c8016d0f58 MEDIUM: tcp-rules: Eval TCP rules defined in defaults sections
TCP rules from defaults section are now evaluated. These rules are evaluated
before those of the proxy. For L7 TCP rules, the same default ruleset cannot
be attached to the frontend and the backend. However, at this stage, we take
care to not execute twice the same ruleset. So, in theory, a frontend and a
backend could use the same defaults section. In this case, the default
ruleset is executed before all others and only once.
2021-10-15 14:12:19 +02:00
Christopher Faulet
ee08d6cc74 MEDIUM: rules/acl: Parse TCP/HTTP rules and acls defined in defaults sections
TCP and HTTP rules can now be defined in defaults sections, but only those
with a name. Because these rules may use conditions based on ACLs, ACLs can
also be defined in defaults sections.

However there are some limitations:

  * A defaults section defining TCP/HTTP rules cannot be used by a defaults
    section
  * A defaults section defining TCP/HTTP rules cannot be used bu a listen
    section
  * A defaults sections defining TCP/HTTP rules cannot be used by frontends
    and backends at the same time
  * A defaults sections defining 'tcp-request connection' or 'tcp-request
    session' rules cannot be used by backends
  * A defaults sections defining 'tcp-response content' rules cannot be used
    by frontends

The TCP request/response inspect-delay of a proxy is now inherited from the
defaults section it uses. For now, these rules are only parsed. No evaluation is
performed.
2021-10-15 14:12:19 +02:00
Christopher Faulet
6ff7de5d64 MINOR: tcpcheck: Support 2-steps args resolution in defaults sections
With the commit eaba25dd9 ("BUG/MINOR: tcpcheck: Don't use arg list for
default proxies during parsing"), we restricted the use of sample fetch in
tcpcheck rules defined in a defaults section to those depending on explicit
arguments only. This means a tcpcheck rules defined in a defaults section
cannot rely on argument unresolved during the configuration parsing.

Thanks to recent changes, it is now possible again.

This patch is mandatory to support TCP/HTTP rules in defaults sections.
2021-10-15 14:12:19 +02:00
Christopher Faulet
52b8a43d4e MINOR: config: No longer remove previous anonymous defaults section
When the parsing of a defaults section is started, the previous anonymous
defaults section is removed. It may be a problem with referenced defaults
sections. And because all unused defautl proxies are removed after the
configuration parsing, it is not required to remove it so early.

This patch is mandatory to support TCP/HTTP rules in defaults sections.
2021-10-15 14:12:19 +02:00
Christopher Faulet
ff556276eb MINOR: config: Finish configuration for referenced default proxies
If a not-ready default proxy is referenced by a proxy during the
configuration validity check, its configuration is also finished and
PR_FL_READY flag is set on it.

For now, the arguments resolution is the only step performed.

This patch is mandatory to support TCP/HTTP rules in defaults sections.
2021-10-15 14:12:19 +02:00
Christopher Faulet
56717803e1 MINOR: proxy: Add PR_FL_READY flag on fully configured and usable proxies
The PR_FL_READY flags must now be set on a proxy at the end of the
configuration validity check to notify it is fully configured and may be
safely used.

For now there is no real usage of this flag. But it will be usefull for
referenced default proxies to finish their configuration only once.

This patch is mandatory to support TCP/HTTP rules in defaults sections.
2021-10-15 14:12:19 +02:00
Christopher Faulet
27c8d20451 MINOR: proxy: Be able to reference the defaults section used by a proxy
A proxy may now references the defaults section it is used. To do so, a
pointer on the default proxy was added in the proxy structure. And a
refcount must be used to track proxies using a default proxy. A default
proxy is destroyed iff its refcount is equal to zero and when it drops to
zero.

All this stuff must be performed during init/deinit staged for now. All
unreferenced default proxies are removed after the configuration parsing.

This patch is mandatory to support TCP/HTTP rules in defaults sections.
2021-10-15 14:12:19 +02:00
Christopher Faulet
b40542000d MEDIUM: proxy: Warn about ambiguous use of named defaults sections
It is now possible to designate the defaults section to use by adding a name
of the corresponding defaults section and referencing it in the desired
proxy section. However, this introduces an ambiguity. This named defaults
section may still be implicitly used by other proxies if it is the last one
defined. In this case for instance:

  default common
    ...

  default frt from common
    ...

  default bck from common
    ...

  frontend fe from frt
    ...

  backend be from bck
    ...

  listen stats
    ...

Here, it is not really obvious the last section will use the 'bck' defaults
section. And it is probably not the expected behaviour. To help users to
properly configure their haproxy, a warning is now emitted if a defaults
section is explicitly AND implicitly used. The configuration manual was
updated accordingly.

Because this patch adds a warning, it should probably not be backported to
2.4. However, if is is backported, it depends on commit "MINOR: proxy:
Introduce proxy flags to replace disabled bitfield".
2021-10-15 14:12:19 +02:00
Christopher Faulet
37a9e21a3a MINOR: sample/arg: Be able to resolve args found in defaults sections
It is not yet used but thanks to this patch, it will be possible to resolve
arguments found in defaults sections. However, there is some restrictions:

  * For FE (frontend) or BE (backend) arguments, if the proxy is explicity
    defined, there is no change. But for implicit proxy (not specified), the
    argument points on the default proxy. when a sample fetch using this
    kind of argument is evaluated, the default proxy replaced by the current
    one.

  * For SRV (server) and TAB (stick-table)arguments, the proxy must always
    be specified. Otherwise an error is reported.

This patch is mandatory to support TCP/HTTP rules in defaults sections.
2021-10-15 14:12:19 +02:00
Christopher Faulet
dfd10ab5ee MINOR: proxy: Introduce proxy flags to replace disabled bitfield
This change is required to support TCP/HTTP rules in defaults sections. The
'disabled' bitfield in the proxy structure, used to know if a proxy is
disabled or stopped, is replaced a generic bitfield named 'flags'.

PR_DISABLED and PR_STOPPED flags are renamed to PR_FL_DISABLED and
PR_FL_STOPPED respectively. In addition, everywhere there is a test to know
if a proxy is disabled or stopped, there is now a bitwise AND operation on
PR_FL_DISABLED and/or PR_FL_STOPPED flags.
2021-10-15 14:12:19 +02:00
Christopher Faulet
647a61cc4b BUG/MINOR: proxy: Use .disabled field as a bitfield as documented
.disabled field in the proxy structure is documented to be a bitfield. So
use it as a bitfield. This change was introduced to the 2.5, by commit
8e765b86f ("MINOR: proxy: disabled takes a stopping and a disabled state").

No backport is needed except if the above commit is backported.
2021-10-15 14:12:19 +02:00
Christopher Faulet
a5aa082742 BUG/MINOR: sample: Fix 'fix_tag_value' sample when waiting for more data
The test on the return value of fix_tag_value() function was inverted. To
wait for more data, the return value must be a valid empty string and not
IST_NULL.

This patch must be backported to 2.4.
2021-10-15 14:12:19 +02:00
Christopher Faulet
597909f4e6 BUG/MINOR: http-ana: Don't eval front after-response rules if stopped on back
http-after-response rules evaluation must be stopped after a "allow". It
means the frontend ruleset must not be evaluated if a "allow" was performed
in the backend ruleset. Internally, the evaluation must be stopped if on
HTTP_RULE_RES_STOP return value. Only the "allow" action is concerned by
this change.

Thanks to this patch, http-response and http-after-response behave in the
same way.

This patch should be backported as far as 2.2.
2021-10-15 14:12:19 +02:00
Willy Tarreau
e20e026033 BUG/MEDIUM: sample/jwt: fix another instance of base64 error detection
This is the same as for commit 468c000db ("BUG/MEDIUM: jwt: fix base64
decoding error detection"), but for function sample_conv_jwt_member_query()
that is used by sample converters jwt_header_query() and jwt_payload_query().
Thanks to Tim for the report. No backport is needed.
2021-10-15 12:14:16 +02:00
Willy Tarreau
ce16db4145 BUG/MINOR: jwt: use CRYPTO_memcmp() to compare HMACs
As Tim reported in github issue #1414, we ought to use a constant-time
memcmp() when comparing hashes to avoid time-based attacks. Let's use
CRYPTO_memcmp() since this code already depends on openssl.

No backport is needed, this was just merged into 2.5.
2021-10-15 11:54:04 +02:00
Willy Tarreau
468c000db0 BUG/MEDIUM: jwt: fix base64 decoding error detection
Tim reported that a decoding error from the base64 function wouldn't
be matched in case of bad input, and could possibly cause trouble
with -1 being passed in decoded_sig->data. In the case of HMAC+SHA
it is harmless as the comparison is made using memcmp() after checking
for length equality, but in the case of RSA/ECDSA this result is passed
as a size_t to EVP_DigetVerifyFinal() and may depend on the lib's mood.

The fix simply consists in checking the intermediary result before
storing it.

That's precisely what happens with one of the regtests which returned
0 instead of 4 on the intentionally defective token, so the regtest
was fixed as well.

No backport is needed as this is new in this release.
2021-10-15 11:41:16 +02:00
Willy Tarreau
7b232f132d BUG/MEDIUM: resolvers: fix truncated TLD consecutive to the API fix
A bug was introduced by commit previous bf9498a31 ("MINOR: resolvers:
fix the resolv_str_to_dn_label() API about trailing zero") as the code
is particularly contrived and hard to test. The output writes the last
char at [i+1] so the trailing zero and return value must be at i+1.

This will have to be backported where the patch above is backported
since it was needed for a fix.
2021-10-15 08:09:25 +02:00
Willy Tarreau
cc8fd4c040 MINOR: resolvers: merge address and target into a union "data"
These two fields are exclusive as they depend on the data type.
Let's move them into a union to save some precious bytes. This
reduces the struct resolv_answer_item size from 600 to 576 bytes.
2021-10-14 22:52:04 +02:00
Willy Tarreau
b4ca0195a9 BUG/MEDIUM: resolvers: use correct storage for the target address
The struct resolv_answer_item contains an address field of type
"sockaddr" which is only 16 bytes long, but which is used to store
either IPv4 or IPv6. Fortunately, the contents only overlap with
the "target" field that follows it and that is large enough to
absorb the extra bytes needed to store AAAA records. But this is
dangerous as just moving fields around could result in memory
corruption.

The fix uses a union and removes the casts that were used to hide
the problem.

Older versions need to be checked and possibly fixed. This needs
to be backported anyway.
2021-10-14 22:44:51 +02:00
Willy Tarreau
6dfbef4145 MEDIUM: listener: add the "shards" bind keyword
In multi-threaded mode, on operating systems supporting multiple listeners on
the same IP:port, this will automatically create this number of multiple
identical listeners for the same line, all bound to a fair share of the number
of the threads attached to this listener. This can sometimes be useful when
using very large thread counts where the in-kernel locking on a single socket
starts to cause a significant overhead. In this case the incoming traffic is
distributed over multiple sockets and the contention is reduced. Note that
doing this can easily increase the CPU usage by making more threads work a
little bit.

If the number of shards is higher than the number of available threads, it
will automatically be trimmed to the number of threads. A special value
"by-thread" will automatically assign one shard per thread.
2021-10-14 21:27:48 +02:00
Willy Tarreau
59a877dfd9 MINOR: listeners: add clone_listener() to duplicate listeners at boot time
This function's purpose will be to duplicate a listener in INIT state.
This will be used to ease declaration of listeners spanning multiple
groups, which will thus require multiple FDs hence multiple receivers.
2021-10-14 21:27:48 +02:00
Willy Tarreau
01cac3f721 MEDIUM: listeners: split the thread mask between receiver and bind_conf
With groups at some point we'll have to have distinct masks/groups in the
receiver and the bind_conf, because a single bind_conf might require to
instantiate multiple receivers (one per group).

Let's split the thread mask and group to have one for the bind_conf and
another one for the receiver while it remains easy to do. This will later
allow to use different storage for the bind_conf if needed (e.g. support
multiple groups).
2021-10-14 21:27:48 +02:00
Willy Tarreau
875ee704dd MINOR: resolvers: fix the resolv_dn_label_to_str() API about trailing zero
This function suffers from the same API issue as its sibling that does the
opposite direction, it demands that the input string is zero-terminated
*and* that its length *including* the trailing zero is passed on input,
forcing callers to pass length + 1, and itself to use that length - 1
everywhere internally.

This patch addressess this. There is a single caller, which is the
location of the previous bug, so it should probably be backported at
least to keep the code consistent across versions. Note that the
function is called dns_dn_label_to_str() in 2.3 and earlier.
2021-10-14 21:24:18 +02:00
Willy Tarreau
85c15e6bff BUG/MINOR: resolvers: do not reject host names of length 255 in SRV records
An off-by-one issue in buffer size calculation used to limit the output
of resolv_dn_label_to_str() to 254 instead of 255.

This must be backported to 2.0.
2021-10-14 21:24:18 +02:00
Willy Tarreau
947ae125cc BUG/MEDIUM: resolver: make sure to always use the correct hostname length
In issue #1411, @jjiang-stripe reports that do-resolve() sometimes seems
to be trying to resolve crap from random memory contents.

The issue is that action_prepare_for_resolution() tries to measure the
input string by itself using strlen(), while resolv_action_do_resolve()
directly passes it a pointer to the sample, omitting the known length.
Thus of course any other header present after the host in memory are
appended to the host value. It could theoretically crash if really
unlucky, with a buffer that does not contain any zero including in the
index at the end, and if the HTX buffer ends on an allocation boundary.
In practice it should be too low a probability to have ever been observed.

This patch modifies the action_prepare_for_resolution() function to take
the string length on with the host name on input and pass that down the
chain. This should be backported to 2.0 along with commit "MINOR:
resolvers: fix the resolv_str_to_dn_label() API about trailing zero".
2021-10-14 21:24:18 +02:00
Willy Tarreau
bf9498a31b MINOR: resolvers: fix the resolv_str_to_dn_label() API about trailing zero
This function is bogus at the API level: it demands that the input string
is zero-terminated *and* that its length *including* the trailing zero is
passed on input. While that already looks smelly, the trailing zero is
copied as-is, and is then explicitly replaced with a zero... Not only
all callers have to pass hostname_len+1 everywhere to work around this
absurdity, but this requirement causes a bug in the do-resolve() action
that passes random string lengths on input, and that will be fixed on a
subsequent patch.

Let's fix this API issue for now.

This patch will have to be backported, and in versions 2.3 and older,
the function is in dns.c and is called dns_str_to_dn_label().
2021-10-14 21:24:18 +02:00
Willy Tarreau
6823a3acee MINOR: protocol: uniformize protocol errors
Some protocols fail with "error blah [ip:port]" and other fail with
"[ip:port] error blah". All this already appears in a "starting" or
"binding" context after a proxy name. Let's choose a more universal
approach like below where the ip:port remains at the end of the line
prefixed with "for".

  [WARNING]  (18632) : Binding [binderr.cfg:10] for proxy http: cannot bind receiver to device 'eth2' (No such device) for [0.0.0.0:1080]
  [WARNING]  (18632) : Starting [binderr.cfg:10] for proxy http: cannot set MSS to 12 for [0.0.0.0:1080]
2021-10-14 21:22:52 +02:00
Willy Tarreau
37de553f1d MINOR: protocol: report the file and line number for binding/listening errors
Binding errors and late socket errors provide no information about
the file and line where the problem occurs. These are all done by
protocol_bind_all() and they only report "Starting proxy blah". Let's
change this a little bit so that:
  - the file name and line number of the faulty bind line is alwas mentioned
  - early binding errors are indicated with "Binding" instead of "Starting".

Now we can for example have this:
  [WARNING]  (18580) : Binding [binderr.cfg:10] for proxy http: cannot bind receiver to device 'eth2' (No such device) [0.0.0.0:1080]
2021-10-14 21:22:52 +02:00
Willy Tarreau
f78b52eb7d MINOR: inet: report the faulty interface name in "bind" errors
When a "bind ... interface foo" statement fails, let's report the
interface name in the error message to help locating it in the file.
2021-10-14 21:22:52 +02:00
Willy Tarreau
3cf05cb0b1 MINOR: proto_tcp: also report the attempted MSS values in error message
The MSS errors are the only ones not indicating what was attempted, let's
report the value that was tried, as it can help users spot them in the
config (particularly if a default value was used).
2021-10-14 21:22:52 +02:00
Bjoern Jacke
ed1748553a MINOR: proto_tcp: use chunk_appendf() to ouput socket setup errors
Right now only the last warning or error is reported from
tcp_bind_listener(), but it is useful to report all warnings and no only
the last one, so we now emit them delimited by commas. Previously we used
a fixed buffer of 100 bytes, which was too small to store more than one
message, so let's extend it.

Signed-off-by: Bjoern Jacke <bjacke@samba.org>
2021-10-14 21:22:52 +02:00
Remi Tricot-Le Breton
130e142ee2 MEDIUM: jwt: Add jwt_verify converter to verify JWT integrity
This new converter takes a JSON Web Token, an algorithm (among the ones
specified for JWS tokens in RFC 7518) and a public key or a secret, and
it returns a verdict about the signature contained in the token. It does
not simply return a boolean because some specific error cases cas be
specified by returning an integer instead, such as unmanaged algorithms
or invalid tokens. This enables to distinguich malformed tokens from
tampered ones, that would be valid format-wise but would have a bad
signature.
This converter does not perform a full JWT validation as decribed in
section 7.2 of RFC 7519. For instance it does not ensure that the header
and payload parts of the token are completely valid JSON objects because
it would need a complete JSON parser. It only focuses on the signature
and checks that it matches the token's contents.
2021-10-14 16:38:14 +02:00
Remi Tricot-Le Breton
0a72f5ee7c MINOR: jwt: jwt_header_query and jwt_payload_query converters
Those converters allow to extract a JSON value out of a JSON Web Token's
header part or payload part (the two first dot-separated base64url
encoded parts of a JWS in the Compact Serialization format).
They act as a json_query call on the corresponding decoded subpart when
given parameters, and they return the decoded JSON subpart when no
parameter is given.
2021-10-14 16:38:13 +02:00
Remi Tricot-Le Breton
864089e0a6 MINOR: jwt: Insert public certificates into dedicated JWT tree
A JWT signed with the RSXXX or ESXXX algorithm (RSA or ECDSA) requires a
public certificate to be verified and to ensure it is valid. Those
certificates must not be read on disk at runtime so we need a caching
mechanism into which those certificates will be loaded during init.
This is done through a dedicated ebtree that is filled during
configuration parsing. The path to the public certificates will need to
be explicitely mentioned in the configuration so that certificates can
be loaded as early as possible.
This tree is different from the ckch one because ckch entries are much
bigger than the public certificates used in JWT validation process.
2021-10-14 16:38:12 +02:00
Remi Tricot-Le Breton
e0d3c00086 MINOR: jwt: JWT tokenizing helper function
This helper function splits a JWT under Compact Serialization format
(dot-separated base64-url encoded strings) into its different sub
strings. Since we do not want to manage more than JWS for now, which can
only have at most three subparts, any JWT that has strictly more than
two dots is considered invalid.
2021-10-14 16:38:10 +02:00
Remi Tricot-Le Breton
7feb361776 MINOR: jwt: Parse JWT alg field
The full list of possible algorithms used to create a JWS signature is
defined in section 3.1 of RFC7518. This patch adds a helper function
that converts the "alg" strings into an enum member.
2021-10-14 16:38:08 +02:00
Remi Tricot-Le Breton
f5dd337b12 MINOR: http: Add http_auth_bearer sample fetch
This fetch can be used to retrieve the data contained in an HTTP
Authorization header when the Bearer scheme is used. This is used when
transmitting JSON Web Tokens for instance.
2021-10-14 16:38:07 +02:00
William Lallemand
1d58b01316 MINOR: ssl: add ssl_fc_is_resumed to "option httpslog"
In order to trace which session were TLS resumed, add the
ssl_fc_is_resumed in the httpslog option.
2021-10-14 14:27:48 +02:00
Amaury Denoyelle
493bb1db10 MINOR: quic: handle CONNECTION_CLOSE frame
On receiving CONNECTION_CLOSE frame, the mux is flagged for immediate
connection close. A stream is closed even if there is data not ACKed
left if CONNECTION_CLOSE has been received.
2021-10-13 16:38:56 +02:00
Amaury Denoyelle
1e308ffc79 MINOR: mux: remove last occurences of qcc ring buffer
The mux tx buffers have been rewritten with buffers attached to qcs
instances. qc_buf_available and qc_get_buf functions are updated to
manipulates qcs. All occurences of the unused qcc ring buffer are
removed to ease the code maintenance.
2021-10-13 16:38:56 +02:00
Amaury Denoyelle
cae0791942 MEDIUM: mux-quic: defer stream shut if remaining tx data
Defer the shutting of a qcs if there is still data in its tx buffers. In
this case, the conn_stream is closed but the qcs is kept with a new flag
QC_SF_DETACH.

On ACK reception, the xprt wake up the shut_tl tasklet if the stream is
flagged with QC_SF_DETACH. This tasklet is responsible to free the qcs
and possibly the qcc when all bidirectional streams are removed.
2021-10-13 16:38:56 +02:00
Amaury Denoyelle
ac8ee25659 MINOR: mux-quic: implement standard method to detect if qcc is dead
For the moment, a quic connection is considered dead if it has no
bidirectional streams left on it. This test is implemented via
qcc_is_dead function. It can be reused to properly close the connection
when needed.
2021-10-13 16:38:56 +02:00
Amaury Denoyelle
4fc8b1cb17 CLEANUP: h3: remove dead code
Remove unused function. This will simplify code maintenance.
2021-10-13 16:38:56 +02:00
Amaury Denoyelle
a587136c6f MINOR: mux-quic: standardize h3 settings sending
Use same buffer management to send h3 settings as for streams. This
simplify the code maintenance with unused function removed.
2021-10-13 16:38:56 +02:00
Amaury Denoyelle
a543eb1f6f MEDIUM: h3: properly manage tx buffers for large data
Properly handle tx buffers management in h3 data sending. If there is
not enough contiguous space, the buffer is first realigned. If this is
not enough, the stream is flagged with QC_SF_BLK_MROOM waiting for the
buffer to be emptied.

If a frame on a stream is successfully pushed for sending, the stream is
called if it was flagged with QC_SF_BLK_MROOM.
2021-10-13 16:38:56 +02:00
Amaury Denoyelle
d3d97c6ae7 MEDIUM: mux-quic: rationalize tx buffers between qcc/qcs
Remove the tx mux ring buffers in qcs, which should be in the qcc. For
the moment, use a simple architecture with 2 simple tx buffers in the
qcs.

The first buffer is used by the h3 layer to prepare the data. The mux
send operation transfer these into the 2nd buffer named xprt_buf. This
buffer is only freed when an ACK has been received.

This architecture is functional but not optimal for two reasons :
- it won't limit the buffer usage by connection
- each transfer on a new stream requires an allocation
2021-10-13 16:38:56 +02:00
Remi Tricot-Le Breton
b01179aa92 MINOR: ssl: Add ssllib_name_startswith precondition
This new ssllib_name_startswith precondition check can be used to
distinguish application linked with OpenSSL from the ones linked with
other SSL libraries (LibreSSL or BoringSSL namely). This check takes a
string as input and returns 1 when the SSL library's name starts with
the given string. It is based on the OpenSSL_version function which
returns the same output as the "openssl version" command.
2021-10-13 11:28:08 +02:00
Tim Duesterhus
9e5e586e35 BUG/MINOR: lua: Fix lua error handling in hlua_config_prepend_path()
Set an `lua_atpanic()` handler before calling `hlua_prepend_path()` in
`hlua_config_prepend_path()`.

This prevents the process from abort()ing when `hlua_prepend_path()` fails
for some reason.

see GitHub Issue #1409

This is a very minor issue that can't happen in practice. No backport needed.
2021-10-12 11:28:57 +02:00
Christopher Faulet
8c67eceeca CLEANUP: stream: Properly indent current_rule line in "show sess all"
This line is not related to the response channel but to the stream. Thus it
must be indented at the same level as stream-interfaces, connections,
channels...
2021-10-12 11:27:24 +02:00
Christopher Faulet
d4762b8474 MINOR: stream: report the current filter in "show sess all" when known
Filters can block the stream on pre/post analysis for any reason and it can
be useful to report it in "show sess all". So now, a "current_filter" extra
line is reported for each channel if a filter is blocking the analysis. Note
that this does not catch the TCP/HTTP payload analysis because all
registered filters are always evaluated when more data are received.
2021-10-12 11:26:49 +02:00
Willy Tarreau
1274e10d5c MINOR: stream: report the current rule in "show sess all" when known
Sometimes an HTTP or TCP rule may take time to complete because it is
waiting for external data (e.g. "wait-for-body", "do-resolve"), and it
can be useful to report the action and the location of that rule in
"show sess all". Here for streams blocked on such a rule, there will
now be a "current_line" extra line reporting this. Note that this does
not catch rulesets which are re-evaluated from the start on each change
(e.g. tcp-request content waiting for changes) but only when a specific
rule is being paused.
2021-10-12 07:38:30 +02:00
Willy Tarreau
c9e4868510 MINOR: rules: add a file name and line number to act_rules
These ones are passed on rule creation for the sole purpose of being
reported in "show sess", which is not done yet. For now the entries
are allocated upon rule creation and freed in free_act_rules().
2021-10-12 07:38:30 +02:00
Willy Tarreau
d535f807bb MINOR: rules: add a new function new_act_rule() to allocate act_rules
Rules are currently allocated using calloc() by their caller, which does
not make it very convenient to pass more information such as the file
name and line number.

This patch introduces new_act_rule() which performs the malloc() and
already takes in argument the ruleset (ACT_F_*), the file name and the
line number. This saves the caller from having to assing ->from, and
will allow to improve the internal storage with more info.
2021-10-12 07:38:30 +02:00
Willy Tarreau
db2ab8218c MEDIUM: stick-table: never learn the "conn_cur" value from peers
There have been a large number of issues reported with conn_cur
synchronization because the concept is wrong. In an active-passive
setup, pushing the local connections count from the active node to
the passive one will result in the passive node to have a higher
counter than the real number of connections. Due to this, after a
switchover, it will never be able to close enough connections to
go down to zero. The same commonly happens on reloads since the new
process preloads its values from the old process, and if no connection
happens for a key after the value is learned, it is impossible to reset
the previous ones. In active-active setups it's a bit different, as the
number of connections reflects the number on the peer that pushed last.

This patch solves this by marking the "conn_cur" local and preventing
it from being learned from peers. It is still pushed, however, so that
any monitoring system that collects values from the peers will still
see it.

The patch is tiny and trivially backportable. While a change of behavior
in stable branches is never welcome, it remains possible to fix issues
if reports become frequent.
2021-10-08 17:53:12 +02:00
Willy Tarreau
e3f4d7496d MEDIUM: config: resolve relative threads on bind lines to absolute ones
Now threads ranges specified on bind lines will be turned to effective
ones that will lead to a usable thread mask and a group ID.
2021-10-08 17:22:26 +02:00
Willy Tarreau
627def9e50 MINOR: threads: add a new function to resolve config groups and masks
In the configuration sometimes we'll omit a thread group number to designate
a global thread number range, and sometimes we'll mention the group and
designate IDs within that group. The operation is more complex than it
seems due to the need to check for ranges spanning between multiple groups
and determining groups from threads from bit masks and remapping bit masks
between local/global.

This patch adds a function to perform this operation, it takes a group and
mask on input and updates them on output. It's designed to be used by "bind"
lines but will likely be usable at other places if needed.

For situations where specified threads do not exist in the group, we have
the choice in the code between silently fixing the thread set or failing
with a message. For now the better option seems to return an error, but if
it turns out to be an issue we can easily change that in the future. Note
that it should only happen with "x/even" when group x only has one thread.
2021-10-08 17:22:26 +02:00
Willy Tarreau
d57b9ff7af MEDIUM: listeners: support the definition of thread groups on bind lines
This extends the "thread" statement of bind lines to support an optional
thread group number. When unspecified (0) it's an absolute thread range,
and when specified it's one relative to the thread group. Masks are still
used so no more than 64 threads may be specified at once, and a single
group is possible. The directive is not used for now.
2021-10-08 17:22:26 +02:00
Willy Tarreau
a3870b7952 MINOR: debug: report the group and thread ID in the thread dumps
Now thread dumps will report the thread group number and the ID within
this group. Note that this is still quite limited because some masks
are calculated based on the thread in argument while they have to be
performed against a group-level thread ID.
2021-10-08 17:22:26 +02:00
Willy Tarreau
b90935c908 MINOR: threads: add the current group ID in thread-local "tgid" variable
This is the equivalent of "tid" for ease of access. In the future if we
make th_cfg a pure thread-local array (not a pointer), it may make sense
to move it there.
2021-10-08 17:22:26 +02:00
Willy Tarreau
43ab05b3da MEDIUM: threads: replace ha_set_tid() with ha_set_thread()
ha_set_tid() was randomly used either to explicitly set thread 0 or to
set any possibly incomplete thread during boot. Let's replace it with
a pointer to a valid thread or NULL for any thread. This allows us to
check that the designated threads are always valid, and to ignore the
thread 0's mapping when setting it to NULL, and always use group 0 with
it during boot.

The initialization code is also cleaner, as we don't pass ugly casts
of a thread ID to a pointer anymore.
2021-10-08 17:22:26 +02:00
Willy Tarreau
cc7a11ee3b MINOR: threads: set the tid, ltid and their bit in thread_cfg
This will be a convenient way to communicate the thread ID and its
local ID in the group, as well as their respective bits when creating
the threads or when only a pointer is given.
2021-10-08 17:22:26 +02:00
Willy Tarreau
6eee85f887 MINOR: threads: set the group ID and its bit in the thread group
This will ease the reporting of the current thread group ID when coming
from the thread itself, especially since it returns the visible ID,
starting at 1.
2021-10-08 17:22:26 +02:00
Willy Tarreau
e6806ebecc MEDIUM: threads: automatically assign threads to groups
This takes care of unassigned threads groups and places unassigned
threads there, in a more or less balanced way. Too sparse allocations
may still fail though. For now with a maximum group number fixed to 1
nothing can really fail.
2021-10-08 17:22:26 +02:00
Willy Tarreau
d04bc3ac21 MINOR: global: add a new "thread-group" directive
This registers a mapping of threads to groups by enumerating for each thread
what group it belongs to, and marking the group as assigned. It takes care of
checking for redefinitions, overlaps, and holes. It supports both individual
numbers and ranges. The thread group is referenced from the thread config.
2021-10-08 17:22:26 +02:00
Willy Tarreau
c33b969e35 MINOR: global: add a new "thread-groups" directive
This is used to configure the number of thread groups. For now it can
only be 1.
2021-10-08 17:22:26 +02:00
Willy Tarreau
f9662848f2 MINOR: threads: introduce a minimalistic notion of thread-group
This creates a struct tgroup_info which knows the thread ID of the first
thread in a group, and the number of threads in it. For now there's only
one thread group supported in the configuration, but it may be forced to
other values for development purposes by defining MAX_TGROUPS, and it's
enabled even when threads are disabled and will need to remain accessible
during boot to keep a simple enough internal API.

For the purpose of easing the configurations which do not specify a thread
group, we're starting group numbering at 1 so that thread group 0 can be
"undefined" (i.e. for "bind" lines or when binding tasks).

The goal will be to later move there some global items that must be
made per-group.
2021-10-08 17:22:26 +02:00
Willy Tarreau
6036342f58 MINOR: thread: make "ti" a const pointer and clean up thread_info a bit
We want to make sure that the current thread_info accessed via "ti" will
remain constant, so that we don't accidentally place new variable parts
there and so that the compiler knows that info retrieved from there is
not expected to have changed between two function calls.

Only a few init locations had to be adjusted to use the array and the
rest is unaffected.
2021-10-08 17:22:26 +02:00
Willy Tarreau
b4e34766a3 REORG: thread/sched: move the last dynamic thread_info to thread_ctx
The last 3 fields were 3 list heads that are per-thread, and which are:
  - the pool's LRU head
  - the buffer_wq
  - the streams list head

Moving them into thread_ctx completes the removal of dynamic elements
from the struct thread_info. Now all these dynamic elements are packed
together at a single place for a thread.
2021-10-08 17:22:26 +02:00
Willy Tarreau
a0b99536c8 REORG: thread/sched: move the thread_info flags to the thread_ctx
The TI_FL_STUCK flag is manipulated by the watchdog and scheduler
and describes the apparent life/death of a thread so it changes
all the time and it makes sense to move it to the thread's context
for an active thread.
2021-10-08 17:22:26 +02:00
Willy Tarreau
45c38e22bf REORG: thread/clock: move the clock parts of thread_info to thread_ctx
The "thread_info" name was initially chosen to store all info about
threads but since we now have a separate per-thread context, there is
no point keeping some of its elements in the thread_info struct.

As such, this patch moves prev_cpu_time, prev_mono_time and idle_pct to
thread_ctx, into the thread context, with the scheduler parts. Instead
of accessing them via "ti->" we now access them via "th_ctx->", which
makes more sense as they're totally dynamic, and will be required for
future evolutions. There's no room problem for now, the structure still
has 84 bytes available at the end.
2021-10-08 17:22:26 +02:00
Willy Tarreau
1a9c922b53 REORG: thread/sched: move the task_per_thread stuff to thread_ctx
The scheduler contains a lot of stuff that is thread-local and not
exclusively tied to the scheduler. Other parts (namely thread_info)
contain similar thread-local context that ought to be merged with
it but that is even less related to the scheduler. However moving
more data into this structure isn't possible since task.h is high
level and cannot be included everywhere (e.g. activity) without
causing include loops.

In the end, it appears that the task_per_thread represents most of
the per-thread context defined with generic types and should simply
move to tinfo.h so that everyone can use them.

The struct was renamed to thread_ctx and the variable "sched" was
renamed to "th_ctx". "sched" used to be initialized manually from
run_thread_poll_loop(), now it's initialized by ha_set_tid() just
like ti, tid, tid_bit.

The memset() in init_task() was removed in favor of a bss initialization
of the array, so that other subsystems can put their stuff in this array.

Since the tasklet array has TL_CLASSES elements, the TL_* definitions
was moved there as well, but it's not a problem.

The vast majority of the change in this patch is caused by the
renaming of the structures.
2021-10-08 17:22:26 +02:00
Willy Tarreau
6414e4423c CLEANUP: wdt: do not remap SI_TKILL to SI_LWP, test the values directly
We used to remap SI_TKILL to SI_LWP when SI_TKILL was not available
(e.g. FreeBSD) but that's ugly and since we need this only in a single
switch/case block in wdt.c it's even simpler and cleaner to perform the
two tests there, so let's do this.
2021-10-08 17:22:26 +02:00
Willy Tarreau
b474f43816 MINOR: wdt: move wd_timer to wdt.c
The watchdog timer had no more reason for being shared with the struct
thread_info since the watchdog is the only user now. Let's remove it
from the struct and move it to a static array in wdt.c. This removes
some ifdefs and the need for the ugly mapping to empty_t that might be
subject to a cast to a long when compared to TIMER_INVALID. Now timer_t
is not known outside of wdt.c and clock.c anymore.
2021-10-08 17:22:26 +02:00
Willy Tarreau
2169498941 MINOR: clock: move the clock_ids to clock.c
This removes the knowledge of clockid_t from anywhere but clock.c, thus
eliminating a source of includes burden. The unused clock_id field was
removed from thread_info, and the definition setting of clockid_t was
removed from compat.h. The most visible change is that the function
now_cpu_time_thread() now takes the thread number instead of a tinfo
pointer.
2021-10-08 17:22:26 +02:00
Willy Tarreau
6cb0c391e7 REORG: clock/wdt: move wdt timer initialization to clock.c
The code that deals with timer creation for the WDT was moved to clock.c
and is called with the few relevant arguments. This removes the need for
awareness of clock_id from wdt.c and as such saves us from having to
share it outside. The timer_t is also known only from both ends but not
from the public API so that we don't have to create a fake timer_t
anymore on systems which do not support it (e.g. macos).
2021-10-08 17:22:26 +02:00
Willy Tarreau
44c58da52f REORG: clock: move the clock_id initialization to clock.c
This was previously open-coded in run_thread_poll_loop(). Now that
we have clock.c dedicated to such stuff, let's move the code there
so that we don't need to keep such ifdefs nor to depend on the
clock_id.
2021-10-08 17:22:26 +02:00
Willy Tarreau
2c6a998727 CLEANUP: clock: stop exporting before_poll and after_poll
We don't need to export them anymore so let's make them static.
2021-10-08 17:22:26 +02:00
Willy Tarreau
20adfde9c8 MINOR: activity: get the run_time from the clock updates
Instead of fiddling with before_poll and after_poll in
activity_count_runtime(), the function is now called by
clock_entering_poll() which passes it the number of microseconds
spent working. This allows to remove all calls to
activity_count_runtime() from the pollers.
2021-10-08 17:22:26 +02:00
Willy Tarreau
f9d5e1079c REORG: clock: move the updates of cpu/mono time to clock.c
The entering_poll/leaving_poll/measure_idle functions that were hard
to classify and used to move to various locations have now been placed
into clock.c since it's precisely about time-keeping. The functions
were renamed to clock_*. The samp_time and idle_time values are now
static since there is no reason for them to be read from outside.
2021-10-08 17:22:26 +02:00
Willy Tarreau
5554264f31 REORG: time: move time-keeping code and variables to clock.c
There is currently a problem related to time keeping. We're mixing
the functions to perform calculations with the os-dependent code
needed to retrieve and adjust the local time.

This patch extracts from time.{c,h} the parts that are solely dedicated
to time keeping. These are the "now" or "before_poll" variables for
example, as well as the various now_*() functions that make use of
gettimeofday() and clock_gettime() to retrieve the current time.

The "tv_*" functions moved there were also more appropriately renamed
to "clock_*".

Other parts used to compute stolen time are in other files, they will
have to be picked next.
2021-10-08 17:22:26 +02:00
Willy Tarreau
28345c6652 BUILD: init: avoid a build warning on FreeBSD with USE_PROCCTL
It was brought by a variable declared after some statements in commit
21185970c ("MINOR: proc: setting the process to produce a core dump on
FreeBSD."). It's worth noting that some versions of clang seem to ignore
-Wdeclaration-after-statement by default. No backport is needed.
2021-10-08 17:21:48 +02:00
Amaury Denoyelle
eb01f597eb BUG/MINOR: quic: fix includes for compilation
Fix missing includes in quic code following the general recent include
reorganization. This fixes the compilation error with QUIC enabled.
2021-10-08 15:59:02 +02:00
Amaury Denoyelle
769e9ffd94 CLEANUP: mux-quic: remove unused code
Remove unused code in mux-quic. This is mostly code related to the
backend side. This code is untested for the moment, its removal will
simplify the code maintenance.
2021-10-08 15:48:00 +02:00
Amaury Denoyelle
9c8c4fa3a2 MINOR: qpack: fix memory leak on huffman decoding
Remove an unneeded strdup invocation during QPACK huffman decoding. A
temporary storage buffer is passed by the function and exists after
decoding so no need to duplicate memory here.
2021-10-08 15:45:57 +02:00
Amaury Denoyelle
3a590c7ff2 MINOR: qpack: support non-indexed http status code encoding
If a HTTP status code is not present in the QPACK static table, encode
it with a literal field line with name reference.
2021-10-08 15:30:18 +02:00
Amaury Denoyelle
fccffe08b3 MINOR: qpack: do not encode invalid http status code
Ensure that the HTTP status code is valid before encoding with QPACK. An
error is return if this is not the case.
2021-10-08 15:28:35 +02:00
Christopher Faulet
485da0b053 BUG/MEDIUM: mux_h2: Handle others remaining read0 cases on partial frames
We've found others places where the read0 is ignored because of an
incomplete frame parsing. This time, it happens during parsing of
CONTINUATION frames.

When frames are parsed, incomplete frames are properly handled and
H2_CF_DEM_SHORT_READ flag is set. It is also true for HEADERS
frames. However, for CONTINUATION frames, there is an exception. Besides
parsing the current frame, we try to peek header of the next one to merge
payload of both frames, the current one and the next one. Idea is to create
a sole HEADERS frame before parsing the payload. However, in this case, it
is possible to have an incomplete frame too, not the current one but the
next one. From the demux point of view, the current frame is complete. We
must go to the internal function h2c_decode_headers() to detect an
incomplete frame. And this case was not identified and fixed when
H2_CF_DEM_SHORT_READ flag was introduced in the commit b5f7b5296
("BUG/MEDIUM: mux-h2: Handle remaining read0 cases on partial frames")

This bug was reported in a comment of the issue #1362. The patch must be
backported as far as 2.0.
2021-10-08 09:17:27 +02:00
Amaury Denoyelle
2af1985af8 BUG/MAJOR: quic: remove qc from receiver cids tree on free
Remove the quic_conn from the receiver connection_ids tree on
quic_conn_free. This fixes a crash due to dangling references in the
tree after a quic connection release.

This operation must be conducted under the listener lock. For this
reason, the quic_conn now contains a reference to its attached listener.
2021-10-07 17:35:25 +02:00
Amaury Denoyelle
d595f108db MINOR: mux-quic: release connection if no more bidir streams
Use the count of bidirectional streams to call qc_release in qc_detach.
We cannot inspect the by_id tree because uni-streams are never removed
from it. This allows the connection to be properly freed.
2021-10-07 17:35:25 +02:00
Amaury Denoyelle
336f6fd964 BUG/MAJOR: xprt-quic: do not queue qc timer if not set
Do not queue the pto/loss-detection timer if set to TICK_ETERNITY. This
usage is invalid with the scheduler and cause a BUG_ON trigger.
2021-10-07 17:35:25 +02:00
Amaury Denoyelle
139814a67a BUG/MEDIUM: mux-quic: reinsert all streams in by_id tree
It is required that all qcs streams are in the by_id tree for the xprt
to function correctly. Without this, some ACKs are not properly emitted
by xprt.

Note that this change breaks the free of the connection because the
condition eb_is_empty in qc_detach is always true. This will be fixed in
a following patch.
2021-10-07 17:35:25 +02:00
Frédéric Lécaille
75dd2b7987 MINOR: quic: Fix SSL error issues (do not use ssl_bio_and_sess_init())
It seems it was a bad idea to use the same function as for TCP ssl sockets
to initialize the SSL session objects for QUIC with ssl_bio_and_sess_init().
Indeed, this had as very bad side effects to generate SSL errors due
to the fact that such BIOs initialized for QUIC could not finally be controlled
via the BIO_ctrl*() API, especially BIO_ctrl() function used by very much other
internal OpenSSL functions (BIO_push(), BIO_pop() etc).
Others OpenSSL base QUIC implementation do not use at all BIOs to configure
QUIC connections. So, we decided to proceed the same way as ngtcp2 for instance:
only initialize an SSL object and call SSL_set_quic_method() to set its
underlying method. Note that calling this function silently disable this option:
SSL_OP_ENABLE_MIDDLEBOX_COMPAT.
We implement qc_ssl_sess_init() to initialize SSL sessions for QUIC connections
to do so with a retry in case of allocation failure as this is done by
ssl_bio_and_sess_init(). We also modify the code part for haproxy servers.
2021-10-07 17:35:25 +02:00
Frédéric Lécaille
7c881bdab8 MINOR: quic: BUG_ON() SSL errors.
As this QUIC implementation is still experimental, let's BUG_ON()
very important SSL handshake errors.
Also dump the SSL errors before BUG_ON().
2021-10-07 17:35:25 +02:00
Frédéric Lécaille
6f0fadb5a7 MINOR: quic: Add a function to dump SSL stack errors
This has been very helpful to fix SSL related issues.
2021-10-07 17:35:25 +02:00
Frédéric Lécaille
57e6e9eef8 MINOR: quic: Distinguish packet and SSL read enc. level in traces
This is only to distinguish the encryption level of packet traces from
the TLS stack current read encryption level.
2021-10-07 17:35:25 +02:00
Willy Tarreau
1b4a714266 MINOR: pools: report the amount used by thread caches in "show pools"
The "show pools" command provides some "allocated" and "used" estimates
on the pools objects, but this applies to the shared pool and the "used"
includes what is currently assigned to thread-local caches. It's possible
to know how much each thread uses, so let's dump the total size allocated
by thread caches as an estimate. It's only done when pools are enabled,
which explains why the patch adds quite a lot of ifdefs.
2021-10-07 17:30:06 +02:00
Willy Tarreau
aa992761d8 CLEANUP: thread: uninline ha_tkill/ha_tkillall/ha_cpu_relax()
These ones are rarely used or only to waste CPU cycles waiting, and are
the last ones requiring system includes in thread.h. Let's uninline them
and move them to thread.c.
2021-10-07 01:41:15 +02:00
Willy Tarreau
5e03dfaaf6 MINOR: thread: use a dedicated static pthread_t array in thread.c
This removes the thread identifiers from struct thread_info and moves
them only in static array in thread.c since it's now the only file that
needs to touch it. It's also the only file that needs to include
pthread.h, beyond haproxy.c which needs it to start the poll loop. As
a result, much less system includes are needed and the LoC reduced by
around 3%.
2021-10-07 01:41:15 +02:00
Willy Tarreau
4eeb88363c REORG: thread: move ha_get_pthread_id() to thread.c
It's the last function which directly accesses the pthread_t, let's move
it to thread.c and leave a static inline for non-thread.
2021-10-07 01:41:14 +02:00
Willy Tarreau
d10385ac4b REORG: thread: move the thread init/affinity/stop to thread.c
haproxy.c still has to deal with pthread-specific low-level stuff that
is OS-dependent. We should not have to deal with this there, and we do
not need to access pthread anywhere else.

Let's move these 3 functions to thread.c and keep empty inline ones for
when threads are disabled.
2021-10-07 01:41:14 +02:00
Willy Tarreau
19b18ad552 CLENAUP: wdt: use ha_tkill() instead of accessing pthread directly
Instead of calling pthread_kill() directly on the pthread_t let's
call ha_tkill() which does the same by itself. This will help isolate
pthread_t.
2021-10-07 01:41:14 +02:00
Willy Tarreau
b63888c67c REORG: fd: uninline compute_poll_timeout()
It's not needed to inline it at all (one call per loop) and it introduces
dependencies, let's move it to fd.c.

Removing the few remaining includes that came with it further reduced
by ~0.2% the LoC and the build time is now below 6s.
2021-10-07 01:41:14 +02:00
Willy Tarreau
d8b325c748 REORG: task: uninline the loop time measurement code
It's pointless to inline this, it's called exactly once per poll loop,
and it depends on time.h which is quite deep. Let's move that to task.c
along with sched_report_idle().
2021-10-07 01:41:14 +02:00
Willy Tarreau
8de90c71b3 REORG: connection: uninline the rest of the alloc/free stuff
The remaining large functions are those allocating/initializing and
occasionally freeing connections, conn_streams and sockaddr. Let's
move them to connection.c. In fact, cs_free() is the only one-liner
but let's move it along with the other ones since a call will be
small compared to the rest of the work done there.
2021-10-07 01:41:14 +02:00
Willy Tarreau
aac777f169 REORG: connection: move the largest inlines from connection.h to connection.c
The following inlined functions are particularly large (and probably not
inlined at all by the compiler), and together represent roughly half of
the file, while they're used at most once per connection. They were moved
to connection.c.

  conn_upgrade_mux_fe, conn_install_mux_fe, conn_install_mux_be,
  conn_install_mux_chk, conn_delete_from_tree, conn_init, conn_new,
  conn_free
2021-10-07 01:41:14 +02:00
Willy Tarreau
260f324c19 REORG: server: uninline the idle conns management functions
The following functions are quite heavy and have no reason to be kept
inlined:

   srv_release_conn, srv_lookup_conn, srv_lookup_conn_next,
   srv_add_to_idle_list

They were moved to server.c. It's worth noting that they're a bit
at the edge between server and connection and that maybe we could
create an idle-conn file for these in the near future.
2021-10-07 01:41:14 +02:00
Willy Tarreau
930428c0bf REORG: connection: uninline conn_notify_mux() and conn_delete_from_tree()
The former is far too huge to be inlined and the second is the only
one requiring an ebmb tree through all includes, let's move them to
connection.c.
2021-10-07 01:41:14 +02:00
Willy Tarreau
e5983ffb3a REORG: connection: move the hash-related stuff to connection.c
We do not really need to have them inlined, and having xxhash.h included
by connection.h results in this 4700-lines file being processed 101 times
over the whole project, which accounts for 13.5% of the total size!
Additionally, half of the functions are only needed from connection.c.
Let's move the functions there and get rid of the painful include.

The build time is now down to 6.2s just due to this.
2021-10-07 01:41:14 +02:00
Willy Tarreau
fd21c6c6fd MINOR: connection: use uint64_t for the hashes
The hash type stored everywhere is XXH64_hash_t, which annoyingly forces
everyone to include the huge xxhash file. We know it's an uint64_t because
that's its purpose and the type is only made to abstract it on machines
where uint64_t is not availble. Let's switch the type to uint64_t
everywhere and avoid including xxhash from the type file.
2021-10-07 01:41:14 +02:00
Willy Tarreau
a26be37e20 REORG: acitvity: uninline sched_activity_entry()
This one is expensive in code size because it comes with xxhash.h at a
low level of dependency that's inherited at plenty of places, and for
a function does doesn't benefit from inlining and could possibly even
benefit from not being inline given that it's large and called from the
scheduler.

Moving it to activity.c reduces the LoC by 1.2% and the binary size by
~1kB.
2021-10-07 01:41:14 +02:00
Willy Tarreau
e0650224b8 REORG: activity: uninline activity_count_runtime()
This function has no reason for being inlined, it's called from non
critical places (once in pollers), is quite large and comes with
dependencies (time and freq_ctr). Let's move it to acitvity.c. That's
another 0.4% less LoC to build.
2021-10-07 01:41:14 +02:00
Willy Tarreau
9310f481ce CLEANUP: tree-wide: remove unneeded include time.h in ~20 files
20 files used to have haproxy/time.h included only for now_ms, and two
were missing it for other things but used to inherit from it via other
files.
2021-10-07 01:41:14 +02:00
Willy Tarreau
078c2573c2 REORG: sched: moved samp_time and idle_time to task.c as well
The idle time calculation stuff was moved to task.h by commit 6dfab112e
("REORG: sched: move idle time calculation from time.h to task.h") but
these two variables that are only maintained by task.{c,h} were still
left in time.{c,h}. They have to move as well.
2021-10-07 01:41:14 +02:00
Willy Tarreau
99ea188c0e REORG: sample: move the crypto samples to ssl_sample.c
These ones require openssl and are only built when it's enabled. There's
no point keeping them in sample.c when ssl_sample.c already deals with this
and the required includes. This also allows to remove openssl-compat.h
from sample.c and to further reduce the number of inclusions of openssl
includes, and the build time is now down to under 8 seconds.
2021-10-07 01:41:14 +02:00
Willy Tarreau
82531f6730 REORG: ssl-sock: move the sslconns/totalsslconns counters to global
These two counters were the only ones not in the global struct, while
the SSL freq counters or the req counts are already in it, this forces
stats.c to include ssl_sock just to know about them. Let's move them
over there with their friends. This reduces from 408 to 384 the number
of includes of opensslconf.h.
2021-10-07 01:41:14 +02:00
Willy Tarreau
a8a72c68d5 CLEANUP: ssl/server: move ssl_sock_set_srv() to srv_set_ssl() in server.c
This one has nothing to do with ssl_sock as it manipulates the struct
server only. Let's move it to server.c and remove unneeded dependencies
on ssl_sock.h. This further reduces by 10% the number of includes of
opensslconf.h and by 0.5% the number of compiled lines.
2021-10-07 01:41:06 +02:00
Willy Tarreau
d2ae3858e9 CLEANUP: mux_fcgi: remove dependency on ssl_sock
It's not needed anymore (used to be needed for ssl_sock_is_ssl()).
2021-10-07 01:36:51 +02:00
Willy Tarreau
1057beecda REORG: ssl: move ssl_sock_is_ssl() to connection.h and rename it
This one doesn't use anything from an SSL context, it only checks the
type of the transport layer of a connection, thus it belongs to
connection.h. This is particularly visible due to all the ifdefs
around it in various call places.
2021-10-07 01:36:51 +02:00
Willy Tarreau
dbf78025a0 REORG: listener: move bind_conf_alloc() and listener_state_str() to listener.c
These functions have no reason for being inlined, and they require some
includes with long dependencies. Let's move them to listener.c and trim
unused includes in listener.h.
2021-10-07 01:36:51 +02:00
Willy Tarreau
dced3ebb4a MINOR: thread/debug: replace nsec_now() with now_mono_time()
The two functions do exactly the same except that the second one
is already provided by time.h and still defined if not available.
2021-10-07 01:36:51 +02:00
Willy Tarreau
407ef893e7 REORG: thread: uninline the lock-debugging code
The lock-debugging code in thread.h has no reason to be inlined. the
functions are quite fat and perform a lot of operations so there's no
saving keeping them inlined. Worse, most of them are in fact not
inlined, resulting in a significantly bigger executable.

This patch moves all this part from thread.h to thread.c. The functions
are still exported in thread.h of course. This results in ~166kB less
code:

     text    data     bss     dec     hex filename
  3165938   99424  897376 4162738  3f84b2 haproxy-before
  2991987   99424  897376 3988787  3cdd33 haproxy-after

In addition the build time with thread debugging enabled has shrunk
from 19.2 to 17.7s thanks to much less code to be parsed in thread.h
that is included virtually everywhere.
2021-10-07 01:36:51 +02:00
Willy Tarreau
f14d19024b REORG: pools: uninline the UAF allocator and force-inline the rest
pool-os.h relies on a number of includes solely because the
pool_alloc_area() function was inlined, and this only because we want
the normal version to be inlined so that we can track the calling
places for the memory profiler. It's worth noting that it already
does not work at -O0, and that when UAF is enabled we don't care a
dime about profiling.

This patch does two things at once:
  - force-inline the functions so that pool_alloc_area() is still
    inlined at -O0 to help track malloc() users ;

  - uninline the UAF version of these (that rely on mmap/munmap)
    and move them to pools.c so that we can remove all unneeded
    includes.

Doing so reduces by ~270kB or 0.15% the total build size.
2021-10-07 01:36:51 +02:00
Willy Tarreau
5d9ddc5442 BUILD: tree-wide: add several missing activity.h
A number of files currently access activity counters but rely on their
definitions to be inherited from other files (task.c, backend.c hlua.c,
sock.c, pool.c, stats.c, fd.c).
2021-10-07 01:36:51 +02:00
Willy Tarreau
410e2590e9 BUILD: mworker: mworker-prog needs time.h for the 'now' variable
It wasn't included and it used to get them through other includes.
2021-10-07 01:36:51 +02:00
Willy Tarreau
6cd007d078 BUILD: tcp_sample: include missing errors.h and session-t.h
Both are used without being defined as they were inherited from other
files.
2021-10-07 01:36:51 +02:00
Willy Tarreau
0d1dd0e894 BUILD: cfgparse-ssl: add missing errors.h
ha_warning(), ha_alert() and friends are in errors.h and it used
to be inherited via other files.
2021-10-07 01:36:51 +02:00
Willy Tarreau
b7fc4c4e9f BUILD: tree-wide: add missing http_ana.h from many places
At least 6 files make use of s->txn without including http_ana which
defines it. They used to get it from other includes.
2021-10-07 01:36:51 +02:00
Willy Tarreau
63617dbec6 BUILD: idleconns: include missing ebmbtree.h at several places
backend.c, all muxes, backend.c started manipulating ebmb_nodes with
the introduction of idle conns but the types were inherited through
other includes. Let's add ebmbtree.h there.
2021-10-07 01:36:51 +02:00
Willy Tarreau
74f2456c42 BUILD: ssl_ckch: include ebpttree.h in ssl_ckch.c
It's used but is only found through other includes.
2021-10-07 01:36:51 +02:00
Willy Tarreau
8db34cc974 BUILD: peers: need to include eb{32/mb/pt}tree.h
peers.c uses them all and used to only find them through other includes.
2021-10-07 01:36:51 +02:00
Willy Tarreau
b555eb1176 BUILD: vars: need to include xxhash
It's needed for XXH3(), and it used to get it through other includes.
2021-10-07 01:36:51 +02:00
Willy Tarreau
0ce6dc0107 BUILD: http_rules: requires http_ana-t.h for REDIRECT_*
It used to inherit it through other includes.
2021-10-07 01:36:51 +02:00
Willy Tarreau
286631a1a0 BUILD: sample: include openssl-compat
It's needed for EVP_*.
2021-10-07 01:36:51 +02:00
Willy Tarreau
1df20428f1 BUILD: httpclient: include missing ssl_sock-t
It's needed for SSL_SOCK_VERIFY_NONE.
2021-10-07 01:36:51 +02:00
Willy Tarreau
27539409fd BUILD: hlua: needs to include stream-t.h
It uses the SF_ERR_* error codes and currently gets them via
intermediary includes.
2021-10-07 01:36:51 +02:00
Willy Tarreau
397ad4135a BUILD: extcheck: needs to include stream-t.h
It uses the SF_ERR_* error codes and currently gets them via
intermediary includes.
2021-10-07 01:36:51 +02:00
Willy Tarreau
2476ff102f BUG/MEDIUM: sample: properly verify that variables cast to sample
The various variable-to-sample converters allow to turn a variable to
a sample of type string, sint or binary, but both the string one used
by strcmp() and the binary one used by secure_memcmp() are missing a
pointer check on the ability to the cast, making them crash if a
variable of type addr is used with strcmp(), or if an addr or bool is
used with secure_memcmp().

Let's rely on the new sample_conv_var2smp() function to run the proper
checks.

This will need to be backported to all supported version. It relies on
previous commits:

  CLEANUP: server: always include the storage for SSL settings
  CLEANUP: sample: rename sample_conv_var2smp() to *_sint
  CLEANUP: sample: uninline sample_conv_var2smp_str()
  MINOR: sample: provide a generic var-to-sample conversion function

For backports it's probably easier to check the sample_casts[] pointer
before calling it in sample_conv_strcmp() and sample_conv_secure_memcmp().
2021-10-07 01:36:51 +02:00
Willy Tarreau
168e8de1d0 MINOR: sample: provide a generic var-to-sample conversion function
We're using variable-to-sample conversion at least 4 times in the code,
two of which are bogus. Let's introduce a generic conversion function
that performs the required checks.
2021-10-07 01:36:51 +02:00
Willy Tarreau
4034e2cb58 CLEANUP: sample: uninline sample_conv_var2smp_str()
There's no reason to limit this one to this file, it could be used in
other contexts.
2021-10-07 01:36:51 +02:00
Willy Tarreau
d9be599529 CLEANUP: sample: rename sample_conv_var2smp() to *_sint
This one only handles integers, contrary to its sibling with the suffix
_str that only handles strings. Let's rename it and uninline it since it
may well be used from outside.
2021-10-07 01:36:51 +02:00
Willy Tarreau
80527bcb9d CLEANUP: server: always include the storage for SSL settings
The SSL stuff in struct server takes less than 3% of it and requires
lots of annoying ifdefs in the code just to take care of the cases
where the field is absent. Let's get rid of this and stop including
openssl-compat from server.c to detect NPN and ALPN capabilities.

This reduces the total LoC by another 0.4%.
2021-10-07 01:36:51 +02:00
William Lallemand
746e6f3f8e MINOR: httpclient/lua: supports headers via named arguments
Migrate the httpclient:get() method to named arguments so we can
specify optional arguments.

This allows to pass headers as an optional argument as an array.

The () in the method call must be replaced by {}:

	local res = httpclient:get{url="http://127.0.0.1:9000/?s=99",
	            headers= {["X-foo"]  = { "salt" }, ["X-bar"] = {"pepper" }}}
2021-10-06 15:21:02 +02:00
William Lallemand
ef574b2101 BUG/MINOR: httpclient/lua: does not process headers when failed
Do not try to process the header list when it is NULL. This case can
arrive when the request failed and did not return a response.
2021-10-06 15:15:03 +02:00
William Lallemand
2a879001b5 MINOR: httpclient: destroy checks if a client was started but not stopped
During httpclient_destroy, add a condition in the BUG_ON which checks
that the client was started before it has ended. A httpclient structure
could have been created without being started.
2021-10-06 15:15:03 +02:00
William Lallemand
4d60184887 BUG/MEDIUM: httpclient/lua: crash because of b_xfer and get_trash_chunk()
When using the lua httpclient, haproxy could crash because a b_xfer is
done in httpclient_xfer, which will do a zero-copy swap of the data in
the buffers. The ptr will then be free() by the pool.

However this can't work with a trash buffer, because the area was not
allocated from the pool buffer, so the pool is not suppose to free it
because it does not know this ptr, using -DDEBUG_MEMORY_POOLS will
result with a crash during the free.

Fix the problem by using b_force_xfer() instead of b_xfer which copy
the data instead. The problem still exist with the trash however, and
the trash API must be reworked.
2021-10-06 15:15:03 +02:00
William Lallemand
f77f1de802 MINOR: httpclient/lua: implement garbage collection
Implement the garbage collector of the lua httpclient.

This patch declares the __gc method of the httpclient object which only
does a httpclient_stop_and_destroy().
2021-10-06 15:15:03 +02:00
William Lallemand
b8b1370307 MINOR: httpclient: test if started during stop_and_destroy()
If the httpclient was never started, it is safe to destroy completely
the httpclient.
2021-10-06 15:15:03 +02:00
William Lallemand
ecb83e13eb MINOR: httpclient: stop_and_destroy() ask the applet to autokill
httpclient_stop_and_destroy() tries to destroy the httpclient structure
if the client was stopped.

In the case the client wasn't stopped, it ask the client to stop itself
and to destroy the httpclient structure itself during the release of the
applet.
2021-10-06 15:15:03 +02:00
William Lallemand
739f90a6ef MINOR: httpclient: set HTTPCLIENT_F_ENDED only in release
Only set the HTTPCLIENT_F_ENDED flag in httpclient_applet_release()
function so we are sure that the appctx is not used anymore once the
flag is set.
2021-10-06 15:15:03 +02:00
William Lallemand
03f5a1c77d MINOR: httpclient: destroy() must free the headers and the ists
httpclient_destroy() must free all the ist in the httpclient structure,
the URL in the request, the vsn and reason in the response.

It also must free the list of headers of the response.
2021-10-06 15:15:03 +02:00
Christopher Faulet
d34758849e BUG/MEDIUM: http-ana: Clear request analyzers when applying redirect rule
A bug was introduced by the commit 2d5650082 ("BUG/MEDIUM: http-ana: Reset
channels analysers when returning an error").

The request analyzers must be cleared when a redirect rule is applied. It is
not a problem if the redirect rule is inside an http-request ruleset because
the analyzer takes care to clear it. However, when it comes from a redirect
ruleset (via the "redirect ..."  directive), because of the above commit,
the request analyzers are no longer cleared. It means some HTTP request
analyzers may be called while the request channel was already flushed. It is
totally unexpected and may lead to crash.

Thanks to Yves Lafon for reporting the problem.

This patch must be backported everywhere the above commit was backported.
2021-10-04 14:32:02 +02:00
Christopher Faulet
d28b2b2352 BUG/MEDIUM: filters: Fix a typo when a filter is attached blocking the release
When a filter is attached to a stream, the wrong FLT_END analyzer is added
on the request channel. AN_REQ_FLT_END must be added instead of
AN_RES_FLT_END. Because of this bug, the stream may hang on the filter
release stage.

It seems to be ok for HTTP filters (cache & compression) in HTTP mode. But
when enabled on a TCP proxy, the stream is blocked until the client or the
server timeout expire because data forwarding is blocked. The stream is then
prematurely aborted.

This bug was introduced by commit 26eb5ea35 ("BUG/MINOR: filters: Always set
FLT_END analyser when CF_FLT_ANALYZE flag is set"). The patch must be
backported in all stable versions.
2021-10-04 08:28:44 +02:00
Willy Tarreau
6dfab112e1 REORG: sched: move idle time calculation from time.h to task.h
time.h is a horrible place to put activity calculation, it's a
historical mistake because the functions were there. We already have
most of the parts in sched.{c,h} and these ones make an exception in
the middle, forcing time.h to include some thread stuff and to access
the before/after_poll and idle_pct values.

Let's move these 3 functions to task.h with the other ones. They were
prefixed with "sched_" instead of the historical "tv_" which already
made no sense anymore.
2021-10-01 18:37:51 +02:00
Willy Tarreau
6136989a22 MINOR: time: uninline report_idle() and move it to task.c
I don't know why I inlined this one, this makes no sense given that it's
only used for stats, and it starts a circular dependency on tinfo.h which
can be problematic in the future. In addition, all the stuff related to
idle time calculation should be with the rest of the scheduler, which
currently is in task.{c,h}, so let's move it there.
2021-10-01 18:37:50 +02:00
Willy Tarreau
beeabf5314 MINOR: task: provide 3 task_new_* wrappers to simplify the API
We'll need to improve the API to pass other arguments in the future, so
let's start to adapt better to the current use cases. task_new() is used:
  - 18 times as task_new(tid_bit)
  - 18 times as task_new(MAX_THREADS_MASK)
  - 2 times with a single bit (in a loop)
  - 1 in the debug code that uses a mask

This patch provides 3 new functions to achieve this:
  - task_new_here()     to create a task on the calling thread
  - task_new_anywhere() to create a task to be run anywhere
  - task_new_on()       to create a task to run on a specific thread

The change is trivial and will allow us to later concentrate the
required adaptations to these 3 functions only. It's still possible
to call task_new() if needed but a comment was added to encourage the
use of the new ones instead. The debug code was not changed and still
uses it.
2021-10-01 18:36:29 +02:00
Willy Tarreau
6a2a912cb8 CLEANUP: tasks: remove the long-unused work_lists
Work lists were a mechanism introduced in 1.8 to asynchronously delegate
some work to be performed on another thread via a dedicated task.
The only user was the listeners, to deal with the queue. Nowadays
the tasklets have made this much more convenient, and have replaced
work_lists in the listeners. It seems there will be no valid use case
of work lists anymore, so better get rid of them entirely and keep the
scheduler code cleaner.
2021-10-01 18:30:14 +02:00
Willy Tarreau
7a9699916a MINOR: tasks: catch TICK_ETERNITY with BUG_ON() in __task_queue()
__task_queue() must absolutely not be called with TICK_ETERNITY or it
will place a never-expiring node upfront in the timers queue, preventing
any timer from expiring until the process is restarted. Code was found
to cause this using "task_schedule(task, now_ms)" which does this one
millisecond every 49.7 days, so let's add a condition against this. It
must never trigger since any process susceptible to trigger it would
already accumulate tasks until it dies.

An extra test was added in wake_expired_tasks() to detect tasks whose
timeout would have been changed after being queued.

An improvement over this could be in the future to use a non-scalar
type (union/struct) for expiration dates so as to avoid the risk of
using them directly like this. But now_ms is already such a valid
time and this specific construct would still not be caught.

This could even be backported to stable versions to help detect other
occurrences if any.
2021-09-30 17:09:39 +02:00
Christopher Faulet
cb59e0bc3c BUG/MINOR: tcp-rules: Stop content rules eval on read error and end-of-input
For now, tcp-request and tcp-response content rules evaluation is
interrupted before the inspect-delay when the channel's buffer is full, the
RX path is blocked or when a shutdown for reads was received. To sum up, the
evaluation is interrupted when no more input data are expected. However, it
is not exhaustive. It also happens when end of input is reached (CF_EOI flag
set) or when a read error occurred (CF_READ_ERROR flag set).

Note that, AFAIK, it is only a problem on HAProy 2.3 and prior when a H1 to
H2 upgrade is performed. On newer versions, it works as expected because the
stream is not created at this stage.

This patch must be backported as far as 2.0.
2021-09-30 16:37:29 +02:00
Christopher Faulet
eaba25dd97 BUG/MINOR: tcpcheck: Don't use arg list for default proxies during parsing
During tcp/http check rules parsing, when a sample fetch or a log-format
string is parsed, the proxy's argument list used to track unresolved
argument is no longer passed for default proxies. It means it is no longer
possible to rely on sample fetches depending on the execution context (for
instance 'nbsrv').

It is important to avoid HAProxy crashes because these arguments are
resolved during the configuration validity check. But, default proxies are
not evaluated during this stage. Thus, these arguments remain unresolved.

It will probably be possible to relax this rule. But to ease backports, it
is forbidden for now.

This patch must be backported as far as 2.2. It depends on the commit
"MINOR: arg: Be able to forbid unresolved args when building an argument
list".  It must be adapted for the 2.3 because PR_CAP_DEF capability was
introduced in the 2.4. A solution may be to test The proxy's id agains NULL.
2021-09-30 16:37:05 +02:00
Christopher Faulet
35926a16ac MINOR: arg: Be able to forbid unresolved args when building an argument list
In make_arg_list() function, unresolved dependencies are pushed in an
argument list to be resolved later, during the configuration validity
check. It is now possible to forbid such unresolved dependencies by omitting
<al> parameter (setting it to NULL). It is usefull when the parsing context
is not the same than the running context or when the parsing context is lost
after the startup stage. For instance, an argument may be defined in
defaults section during parsing and executed in a frontend/backend section.
2021-09-30 16:37:05 +02:00
Willy Tarreau
e3957f83e0 BUG/MAJOR: lua: use task_wakeup() to properly run a task once
The Lua tasks registered vi core.register_task() use a dangerous
task_schedule(task, now_ms) to start them, that will most of the
time work by accident, except when the time wraps every 49.7 days,
if now_ms is 0, because it's not valid to queue a task with an
expiration date set to TICK_ETERNITY, as it will fail all wakeup
checks and prevent all subsequent timers from being seen as expired.
The only solution in this case is to restart the process.

Fortunately for the vast majority of users it is extremely unlikely
to ever be met (only one millisecond every 49.7 days is at risk), but
this can be systematic for a process dealing with 1000 req/s, hence
the major tag.

The bug was introduced in 1.6-dev with commit 24f335340 ("MEDIUM: lua:
add coroutine as tasks."), so the fix must be backported to all stable
branches.
2021-09-30 16:26:51 +02:00
Willy Tarreau
12c02701d3 BUG/MEDIUM: lua: fix wakeup condition from sleep()
A time comparison was wrong in hlua_sleep_yield(), making the sleep()
code do nothing for periods of 24 days every 49 days. An arithmetic
comparison was performed on now_ms instead of using tick_is_expired().

This bug was added in 1.6-dev by commit 5b8608f1e ("MINOR: lua: core:
add sleep functions") so the fix should be backported to all stable
versions.
2021-09-30 16:26:51 +02:00
Remi Tricot-Le Breton
9543d5ad5b MINOR: ssl: Store the last SSL error code in case of read or write failure
In case of error while calling a SSL_read or SSL_write, the
SSL_get_error function is called in order to know more about the error
that happened. If the error code is SSL_ERROR_SSL or SSL_ERROR_SYSCALL,
the error queue might contain more information on the error. This error
code was not used until now. But we now need to store it in order for
backend error fetches to catch all handshake related errors.

The change was required because the previous backend fetch would not
have raised anything if the client's certificate was rejected by the
server (and the connection interrupted). This happens because starting
from TLS1.3, the 'Finished' state on the client is reached before its
certificate is sent to the server (see the "Protocol Overview" part of
RFC 8446). The only place where we can detect that the server rejected the
certificate is after the first SSL_read call after the SSL_do_handshake
function.

This patch then adds an extra ERR_peek_error after the SSL_read and
SSL_write calls in ssl_sock_to_buf and ssl_sock_from_buf. This means
that it could set an error code in the SSL context a long time after the
handshake is over, hence the change in the error fetches.
2021-09-30 11:04:35 +02:00
Remi Tricot-Le Breton
1fe0fad88b MINOR: ssl: Rename ssl_bc_hsk_err to ssl_bc_err
The ssl_bc_hsk_err sample fetch will need to raise more errors than only
handshake related ones hence its renaming to a more generic ssl_bc_err.
This patch is required because some handshake failures that should have
been caught by this fetch (verify error on the server side for instance)
were missed. This is caused by a change in TLS1.3 in which the
'Finished' state on the client is reached before its certificate is sent
(and verified) on the server side (see the "Protocol Overview" part of
RFC 8446).
This means that the SSL_do_handshake call is finished long before the
server can verify and potentially reject the client certificate.

The ssl_bc_hsk_err will then need to be expanded to catch other types of
errors.

This change is also applied to the frontend fetches (ssl_fc_hsk_err
becomes ssl_fc_err) and to their string counterparts.
2021-09-30 11:04:35 +02:00
Remi Tricot-Le Breton
61944f7a73 MINOR: ssl: Set connection error code in case of SSL read or write fatal failure
In case of a connection error happening after the SSL handshake is
completed, the error code stored in the connection structure would not
always be set, hence having some connection failures being described as
successful in the fc_conn_err or bc_conn_err sample fetches.
The most common case in which it could happen is when the SSL server
rejects the client's certificate. The SSL_do_handshake call on the
client side would be sucessful because the client effectively sent its
client hello and certificate information to the server, but the next
call to SSL_read on the client side would raise an SSL_ERROR_SSL code
(through the SSL_get_error function) which is decribed in OpenSSL
documentation as a non-recoverable and fatal SSL error.
This patch ensures that in such a case, the connection's error code is
set to a special CO_ERR_SSL_FATAL value.
2021-09-30 11:04:35 +02:00
Christopher Faulet
da3adebd06 BUG/MEDIUM: mux-h1/mux-fcgi: Reject messages with unknown transfer encoding
HAproxy only handles "chunked" encoding internally. Because it is a gateway,
we stated it was not a problem if unknown encodings were applied on a
message because it is the recipient responsibility to accept the message or
not. And indeed, it is not a problem if both the client and the server
connections are using H1. However, Transfer-Encoding headers are dropped
from H2 messages. It is not a problem for chunk-encoded payload because
dechunking is performed during H1 parsing. But, for any other encodings, the
xferred H2 message is invalid.

It is also a problem for internal payload manipulations (lua,
filters...). Because the TE request headers are now sanitiezd, unsupported
encoding should not be used by servers. Thus it is only a problem for the
request messages. For this reason, such messages are now rejected. And if a
server decides to use an unknown encoding, the response will also be
rejected.

Note that it is pretty uncommon to use other encoding than "chunked" on the
request payload. So it is not necessary to backport it.

This patch should fix the issue #1301. No backport is needed.
2021-09-28 16:39:47 +02:00
Christopher Faulet
545fbba273 MINOR: h1: Change T-E header parsing to fail if chunked encoding is found twice
According to the RFC7230, "chunked" encoding must not be applied more than
once to a message body. To handle this case, h1_parse_xfer_enc_header() is
now responsible to fail when a parsing error is found. It also fails if the
"chunked" encoding is not the last one for a request.

To help the parsing, two H1 parser flags have been added: H1_MF_TE_CHUNKED
and H1_MF_TE_OTHER. These flags are set, respectively, when "chunked"
encoding and any other encoding are found. H1_MF_CHNK flag is used when
"chunked" encoding is the last one.
2021-09-28 16:21:25 +02:00
Christopher Faulet
92cafb39e7 MINOR: http: Add 422-Unprocessable-Content error message
The last HTTP/1.1 draft adds the 422 status code in the list of client
errors. It normalizes the WebDav specific one (422-Unprocessable-Entity).
2021-09-28 16:21:25 +02:00
Christopher Faulet
f56e8465f0 BUG/MINOR: mux-h1/mux-fcgi: Sanitize TE header to only send "trailers"
Only chunk-encoded response payloads are supported by HAProxy. All other
transfer encodings are not supported and will be an issue if the HTTP
compression is enabled. So be sure only "trailers" is send in TE request
headers.

The patch is related to the issue #1301. It must be backported to all stable
versions. Be carefull for 2.0 and lower because the HTTP legacy must also be
fixed.
2021-09-28 16:21:25 +02:00
Christopher Faulet
631c7e8665 MEDIUM: h1: Force close mode for invalid uses of T-E header
Transfer-Encoding header is not supported in HTTP/1.0. However, softwares
dealing with HTTP/1.0 and HTTP/1.1 messages may accept it and transfer
it. When a Content-Length header is also provided, it must be
ignored. Unfortunately, this may lead to vulnerabilities (request smuggling
or response splitting) if an intermediary is only implementing
HTTP/1.0. Because it may ignore Transfer-Encoding header and only handle
Content-Length one.

To avoid any security issues, when Transfer-Encoding and Content-Length
headers are found in a message, the close mode is forced. The same is
performed for HTTP/1.0 message with a Transfer-Encoding header only. This
change is conform to what it is described in the last HTTP/1.1 draft. See
also httpwg/http-core#879.

Note that Content-Length header is also removed from any incoming messages
if a Transfer-Encoding header is found. However it is not true (not yet) for
responses generated by HAProxy.
2021-09-28 16:21:25 +02:00
Christopher Faulet
e136bd12a3 MEDIUM: mux-h1: Reject HTTP/1.0 GET/HEAD/DELETE requests with a payload
This kind of requests is now forbidden and rejected with a
413-Payload-Too-Large error.

It is unexpected to have a payload for GET/HEAD/DELETE requests. It is
explicitly allowed in HTTP/1.1 even if some servers may reject such
requests. However, HTTP/1.0 is not clear on this point and some old servers
don't expect any payload and never look for body length (via Content-Length
or Transfer-Encoding headers).

It means that some intermediaries may properly handle the payload for
HTTP/1.0 GET/HEAD/DELETE requests, while some others may totally ignore
it. That may lead to security issues because a request smuggling attack is
possible.

To prevent any issue, those requests are now rejected.

See also httpwg/http-core#904
2021-09-28 16:21:11 +02:00
Christopher Faulet
b3230f76e8 MINOR: mux-h1: Be able to set custom status code on parsing error
When a parsing error is triggered, the status code may be customized by
setting H1C .errcode field. By default a 400-Bad-Request is returned. The
function h1_handle_bad_req() has been renamed to h1_handle_parsing_error()
to be more generic.
2021-09-28 16:18:17 +02:00
Christopher Faulet
36e46aa28c MINOR: mux-h1: Set error code if possible when MUX_EXIT_STATUS is returned
In h1_ctl(), if output parameter is provided when MUX_EXIT_STATUS is
returned, it is used to set the error code. In addition, any client errors
(4xx), except for 408 ones, are handled as invalid errors
(MUX_ES_INVALID_ERR). This way, it will be possible to customize the parsing
error code for request messages.
2021-09-28 16:17:59 +02:00
Christopher Faulet
a015b3ec8b MINOR: log: Try to get the status code when MUX_EXIT_STATUS is retrieved
The mux .ctl callback can provide some information about the mux to the
caller if the third parameter is provided. Thus, when MUX_EXIT_STATUS is
retrieved, a pointer on the status is now passed. The mux may fill it. It
will be pretty handy to provide custom error code from h1 mux instead of
default ones (400/408/500/501).
2021-09-28 13:52:25 +02:00
Willy Tarreau
2d5d4e0c3e MINOR: init: extract the setup and end of threads to their own functions
The startup code was still ugly with tons of unreadable nested ifdefs.
Let's just have one function to set up the extra threads and another one
to wait for their completion. The ifdefs are isolated into their own
functions now and are more readable, just like the end of main(), which
now uses the same statements to start thread 0 with and without threads.
2021-09-28 11:44:31 +02:00
Willy Tarreau
fb641d7af0 MEDIUM: init: de-uglify the per-thread affinity setting
Till now the threads startup was quite messy:
  - we would start all threads but one
  - then we would change all threads' CPU affinities
  - then we would manually start the poll loop for the current thread

Let's change this by moving the CPU affinity setting code to a function
set_thread_cpu_affinity() that does this job for the current thread only,
and that is called during the thread's initialization in the polling loop.

It takes care of not doing this for the master, and will result in all
threads to be properly bound earlier and with cleaner code. It also
removes some ugly nested ifdefs.
2021-09-28 11:42:19 +02:00
Willy Tarreau
2a30f4d87e CLEANUP: init: remove useless test against MAX_THREADS in affinity loop
The test i < MAX_THREADS is pointless since the loop boundary is bound
to global.nbthread which is already not greater.
2021-09-28 09:56:44 +02:00
Willy Tarreau
51ec03a61d MINOR: config: use a standard parser for the "nbthread" keyword
Probably because of some copy-paste from "nbproc", "nbthread" used to
be parsed in cfgparse instead of using a registered parser. Let's fix
this to clean up the code base now.
2021-09-27 09:47:40 +02:00
William Lallemand
614e68337d BUG/MEDIUM: httpclient: replace ist0 by istptr
ASAN reported a buffer overflow in the httpclient. This overflow is the
consequence of ist0() which is incorrect here.

Replace all occurences of ist0() by istptr() which is more appropried
here since all ist in the httpclient were created from strings.
2021-09-26 18:19:55 +02:00
William Lallemand
4a4e663771 Revert "head-truc"
This reverts commit fe67e091859b07dca4622981a8d98a0b64de3cab.

Revert a development/test patch which was accidentely introduced.
2021-09-24 19:19:37 +02:00
William Lallemand
7d21836bc6 head-truc 2021-09-24 19:05:41 +02:00
Tim Duesterhus
eaf16fcb53 CLEANUP: slz: Mark reset_refs as static
This function has no prototype and is not used outside of slz.c.
2021-09-24 15:07:50 +02:00
William Lallemand
79416cbd7a BUG/MINOR: httpclient/lua: return an error on argument check
src/hlua.c:7074:6: error: variable 'url_str' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
        if (lua_type(L, -1) == LUA_TSTRING)
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
src/hlua.c:7079:36: note: uninitialized use occurs here
        hlua_hc->hc->req.url = istdup(ist(url_str));
                                          ^~~~~~~

Return an error on the stack if the argument is not a string.
2021-09-24 14:57:15 +02:00
William Lallemand
d7df73a114 MINOR: httpclient/lua: implement the headers in the response object
Provide a new field "headers" in the response of the HTTPClient, which
contains all headers of the response.

This field is a multi-dimensionnal table which could be represented this
way in lua:

    headers = {
       ["content-type"] = { "text/html" },
       ["cache-control"] = { "no-cache" }
    }
2021-09-24 14:29:36 +02:00
William Lallemand
3956c4ead2 MINOR: httpclient/lua: httpclient:get() API in lua
This commit provides an hlua_httpclient object which is a bridge between
the httpclient and the lua API.

The HTTPClient is callable in lua this way:

    local httpclient = core.httpclient()
    local response = httpclient:get("http://127.0.0.1:9000/?s=9999")
    core.Debug("Status: ".. res.status .. ", Reason : " .. res.reason .. ", Len:" .. string.len(res.body) .. "\n")

The resulting response object will provide a "status" field which
contains the status code, a "reason" string which contains the reason
string, and a "body" field which contains the response body.

The implementation uses the httpclient callback to wake up the lua task
which yield each time it pushes some data. The httpclient works in the
same thread as the lua task.
2021-09-24 14:29:36 +02:00
William Lallemand
1123dde6dd MINOR: httpclient: httpclient_ended() returns 1 if the client ended
httpclient_ended() returns 1 if there is no more data to collect,
because the client received everything or the connection ended.
2021-09-24 14:21:26 +02:00
William Lallemand
518878e007 MINOR: httpclient: httpclient_data() returns the available data
httpclient_data() returns the available data in the httpclient.
2021-09-24 14:21:26 +02:00
Thierry Fournier
b6b1cdeae4 CLEANUP: stats: Fix some alignment mistakes
This patch fix some broken alignements. Code is not modified
The command `git show -w` whows nothing.
2021-09-24 08:52:45 +02:00
Thierry Fournier
e9ed63e548 MINOR: stats: Enable dark mode on stat web page
According with the W3 CSS specification, media queries 5 allow
the browser to enable some CSS when dark mode is enabled. This
patch defines dark mode CSS for the stats page.

https://www.w3.org/TR/mediaqueries-5/#prefers-color-scheme
2021-09-24 08:27:40 +02:00
Dragan Dosen
9a006f9641 BUG/MINOR: http-ana: increment internal_errors counter on response error
A bug was introduced in the commit cff0f739e5 ("MINOR: counters: Review
conditions to increment counters from analysers"). The internal_errors
counter for the target server was incremented twice. The counter for the
session listener needs to be incremented instead.

This must be backported everywhere the commit cff0f739e5 is.
2021-09-23 16:25:47 +02:00
Christopher Faulet
564e39c4c6 MINOR: stream-int: Notify mux when the buffer is not stuck when calling rcv_buf
The transient flag CO_RFL_BUF_NOT_STUCK should now be set when the mux's
rcv_buf() function is called, in si_cs_recv(), to be sure the mux is able to
perform some optimisation during data copy. This flag is set when we are
sure the channel buffer is not stuck. Concretely, it happens when there are
data scheduled to be sent.

It is not a fix and this flag is not used for now. But it makes sense to have
this info to be sure to be able to do some optimisations if necessary.

This patch is related to the issue #1362. It may be backported to 2.4 to
ease future backports.
2021-09-23 16:25:47 +02:00
Christopher Faulet
2bc364c191 BUG/MEDIUM: stream-int: Defrag HTX message in si_cs_recv() if necessary
The stream interface is now responsible for defragmenting the HTX message of
the input channel if necessary, before calling the mux's .rcv_buf()
function. The defrag is performed if the underlying buffer contains only
input data while the HTX message free space is not contiguous.

The defrag is important here to be sure the mux and the app layer have the
same criteria to decide if a buffer is full or not. Otherwise, the app layer
may wait for more data because the buffer is not full while the mux is
blocked because it needs more space to proceed.

This patch depends on following commits:

  * MINOR: htx: Add an HTX flag to know when a message is fragmented
  * MINOR: htx: Add a function to know if the free space wraps

This patch is related to the issue #1362. It may be backported as far as 2.0
after some observation period (not sure it is required or not).
2021-09-23 16:25:16 +02:00
Christopher Faulet
4697c92c9d MINOR: htx: Add an HTX flag to know when a message is fragmented
HTX_FL_FRAGMENTED flag is now set on an HTX message when it is
fragmented. It happens when an HTX block is removed in the middle of the
message and flagged as unused. HTX_FL_FRAGMENTED flag is removed when all
data are removed from the message or when the message is defragmented.

Note that some optimisations are still possible because the flag can be
avoided in other situations. For instance when the last header of a bodyless
message is removed.
2021-09-23 16:19:36 +02:00
Christopher Faulet
68a14db573 MINOR: stream-int: Set CO_RFL transient/persistent flags apart in si_cs_rcv()
In si_cs_recv(), some CO_RFL flags are set when the mux's .rcv_buf()
function is called. Some are persitent inside si_cs_recv() scope, some
others must be computed at each call to rcv_buf(). This patch takes care of
distinguishing them.

Among others, CO_RFL_KEEP_RECV is a persistent flag while CO_RFL_BUF_WET is
transient.
2021-09-23 16:19:36 +02:00
Christopher Faulet
7833596ff4 BUG/MEDIUM: stream: Stop waiting for more data if SI is blocked on RXBLK_ROOM
If the stream-interface is waiting for more buffer room to store incoming
data, it is important at the stream level to stop to wait for more data to
continue. Thanks to the previous patch ("BUG/MEDIUM: stream-int: Notify
stream that the mux wants more room to xfer data"), the stream is woken up
when this happens. In this patch, we take care to interrupt the
corresponding tcp-content ruleset or to stop waiting for the HTTP message
payload.

To ease detection of the state, si_rx_blocked_room() helper function has
been added. It returns non-zero if the stream interface's Rx path is blocked
because of lack of room in the input buffer.

This patch is part of a series related to the issue #1362. It should be
backported as ar as 2.0, probably with some adaptations. So be careful
during backports.
2021-09-23 16:18:07 +02:00
Christopher Faulet
df99408e0d BUG/MEDIUM: stream-int: Notify stream that the mux wants more room to xfer data
When the mux failed to transfer data to the upper layer because of a lack of
room, it is important to wake the stream up to let it handle this
event. Otherwise, if the stream is waiting for more data, both the stream
and the mux reamin blocked waiting for each other.

When this happens, the mux set the CS_FL_WANT_ROOM flag on the
conn-stream. Thus, in si_cs_recv() we are able to detect this event. Today,
the stream-interface is blocked. But, it is not enough to wake the stream
up. To fix the bug, CF_READ_PARTIAL flag is extended to also handle cases
where a read exception occurred. This flag should idealy be renamed. But for
now, it is good enough. By setting this flag, we are sure the stream will be
woken up.

This patch is part of a series related to the issue #1362. It should be
backported as far as 2.0, probably with some adaptations. So be careful
during backports.
2021-09-23 16:16:57 +02:00
Christopher Faulet
46e058dda5 BUG/MEDIUM: mux-h1: Adjust conditions to ask more space in the channel buffer
When a message is parsed and copied into the channel buffer, in
h1_process_demux(), more space is requested if some pending data remain
after the parsing while the channel buffer is not empty. To do so,
CS_FL_WANT_ROOM flag is set. It means the H1 parser needs more space in the
channel buffer to continue. In the stream-interface, when this flag is set,
the SI is considered as blocked on the RX path. It is only unblocked when
some data are sent.

However, it is not accurrate because the parsing may be stopped because
there is not enough data to continue. For instance in the middle of a chunk
size. In this case, some data may have been already copied but the parser is
blocked because it must receive more data to continue. If the calling SI is
blocked on RX at this stage when the stream is waiting for the payload
(because http-buffer-request is set for instance), the stream remains stuck
infinitely.

To fix the bug, we must request more space to the app layer only when it is
not possible to copied more data. Actually, this happens when data remain in
the input buffer while the H1 parser is in states MSG_DATA or MSG_TUNNEL, or
when we are unable to copy headers or trailers into a non-empty buffer.

The first condition is quite easy to handle. The second one requires an API
refactoring. h1_parse_msg_hdrs() and h1_parse_msg_tlrs() fnuctions have been
updated. Now it is possible to know when we need more space in the buffer to
copy headers or trailers (-2 is returned). In the H1 mux, a new H1S flag
(H1S_F_RX_CONGESTED) is used to track this state inside h1_process_demux().

This patch is part of a series related to the issue #1362. It should be
backported as far as 2.0, probably with some adaptations. So be careful
during backports.
2021-09-23 16:13:17 +02:00
Christopher Faulet
216d3352b1 BUG/MINOR: h1-htx: Fix a typo when request parser is reset
In h1_postparse_req_hdrs(), if we need more space to copy headers, the request
parser is reset. However, because of a typo, it was reset as a response parser
instead of a request one. h1m_init_req() must be called.

This patch must be backported as far as 2.2.
2021-09-23 16:10:36 +02:00
Amaury Denoyelle
cde911231e MINOR: quic: fix qcc subs initialization 2021-09-23 15:27:25 +02:00
Amaury Denoyelle
cd28b27581 MEDIUM: quic: implement mux release/conn free 2021-09-23 15:27:25 +02:00
Amaury Denoyelle
414cac5f9d MINOR: quic: define close handler 2021-09-23 15:27:25 +02:00
Frédéric Lécaille
865b07855e MINOR: quic: Crash upon too big packets receipt
This bug came with this commit:
    ("MINOR: quic: RX packets memory leak")
Too big packets were freed twice.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
3230bcfdc4 MINOR: quic: Possible endless loop in qc_treat_rx_pkts()
Ensure we do not endlessly treat always the same encryption level
in qc_treat_rx_pkts().
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
310d1bd08f MINOR: quic: RX packets memory leak
Missing RX packet reference counter decrementation at the lowest level.
This leaded the memory reserved for RX packets to never be released.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
ebc3fc1509 CLEANUP: quic: Remove useless inline functions
We want to track the packet reference counting more easily, so without
inline functions.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
8526f14acd MINOR: quic: Wake up the xprt from mux
We wake up the xprt as soon as STREAM frames have been pushed to
the TX mux buffer (->tx.buf).
We also make the mux subscribe() to the xprt layer if some data
remain in its ring buffer after having try to transfer them to the
xprt layer (TX mux buffer for the stream full).
Also do not consider a buffer in the ring if not allocated (see b_size(buf))
condition in the for(;;) loop.
Make a call to qc_process_mux() if possible when entering qc_send() to
fill the mux with data from streams in the send or flow control lists.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
1d40240f25 MINOR: quic: Implement qc_process_mux()
At this time, we only add calls to qc_resume_each_sending_qcs()
which handle the flow control and send lists.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
d2ba0967b7 MINOR: quic: Stream FIN bit fix in qcs_push_frame()
The FIN of a STREAM frame to be built must be set if there is no more
at all data in the ring buffer.
Do not do anything if there is nothing to transfer the ->tx.buf mux
buffer via b_force_xfer() (without zero copy)
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
1c482c665b MINOR: quic: Wake up the mux upon ACK receipt
When ACK have been received by the xprt, it must wake up the
mux if this latter has subscribed to SEND events. This is the
role of qcs_try_to_consume() to detect such a situation. This
is the function which consumes the buffer filled by the mux.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
513b4f290a MINOR: quic: Implement quic_conn_subscribe()
We implement ->subscribe() xprt callback which should be used only by the mux.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
153194f47a MINOR: mux_quic: Export the mux related flags
These flags should be available from the xprt which must be able to
wake up the mux when blocked.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
acd43a597c MINOR: quic: Add useful trace about pktns discarding
It is important to know if the packet number spaces used during the
handshakes have really been discarding. If not, this may have a
significant impact on the packet loss detection.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
8c27de7d20 MINOR: quic: Initial packet number spaced not discarded
There were cases where the Initial packet number space was not discarded.
This leaded the packet loss detection to continue to take it into
considuration during the connection lifetime. Some Application level
packets could not be retransmitted.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
2cb130c980 MINOR: quic: Constantness fixes for frame builders/parsers.
This is to ensure we do not modify important static variables:
the QUIC frame builders and parsers.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
dc2593e460 MINOR: quic: Wrong packet flags settings during frame building
We flag the packet as being ack-eliciting when building the frame.
But a wrong variable was used to to so.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
156a59b7c9 MINOR: quic: Confusion between TX/RX for the frame builders
QUIC_FL_TX_PACKET_ACK_ELICITING was replaced by QUIC_FL_RX_PACKET_ACK_ELICITING
by this commit due to a copy and paste:
   e5b47b637 ("MINOR: quic: Add a mask for TX frame builders and their authorized packet types")
Furthermore the flags for the PADDING frame builder was not initialized.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
578a7898f2 MINOR: mux_quic: move qc_process() code to qc_send()
qc_process is supposed to be run for each I/O handler event, not
only for "send" events.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
785d3bdedc MINOR: quic: Make use of buffer structs to handle STREAM frames
The STREAM data to send coming from the upper layer must be stored until
having being acked by the peer. To do so, we store them in buffer structs,
one by stream (see qcs.tx.buf). Each time a STREAM is built by quic_push_frame(),
its offset must match the offset of the first byte added to the buffer (modulo
the size of the buffer) by the frame. As they are not always acknowledged in
order, they may be stored in eb_trees ordered by their offset to be sure
to sequentially delete the STREAM data from their buffer, in the order they
have been added to it.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
b9c06fbe52 MINOR: quic_sock: Do not flag QUIC connections as being set
This is to let conn_get_src() or conn_get_src() set the source
or destination addresses for the connection.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
48fc74af64 MINOR: quic: Missing active_connection_id_limit default value
The peer transport parameter values were not initialized with
the default ones (when absent), especially the
"active_connection_id_limit" parameter with 2 as default value
when absent from received remote transport parameters. This
had as side effect to send too much NEW_CONNECTION_ID frames.
This was the case for curl which does not announce any
"active_connection_id_limit" parameter.
Also rename ->idle_timeout to ->max_idle_timeout to reflect the RFC9000.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
d4d6aa7b5c MINOR: quic: Attach the QUIC connection to a thread.
Compute a thread ID from a QUIC CID and attach the I/O handler to this
thread.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
2fc76cffaf MINOR: quic: Make QUIC-TLS support at least two initial salts
These salts are used to derive initial secrets to decrypt the first Initial packet.
We support draft-29 and v1 QUIC version initial salts.
Add parameters to our QUIC-TLS API functions used to derive these secret for
these salts.
Make our xprt_quic use the correct initial salt upon QUIC version field found in
the first paquet. Useful to support connections with curl which use draft-29
QUIC version.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
2766e78f3b MINOR: quic: Shorten some handshakes
Move the "ACK required" bit from the packet number space to the connection level.
Force the "ACK required" option when acknowlegding Handshake or Initial packet.
A client may send three packets with a different encryption level for each. So,
this patch modifies qc_treat_rx_pkts() to consider two encryption level passed
as parameters, in place of only one.
Make qc_conn_io_cb() restart its process after the handshake has succeeded
so that to process any Application level packets which have already been received
in the same datagram as the last CRYPTO frames in Handshake packets.
2021-09-23 15:27:25 +02:00
Amaury Denoyelle
42bb8aac65 MINOR: h3/mux: detect fin on last h3 frame of the stream 2021-09-23 15:27:25 +02:00
Amaury Denoyelle
8e2a998b17 MINOR: h3: send htx data 2021-09-23 15:27:25 +02:00
Amaury Denoyelle
15b096180d MINOR: h3: encode htx headers to QPACK 2021-09-23 15:27:25 +02:00
Amaury Denoyelle
e0930fcb07 MINOR: qpack: encode headers functions 2021-09-23 15:27:25 +02:00
Amaury Denoyelle
4652a59255 MINOR: qpack: create qpack-enc module 2021-09-23 15:27:25 +02:00
Amaury Denoyelle
26dfd90eb0 MINOR: h3: define snd_buf callback and divert mux ops 2021-09-23 15:27:25 +02:00
Amaury Denoyelle
7b1d3d6d3d MINOR: mux-quic: send SETTINGS on uni stream 2021-09-23 15:27:25 +02:00
Amaury Denoyelle
f52151d83e MEDIUM: mux-quic: implement ring buffer on stream tx 2021-09-23 15:27:25 +02:00
Amaury Denoyelle
990435561b MINOR: h3: allocate stream on headers 2021-09-23 15:27:25 +02:00
Amaury Denoyelle
b49fa1aa6d MINOR: h3: parse headers to htx 2021-09-23 15:27:25 +02:00
Amaury Denoyelle
fd7cdc3e70 MINOR: qpack: generate headers list on decoder
TMP -> non-free strdup
TMP -> currently only support indexed field line or literal field line
with name reference
2021-09-23 15:27:25 +02:00
Amaury Denoyelle
484317e5e8 MINOR: qpack: fix wrong comment 2021-09-23 15:27:25 +02:00
Amaury Denoyelle
3394939475 MINOR: h3: change default settings
In particular, advertise a 0-length dynamic table for QPACK.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
a5b1b894c6 MINOR: quic: Prepare STREAM frames to fill QUIC packets
We must take as most as possible data from STREAM frames to be encapsulated
in QUIC packets, almost as this is done for CRYPTO frames whose fields are
variable length fields. The difference is that STREAM frames are only accepted
for short packets without any "Length" field. So it is sufficient to call
max_available_room() for that in place of max_stream_data_size() as this
is done for CRYPTO data.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
e16f0bd1e3 MINOR: h3: Send h3 settings asap
As it is possible to send Application level packets during the handshake,
let's send the h3 settings asaps.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
4bade77bf9 MINOR: quic: Prepare Application level packet asap.
It is possible the TLS stack stack provides us with 1-RTT TX secrets
at the same time as Handshake secrets are provided. Thanks to this
simple patch we can build Application level packets during the handshake.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
f798096412 MINOR: quic: Post handshake packet building improvements
Make qc_prep_hdshk_pkts() and qui_conn_io_cb() handle the case
where we enter them with QUIC_HS_ST_COMPLETE or QUIC_HS_ST_CONFIRMED
as connection state with QUIC_TLS_ENC_LEVEL_APP and QUIC_TLS_ENC_LEVEL_NONE
to consider to prepare packets.
quic_get_tls_enc_levels() is modified to return QUIC_TLS_ENC_LEVEL_APP
and QUIC_TLS_ENC_LEVEL_NONE as levels to consider when coalescing
packets in the same datagram.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
754f99e995 MINOR: quic: Missing case when discarding HANDSHAKE secrets
With very few packets received by the listener, it is possible
that its state may move from QUIC_HS_ST_SERVER_INITIAL to
QUIC_HS_ST_COMPLETE without transition to QUIC_HS_ST_SERVER_HANDSHAKE state.
This latter state is not mandatory.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
67f47d0125 MINOR: quic: Wrong flags handling for acks
Fixes several concurrent accesses issue regarding QUIC_FL_PKTNS_ACK_RECEIVED and
QUIC_FL_PKTNS_ACK_REQUIRED flags.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
d067088695 MINOR: quic: Coalesce Application level packets with Handshake packets.
This simple enable use to coalesce Application level packet with
Handshake ones at the end of the handshake. This is highly useful
if we do want to send a short Handshake packet followed by Application
level ones.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
eed7a7d73b MINOR: quic: Atomically get/set the connection state
As ->state quic_conn struct member field is shared between threads
we must atomically get and set its value.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
4436cb6606 MINOR: quic: Evaluate the packet lengths in advance
We must evaluate the packet lenghts in advance to be sure we do not
consume a packet number for nothing. The packet building must always
succeeds. This is the role of qc_eval_pkt() implemented by this patch
called before calling qc_do_build_pkt() which was previously modified to
always succeed.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
82b8652ac6 MINOR: quic: Missing acks encoded size updates.
There were cases where the encoded size of acks was not updated leading
to ACK frames building too big compared to the expected size. At this
time, this makes the code "BUG_ON()".
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
8b19a9f162 MINOR: quic: Make use of the last cbuf API when initializing TX ring buffers
Initialize the circular buffer internal buffer from a specific pool for TX ring
buffers named "pool_head_quic_tx_ring".
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
dbe25afbe6 MINOR: quic: Add a pool for TX ring buffer internal buffer
We want to allocate the internal buffer of TX ring buffer from a pool.
This patch add "quic_tx_ring_pool" to do so.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
a2e954a817 MINOR: quic: Make circular buffer internal buffers be variable-sized.
For now on thanks to this simple patch we can use circular buffers with
a variable-sized internal buffer.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
9445abc013 MINOR: quic: Rename functions which do not build only Handshake packets
Rename qc_build_hdshk_pkt() to qc_build_pkt() and qc_do_build_hdshk_pkt()
to qc_do_build_pkt().
Update their comments consequently.
Make qc_do_build_hdshk_pkt() BUG_ON() when it does not manage to build
a packet. This is a bug!
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
5d00b2d7b1 MINOR: quic: Remove Application level related functions
Remove the functions which were specific to the Application level.
This is the same function which build any packet for any encryption
level: quic_prep_hdshk_pkts() directly called from the quic_conn_io_cb().
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
f252adb368 MINOR: quic: qc_do_build_hdshk_pkt() does not need to pass a copy of CRYPTO frame
There is no need to pass a copy of CRYPTO frames to qc_build_frm() from
qc_do_build_hdshk_pkt(). Furthermore, after the previous modifications,
qc_do_build_hdshk_pkt() do not build only CRYPTO frame from ->pktns.tx.frms
MT_LIST but any type of frame.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
a7348f6f85 MINOR: quic: Make qc_build_hdshk_pkt() atomically consume a packet number
Atomically increase the "next packet variable" before building a new packet.
Make the code bug on a packet building failure. This should never happen
if we do not want to consume a packet number for nothing. There are remaining
modifications to come to ensure this is the case.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
91ae7aa7ec MINOR: quic: quic_conn_io_cb() task rework
Modify this task which is called at least each a packet is received by a listener
so that to make it behave almost as qc_do_hdshk(). This latter is no more useful
and removed.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
0ac3851f14 MINOR: quic: Modify qc_build_cfrms() to support any frame
This function was responsible of building CRYPTO frames to fill as much as
possible a packet passed as argument. This patch makes it support any frame
except STREAM frames whose lengths are highly variable.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
59b07c737b MINOR: quic: Atomically handle packet number space ->largest_acked_pn variable
Protect this variable (largest acked packet number) from any concurrent access.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
e1aa0d347a MINOR: quic: Modify qc_do_build_hdshk_pkt() to accept any packet type
With this patch qc_do_build_hdshk_pkt() is also able to build Application level
packet type. Its name should be consequently renamed (to come).
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
0e50e1b0b5 MINOR: quic: Add the packet type to quic_tx_packet struct
This is required to build packets from the same function.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
522c65ce39 MINOR: quic: Store post handshake frame in ->pktns.tx.frms MT_LIST
We want to treat all the frames to be built the same way as frames
built during handshake (CRYPTO frames). So, let't store them at the same
place which is an MT_LIST.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
546186b1cf MINOR: quic: Add the QUIC connection state to traces
This connection variable was missing. It is useful to debug issues.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
f5821dc7b7 MINOR: quic: Add a mask for TX frame builders and their authorized packet types
As this has been done for RX frame parsers, we add a mask for each TX frame
builder to denote the packet types which are authorized to embed such frames.
Each time a TX frame builder is called, we check that its mask matches the
packet type the frame is built for.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
0ad0458a56 MINOR: quic: Replace quic_tx_frm struct by quic_frame struct
These structures are similar. quic_tx_frm was there to try to reduce the
size of such objects which embed a union for all the QUIC frames.
Furtheremore this patch fixes the issue where quic_tx_frm objects were freed
from the pool for quic_frame.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
c88df07bdd MINOR: quic: Make ->tx.frms quic_pktns struct member be thread safe
Replace this member which is a list struct by an mt_list struct.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
120ea6f169 MINOR: quic: Make qc_treat_rx_pkts() be thread safe.
Make quic_rx_packet_ref(inc|dec)() functions be thread safe.
Make use of ->rx.crypto.frms_rwlock RW lock when manipulating RX frames
from qc_treat_rx_crypto_frms().
Modify atomically several variables attached to RX part of quic_enc_level struct.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
98cdeb2f0c MINOR: quic: Rename ->rx.rwlock of quic_enc_level struct to ->rx.pkts_rwlock
As there are at two RW lock in this structure, let's the name of this lock
be more explicit.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
9054d1b564 MINOR: quic: Missing encryption level rx.crypto member initialization and lock.
->rx.crypto member of quic_enc_level struct was not initialized as
this was done for all other members of this structure. This patch
fixes this.
Also adds a RW lock for the frame of this member.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
01abc4612b MINOR: quic: Unitialized mux context upon Client Hello message receipt.
If we let the connection packet handler task (quic_conn_io_cb) process the first
client Initial packet which contain the TLS Client Hello message before the mux
context is initialized, quic_mux_transport_params_update() makes haproxy crash.
->start xprt callback already wakes up this task and is called after all the
connection contexts are initialized. So, this patch do not wakes up quic_conn_io_cb()
if the mux context is not initialized (this was already the case for the connection
context (conn_ctx)).
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
0eb60c5b4d MINOR: quic: Add TX packets at the very last time to their tree.
If we add TX packets to their trees before sending them, they may
be detected as lost before being sent. This may make haproxy crash
when it retreives the prepared packets from TX ring buffers, dereferencing
them after they have been freed.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
c8d3f873e8 MINOR: quic: Remove old TX buffer implementation
We use only ring buffers (struct qring) to prepare and send QUIC datagrams.
We can safely remove the old buffering implementation which was not thread safe.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
98ad56a049 MINOR: quic_tls: Make use of the QUIC V1 salt.
This salt is used to derive the Initial secrets.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
c5b0c93c26 MINOR: quic: Make use of TX ring buffers to send QUIC packets
We modify the functions responsible of building packets to put these latters
in ring buffers (qc_build_hdshk_pkt() during the handshake step, and
qc_build_phdshk_apkt() during the post-handshake step). These functions
remove a ring buffer from its list to build as much as possible datagrams.
Eache datagram is prepended of two field: the datagram length and the
first packet in the datagram. We chain the packets belonging to the same datagram
in a singly linked list to reach them from the first one: indeed we must
modify some members of each packet when we really send them from send_ppkts().
This function is also modified to retrieved the datagram from ring buffers.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
6b19764e3c MINOR: quic: Initialize pointers to TX ring buffer list
We initialize the pointer to the listener TX ring buffer list.
Note that this is not done for QUIC clients  as we do not fully support them:
we only have to allocate the list and attach it to server struct I guess.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
48f8e1925b MINOR: proto_quic: Allocate TX ring buffers for listeners
We allocate an array of QUIC ring buffer, one by thread, and arranges them in a
MT_LIST. Everything is allocated or nothing: we do not want to usse an incomplete
array of ring buffers to ensure that each thread may safely acquire one of these
buffers.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
ed9119f39e BUG/MINOR: quic: Too much reduced computed space to build handshake packets
Before this patch we reserved 16 bytes (QUIC_TLS_TAG_LEN) before building the
handshake packet to be sure to be able to add the tag which comes with the
the packet encryption, decreasing the end offset of the building buffer by 16 bytes.
But this tag length was taken into an account when calling qc_build_frms() which
computes and build crypto frames for the remaining available room thanks to <*len>
parameter which is the length of the already present bytes in the building buffer
before adding CRYPTO frames. This leaded us to waste the 16 last bytes of the buffer
which were not used.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
82d1daa268 MINOR: quic: Add the QUIC v1 initial salt.
See initial_salt value for QUIC-TLS RFC 9001 at
https://www.rfc-editor.org/rfc/rfc9001.html#name-initial-secrets
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
4b1fddcfcf MINOR: quic: Prefer x25519 as ECDH preferred parametes.
This make at least our listeners answer to ngtcp2 clients without
HelloRetryRequest message. It seems the server choses the first
group in the group list ordered by preference and set by
SSL_CTX_set1_curves_list() which match the client ones.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
c6bc185c18 MINOR: quic: Add a ring buffer implementation for QUIC
This implementation is inspired from Linux kernel circular buffer implementation
(see include/linux/circ-buf.h). Such buffers may be used at the same time both
by writer and reader (lock-free).
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
f3d078d22e MINOR: quic: Make qc_lstnr_pkt_rcv() be thread safe.
Modify the I/O dgram handler principal function used to parse QUIC packets
be thread safe. Its role is at least to create new incoming connections
add to two trees protected by the same RW lock. The packets are for now on
fully parsed before possibly creating new connections.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
01ab6618fe MINOR: quic: Move conn_prepare() to ->accept_conn() callback
The xprt context must be initialized before receiving further packets from
the I/O dgram handler.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
6de7287cc7 MINOR: quic: Connection allocations rework
Allocate everything needed for a connection (struct quic_conn) from the same
function.
Rename qc_new_conn_init() to qc_new_conn() to reflect these modifications.
Insert these connection objects in their tree after returning from this function.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
7fd59789f2 MINOR: quic: Do not wakeup the xprt task on ACK receipt
This is an old statement which was there before implemeting the PTO and
packet loss detection. There is no reason to keep for now on.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
2e7ffc9d31 MINOR: quic: Add useful traces for I/O dgram handler
This traces have already help in diagnosing multithreading issues.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
a11d0e26d4 MINOR: quic: Replace the RX unprotected packet list by a thread safety one.
This list is shared between the I/O dgram handler and the task responsible
for processing the QUIC packets inside.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
c28aba2a8d MINOR: quic: Replace the RX list of packet by a thread safety one.
This list is shared between the I/O dgram handler and the task responsible
for processing the QUIC packets.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
1eaec33cb5 MINOR: quic: Replace quic_conn_ctx struct by ssl_sock_ctx struct
Some SSL call may be called with pointer to ssl_sock_ctx struct as parameter
which does not match the quic_conn_ctx struct type (see ssl_sock_infocb()).
I am not sure we have to keep such callbacks for QUIC but we must ensure
the SSL and QUIC xprts use the same data structure as context.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
a5fe49f44a MINOR: quic: Move the connection state
Move the connection state from quic_conn_ctx struct to quic_conn struct which
is the structure which is used to store the QUIC connection part information.
This structure is initialized by the I/O dgram handler for each new connection
to QUIC listeners. This is needed for the multithread support so that to not
to have to depend on the connection context potentially initialized by another
thread.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
9fccace8b0 MINOR: quic: Add a lock for RX packets
We must protect from concurrent the tree which stores the QUIC packets received
by the dgram I/O handler, these packets being also parsed by the xprt task.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
654c691731 MINOR: quic: Do not stop the packet parsing too early in qc_treat_rx_packets()
Continue to parse the packets even if we will not be able to acknowledge them.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
6fe21b0dec BUG/MINOR: quic: Wrong RX packet reference counter usage
No need to call free_quic_rx_packet() after calling quic_rx_packet_eb64_delete()
as this latter already calls quic_rx_packet_refdec() also called by
free_quic_rx_packet().
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
c4b93ea57d CLEAUNUP: quic: Usage of a useless variable in qc_treat_rx_pkts()
The usage of a <drop> variable is unnecessary here.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
8ba4276d13 BUG/MINOR: quic: Missing cases treatement when updating ACK ranges
Let's say that we have to insert a range R between to others A and B
with A->first <= R->first <= B->first. We have to remove the ranges
which are overlapsed by R during. This was correctly done when
the intersection between A and R was not empty, but not when the
intersection between R and B was not empty. If this latter case
after having inserting a new range R we set <new> variable as the
node to consider to check the overlaping between R and its following
ranges.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
c825eba5f9 MINOR: quic: Remove a useless variable in quic_update_ack_ranges_list()
This very minor modification is there to ease the readibilyt of this function.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
d3f4dd8014 MINOR: quic: Useless test in quic_update_ack_ranges_list()
At this place, the condition "le_ar->first.key <= ar->first" is true because
<le_ar> is the ack-range just below <ar> ack range.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
9ef64cd078 MINOR: quic: quic_update_ack_ranges_list() code factorization
Very minor modification to avoid repeating the same code section in this function
when allocation new ack range.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
baea284c3c BUG/MINOR: quic: Wrong memory free in quic_update_ack_ranges_list()
Wrong call to free() in place of pool_free() for an object allocated from a pool
memory.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
1a5e88c86a MINOR: quic: Remove header protection also for Initial packets
Make qc_try_rm_hp() be able to remove the header protection of Initial packets
which are the first incoming packets of a connection without context.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
497fa78ad8 MINOR: quic: Derive the initial secrets asap
Make depends qc_new_isecs() only on quic_conn struct initialization only (no more
dependency on connection struct initialization) to be able to run it as soon as
the quic_conn struct is initialized (from the I/O handler) before running ->accept()
quic proto callback.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
d24c2ecb16 MINOR: quic: Remove header protection for conn with context
We remove the header protection of packet only for connection with already
initialized context. This latter keep traces of the connection state.
Furthermore, we enqueue the first Initial packet for a new connection
after having completely parsed the packet so that to not start the accept
process for nothing.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
3d77fa754d MINOR: quic: QUIC conn initialization from I/O handler
Move the QUIC conn (struct quic_conn) initialization from quic_sock_accept_conn()
to qc_lstnr_pkt_rcv() as this is done for the server part.
Move the timer initialization to ->start xprt callback to ensure the connection
context is done : it is initialized by the ->accept callback which may be run
by another thread than the one for the I/O handler which also run ->start.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
de935f34e5 BUG/MINOR: quic: Do not check the acception of a new conn from I/O handler.
As the ->conn member of quic_conn struct is reset to NULL value by the ->accept
callback potentially run by another thread, this check is irrelevant.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
ecb5872012 MINOR: quic: Initialize the session before starting the xprt.
We must ensure the session and the mux are initialized before starting the xprt.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
1e1aad4ff4 MINOR: quic: Move an SSL func call from QUIC I/O handler to the xprt init.
Move the call to SSL_set_quic_transport_params() from the listener I/O dgram
handler to the ->init() callback of the xprt (qc_conn_init()) which initializes
its context where is stored the SSL context itself, needed by
SSL_set_quic_transport_params(). Furthermore this is already what is done for the
server counterpart of ->init() QUIC xprt callback. As the ->init() may be run
by another thread than the one for the I/O handler, the xprt context could
not be potentially already initialized before calling SSL_set_quic_transport_params()
from the I/O handler.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
785c9c998a MINOR: quic: Replace max_packet_size by max_udp_payload size.
The name the maximum packet size transport parameter was ambiguous and replaced
by maximum UDP payload size. Our code would be also ambiguous if it does not
reflect this change.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
0faf8078a8 MINOR: quic: Update the streams transport parameters.
Set the streams transport parameters which could not be initialized because they
were not available during initializations. Indeed, the streams transport parameters
are provided by the peer during the handshake.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
e8139f3b37 BUG/MINOR: quic: Wrong ->accept() error handling
Really signal the caller that ->accept() has failed if the session could not
be initialized because conn_complete_session() has failed. This is the case
if the mux could not be initialized too.
When it fails an ->accept() must returns -1 in case of resource shortage.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
422a39cf2c MINOR: quic: Add callbacks for (un)scribing to QUIC xprt.
Add these callbacks so that the QUIC mux may (un)scribe to the read/write xprt
events.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
fbe3b77c4e MINOR: quic: Disable the action of ->rcv_buf() xprt callback
Deactivate the action of this callback at this time. I am not sure
we will keep it for QUIC as it does not really make sense for QUIC:
the QUIC packet are already recvfrom()'ed by the low level I/O handler
used for all the connections.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
27faba7240 MINOR: quic_sock: Finalize the QUIC connections.
Add a call to conn_connection_complete() so that to install the mux any
QUIC connection.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
b4672fb6f0 MINOR: qpack: Add QPACK compression.
Implement QPACK used for HTTP header compression by h3.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
ccac11f35a MINOR: h3: Add HTTP/3 definitions.
Add all the definitions for HTTP/3 implementation.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
dfbae766b2 MINOR: mux_quic: Add QUIC mux layer.
This file has been derived from mux_h2.c removing all h2 parts. At
QUIC mux layer, there must not be any reference to http. This will be the
responsability of the application layer (h3) to open streams handled by the mux.
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
5aa4143d6c MINOR: quic: Move transport parmaters to anynomous struct.
We move ->params transport parameters to ->rx.params. They are the
transport parameters which will be sent to the peer, and used for
the endpoint flow control. So, they will be used to received packets
from the peer (RX part).
Also move ->rx_tps transport parameters to ->tx.params. They are the
transport parameter which are sent by the peer, and used to respect
its flow control limits. So, they will be used when sending packets
to the peer (TX part).
2021-09-23 15:27:25 +02:00
Frédéric Lécaille
577fe48890 BUG/MINOR: quic: Possible NULL pointer dereferencing when dumping streams.
This bug may occur when displaying streams traces. It came with this commit:
242fb1b63 ("MINOR: quic: Drop packets with STREAM frames with wrong direction.").
2021-09-23 15:27:25 +02:00
Willy Tarreau
6f97b4ef33 BUG/MEDIUM: leastconn: fix rare possibility of divide by zero
An optimization was brought in commit 5064ab6a9 ("OPTIM: lb-leastconn:
do not unlink the server if it did not change") to avoid locking the
server just to discover it did not move. However a mistake was made
because the operation involves a divide with a value that is read
outside of its usual lock, which makes it possible to be zero at the
exact moment we watch it if another thread takes the server down under
the lbprm lock, resulting in a divide by zero.

Therefore we must check that the value is not null there.

This must be backported to 2.4.
2021-09-22 07:24:02 +02:00
Willy Tarreau
c8cac04bd5 MEDIUM: listener: deprecate "process" in favor of "thread" on bind lines
The "process" directive on "bind" lines becomes quite confusing considering
that the only allowed value is 1 for the process, and that threads are
optional and come after the mandatory "1/".

Let's introduce a new "thread" directive to directly configure thread
numbers, and mark "process" as deprecated. Now "process" will emit a
warning and will suggest how to be replaced with "thread" instead.
The doc was updated accordingly (mostly a copy-paste of the previous
description which was already up to date).

This is marked as MEDIUM as it will impact users having "zero-warning"
and "process" specified.
2021-09-21 14:35:42 +02:00
Amaury Denoyelle
cd8a6f28c6 MINOR: server: enable slowstart for dynamic server
Enable the 'slowstart' keyword for dynamic servers. The slowstart task
is allocated in 'add server' handler if slowstart is used.

As the server is created in disabled state, there is no need to start
the task. The slowstart task will be automatically started on the first
'enable server' invocation.
2021-09-21 14:00:32 +02:00
Amaury Denoyelle
29d1ac1330 REORG: server: move slowstart init outside of checks
'slowstart' can be used without check on a server, with the CLI handlers
'enable/disable server'. Move the code to initialize and start the
slowstart task outside of check.c.

This change will also be reused to enable slowstart for dynamic servers.
2021-09-21 14:00:32 +02:00
Amaury Denoyelle
725f8d29ff MINOR: server: enable more check related keywords for dynamic servers
Allow to use the check related keywords defined in server.c. These
keywords can be enabled now that checks have been implemented for
dynamic servers.

Here is the list of the new keywords supported :
- error-limit
- observe
- on-error
- on-marked-down
- on-marked-up
2021-09-21 14:00:32 +02:00
Amaury Denoyelle
79b90e8cd4 MINOR: server: enable more keywords for ssl checks for dynamic servers
Allow to configure ssl support for dynamic server checks independently
of the ssl server configuration. This is done via the keyword
"check-ssl". Also enable to configure the sni/alpn used for the check
via "check-sni/alpn".
2021-09-21 14:00:07 +02:00
Amaury Denoyelle
b621552ca3 BUG/MINOR: server: alloc dynamic srv ssl ctx if proxy uses ssl chk rule
The ssl context is not initialized for a dynamic server, even if there
is a tcpcheck rule which uses ssl on the related backed. This will cause
the check initialization to failed with the message :
  "Out of memory when initializing an SSL connection"

This can be reproduced by having the following config in the backend :
  option tcp-check
  tcp-check connect ssl
and create a dynamic server with check activated and a ca-file.

Fix this by calling the prepare_srv xprt callback when the proxy options
PR_O_TCPCKH_SSL is set.

Check support for dynamic servers has been merged in the current branch.
No backport needed.
2021-09-21 13:56:03 +02:00
Amaury Denoyelle
0f456d5029 BUG/MINOR: server: allow 'enable health' only if check configured
Test that checks have been configured on the server before enabling via
the 'enable health' CLI. This mirrors the 'enable agent' command.

Without this, a user can use the command on the server without checks.
This leaves the server in an undefined state. Notably, the stat page
reports the server in check transition.

This condition was left on the following reorg commit.
  2c04eda8b5
  REORG: cli: move "{enable|disable} health" to server.c

This should be backported up to 1.8.
2021-09-21 11:50:22 +02:00
Tim Duesterhus
4f065262e9 CLEANUP: Remove unreachable break from parse_time_err()
The `return` already leaves the function.
2021-09-20 18:37:32 +02:00
Tim Duesterhus
75e2f8dcdd CLEANUP: Include check.h in flt_spoe.c
This is required for the prototype of spoe_prepare_healthcheck_request().
2021-09-20 18:37:32 +02:00
William Lallemand
79a3478c24 MINOR: httpclient: add the EOH when no headers where provided
httpclient_req_gen() now adds the end of headers block when no header
was provided, which avoid adding it manually.
2021-09-20 16:24:54 +02:00
Dragan Dosen
a8018eb470 BUG/MINOR: flt-trace: fix an infinite loop when random-parsing is set
The issue is introduced with the commit c41d8bd65 ("CLEANUP: flt-trace:
Remove unused random-parsing option").

This must be backported everywhere the above commit is.
2021-09-20 16:06:58 +02:00
Tim Duesterhus
ec4a8754da CLEANUP: Apply xalloc_size.cocci
This fixes a few locations with a hardcoded type within `sizeof()`.
2021-09-17 17:22:05 +02:00
Tim Duesterhus
16554245e2 CLEANUP: Apply bug_on.cocci
The changes look safe to me, even if `DEBUG_STRICT` is not enabled.
2021-09-17 17:22:05 +02:00
Tim Duesterhus
b113b5ca24 CLEANUP: Apply ist.cocci
This cleans up ist handling.
2021-09-17 17:22:05 +02:00
Willy Tarreau
e61244631a MINOR: applet: remove the thread mask from appctx_new()
appctx_new() is exclusively called with tid_bit and it only uses the
mask to pass it to the accompanying task. There is no point requiring
the caller to know about a mask there, nor is there any point in
creating an applet outside of the context of its own thread anyway.
Let's drop this and pass tid_bit to task_new() directly.
2021-09-17 16:08:34 +02:00
Willy Tarreau
87063a7da1 BUILD: fd: remove unused variable totlen in fd_write_frag_line()
Ilya reports in GH #1392 that clang 13 complains about totlen being
calculated and not used in fd_write_frag_line(), which is true. It's
a leftover of some older code.
2021-09-17 12:00:27 +02:00
Willy Tarreau
b5d1141305 BUILD: proto_uxst: do not set unused flag
Similarly to previous patch for sockpair, UNIX sockets set the
CONNECT_HAS_DATA flag without using it later, we can drop it.
2021-09-17 11:59:15 +02:00
Willy Tarreau
0ce77ac204 BUILD: sockpair: do not set unused flag
Ilya reports in GH #1392 that clang 13 complains about a flag being added
to the "flags" parameter without being used later. That's generic code
that was shared from TCP but we can indeed drop this flag since it's used
for TFO which we don't have in socketpairs.
2021-09-17 11:56:25 +02:00
Willy Tarreau
f2dda52e78 BUG/MINOR: cli/payload: do not search for args inside payload
The CLI's payload parser is over-complicated and as such contains more
bugs than needed. One of them is that it uses strstr() to find the
ending tag, ignoring spaces before it, while the argument locator
creates a new arg on each space, without checking if the end of the
word appears past the previously found end. This results in "<<" being
considered as the start of a new argument if preceeded by more than
one space, and the payload being damaged with a \0 inserted at the
first space or tab.

Let's make an easily backportable fix for now. This fix makes sure that
the trailing zero from the first line is properly kept after '<<' and
that the end tag is looked for only as an isolated argument and nothing
else. This also gets rid of the unsuitable strstr() call and now makes
sure that strcspn() will not return elements that are found in the
payload.

For the long term the loop must be rewritten to get rid of those
unsuitable strcspn() and strstr() calls which work past each other, and
the cli_parse_request() function should be split into a tokenizer and
an executor that are used from the caller instead of letting the caller
play games with what it finds there.

This should be backported wherever CLI payload is supported, i.e. 2.0+.
2021-09-17 11:50:09 +02:00
Amaury Denoyelle
4837293ca0 BUG/MINOR: connection: prevent null deref on mux cleanup task allocation
Move the code to allocate/free the mux cleanup task outside of the polling
loop. A new thread_alloc/free handler is registered for this in
connection.c.

This has the benefit to clean up the polling loop code. And as another
benefit, if the task allocation fails, the handler can report an error
to exit the haproxy process. This prevents a potential null pointer
dereferencing.

This should fix the github issue #1389.

This must be backported up to 2.4.
2021-09-16 17:45:52 +02:00
Christopher Faulet
8a0e5f822b BUG/MINOR: tcpcheck: Improve LDAP response parsing to fix LDAP check
When the LDAP response is parsed, the message length is not properly
decoded. While it works for LDAP servers encoding it on 1 byte, it does not
work for those using a multi-bytes encoding. Among others, Active Directory
servers seems to encode messages or elements length on 4 bytes.

In this patch, we only handle length of BindResponse messages encoded on 1,
2 or 4 bytes. In theory, it may be encoded on any bytes number less than 127
bytes. But it is useless to make this part too complex. It should be ok this
way.

This patch should fix the issue #1390. It should be backported to all stable
versions. While it should be easy to backport it as far as 2.2, the patch
will have to be totally rewritten for lower versions.
2021-09-16 17:24:50 +02:00
Willy Tarreau
c2afb860f2 MINOR: pools: use mallinfo2() when available instead of mallinfo()
Ilya reported in issue #1391 a build warning on Fedora about mallinfo()
being deprecated in favor of mallinfo2() since glibc-2.33. Let's add
support for it. This should be backported where the following commit is
also backported: 157e39303 ("MINOR: pools: automatically disable
malloc_trim() with external allocators").
2021-09-16 09:20:16 +02:00
Christopher Faulet
ab7389dc3c BUG/MAJOR: mux-h1: Don't eval input data if an error was reported
If an error was already reported on the H1 connection, pending input data
must not be (re)evaluated in h1_process(). Otherwise an unexpected internal
error will be reported, in addition of the first one. And on some
conditions, this may generate an infinite loop because the mux tries to send
an internal error but it fails to do so thus it loops to retry.

This patch should fix the issue #1356. It must be backported to 2.4.
2021-09-16 08:31:46 +02:00
Christopher Faulet
51324b8720 CLEANUP: acl: Remove unused variable when releasing an acl expression
The "unresolved" variable is unused since commit 9fa0df5 ("BUG/MINOR: acl:
Fix freeing of expr->smp in prune_acl_expr").

This patch should fix the issue #1359.
2021-09-16 08:31:46 +02:00
Willy Tarreau
845b560f6a MINOR: pools: report it when malloc_trim() is enabled
Since we can detect it at runtime now, it could help to have it mentioned
in haproxy -vv.
2021-09-15 10:41:24 +02:00
Willy Tarreau
157e393039 MINOR: pools: automatically disable malloc_trim() with external allocators
Pierre Cheynier reported some occasional crashes in malloc_trim() on a
recent glibc when running with jemalloc(). While in theory there should
not be any link between the two, it remains plausible that something
allocated early with one is tentatively freed with the other and that
attempts to trim end up badly. There's no point calling the glibc specific
malloc_trim() with external allocators anyway. However these ones are often
enabled at link time or even at run time with LD_PRELOAD, so we cannot rely
on build options for this.

This patch implements runtime detection for the allocator in use by checking
with mallinfo() that a malloc() call is properly accounted for in glibc's
malloc. It only enables malloc_trim() in this case, and ignores it for
other cases. It's fine to proceed like this because mallinfo() is provided
by a wider range of glibcs than malloc_trim().

This could be backported to 2.4 and 2.3. If so, it will also need previous
patch "CLEANUP: pools: factor all malloc_trim() calls into trim_all_pools()".
2021-09-15 10:40:39 +02:00
Willy Tarreau
ea3323f62c CLEANUP: pools: factor all malloc_trim() calls into trim_all_pools()
The code was slightly cleaned up by removing repeated occurrences of ifdefs
and moving that into a single trim_all_pools() function.
2021-09-15 10:38:21 +02:00
Willy Tarreau
c5d0fc9b9f BUILD: sample: fix format warning on 32-bit archs in sample_conv_be2dec_check()
The sizeof() was printed as a long but it's just an unsigned on some
32-bit platforms, hence the format warning. No backport is needed, as
this arrived in 2.5 with commit  40ca09c7b ("MINOR: sample: Add be2dec
converter").
2021-09-15 10:32:12 +02:00
Tim Duesterhus
2281738256 BUG/MEDIUM lua: Add missing call to RESET_SAFE_LJMP in hlua_filter_new()
In one case before exiting leaving the function the panic handler was not
reset.

Introduced in 69c581a092, which is 2.5+.
No backport required.
2021-09-12 08:21:07 +02:00
Tim Duesterhus
d5fc8fcb86 CLEANUP: Add haproxy/xxhash.h to avoid modifying import/xxhash.h
This solves setting XXH_INLINE_ALL in a cleaner way, because the imported
header is not modified, easing future updates.

see 6f7cc11e6d
2021-09-11 19:58:45 +02:00
Christopher Faulet
949b6ca961 BUG/MINOR: filters: Set right FLT_END analyser depending on channel
A bug was introduced by the commit 26eb5ea35 ("BUG/MINOR: filters: Always
set FLT_END analyser when CF_FLT_ANALYZE flag is set"). Depending on the
channel evaluated, the rigth FLT_END analyser must be set. AN_REQ_FLT_END
for the request channel and AN_RES_FLT_END for the response one.

Ths patch must be backported everywhere the above commit was backported.
2021-09-10 10:35:53 +02:00
Christopher Faulet
2d56500826 BUG/MEDIUM: http-ana: Reset channels analysers when returning an error
When an error is returned to the client, via a call to
http_reply_and_close(), the request channel is flushed and shut down and
HTTP analysis on both direction is finished. So it is safer to centralize
reset of channels analysers at this place. It is especially important when a
filter is attached to the stream when a client abort is detected. Because,
otherwise, the stream remains blocked because request analysers are not
reset.

This bug was hidden for a while. But since the fix 6fcd2d328 ("BUG/MINOR:
stream: Don't release a stream if FLT_END is still registered"), it is
possible to trigger it.

This patch must be backported everywhere the above commit was backported.
2021-09-10 10:35:53 +02:00
Christopher Faulet
883d83e83c BUG/MEDIUM: stream-int: Don't block SI on a channel policy if EOI is reached
If the end of input is reported by the mux on the conn-stream during a
receive, we leave without evaluating the channel policies. It is especially
important to be able to catch client aborts during server connection
establishment. Indeed, in this case, without this patch, the
stream-interface remains blocked and read events are not forwarded to the
stream. It means it is not possible to detect client aborts.

Thanks to this fix, the abortonclose option should fixed for HAProxy 2.3 and
lower. On 2.4 and 2.5, it seems to work because the stream is created after
the request parsing.

Note that a previous fix of abortonclose option was reverted. This one
should be the right way to fix it. It must carefully be backported as far as
2.0. A observation period on the 2.3 is probably a good idea.
2021-09-10 10:35:53 +02:00
Christopher Faulet
0fa8007102 CLEANUP: mux-h1: Remove condition rejecting upgrade requests with payload
Now, "Upgrade:" header is removed from such requests. Thus, the condition to
reject them is now useless and can be removed. Code to handle unimplemented
features is now unused but is preserved for future uses.

This patch may be backported to 2.4.
2021-09-10 10:35:53 +02:00
Christopher Faulet
52a5ec2d18 BUG/MEDIUM: mux-h1: Remove "Upgrade:" header for requests with payload
Instead of returning a 501-Not-implemented error when "Ugrade:" header is
found for a request with a payload, the header is removed. This way, the
upgrade is disabled and the request is still sent to the server. It is
required because some frameworks seem to try to perform H2 upgrade on every
requests, including POST ones.

The h2 mux was slightly fixed to convert Upgrade requests to extended
connect ones only if the rigth HTX flag is set.

This patch should fix the issue #1381. It must be backported to 2.4.
2021-09-10 09:17:51 +02:00
Willy Tarreau
55f8a830dc OPTIM: vars: do not keep variables usage stats if no limit is set
The sole purpose of the variable's usage accounting is to enforce
limits at the session or process level, but very commonly these are not
set, yet the bookkeeping (especially at the process level) is extremely
expensive.

Let's simply disable it when the limits are not set. This further
increases the performance of 12 variables on 16-thread from 1.06M
to 1.24M req/s.
2021-09-08 15:53:07 +02:00
Willy Tarreau
3b78f2aa5d OPTIM: vars: remove internal bookkeeping for vars_global_size
Right now we have a per-process max variable size and a per-scope one,
with the proc scope covering all others. As such, the per-process global
one is always exactly equal to the per-proc-scope one. And bookkeeping
on these process-wide variables is extremely expensive (up to 38% CPU
seen in var_accounting_diff() just for them).

Let's kill vars_global_size and only rely on the proc one. Doing this
increased the request rate from 770k to 1.06M in a config having only
12 variables on a 16-thread machine.
2021-09-08 15:45:05 +02:00
Willy Tarreau
dc72fbb8e8 MINOR: vars: centralize the lock/unlock into static inlines
The goal it to simplify the variables locking in order to later
simplify it.
2021-09-08 15:19:57 +02:00
Willy Tarreau
3f120d2a58 CLEANUP: vars: remove the now unused var_names array
This was the table of all variable names known to the haproxy process.
It's not used anymore.
2021-09-08 15:09:22 +02:00
Willy Tarreau
3a4bedccc6 MEDIUM: vars: replace the global name index with a hash
The global table of known variables names can only grow and was designed
for static names that are registered at boot. Nowadays it's possible to
set dynamic variable names from Lua or from the CLI, which causes a real
problem that was partially addressed in 2.2 with commit 4e172c93f
("MEDIUM: lua: Add `ifexist` parameter to `set_var`"). Please see github
issue #624 for more context.

This patch simplifies all this by removing the need for a central
registry of known names, and storing 64-bit hashes instead. This is
highly sufficient given the low number of variables in each context.
The hash is calculated using XXH64() which is bijective over the 64-bit
space thus is guaranteed collision-free for 1..8 chars. Above that the
risk remains around 1/2^64 per extra 8 chars so in practice this is
highly sufficient for our usage. A random seed is used at boot to seed
the hash so that it's not attackable from Lua for example.

There's one particular nit though. The "ifexist" hack mentioned above
is now limited to variables of scope "proc" only, and will only match
variables that were already created or declared, but will now verify
the scope as well. This may affect some bogus Lua scripts and SPOE
agents which used to accidentally work because a similarly named
variable used to exist in a different scope. These ones may need to be
fixed to comply with the doc.

Now we can sum up the situation as this one:
  - ephemeral variables (scopes sess, txn, req, res) will always be
    usable, regardless of any prior declaration. This effectively
    addresses the most problematic change from the commit above that
    in order to work well could have required some script auditing ;

  - process-wide variables (scope proc) that are mentioned in the
    configuration, referenced in a "register-var-names" SPOE directive,
    or created via "set-var" in the global section or the CLI, are
    permanent and will always accept to be set, with or without the
    "ifexist" restriction (SPOE uses this internally as well).

  - process-wide variables (scope proc) that are only created via a
    set-var() tcp/http action, via Lua's set_var() calls, or via an
    SPOE with the "force-set-var" directive), will not be permanent
    but will always accept to be replaced once they are created, even
    if "ifexist" is present

  - process-wide variables (scope proc) that do not exist will only
    support being created via the set-var() tcp/http action, Lua's
    set_var() calls without "ifexist", or an SPOE declared with
    "force-set-var".

This means that non-proc variables do not care about "ifexist" nor
prior declaration, and that using "ifexist" should most often be
reliable in Lua and that SPOE should most often work without any
prior declaration. It may be doable to turn "ifexist" to 1 by default
in Lua to further ease the transition. Note: regtests were adjusted.

Cc: Tim Dsterhus <tim@bastelstu.be>
2021-09-08 15:06:11 +02:00
Willy Tarreau
2c897d9d1b MINOR: vars: preset a random seed to hash variables names
Variables names will be hashed, but for this we need a random seed.
The XXH3() algorithms is bijective over the whole 64-bit space, which
is great as it guarantees no collision for 1..8 byte names. But above
that even if the risk is extremely faint, it theoretically exists and
since variables may be set from Lua we'd rather do our best to limit
the risk of controlled collision, hence the random seed.
2021-09-08 15:06:11 +02:00
Willy Tarreau
df8eeb1619 MEDIUM: vars: pre-create parsed SCOPE_PROC variables as permanent ones
All variables whose names are parsed by the config parser, the
command-line parser or the SPOE's register-var-names parser are
now preset as permanent. This will guarantee that these variables
will exist through out all the process' life, and that it will be
possible to implement the "ifexist" feature by looking them up.

This was marked medium because pre-setting a variable with an empty
value may always have side effects, even though none was spotted at
this stage.
2021-09-08 15:06:11 +02:00
Willy Tarreau
c1c88f4809 MEDIUM: vars: make var_clear() only reset VF_PERMANENT variables
We certainly do not want that a permanent variable (one that is listed
in the configuration) be erased by accident by an "unset-var" action.
Let's make sure these ones are only reset to an empty sample, like at
the moment of their initial registration. One trick is that the same
function is used to purge the memory at the end and to delete, so we
need to add an extra "force" argument to make the choice.
2021-09-08 15:06:11 +02:00
Willy Tarreau
3dc6dc3178 MINOR: vars: store flags into variables and add VF_PERMANENT
In order to continue to honor the ifexist Lua option and prevent rogue
SPOA agents from creating too many variables, we'll need to keep the
ability to mark certain proc.* variables as permanent when they're
known from the config file.

Let's add a flag there for this. It's added to the variable when the
variable is created with this flag set by the caller.

Another approach could have been to use a distinct list or distinct
scope but that sounds complicated and bug-prone.
2021-09-08 14:06:34 +02:00
Willy Tarreau
63c30667d7 MINOR: vars: support storing empty sample data with a variable
Storing an unset sample (SMP_T_ANY == 0) will be used to only reserve
the variable's space but associate no value. We need to slightly adjust
var_to_smp() for this so that it considers a value-less variable as non
existent and falls back to the default value.
2021-09-08 13:59:43 +02:00
Willy Tarreau
4994b57728 MINOR: vars: add a VF_CREATEONLY flag for creation
Passing this flag to var_set() will result in the variable to only be
created if it did not exist, otherwise nothing is done (it's not even
updated). This will be used for pre-registering names.
2021-09-08 11:47:30 +02:00
Willy Tarreau
7978c5c422 MEDIUM: vars: make the ifexist variant of set-var only apply to the proc scope
When setting variables, there are currently two variants, one which will
always create the variable, and another one, "ifexist", which will only
create or update a variable if a similarly named variable in any scope
already existed before.

The goal was to limit the risk of injecting random names in the proc
scope, but it was achieved by making use of the somewhat limited name
indexing model, which explains the scope-agnostic restriction.

With this change, we're moving the check downwards in the chain, at the
variable level, and only variables under the scope "proc" will be subject
to the restriction. A new set of VF_* flags was added to adjust how
variables are set, and VF_UPDATEONLY is used to mention this restriction.

In this exact state of affairs, this is not completely exact, as if a
similar name was not known in any scope, the variable will continue to
be rejected like before, but this will change soon.
2021-09-08 11:47:06 +02:00
Willy Tarreau
f1cb0ebe3e REORG: vars: remerge sample_store{,_stream}() into var_set()
The names for these two functions are totally misleading, they have
nothing to do with samples, they're purely dedicated to variables. The
former is only used by the second one and makes no sense by itself, so
it cannot even get a meaningful name. Let's remerge them into a single
one called "var_set()" which, as its name tries to imply, sets a variable
to a given value.
2021-09-08 11:10:16 +02:00
Willy Tarreau
d378eb82d9 CLEANUP: vars: rename sample_clear_stream() to var_unset()
This name was quite misleading, as it has nothing to do with samples nor
streams. This function's sole purpose is to unset a variable, so let's
call it "var_unset()" and document it a little bit.
2021-09-08 11:10:16 +02:00
Willy Tarreau
b7bfcb3ff3 MINOR: vars: rename vars_init() to vars_init_head()
The vars_init() name is particularly confusing as it does not initialize
the variables code but the head of a list of variables passed in
arguments. And we'll soon need to have proper initialization code, so
let's rename it now.
2021-09-08 11:10:16 +02:00
Willy Tarreau
10080716bf MINOR: proxy: add a global "grace" directive to postpone soft-stop
In ticket #1348 some users expressed some concerns regarding the removal
of the "grace" directive from the proxies. Their use case very closely
mimmicks the original intent of the grace keyword, which is, let haproxy
accept traffic for some time when stopping, while indicating an external
LB that it's stopping.

This is implemented here by starting a task whose expiration triggers
the soft-stop for real. The global "stopping" variable is immediately
set however. For example, this below will be sufficient to instantly
notify an external check on port 9999 that the service is going down,
while other services remain active for 10s:

    global
      grace 10s

    frontend ext-check
      bind :9999
      monitor-uri /ext-check
      monitor fail if { stopping }
2021-09-07 17:34:29 +02:00
Christopher Faulet
b7308f00cb Revert "BUG/MINOR: stream-int: Don't block reads in si_update_rx() if chn may receive"
This reverts commit e0dec4b7b2.

At first glance, channel_is_empty() was used on purpose in si_update_rx(),
because of the HTX ("b3e0de46c" MEDIUM: stream-int: Rely only on
SI_FL_WAIT_ROOM to stop data receipt). It is not pretty clear for now why
channel_may_recv() sould not be used here but this change introduce a
possible infinite loop with the stats applet. So, it is safer to revert the
patch, waiting for a better understanding of the probelm.

This means the abortonclose option will be broken again on the 2.3 and lower
versions.

This patch should fix the issue #1360. It must be backported as far as 2.0.
2021-09-07 14:31:02 +02:00
Willy Tarreau
3d5f19e04d CLEANUP: htx: remove comments about "must be < 256 MB"
Since commit "BUG/MINOR: config: reject configs using HTTP with bufsize
>= 256 MB" we are now sure that it's not possible anymore to have an HTX
block of a size 256 MB or more, even after concatenation thanks to the
tests for len >= htx_free_data_space(). Let's remove these now obsolete
comments.

A BUG_ON() was added in htx_add_blk() to track any such exception if
the conditions would change later, to complete the one that is performed
on the start address that must remain within the buffer.
2021-09-03 16:15:29 +02:00
Willy Tarreau
32b51cdf30 BUG/MINOR: config: reject configs using HTTP with bufsize >= 256 MB
As seen in commit 5ef965606 ("BUG/MINOR: lua: use strlcpy2() not
strncpy() to copy sample keywords"), configs with large values of
tune.bufsize were not practically usable since Lua was introduced,
regardless of the machine's available memory.

In addition, HTX encoding already limits block sizes to 256 MB, thus
it is not technically possible to use that large a buffer size when
HTTP is in use. This is absurdly high anyway, and for example Lua
initialization would take around one minute on a 4 GHz CPU. Better
prevent such a config from starting than having to deal with bug
reports that make no sense.

The check is only enforced if at least one HTX proxy was found, as
there is no techincal reason to block it for configs that are solely
based on raw TCP, and it could still be imagined that some such might
exist with single connections (e.g. a log forwarder that buffers to
cover for the storage I/O latencies).

This should be backported to all HTX-enabled versions (2.0 and above).
2021-09-03 16:15:29 +02:00
Willy Tarreau
54496a6a5b MINOR: vars: make the vars() sample fetch function support a default value
It is quite common to see in configurations constructions like the
following one:

    http-request set-var(txn.bodylen) 0
    http-request set-var(txn.bodylen) req.hdr(content-length)
    ...
    http-request set-header orig-len %[var(txn.bodylen)]

The set-var() rules are almost always duplicated when manipulating
integers or any other value that is mandatory along operations. This is
a problem because it makes the configurations complicated to maintain
and slower than needed. And it becomes even more complicated when several
conditions may set the same variable because the risk of forgetting to
initialize it or to accidentally reset it is high.

This patch extends the var() sample fetch function to take an optional
argument which contains a default value to be returned if the variable
was not set. This way it becomes much simpler to use the variable, just
set it where needed, and read it with a fall back to the default value:

    http-request set-var(txn.bodylen) req.hdr(content-length)
    ...
    http-request set-header orig-len %[var(txn.bodylen,0)]

The default value is always passed as a string, thus it will experience
a cast to the output type. It doesn't seem userful to complicate the
configuration to pass an explicit type at this point.

The vars.vtc regtest was updated accordingly.
2021-09-03 12:08:54 +02:00
Willy Tarreau
e352b9dac7 MINOR: vars: make vars_get_by_* support an optional default value
In preparation for support default values when fetching variables, we
need to update the internal API to pass an extra argument to functions
vars_get_by_{name,desc} to provide an optional default value. This
patch does this and always passes NULL in this argument. var_to_smp()
was extended to fall back to this value when available.
2021-09-03 12:08:54 +02:00
Willy Tarreau
be7e00d134 CLEANUP: vars: factor out common code from vars_get_by_{desc,name}
The two functions vars_get_by_name() and vars_get_by_scope() perform
almost the same operations except that they differ from the way the
name and scope are retrieved. The second part in common is more
complex and involves locking, so better factor this one out into a
new function.

There is no other change than refactoring.
2021-09-03 11:43:35 +02:00
Willy Tarreau
e93bff4107 MEDIUM: vars: also support format strings in CLI's "set var" command
Most often "set var" on the CLI is used to set a string, and using only
expressions is not always convenient, particularly when trying to
concatenate variables sur as host names and paths.

Now the "set var" command supports an optional keyword before the value
to indicate its type. "expr" takes an expression just like before this
patch, and "fmt" a format string, making it work like the "set-var-fmt"
actions.

The VTC was updated to include a test on the format string.
2021-09-03 11:01:48 +02:00
Willy Tarreau
753d4db5f3 MINOR: vars: add a "set-var-fmt" directive to the global section
Just like the set-var-fmt action for tcp/http rules, the set-var-fmt
directive in global sections allows to pre-set process-wide variables
using a format string instead of a sample expression. This is often
more convenient when it is required to concatenate multiple fields,
or when emitting just one word.
2021-09-03 11:01:48 +02:00
Willy Tarreau
20b7a0f9ed MINOR: log: make log-format expressions completely usable outside of req/resp
The log-format strings are usable at plenty of places, but the expressions
using %[] were restricted to request or response context and nothing else.
This prevents from using them from the config context or the CLI, let's
relax this.
2021-09-03 11:01:48 +02:00
Willy Tarreau
9c20433aca CLEANUP: vars: name the temporary proxy "CFG" instead of "CLI" for global vars
We're using a dummy temporary proxy when creating global variables in
the configuration file, it was copied from the CLI's code and was
mistakenly called "CLI", better name it "CFG". It should not appear
anywhere except maybe when debugging cores.
2021-09-03 11:01:48 +02:00
Willy Tarreau
c767eebf1f BUG/MINOR: vars: do not talk about global section in CLI errors for set-var
When attempting to set a variable does not start with the "proc" scope on
the CLI, we used to emit "only proc is permitted in the global section"
which obviously is a leftover from the initial code.

This may be backported to 2.4.
2021-09-03 11:01:12 +02:00
Willy Tarreau
1402fef58a BUG/MINOR: vars: truncate the variable name in error reports about scope.
When a variable starts with the wrong scope, it is named without stripping
the extra characters that follow it, which usually are closing parenthesis.
Let's make sure we only report what is expected.

This may be backported to 2.4.
2021-09-03 11:01:12 +02:00
Willy Tarreau
c77bad2467 BUG/MEDIUM: vars: run over the correct list in release_store_rules()
In commit 9a621ae76 ("MEDIUM: vars: add a new "set-var-fmt" action")
we introduced the support for format strings in variables with the
ability to release them on exit, except that it's the wrong list that
was being scanned for the rule (http vs vars), resulting in random
crashes during deinit.

This was a recent commit in 2.5-dev, no backport is needed.
2021-09-03 11:01:12 +02:00
Willy Tarreau
9a621ae76d MEDIUM: vars: add a new "set-var-fmt" action
The set-var() action is convenient because it preserves the input type
but it's a pain to deal with when trying to concatenate values. The
most recurring example is when it's needed to build a variable composed
of the source address and the source port. Usually it ends up like this:

    tcp-request session set-var(sess.port) src_port
    tcp-request session set-var(sess.addr) src,concat(":",sess.port)

This is even worse when trying to aggregate multiple fields from stick-table
data for example. Due to this a lot of users instead abuse headers from HTTP
rules:

    http-request set-header(x-addr) %[src]:%[src_port]

But this requires some careful cleanups to make sure they won't leak, and
it's significantly more expensive to deal with. And generally speaking it's
not clean. Plus it must be performed for each and every request, which is
expensive for this common case of ip+port that doesn't change for the whole
session.

This patch addresses this limitation by implementing a new "set-var-fmt"
action which performs the same work as "set-var" but takes a format string
in argument instead of an expression. This way it becomes pretty simple to
just write:

    tcp-request session set-var-fmt(sess.addr) %[src]:%[src_port]

It is usable in all rulesets that already support the "set-var" action.
It is not yet implemented for the global "set-var" directive (which already
takes a string) and the CLI's "set var" command, which would definitely
benefit from it but currently uses its own parser and engine, thus it
must be reworked.

The doc and regtests were updated.
2021-09-02 21:22:22 +02:00
Willy Tarreau
54b96d9955 BUG/MINOR: vars: properly set the argument parsing context in the expression
When the expression called in "set-var" uses argments that require late
resolution, the context must be set. At the moment, any unknown argument
is misleadingly reported as "ACL":

    frontend f
        bind :8080
        mode http
        http-request set-var(proc.a) be_conn(foo)

   parsing [b1.cfg:4]: unable to find backend 'foo' referenced in arg 1 \
   of ACL keyword 'be_conn' in proxy 'f'.

Once the context is properly set, it now says the truth:

   parsing [b1.cfg:8]: unable to find backend 'foo' referenced in arg 1 \
   of sample fetch keyword 'be_conn' in http-request expression in proxy 'f'.

This may be backported but is not really important. If so, the preceeding
patches "BUG/MINOR: vars: improve accuracy of the rules used to check
expression validity" and "MINOR: sample: add missing ARGC_ entries" must
be backported as well.
2021-09-02 20:34:30 +02:00
Willy Tarreau
57467b8356 MINOR: sample: add missing ARGC_ entries
For a long time we couldn't have arguments in expressions used in
tcp-request, tcp-response etc rules. But now due to the variables
it's possible, and their context in case of failure to resolve an
argument (e.g. backend name not found) is not properly reported
because there is no arg context values in ARGC_* to report them.

Let's add a number of missing ones for tcp-request {connection,
session,content}, tcp-response content, tcp-check, the config
parser (for "set-var" in the global section) and the CLI parser
(for "set-var" on the CLI).
2021-09-02 19:43:20 +02:00
Willy Tarreau
843096d72a BUG/MINOR: vars: improve accuracy of the rules used to check expression validity
The set-var() expression naturally checks whether expressions are valid
in the context of the rule, but it fails to differentiate frontends from
backends. As such for tcp-content and http-request rules, it will only
accept frontend-compatible sample-fetches, excluding those declared with
SMP_UES_BKEND (a few such as be_id, be_name). For the response it accepts
the backend-compatible expressions only, though it seems that there are
no sample-fetch function that are valid only in the frontend's content,
so that should not cause any problem.

Note that while allowing valid configs to be used, the fix might also
uncover some incorrect configurations where some expressions currently
return nothing (e.g. something depending on frontend declared in a
backend), and which could be rejected, but there does not seem to be
any such keyword. Thus while it should be backported, better not backport
it too far (2.4 and possibly 2.3 only).
2021-09-02 19:23:43 +02:00
Willy Tarreau
2819210a83 BUG/MINOR: vars: fix set-var/unset-var exclusivity in the keyword parser
The parser checks first for "set-var" then "unset-var" from the updated
offset instead of testing it only when the other one fails, so it
validates this rule as "unset-var":

    http-request set-varunset-var(proc.a)

This should be backported everywhere relevant, though it's mostly harmless
as it's unlikely that some users are purposely writing this in their conf!
2021-09-02 18:46:22 +02:00
Willy Tarreau
bc1223be79 MINOR: http-rules: add a new "ignore-empty" option to redirects.
Sometimes it is convenient to remap large sets of URIs to new ones (e.g.
after a site migration for example). This can be achieved using
"http-request redirect" combined with maps, but one difficulty there is
that non-matching entries will return an empty response. In order to
avoid this, duplicating the operation as an ACL condition ending in
"-m found" is possible but it becomes complex and error-prone while it's
known that an empty URL is not valid in a location header.

This patch addresses this by improving the redirect rules to be able to
simply ignore the rule and skip to the next one if the result of the
evaluation of the "location" expression is empty. However in order not
to break existing setups, it requires a new "ignore-empty" keyword.

There used to be an ACT_FLAG_FINAL on redirect rules that's used during
the parsing to emit a warning if followed by another rule, so here we
only set it if the option is not there. The http_apply_redirect_rule()
function now returns a 3rd value to mention that it did nothing and
that this was not an error, so that callers can just ignore the rule.
The regular "redirect" rules were not modified however since this does
not apply there.

The map_redirect VTC was completed with such a test and updated to 2.5
and an example was added into the documentation.
2021-09-02 17:06:18 +02:00
Remi Tricot-Le Breton
942c167229 MINOR: connection: Add a connection error code sample fetch for backend side
The bc_conn_err and bc_conn_err_str sample fetches give the status of
the connection on the backend side. The error codes and error messages
are the same than the ones that can be raised by the fc_conn_err fetch.
2021-09-01 22:55:54 +02:00
Remi Tricot-Le Breton
163cdeba37 MINOR: ssl: Add new ssl_bc_hsk_err sample fetch
This new sample fetch along the ssl_bc_hsk_err_str fetch contain the
last SSL error of the error stack that occurred during the SSL
handshake (from the backend's perspective).
2021-09-01 22:55:39 +02:00
Willy Tarreau
87154e3010 BUG/MAJOR: queue: better protect a pendconn being picked from the proxy
The locking in the dequeuing process was significantly improved by commit
49667c14b ("MEDIUM: queue: take the proxy lock only during the px queue
accesses") in that it tries hard to limit the time during which the
proxy's queue lock is held to the strict minimum. Unfortunately it's not
enough anymore, because we take up the task and manipulate a few pendconn
elements after releasing the proxy's lock (while we're under the server's
lock) but the task will not necessarily hold the server lock since it may
not have successfully found one (e.g. timeout in the backend queue). As
such, stream_free() calling pendconn_free() may release the pendconn
immediately after the proxy's lock is released while the other thread
currently proceeding with the dequeuing tries to wake up the owner's
task and dies in task_wakeup().

One solution consists in releasing le proxy's lock later. But tests have
shown that we'd have to sacrifice a significant share of the performance
gained with the patch above (roughly a 20% loss).

This patch takes another approach. It adds a "del_lock" to each pendconn
struct, that allows to keep it referenced while the proxy's lock is being
released. It's mostly a serialization lock like a refcount, just to maintain
the pendconn alive till the task_wakeup() call is complete. This way we can
continue to release the proxy's lock early while keeping this one. It had
to be added to the few points where we're about to free a pendconn, namely
in pendconn_dequeue() and pendconn_unlink(). This way we continue to
release the proxy's lock very early and there is no performance degradation.

This lock may only be held under the queue's lock to prevent lock
inversion.

No backport is needed since the patch above was merged in 2.5-dev only.
2021-08-31 18:37:13 +02:00
Remi Tricot-Le Breton
fe21fe76bd MINOR: log: Add new "error-log-format" option
This option can be used to define a specific log format that will be
used in case of error, timeout, connection failure on a frontend... It
will be used for any log line concerned by the log-separate-errors
option. It will also replace the format of specific error messages
decribed in section 8.2.6.
If no "error-log-format" is defined, the legacy error messages are still
emitted and the other error logs keep using the regular log-format.
2021-08-31 12:13:08 +02:00
Remi Tricot-Le Breton
3d6350e108 MINOR: log: Remove log-error-via-logformat option
This option will be replaced by a "error-log-format" that enables to use
a dedicated log-format for connection error messages instead of the
regular log-format (in which most of the fields would be invalid in such
a case).
The "log-error-via-logformat" mechanism will then be replaced by a test
on the presence of such an error log format or not. If a format is
defined, it is used for connection error messages, otherwise the legacy
error log format is used.
2021-08-31 12:13:06 +02:00
Willy Tarreau
7b2108cad1 BUILD: tools: properly guard __GLIBC__ with defined()
The test on the glibc versions based on #if (__GLIBC > 2 ...) fails to
build under -Wundef, let's prepend defined(__GLIBC__) first.
2021-08-30 10:16:30 +02:00
Willy Tarreau
b131049eb5 BUILD: ssl: fix two remaining occurrences of #if USE_OPENSSL
One was in backend.c and the other one in hlua.c. No other candidate
was found with "git grep '^#if\s*USE'". It's worth noting that 3
other such tests exist for SSL_OP_NO_{SSLv3,TLSv1_1,TLSv1_2} but
that these ones are properly set to 0 in openssl-compat.h when not
defined.
2021-08-30 09:39:24 +02:00
Tim Duesterhus
18795d48a9 BUG/MINOR: tools: Fix loop condition in dump_text()
The condition should first check whether `bsize` is reached, before
dereferencing the offset. Even if this always works fine, due to the
string being null-terminated, this certainly looks odd.

Found using GitHub's CodeQL scan.

This bug traces back to at least 97c2ae13bc
(1.7.0+) and this patch should be backported accordingly.
2021-08-30 06:14:50 +02:00
Tim Duesterhus
1f269c12dc BUG/MINOR threads: Use get_(local|gm)time instead of (local|gm)time
Using localtime / gmtime is not thread-safe, whereas the `get_*` wrappers are.

Found using GitHub's CodeQL scan.

The use in sample_conv_ltime() can be traced back to at least
fac9ccfb70 (first appearing in 1.6-dev3), so all
supported branches with thread support are affected.
2021-08-30 06:14:32 +02:00
Willy Tarreau
fc89c3fd2b IMPORT: slz: silence a build warning with -Wundef
The test on FIND_OPTIMAL_MATCH for the experimental code can yield a
build warning when using -Wundef, let's turn it into a regular ifdef.

This is slz upstream commit 05630ae8f22b71022803809eb1e7deb707bb30fb
2021-08-28 12:47:57 +02:00
Willy Tarreau
e15615c1ff BUILD: activity: use #ifdef not #if on USE_MEMORY_PROFILING
This avoids most build warnings with -Wundef, and all other USE_* flags
are tested this way, let's do it there as well. See gh issue #1369.
2021-08-28 12:04:25 +02:00
Willy Tarreau
fe456c581f MINOR: time: add report_idle() to report process-wide idle time
Before threads were introduced in 1.8, idle_pct used to be a global
variable indicating the overall process idle time. Threads made it
thread-local, meaning that its reporting in the stats made little
sense, though this was not easy to spot. In 2.0, the idle_pct variable
moved to the struct thread_info via commit 81036f273 ("MINOR: time:
move the cpu, mono, and idle time to thread_info"). It made it more
obvious that the idle_pct was per thread, and also allowed to more
accurately measure it. But no more effort was made in that direction.

This patch introduces a new report_idle() function that accurately
averages the per-thread idle time over all running threads (i.e. it
should remain valid even if some threads are paused or stopped), and
makes use of it in the stats / "show info" reports.

Sending traffic over only two connections of an 8-thread process
would previously show this erratic CPU usage pattern:

  $ while :; do socat /tmp/sock1 - <<< "show info"|grep ^Idle;sleep 0.1;done
  Idle_pct: 30
  Idle_pct: 35
  Idle_pct: 100
  Idle_pct: 100
  Idle_pct: 100
  Idle_pct: 100
  Idle_pct: 100
  Idle_pct: 100
  Idle_pct: 35
  Idle_pct: 33
  Idle_pct: 100
  Idle_pct: 100
  Idle_pct: 100
  Idle_pct: 100
  Idle_pct: 100
  Idle_pct: 100

Now it shows this more accurate measurement:

  $ while :; do socat /tmp/sock1 - <<< "show info"|grep ^Idle;sleep 0.1;done
  Idle_pct: 83
  Idle_pct: 83
  Idle_pct: 83
  Idle_pct: 83
  Idle_pct: 83
  Idle_pct: 83
  Idle_pct: 83
  Idle_pct: 83
  Idle_pct: 83
  Idle_pct: 83
  Idle_pct: 83
  Idle_pct: 83
  Idle_pct: 83
  Idle_pct: 83
  Idle_pct: 83

This is not technically a bug but this lack of precision definitely affects
some users who rely on the idle_pct measurement. This should at least be
backported to 2.4, and might be to some older releases depending on users
demand.
2021-08-28 11:18:10 +02:00
Marcin Deranek
310a260e4a MEDIUM: config: Deprecate tune.ssl.capture-cipherlist-size
Deprecate tune.ssl.capture-cipherlist-size in favor of
tune.ssl.capture-buffer-size which better describes the purpose of the
setting.
2021-08-26 19:52:04 +02:00
Marcin Deranek
da0264a968 MINOR: sample: Add be2hex converter
Add be2hex converter to convert big-endian binary data into hex string
with optional string separators.
2021-08-26 19:48:34 +02:00
Marcin Deranek
40ca09c7bb MINOR: sample: Add be2dec converter
Add be2dec converter which allows to build JA3 compatible TLS
fingerprints by converting big-endian binary data into string
separated unsigned integers eg.

http-request set-header X-SSL-JA3 %[ssl_fc_protocol_hello_id],\
    %[ssl_fc_cipherlist_bin(1),be2dec(-,2)],\
    %[ssl_fc_extlist_bin(1),be2dec(-,2)],\
    %[ssl_fc_eclist_bin(1),be2dec(-,2)],\
    %[ssl_fc_ecformats_bin,be2dec(-,1)]
2021-08-26 19:48:34 +02:00
Marcin Deranek
959a48c116 MINOR: sample: Expose SSL captures using new fetchers
To be able to provide JA3 compatible TLS Fingerprints we need to expose
all Client Hello captured data using fetchers. Patch provides new
and modifies existing fetchers to add ability to filter out GREASE values:
- ssl_fc_cipherlist_*
- ssl_fc_ecformats_bin
- ssl_fc_eclist_bin
- ssl_fc_extlist_bin
- ssl_fc_protocol_hello_id
2021-08-26 19:48:34 +02:00
Marcin Deranek
769fd2e447 MEDIUM: ssl: Capture more info from Client Hello
When we set tune.ssl.capture-cipherlist-size to a non-zero value
we are able to capture cipherlist supported by the client. To be able to
provide JA3 compatible TLS fingerprinting we need to capture more
information from Client Hello message:
- SSL Version
- SSL Extensions
- Elliptic Curves
- Elliptic Curve Point Formats
This patch allows HAProxy to capture such information and store it for
later use.
2021-08-26 19:48:33 +02:00
Willy Tarreau
5ef965606b BUG/MINOR: lua: use strlcpy2() not strncpy() to copy sample keywords
The lua initialization code which creates the Lua mapping of all converters
and sample fetch keywords makes use of strncpy(), and as such can take ages
to start with large values of tune.bufsize because it spends its time zeroing
gigabytes of memory for nothing. A test performed with an extreme value of
16 MB takes roughly 4 seconds, so it's possible that some users with huge
1 MB buffers (e.g. for payload analysis) notice a small startup latency.
However this does not affect config checks since the Lua stack is not yet
started. Let's replace this with strlcpy2().

This should be backported to all supported versions.
2021-08-26 16:57:48 +02:00
Amaury Denoyelle
dd56520cdf BUG/MINOR: resolvers: mark servers with name-resolution as non purgeable
When a server is configured with name-resolution, resolvers objects are
created with reference to this server. Thus the server is marked as non
purgeable to prevent its removal at runtime.

This does not need to be backport.
2021-08-26 15:53:17 +02:00
William Lallemand
a39e6266d1 BUG/MINOR: proxy: don't dump servers of internal proxies
Patch 211c967 ("MINOR: httpclient: add the server to the proxy") broke
the reg-tests that do a "show servers state".

Indeed the servers of the proxies flagged with PR_CAP_INT are dumped in
the output of this CLI command.

This patch fixes the issue par ignoring the PR_CA_INT proxies in the
dump.
2021-08-25 18:15:31 +02:00
Dragan Dosen
61aa4428c1 BUG/MINOR: base64: base64urldec() ignores padding in output size check
Without this fix, the decode function would proceed even when the output
buffer is not large enough, because the padding was not considered. For
example, it would not fail with the input length of 23 and the output
buffer size of 15, even the actual decoded output size is 17.

This patch should be backported to all stable branches that have a
base64urldec() function available.
2021-08-25 16:14:14 +02:00
Amaury Denoyelle
14c3c5c121 MEDIUM: server: allow to remove servers at runtime except non purgeable
Relax the condition on "delete server" CLI handler to be able to remove
all servers, even non dynamic, except if they are flagged as non
purgeable.

This change is necessary to extend the use cases for dynamic servers
with reload. It's expected that each dynamic server created via the CLI
is manually commited in the haproxy configuration by the user. Dynamic
servers will be present on reload only if they are present in the
configuration file. This means that non-dynamic servers must be allowed
to be removable at runtime.

The dynamic servers removal reg-test has been updated and renamed to
reflect its purpose. A new test is present to check that non-purgeable
servers cannot be removed.
2021-08-25 15:53:54 +02:00
Amaury Denoyelle
86f3707d14 MINOR: server: mark servers referenced by LUA script as non purgeable
Each server that is retrieved by a LUA script is marked as non
purgeable. Note that for this to work, the script must have been
executed already once.
2021-08-25 15:53:54 +02:00
Amaury Denoyelle
0626961ad3 MINOR: server: mark referenced servers as non purgeable
Mark servers that are referenced by configuration elements as non
purgeable. This includes the following list :
- tracked servers
- servers referenced in a use-server rule
- servers referenced in a sample fetch
2021-08-25 15:53:54 +02:00
Amaury Denoyelle
bc2ebfa5a4 MEDIUM: server: extend refcount for all servers
In a future patch, it will be possible to remove at runtime every
servers, both static and dynamic. This requires to extend the server
refcount for all instances.

First, refcount manipulation functions have been renamed to better
express the API usage.

* srv_refcount_use -> srv_take
The refcount is always initialize to 1 on the server creation in
new_server. It's also incremented for each check/agent configured on a
server instance.

* free_server -> srv_drop
This decrements the refcount and if null, the server is freed, so code
calling it must not use the server reference after it. As a bonus, this
function now returns the next server instance. This is useful when
calling on the server loop without having to save the next pointer
before each invocation.

In these functions, remove the checks that prevent refcount on
non-dynamic servers. Each reference to "dynamic" in variable/function
naming have been eliminated as well.
2021-08-25 15:53:54 +02:00
Amaury Denoyelle
0a8d05d31c BUG/MINOR: stats: use refcount to protect dynamic server on dump
A dynamic server may be deleted at runtime at the same moment when the
stats applet is pointing to it. Use the server refcount to prevent
deletion in this case.

This should be backported up to 2.4, with an observability period of 2
weeks. Note that it requires the dynamic server refcounting feature
which has been implemented on 2.5; the following commits are required :

- MINOR: server: implement a refcount for dynamic servers
- BUG/MINOR: server: do not use refcount in free_server in stopping mode
- MINOR: server: return the next srv instance on free_server
2021-08-25 15:53:43 +02:00
Amaury Denoyelle
f5c1e12e44 MINOR: server: return the next srv instance on free_server
As a convenience, return the next server instance from servers list on
free_server.

This is particularily useful when using this function on the servers
list without having to save of the next pointer before calling it.
2021-08-25 15:29:19 +02:00
devnexen@gmail.com
21185970c1 MINOR: proc: setting the process to produce a core dump on FreeBSD.
using the procctl api to set the current process as traceable, thus being able to produce a core dump as well.

making it as compile option if not wished or using freebsd prior to 11.x (last no EOL release).
2021-08-25 05:14:27 +02:00
Ilya Shipitsin
ff0f278860 CLEANUP: assorted typo fixes in the code and comments
This is 26th iteration of typo fixes
2021-08-25 05:13:31 +02:00
William Lallemand
957ab13d7b BUILD: httpclient: fix build without OpenSSL
Add some defines around the ssl server so we can build without OpenSSL.
2021-08-24 18:33:28 +02:00
William Lallemand
4463b17fe3 BUG/MINOR: httpclient: fix Host header
THe http_update_update_host function takes an URL and extract the domain
to use as a host header. However it only update an existing host header
and does not create one.

This patch add an empty host header so the function can update it.
2021-08-24 17:53:03 +02:00
William Lallemand
211c9679c8 MINOR: httpclient: add the server to the proxy
Add the raw and ssl server to the proxy list so they can be freed during
the deinit() of HAProxy. As a side effect the 2 servers need to have a
different ID so the SSL one was renamed "<HTTPSCLIENT>".
2021-08-24 17:18:13 +02:00
William Lallemand
cfcbe9ebd9 MINOR: httpclient: set verify none on the https server
There is currently no way to specify the CA to verify from the
httpclient API. Sets the verify to none so we can still do https
request.
2021-08-24 17:15:58 +02:00
Dragan Dosen
f3899ddbcb BUG/MEDIUM: base64: check output boundaries within base64{dec,urldec}
Ensure that no more than olen bytes is written to the output buffer,
otherwise we might experience an unexpected behavior.

While the original code used to validate that the output size was
always large enough before starting to write, this validation was
later broken by the commit below, allowing to 3-byte blocks to
areas whose size is not multiple of 3:

  commit ed697e4856
  Author: Emeric Brun <ebrun@haproxy.com>
  Date:   Mon Jan 14 14:38:39 2019 +0100

    BUG/MINOR: base64: dec func ignores padding for output size checking

    Decode function returns an error even if the ouptut buffer is
    large enought because the padding was not considered. This
    case was never met with current code base.

For base64urldec(), it's basically the same problem except that since
the input format supports arbitrary lengths, the problem has always
been there since its introduction in 2.4.

This should be backported to all stable branches having a backport of
the patch above (i.e. 2.0), with some adjustments depending on the
availability of the base64dec() and base64urldec().
2021-08-24 16:10:49 +02:00
William Lallemand
76ad371b86 BUG/MINOR: httpclient: remove deinit of the httpclient
The httpclient does a free of the servers and proxies it uses, however
since we are including them in the global proxy list, haproxy already
free them during the deinit. We can safely remove these free.
2021-08-24 15:11:03 +02:00
Willy Tarreau
ece4c4a352 BUG/MINOR: stick-table: fix the sc-set-gpt* parser when using expressions
The sc-set-gpt0() parser was extended in 2.1 by commit 0d7712dff ("MINOR:
stick-table: allow sc-set-gpt0 to set value from an expression") to support
sample expressions in addition to plain integers. However there is a
subtlety there, which is that while the arg position must be incremented
when parsing an integer, it must not be touched when calling an expression
since the expression parser already does it.

The effect is that rules making use of sc-set-gpt0() followed by an
expression always ignore one word after that expression, and will typically
fail to parse if followed by an "if" as the parser will restart after the
"if". With no condition it's different because an empty condition doesn't
result in trying to parse anything.

This patch moves the increment at the right place and adds a few
explanations for a code part that was far from being obvious.

This should be backported to branches having the commit above (2.1+).
2021-08-24 15:05:48 +02:00
William Lallemand
8b673f0fe3 CLEANUP: ssl: remove useless check on p in openssl_version_parser()
Remove a  useless check on a pointer which reports a NULL dereference on
coverity.

Fixes issue #1358.
2021-08-22 13:36:11 +02:00
William Lallemand
3aeb3f9347 MINOR: cfgcond: implements openssl_version_atleast and openssl_version_before
Implements a way of checking the running openssl version:

If the OpenSSL support was not compiled within HAProxy it will returns a
error, so it's recommanded to do a SSL feature check before:

	$ ./haproxy -cc 'feature(OPENSSL) && openssl_version_atleast(0.9.8zh) && openssl_version_before(3.0.0)'

This will allow to select the SSL reg-tests more carefully.
2021-08-22 00:30:24 +02:00
William Lallemand
44d862d8d4 MINOR: ssl: add an openssl version string parser
openssl_version_parser() parse a string in the OpenSSL version format
which is documented here:

https://www.openssl.org/docs/man1.1.1/man3/OPENSSL_VERSION_NUMBER.html

The function returns an unsigned int that could be used for comparing
openssl versions.
2021-08-21 23:44:02 +02:00
devnexen@gmail.com
c4e5232db8 MINOR: tools: add FreeBSD support to get_exec_path()
FreeBSD stores the absolute path into the auxiliary vector as well.
The auxiliary vector is found in __elf_aux_vector there.
2021-08-20 17:33:32 +02:00
Willy Tarreau
1e7bef17df MINOR: hlua: take the global Lua lock inside a global function
Some users are facing huge CPU usage or even watchdog panics due to
the Lua global lock when many threads compete on it, but they have
no way to see that in the usual dumps. We take the lock at 2 or 3
places only, thus it's trivial to move it to a global function so
that stack dumps will now explicitly show it, increasing the change
that it rings a bell and someone suggests switch to lua-load-per-thread:

  Current executing Lua from a stream analyser -- stack traceback:
      loop.lua:1: in function line 1
  call trace(27):
  |       0x5ff157 [48 83 c4 10 5b 5d 41 5c]: wdt_handler+0xf7/0x104
  | 0x7fe37fe82690 [48 c7 c0 0f 00 00 00 0f]: libpthread:+0x13690
  |       0x614340 [66 48 0f 7e c9 48 01 c2]: main+0x1e8a40
  |       0x607b85 [48 83 c4 08 48 89 df 31]: main+0x1dc285
  |       0x6070bc [48 8b 44 24 20 48 8b 14]: main+0x1db7bc
  |       0x607d37 [41 89 c4 89 44 24 1c 83]: lua_resume+0xc7/0x214
  |       0x464ad6 [83 f8 06 0f 87 f1 01 00]: main+0x391d6
  |       0x4691a7 [83 f8 06 0f 87 03 20 fc]: main+0x3d8a7
  |       0x51dacb [85 c0 74 61 48 8b 5d 20]: sample_process+0x4b/0xf7
  |       0x51e55c [48 85 c0 74 3f 64 48 63]: sample_fetch_as_type+0x3c/0x9b
  |       0x525613 [48 89 c6 48 85 c0 0f 84]: sess_build_logline+0x2443/0x3cae
  |       0x4af0be [4c 63 e8 4c 03 6d 10 4c]: http_apply_redirect_rule+0xbfe/0xdf8
  |       0x4af523 [83 f8 01 19 c0 83 e0 03]: main+0x83c23
  |       0x4b2326 [83 f8 07 0f 87 99 00 00]: http_process_req_common+0xf6/0x15f6
  |       0x4d5b30 [85 c0 0f 85 9f f5 ff ff]: process_stream+0x2010/0x4e18

It also allows "perf top" to directly show the time spent on this lock.

This may be backported to some stable versions as it improves the
overall debuggability.
2021-08-20 17:33:26 +02:00
William Lallemand
2a8fe8bb48 MINOR: httpclient: cleanup the include files
Include the correct .h files in http_client.c and http_client.h.

The api.h is needed in http_client.c and http_client-t.h is now include
directly from http_client.h
2021-08-20 14:25:15 +02:00
William Lallemand
0d6f7790fb BUG/MINOR: httpclient: check if hdr_num is not 0
Check if hdr_num is not 0 before allocating or copying the headers to
the hc->hdrs space.
2021-08-20 11:59:49 +02:00
William Lallemand
dfc3f8906d BUG/MINOR: httpclient/cli: change the appctx test in the callbacks
The callbacks of the CLI httpclient are testing the appctx pointer
before doing the appctx_wakeup but was dereferencing the appctx pointer
before.
2021-08-20 11:53:16 +02:00
William Lallemand
b70203017b BUG/MINOR: httpclient: fix uninitialized sl variable
Reported by coverity in ticket #1355

  CID 1461505:  Memory - illegal accesses  (UNINIT)
  Using uninitialized value "sl".

Fix the problem by initializing sl to NULL.
2021-08-20 11:53:16 +02:00
Willy Tarreau
0e72e40f7e BUG/MINOR: http_client: make sure to preset the proxy's default settings
Proxies must call proxy_preset_defaults() to initialize their settings
that are usually learned from defaults sections (e.g. connection retries,
pool purge delay etc). At the moment there was likely no impact, but not
doing so could cause trouble soon when using the client more extensively
or when new defaults are introduced and failed to be initialized.

No backport is needed.
2021-08-20 10:23:12 +02:00
Willy Tarreau
d3dbfd9085 BUG/MEDIUM: cfgparse: do not allocate IDs to automatic internal proxies
Recent commit 83614a9fb ("MINOR: httpclient: initialize the proxy") broke
reg tests that match the output of "show stats" or "show servers state"
because it changed the proxies' numeric ID.

In fact it did nothing wrong, it just registers a proxy and adds it at
the head of the list. But the automatic numbering scheme, which was made
to make sure that temporarily disabled proxies in the config keep their
ID instead of shifting all others, sees one more proxy and increments
next_pxid for all subsequent proxies.

This patch avoids this by not assigning automatic IDs to such internal
proxies, leaving them with their ID of -1, and by not shifting next_pxid
for them. This is important because the user might experience them
appearing or disappearing depending on apparently unrelated config
options or build options, and this must not cause visible proxy IDs
to change (e.g. stats or minitoring may break).

Though the issue has always been there, it only became a problem with
the recent proxy additions so there is no need to backport this.
2021-08-20 10:22:41 +02:00
William Lallemand
b0281a4903 MINOR: proxy: check if p is NULL in free_proxy()
Check if p is NULL before trying to do anything  in free_proxy(),
like most free()-like function do.
2021-08-20 10:20:56 +02:00
William Lallemand
4c395fce21 MINOR: server: check if srv is NULL in free_server()
Check if srv is NULL before trying to do anything  in free_server(),
like most free()-like function do.
2021-08-20 10:20:51 +02:00
Remi Tricot-Le Breton
f95c29546c BUILD/MINOR: ssl: Fix compilation with OpenSSL 1.0.2
The X509_STORE_CTX_get0_cert did not exist yet on OpenSSL 1.0.2 and
neither did X509_STORE_CTX_get0_chain, which was not actually needed
since its get1 equivalent already existed.
2021-08-20 10:05:58 +02:00
Willy Tarreau
46b7dff8f0 BUG/MEDIUM: h2: match absolute-path not path-absolute for :path
RFC7540 states that :path follows RFC3986's path-absolute. However
that was a bug introduced in the spec between draft 04 and draft 05
of the spec, which implicitly causes paths starting with "//" to be
forbidden. HTTP/1 (and now HTTP core semantics) made it explicit
that the request-target in origin-form follows a purposely defined
absolute-path defined as 1*(/ segment) to explicitly allow "//".

http2bis now fixes this by relying on absolute-path so that "//"
becomes valid and matches other versions. Full discussion here:

  https://lists.w3.org/Archives/Public/ietf-http-wg/2021JulSep/0245.html

This issue appeared in haproxy with commit 4b8852c70 ("BUG/MAJOR: h2:
verify that :path starts with a '/' before concatenating it") when
making the checks on :path fully comply with the spec, and was backported
as far as 2.0, so this fix must be backported there as well to allow
"//" in H2 again.
2021-08-19 23:38:18 +02:00
Remi Tricot-Le Breton
74f6ab6e87 MEDIUM: ssl: Keep a reference to the client's certificate for use in logs
Most of the SSL sample fetches related to the client certificate were
based on the SSL_get_peer_certificate function which returns NULL when
the verification process failed. This made it impossible to use those
fetches in a log format since they would always be empty.

The patch adds a reference to the X509 object representing the client
certificate in the SSL structure and makes use of this reference in the
fetches.

The reference can only be obtained in ssl_sock_bind_verifycbk which
means that in case of an SSL error occurring before the verification
process ("no shared cipher" for instance, which happens while processing
the Client Hello), we won't ever start the verification process and it
will be impossible to get information about the client certificate.

This patch also allows most of the ssl_c_XXX fetches to return a usable
value in case of connection failure (because of a verification error for
instance) by making the "conn->flags & CO_FL_WAIT_XPRT" test (which
requires a connection to be established) less strict.

Thanks to this patch, a log-format such as the following should return
usable information in case of an error occurring during the verification
process :
    log-format "DN=%{+Q}[ssl_c_s_dn] serial=%[ssl_c_serial,hex] \
                hash=%[ssl_c_sha1,hex]"

It should answer to GitHub issue #693.
2021-08-19 23:26:05 +02:00
William Lallemand
2484da5ebc MINOR: httpclient/cli: change the User-Agent to "HAProxy"
Change the User-Agent from "HAProxy HTTP client" to "HAProxy" as the
previous name is not valid according to RFC 7231#5.5.3.

This patch fixes issue #1354.
2021-08-19 15:55:19 +02:00
William Lallemand
03a4eb154f MINOR: httpclient/cli: implement a simple client over the CLI
This commit implements an HTTP Client over the CLI, this was made as
working example for the HTTP Client API.

It usable over the CLI by specifying a method and an URL:

    echo "httpclient GET http://127.0.0.1:8000/demo.file" | socat /tmp/haproxy.sock -

Only IP addresses are accessibles since the API does not allow to
resolve addresses yet.
2021-08-18 18:25:05 +02:00
William Lallemand
33b0d095cc MINOR: httpclient: implement a simple HTTP Client API
This commit implements a very simple HTTP Client API.

A client can be operated by several functions:

    - httpclient_new(), httpclient_destroy(): create
      and destroy the struct httpclient instance.

    - httpclient_req_gen(): generate a complete HTX request using the
      the absolute URL, the method and a list of headers. This request
      is complete and sets the HTX End of Message flag. This is limited
      to small request we don't need a body.

    - httpclient_start() fill a sockaddr storage with a IP extracted
      from the URL (it cannot resolve an fqdm for now), start the
      applet. It also stores the ptr of the caller which could be an
      appctx or something else.

   - hc->ops contains a list of callbacks used by the
     HTTPClient, they should be filled manually after an
     httpclient_new():

        * res_stline(): the client received a start line, its content
          will be stored in hc->res.vsn, hc->res.status, hc->res.reason

        * res_headers(): the client received headers, they are stored in
          hc->res.hdrs.

        * res_payload(): the client received some payload data, they are
        stored in the hc->res.buf buffer and could be extracted with the
        httpclient_res_xfer() function, which takes a destination buffer
        as a parameter

        * res_end(): this callback is called once we finished to receive
        the response.
2021-08-18 17:36:32 +02:00
William Lallemand
83614a9fbe MINOR: httpclient: initialize the proxy
Initialize a proxy which contain a server for the raw HTTP, and another
one for the HTTPS. This proxy will use the global server log definition
and the 'option httplog' directive.

This proxy is internal and will only be used for the HTTP Client API.
2021-08-18 17:35:48 +02:00
Willy Tarreau
b5d2b9e154 BUG/MEDIUM: h2: give :authority precedence over Host
The wording regarding Host vs :authority in RFC7540 is ambiguous as it
says that an intermediary must produce a host header from :authority if
Host is missing, but, contrary to HTTP/1.1, doesn't say anything regarding
the possibility that Host and :authority differ, which leaves Host with
higher precedence there. In addition it mentions that clients should use
:authority *instead* of Host, and that H1->H2 should use :authority only
if the original request was in authority form. This leaves some gray
area in the middle of the chain for fully valid H2 requests arboring a
Host header that are forwarded to the other side where it's possible to
drop the Host header and use the authority only after forwarding to a
second H2 layer, thus possibly seeing two different values of Host at
a different stage. There's no such issue when forwarding from H2 to H1
as the authority is dropped only only the Host is kept.

Note that the following request is sufficient to re-normalize such a
request:

   http-request set-header host %[req.hdr(host)]

The new spec in progress (draft-ietf-httpbis-http2bis-03) addresses
this trouble by being a bit is stricter on these rules. It clarifies
that :authority must always be used instead of Host and that Host ought
to be ignored. This is much saner as it avoids to convey two distinct
values along the chain. This becomes the protocol-level equivalent of:

   http-request set-uri %[url]

So this patch does exactly this, which we were initially a bit reluctant
to do initially by lack of visibility about other implementations'
expectations. In addition it slightly simplifies the Host header field
creation by always placing it first in the list of headers instead of
last; this could also speed up the look up a little bit.

This needs to be backported to 2.0. Non-HTX versions are safe regarding
this because they drop the URI during the conversion to HTTP/1.1 so
only Host is used and transmitted.

Thanks to Tim Dsterhus for reporting that one.
2021-08-17 10:21:07 +02:00
Willy Tarreau
89265224d3 BUG/MAJOR: h2: enforce stricter syntax checks on the :method pseudo-header
Before HTX was introduced, all the HTTP request elements passed in
pseudo-headers fields were used to build an HTTP/1 request whose syntax
was then scrutinized by the HTTP/1 parser, leaving no room to inject
invalid characters.

While NUL, CR and LF are properly blocked, it is possible to inject
spaces in the method so that once translated to HTTP/1, fields are
shifted by one spcae, and a lenient HTTP/1 server could possibly be
fooled into using a part of the method as the URI. For example, the
following request:

   H2 request
     :method: "GET /admin? HTTP/1.1"
     :path:   "/static/images"

would become:

   GET /admin? HTTP/1.1 /static/images HTTP/1.1

It's important to note that the resulting request is *not* valid, and
that in order for this to be a problem, it requires that this request
is delivered to an already vulnerable HTTP/1 server.

A workaround here is to reject malformed methods by placing this rule
in the frontend or backend, at least before leaving haproxy in H1:

   http-request reject if { method -m reg [^A-Z0-9] }

Alternately H2 may be globally disabled by commenting out the "alpn"
directive on "bind" lines, and by rejecting H2 streams creation by
adding the following statement to the global section:

   tune.h2.max-concurrent-streams 0

This patch adds a check for each character of the method to make sure
they belong to the ones permitted in a token, as mentioned in RFC7231#4.1.

This should be backported to versions 2.0 and above. For older versions
not having HTX_FL_PARSING_ERROR, a "goto fail" works as well as it
results in a protocol error at the stream level. Non-HTX versions are
safe because the resulting invalid request will be rejected by the
internal HTTP/1 parser.

Thanks to Tim Dsterhus for reporting that one.
2021-08-17 10:18:52 +02:00
Willy Tarreau
4b8852c70d BUG/MAJOR: h2: verify that :path starts with a '/' before concatenating it
Tim Dsterhus found that while the H2 path is checked for non-emptiness,
invalid chars and '*', a test is missing to verify that except for '*',
it always starts with exactly one '/'. During the reconstruction of the
full URI when passing to HTX, this missing test allows to affect the
apparent authority by appending a port number or a suffix name.

This only affects H2-to-H2 communications, as H2-to-H1 do not use the
full URI. Like for previous fix, the following rule inserted before
other ones in the frontend is sufficient to renormalize the internal
URI and let haproxy see the same authority as the target server:

    http-request set-uri %[url]

This needs to be backported to 2.2. Earlier versions do not rebuild a
full URI using the authority and will fail on the malformed path at the
HTTP layer, so they are safe.
2021-08-17 10:16:22 +02:00
Willy Tarreau
a495e0d948 BUG/MAJOR: h2: verify early that non-http/https schemes match the valid syntax
While we do explicitly check for strict character sets in the scheme,
this is only done when extracting URL components from an assembled one,
and we have special handling for "http" and "https" schemes directly in
the H2-to-HTX conversion. Sadly, this lets all other ones pass through
if they start exactly with "http://" or "https://", allowing the
reconstructed URI to start with a different looking authority if it was
part of the scheme.

It's interesting to note that in this case the valid authority is in
the Host header and that the request will only be wrong if emitted over
H2 on the backend side, since H1 will not emit an absolute URI by
default and will drop the scheme. So in essence, this is a variant of
the scheme-based attack described below in that it only affects H2-H2
and not H2-H1 forwarding:

   https://portswigger.net/research/http2

As such, a simple workaround consists in just inserting the following
rule before other ones in the frontend, which will have for effect to
renormalize the authority in the request line according to the
concatenated version (making haproxy see the same authority and host
as what the target server will see):

   http-request set-uri %[url]

This patch simply adds the missing syntax checks for non-http/https
schemes before the concatenation in the H2 code. An improvement may
consist in the future in splitting these ones apart in the start
line so that only the "url" sample fetch function requires to access
them together and that all other places continue to access them
separately. This will then allow the core code to perform such checks
itself.

The patch needs to be backported as far as 2.2. Before 2.2 the full
URI was not being reconstructed so the scheme and authority part were
always dropped from H2 requests to leave only origin requests. Note
for backporters: this depends on this previous patch:

  MINOR: http: add a new function http_validate_scheme() to validate a scheme

Many thanks to Tim Dsterhus for figuring that one and providing a
reproducer.
2021-08-17 10:16:22 +02:00
Willy Tarreau
d3d8d03d98 MINOR: http: add a new function http_validate_scheme() to validate a scheme
While http_parse_scheme() extracts a scheme from a URI by extracting
exactly the valid characters and stopping on delimiters, this new
function performs the same on a fixed-size string.
2021-08-17 10:16:22 +02:00
David Carlier
bd2ccedcc5 BUILD: tools: get the absolute path of the current binary on NetBSD.
NetBSD stores the absolute path into the auxiliary vector as well.
2021-08-17 09:54:28 +02:00
Ilya Shipitsin
01881087fc CLEANUP: assorted typo fixes in the code and comments
This is 25th iteration of typo fixes
2021-08-16 12:37:59 +02:00
Christopher Faulet
e48d1dc2d9 BUG/MINOR: lua/filters: Return right code when txn:done() is called
txn functions can now be called from an action or a filter context. Thus the
return code must be adapted depending on this context. From an action, act.ABORT
is returned. From a filter, -1 is returned. It is the filter error code.

This bug only affects 2.5-dev. No backport needed.
2021-08-13 17:14:47 +02:00
Christopher Faulet
26eb5ea352 BUG/MINOR: filters: Always set FLT_END analyser when CF_FLT_ANALYZE flag is set
CF_FLT_ANALYZE flags may be set before the FLT_END analyser. Thus if an error is
triggered in the mean time, this may block the stream and prevent it to be
released. It is indeed a problem only for the response channel because the
response analysers may be skipped on early errors.

So, to prevent any issue, depending on the code path, the FLT_END analyser is
systematically set when the CF_FLT_ANALYZE flag is set.

This patch must be backported in all stable branches.
2021-08-13 17:14:47 +02:00
William Lallemand
2c04a5a03d MINOR: proxy: disable warnings for internal proxies
The internal proxies should be part of the proxies list, because of
this, the check_config_validity() fonction could emit warnings about
these proxies.

This patch disables 3 startup warnings for internal proxies:

- "has no 'bind' directive" (this one was already ignored for the CLI
frontend, but we made it generic instead)
- "missing timeouts"
- "log format ignored"
2021-08-13 15:34:16 +02:00
William Lallemand
6640dbb524 MINOR: cli: delare the CLI frontend as an internal proxy
Declare the CLI frontend as an internal proxy so we can check the
PR_CAP_INT flag instead of the global.fe_cli pointer for generic use
cases.
2021-08-13 15:34:10 +02:00
Emeric Brun
bc5c821cc2 BUG/MEDIUM: cfgcheck: verify existing log-forward listeners during config check
User reported that the config check returns an error with the message:
"Configuration file has no error but will not start (no listener) => exit(2)."
if the configuration present only a log-forward section with bind or dgram-bind
listeners but no listen/backend nor peer sections.

The process checked if there was 'peers' section avalaible with
an internal frontend (and so a listener) or a 'listen/backend'
section not disabled with at least one configured listener (into the
global proxies_list). Since the log-forward proxies appear in a
different list, they were not checked.

This patch adds a lookup on the 'log-forward' proxies list to check
if one of them presents a listener and is not disabled. And
this is done only if there was no available listener found into
'listen/backend' sections.

I have also studied how to re-work this check considering the 'listeners'
counter used after startup/init to keep the same algo and avoid further
mistakes but currently this counter seems increased during config parsing
and if a proxy is disabled, decreased during startup/init which is done
after the current config check. So the fix still not rely on this
counter.

This patch should fix the github issue #1346

This patch should be backported as far as 2.3 (so on branches
including the "log-forward" feature)
2021-08-13 11:21:57 +02:00
Christopher Faulet
c86bb87f10 BUG/MINOR: lua: Properly catch alloc errors when parsing lua filter directives
When a lua filter declaration is parsed, some allocation errors were not
properly handled. In addition, we must be sure the filter identifier is defined
in lua to duplicate it when the filter configuration is filled.

This patch fix a defect reported in the issue #1347. It only concerns
2.5-dev. No backport needed.
2021-08-13 08:42:00 +02:00
Christopher Faulet
70c4345dbc BUG/MINOR: lua: Properly check negative offset in Channel/HttpMessage functions
In Channel and HTTPMessage classes, several functions uses an offset that
may be negative to start from the end of incoming data. But, after
calculation, the offset must never be negative. However, there is a bug
because of a bad cast to unsigned when "input + offset" is performed. The
result must be a signed integer.

This patch should fix most of defects reported in the issue #1347. It only
affects 2.5-dev. No backport needed.
2021-08-13 08:36:42 +02:00
Christopher Faulet
eae8afaa60 MINOR: filters/lua: Support the HTTP filtering from filters written in lua
Now an HTTPMessage class is available to manipulate HTTP message from a filter
it is possible to bind HTTP filters callback function on lua functions. Thus,
following methods may now be defined by a lua filter:

  * Filter:http_headers(txn, http_msg)
  * Filter:http_payload(txn, http_msg, offset, len)
  * Filter:http_end(txn, http_msg)

http_headers() and http_end() may return one of the constant filter.CONTINUE,
filter.WAIT or filter.ERROR. If nothing is returned, filter.CONTINUE is used as
the default value. On its side, http_payload() may return the amount of data to
forward. If nothing is returned, all incoming data are forwarded.

For now, these functions are not allowed to yield because this interferes with
the filter workflow.
2021-08-12 08:57:07 +02:00
Christopher Faulet
78c35471f8 MINOR: filters/lua: Add request and response HTTP messages in the lua TXN
When a lua TXN is created from a filter context, the request and the response
HTTP message objects are accessible from ".http_req" and ".http_res" fields. For
an HTTP proxy, these objects are always defined. Otherwise, for a TCP proxy, no
object is created and nil is used instead. From any other context (action or
sample fetch), these fields don't exist.
2021-08-12 08:57:07 +02:00
Christopher Faulet
df97ac4584 MEDIUM: filters/lua: Add HTTPMessage class to help HTTP filtering
This new class exposes methods to manipulate HTTP messages from a filter
written in lua. Like for the HTTP class, there is a bunch of methods to
manipulate the message headers. But there are also methods to manipulate the
message payload. This part is similar to what is available in the Channel
class. Thus the payload can be duplicated, erased, modified or
forwarded. For now, only DATA blocks can be retrieved and modified because
the current API is limited. No HTTPMessage method is able to yield. Those
manipulating the headers are always called on messages containing all the
headers, so there is no reason to yield. Those manipulating the payload are
called from the http_payload filters callback function where yielding is
forbidden.

When an HTTPMessage object is instantiated, the underlying Channel object
can be retrieved via the ".channel" field.

For now this class is not used because the HTTP filtering is not supported
yet. It will be the purpose of another commit.

There is no documentation for now.
2021-08-12 08:57:07 +02:00
Christopher Faulet
c404f1126c MEDIUM: filters/lua: Support declaration of some filter callback functions in lua
It is now possible to write some filter callback functions in lua. All
filter callbacks are not supported yet but the mechanism to call them is now
in place. Following method may be defined in the Lua filter class to be
bound on filter callbacks:

  * Filter:start_analyse(txn, chn)
  * Filter:end_analyse(txn, chn)
  * Filter:tcp_payload(txn, chn, offset, length)

hlua_filter_callback() function is responsible to call the good lua function
depending on the filter callback function. Using some flags it is possible
to allow a lua call to yield or not, to retrieve a return value or not, and
to specify if a channel or an http message must be passed as second
argument. For now, the HTTP part has not been added yet. It is also possible
to add extra argument adding them on the stack before the call.

3 new functions are exposed by the global object "filter". The first one,
filter.wake_time(ms_delay), to set the wake_time when a Lua callback
function yields (if allowed). The two others,
filter.register_data_filter(filter, chn) and
filter.unregister_data_filter(filter, chn), to enable or disable the data
filtering on a channel for a specific lua filter instance.

start_analyse() and end_analyse() may return one of the constant
filter.CONTINUE, filter.WAIT or filter.ERROR. If nothing is returned,
filter.CONTINUE is used as the default value. On its side, tcp_payload() may
return the amount of data to forward. If nothing is returned, all incoming
data are forwarded.

For now, these functions are not allowed to yield because this interferes
with the filter workflow.

Here is a simple example :

    MyFilter = {}
    MyFilter.id = "My Lua filter"
    MyFilter.flags = filter.FLT_CFG_FL_HTX
    MyFilter.__index = MyFilter

    function MyFilter:new()
        flt = {}
        setmetatable(flt, MyFilter)
        flt.req_len = 0
        flt.res_len = 0
        return flt
     end

    function MyFilter:start_analyze(txn, chn)
        filter.register_data_filter(self, chn)
    end

    function MyFilter:end_analyze(txn, chn)
        print("<Total> request: "..self.req_len.." - response: "..self.res_len)
    end

    function MyFilter:tcp_payload(txn, chn)
        offset = chn:ouput()
	len    = chn:input()
        if chn:is_resp() then
            self.res_len = self.res_len + len
            print("<TCP:Response> offset: "..offset.." - length: "..len)
        else
            self.req_len = self.req_len + len
            print("<TCP:Request> offset: "..offset.." - length: "..len)
        end
    end
2021-08-12 08:57:07 +02:00
Christopher Faulet
a1ac5fb28e MEDIUM: filters/lua: Be prepared to filter TCP payloads
For filters written in lua, the tcp payloads will be filtered using methods
exposed by the Channel class. So the corrsponding C binding functions must
be prepared to process payload in a filter context and not only in an action
context.

The main change is the offset where to start to process data in the channel
buffer, and the length of these data. For an action, all input data are
considered. But for a filter, it depends on what the filter is allow to
forward when the tcp_payload callback function is called. It depends on
previous calls but also on other filters.

In addition, when the payload is modified by a lua filter, its context must
be updated. Note also that channel functions cannot yield when called from a
filter context.

For now, it is not possible to define callbacks to filter data and the
documentation has not been updated.
2021-08-12 08:57:07 +02:00
Christopher Faulet
8c9e6bba0f MINOR: lua: Add flags on the lua TXN to know the execution context
A lua TXN can be created when a sample fetch, an action or a filter callback
function is executed. A flag is now used to track the execute context.
Respectively, HLUA_TXN_SMP_CTX, HLUA_TXN_ACT_CTX and HLUA_TXN_FLT_CTX. The
filter flag is not used for now.
2021-08-12 08:57:07 +02:00
Christopher Faulet
9f55a5012e MINOR: lua: Add a function to get a filter attached to a channel class
For now, there is no support for filters written in lua. So this function,
if called, will always return NULL. But when it will be called in a filter
context, it will return the filter structure attached to a channel
class. This function is also responsible to set the offset of data that may
be processed and the length of these data. When called outside a filter
context (so from an action), the offset is the input data position and the
length is the input data length. From a filter, the offset and the length of
data that may be filtered are retrieved the filter context.
2021-08-12 08:57:07 +02:00
Christopher Faulet
69c581a092 MEDIUM: filters/lua: Add support for dummy filters written in lua
It is now possible to write dummy filters in lua. Only the basis to declare
such filters has been added for now. There is no way to declare callbacks to
filter anything. Lua filters are for now empty nutshells.

To do so, core.register_filter() must be called, with 3 arguments, the
filter's name (as it appears in HAProxy config), the lua class that will be
used to instantiate filters and a function to parse arguments passed on the
filter line in HAProxy configuration file. The lua filter class must at
least define the method new(), without any extra args, to create new
instances when streams are created. If this method is not found, the filter
will be ignored.

Here is a template to declare a new Lua filter:

    // haproxy.conf
    global
        lua-load /path/to/my-filter.lua
        ...

    frontend fe
        ...
        filter lua.my-lua-filter arg1 arg2 arg3
        filter lua.my-lua-filter arg4 arg5

    // my-filter.lua
    MyFilter = {}
    MyFilter.id = "My Lua filter"          -- the filter ID (optional)
    MyFilter.flags = filter.FLT_CFG_FL_HTX -- process HTX streams (optional)
    MyFilter.__index = MyFilter

    function MyFilter:new()
        flt = {}
        setmetatable(flt, MyFilter)
        -- Set any flt fields. self.args can be used
	flt.args = self.args

        return flt -- The new instance of Myfilter
    end

    core.register_filter("my-lua-filter", MyFilter, function(filter, args)
        -- process <args>, an array of strings. For instance:
        filter.args = args
        return filter
    end)

In this example, 2 filters are declared using the same lua class. The
parsing function is called for both, with its own copy of the lua class. So
each filter will be unique.

The global object "filter" exposes some constants and flags, and later some
functions, to help writting filters in lua.

Internally, when a lua filter is instantiated (so when new() method is
called), 2 lua contexts are created, one for the request channel and another
for the response channel. It is a prerequisite to let some callbacks yield
on one side independently on the other one.

There is no documentation for now.
2021-08-12 08:57:07 +02:00
Christopher Faulet
6a79fc16bd MEDIUM: lua: Improve/revisit the lua api to manipulate channels
First of all, following functions are now considered deprecated:

  * Channel:dup()
  * Channel:get()
  * Channel:getline()
  * Channel:get_in_len()
  * Cahnnel:get_out_len()

It is just informative, there is no warning and functions may still be
used. Howver it is recommended to use new functions. New functions are more
flexible and use a better naming pattern. In addition, the same names will
be used in the http_msg class to manipulate http messages from lua filters.

The new API is:

  * Channel:data()
  * Channel:line()
  * Channel:append()
  * Channel:prepend()
  * Channel:insert()
  * Channel:remove()
  * Channel:set()
  * Channel:input()
  * Channel:output()
  * Channel:send()
  * Channel:forward()
  * Channel:is_resp()
  * Channel:is_full()
  * Channel:may_recv()

The lua documentation was updated accordingly.
2021-08-12 08:57:07 +02:00
Christopher Faulet
9a6ffda795 MEDIUM: lua: Process buffer data using an offset and a length
The main change is that following functions will now process channel's data
using an offset and a length:

  * hlua_channel_dup_yield()
  * hlua_channel_get_yield()
  * hlua_channel_getline_yield()
  * hlua_channel_append_yield()
  * hlua_channel_set()
  * hlua_channel_send_yield()
  * hlua_channel_forward_yield()

So for now, the offset is always the input data position and the length is
the input data length. But with the support for filters, from a filter
context, these values will be relative to the filter.

To make all processing clearer, the function _hlua_channel_dup() has been
updated and _hlua_channel_dupline(), _hlua_channel_insert() and
_hlua_channel_delete() have been added.

This patch is mandatory to allow the support of the filters written in lua.
2021-08-12 08:57:07 +02:00
Christopher Faulet
ba9e21dc68 MINOR: lua: Add a function to get a reference on a table in the stack
The hlua_checktable() function may now be used to create and return a
reference on a table in stack, given its position. This function ensures it
is really a table and throws an exception if not.

This patch is mandatory to allow the support of the filters written in lua.
2021-08-12 08:57:07 +02:00
Christopher Faulet
03fb1b26f7 MINOR: filters/lua: Release filters before the lua context
This patch is mandatory to allow the support of the filters written in lua.
2021-08-12 08:57:07 +02:00
Christopher Faulet
23976d9e40 BUG/MINOR: lua: Don't yield in channel.append() and channel.set()
Lua functions to set or append data to the input part of a channel must not
yield because new data may be received while the lua script is suspended. So
adding data to the input part in several passes is highly unpredicatble and
may be interleaved with received data.

Note that if necessary, it is still possible to suspend a lua action by
returning act.YIELD. This way the whole action will be reexecuted later
because of I/O events or a timer. Another solution is to call core.yield().

This bug affects all stable versions. So, it may be backported. But it is
probably not necessary because nobody notice it till now.
2021-08-12 08:57:07 +02:00
Christopher Faulet
2e60aa4dee BUG/MINOR: lua: Yield in channel functions only if lua context can yield
When a script is executed, it is not always allowed to yield. Lua sample
fetches and converters cannot yield. For lua actions, it depends on the
context. When called from tcp content ruleset, an action may yield until the
expiration of the inspect-delay timeout. From http rulesets, yield is not
possible.

Thus, when channel functions (dup, get, append, send...) are called, instead
of yielding when it is not allowed and triggering an error, we just give
up. In this case, some functions do nothing (dup, append...), some others
just interrupt the in-progress job (send, forward...). But, because these
functions don't yield anymore when it is not allowed, the script regains the
control and can continue its execution.

This patch depends on "MINOR: lua: Add a flag on lua context to know the
yield capability at run time". Both may be backported in all stable
versions. However, because nobody notice this bug till now, it is probably
not necessary, excepted if someone ask for it.
2021-08-12 08:57:07 +02:00
Christopher Faulet
1f43a3430e MINOR: lua: Add a flag on lua context to know the yield capability at run time
When a script is executed, a flag is used to allow it to yield. An error is
returned if a lua function yield, explicitly or not. But there is no way to
get this capability in C functions. So there is no way to choose to yield or
not depending on this capability.

To fill this gap, the flag HLUA_NOYIELD is introduced and added on the lua
context if the current script execution is not authorized to yield. Macros
to set, clear and test this flags are also added.

This feature will be usefull to fix some bugs in lua actions execution.
2021-08-12 08:57:07 +02:00
Christopher Faulet
6fcd2d3280 BUG/MINOR: stream: Don't release a stream if FLT_END is still registered
When at least one filter is registered on a stream, the FLT_END analyzer is
called on both direction when all other analyzers have finished their
processing. During this step, filters may release any allocated elements if
necessary. So it is important to not skip it.

Unfortunately, if both stream interfaces are closed, it is possible to not
wait the end of this analyzer. It is possible to be in this situation if a
filter must wait and prevents the analyzer completion. To fix the bug, we
now wait FLT_END analyzer is no longer registered on both direction before
releasing the stream.

This patch may be backported as far as 1.7, but AFAIK, no filter is affected
by this bug. So the backport seems to be optional for now. In any case, it
should remain under observation for some weeks first.
2021-08-12 08:54:16 +02:00
Christopher Faulet
47bfd7b9b7 BUG/MINOR: tcpcheck: Properly detect pending HTTP data in output buffer
In tcpcheck_eval_send(), the condition to detect there are still pending
data in the output buffer is buggy. Presence of raw data must be tested for
TCP connection only. But a condition on the connection was missing to be
sure it is not an HTX connection.

This patch must be backported as far as 2.2.
2021-08-12 07:49:23 +02:00
William Lallemand
7e7765a451 BUG/MINOR: buffer: fix buffer_dump() formatting
The formatting of the buffer_dump() output must be calculated using the
relative counter, not the absolute one, or everything will be broken if
the <from> variable is not a multiple of 16.

Could be backported in all maintained versions.
2021-08-12 00:51:45 +02:00
Amaury Denoyelle
3eb42f91d9 BUG/MEDIUM: server: support both check/agent-check on a dynamic instance
A static server is able to support simultaneously both health chech and
agent-check. Adjust the dynamic server CLI handlers to also support this
configuration.

This should not be backported, unless dynamic server checks are
backported.
2021-08-11 14:41:47 +02:00
Amaury Denoyelle
26cb8342ad BUG/MEDIUM: check: fix leak on agent-check purge
There is currently a leak on agent-check for dynamic servers. When
deleted, the check rules and vars are not liberated. This leak grows
each time a dynamic server with agent-check is deleted.

Replace the manual purge code by a free_check invocation which
centralizes all the details on check cleaning.

There is no leak for health check because in this case the proxy is the
owner of the check vars and rules.

This should not be backported, unless dynamic server checks are
backported.
2021-08-11 14:40:21 +02:00
Amaury Denoyelle
6d7fc446b4 BUG/MINOR: check: fix leak on add dynamic server with agent-check error
If an error occured during a dynamic server creation, free_check is used
to liberate a possible agent-check. However, this does not free
associated vars and rules associated as this is done on another function
named deinit_srv_agent_check.

To simplify the check free and avoid a leak, move free vars/rules in
free_check. This is valid because deinit_srv_agent_check also uses
free_check.

This operation is done only for an agent-check because for a health
check, the proxy instance is the owner of check vars/rules.

This should not be backported, unless dynamic server checks are
backported.
2021-08-11 14:37:42 +02:00
Amaury Denoyelle
25fe1033cb BUG/MINOR: check: do not reset check flags on purge
Do not reset check flags when setting CHK_ST_PURGE.

Currently, this change has no impact. However, it is semantically wrong
to clear important flags such as CHK_ST_AGENT on purge.

Furthermore, this change will become mandatoy for a future fix to
properly free agent checks on dynamic servers removal. For this, it will
be needed to differentiate health/agent-check on purge via CHK_ST_AGENT
to properly free agent checks.

This must not be backported unless dynamic servers checks are
backported.
2021-08-11 14:33:34 +02:00
Amaury Denoyelle
13f2e2ceeb BUG/MINOR: server: do not use refcount in free_server in stopping mode
Currently there is a leak at process shutdown with dynamic servers with
check/agent-check activated. Check purges are not executed on process
stopping, so the server is not liberated due to its refcount.

The solution is simply to ignore the refcount on process stopping mode
and free the server on the first free_server invocation.

This should not be backported, unless dynamic server checks are
backported. In this case, the following commit must be backported first.
  7afa5c1843
  MINOR: global: define MODE_STOPPING
2021-08-09 17:53:30 +02:00
Amaury Denoyelle
7afa5c1843 MINOR: global: define MODE_STOPPING
Define a new mode MODE_STOPPING. It is used to indicate that the process
is in the stopping stage and no event loop runs anymore.
2021-08-09 17:51:55 +02:00
Amaury Denoyelle
9ba34ae710 BUG/MINOR: check: test if server is not null in purge
Test if server is not null before using free_server in the check purge
operation. Currently, the null server scenario should not occured as
purge is used with refcounted dynamic servers. However, this might not
be always the case if purge is use in the future in other cases; thus
the test is useful for extensibility.

No need to backport, unless dynamic server checks are backported.

This has been reported through a coverity report in github issue #1343.
2021-08-09 17:48:34 +02:00
Amaury Denoyelle
b65f4cab6a MEDIUM: server: implement agent check for dynamic servers
This commit is the counterpart for agent check of
"MEDIUM: server: implement check for dynamic servers".

The "agent-check" keyword is enabled for dynamic servers. The agent
check must manually be activated via "enable agent" CLI. This can
enable the dynamic server if the agent response is "ready" without an
explicit "enable server" CLI.
2021-08-06 11:09:48 +02:00
Amaury Denoyelle
2fc4d39577 MEDIUM: server: implement check for dynamic servers
Implement check support for dynamic servers. The "check" keyword is now
enabled for dynamic servers. If used, the server check is initialized
and the check task started in the "add server" CLI handler. The check is
explicitely disabled and must be manually activated via "enable health"
CLI handler.

The dynamic server refcount is incremented if a check is configured. On
"delete server" handler, the check is purged, which decrements the
refcount.
2021-08-06 11:09:48 +02:00
Amaury Denoyelle
9ecee0fa36 MINOR: check: enable safe keywords for dynamic servers
Implement a collection of keywords deemed safe and useful to dynamic
servers. The list of the supported keywords is :
- addr
- check-proto
- check-send-proxy
- check-via-socks4
- rise
- fall
- fastinter
- downinter
- port
- agent-addr
- agent-inter
- agent-port
- agent-send
2021-08-06 11:09:48 +02:00
Amaury Denoyelle
b33a0abc0b MEDIUM: check: implement check deletion for dynamic servers
Implement a mechanism to free a started check on runtime for dynamic
servers. A new function check_purge is created for this. The check task
will be marked for deletion and scheduled to properly close connection
elements and free the task/tasklet/buf_wait elements.

This function will be useful to delete a dynamic server wich checks.
2021-08-06 11:09:48 +02:00
Amaury Denoyelle
d6b7080cec MINOR: server: implement a refcount for dynamic servers
It is necessary to have a refcount mechanism on dynamic servers to be
able to enable check support. Indeed, when deleting a dynamic server
with check activated, the check will be asynchronously removed. This is
mandatory to properly free the check resources in a thread-safe manner.
The server instance must be kept alive for this.
2021-08-06 11:09:48 +02:00
Amaury Denoyelle
403dce8e5a MINOR: check: do not increment global maxsock at runtime
global maxsock is used to estimate a number of fd to reserve for
internal use, such as checks. It is incremented at startup with the info
from the config file.

Disable this incrementation in checks functions at runtime. First, it
currently serves no purpose to increment it after startup. Worse, it may
lead to out-of-bound accesse on the fdtab.

This will be useful to initiate checks for dynamic servers.
2021-08-06 11:08:24 +02:00
Amaury Denoyelle
3c2ab1a0d4 MINOR: check: export check init functions
Remove static qualifier on init_srv_check, init_srv_agent_check and
start_check_task. These functions will be called in server.c for dynamic
servers with checks.
2021-08-06 11:08:04 +02:00
Amaury Denoyelle
f2c27a5c67 MINOR: check: allocate default check ruleset for every backends
Allocate default tcp ruleset for every backend without explicit rules
defined, even if no server in the backend use check. This change is
required to implement checks for dynamic servers.

This allocation is done on check_config_validity. It must absolutely be
called before check_proxy_tcpcheck (called via post proxy check) which
allocate the implicit tcp connect rule.
2021-08-06 11:08:04 +02:00
Amaury Denoyelle
fca18172d9 MINOR: server: initialize fields for dynamic server check
Set default inter/rise/fall values for dynamic servers check/agent. This
is required because dynamic servers do not inherit from a
default-server.
2021-08-06 11:08:04 +02:00
Amaury Denoyelle
7b368339af MEDIUM: task: implement tasklet kill
Implement an equivalent of task_kill for tasklets. This function can be
used to request a tasklet deletion in a thread-safe way.

Currently this function is unused.
2021-08-06 11:07:48 +02:00
Amaury Denoyelle
c755efd5c6 MINOR: server: unmark deprecated on enable health/agent cli
Remove the "DEPRECATED" marker on "enable/disable health/agent"
commands. Their purpose is to toggle the check/agent on a server.

These commands are still useful because their purpose is not covered by
the "set server" command. Most there was confusion with the commands
'set server health/agent', which in fact serves another goal.

Note that the indication "use 'set server' instead" has been added since
2016 on the commit
  2c04eda8b5
  REORG: cli: move "{enable|disable} health" to server.c

and
  58d9cb7d22
  REORG: cli: move "{enable|disable} agent" to server.c

Besides, these commands will become required to enable check/agent on
dynamic servers which will be created with check disabled.

This should be backported up to 2.4.
2021-08-06 10:09:50 +02:00
Christopher Faulet
d7da3dd928 BUG/MEDIUM: spoe: Fix policy to close applets when SPOE connections are queued
It is the second part of the fix that should solve fairness issues with the
connections management inside the SPOE filter. Indeed, in multithreaded
mode, when the SPOE detects there are some connections in queue on a server,
it closes existing connections by releasing SPOE applets. It is mandatory
when a maxconn is set because few connections on a thread may prenvent new
connections establishment.

The first attempt to fix this bug (9e647e5af "BUG/MEDIUM: spoe: Kill applets
if there are pending connections and nbthread > 1") introduced a bug. In
pipelining mode, SPOE applets might be closed while some frames are pending
for the ACK reply. To fix the bug, in the processing stage, if there are
some connections in queue, only truly idle applets may process pending
requests. In this case, only one request at a time is processed. And at the
end of the processing stage, only truly idle applets may be released. It is
an empirical workaround, but it should be good enough to solve contention
issues when a low maxconn is set.

This patch should partely fix the issue #1340. It must be backported as far
as 2.0.
2021-08-05 10:07:43 +02:00
Christopher Faulet
6f1296b5c7 BUG/MEDIUM: spoe: Create a SPOE applet if necessary when the last one is released
On a thread, when the last SPOE applet is released, if there are still
pending streams, a new one is created. Of course, HAproxy must not be
stopping. It is important to start a new applet in this case to not abort
in-progress jobs, especially when a maxconn is set. Because applets may be
closed to be fair with connections waiting for a free slot.

This patch should partely fix the issue #1340. It depends on the commit
"MINOR: spoe: Create a SPOE applet if necessary when the last one on a
thread is closed". Both must be backported as far as 2.0.
2021-08-05 10:07:43 +02:00
Christopher Faulet
434b8525ee MINOR: spoe: Add a pointer on the filter config in the spoe_agent structure
There was no way to access the SPOE filter configuration from the agent
object. However it could be handy to have it. And in fact, this will be
required to fix a bug.
2021-08-05 10:07:43 +02:00
Willy Tarreau
d332f1396b BUG/MINOR: server: update last_change on maint->ready transitions too
Nenad noticed that when leaving maintenance, the servers' last_change
field was not updated. This is visible in the Status column of the stats
page in front of the state, as the cumuled time spent in the current state
is wrong, it starts from the last transition (typically ready->maint). In
addition, the backend's state was not updated either, because the down
transition is performed by set_backend_down() which also emits a log, and
it is this function which was extended to update the backend's last_change,
but it's not called for down->up transitions so that was not done.

The most visible (and unpleasant) effect of this bug is that it affects
slowstart so such a server could immediately restart with a significant
load ratio.

This should likely be backported to all stable releases.
2021-08-04 19:41:01 +02:00
Willy Tarreau
7b2ac29a92 CLEANUP: fd: remove the now unneeded fd_mig_lock
This is not needed anymore since we don't use it when setting the running
mask anymore.
2021-08-04 16:03:36 +02:00
Willy Tarreau
f69fea64e0 MAJOR: fd: get rid of the DWCAS when setting the running_mask
Right now we're using a DWCAS to atomically set the running_mask while
being constrained by the thread_mask. This DWCAS is annoying because we
may seriously need it later when adding support for thread groups, for
checking that the running_mask applies to the correct group.

It turns out that the DWCAS is not strictly necessary because we never
need it to set the thread_mask based on the running_mask, only the other
way around. And in fact, the running_mask is always cleared alone, and
the thread_mask is changed alone as well. The running_mask is only
relevant to indicate a takeover when the thread_mask matches it. Any
bit set in running and not present in thread_mask indicates a transition
in progress.

As such, it is possible to re-arrange this by using a regular CAS around a
consistency check between running_mask and thread_mask in fd_update_events
and by making a CAS on running_mask then an atomic store on the thread_mask
in fd_takeover(). The only other case is fd_delete() but that one already
sets the running_mask before clearing the thread_mask, which is compatible
with the consistency check above.

This change has happily survived 10 billion takeovers on a 16-thread
machine at 800k requests/s.

The fd-migration doc was updated to reflect this change.
2021-08-04 16:03:36 +02:00
Willy Tarreau
b1f29bc625 MINOR: activity/fd: remove the dead_fd counter
This one is set whenever an FD is reported by a poller with a null owner,
regardless of the thread_mask. It has become totally meaningless because
it only indicates a migrated FD that was not yet reassigned to a thread,
but as soon as a thread uses it, the status will change to skip_fd. Thus
there is no reason to distinguish between the two, it adds more confusion
than it helps. Let's simply drop it.
2021-08-04 16:03:36 +02:00
Amaury Denoyelle
bd8dd841e5 BUG/MINOR: server: remove srv from px list on CLI 'add server' error
If an error occured during the CLI 'add server' handler, the newly
created server must be removed from the proxy list if already inserted.
Currently, this can happen on the extremely rare error during server id
generation if there is no id left.

The removal operation is not thread-safe, it must be conducted before
releasing the thread isolation.

This can be backported up to 2.4. Please note that dynamic server track
is not implemented in 2.4, so the release_server_track invocation must
be removed for the backport to prevent a compilation error.
2021-08-04 14:57:06 +02:00
Willy Tarreau
ba3ab7907a MEDIUM: servers: make the server deletion code run under full thread isolation
In 2.4, runtime server deletion was brought by commit e558043e1 ("MINOR:
server: implement delete server cli command"). A comment remained in the
code about a theoretical race between the thread_isolate() call and another
thread being in the process of allocating memory before accessing the
server via a reference that was grabbed before the memory allocation,
since the thread_harmless_now()/thread_harmless_end() pair around mmap()
may have the effect of allowing cli_parse_delete_server() to proceed.

Now that the full thread isolation is available, let's update the code
to rely on this. Now it is guaranteed that competing threads will either
be in the poller or queued in front of thread_isolate_full().

This may be backported to 2.4 if any report of breakage suggests the bug
really exists, in which case the two following patches will also be
needed:
  MINOR: threads: make thread_release() not wait for other ones to complete
  MEDIUM: threads: add a stronger thread_isolate_full() call
2021-08-04 14:49:36 +02:00
Willy Tarreau
88d1c5d3fb MEDIUM: threads: add a stronger thread_isolate_full() call
The current principle of running under isolation was made to access
sensitive data while being certain that no other thread was using them
in parallel, without necessarily having to place locks everywhere. The
main use case are "show sess" and "show fd" which run over long chains
of pointers.

The thread_isolate() call relies on the "harmless" bit that indicates
for a given thread that it's not currently doing such sensitive things,
which is advertised using thread_harmless_now() and which ends usings
thread_harmless_end(), which also waits for possibly concurrent threads
to complete their work if they took this opportunity for starting
something tricky.

As some system calls were notoriously slow (e.g. mmap()), a bunch of
thread_harmless_now() / thread_harmless_end() were placed around them
to let waiting threads do their work while such other threads were not
able to modify memory contents.

But this is not sufficient for performing memory modifications. One such
example is the server deletion code. By modifying memory, it not only
requires that other threads are not playing with it, but are not either
in the process of touching it. The fact that a pool_alloc() or pool_free()
on some structure may call thread_harmless_now() and let another thread
start to release the same object's memory is not acceptable.

This patch introduces the concept of "idle threads". Threads entering
the polling loop are idle, as well as those that are waiting for all
others to become idle via the new function thread_isolate_full(). Once
thread_isolate_full() is granted, the thread is not idle anymore, and
it is released using thread_release() just like regular isolation. Its
users have to keep in mind that across this call nothing is granted as
another thread might have performed shared memory modifications. But
such users are extremely rare and are actually expecting this from their
peers as well.

Note that that in case of backport, this patch depends on previous patch:
  MINOR: threads: make thread_release() not wait for other ones to complete
2021-08-04 14:49:36 +02:00
Willy Tarreau
f519cfaa63 MINOR: threads: make thread_release() not wait for other ones to complete
The original intent of making thread_release() wait for other requesters to
proceed was more of a fairness trade, guaranteeing that a thread that was
granted an access to the CPU would be in turn giving back once its job is
done. But this is counter-productive as it forces such threads to spin
instead of going back to the poller, and it prevents us from implementing
multiple levels of guarantees, as a thread_release() call could spin
waiting for another requester to pass while that requester expects
stronger guarantees than the current thread may be able to offer.

Let's just remove that wait period and let the thread go back to the
poller, a-la "race to idle".

While in theory it could possibly slightly increase the perceived
latency of concurrent slow operations like "show fd" or "show sess",
it is not the case at all in tests, probably because the time needed
to reach the poller remains extremely low anyway.
2021-08-04 14:49:36 +02:00
Willy Tarreau
286363be08 CLEANUP: thread: fix fantaisist indentation of thread_harmless_till_end()
Probably due to a copy-paste, there were two indent levels in this function
since its introduction in 1.9 by commit 60b639ccb ("MEDIUM: hathreads:
implement a more flexible rendez-vous point"). Let's fix this.
2021-08-04 14:49:36 +02:00
Amaury Denoyelle
08be72b827 BUG/MINOR: server: fix race on error path of 'add server' CLI if track
If an error occurs during a dynamic server creation with tracking, it
must be removed from the tracked list. This operation is not thread-safe
and thus must be conducted under the thread isolation.

Track support for dynamic servers has been introduced in this release.
This does not need to be backported.
2021-08-04 09:18:12 +02:00
William Lallemand
85a16b2ba2 MINOR: stats: shows proxy in a stopped state
Previous patch b5c0d65 ("MINOR: proxy: disabled takes a stopping and a
disabled state") allows us to set 2 states for a stopped or a disabled
proxy. With this patch we are now able to show the stats of all proxies
when the process is in a stopping states, not only when there is some
activity on a proxy.

This patch should fix issue #1307.
2021-08-03 14:17:45 +02:00
William Lallemand
8e765b86fd MINOR: proxy: disabled takes a stopping and a disabled state
This patch splits the disabled state of a proxy into a PR_DISABLED and a
PR_STOPPED state.

The first one is set when the proxy is disabled in the configuration
file, and the second one is set upon a stop_proxy().
2021-08-03 14:17:45 +02:00
William Lallemand
56f1f75715 MINOR: log: rename 'dontloglegacyconnerr' to 'log-error-via-logformat'
Rename the 'dontloglegacyconnerr' option to 'log-error-via-logformat'
which is much more self-explanatory and readable.

Note: only legacy keywords don't use hyphens, it is recommended to
separate words with them in new keywords.
2021-08-02 10:42:42 +02:00
Willy Tarreau
55a0975b1e BUG/MINOR: freq_ctr: use stricter barriers between updates and readings
update_freq_ctr_period() was using relaxed atomics without using barriers,
which usually works fine on x86 but not everywhere else. In addition, some
values were read without being enclosed by barriers, allowing the compiler
to possibly prefetch them a bit earlier. Finally, freq_ctr_total() was also
reading these without enough barriers. Let's make explicit use of atomic
loads and atomic stores to get rid of this situation. This required to
slightly rearrange the freq_ctr_total() loop, which could possibly slightly
improve performance under extreme contention by avoiding to reread all
fields.

A backport may be done to 2.4 if a problem is encountered, but last tests
on arm64 with LSE didn't show any issue so this can possibly stay as-is.
2021-08-01 17:34:06 +02:00
Willy Tarreau
200bd50b73 MEDIUM: fd: rely more on fd_update_events() to detect changes
This function already performs a number of checks prior to calling the
IOCB, and detects the change of thread (FD migration). Half of the
controls are still in each poller, and these pollers also maintain
activity counters for various cases.

Note that the unreliable test on thread_mask was removed so that only
the one performed by fd_set_running() is now used, since this one is
reliable.

Let's centralize all that fd-specific logic into the function and make
it return a status among:

  FD_UPDT_DONE,        // update done, nothing else to be done
  FD_UPDT_DEAD,        // FD was already dead, ignore it
  FD_UPDT_CLOSED,      // FD was closed
  FD_UPDT_MIGRATED,    // FD was migrated, ignore it now

Some pollers already used to call it last and have nothing to do after
it, regardless of the result. epoll has to delete the FD in case a
migration is detected. Overall this removes more code than it adds.
2021-07-30 17:45:18 +02:00
Willy Tarreau
84c7922c52 REORG: fd: uninline fd_update_events()
This function has become a monster (80 lines and 2/3 of a kB), it doesn't
benefit from being static nor inline anymore, let's move it to fd.c.
2021-07-30 17:41:55 +02:00
Willy Tarreau
53a16187fd MINOR: poll/epoll: move detection of RDHUP support earlier
Let's move the detection of support for RDHUP earlier and out of the
FD update chain, as it complicates its simplification.
2021-07-30 17:41:55 +02:00
Willy Tarreau
79e90b9615 BUG/MINOR: pollers: always program an update for migrated FDs
If an MT-aware poller reports that a file descriptor was migrated, it
must stop reporting it. The simplest way to do this is to program an
update if not done yet. This will automatically mark the FD for update
on next round. Otherwise there's a risk that some events are reported
a bit too often and cause extra CPU usage with these pollers. Note
that epoll is currently OK regarding this. Select does not need this
because it uses a single shared events table, so in case of migration
no FD change is expected.

This should be backported as far as 2.2.
2021-07-30 14:21:43 +02:00
Willy Tarreau
177119bb11 BUG/MINOR: poll: fix abnormally high skip_fd counter
The skip_fd counter that is incremented when a migrated FD is reported
was abnormally high in with poll. The reason is that it was accounted
for before preparing the polled events instead of being measured from
the reported events.

This mistake was done when the counters were introduced in 1.9 with
commit d80cb4ee1 ("MINOR: global: add some global activity counters to
help debugging"). It may be backported as far as 2.0.
2021-07-30 14:04:28 +02:00
Willy Tarreau
fcc5281513 BUG/MINOR: select: fix excess number of dead/skip reported
In 1.8, commit ab62f5195 ("MINOR: polling: Use fd_update_events to update
events seen for a fd") updated the pollers to rely on fd_update_events(),
but the modification delayed the test of presence of the FD in the report,
resulting in owner/thread_mask and possibly event updates being performed
for each FD appearing in a block of 32 FDs around an active one. This
caused the request rate to be ~3 times lower with select() than poll()
under 6 threads.

This can be backported as far as 1.8.
2021-07-30 13:55:36 +02:00
Willy Tarreau
c37ccd70b4 BUG/MEDIUM: pollers: clear the sleeping bit after waking up, not before
A bug was introduced in 2.1-dev2 by commit 305d5ab46 ("MAJOR: fd: Get
rid of the fd cache."). Pollers "poll" and "evport" had the sleeping
bit accidentally removed before the syscall instead of after. This
results in them not being woken up by inter-thread wakeups, which is
particularly visible with the multi-queue accept() and with queues.

As a work-around, when these pollers are used, "nbthread 1" should
be used.

The fact that it has remained broken for 2 years is a great indication
that threads are definitely not enabled outside of epoll and kqueue,
hence why this patch is only tagged medium.

This must be backported as far as 2.2.
2021-07-30 10:57:09 +02:00
Remi Tricot-Le Breton
4a6328f066 MEDIUM: connection: Add option to disable legacy error log
In case of connection failure, a dedicated error message is output,
following the format described in section "Error log format" of the
documentation. These messages cannot be configured through a log-format
option.
This patch adds a new option, "dontloglegacyconnerr", that disables
those error logs when set, and "replaces" them by a regular log line
that follows the configured log-format (thanks to a call to sess_log in
session_kill_embryonic).
The new fc_conn_err sample fetch allows to add the legacy error log
information into a regular log format.
This new option is unset by default so the logging logic will remain the
same until this new option is used.
2021-07-29 15:40:45 +02:00
Remi Tricot-Le Breton
98b930d043 MINOR: ssl: Define a default https log format
This patch adds a new httpslog option and a new HTTP over SSL log-format
that expands the default HTTP format and adds SSL specific information.
2021-07-29 15:40:45 +02:00
Remi Tricot-Le Breton
7c6898ee49 MINOR: ssl: Add new ssl_fc_hsk_err sample fetch
This new sample fetch along the ssl_fc_hsk_err_str fetch contain the
last SSL error of the error stack that occurred during the SSL
handshake (from the frontend's perspective). The errors happening during
the client's certificate verification will still be given by the
ssl_c_err and ssl_c_ca_err fetches. This new fetch will only hold errors
retrieved by the OpenSSL ERR_get_error function.
2021-07-29 15:40:45 +02:00
Remi Tricot-Le Breton
89b65cfd52 MINOR: ssl: Enable error fetches in case of handshake error
The ssl_c_err, ssl_c_ca_err and ssl_c_ca_err_depth sample fetches values
were not recoverable when the connection failed because of the test
"conn->flags & CO_FL_WAIT_XPRT" (which required the connection to be
established). They could then not be used in a log-format since whenever
they would have sent a non-null value, the value fetching was disabled.
This patch ensures that all these values can be fetched in case of
connection failure.
2021-07-29 15:40:45 +02:00
Remi Tricot-Le Breton
3d2093af9b MINOR: connection: Add a connection error code sample fetch
The fc_conn_err and fc_conn_err_str sample fetches give information
about the problem that made the connection fail. This information would
previously only have been given by the error log messages meaning that
thanks to these fetches, the error log can now be included in a custom
log format. The log strings were all found in the conn_err_code_str
function.
2021-07-29 15:40:45 +02:00
William Lallemand
df9caeb9ae CLEANUP: mworker: PR_CAP already initialized with alloc_new_proxy()
Remove the PR_CAP initialization in mworker_cli_proxy_create() which is
already done in alloc_new_proxy().
2021-07-29 15:35:48 +02:00
William Lallemand
ae787bad80 CLEANUP: mworker: use the proxy helper functions in mworker_cli_proxy_create()
Cleanup the mworker_cli_proxy_create() function by removing the
allocation and init of the proxy which is done manually, and replace it
by alloc_new_proxy(). Do the same with the free_proxy() function.

This patch also move the insertion at the end of the function.
2021-07-29 15:13:22 +02:00
William Lallemand
e7f74623e4 MINOR: stats: don't output internal proxies (PR_CAP_INT)
Disable the output of the statistics of internal proxies (PR_CAP_INT),
wo we don't rely only on the px->uuid > 0. This will allow to hide more
cleanly the internal proxies in the stats.
2021-07-28 17:45:18 +02:00
William Lallemand
d11c5728b4 MINOR: mworker: the mworker CLI proxy is internal
Sets the mworker CLI proxy as a internal one (PR_CAP_INT) so we could
exlude it from stats and other tests.
2021-07-28 17:40:56 +02:00
William Lallemand
6bb77b9c64 MINOR: proxy: rename PR_CAP_LUA to PR_CAP_INT
This patch renames the proxy capability "LUA" to "INT" so it could be
used for any internal proxy.

Every proxy that are not user defined should use this flag.
2021-07-28 15:51:42 +02:00
Christopher Faulet
b5f7b52968 BUG/MEDIUM: mux-h2: Handle remaining read0 cases on partial frames
This part was fixed several times since commit aade4edc1 ("BUG/MEDIUM:
mux-h2: Don't handle pending read0 too early on streams") and there are
still some cases where a read0 event may be ignored because a partial frame
inhibits the event.

Here, we must take care to set H2_CF_END_REACHED flag if a read0 was
received while a partial frame header is received or if the padding length
is missing.

To ease partial frame detection, H2_CF_DEM_SHORT_READ flag is introduced. It
is systematically removed when some data are received and is set when a
partial frame is found or when dbuf buffer is empty. At the end of the
demux, if the connection must be closed ASAP or if data are missing to move
forward, we may acknowledge the pending read0 event, if any. For now,
H2_CF_DEM_SHORT_READ is not part of H2_CF_DEM_BLOCK_ANY mask.

This patch should fix the issue #1328. It must be backported as far as 2.0.
2021-07-27 09:26:02 +02:00
Christopher Faulet
cf30756f0c BUG/MINOR: mux-h1: Be sure to swap H1C to splice mode when rcv_pipe() is called
The splicing does not work anymore because the H1 connection is not swap to
splice mode when rcv_pipe() callback function is called. It is important to
set H1C_F_WANT_SPLICE flag to inhibit data receipt via the buffer
API. Otherwise, because there are always data in the buffer, it is not
possible to use the kernel splicing.

This bug was introduced by the commit 2b861bf72 ("MINOR: mux-h1: clean up
conditions to enabled and disabled splicing").

The patch must be backported to 2.4.
2021-07-26 15:14:35 +02:00
Christopher Faulet
3f35da296e BUG/MINOR: mux-h2: Obey dontlognull option during the preface
If a connection is closed during the preface while no data are received, if
the dontlognull option is set, no log message must be emitted. However, this
will still be handled as a protocol error. Only the log is omitted.

This patch should fix the issue #1336 for H2 sessions. It must be backported
to 2.4 and 2.3 at least, and probably as far as 2.0.
2021-07-26 15:14:35 +02:00
Christopher Faulet
07e10deb36 BUG/MINOR: mux-h1: Obey dontlognull option for empty requests
If a H1 connection is closed while no data are received, if the dontlognull
option is set, no log message must be emitted. Because the H1 multiplexer
handles early errors, it must take care to obey this option. It is true for
400-Bad-Request, 408-Request-Time-out and 501-Not-Implemented
responses. 500-Internal-Server-Error responses are still logged.

This patch should fix the issue #1336 for H1 sessions. It must be backported
to 2.4.
2021-07-26 15:14:35 +02:00
Amaury Denoyelle
2bf5d41ada MINOR: ssl: use __objt_* variant when retrieving counters
Use non-checked function to retrieve listener/server via obj_type. This
is done as a previous obj_type function ensure that the type is well
known and the instance is not NULL.

Incidentally, this should prevent the coverity report from the #1335
github issue which warns about a possible NULL dereference.
2021-07-26 09:59:06 +02:00
Christopher Faulet
1f923391d1 BUG/MINOR: resolvers: Use a null-terminated string to lookup in servers tree
When we evaluate a DNS response item, it may be necessary to look for a
server with a hostname matching the item target into the named servers
tree. To do so, the item target is transformed to a lowercase string. It
must be a null-terminated string. Thus we must explicitly set the trailing
'\0' character.

For a specific resolution, the named servers tree contains all servers using
this resolution with a hostname loaded from a state file. Because of this
bug, same entry may be duplicated because we are unable to find the right
server, assigning this way the item to a free server slot.

This patch should fix the issue #1333. It must be backported as far as 2.2.
2021-07-22 15:03:25 +02:00
Willy Tarreau
b3c4a8f59d BUILD: threads: fix pthread_mutex_unlock when !USE_THREAD
Commit 048368ef6 ("MINOR: deinit: always deinit the init_mutex on
failed initialization") added the missing unlock but forgot to
condition it on USE_THREAD, resulting in a build failure. No
backport is needed.

This addresses oss-fuzz issue 36426.
2021-07-22 14:43:21 +02:00
Willy Tarreau
acff309753 BUG/MINOR: check: fix the condition to validate a port-less server
A config like the below fails to validate because of a bogus test:

  backend b1
      tcp-check connect port 1234
      option tcp-check
      server s1 1.2.3.4 check

[ALERT] (18887) : config : config: proxy 'b1': server 's1' has neither
                  service port nor check port, and a tcp_check rule
                  'connect' with no port information.

A || instead of a && only validates the connect rule when both the
address *and* the port are set. A work around is to set the rule like
this:

      tcp-check connect addr 0:1234 port 1234

This needs to be backported as far as 2.2 (2.0 is OK).
2021-07-22 11:21:33 +02:00
Christopher Faulet
59bab61649 BUG/MINOR: stats: Add missing agent stats on servers
Agent stats were lost during the stats refactoring performed in the 2.4 to
simplify the Prometheus exporter. stats_fill_sv_stats() function must fill
ST_F_AGENT_* and ST_F_LAST_AGT stats.

This patch should fix the issue #1331. It must be backported to 2.4.
2021-07-22 08:47:55 +02:00
Amaury Denoyelle
5fcd428c35 BUG/MEDIUM: ssl_sample: fix segfault for srv samples on invalid request
Some ssl samples cause a segfault when the stream is not instantiated,
for example during an invalid HTTP request. A new check is added to
prevent the stream dereferencing if NULL.

This is the list of the affected samples :
- ssl_s_chain_der
- ssl_s_der
- ssl_s_i_dn
- ssl_s_key_alg
- ssl_s_notafter
- ssl_s_notbefore
- ssl_s_s_dn
- ssl_s_serial
- ssl_s_sha1
- ssl_s_sig_alg
- ssl_s_version

This bug can be reproduced easily by using one of these samples in a
log-format string. Emit an invalid HTTP request with an HTTP client to
trigger the crash.

This bug has been reported in redmine issue 3913.

This must be backported up to 2.2.
2021-07-21 14:23:06 +02:00
Willy Tarreau
3c032f2d4d BUG/MINOR: mworker: do not export HAPROXY_MWORKER_REEXEC across programs
This undocumented variable is only for internal use, and its sole
presence affects the process' behavior, as shown in bug #1324. It must
not be exported to workers, external checks, nor programs. Let's unset
it before forking programs and workers.

This should be backported as far as 1.8. The worker code might differ
a bit before 2.5 due to the recent removal of multi-process support.
2021-07-21 10:17:02 +02:00
Willy Tarreau
26146194d3 BUG/MEDIUM: mworker: do not register an exit handler if exit is expected
The master-worker code registers an exit handler to deal with configuration
issues during reload, leading to a restart of the master process in wait
mode. But it shouldn't do that when it's expected that the program stops
during config parsing or condition checks, as the reload operation is
unexpectedly called and results in abnormal behavior and even crashes:

  $ HAPROXY_MWORKER_REEXEC=1  ./haproxy -W -c -f /dev/null
  Configuration file is valid
  [NOTICE]   (18418) : haproxy version is 2.5-dev2-ee2420-6
  [NOTICE]   (18418) : path to executable is ./haproxy
  [WARNING]  (18418) : config : Reexecuting Master process in waitpid mode
  Segmentation fault

  $ HAPROXY_MWORKER_REEXEC=1 ./haproxy -W -cc 1
  [NOTICE]   (18412) : haproxy version is 2.5-dev2-ee2420-6
  [NOTICE]   (18412) : path to executable is ./haproxy
  [WARNING]  (18412) : config : Reexecuting Master process in waitpid mode
  [WARNING]  (18412) : config : Reexecuting Master process

Note that the presence of this variable happens by accident when haproxy
is called from within its own programs (see issue #1324), but this should
be the object of a separate fix.

This patch fixes this by preventing the atexit registration in such
situations. This should be backported as far as 1.8. MODE_CHECK_CONDITION
has to be dropped for versions prior to 2.5.
2021-07-21 10:01:36 +02:00
Willy Tarreau
dc70c18ddc BUG/MEDIUM: cfgcond: limit recursion level in the condition expression parser
Oss-fuzz reports in issue 36328 that we can recurse too far by passing
extremely deep expressions to the ".if" parser. I thought we were still
limited to the 1024 chars per line, that would be highly sufficient, but
we don't have any limit now :-/

Let's just pass a maximum recursion counter to the recursive parsers.
It's decremented for each call and the expression fails if it reaches
zero. On the most complex paths it can add 3 levels per parenthesis,
so with a limit of 1024, that's roughly 343 nested sub-expressions that
are supported in the worst case. That's more than sufficient, for just
a few kB of RAM.

No backport is needed.
2021-07-20 18:03:08 +02:00
jenny-cheung
048368ef6f MINOR: deinit: always deinit the init_mutex on failed initialization
The init_mutex was not unlocked in case an error is encountered during
a thread initialization, and the polling loop was aborted during startup.
In practise it does not have any observable effect since an explicit
exit() is placed there, but it could confuse some debugging tools or
some static analysers, so let's release it as expected.

This addresses issue #1326.
2021-07-20 16:38:23 +02:00
Christopher Faulet
b73f653d00 CLEANUP: http_ana: Remove now unused label from http_process_request()
Since last change on HTTP analysers (252412316 "MEDIUM: proxy: remove
long-broken 'option http_proxy'"), http_process_request() may only return
internal errors on failures. Thus the label used to handle bad requests may
be removed.

This patch should fix the issue #1330.
2021-07-19 10:32:17 +02:00
Willy Tarreau
252412316e MEDIUM: proxy: remove long-broken 'option http_proxy'
This option had always been broken in HTX, which means that the first
breakage appeared in 1.9, that it was broken by default in 2.0 and that
no workaround existed starting with 2.1. The way this option works is
praticularly unfit to the rest of the configuration and to the internal
architecture. It had some uses when it was introduced 14 years ago but
nowadays it's possible to do much better and more reliable using a
set of "http-request set-dst" and "http-request set-uri" rules, which
additionally are compatible with DNS resolution (via do-resolve) and
are not exclusive to normal load balancing. The "option-http_proxy"
example config file was updated to reflect this.

The option is still parsed so that an error message gives hints about
what to look for.
2021-07-18 19:35:32 +02:00
Willy Tarreau
f1db20c473 BUG/MINOR: cfgcond: revisit the condition freeing mechanism to avoid a leak
The cfg_free_cond_{term,and,expr}() functions used to take a pointer to
the pointer to be freed in order to replace it with a NULL once done.
But this doesn't cope well with freeing lists as it would require
recursion which the current code tried to avoid.

Let's just change the API to free the area and let the caller set the NULL.

This leak was reported by oss-fuzz (issue 36265).
2021-07-17 18:46:30 +02:00
Willy Tarreau
69a23ae091 BUG/MINOR: arg: free all args on make_arg_list()'s error path
While we do free the array containing the arguments, we do not free
allocated ones. Most of them are unresolved, but strings are allocated
and have to be freed as well. Note that for the sake of not breaking
the args resolution list that might have been set, we still refrain
from doing this if a resolution was already programmed, but for most
common cases (including the ones that can be found in config conditions
and at run time) we're safe.

This may be backported to stable branches, but it relies on the new
free_args() function that was introduced by commit ab213a5b6 ("MINOR:
arg: add a free_args() function to free an args array"), and which is
likely safe to backport as well.

This leak was reported by oss-fuzz (issue 36265).
2021-07-17 18:36:43 +02:00
Willy Tarreau
79c9bdf63d BUG/MEDIUM: init: restore behavior of command-line "-m" for memory limitation
The removal for the shared inter-process cache in commit 6fd0450b4
("CLEANUP: shctx: remove the different inter-process locking techniques")
accidentally removed the enforcement of rlimit_memmax_all which
corresponds to what is passed to the command-line "-m" argument.
Let's restore it.

Thanks to @nafets227 for spotting this. This fixes github issue #1319.
2021-07-17 12:31:08 +02:00
Willy Tarreau
316ea7ede5 MINOR: cfgcond: support terms made of parenthesis around expressions
Now it's possible to form a term using parenthesis around an expression.
This will soon allow to build more complex expressions. For now they're
still pretty limited but parenthesis do work.
2021-07-16 19:18:41 +02:00
Willy Tarreau
ca81887599 MINOR: cfgcond: insert an expression between the condition and the term
Now evaluating a condition will rely on an expression (or an empty string),
and this expression will support ORing a sub-expression with another
optional expression. The sub-expressions ANDs a term with another optional
sub-expression. With this alone precedence between && and || is respected,
and the following expression:

     A && B && C || D || E && F || G

will naturally evaluate as:

     (A && B && C) || D || (E && F) || G
2021-07-16 19:18:41 +02:00
Willy Tarreau
087b2d018f MINOR: cfgcond: make the conditional term parser automatically allocate nodes
It's not convenient to let the caller be responsible for node allocation,
better have the leaf function do that and implement the accompanying free
call. Now only a pointer is needed instead of a struct, and the leaf
function makes sure to leave the situation in a consistent way.
2021-07-16 19:18:41 +02:00
Willy Tarreau
ca56d3d28b MINOR: cfgcond: support negating conditional expressions
Now preceeding a config condition term with "!" will simply negate it.
Example:

   .if !feature(OPENSSL)
       .alert "SSL support is mandatory"
   .endif
2021-07-16 19:18:41 +02:00
Willy Tarreau
c8194c30df MINOR: cfgcond: remerge all arguments into a single line
Till now we were dealing with single-word expressions but in order to
extend the configuration condition language a bit more, we'll need to
support slightly more complex expressions involving operators, and we
must absolutely support spaces around them to keep them readable.

As all arguments are pointers to the same line with spaces replaced by
zeroes, we can trivially rebuild the whole line before calling the
condition evaluator, and remove the test for extraneous argument. This
is what this patch does.
2021-07-16 19:18:41 +02:00
Willy Tarreau
379ceeaaeb MEDIUM: cfgcond: report invalid trailing chars after expressions
Random characters placed after a configuration predicate currently do
not report an error. This is a problem because extra parenthesis,
commas or even other random left-over chars may accidently appear there.
Let's now report an error when this happens.

This is marked MEDIUM because it may break otherwise working configs
which are faulty.
2021-07-16 19:18:41 +02:00
Willy Tarreau
f869095df9 MINOR: cfgcond: start to split the condition parser to introduce terms
The purpose is to build a descendent parser that will split conditions
into expressions made of terms. There are two phases, a parsing phase
and an evaluation phase. Strictly speaking it's not required to cut
that in two right now, but it's likely that in the future we won't want
certain predicates to be evaluated during the parsing (e.g. file system
checks or execution of some external commands).

The cfg_eval_condition() function is now much simpler, it just tries to
parse a single term, and if OK evaluates it, then returns the result.
Errors are unchanged and may still be reported during parsing or
evaluation.

It's worth noting that some invalid expressions such as streq(a,b)zzz
continue to parse correctly for now (what remains after the parenthesis
is simply ignored as not necessary).
2021-07-16 19:18:41 +02:00
Willy Tarreau
66243b4273 REORG: config: move the condition preprocessing code to its own file
The .if/.else/.endif and condition evaluation code is quite dirty and
was dumped into cfgparse.c because it was easy. But it should be tidied
quite a bit as it will need to evolve.

Let's move all that to cfgcond.{c,h}.
2021-07-16 19:18:41 +02:00
Willy Tarreau
ee0d727989 CLEANUP: hlua: use free_args() to release args arrays
Argument arrays used in hlua_lua2arg_check() as well as in the functions
used to call sample fetches and converters were manually released, let's
use the cleaner and more reliable free_args() instead. The prototype of
hlua_lua2arg_check() was amended to mention that the function relies on
the final ARGT_STOP, which is already the case, and the pointless test
for this was removed.
2021-07-16 19:18:41 +02:00
Willy Tarreau
c15221b80c CLEANUP: config: use free_args() to release args array in cfg_eval_condition()
Doing so is cleaner than open-coding it and will support future extensions.
2021-07-16 19:18:41 +02:00
Willy Tarreau
ab213a5b6f MINOR: arg: add a free_args() function to free an args array
make_arg_list() can create an array of arguments, some of which remain
to be resolved, but all users had to deal with their own roll back on
error. Let's add a free_args() function to release all the array's
elements and let the caller deal with the array itself (sometimes it's
allocated in the stack).
2021-07-16 19:18:41 +02:00
Willy Tarreau
a87e782a2d MINOR: init: make -cc support environment variables expansion
I found myself a few times testing some conditoin examples from the doc
against command line's "-cc" to see that they didn't work with environment
variables expansion. Not being documented as being on purpose it looks like
a miss, so let's add PARSE_OPT_ENV and PARSE_OPT_WORD_EXPAND to be able to
test for example -cc "streq(${WITH_SSL},yes)" to help debug expressions.
2021-07-16 19:18:41 +02:00
Willy Tarreau
7edc0fde05 MINOR: init: verify that there is a single word on "-cc"
This adds the exact same restriction as commit 5546c8bdc ("MINOR:
cfgparse: Fail when encountering extra arguments in macro") but for
the "-cc" command line argument, for the sake of consistency.
2021-07-16 19:18:41 +02:00
Amaury Denoyelle
56eb8ed37d MEDIUM: server: support track keyword for dynamic servers
Allow the usage of the 'track' keyword for dynamic servers. On server
deletion, the server is properly removed from the tracking chain to
prevents NULL pointer dereferencing.
2021-07-16 10:22:58 +02:00
Amaury Denoyelle
79f68be207 MINOR: srv: do not allow to track a dynamic server
Prevents the use of the "track" keyword for a dynamic server. This
simplifies the deletion of a dynamic server, without having to worry
about servers which might tracked it.

A BUG_ON is present in the dynamic server delete function to validate
this assertion.
2021-07-16 10:08:55 +02:00
Amaury Denoyelle
669b620e5f MINOR: srv: extract tracking server config function
Extract the post-config tracking setup in a dedicated function
srv_apply_track. This will be useful to implement track support for
dynamic servers.
2021-07-16 10:08:55 +02:00
Willy Tarreau
6a51090780 BUILD: lua: silence a build warning with TCC
TCC doesn't have the equivalent of __builtin_unreachable() and complains
that hlua_panic_ljmp() may return no value. Let's add a return 0 there.
All compilers that know that longjmp() doesn't return will see no change
and tcc will be happy.
2021-07-14 19:41:25 +02:00
Willy Tarreau
1335da38f4 BUILD: add detection of missing important CFLAGS
Modern compilers love to break existing code, and some options detected
at build time (such as -fwrapv) are absolutely critical otherwise some
bad code can be generated.

Given that some users rely on packages that force CFLAGS without being
aware of this and can be hit by runtime bugs, we have to help packagers
figure that they need to be careful about their build options.

The test here consists in detecting correct wrapping of signed integers.
Some of the old code relies on it, and modern compilers recently decided
to break it. It's normally addressed using -fwrapv which users will
rarely enforce in their own flags. Thus it is a good indicator of missing
critical CFLAGS, and it happens to be very easy to detect at run time.
Note that the test uses argc in order to have a variable. While gcc
ignores wrapping even for constants, clang only ignores it for variables.
The way the code is constructed doesn't result in code being emitted for
optimized builds thanks to value range propagation.

This should address GitHub issue #1315, and should be backported to all
stable versions. It may result in instantly breaking binaries that seemed
to work fine (typically the ones suddenly showing a busy loop after a few
weeks of uptime), and require packagers to fix their flags. The vast
majority of distro packages are fine and will not be affected though.
2021-07-14 18:50:27 +02:00
Remi Tricot-Le Breton
0498fa4059 BUG/MINOR: ssl: Default-server configuration ignored by server
When a default-server line specified a client certificate to use, the
frontend would not take it into account and create an empty SSL context,
which would raise an error on the backend side ("peer did not return a
certificate").

This bug was introduced by d817dc733e in
which the SSL contexts are created earlier than before (during the
default-server line parsing) without setting it in the corresponding
server structures. It then made the server create an empty SSL context
in ssl_sock_prepare_srv_ctx because it thought it needed one.

It was raised on redmine, in Bug #3906.

It can be backported to 2.4.
2021-07-13 18:35:38 +02:00
Willy Tarreau
4c6986a6bc CLEANUP: applet: remove unused thread_mask
Since 1.9 with commit 673867c35 ("MAJOR: applets: Use tasks, instead
of rolling our own scheduler.") the thread_mask field of the appctx
became unused, but the code hadn't been cleaned for this. The appctx
has its own task and the task's thread_mask is the one to be displayed.

It's worth noting that all calls to appctx_new() pass tid_bit as the
thread_mask. This makes sense, and it could be convenient to decide
that this becomes the norm and to simplify the API.
2021-07-13 18:20:34 +02:00
Amaury Denoyelle
befeae88e8 MINOR: mux_h2: define config to disable h2 websocket support
Define a new global config statement named
"h2-workaround-bogus-websocket-clients".

This statement will disable the automatic announce of h2 websocket
support as specified in the RFC8441. This can be use to overcome clients
which fail to implement the relatively fresh RFC8441. Clients will in
his case automatically downgrade to http/1.1 for the websocket tunnel
if the haproxy configuration allows it.

This feature is relatively simple and can be backported up to 2.4, which
saw the introduction of h2 websocket support.
2021-07-12 10:41:45 +02:00
Amaury Denoyelle
b60fb8d5be BUG/MEDIUM: http_ana: fix crash for http_proxy mode during uri rewrite
Fix the wrong usage of http_uri_parser which is defined with an
uninitialized uri. This causes a crash which happens when forwarding a
request to a backend configured in plain proxy ('option http_proxy').

This has been reported through a clang warning on the CI.

This bug has been introduced by the refactoring of URI parser API.
  c453f9547e
  MINOR: http: use http uri parser for path
This does not need to be backported.

WARNING: although this patch fix the crash, the 'option http_proxy'
seems to be non buggy, possibly since quite a few stable versions.
Indeed, the URI rewriting is not functional : the path is written on the
beginning of the URI but the rest of the URI is not and this garbage is
passed to the server which does not understand the request.
2021-07-08 18:09:52 +02:00
Amaury Denoyelle
c453f9547e MINOR: http: use http uri parser for path
Replace http_get_path by the http_uri_parser API. The new functions is
renamed http_parse_path. Replace duplicated code for scheme and
authority parsing by invocations to http_parse_scheme/authority.

If no scheme is found for an URI detected as an absolute-uri/authority,
consider it to be an authority format : no path will be found. For an
absolute-uri or absolute-path, use the remaining of the string as the
path. A new http_uri_parser state is declared to mark the path parsing
as done.
2021-07-08 17:11:17 +02:00
Amaury Denoyelle
5a9bd375fd REORG: http_ana: split conditions for monitor-uri in wait for request
Split in two the condition which check if the monitor-uri is set for the
current request. This will allow to easily use the http_uri_parser type
for http_get_path.
2021-07-08 17:11:17 +02:00
Amaury Denoyelle
69294b20ac MINOR: http: use http uri parser for authority
Replace http_get_authority by the http_uri_parser API.

The new function is renamed http_parse_authority. Replace duplicated
scheme parsing code by http_parse_scheme invocation. A new
http_uri_parser state is declared to mark the authority parsing as done.
2021-07-08 17:11:17 +02:00
Amaury Denoyelle
8ac8cbfd72 MINOR: http: use http uri parser for scheme
Replace http_get_scheme by the http_uri_parser API. The new function is
renamed http_parse_scheme. A new http_uri_parser state is declared to
mark the scheme parsing as completed.
2021-07-08 17:11:17 +02:00
Amaury Denoyelle
164ae4ad55 BUILD: http_htx: fix ci compilation error with isdigit for Windows
The warning is encountered on platforms for which char type is signed by
default.

cf the following links
https://stackoverflow.com/questions/10186219/array-subscript-has-type-char

This must be backported up to 2.4.
2021-07-07 17:23:57 +02:00
Amaury Denoyelle
4ca0f363a1 MEDIUM: h2: apply scheme-based normalization on h2 requests
Apply the rfc 3986 scheme-based normalization on h2 requests. This
process will be executed for most of requests because scheme and
authority are present on every h2 requests, except CONNECT. However, the
normalization will only be applied on requests with defaults http port
(http/80 or https/443) explicitly specified which most http clients
avoid.

This change is notably useful for http2 websockets with Firefox which
explicitly specify the 443 default port on Extended CONNECT. In this
case, users can be trapped if they are using host routing without
removing the port. With the scheme-based normalization, the default port
will be removed.

To backport this change, it is required to backport first the following
commits:
* MINOR: http: implement http_get_scheme
* MEDIUM: http: implement scheme-based normalization
2021-07-07 15:34:01 +02:00
Amaury Denoyelle
852d78c232 MEDIUM: h1-htx: apply scheme-based normalization on h1 requests
Apply the rfc 3986 scheme-based normalization on h1 requests. It is
executed only for requests which uses absolute-form target URI, which is
not the standard case.
2021-07-07 15:34:01 +02:00
Amaury Denoyelle
4c0882b1b4 MEDIUM: http: implement scheme-based normalization
Implement the scheme-based uri normalization as described in rfc3986
6.3.2. Its purpose is to remove the port of an uri if the default one is
used according to the uri scheme : 80/http and 443/https. All other
ports are not touched.

This method uses an htx message as an input. It requires that the target
URI is in absolute-form with a http/https scheme. This represents most
of h2 requests except CONNECT. On the contrary, most of h1 requests
won't be elligible as origin-form is the standard case.

The normalization is first applied on the target URL of the start line.
Then, it is conducted on every Host headers present, assuming that they
are equivalent to the target URL.

This change will be notably useful to not confuse users who are
accustomed to use the host for routing without specifying default ports.
This problem was recently encountered with Firefox which specify the 443
default port for http2 websocket Extended CONNECT.
2021-07-07 15:34:01 +02:00
Amaury Denoyelle
ef08811240 MINOR: http: implement http_get_scheme
This method can be used to retrieve the scheme part of an uri, with the
suffix '://'. It will be useful to implement scheme-based normalization.
2021-07-07 15:34:01 +02:00
Willy Tarreau
5b654ad42c BUILD: stick-table: shut up invalid "uninitialized" warning in gcc 8.3
gcc 8.3.0 spews a bunch of:

  src/stick_table.c: In function 'action_inc_gpc0':
  include/haproxy/freq_ctr.h:66:12: warning: 'period' may be used uninitialized in this function [-Wmaybe-uninitialized]
    curr_tick += period;
            ^~
  src/stick_table.c:2241:15: note: 'period' was declared here
    unsigned int period;
               ^~~~~~
but they're incorrect because all accesses are guarded by the exact same
condition (ptr1 not being null), it's just the compiler being overzealous
about the uninitialized detection that seems to be stronger than its
ability to follow its own optimizations. This code path is not critical,
let's just pre-initialize the period to zero.

No backport is needed.
2021-07-06 18:54:07 +02:00
Marno Krahmer
07954fb069 MEDIUM: stats: include disabled proxies that hold active sessions to stats
After reloading HAProxy, the old process may still hold active sessions.
Currently there is no way to gather information, how many sessions such
a process still holds. This patch will not exclude disabled proxies from
stats output when they hold at least one active session. This will allow
sending `!@<PID> show stat` through a master socket to the disabled
process and have it returning its stats data.
2021-07-06 11:54:08 +02:00
Christopher Faulet
23048875a4 Revert "MINOR: tcp-act: Add set-src/set-src-port for "tcp-request content" rules"
This reverts commit 19bbbe0562.

For now, set-src/set-src-port actions are directly performed on the client
connection. Using these actions at the stream level is really a problem with
HTTP connection (See #90) because all requests are affected by this change
and not only the current request. And it is worse with the H2, because
several requests can set their source address into the same connection at
the same time.

It is already an issue when these actions are called from "http-request"
rules. It is safer to wait a bit before adding the support to "tcp-request
content" rules. The solution is to be able to set src/dst address on the
stream and not on the connection when the action if performed from the L7
level..

Reverting the above commit means the issue #1303 is no longer fixed.

This patch must be backported in all branches containing the above commit
(as far as 2.0 for now).
2021-07-06 11:44:04 +02:00
Willy Tarreau
dfb34a8f87 BUG/MINOR: cli: fix server name output in "show fd"
A server name was displayed as <srv>/<proxy> instead of the reverse.
It only confuses diagnostics. This was introduced by commit 7a4a0ac71
("MINOR: cli: add a new "show fd" command") so this fix can be backport
down to 1.8.
2021-07-06 11:41:10 +02:00
Willy Tarreau
5a9c637bf3 BUG/MEDIUM: sock: make sure to never miss early connection failures
As shown in issue #1251, it is possible for a connect() to report an
error directly via the poller without ever reporting send readiness,
but currentlt sock_conn_check() manages to ignore that situation,
leading to high CPU usage as poll() wakes up on these FDs.

The bug was apparently introduced in 1.5-dev22 with commit fd803bb4d
("MEDIUM: connection: add check for readiness in I/O handlers"), but
was likely only woken up by recent changes to conn_fd_handler() that
made use of wakeups instead of direct calls between 1.8 and 1.9,
voiding any chance to catch such errors in the early recv() callback.

The exact sequence that leads to this situation remains obscure though
because the poller does not report send readiness nor does it report an
error. Only HUP and IN are reported on the FD. It is also possible that
some recent kernel updates made this condition appear while it never
used to previously.

This needs to be backported to all stable branches, at least as far
as 2.0. Before 2.2 the code was in tcp_connect_probe() in proto_tcp.c.
2021-07-06 10:52:19 +02:00
Emeric Brun
726783db18 MEDIUM: stick-table: make the use of 'gpc' excluding the use of 'gpc0/1''
This patch makes the use of 'gpc' excluding the use of the legacy
types 'gpc0' and 'gpc1" on the same table.

It also makes the use of 'gpc_rate' excluding the use of the legacy
types 'gpc0_rate' and 'gpc1_rate" on the same table.

The 'gpc0' and 'gpc1' related fetches and actions will apply
to the first two elements of the 'gpc' array if stored in table.

The 'gpc0_rate' and 'gpc1_rate' related fetches and actions will apply
to the first two elements of the 'gpc_rate' array if stored in table.
2021-07-06 07:24:42 +02:00
Emeric Brun
4d7ada8f9e MEDIUM: stick-table: add the new arrays of gpc and gpc_rate
This patch adds the definition of two new array data_types:
'gpc': This is an array of 32bits General Purpose Counters.
'gpc_rate': This is an array on increment rates of General Purpose Counters.

Like for all arrays, they are limited to 100 elements.

This patch also adds actions and fetches to handle
elements of those arrays.

Note: As documented, those new actions and fetches won't
apply to the legacy 'gpc0', 'gpc1', 'gpc0_rate' nor 'gpc1_rate'.
2021-07-06 07:24:42 +02:00
Emeric Brun
f7ab0bfb62 MEDIUM: stick-table: make the use of 'gpt' excluding the use of 'gpt0'
This patch makes the use of 'gpt' excluding the use of the legacy
type 'gpt0' on the same table.

It also makes the 'gpt0' related fetches and actions applying
to the first element of the 'gpt' array if stored in table.
2021-07-06 07:24:42 +02:00
Emeric Brun
877b0b5a7b MEDIUM: stick-table: add the new array of gpt data_type
This patch adds the definition of a new array data_type
'gpt'. This is an array of 32bits General Purpose Tags.

Like for all arrays, it is limited to 100 elements.

This patch also adds actions and fetches to handle
elements of this array.

Note: As documented, those new actions and fetches won't
apply to the legacy 'gpt0' data type.
2021-07-06 07:24:42 +02:00
Emeric Brun
90a9b676a8 MEDIUM: peers: handle arrays of std types in peers protocol
This patch adds support of array data_types on the peer protocol.

The table definition message will provide an additionnal parameter
for array data-types: the number of elements of the array.

In case of array of frqp it also provides a second parameter:
the period used to compute freq counter.

The array elements are std_type values linearly encoded in
the update message.

Note: if a remote peer announces an array data_type without
parameters into the table definition message, all updates
on this table will be ignored because we can not
parse update messages consistently.
2021-07-06 07:24:42 +02:00
Emeric Brun
c64a2a307c MEDIUM: stick-table: handle arrays of standard types into stick-tables
This patch provides the code to handle arrays of some
standard types (SINT, UINT, ULL and FRQP) in stick table.

This way we could define new "array" data types.

Note: the number of elements of an array was limited
to 100 to put a limit and to ensure that an encoded
update message will continue to fit into a buffer
when the peer protocol will handle such data types.
2021-07-06 07:24:42 +02:00
Emeric Brun
0e3457b63a MINOR: stick-table: make skttable_data_cast to use only std types
This patch replaces all advanced data type aliases on
stktable_data_cast calls by standard types.

This way we could call the same stktable_data_cast
regardless of the used advanced data type as long they
are using the same std type.

It also removes all the advanced data type aliases.
2021-07-06 07:24:42 +02:00
Emeric Brun
08b0f6780c BUG/MINOR: peers: fix data_type bit computation more than 32 data_types
This patch fixes the computation of the bit of the current data_type
in some part of code of peer protocol where the computation is limited
to 32bits whereas the bitfield of data_types can support 64bits.

Without this patch it could result in bugs when we will define more
than 32 data_types.

Backport is useless because there is currently less than 32 data_types
2021-07-06 07:24:42 +02:00
Emeric Brun
01928ae56b BUG/MINOR: stick-table: fix several printf sign errors dumping tables
This patch fixes several errors printing integers
of stick table entry values and args during dump on cli.

This patch should be backported since the dump of entries
is supported.  [wt: roughly 1.5-dev1 hence all stable branches]
2021-07-06 07:24:42 +02:00
David Carlier
bae4cb2790 BUILD/MEDIUM: tcp: set-mark support for OpenBSD
set-mark support for this platform, for routing table purpose.
Follow-up from f7f53afcf9, this time for OpenBSD.
2021-07-05 10:53:18 +02:00
Emeric Brun
5ea07d9e91 CLEANUP: peers: re-write intdecode function comment.
The varint decoding function comment was not clear enough and
didn't reflect the current usage.

This patch re-writes this.
2021-06-30 13:49:12 +02:00
Christopher Faulet
81ba74ae50 BUG/MEDIUM: resolvers: Make 1st server of a template take part to SRV resolution
The commit 3406766d5 ("MEDIUM: resolvers: add a ref between servers and srv
request or used SRV record") introduced a regression. The first server of a
template based on SRV record is no longer resolved. The same bug exists for
a normal server based on a SRV record.

In fact, the server used during parsing (used as reference when a
server-template line is parsed) is never attached to the corresponding srvrq
object. Thus with following lines, no resolution is performed because
"srvrq->attached_servers" is empty:

  server-template test 1 _http.domain.tld resolvers dns ...
  server test1 _http.domain.tld resolvers dns ...

This patch should fix the issue #1295 (but not confirmed yet it is the same
bug). It must be backported everywhere the above commit is.
2021-06-29 20:52:37 +02:00
Christopher Faulet
0de0becf0b BUG/MINOR: mqtt: Support empty client ID in CONNECT message
As specified by the MQTT specification (MQTT-3.1.3-6), the client ID may be
empty. That means the length of the client ID string may be 0. However, The
MQTT parser does not support empty strings.

So, to fix the bug, the mqtt_read_string() function may now parse empty
string. 2 bytes must be found to decode the string length, but the length
may be 0 now. It is the caller responsibility to test the string emptiness
if necessary. In addition, in mqtt_parse_connect(), the client ID may be
empty now.

This patch should partely fix the issue #1310. It must be backported to 2.4.
2021-06-28 16:29:44 +02:00
Christopher Faulet
ca925c9c28 BUG/MINOR: mqtt: Fix parser for string with more than 127 characters
Parsing of too long strings (> 127 characters) was buggy because of a wrong
cast on the length bytes. To fix the bug, we rely on mqtt_read_2byte_int()
function. This way, the string length is properly decoded.

This patch should partely fix the issue #1310. It must be backported to 2.4.
2021-06-28 16:29:44 +02:00
Willy Tarreau
5bbfff107b BUILD: tcp-act: avoid warning when set-mark / set-tos are not supported
Since recent commit 469c06c30 ("MINOR: http-act/tcp-act: Add "set-mark"
and "set-tos" for tcp content rules") there's a build warning (or error)
on Windows due to static function tcp_action_set_mark() not being used
because the set-mark functionality is not supported there. It's caused
by the fact that only the parsing function uses it so if the code is
ifdefed out the function remains unused.

Let's surround it with ifdefs as well, and do the same for
tcp_action_set_tos() which could suffer the same fate on operating systems
not defining IP_TOS.

This may need to be backported if the patch above is backported. Also
be careful, the condition was adjusted to cover FreeBSD after commit
f7f53afcf ("BUILD/MEDIUM: tcp: set-mark setting support for FreeBSD.").
2021-06-28 07:12:22 +02:00
David Carlier
f7f53afcf9 BUILD/MEDIUM: tcp: set-mark setting support for FreeBSD.
This platform has a similar socket option from Linux's SO_MARK,
marking a socket with an id for packet filter purpose, DTrace
monitoring and so on.
2021-06-28 07:03:35 +02:00
Christopher Faulet
ee9c98d81b CLEANUP: tcp-act: Sort action lists
Sort the lists used to register tcp actions.
2021-06-25 16:12:02 +02:00
Christopher Faulet
469c06c30e MINOR: http-act/tcp-act: Add "set-mark" and "set-tos" for tcp content rules
It is now possible to set the Netfilter MARK and the TOS field value in all
packets sent to the client from any tcp-request rulesets or the "tcp-response
content" one. To do so, the parsing of "set-mark" and "set-tos" actions are
moved in tcp_act.c and the actions evaluation is handled in dedicated functions.

This patch may be backported as far as 2.2 if necessary.
2021-06-25 16:11:58 +02:00
Christopher Faulet
1da374af2f MINOR: http-act/tcp-act: Add "set-nice" for tcp content rules
It is now possible to set the "nice" factor of the current stream from a
"tcp-request content" or "tcp-response content" ruleset. To do so, the
action parsing is moved in stream.c and the action evaluation is handled in
a dedicated function.

This patch may be backported as far as 2.2 if necessary.
2021-06-25 16:11:53 +02:00
Christopher Faulet
551a641cff MINOR: http-act/tcp-act: Add "set-log-level" for tcp content rules
It is now possible to set the stream log level from a "tcp-request content"
or "tcp-response content" ruleset. To do so, the action parsing is moved in
stream.c and the action evaluation is handled in a dedicated function.

This patch should fix issue #1306. It may be backported as far as 2.2 if
necessary.
2021-06-25 16:11:46 +02:00
Christopher Faulet
fa5880bd53 BUG/MINOR: tcpcheck: Fix numbering of implicit HTTP send/expect rules
The index of the failing rule is reported in the health-check log message. The
rules index is also used in the check traces. But for implicit HTTP send/expect
rules, the index is wrong. It must be incremented by one compared to the
preceding rule.

This patch may be backported as far as 2.2.
2021-06-25 14:03:45 +02:00
Dirkjan Bussink
dfee217b68 BUG/MINOR: checks: return correct error code for srv_parse_agent_check
In srv_parse_agent_check the error code is not returned in case
something goes wrong. The value 0 is always return.

Additionally, there's a small cleanup of unreachable returns that in
most checks are not present either and removed in two places they were
present. This makes the code consistent across the different checks.
2021-06-25 08:55:39 +02:00
Christopher Faulet
07ecff589d MINOR: resolvers: Reset server IP on error in resolv_get_ip_from_response()
If resolv_get_ip_from_response() returns an error (or an unexpected return
value), the server is set to RMAINT status. However, its address must also
be reset. Otherwise, it is still reported by the cli on "show servers state"
commands. This may be confusing. Note that it is a theorical patch because
this code path does not exist. Thus it is not tagged as a BUG.

This patch may be backported as far as 2.0.
2021-06-24 17:22:36 +02:00
Christopher Faulet
a8ce497aac BUG/MINOR: resolvers: Reset server IP when no ip is found in the response
For A/AAAA resolution, if no ip is found for a server in the response, the
server is set to RMAINT status. However, its address must also be
reset. Otherwise, it is still reported by the cli on "show servers state"
commands. This may be confusing.

This patch may be backported as far as 2.0.
2021-06-24 17:22:36 +02:00
Christopher Faulet
d7bb23490c BUG/MINOR: resolvers: Always attach server on matching record on resolution
On A/AAAA resolution, for a given server, if a record is matching, we must
always attach the server to this record. Before it was only done if the
server IP was not the same than the record one. However, it is a problem if
the server IP was not set for a previous resolution. From the libc during
startup for instance. In this case, the server IP is not updated and the
server is not attached to any record. It remains in this state while a
matching record is found in the DNS response. It is especially a problem
when the resolution is used for server-templates.

This bug was introduced by the commit bd78c912f ("MEDIUM: resolvers: add a
ref on server to the used A/AAAA answer item").

This patch should solve the issue #1305. It must be backported to all
versions containing the above commit.
2021-06-24 17:15:33 +02:00
Willy Tarreau
47ee44fb71 BUG/MINOR: queue/debug: use the correct lock labels on the queue lock
A dedicated queue lock was added by commit 16fbdda3c ("MEDIUM: queue:
use a dedicated lock for the queues (v2)") but during its rebase, some
labels were lost and left to SERVER_LOCK / PROXY_LOCK instead of
QUEUE_LOCK. It's harmless but can confuse the lock debugger, so better
fix it.

No backport is needed.
2021-06-24 16:00:18 +02:00
Willy Tarreau
19c5581b43 BUG: backend: stop looking for queued connections once there's no more
Commit ae0b12ee0 ("MEDIUM: queue: use a trylock on the server's queue")
introduced a hard to trigger bug that's more visible with a single thread:
if a server dequeues a connection and finds another free slot with no
connection to place there, process_srv_queue() will never break out of
the loop. In multi-thread it almost does not happen because other threads
bring new connections.

No backport is needed as it's only in -dev.
2021-06-24 15:56:07 +02:00
Willy Tarreau
d03adce575 MINOR: queue: simplify pendconn_unlink() regarding srv vs px
Since the code paths became exactly the same except for what log field
to update, let's simplify the code and move further code out of the
lock. The queue position update and the test for server vs proxy do not
need to be inside the lock.
2021-06-24 10:52:31 +02:00
Willy Tarreau
51c63f0f0a MINOR: queue: remove the px/srv fields from pendconn
Now we directly use p->queue to get to the queue, which is much more
straightforward. The performance on 100 servers and 16 threads
increased from 560k to 574k RPS, or 2.5%.

A lot more simplifications are possible, but the minimum was done at
this point.
2021-06-24 10:52:31 +02:00
Willy Tarreau
8429097c61 MINOR: queue: store a pointer to the queue into the pendconn
By following the queue pointer in the pendconn it will now be possible
to always retrieve the elements (index, srv, px, etc).
2021-06-24 10:52:31 +02:00
Willy Tarreau
cdc83e0192 MINOR: queue: add a pointer to the server and the proxy in the queue
A queue is specific to a server or a proxy, so we don't need to place
this distinction inside all pendconns, it can be in the queue itself.
This commit adds the relevant fields "px" and "sv" into the struct
queue, and initializes them accordingly.
2021-06-24 10:52:31 +02:00
Willy Tarreau
df3b0cbe31 MINOR: queue: add queue_init() to initialize a queue
This is better and cleaner than open-coding this in the server and
proxy code, where it has all chances of becoming wrong once forgotten.
2021-06-24 10:52:31 +02:00
Willy Tarreau
ae0b12ee03 MEDIUM: queue: use a trylock on the server's queue
Doing so makes sure that threads attempting to wake up new connections
for a server will give up early if another thread is already in charge
of this. The goal is to avoid unneeded contention on low server counts.

Now with a single server with 16 threads in roundrobin we get the same
performance as with multiple servers, i.e. ~575kreq/s instead of ~496k
before. Leastconn is seeing a similar jump, from ~460 to ~560k (the
difference being the calls to fwlc_srv_reposition).

The overhead of process_srv_queue() is now around 2% instead of ~20%
previously.
2021-06-24 10:52:31 +02:00
Willy Tarreau
49667c14ba MEDIUM: queue: take the proxy lock only during the px queue accesses
There's no point keeping the proxy lock held for a long time, it's
only needed when checking the proxy's queue, and keeping it prevents
multiple servers from dequeuing in parallel. Let's move it into
pendconn_process_next_strm() and release it ASAP. The pendconn
remains under the server queue lock's protection, guaranteeing that
no stream will release it while it's being touched.

For roundrobin, the performance increases by 76% (327k to 575k) on
16 threads. Even with a single server and maxconn=100, the performance
increases from 398 to 496 kreq/s. For leastconn, almost no change is
visible (less than one percent) but this is expected since most of the
time there is spent in fwlc_reposition() and fwlc_get_next_server().
2021-06-24 10:52:31 +02:00
Willy Tarreau
98c8910be7 MINOR: queue: use atomic-ops to update the queue's index (v2)
Doing so allows to retrieve and update the pendconn's queue index outside
of the queue's lock and to save one more percent CPU on a highly-contented
backend.
2021-06-24 10:52:31 +02:00
Willy Tarreau
12529c0ed3 MINOR: queue: factor out the proxy/server queuing code (v2)
The code only differed by the nbpend_max counter. Let's have a pointer
to it and merge the two variants to always use a generic queue. It was
initially considered to put the max inside the queue structure itself,
but the stats support clearing values and maxes and this would have been
the only counter having to be handled separately there. Given that we
don't need this max anywhere outside stats, let's keep it where it is
and have a pointer to it instead.

The CAS loop to update the max remains. It was naively thought that it
would have been faster without atomic ops inside the lock, but this is
not the case for the simple reason that it is a max, it converges very
quickly and never has to perform the check anymore. Thus this code is
better out of the lock.

The queue_idx is still updated inside the lock since that's where the
idx is updated, though it could be performed using atomic ops given
that it's only used to roughly count places for logging.
2021-06-24 10:52:31 +02:00
Willy Tarreau
a0e9c55ab1 MEDIUM: queue: determine in process_srv_queue() if the proxy is usable (v2)
By doing so we can move some evaluations outside of the lock and the
loop.
2021-06-24 10:52:31 +02:00
Willy Tarreau
9ab78293bf MEDIUM: queue: simplify again the process_srv_queue() API (v2)
This basically undoes the API changes that were performed by commit
0274286dd ("BUG/MAJOR: server: fix deadlock when changing maxconn via
agent-check") to address the deadlock issue: since process_srv_queue()
doesn't use the server lock anymore, it doesn't need the "server_locked"
argument, so let's get rid of it before it gets used again.
2021-06-24 10:52:31 +02:00
Willy Tarreau
16fbdda3c3 MEDIUM: queue: use a dedicated lock for the queues (v2)
Till now whenever a server or proxy's queue was touched, this server
or proxy's lock was taken. Not only this requires distinct code paths,
but it also causes unnecessary contention with other uses of these locks.

This patch adds a lock inside the "queue" structure that will be used
the same way by the server and the proxy queuing code. The server used
to use a spinlock and the proxy an rwlock, though the queue only used
it for locked writes. This new version uses a spinlock since we don't
need the read lock part here. Tests have not shown any benefit nor cost
in using this one versus the rwlock so we could change later if needed.

The lower contention on the locks increases the performance from 362k
to 374k req/s on 16 threads with 20 servers and leastconn. The gain
with roundrobin even increases by 9%.

This is tagged medium because the lock is changed, but no other part of
the code touches the queues, with nor without locking, so this should
remain invisible.
2021-06-24 10:52:31 +02:00
Willy Tarreau
9cef43acab MEDIUM: queue: update px->served and lb's take_conn once per loop
There's no point doing atomic incs over px->served/px->totpend under the
locks from the inner loop, as this value is used by the LB algorithms but
not during the dequeuing step. In addition, the LB algo's take_conn()
doesn't need to be refreshed for each and every connection taken
under the lock, it can be performed once at the end and out of the
lock.

While the gain on roundrobin is not noticeable (only the atomic inc),
on leastconn which uses take_conn(), the performance increases from
355k to 362k req/s on 16 threads.
2021-06-24 10:09:40 +02:00
Willy Tarreau
a48905bad8 Revert "MEDIUM: queue: make pendconn_process_next_strm() only return the pendconn"
This reverts commit 5304669e1b.

The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.
2021-06-24 09:55:59 +02:00
Willy Tarreau
d83c98eb14 Revert "MINOR: queue: update proxy->served once out of the loop"
This reverts commit 3e92a31783.

The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.
2021-06-24 09:55:14 +02:00
Willy Tarreau
e76fc3253d Revert "MEDIUM: queue: refine the locking in process_srv_queue()"
This reverts commit 1b648c857b.

The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.
2021-06-24 09:55:14 +02:00
Willy Tarreau
3f70fb9ea2 Revert "MEDIUM: queue: use a dedicated lock for the queues"
This reverts commit fcb8bf8650.

The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.
2021-06-24 07:26:28 +02:00
Willy Tarreau
ccd85a3e08 Revert "MEDIUM: queue: simplify again the process_srv_queue() API"
This reverts commit c83e45e9b0.

The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.
2021-06-24 07:22:18 +02:00
Willy Tarreau
58f4dfb2b0 Revert "MINOR: queue: factor out the proxy/server queuing code"
This reverts commit 3eecdb65c5.

The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.
2021-06-24 07:22:15 +02:00
Willy Tarreau
a4a9bbadc6 Revert "MINOR: queue: use atomic-ops to update the queue's index"
This reverts commit 1335eb9867.

The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.
2021-06-24 07:22:12 +02:00
Willy Tarreau
ddac4a1f35 Revert "MEDIUM: queue: determine in process_srv_queue() if the proxy is usable"
This reverts commit de814dd422.

The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.
2021-06-24 07:22:08 +02:00
Willy Tarreau
5343d8ed6f Revert "MEDIUM: queue: move the queue lock manipulation to pendconn_process_next_strm()"
This reverts commit 9a6d0ddbd6.

The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.
2021-06-24 07:22:03 +02:00
Willy Tarreau
90a160a465 Revert "MEDIUM: queue: unlock as soon as possible"
This reverts commit 5b39275311.

The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.
2021-06-24 07:21:59 +02:00
Willy Tarreau
2bf3f2cf7f Revert "MINOR: queue: make pendconn_first() take the lock by itself"
This reverts commit 772e968b06.

The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.
2021-06-24 07:20:26 +02:00
Christopher Faulet
c3fe968f22 CLEANUP: dns: Remove a forgotten debug message
A debug message was forgotten in the dns part.

This patch should fix the issue #1304. It must be backported to 2.4.
2021-06-23 12:21:47 +02:00
Christopher Faulet
19bbbe0562 MINOR: tcp-act: Add set-src/set-src-port for "tcp-request content" rules
If it possible to set source IP/Port from "tcp-request connection",
"tcp-request session" and "http-request" rules but not from "tcp-request
content" rules. There is no reason for this limitation and it may be a
problem for anyone wanting to call a lua fetch to dynamically set source
IP/Port from a TCP proxy. Indeed, to call a lua fetch, we must have a
stream. And there is no stream when "tcp-request connection/session" rules
are evaluated.

Thanks to this patch, "set-src" and "set-src-port" action are now supported
by "tcp_request content" rules.

This patch is related to the issue #1303. It may be backported to all stable
versions.
2021-06-23 12:07:24 +02:00
Willy Tarreau
5ffb045ed1 CLEANUP: backend: remove impossible case of round-robin + consistent hash
In 1.4, consistent hashing was brought by commit 6b2e11be1 ("[MEDIUM]
backend: implement consistent hashing variation") which took care of
replacing all direct calls to map_get_server_rr() with an alternate
call to chash_get_next_server() if consistent hash was being used.

One of them, however, cannot happen because a preliminary test for
static round-robin is being done prior to the call, so we're certain
that if it matches it cannot use a consistent hash tree.

Let's remove it.
2021-06-22 19:21:11 +02:00
Willy Tarreau
772e968b06 MINOR: queue: make pendconn_first() take the lock by itself
Dealing with the queue lock in the caller remains complicated. Let's
change pendconn_first() to take the queue instead of the tree head,
and handle the lock itself. It now returns an element with a locked
queue or no element with an unlocked queue. It can avoid locking if
the queue is already empty.
2021-06-22 18:57:18 +02:00
Willy Tarreau
5b39275311 MEDIUM: queue: unlock as soon as possible
There's no point keeping the server's queue lock after seeing that the
server's queue is empty, just like there's no need to keep the proxy's
lock when its queue is empty. This patch checks for emptiness and
releases these locks as soon as possible.

With this the performance increased from 524k to 530k on 16 threads
with round-robin.
2021-06-22 18:57:18 +02:00
Willy Tarreau
9a6d0ddbd6 MEDIUM: queue: move the queue lock manipulation to pendconn_process_next_strm()
By placing the lock there, it becomes possible to lock the proxy
later and to unlock it earlier. The server unlocking also happens slightly
earlier.

The performance on roundrobin increases from 481k to 524k req/s on 16
threads. Leastconn shows about 513k req/s (the difference being the
take_conn() call).

The performance profile changes from this:
   9.32%  hap-pxok            [.] process_srv_queue
   7.56%  hap-pxok            [.] pendconn_dequeue
   6.90%  hap-pxok            [.] pendconn_add

to this:
   7.42%  haproxy             [.] process_srv_queue
   5.61%  haproxy             [.] pendconn_dequeue
   4.95%  haproxy             [.] pendconn_add
2021-06-22 18:57:18 +02:00
Willy Tarreau
de814dd422 MEDIUM: queue: determine in process_srv_queue() if the proxy is usable
By doing so we can move some evaluations outside of the lock and the
loop. In the round robin case, the performance increases from 497k to
505k rps on 16 threads with 100 servers.
2021-06-22 18:57:18 +02:00
Willy Tarreau
1335eb9867 MINOR: queue: use atomic-ops to update the queue's index
Doing so allows to retrieve and update the pendconn's queue index outside
of the queue's lock and to save one more percent CPU on a highly-contented
backend.
2021-06-22 18:57:18 +02:00
Willy Tarreau
3eecdb65c5 MINOR: queue: factor out the proxy/server queuing code
The code only differed by the nbpend_max counter. Let's have a pointer
to it and merge the two variants to always use a generic queue. It was
initially considered to put the max inside the queue structure itself,
but the stats support clearing values and maxes and this would have been
the only counter having to be handled separately there. Given that we
don't need this max anywhere outside stats, let's keep it where it is
and have a pointer to it instead.

The CAS loop to update the max remains. It was naively thought that it
would have been faster without atomic ops inside the lock, but this is
not the case for the simple reason that it is a max, it converges very
quickly and never has to perform the check anymore. Thus this code is
better out of the lock.

The queue_idx is still updated inside the lock since that's where the
idx is updated, though it could be performed using atomic ops given
that it's only used to roughly count places for logging.
2021-06-22 18:57:18 +02:00
Willy Tarreau
c83e45e9b0 MEDIUM: queue: simplify again the process_srv_queue() API
This basically undoes the API changes that were performed by commit
0274286dd ("BUG/MAJOR: server: fix deadlock when changing maxconn via
agent-check") to address the deadlock issue: since process_srv_queue()
doesn't use the server lock anymore, it doesn't need the "server_locked"
argument, so let's get rid of it before it gets used again.
2021-06-22 18:57:15 +02:00
Willy Tarreau
fcb8bf8650 MEDIUM: queue: use a dedicated lock for the queues
Till now whenever a server or proxy's queue was touched, this server
or proxy's lock was taken. Not only this requires distinct code paths,
but it also causes unnecessary contention with other uses of these locks.

This patch adds a lock inside the "queue" structure that will be used
the same way by the server and the proxy queuing code. The server used
to use a spinlock and the proxy an rwlock, though the queue only used
it for locked writes. This new version uses a spinlock since we don't
need the read lock part here. Tests have not shown any benefit nor cost
in using this one versus the rwlock so we could change later if needed.

The lower contention on the locks increases the performance from 491k
to 507k req/s on 16 threads with 20 servers and leastconn. The gain
with roundrobin even increases by 6%.

The performance profile changes from this:
  13.03%  haproxy             [.] fwlc_srv_reposition
   8.08%  haproxy             [.] fwlc_get_next_server
   3.62%  haproxy             [.] process_srv_queue
   1.78%  haproxy             [.] pendconn_dequeue
   1.74%  haproxy             [.] pendconn_add

to this:
  11.95%  haproxy             [.] fwlc_srv_reposition
   7.57%  haproxy             [.] fwlc_get_next_server
   3.51%  haproxy             [.] process_srv_queue
   1.74%  haproxy             [.] pendconn_dequeue
   1.70%  haproxy             [.] pendconn_add

At this point the differences are mostly measurement noise.

This is tagged medium because the lock is changed, but no other part of
the code touches the queues, with nor without locking, so this should
remain invisible.
2021-06-22 18:43:56 +02:00
Willy Tarreau
a05704582c MINOR: server: replace the pendconns-related stuff with a struct queue
Just like for proxies, all three elements (pendconns, nbpend, queue_idx)
were moved to struct queue.
2021-06-22 18:43:14 +02:00
Willy Tarreau
7f3c1df248 MINOR: proxy: replace the pendconns-related stuff with a struct queue
All three elements (pendconns, nbpend, queue_idx) were moved to struct
queue.
2021-06-22 18:43:14 +02:00
Willy Tarreau
5941ef0a6c MINOR: lb/api: remove the locked argument from take_conn/drop_conn
This essentially reverts commit 2b4370078 ("MINOR: lb/api: let callers
of take_conn/drop_conn tell if they have the lock") that was merged
during 2.4 before the various locks could be eliminated at the lower
layers. Passing that information complicates the cleanup of the queuing
code and it's become useless.
2021-06-22 18:43:12 +02:00
Willy Tarreau
1b648c857b MEDIUM: queue: refine the locking in process_srv_queue()
The lock in process_srv_queue() was placed around the whole loop to
avoid the cost of taking/releasing it multiple times. But in practice
almost all calls to this function only dequeue a single connection, so
that argument doesn't really stand. However by placing the lock inside
the loop, we'd make it possible to release it before manipulating the
pendconn and waking the task up. That's what this patch does.

This increases the performance from 431k to 491k req/s on 16 threads
with 20 servers under leastconn.

The performance profile changes from this:
  14.09%  haproxy             [.] process_srv_queue
  10.22%  haproxy             [.] fwlc_srv_reposition
   6.39%  haproxy             [.] fwlc_get_next_server
   3.97%  haproxy             [.] pendconn_dequeue
   3.84%  haproxy             [.] pendconn_add

to this:
  13.03%  haproxy             [.] fwlc_srv_reposition
   8.08%  haproxy             [.] fwlc_get_next_server
   3.62%  haproxy             [.] process_srv_queue
   1.78%  haproxy             [.] pendconn_dequeue
   1.74%  haproxy             [.] pendconn_add

The difference is even slightly more visible in roundrobin which
does not have take_conn() call.
2021-06-22 18:41:55 +02:00
Willy Tarreau
3e92a31783 MINOR: queue: update proxy->served once out of the loop
It's not needed during all these operations and doesn't even affect
queueing in the LB algo, so we can safely update it out of the loop
and the lock.
2021-06-22 18:37:45 +02:00
Willy Tarreau
5304669e1b MEDIUM: queue: make pendconn_process_next_strm() only return the pendconn
It used to do far too much under the lock, including waking up tasks,
updating counters and repositionning entries in the load balancing algo.

This patch first moves all that stuff out of the function into the only
caller (process_srv_queue()). The decision to update the LB algo is now
taken out of the lock. The wakeups could be performed outside of the
loop by using a local list.

This increases the performance from 377k to 431k req/s on 16 threads
with 20 servers under leastconn.

The perf profile changes from this:
  23.17%  haproxy             [.] process_srv_queue
   6.58%  haproxy             [.] pendconn_add
   6.40%  haproxy             [.] pendconn_dequeue
   5.48%  haproxy             [.] fwlc_srv_reposition
   3.70%  haproxy             [.] fwlc_get_next_server

to this:
  13.95%  haproxy             [.] process_srv_queue
   9.96%  haproxy             [.] fwlc_srv_reposition
   6.21%  haproxy             [.] fwlc_get_next_server
   3.96%  haproxy             [.] pendconn_dequeue
   3.75%  haproxy             [.] pendconn_add
2021-06-22 18:37:41 +02:00
Amaury Denoyelle
0274286dd3 BUG/MAJOR: server: fix deadlock when changing maxconn via agent-check
The server_parse_maxconn_change_request locks the server lock. However,
this function can be called via agent-checks or lua code which already
lock it. This bug has been introduced by the following commit :

  commit 79a88ba3d0
  BUG/MAJOR: server: prevent deadlock when using 'set maxconn server'

This commit tried to fix another deadlock with can occur because
previoulsy server_parse_maxconn_change_request requires the server lock
to be held. However, it may call internally process_srv_queue which also
locks the server lock. The locking policy has thus been updated. The fix
is functional for the CLI 'set maxconn' but fails to address the
agent-check / lua counterparts.

This new issue is fixed in two steps :
- changes from the above commit have been reverted. This means that
  server_parse_maxconn_change_request must again be called with the
  server lock.

- to counter the deadlock fixed by the above commit, process_srv_queue
  now takes an argument to render the server locking optional if the
  caller already held it. This is only used by
  server_parse_maxconn_change_request.

The above commit was subject to backport up to 1.8. Thus this commit
must be backported in every release where it is already present.
2021-06-22 11:39:20 +02:00
Willy Tarreau
901972e261 MINOR: queue: update the stream's pend_pos before queuing it
Since commit c7eedf7a5 ("MINOR: queue: reduce the locked area in
pendconn_add()") the stream's pend_pos is set out of the lock, after
the pendconn is queued. While this entry is only manipulated by the
stream itself and there is no bug caused by this right now, it's a
bit dangerous because another thread could decide to look at this
field during dequeuing and could randomly see something else. Also
in case of crashes, memory inspection wouldn't be as trustable.
Let's assign the pendconn before it can be found in the queue.
2021-06-18 18:21:18 +02:00
Amaury Denoyelle
34897d2eff MINOR: ssl: support ssl keyword for dynamic servers
Activate the 'ssl' keyword for dynamic servers. This is the final step
to have ssl dynamic servers feature implemented. If activated,
ssl_sock_prepare_srv_ctx will be called at the end of the 'add server'
CLI handler.

At the same time, update the management doc to list all ssl keywords
implemented for dynamic servers.
2021-06-18 16:42:26 +02:00
Amaury Denoyelle
71f9a06e4b MINOR: ssl: enable a series of ssl keywords for dynamic servers
These keywords are deemed safe-enough to be enable on dynamic servers.
Their parsing functions are simple and can be called at runtime.

- allow-0rtt
- alpn
- ciphers
- ciphersuites
- force-sslv3/tlsv10/tlsv11/tlsv12/tlsv13
- no-sslv3/tlsv10/tlsv11/tlsv12/tlsv13
- no-ssl-reuse
- no-tls-tickets
- npn
- send-proxy-v2-ssl
- send-proxy-v2-ssl-cn
- sni
- ssl-min-ver
- ssl-max-ver
- tls-tickets
- verify
- verifyhost

'no-ssl-reuse' and 'no-tls-tickets' are enabled to override the default
behavior.

'tls-tickets' is enable to override a possible 'no-tls-tickets' set via
the global option 'ssl-default-server-options'.

'force' and 'no' variants of tls method options are useful to override a
possible 'ssl-default-server-options'.
2021-06-18 16:42:26 +02:00
Amaury Denoyelle
fde82605cd MINOR: ssl: support crl arg for dynamic servers
File-access through ssl_store_load_locations_file is deactivated if
srv_parse_crl is used at runtime for a dynamic server. The crl must
have already been loaded either in the config or through the 'ssl crl'
CLI commands.
2021-06-18 16:42:26 +02:00
Amaury Denoyelle
93be21e0c6 MINOR: ssl: support crt arg for dynamic servers
File-access through ssl_store_load_locations_file is deactivated if
srv_parse_crt is used at runtime for a dynamic server. The cert must
have already been loaded either in the config or through the 'ssl cert'
CLI commands.
2021-06-18 16:42:26 +02:00
Amaury Denoyelle
482550280a MINOR: ssl: support ca-file arg for dynamic servers
File-access through ssl_store_load_locations_file is deactivated if
srv_parse_ca_file is used at runtime for a dynamic server. The ca-file
must have already been loaded either in the config or through the 'ssl
ca-file' CLI commands.
2021-06-18 16:42:26 +02:00
Amaury Denoyelle
7addf56b72 MINOR: ssl: split parse functions for alpn/check-alpn
This will be in preparation for support of ssl on dynamic servers. The
'alpn' keyword will be allowed for dynamic servers but not the
'check-alpn'.

The alpn parsing is extracted into a new function parse_alpn. Each
srv_parse_alpn and srv_parse_check_alpn called it.
2021-06-18 16:42:26 +02:00
Amaury Denoyelle
36aa451a4e MINOR: ssl: render file-access optional on server crt loading
The function ssl_sock_load_srv_cert will be used at runtime for dynamic
servers. If the cert is not loaded on ckch tree, we try to access it
from the file-system.

Now this access operation is rendered optional by a new function
argument. It is only allowed at parsing time, but will be disabled for
dynamic servers at runtime.
2021-06-18 16:42:25 +02:00
Amaury Denoyelle
b89d3d3de7 MINOR: server: disable CLI 'set server ssl' for dynamic servers
'set server ssl' uses ssl parameters from default-server. As dynamic
servers does not reuse any default-server parameters, this command has
no sense for them.
2021-06-18 16:42:25 +02:00
Amaury Denoyelle
1f9333b30e MINOR: ssl: check allocation in parse npn/sni
These checks are especially required now as this function will be used
at runtime for dynamic servers.
2021-06-18 16:42:25 +02:00
Amaury Denoyelle
cbbf87f119 MINOR: ssl: check allocation in parse ciphers/ciphersuites/verifyhost
These checks are especially required now as this function will be used
at runtime for dynamic servers.
2021-06-18 16:42:25 +02:00
Amaury Denoyelle
949c94e462 MINOR: ssl: check allocation in ssl_sock_init_srv
These checks are especially required now as this function will be used
at runtime for dynamic servers.
2021-06-18 16:42:25 +02:00
Amaury Denoyelle
c593bcdb43 MINOR: ssl: always initialize random generator
Explicitly call ssl_initialize_random to initialize the random generator
in init() global function. If the initialization fails, the startup is
interrupted.

This commit is in preparation for support of ssl on dynamic servers. To
be able to activate ssl on dynamic servers, it is necessary to ensure
that the random generator is initialized on startup regardless of the
config. It cannot be called at runtime as access to /dev/urandom is
required.

This also has the effect to fix the previous non-consistent behavior.
Indeed, if bind or server in the config are using ssl, the
initialization function was called, and if it failed, the startup was
interrupted. Otherwise, the ssl initialization code could have been
called through the ssl server for lua, but this times without blocking
the startup on error. Or not called at all if lua was deactivated.
2021-06-18 16:42:25 +02:00
Amaury Denoyelle
b11ad9ed61 MINOR: ssl: fix typo in usage for 'new ssl ca-file'
Fix the usage for the command new ssl ca-file, which has a missing '-'
dash separator.
2021-06-18 16:42:25 +02:00
Tim Duesterhus
3bc6af417d BUG/MINOR: cache: Correctly handle existing-but-empty 'accept-encoding' header
RFC 7231#5.3.4 makes a difference between a completely missing
'accept-encoding' header and an 'accept-encoding' header without any values.

This case was already correctly handled by accident, because an empty accept
encoding does not match any known encoding. However this resulted in the
'other' encoding being added to the bitmap. Usually this also succeeds in
serving cached responses, because the cached response likely has no
'content-encoding', thus matching the identity case instead of not serving the
response, due to the 'other' encoding. But it's technically not 100% correct.

Fix this by special-casing 'accept-encoding' values with a length of zero and
extend the test to check that an empty accept-encoding is correctly handled.
Due to the reasons given above the test also passes without the change in
cache.c.

Vary support was added in HAProxy 2.4. This fix should be backported to 2.4+.
2021-06-18 15:48:20 +02:00
Christopher Faulet
0ba54bb401 BUG/MINOR: server/cli: Fix locking in function processing "set server" command
The commit c7b391aed ("BUG/MEDIUM: server/cli: Fix ABBA deadlock when fqdn
is set from the CLI") introduced 2 bugs. The first one is a typo on the
server's lock label (s/SERVER_UNLOCK/SERVER_LOCK/). The second one is about
the server's lock itself. It must be acquired to execute the "agent-send"
subcommand.

The patch above is marked to be backported as far as 1.8. Thus, this one
must also backported as far 1.8.

BUG/MINOR: server/cli: Don't forget to lock server on agent-send subcommand
2021-06-18 09:16:32 +02:00
Christopher Faulet
e886dd5c32 BUG/MINOR: resolvers: Use resolver's lock in resolv_srvrq_expire_task()
The commit dcac41806 ("BUG/MEDIUM: resolvers: Add a task on servers to check
SRV resolution status") introduced a type. In resolv_srvrq_expire_task()
function, the resolver's lock must be used instead of the resolver itself.

This patch must be backported with the patch above (at least as far as 2.2).
2021-06-18 09:15:35 +02:00
Amaury Denoyelle
655dec81bd BUG/MINOR: backend: do not set sni on connection reuse
When reusing a backend connection, do not reapply the SNI on the
connection. It should already be defined when the connection was
instantiated on a previous connect_server invocation. As the SNI is a
parameter used to select a connection, only connection with same value
can be reused.

The impact of this bug is unknown and may be null. No memory leak has
been reported by valgrind. So this is more a cleaning fix.

This commit relies on the SF_SRV_REUSED flag and thus depends on the
following fix :
  BUG/MINOR: backend: restore the SF_SRV_REUSED flag original purpose

This should be backported up to 2.4.
2021-06-17 18:01:57 +02:00
Amaury Denoyelle
2b1d91758d BUG/MINOR: backend: restore the SF_SRV_REUSED flag original purpose
The SF_SRV_REUSED flag was set if a stream reused a backend connection.
One of its purpose is to count the total reuse on the backend in
opposition to newly instantiated connection.

However, the flag was diverted from its original purpose since the
following commit :

  e8f5f5d8b2
  BUG/MEDIUM: servers: Only set SF_SRV_REUSED if the connection if fully ready.

With this change, the flag is not set anymore if the mux is not ready
when a connection is picked for reuse. This can happen for multiplexed
connections which are inserted in the available list as soon as created
in http-reuse always mode. The goal of this change is to not retry
immediately this request in case on an error on the same server if the
reused connection is not fully ready.

This change is justified for the retry timeout handling but it breaks
other places which still uses the flag for its original purpose. Mainly,
in this case the wrong 'connect' backend counter is incremented instead
of the 'reuse' one. The flag is also used in http_return_srv_error and
may have an impact if a http server error is replied for this stream.

To fix this problem, the original purpose of the flag is restored by
setting it unconditionaly when a connection is reused. Additionally, a
new flag SF_SRV_REUSED_ANTICIPATED is created. This flag is set when the
connection is reused but the mux is not ready yet. For the timeout
handling on error, the request is retried immediately only if the stream
reused a connection without this newly anticipated flag.

This must be backported up to 2.1.
2021-06-17 17:58:50 +02:00
Christopher Faulet
dcac418062 BUG/MEDIUM: resolvers: Add a task on servers to check SRV resolution status
When a server relies on a SRV resolution, a task is created to clean it up
(fqdn/port and address) when the SRV resolution is considered as outdated
(based on the resolvers 'timeout' value). It is only possible if the server
inherits outdated info from a state file and is no longer selected to be
attached to a SRV item. Note that most of time, a server is attached to a
SRV item. Thus when the item becomes obsolete, the server is cleaned
up.

It is important to have such task to be sure the server will be free again
to have a chance to be resolved again with fresh information. Of course,
this patch is a workaround to solve a design issue. But there is no other
obvious way to fix it without rewritting all the resolvers part. And it must
be backportable.

This patch relies on following commits:
 * MINOR: resolvers: Clean server in a dedicated function when removing a SRV item
 * MINOR: resolvers: Remove server from named_servers tree when removing a SRV item

All the series must be backported as far as 2.2 after some observation
period. Backports to 2.0 and 1.8 must be evaluated.
2021-06-17 16:52:35 +02:00
Christopher Faulet
73001ab6e3 MINOR: resolvers: Remove server from named_servers tree when removing a SRV item
When a server is cleaned up because the corresponding SRV item is removed,
we always remove the server from the srvrq's name_servers tree. For now, it
is useless because, if a server was attached to a SRV item, it means it was
already removed from the tree. But it will be mandatory to fix a bug.
2021-06-17 16:52:35 +02:00
Christopher Faulet
11c6c39656 MINOR: resolvers: Clean server in a dedicated function when removing a SRV item
A dedicated function is now used to clean up servers when a SRV item becomes
obsolete or when a requester is removed from a resolution. This patch is
mandatory to fix a bug.
2021-06-17 16:52:35 +02:00
Christopher Faulet
c7b391aed2 BUG/MEDIUM: server/cli: Fix ABBA deadlock when fqdn is set from the CLI
To perform servers resolution, the resolver's lock is first acquired then
the server's lock when necessary. However, when the fqdn is set via the CLI,
the opposite is performed. So, it is possible to experience an ABBA
deadlock.

To fix this bug, the server's lock is acquired and released for each
subcommand of "set server" with an exception when the fqdn is set. The
resolver's lock is first acquired. Of course, this means we must be sure to
have a resolver to lock.

This patch must be backported as far as 1.8.
2021-06-17 16:52:14 +02:00
Christopher Faulet
a386e78823 BUG/MINOR: server: Forbid to set fqdn on the CLI if SRV resolution is enabled
If a server is configured to rely on a SRV resolution, we must forbid to
change its fqdn on the CLI. Indeed, in this case, the server retrieves its
fqdn from the SRV resolution. If the fqdn is changed via the CLI, this
conflicts with the SRV resolution and leaves the server in an undefined
state. Most of time, the SRV resolution remains enabled with no effect on
the server (no update). Some time the A/AAAA resolution for the new fqdn is
not enabled at all. It depends on the server state and resolver state when
the CLI command is executed.

This patch must be backported as far as 2.0 (maybe to 1.8 too ?) after some
observation period.
2021-06-17 16:17:14 +02:00
Miroslav Zagorac
8a8f270f6a CLEANUP: server: a separate function for initializing the per_thr field
To avoid repeating the same source code, allocating memory and initializing
the per_thr field from the server structure is transferred to a separate
function.
2021-06-17 16:07:10 +02:00
Ilya Shipitsin
213bb99f9e CLEANUP: assorted typo fixes in the code and comments
This is 24th iteration of typo fixes
2021-06-17 09:02:16 +02:00
Willy Tarreau
3a53707160 BUG/MINOR: mux-h2/traces: bring back the lost "sent H2 REQ/RES" traces
In 2.4, commit d1ac2b90c ("MAJOR: htx: Remove the EOM block type and
use HTX_FL_EOM instead") changed the HTX processing to destroy the
blocks as they are processed. So the traces that were emitted at the
end of the send headers functions didn't have anything to show.

Let's move these traces earlier in the function, right before the HTX
processing, so that everything is still in place.

This should be backported to 2.4.
2021-06-17 08:43:43 +02:00
Willy Tarreau
29268e9a3c BUG/MINOR: mux-h2/traces: bring back the lost "rcvd H2 REQ" trace
Since commit 7d013e796 ("BUG/MEDIUM: mux-h2: Xfer rxbuf to the upper
layer when creating a front stream"), the rxbuf is lost during the
call to h2c_frt_stream_new(), so the trace that happens later cannot
find a request there and we've lost the useful part indicating what
the request looked like. Let's move the trace before this call.

This should be backported to 2.4.
2021-06-17 08:43:27 +02:00
Willy Tarreau
ee4684f65b MINOR: mux-h2: obey http-ignore-probes during the preface
We're seeing some browsers setting up multiple connections and closing
some to just keep one. It looks like they do this in case they'd
negotiate H1. This results in aborted prefaces and log pollution about
bad requests and "PR--" in the status flags.

We already have an option to ignore connections with no data, it's called
http-ignore-probes. But it was not used by the H2 mux. However it totally
makes sense to use it during the preface.

This patch changes this so that connections aborted before sending the
preface can avoid being logged.

This should be backported to 2.4 and 2.3 at least, and probably even
as far as 2.0.
2021-06-17 08:08:48 +02:00
Willy Tarreau
fc8e438637 BUG/MINOR: stats: make "show stat typed desc" work again
As part of the changes to support per-module stats data in 2.3-dev6
with commit ee63d4bd6 ("MEDIUM: stats: integrate static proxies stats
in new stats"), a small change resulted in the description field to
be replaced by the name field, making it pointless. Let's fix this
back.

This should fix issue #1291. Thanks to Nick Ramirez for reporting this
issue.

This patch can be backported to 2.3.
2021-06-17 07:25:22 +02:00
Willy Tarreau
9abb317683 CLEANUP: mux-h2/traces: better align user messages
"sent H2 request" was already misaligned with the 3 other ones
(sent/rcvd, request/response), and now with "new H2 connection" that's
yet another alignment making the traces even less legible. Let's just
realign all 5 messages, this even eases quick pointer comparisons. This
should probably be backported to 2.4 as it's where it's the most likely
to be used in the mid-term.
2021-06-16 18:32:42 +02:00
Willy Tarreau
8e6f749f18 MINOR: mux-h2/trace: report a few connection-level info during h2_init()
It is currently very difficult to match some H2 trace outputs against
some log extracts because there's no exactly equivalent info.

This patch tries to address this by adding a TRACE_USER() call in h2_init()
that is matched in h2_trace() to report:
  - connection pointer and direction
  - frontend's name or server's name
  - transport layer and control layer (e.g. "SSL/tcpv4")
  - source and/or destination depending on what is set

This now permits to get something like this at verbosity level complete:

  <0>2021-06-16T18:30:19.810897+02:00 [00|h2|1|mux_h2.c:1006] new H2 connection : h2c=0x19fee50(F,PRF) : conn=0x7f373c026850(IN) fe=h2gw RAW/tcpv4 src=127.0.0.1:19540
  <0>2021-06-16T18:30:19.810919+02:00 [00|h2|1|mux_h2.c:2731] rcvd H2 request  : h2c=0x19fee50(F,FRH)
  <0>2021-06-16T18:30:19.810998+02:00 [00|h2|1|mux_h2.c:1006] new H2 connection : h2c=0x1a04ee0(B,PRF) : conn=0x1a04ce0(OUT) sv=h2gw/s1 RAW/tcpv4 dst=127.0.0.1:4446
2021-06-16 18:30:42 +02:00
Willy Tarreau
d943a044aa MINOR: connection: add helper conn_append_debug_info()
This function appends to a buffer some information from a connection.
This will be used by traces and possibly some debugging as well. A
frontend/backend/server, transport/control layers, source/destination
ip:port, connection pointer and direction are reported depending on
the available information.
2021-06-16 18:30:42 +02:00
Willy Tarreau
b74debd826 BUG/MINOR: mux-h1: do not skip the error response on bad requests
Since 2.4-dev3 with commit c4bfa59f1 ("MAJOR: mux-h1: Create the client
stream as later as possible"), a request error doesn't result in any
error response if "option http-ignore-probes" is set, there's just a
close. This is caused by an unneeded b_reset() in h1_process_demux()'s
error path, which makes h1_handle_bad_req() believe there was an empty
request. There is no reason for this reset to be there, it must have
been a leftover of an earlier attempt at dealing with the error, let's
drop it.

This should be backported to 2.4.
2021-06-16 15:06:43 +02:00
Willy Tarreau
f9a7c442f6 MINOR: backend: only skip LB when there are actual connections
In 2.3, a significant improvement was brought against situations where
the queue was heavily used, because some LB algos were still checked
for no reason before deciding to put the request into the queue. This
was commit 82cd5c13a ("OPTIM: backend: skip LB when we know the backend
is full").

As seen in previous commit ("BUG/MAJOR: queue: set SF_ASSIGNED when
setting strm->target on dequeue") the dequeuing code is extremely
tricky, and the optimization above tends to emphasize transient issues
by making them permanent until the next reload, which is not acceptable
as the code must always be robust against any bad situation.

This commit brings a protection against such a situation by slightly
relaxing the test. Instead of checking that there are pending connections
in the backend queue, it also verifies that the backend's connections are
not solely composed of queued connections, which would then indicate we
are in this situation. This is not rocket science, but at least if the
situation happens, we know that it will unlock by itself once the streams
have left, as new requests will be allowed to reach the servers and to
flush the queue again.

This needs to be backported to 2.4 and 2.3.
2021-06-16 09:05:35 +02:00
Willy Tarreau
7867cebf31 BUG/MAJOR: queue: set SF_ASSIGNED when setting strm->target on dequeue
Commit 82cd5c13a ("OPTIM: backend: skip LB when we know the backend is
full") has uncovered a long-burried bug in the dequeing code: when a
server releases a connection, it picks a new one from the proxy's or
its queue. Technically speaking it only picks a pendconn which is a
link between a position in the queue and a stream. It then sets this
pendconn's target to itself, and wakes up the stream's task so that
it can try to connect again.

The stream then goes through the regular connection setup phases,
calls back_try_conn_req() which calls pendconn_dequeue(), which
sets the stream's target to the pendconn's and releases the pendconn.
It then reaches assign_server() which sees no SF_ASSIGNED and calls
assign_server_and_queue() to perform load balancing or queuing. This
one first destroys the stream's target and gets ready to perform load
balancing. At this point we're load-balancing for no reason since we
already knew what server was available. And this is where the commit
above comes into play: the check for the backend's queue above may
detect other connections that arrived in between, and will immediately
return FULL, forcing this request back into the queue. If the server
had a very low maxconn (e.g. 1 due to a long slowstart), it's possible
that this evicted connection was the last one on the server and that
no other one will ever be present to process the queue. Usually a
regularly processed request will still have its own srv_conn that will
be used during stream_free() to dequeue other connections. But if the
server had a down-up cycle, then a call to pendconn_grab_from_px()
may start to dequeue entries which had no srv_conn and which will have
no server slot to offer when they expire, thus maintaining the situation
above forever. Worse, as new requests arrive, there are always some
requests in the queue and the situation feeds on itself.

The correct fix here is to properly set SF_ASSIGNED in pendconn_dequeue()
when the stream's target is assigned (as it's what this flag means), so
as to avoid a load-balancing pass when dequeuing.

Many thanks to Pierre Cheynier for the numerous detailed traces he
provided that helped narrow this problem down.

This could be backported to all stable versions, but in practice only
2.3 and above are really affected since the presence of the commit
above. Given how tricky this code is it's better to limit it to those
versions that really need it.
2021-06-16 09:05:35 +02:00
Willy Tarreau
6fd0450b47 CLEANUP: shctx: remove the different inter-process locking techniques
With a single process, we don't need to USE_PRIVATE_CACHE, USE_FUTEX
nor USE_PTHREAD_PSHARED anymore. Let's only keep the basic spinlock
to lock between threads.
2021-06-15 16:52:42 +02:00
Willy Tarreau
b54ca70e7c MEDIUM: config: warn about "bind-process" deprecation
Let's indicate that "bind-process" is deprecated and scheduled for
removal in 2.7, as it only supports "1".
2021-06-15 16:52:42 +02:00
Willy Tarreau
e8422bf56b MEDIUM: global: remove the relative_pid from global and mworker
The relative_pid is always 1. In mworker mode we also have a
child->relative_pid which is always equalt relative_pid, except for a
master (0) or external process (-1), but these types are usually tested
for, except for one place that was amended to carefully check for the
PROC_O_TYPE_WORKER option.

Changes were pretty limited as most usages of relative_pid were for
designating a process in stats output and peers protocol.
2021-06-15 16:52:42 +02:00
Willy Tarreau
06987f4238 CLEANUP: global: remove unused definition of MAX_PROCS
This one was forced to 1 and the only reference was a test to verify it
was comprised between 1 and LONGBITS.
2021-06-15 16:52:42 +02:00
Willy Tarreau
44ea631b77 MEDIUM: cpu-set: make the proc a single bit field and not an array
We only have a single process now so we don't need to store the per-proc
CPU binding anymore.
2021-06-15 16:52:42 +02:00
Willy Tarreau
bda7c1decd MEDIUM: config: simplify cpu-map handling
As there's no more nbproc>1, we can remove some loops and tests in cpu-map.
Both the lack of thread number and thread 1 can count as the whole process
now (which is still used for whole process binding when threads are disabled).
2021-06-15 16:52:42 +02:00
Willy Tarreau
72faef3866 MEDIUM: global: remove dead code from nbproc/bind_proc removal
Lots of places iterating over nbproc or comparing with nbproc could be
simplified. Further, "bind-process" and "process" parsing that was
already limited to process 1 or "all" or "odd" resulted in a bind_proc
field that was either 0 or 1 during the init phase and later always 1.

All the checks for compatibilities were removed since it's not possible
anymore to run a frontend and a backend on different processes or to
have peers and stick-tables bound on different ones. This is the largest
part of this patch.

The bind_proc field was removed from both the proxy and the receiver
structs.

Since the "process" and "bind-process" directives are still parsed,
configs making use of correct values allowing process 1 will continue
to work.
2021-06-15 16:52:42 +02:00
Willy Tarreau
5301f5d72a CLEANUP: global: remove pid_bit and all_proc_mask
They were already set to 1 and never changed. Let's remove them and
replace their references with 1.
2021-06-15 16:52:42 +02:00
Willy Tarreau
91358595f8 CLEANUP: global: remove the nbproc field from the global structure
Let's use 1 in the rare places where it was still referenced since it's
now its only possible value.
2021-06-15 16:52:42 +02:00
Willy Tarreau
6185a0343b MINOR: mworker: remove the initialization loop over processes
There was a loop used to prepare structures for all current processes.
Let's just assume there's a single iteration now.
2021-06-15 16:52:42 +02:00
Willy Tarreau
d67ff340a5 MEDIUM: init: remove the loop over processes during init
There was a loop iterating over all nbproc values during init that
couldn't be immediately removed because the loop's index was used
to distinguish a child from a parent. That's now fixed by replacing
the iterator with an in_parent flag. All bindings that were checking
(1UL << proc) or cpu_map.proc[proc] were adjusted to always use zero
for proc.
2021-06-15 16:52:42 +02:00
Willy Tarreau
e34cf28011 BUG/MINOR: mworker: fix typo in chroot error message
Since its introduction in 1.8 with commit 095ba4c24 ("MEDIUM: mworker:
replace systemd mode by master worker mode"), it says "cannot chroot1(...)"
which seems to be a leftover of a debug message. It could be backported but
probably nobody will notice.
2021-06-15 16:52:07 +02:00
Willy Tarreau
4c19e99621 BUG/MINOR: ssl: use atomic ops to update global shctx stats
The global shctx lookups and misses was updated without using atomic
ops, so the stats available in "show info" are very likely off by a few
units over time. This should be backported as far as 1.8. Versions
without _HA_ATOMIC_INC() can use HA_ATOMIC_ADD(,1).
2021-06-15 16:52:07 +02:00
Willy Tarreau
9e467af804 BUG/MEDIUM: shctx: use at least thread-based locking on USE_PRIVATE_CACHE
Since threads were introduced in 1.8, the USE_PRIVATE_CACHE mode of the
shctx was not updated to use locks. Originally it was meant to disable
sharing between processes, so it removes the lock/unlock instructions.
But with threads enabled, it's not possible to work like this anymore.

It's easy to see that once built with private cache and threads enabled,
sending violent SSL traffic to the the process instantly makes it die.
The HTTP cache is very likely affected as well.

This patch addresses this by falling back to our native spinlocks when
USE_PRIVATE_CACHE is used. In practice we could use them also for other
modes and remove all older implementations, but this patch aims at keeping
the changes very low and easy to backport. A new SHCTX_LOCK label was
added to help with debugging, but OTHER_LOCK might be usable as well
for backports.

An even lighter approach for backports may consist in always declaring
the lock (or reusing "waiters"), and calling pl_take_s() for the lock()
and pl_drop_s() for the unlock() operation. This could even be used in
all modes (process and threads), even when thread support is disabled.

Subsequent patches will further clean up this area.

This patch must be backported to all supported versions since 1.8.
2021-06-15 16:52:07 +02:00
Amaury Denoyelle
8ff0434b61 BUG/MEDIUM: server: do not auto insert a dynamic server in px addr_node
Until then, the servers were automatically attached on their creation
into the proxy addr_node tree via _srv_parse_init. In case of an invalid
dynamic server which is instantly freed, no detach operation was made
leaving a NULL server in the tree.

Change this mode of operation by marking the attach operation as
optional in _srv_parse_init. This operation is not conduct for a dynamic
server. The server is attached only at the end of the CLI handler when
it is marked as valid.

This must be backported up to 2.4.
2021-06-15 11:42:53 +02:00
Amaury Denoyelle
1613b4a75d BUG/MINOR: server: do not keep an invalid dynamic server in px ids tree
A bug is present when trying to create a dynamic server with a fixed id.
If the server is detected invalid due to a later parsing arguments
error, the server is not removed from the proxy used ids tree before
being freed.

Change the mode of operation of 'id' keyword parsing handler. The
insertion in the backend tree is removed from the handler and is not
taken in charge by parse_server for configuration parsing. For the
dynamic servers, the insertion is called at the end of the 'add server'
CLI handler when the server has been validated.

This must be backported up to 2.4.
2021-06-15 11:42:53 +02:00
Amaury Denoyelle
406aaef55a BUG/MEDIUM: server: do not forget to generate the dynamic servers ids
If no id is specified by the user for a dynamic server, it is necessary
to generate a new one. This operation is now done at the end of 'add
server' CLI handler. The server is then inserted into the proxy ids
tree.

Without this, several features may be broken for dynamic servers. Among
them, there is the "first" lb algorithm, the persistence using
stick-tables or the uniqueness internal check of srv_parse_id.

This must be backported up to 2.4.
2021-06-15 11:42:53 +02:00
Amaury Denoyelle
82d7f77463 BUG/MEDIUM: server: clear dynamic srv on delete from proxy id/name trees
Do not leave deleted server in used_server_id/used_server_addr backend
trees. This might lead to crashes if a deleted server is used through
these trees.

At this moment, dynamic servers are only added in used_server_id if they
have a fixed id. They are never inserted in used_server_addr as this
code is missing. So these new delete instructions are noop. However, a
fix will be provided soon to insert properly all dynamic servers in both
used_server_id and used_server_addr trees so the deletion counterpart
will be mandatory in the CLI server delete handler.

This must be backported to 2.4.
2021-06-15 11:38:06 +02:00
Amaury Denoyelle
31ddd76fef BUG/MEDIUM: server: extend thread-isolate over much of CLI 'add server'
Some config parsing handlers were designed to be run at startup on a
single-thread. When executing at runtime for dynamic servers,
thread-safety is not guaranteed. This is the case for example in
srv_parse_id which manipulates backend used_ids tree.

One solution could be to add locks but it might be tricky to found all
affected functions and it can be an easy source of deadlock. The other
solution which has been chosen is to use thread-isolation over almost
all of the cli_parse_add_server CLI handler.

For now this solution is sufficient. If some users make heavy use of the
'add server', hurting the overall performance, it will be necessary to
design a much thinner solution.

This must be backported up to 2.4.
2021-06-15 11:19:43 +02:00
Amaury Denoyelle
077c6b8d29 BUG/MINOR: stick-table: insert srv in used_name tree even with fixed id
If the server id is fixed in the configuration, it is immediately
inserted in the 'used_server_id' backend tree via srv_parse_id. On
check_config_validity, the dynamic id generation is thus skipped for
fixed-id servers. However, it must nevertheless be inserted in the
'used_server_name' backend tree.

This bug seems to be not noticeable for the user. Indeed, before the
fix, the search in sticking_rule_find_target always returned NULL for
the name, then the fallback search with server id succeeded, so the
persistence is properly applied. However with the fix the fallback
search is not executed anymore, which saves from the locking of
STK_SESS.

This should be backported up to 2.0.
2021-06-15 10:50:02 +02:00
Remi Tricot-Le Breton
6916493c29 MINOR: ssl: Use OpenSSL's ASN1_TIME convertor when available
The ASN1_TIME_to_tm function was added in OpenSSL1.1.1 so with this
version of the library we do not need our homemade time convertor
anymore.
2021-06-14 15:12:53 +02:00
Emeric Brun
caef19e0c7 BUG/MAJOR: resolvers: segfault using server template without SRV RECORDs
This patch fix the issue adding a test in srvrq before registering
the server on it during server template init.

This was a regression due to commit :
3406766d57

This should be backported with this previous commit (until 2.0)
2021-06-14 11:04:02 +02:00
Willy Tarreau
2a651e2d0d BUILD: log: remove unused fmt_directive()
fmt_directive() became unused after the removal of the deprecated
tags, and it emits a warning on some compilers. Let's drop it.
2021-06-11 17:32:03 +02:00
Willy Tarreau
3ae1d1eab9 BUILD: init: remove initialization of multi-process thread mappings
This broke the build with recent compilers and is not used anyway.
2021-06-11 17:28:19 +02:00
Willy Tarreau
b63dbb7b2e MAJOR: config: remove parsing of the global "nbproc" directive
This one was deprecated in 2.3 and marked for removal in 2.5. It suffers
too many limitations compared to threads, and prevents some improvements
from being engaged. Instead of a bypassable startup error, there is now
a hard error.

The parsing code was removed, and very few obvious cases were as well.
The code is deeply rooted at certain places (e.g. "for" loops iterating
from 0 to nbproc) so it will not be that trivial to remove everywhere.
The "bind" and "bind-process" parsers will have to be adjusted, though
maybe not completely changed if we later want to support thread groups
for large NUMA machines. Some stats socket restrictions were removed,
and the doc was updated according to what was done. A few places in the
doc still refer to nbproc and will have to be revisited. The master-worker
code also refers to the process number to distinguish between master and
workers and will have to be carefully adjusted. The MAX_PROCS macro was
reset to 1, this will at least reduce the size of some remaining arrays.

Two regtests were dependieng on this directive, one with an explicit
"nbproc 1" and another one testing the master's CLI using nbproc 4.
Both were adapted.
2021-06-11 17:02:13 +02:00
Willy Tarreau
eb778248d9 MEDIUM: proxy: remove the deprecated "grace" keyword
Commit ab0a5192a ("MEDIUM: config: mark "grace" as deprecated") marked
the "grace" keyword as deprecated in 2.3, tentative removal for 2.4
with a hard deadline in 2.5, so let's remove it and return an error now.
This old and outdated feature was incompatible with soft-stop, reload
and socket transfers, and keeping it forced ugly hacks in the lower
layers of the protocol stack.
2021-06-11 16:57:34 +02:00
Willy Tarreau
d2f2537d1b MINOR: config: remove deprecated option "http-tunnel"
It was marked as deprecated in 2.1-dev2 and for removal in 2.2, but it
was missed. A warning was already emitted and the doc didn't refer to
it any more, let's now get rid of it.
2021-06-11 16:57:34 +02:00
Willy Tarreau
6ba69841f8 MINOR: config: reject long-deprecated "option forceclose"
It's been warning as being deprecated since 2.0-dev4, it's about time
to drop it now. The error message recommends to either remove it or
use "option httpclose" instead. It's still referred to in the old
internal doc about the connection header, which itself seems highly
inaccurate by now.
2021-06-11 16:57:34 +02:00
Willy Tarreau
4a83977283 MINOR: http: remove the long deprecated "set-cookie()" sample fetch function
This one was marked as deprecated 9 years ago by commit 28376d62c
("MEDIUM: http: merge ACL and pattern cookie fetches into a single one")
and has disappeared from any documentation, so it never appeared in any
released version. Let's remove it now.
2021-06-11 16:57:34 +02:00
Willy Tarreau
fd6ab66041 MINOR: log: remove the long-deprecated early log-format tags
The following 10 log-format tags were implemented during log-format
development and changed before the release. They were marked as deprecated
in 2012 by commit 2beef5888 ("MEDIUM: log: change a few log tokens to make
them easier to remember") and were not documented. They've been emitting a
warning since then, with a suggestion of the one to use instead. Let's get
rid of them now.

      Bi => bi, Bp => bp, Ci => ci, Cp => cp, Fi => fi
      Fp => fp, Si => si, Sp => sp, cc => CC, cs => CS
2021-06-11 16:57:34 +02:00
Willy Tarreau
9862787e8f MINOR: config: completely remove support for "no option http-use-htx"
This one used to still be supported, emitting a warning about it being
deprecated and the default since 2.1. Let's remove it now.
2021-06-11 16:57:34 +02:00
Willy Tarreau
eb9d90a5a2 MINOR: config: remove support for deprecated option "tune.chksize"
It was marked as deprecated for immediate removal as it was not used,
let's reject it and remove it from the doc. A specific error suggests
to check tune.bufsize instead.
2021-06-11 16:57:34 +02:00
Christopher Faulet
85af93b8c7 BUG/MINOR: server-state: load SRV resolution only if params match the config
When the state of a server is loaded, if there is no hostname defined for
this server and if a fqdn and a server record are retrieved from the state
file, it means the server should rely on a SRV resolution. But we must be
sure the server is configured this way. A SRV resolution must be configured
with the same SRV record. This part must be skipped if there is no SRV
resolution configured for this server or if the SRV record used is not the
same.

This patch should be backported as far as 1.8 after some observation period.
2021-06-11 16:16:20 +02:00
Emeric Brun
3406766d57 MEDIUM: resolvers: add a ref between servers and srv request or used SRV record
This patch add a ref into servers to register them onto the
record answer item used to set their hostnames.

It also adds a head list into 'srvrq' to register servers free
to be affected to a SRV record.

A head of a tree is also added to srvrq to put servers which
present a hotname in server state file. To re-link them fastly
to the matching record as soon an item present the same name.

This results in better performances on SRV record response
parsing.

This is an optimization but it could avoid to trigger the haproxy's
internal wathdog in some circumstances. And for this reason
it should be backported as far we can (2.0 ?)
2021-06-11 16:16:16 +02:00
Emeric Brun
bd78c912fd MEDIUM: resolvers: add a ref on server to the used A/AAAA answer item
This patch adds a head list into answer items on servers which use
this record to set their IPs. It makes lookup on duplicated ip faster and
allow to check immediatly if an item is still valid renewing the IP.

This results in better performances on A/AAAA resolutions.

This is an optimization but it could avoid to trigger the haproxy's
internal wathdog in some circumstances. And for this reason
it should be backported as far we can (2.0 ?)
2021-06-11 16:16:16 +02:00
Emeric Brun
12ca658dbe BUG/MINOR: resolvers: answser item list was randomly purged or errors
In case of SRV records, The answer item list was purged by the
error callback of the first requester which considers the error
could not be safely ignored. It makes this item list unavailable
for subsequent requesters even if they consider the error
could be ignored.

On A resolution or do_resolve action error, the answer items were
never trashed.

This patch re-work the error callbacks and the code to check the return code
If a callback return 1, we consider the error was ignored and
the answer item list must be kept. At the opposite, If all error callbacks
of all requesters of the same resolution returns 0 the list will be purged

This patch should be backported as far as 2.0.
2021-06-11 16:16:16 +02:00
Christopher Faulet
0fe1864f7d CLEANUP: l7-retries: do not test the buffer before calling b_alloc()
The return value is enough now to know if the allocation succeeded or
failed.

This cleanup was already pushed by Willy (f499f50) but a revert crushed
it. It may be backported to the 2.4 because the original patch was done on
this version.
2021-06-11 16:04:28 +02:00
Christopher Faulet
bf76df12a6 BUG/MINOR: h1-htx: Fix a signess bug with char data type when parsing chunk size
On some platform, a char may be unsigned. Of course, we should not rely on
the signess of a char to be portable. Unfortunatly, since the commit
a835f3cb ("MINOR: h1-htx: Use a correlation table to speed-up small chunks
parsing") we rely on it to test the value retrieved from the hexadecimal
correlation table when the size of a chunk is parsed.

To fix the bug, we now test the result is in the range [0,15] with a bitwise
AND.

This patch should fix the issue #1272. It is 2.5-specific, no backport is
needed except if the commit above is backported.
2021-06-11 14:15:48 +02:00
Christopher Faulet
5cd0e528cf BUG/MINOR: mux-fcgi: Expose SERVER_SOFTWARE parameter by default
As specified in the RFC3875 (section 4.1.17), this parameter must be set to
the name and version of the information server software making the CGI
request. Thus, it is now added to the default parameters defined by
HAProxy. It is set to the string "HAProxy $version".

This patch should fix the issue #1285 and must be backported as far as 2.2.
2021-06-11 14:15:48 +02:00
Christopher Faulet
1cf414b522 BUG/MAJOR: htx: Fix htx_defrag() when an HTX block is expanded
When an HTX block is expanded, a defragmentation may be performed first to
have enough space to copy the new data. When it happens, the meta data of
the HTX message must take account of the new data length but copied data are
still unchanged at this stage (because we need more space to update the
message content). And here there is a bug because the meta data are updated
by the caller. It means that when the blocks content is copied, the new
length is already set. Thus a block larger than the reality is copied and
data outside the buffer may be accessed, leading to a crash.

To fix this bug, htx_defrag() is updated to use an extra argument with the
new meta data to use for the referenced block. Thus the caller does not need
to update the HTX message by itself. However, it still have to update the
data.

Most of time, the bug will be encountered in the HTTP compression
filter. But, even if it is highly unlikely, in theory it is also possible to
hit it when a HTTP header (or only its value) is replaced or when the
start-line is changed.

This patch must be backported as far as 2.0.
2021-06-11 14:05:34 +02:00
Remi Tricot-Le Breton
3faf0cbba6 BUILD: ssl: Fix compilation with BoringSSL
The ifdefs surrounding the "show ssl ocsp-response" functionality that
were supposed to disable the code with BoringSSL were built the wrong
way.

It does not need to be backported.
2021-06-10 19:01:13 +02:00
Willy Tarreau
8715dec6f9 MEDIUM: pools: remove the locked pools implementation
Now that the modified lockless variant does not need a DWCAS anymore,
there's no reason to keep the much slower locked version, so let's
just get rid of it.
2021-06-10 17:46:50 +02:00
Willy Tarreau
2a4523f6f4 BUG/MAJOR: pools: fix possible race with free() in the lockless variant
In GH issue #1275, Fabiano Nunes Parente provided a nicely detailed
report showing reproducible crashes under musl. Musl is one of the libs
coming with a simple allocator for which we prefer to keep the shared
cache. On x86 we have a DWCAS so the lockless implementation is enabled
for such libraries.

And this implementation has had a small race since day one: the allocator
will need to read the first object's <next> pointer to place it into the
free list's head. If another thread picks the same element and immediately
releases it, while both the local and the shared pools are too crowded, it
will be freed to the OS. If the libc's allocator immediately releases it,
the memory area is unmapped and we can have a crash while trying to read
that pointer. However there is no problem as long as the item remains
mapped in memory because whatever value found there will not be placed
into the head since the counter will have changed.

The probability for this to happen is extremely low, but as analyzed by
Fabiano, it increases with the buffer size. On 16 threads it's relatively
easy to reproduce with 2MB buffers above 200k req/s, where it should
happen within the first 20 seconds of traffic usually.

This is a structural issue for which there are two non-trivial solutions:
  - place a read lock in the alloc call and a barrier made of lock/unlock
    in the free() call to force to serialize operations; this will have
    a big performance impact since free() is already one of the contention
    points;

  - change the allocator to use a self-locked head, similar to what is
    done in the MT_LISTS. This requires two memory writes to the head
    instead of a single one, thus the overhead is exactly one memory
    write during alloc and one during free;

This patch implements the second option. A new POOL_DUMMY pointer was
defined for the locked pointer value, allowing to both read and lock it
with a single xchg call. The code was carefully optimized so that the
locked period remains the shortest possible and that bus writes are
avoided as much as possible whenever the lock is held.

Tests show that while a bit slower than the original lockless
implementation on large buffers (2MB), it's 2.6 times faster than both
the no-cache and the locked implementation on such large buffers, and
remains as fast or faster than the all implementations when buffers are
48k or higher. Tests were also run on arm64 with similar results.

Note that this code is not used on modern libcs featuring a fast allocator.

A nice benefit of this change is that since it removes a dependency on
the DWCAS, it will be possible to remove the locked implementation and
replace it with this one, that is then usable on all systems, thus
significantly increasing their performance with large buffers.

Given that lockless pools were introduced in 1.9 (not supported anymore),
this patch will have to be backported as far as 2.0. The code changed
several times in this area and is subject to many ifdefs which will
complicate the backport. What is important is to remove all the DWCAS
code from the shared cache alloc/free lockless code and replace it with
this one. The pool_flush() code is basically the same code as the
allocator, retrieving the whole list at once. If in doubt regarding what
barriers to use in older versions, it's safe to use the generic ones.

This patch depends on the following previous commits:

 - MINOR: pools: do not maintain the lock during pool_flush()
 - MINOR: pools: call malloc_trim() under thread isolation
 - MEDIUM: pools: use a single pool_gc() function for locked and lockless

The last one also removes one occurrence of an unneeded DWCAS in the
code that was incompatible with this fix. The removal of the now unused
seq field will happen in a future patch.

Many thanks to Fabiano for his detailed report, and to Olivier for
his help on this issue.
2021-06-10 17:46:50 +02:00
Willy Tarreau
9b3ed51371 MEDIUM: pools: use a single pool_gc() function for locked and lockless
Locked and lockless shared pools don't need to use a different pool_gc()
function because this function isolates itself during the operation, so
we do not need to rely on DWCAS nor any atomic operation in fact. Let's
just get rid of the lockless one in favor of the simple one. This should
even result in a faster execution.

The ifdefs were slightly moved so that we can have pool_gc() defined
as soon as there are global pools, this avoids duplicating the function.
2021-06-10 17:46:50 +02:00
Willy Tarreau
26ed183556 MINOR: pools: call malloc_trim() under thread isolation
pool_gc() was adjusted to run under thread isolation by commit c0e2ff202
("MEDIUM: memory: make pool_gc() run under thread isolation") so that the
underlying malloc() and free() don't compete between threads during these
potentially aggressive moments (especially when mmap/munmap are involved).

Commit 88366c292 ("MEDIUM: pools: call malloc_trim() from pool_gc()")
later added a call to malloc_trim() but made it outside of the thread
isolation, which is contrary to the principle explained above. Also it
missed it in the locked version, meaning that those without a lockless
implementation cannot benefit from trimming.

This patch fixes that by calling it before thread_release() in both
places.
2021-06-10 17:46:50 +02:00
Willy Tarreau
c88914379d MINOR: pools: do not maintain the lock during pool_flush()
The locked version of pool_flush() is absurd, it locks the pool for each
and every element to be released till the end. Not only this is extremely
inefficient, but it may even never finish if other threads spend their
time refilling the pool. The only case where this can happen is during
soft-stop so the risk remains limited, but it should be addressed.
2021-06-10 17:46:50 +02:00
Willy Tarreau
9a7aa3b4a1 BUG/MINOR: pools: make DEBUG_UAF always write to the to-be-freed location
Since the code was reorganized, DEBUG_UAF was still tested in the locked
pool code despite pools being disabled when DEBUG_UAF is used. Let's move
the test to pool_put_to_os() which is the one that is always called in
this condition.

The impact is only a possible misleading analysis during a troubleshooting
session due to a missing double-frees or free of const area test that is
normally already dealt with by the underlying code anyway. In practice it's
unlikely anyone will ever notice.

This should only be backported to 2.4.
2021-06-10 17:46:50 +02:00
Willy Tarreau
c239cde26f BUG/MINOR: pools: fix a possible memory leak in the lockless pool_flush()
The lockless version of pool_flush() had a leftover of the original
version causing the pool's first entry to be set to NULL at the end.
The problem is that it does this outside of any lock and in a non-
atomic way, so that any concurrent alloc+free would result in a lost
object.

The risk is low and the consequence even lower, given that pool_flush()
is only used in pool_destroy() (hence single-threaded) or by stream_free()
during a soft-stop (not the place where most allocations happen), so in
the worst case it could result in valgrind complaining on soft-stop.

The bug was introduced with the first version of the code, in 1.9, so
the fix can be backported to all stable versions.
2021-06-10 17:46:50 +02:00
Amaury Denoyelle
efbf35caf9 BUG/MINOR: server: explicitly set "none" init-addr for dynamic servers
Define srv.init_addr_methods to SRV_IADDR_NONE on 'add server' CLI
handler. This explicitly states that no resolution will be made on the
server creation.

This is not a real bug as the default value (SRV_IADDR_END) has the same
effect in practice. However the intent is clearer and prevent to use the
default "libc,last" by mistake which cannot execute on runtime (blocking
call + file access via gethostbyname/getaddrinfo).

The doc is also updated to reflect this limitation.

This should be backported up to 2.4.
2021-06-10 17:44:05 +02:00
Remi Tricot-Le Breton
6056e61ae2 MINOR: ssl: Add the "show ssl cert foo.pem.ocsp" CLI command
Add the ability to dump an OCSP response details through a call to "show
ssl cert cert.pem.ocsp". It can also be used on an ongoing transaction
by prefixing the certificate name with a '*'.
Even if the ckch structure holds an ocsp_response buffer, we still need
to look for the actual ocsp response entry in the ocsp response tree
rather than just dumping the ckch's buffer details because when updating
an ocsp response through a "set ssl ocsp-response" call, the
corresponding buffer in the ckch is not updated accordingly. So this
buffer, even if it is not empty, might hold an outdated ocsp response.
2021-06-10 16:44:11 +02:00
Remi Tricot-Le Breton
da968f69c7 MINOR: ssl: Add the OCSP entry key when displaying the details of a certificate
This patch adds an "OCSP Response Key" information in the output of a
"show ssl cert <certfile>" call. The key can then be used in a "show ssl
ocsp-response <key>" CLI command.
2021-06-10 16:44:11 +02:00
Remi Tricot-Le Breton
d92fd11c77 MINOR: ssl: Add new "show ssl ocsp-response" CLI command
This patch adds the "show ssl ocsp-response [<id>]" CLI command. This
command can be used to display the IDs of the OCSP tree entries along
with details about the entries' certificate ID (issuer's name and key
hash + serial number), or to display the details of a single
ocsp-response if an ID is given. The details displayed in this latter
case are the ones shown by a "openssl ocsp -respin <ocsp-response>
-text" call.
2021-06-10 16:44:11 +02:00
Remi Tricot-Le Breton
5aa1dce5ee MINOR: ssl: Keep the actual key length in the certificate_ocsp structure
The OCSP tree entry key is a serialized version of the OCSP_CERTID of
the entry which is stored in a buffer that can be at most 128 bytes.
Depending on the length of the serial number, the actual non-zero part
of the key can be smaller than 128 bytes and this new structure member
allows to know how many of the bytes are filled. It will be useful when
dumping the key (in a "show ssl cert <cert>" output for instance).
2021-06-10 16:44:11 +02:00
Christopher Faulet
12554d00f6 BUG/MEDIUM: compression: Add a flag to know the filter is still processing data
Since the commit acfd71b97 ("BUG/MINOR: http-comp: Preserve
HTTP_MSGF_COMPRESSIONG flag on the response"), there is no more flag to know
when the compression ends. This means it is possible to finish the
compression several time if there are trailers.

So, we reintroduce almost the same mechanism but with a dedicated flag. So
now, there is a bits field in the compression filter context.

The commit above is marked to be backported as far as 2.0. Thus this patch
must also be backported as far as 2.0.
2021-06-10 08:57:55 +02:00
Christopher Faulet
402740c3ad BUG/MEDIUM: compression: Properly get the next block to iterate on payload
When a DATA block is compressed, or when the compression context is finished
on a TLR/EOT block, the next block used to loop on the HTX message must be
refreshed because a defragmentation may have occurred.

This bug was introduced when the EOM block was removed in 2.4. Thus, this
patch must be backported to 2.4.
2021-06-10 08:57:55 +02:00
Christopher Faulet
86ca0e52f7 BUG/MEDIUM: compression: Fix loop skipping unused blocks to get the next block
In comp_http_payload(), the loop skipping unused blocks is buggy and may
lead to a infinite loop if the first next block is unused. Indeed instead of
iterating on blocks, we always retrieve the same one because <blk> is used
instead of <next> to get the next block.

This bug was introduced when the EOM block was removed in 2.4. Thus, this
patch must be backported to 2.4.
2021-06-10 08:57:55 +02:00
Remi Tricot-Le Breton
a3a0cce8ee BUG/MINOR: ssl: OCSP stapling does not work if expire too far in the future
The wey the "Next Update" field of the OCSP response is converted into a
timestamp relies on the use of signed integers for the year and month so
if the calculated timestamp happens to overflow INT_MAX, it ends up
being seen as negative and the OCSP response being dwignored in
ssl_sock_ocsp_stapling_cbk (because of the "ocsp->expire < now.tv_sec"
test).

It could be backported to all stable branches.
2021-06-09 17:49:00 +02:00
William Lallemand
722180aca8 BUILD: make tune.ssl.keylog available again
Since commit 04a5a44 ("BUILD: ssl: use HAVE_OPENSSL_KEYLOG instead of
OpenSSL versions") the "tune.ssl.keylog" feature is broken because
HAVE_OPENSSL_KEYLOG does not exist.

Replace this by a HAVE_SSL_KEYLOG which is defined in openssl-compat.h.
Also add an error when not built with the right openssl version.

Must be backported as far as 2.3.
2021-06-09 17:10:13 +02:00
Amaury Denoyelle
846830e47d BUG: errors: remove printf positional args for user messages context
Change the algorithm for the generation of the user messages context
prefix. Remove the dubious API relying on optional printf positional
arguments. This may be non portable, and in fact the CI glibc crashes
with the following error when some arguments are not present in the
format string :

"invalid %N$ use detected".

Now, a fixed buffer attached to the context instance is allocated once
for the program lifetime. Then call repeatedly snprintf with the
optional arguments of context if present to build the context string.
The buffer is deallocated via a per-thread free handler.

This does not need to be backported.
2021-06-08 11:40:44 +02:00
Maximilian Mader
fc0cceb08a MINOR: haproxy: Add -cc argument
This patch adds the `-cc` (check condition) argument to evaluate conditions on
startup and return the result as the exit code.

As an example this can be used to easily check HAProxy's version in scripts:

    haproxy -cc 'version_atleast(2.4)'

This resolves GitHub issue #1246.

Co-authored-by: Tim Duesterhus <tim@bastelstu.be>
2021-06-08 11:17:19 +02:00
Maximilian Mader
29c6cd7d8a CLEANUP: tools: Make errptr const in parse_line()
This change is for consistency with `cfg_eval_condition()`.
2021-06-08 10:56:10 +02:00
Tim Duesterhus
b3168b34a9 CLEANUP: cfgparse: Remove duplication of MAX_LINE_ARGS + 1
We can calculate the number of possible arguments based off the size of the
`args` array. We should do so to prevent the two values from getting out of
sync.
2021-06-08 10:54:30 +02:00
Amaury Denoyelle
5e560e80c7 MINOR: server: use ha_alert in server parsing functions
Replace memprintf usage in _srv_parse* functions by ha_alert calls. This
has the advantage to simplify the function prototype by removing an
extra char** argument.

As a consequence, the CLI handler of 'add server' is updated to output
the user messages buffers if not empty.
2021-06-07 17:19:33 +02:00
Amaury Denoyelle
9d0138ab08 MINOR: server: use parsing ctx for server init addr
Initialize the parsing context in srv_init_addr. This function is called
after configuration check.

This will standardize the stderr output on startup with the parse_server
function.
2021-06-07 17:19:30 +02:00
Amaury Denoyelle
e74cbc3227 REORG: config: use parsing ctx for server config check
Initialize the parsing context when checking server config validity.
Adjust the log messages to remove redundant config file/line and server
name. Do a similar cleaning in prepare_srv from ssl_sock as this
function is called at the same stage.

This will standardize the stderr output on startup with the parse_server
function.
2021-06-07 17:19:27 +02:00
Amaury Denoyelle
0fc136ce5b REORG: server: use parsing ctx for server parsing
Use the parsing context in parse_server. Remove redundant manual
format-string specifying the current file/line/server parsed.
2021-06-07 17:19:24 +02:00
Amaury Denoyelle
d0b237c713 MINOR: log: define server user message format
Define the format for user messages related to a server instance. It
contains the names of the backend and the server itself.
2021-06-07 17:19:23 +02:00
Amaury Denoyelle
111243003e MINOR: errors: specify prefix "config" for parsing output
Set "config :" as a prefix for the user messages context before starting
the configuration parsing. All following stderr output will be prefixed
by it.

As a consequence, remove extraneous prefix "config" already specified in
various ha_alert/warning/notice calls.
2021-06-07 17:19:16 +02:00
Amaury Denoyelle
da3d68111c MINOR: log: display exec path on first warning
Display process executable path on first warning if not already done in
ha_warning, as in ha_alert. The output is thus cleaner when ALERT and
WARN messages are mixed, with the executable path always on first
position.
2021-06-07 17:19:15 +02:00
Amaury Denoyelle
816281ff16 MINOR: errors: use user messages context in print_message
Prepend the user messages context to stderr output in print_message. It
is inserted between the output prefix (log level / pid) and the message
itself. Its content depends on the loaded context infos.
2021-06-07 17:19:10 +02:00
Amaury Denoyelle
6af81f80fb MEDIUM: errors: implement parsing context type
Create a parsing_ctx structure. This type is used to store information
about the current file/line parsed. A global context is created and
can be manipulated when haproxy is in STARTING mode. When starting is
over, the context is resetted and should not be accessed anymore.
2021-06-07 16:58:16 +02:00
Amaury Denoyelle
0a1cdccebd MINOR: log: do not discard stderr when starting is over
Always print message in ha_alert/warning/notice when starting is over,
regardless of quiet/verbose options.

This change is useful to retrieve the output via the newly implemented
user messages buffer at runtime, for the CLI handlers.
2021-06-07 16:58:16 +02:00
Amaury Denoyelle
1833e43c3e MEDIUM: errors: implement user messages buffer
The user messages buffer is used to store the stderr output after the
starting is over. Each thread has it own user messages buffer. Add some
functions to add a new message, retrieve and clear the content.

The user messages buffer primary goal is to be consulted by CLI
handlers. Each handlers using it must clear the buffer before starting
its operation.
2021-06-07 16:58:16 +02:00
Amaury Denoyelle
c008a63582 CLEANUP: server: fix cosmetic of error message on sni parsing
Fix memprintf used in server_parse_sni_expr. Error messages should not
be ending with a newline as it will be inserted in the parent function
on the ha_alert invocation.
2021-06-07 16:58:16 +02:00
Amaury Denoyelle
ce986e1ce8 REORG: errors: split errors reporting function from log.c
Move functions related to errors output on stderr from log.c to a newly
created errors.c file. It targets print_message and
ha_alert/warning/notice/diag functions and related startup_logs feature.
2021-06-07 16:58:15 +02:00
Willy Tarreau
63b3ae7ca3 CLEANUP: backend: fix incorrect comments on locking conditions for lb functions
The leastconn and roundrobin functions mention that the server's lock
must be held while this is not true at all and it is not used either.
The "first" algo doesn't mention anything about the need for locking,
so let's mention that it uses the lbprm lock.
2021-06-04 15:40:50 +02:00
Christopher Faulet
5e702fcadc MINOR: http-ana: Use -1 status for client aborts during queuing and connect
When a client aborts while the session is in the queue or during the connect
stage, instead of reporting a 503-Service-Unavailable error in logs, -1
status is used. It means -1 status is now reported with 'CC' and 'CQ'
termination state.

Indeed, when a client aborts before the server connection is established,
there is no reason to report a 503 because nothing is sent to the
server. And in this case, because it is a client abort, it is useless to
send any response to the client. Thus -1 status is approriate. This status
is used in log messages when the connection is closed and no response is
sent.

This patch should fix the issue #1266.
2021-06-02 17:17:34 +02:00
William Lallemand
f22b032956 BUILD: fix compilation for OpenSSL-3.0.0-alpha17
Some changes in the OpenSSL syntax API broke this syntax:
  #if SSL_OP_NO_TLSv1_3

OpenSSL made this change which broke our usage in commit f04bb0bce490de847ed0482b8ec9eabedd173852:

-# define SSL_OP_NO_TLSv1_3                               (uint64_t)0x20000000
+#define SSL_OP_BIT(n)  ((uint64_t)1 << (uint64_t)n)
+# define SSL_OP_NO_TLSv1_3                               SSL_OP_BIT(29)

Which can't be evaluated by the preprocessor anymore.
This patch replace the test by an openssl version test.

This fix part of #1276 issue.
2021-06-02 16:41:50 +02:00
Christopher Faulet
bf7743094e CLEANUP: mux-fcgi: Don't needlessly store result of data/trailers parsing
Return values of fcgi_strm_parse_data() and fcgi_strm_parse_trailers() are
no longer checked. Thus it is useless to store it.

This patch should fix the issues #1269 and #1268.
2021-06-02 12:04:42 +02:00
Christopher Faulet
c4439f71b0 BUG/MINOR: vars: Be sure to have a session to get checks variables
It is now possible to get any variables from the cli. Concretely, only
variables in the PROC scope can be retrieved because there is neither stream
nor session defined. But, nothing forbids anyone to try to get a variable in
any scope. No value will be found, but it is allowed. Thus, we must be sure
to not rely on an undefined session or stream in that case. Especially, the
session must be tested before retrieving variables in CHECK scope.

This patch should fix the issue #1249. It must be backported to 2.4.
2021-06-02 11:55:14 +02:00
Christopher Faulet
e9106d69cb MINOR: backend: Don't release SI endpoint anymore in connect_server()
Thanks to the previous patch (822decfd "BUG/MAJOR: stream-int: Release SI
endpoint on server side ASAP on retry"), it is now useless to release any
existing connection in connect_server() because it was already done in
back_handle_st_cer() if necessary.

This patch is not a CLEANUP because it may introduce some bugs in edge
cases. There is no reason to backport it for now except if it is required to
fix a bug.
2021-06-01 15:54:50 +02:00
Christopher Faulet
f822decfda BUG/MAJOR: stream-int: Release SI endpoint on server side ASAP on retry
When a connection attempt failed, if a retry is possible, the SI endpoint on
the server side is immediately released, instead of waiting to establish a
new connection to a server. Thus, when the backend SI is switched from
SI_ST_CER state to SI_ST_REQ, SI_ST_ASS or SI_ST_TAR, its endpoint is
released. It is expected because the SI is moved to a state prior to the
connection stage ( < SI_ST_CONN). So it seems logical to not have any server
connection.

It is especially important if the retry is delayed (SI_ST_TAR or
SI_ST_QUE). Because, if the server connection is preserved, any error at the
connection level is unexpectedly relayed to the stream, via the
stream-interface, leading to an infinite loop in process_stream(). if
SI_FL_ERR flag is set on the backend SI in another state than SI_ST_CLO, an
internal goto is performed to resync the stream-interfaces. In addtition,
some ressources are not released ASAP.

This bug is quite old and was reported 1 or 2 times per years since the 2.2
(at least) with not enough information to catch it. It must be backported as
far as 2.2 with a special care because this part has moved several times and
after some observation period and feedback from users to be sure. For info,
in 2.0 and prior, the connection is released when an error is encountered in
SI_ST_CON or SI_ST_RDY states.
2021-06-01 15:53:54 +02:00
Christopher Faulet
1a4449b0d0 CLEANUP: http-ana: Remove useless if statement about L7 retries
Thanks to the commit 1f08bffe0 ("MINOR: http-ana: Perform L7 retries because
of status codes in response analyser"), the L7 retries about the response
status code is now fully handled in the HTTP response analyser.
CF_READ_ERROR flag is no longer set on the response channel in this
case. Thus it is useless to try to catch L7 retries when CF_READ_ERROR is
set because it cannot happen.

The above commit was backported to 2.4, thus this one should also be
backported.
2021-05-31 11:45:26 +02:00
Remi Tricot-Le Breton
476462010e BUG/MINOR: proxy: Missing calloc return value check in chash_init_server_tree
A memory allocation failure happening in chash_init_server_tree while
trying to allocate a server's lb_nodes item used in consistent hashing
would have resulted in a crash. This function is only called during
configuration parsing.

It was raised in GitHub issue #1233.
It could be backported to all stable branches.
2021-05-31 10:55:51 +02:00
Remi Tricot-Le Breton
17acbab0ac BUG/MINOR: http: Missing calloc return value check in make_arg_list
A memory allocation failure happening in make_arg_list when trying to
allocate the argument list would have resulted in a crash. This function
is only called during configuration parsing.

It was raised in GitHub issue #1233.
It could be backported to all stable branches.
2021-05-31 10:51:09 +02:00
Remi Tricot-Le Breton
b6864a5b6f BUG/MINOR: http: Missing calloc return value check while parsing redirect rule
A memory allocation failure happening in http_parse_redirect_rule when
trying to allocate a redirect_rule structure would have resulted in a
crash. This function is only called during configuration parsing.

It was raised in GitHub issue #1233.
It could be backported to all stable branches.
2021-05-31 10:51:08 +02:00
Remi Tricot-Le Breton
1f4fa906c7 BUG/MINOR: worker: Missing calloc return value check in mworker_env_to_proc_list
A memory allocation failure happening in mworker_env_to_proc_list when
trying to allocate a mworker_proc would have resulted in a crash. This
function is only called during init.

It was raised in GitHub issue #1233.
It could be backported to all stable branches.
2021-05-31 10:51:06 +02:00
Remi Tricot-Le Breton
6443bcc2e1 BUG/MINOR: compression: Missing calloc return value check in comp_append_type/algo
A memory allocation failure happening in comp_append_type or
comp_append_algo called while parsing compression options would have
resulted in a crash. These functions are only called during
configuration parsing.

It was raised in GitHub issue #1233.
It could be backported to all stable branches.
2021-05-31 10:51:04 +02:00
Remi Tricot-Le Breton
8cb033643f BUG/MINOR: http: Missing calloc return value check while parsing tcp-request rule
A memory allocation failure happening in tcp_parse_request_rule while
processing the "capture" keyword and trying to allocate a cap_hdr
structure would have resulted in a crash. This function is only called
during configuration parsing.

It was raised in GitHub issue #1233.
It could be backported to all stable branches.
2021-05-31 10:51:02 +02:00
Remi Tricot-Le Breton
2ca42b4656 BUG/MINOR: http: Missing calloc return value check while parsing tcp-request/tcp-response
A memory allocation failure happening in tcp_parse_tcp_req or
tcp_parse_tcp_rep when trying to allocate an act_rule structure would
have resulted in a crash. These functions are only called during
configuration parsing.

It was raised in GitHub issue #1233.
It could be backported to all stable branches.
2021-05-31 10:51:00 +02:00
Remi Tricot-Le Breton
18a82ba690 BUG/MINOR: proxy: Missing calloc return value check in proxy_defproxy_cpy
A memory allocation failure happening in proxy_defproxy_cpy while
copying the default compression options would have resulted in a crash.
This function is called for every new proxy found while parsing the
configuration.

It was raised in GitHub issue #1233.
It could be backported to all stable branches.
2021-05-31 10:50:59 +02:00
Remi Tricot-Le Breton
55ba0d6865 BUG/MINOR: proxy: Missing calloc return value check in proxy_parse_declare
A memory allocation failure happening during proxy_parse_declare while
processing the "capture" keyword and allocating a cap_hdr structure
would have resulted in a crash. This function is only called during
configuration parsing.

It was raised in GitHub issue #1233.
It could be backported to all stable branches.
2021-05-31 10:50:57 +02:00
Remi Tricot-Le Breton
a4bf8a059d BUG/MINOR: http: Missing calloc return value check in parse_http_req_capture
A memory allocation failure happening in parse_http_req_capture while
processing a "len" keyword and allocating a cap_hdr structure would
have resulted in a crash. This function is only called during
configuration parsing.

It was raised in GitHub issue #1233.
It could be backported to all stable branches.
2021-05-31 10:50:55 +02:00
Remi Tricot-Le Breton
612b2c37be BUG/MINOR: ssl: Missing calloc return value check in ssl_init_single_engine
A memory allocation failure happening during ssl_init_single_engine
would have resulted in a crash. This function is only called during
init.

It was raised in GitHub issue #1233.
It could be backported to all stable branches.
2021-05-31 10:50:49 +02:00
Remi Tricot-Le Breton
208ff01b23 BUG/MINOR: peers: Missing calloc return value check in peers_register_table
A memory allocation failure happening during peers_register_table would
have resulted in a crash. This function is only called during init.

It was raised in GitHub issue #1233.
It could be backported to all stable branches.
2021-05-31 10:50:46 +02:00
Remi Tricot-Le Breton
f1800e64ef BUG/MINOR: server: Missing calloc return value check in srv_parse_source
Two calloc calls were not checked in the srv_parse_source function.
Considering that this function could be called at runtime through a
dynamic server creation via the CLI, this could lead to an unfortunate
crash.

It was raised in GitHub issue #1233.
It could be backported to all stable branches even though the runtime
crash could only happen on branches where dynamic server creation is
possible.
2021-05-31 10:50:32 +02:00
Tim Duesterhus
5546c8bdce MINOR: cfgparse: Fail when encountering extra arguments in macro
This resolves GitHub issue #1124.

This change should be backported as a *warning* to 2.4.
2021-05-27 07:54:21 +02:00
Christopher Faulet
1f08bffe0c MINOR: http-ana: Perform L7 retries because of status codes in response analyser
L7 retries because of status codes are now performed in the response
analyser. This way, it is no longer required to handle L7 retries in
si_cs_recv(). It is also useless to set CF_READ_ERROR on the response
channel to be able to trigger such retries.

In addition, if no L7 retries are performed when the response is received,
the L7 buffer is immediately released. Before in this case, it was only
released with the stream.
2021-05-26 13:56:06 +02:00
Christopher Faulet
d976923ab2 BUG/MINOR: http-ana: Handle L7 retries on refused early data before K/A aborts
When a network error occurred on the server side, if it is not the first
request (in case of keep-alive), nothing is returned to the client and its
connexion is closed to be sure it may retry. However L7 retries on refused
early data (0rtt-rejected) must be performed first.

In addition, such L7 retries must also be performed before incrementing the
failed responses counter.

This patch must be backported as far as 2.0.
2021-05-26 13:56:06 +02:00
Christopher Faulet
552601d5fd BUG/MINOR: http-ana: Send the right error if max retries is reached on L7 retry
This bug was introduced by the previous commit (9f5382e45 Revert "MEDIUM:
http-ana: Deal with L7 retries in HTTP analysers") because I failed the
revert.

On L7 retry, if the maximum connection retries is reached, an error must be
return to the client. Depending the situation, it may be a 502-Bad-Gateway
(empty-response or junk-response), a 504-Gateway-Timeout (response-timeout)
or a 425-Too-Early (0rtt-rejected). But contrary to what the comment says,
the do_l7_retry() function always returns a success.

Note it is not a problem for L7 retries on the response status code because
the stream-interface already takes care to have not reached the maximum
connection retries counter to trigger a L7 retry.

This patch must be backported to 2.4 because the commit must also be
backported to 2.4.
2021-05-26 10:31:11 +02:00
Christopher Faulet
9f5382e452 Revert "MEDIUM: http-ana: Deal with L7 retries in HTTP analysers"
This reverts commit 5b82cc5b5c. The purpose of
this commit was to fully handle L7 retries in HTTP analysers and stop to
deal with the L7 buffer in si_cs_send()/si_cs_recv(). It is of course
cleaner this way. But there is a huge drawback. The L7 buffer is reserved
from the time the request analysis is finished until the moment the response
is received. For a small request, the analysis is finished before the
connection to the server. Thus for the L7 buffer will be kept for queued
sessions while it is not mandatory.

So, for now, the commit is reverted to go back to the less expensive
solution. This patch must be backported to 2.4.
2021-05-25 10:51:20 +02:00
Christopher Faulet
44c0dcfe90 CLEANUP: mux-h1: Rename functions parsing input buf and filling output buf
Main functions are renamed h1_process_demux() and h1_process_mux() to be
consistent with the H2 mux. For the same reason,
h1_process_header/data/tralers) functions, responsible to parse incoming
data are renamed with "h1_handle_" prefix.
2021-05-25 10:41:50 +02:00
Christopher Faulet
00d7cde551 MINOR: muxes/h1-htx: Realign input buffer using b_slow_realign_ofs()
Input buffers have never output data. So, use b_slow_realign_ofs() function
instead of b_slow_realign(). It is a slighly simpler function. And in the H1
mux, it allows a realign by setting the input buffer head to permit
zero-copies.
2021-05-25 10:41:50 +02:00
Christopher Faulet
7a835f3cb0 MINOR: h1-htx: Use a correlation table to speed-up small chunks parsing
Instead of using hex2i() to convert an hexa digit to an integer in the
function parsing small chunks, we now use a table because it is faster.
2021-05-25 10:41:50 +02:00
Christopher Faulet
bdcefe58b7 MEDIUM: h1-htx: Add a function to parse contiguous small chunks
Add h1_parse_full_contig_chunks() function to parse full contiguous chunks.
This function neither handles incomplete chunks nor wrapping buffers. It is
designed to efficiently parse a buffer with several small chunks. Of course,
there is no zero copy here because it is not possible. This function is a
bit tricky and all changes may a have a impact. This one may probably be
optimized, but it is good enough for now and not too complex.

The main function (h1_parse_msg_chunks) always tries to use this function
when the HTTP parser is waiting for a chunk size. In this case, there is no
zero-copy, so there is no reason to call the generic version to parse the
chunk. However, if some unparsed data remain after this step, the generic
function is called. This way, wrapping data and incomplete chunks may be
parsed.

Quick tests show it is now slightly faster in all cases than the legacy
mode.
2021-05-25 10:41:50 +02:00
Christopher Faulet
0d4c924c34 MEDIUM: h1-htx: Split function to parse a chunk and the loop on the buffer
A generic function is now used to only parse the current chunk (h1_parse_chunk)
and the main one (h1_parse_msg_chunks) is used to loop on the buffer and relies
on the first one. This change is mandatory to be able to use an optimized
function to parse contiguous small chunks.
2021-05-25 10:41:50 +02:00
Christopher Faulet
140691baf9 MINOR: h1-htx: Move HTTP chunks parsing into a dedicated function
Chunked data are now parsed in a dedicated function. This way, it will be
possible to have two functions to parse chunked messages. The current one
for messages with large chunks and an other one to parse messages with small
chunks.

The parsing of small chunks is really sensitive because it may be used as a
DoS attack. So we must be carefull to have an optimized function to parse
such messages.
2021-05-25 10:41:50 +02:00
Christopher Faulet
16a524c9ea MINOR: mux-h1/mux-fcgi: Don't needlessly loop on data parsing
Because the function parsing H1 data is now able to handle wrapping input
buffers, there is no reason to loop anymore in the muxes to be sure to parse
wrapping data.
2021-05-25 10:41:50 +02:00
Christopher Faulet
f7c2044f8f MEDIUM: h1-htx: Adapt H1 data parsing to copy wrapping data in one call
Since the beginning, wrapping input data are parsed and copied in 2 steps to
not deal with the wrapping in H1 parsing functions. But there is no reason
to do so. This needs 2 calls to parsing functions. This also means, most of
time, when the input buffer does not wrap, there is an extra call for
nothing.

Thus, now, the data parsing functions try to copy as much data as possible,
handling wrapping buffer if necessary.
2021-05-25 10:41:50 +02:00
Christopher Faulet
de471a4a8d MINOR: h1-htx: Update h1 parsing functions to return result as a size_t
h1 parsing functions (h1_parse_msg_*) returns the number of bytes parsed or
0 if nothing is parsed because an error occurred or some data are
missing. But they never return negative values. Thus, instead of a signed
integer, these function now return a size_t value.

The H1 and FCGI muxes are updated accordingly. Note that h1_parse_msg_data()
has been slightly adapted because the parsing of chunked messages still need
to handle negative values when a parsing error is reported by
h1_parse_chunk_size() or h1_skip_chunk_crlf().
2021-05-25 10:41:50 +02:00
Dragan Dosen
a75eea78e2 MINOR: map/acl: print the count of all the map/acl entries in "show map/acl"
The output of "show map/acl" now contains the 'entry_cnt' value that
represents the count of all the entries for each map/acl, not just the
active ones, which means that it also includes entries currently being
added.
2021-05-25 08:44:45 +02:00
Christopher Faulet
acfd71b97a BUG/MINOR: http-comp: Preserve HTTP_MSGF_COMPRESSIONG flag on the response
This flag is set on the response when its payload is compressed by HAProxy.
It must be preserved because it may be used when the log message is emitted.

When the compression filter was refactored to support the HTX, an
optimization was added to not perform extra proessing on the trailers.
HTTP_MSGF_COMPRESSIONG flag is removed when the last data block is
compressed. It is not required, it is just an optimization and unfortunately
a bug. This optimization must be removed to preserve the flag.

This patch must be backported as far as 2.0. On the HTX is affected.
2021-05-21 09:59:00 +02:00
Christopher Faulet
a6d3704e38 BUG/MEDIUM: filters: Exec pre/post analysers only one time per filter
For each filter, pre and post callback functions must only be called one
time. To do so, when one of them is finished, the corresponding analyser bit
must be removed from pre_analyzers or post_analyzers bit field. It is only
an issue with pre-analyser callback functions if the corresponding analyser
yields. It may happens with lua action for instance. In this case, the
filters pre analyser callback function is unexpectedly called several times.

This patch should fix the issue #1263. It must be backported is all stable
versions.
2021-05-21 09:59:00 +02:00
Amaury Denoyelle
79a88ba3d0 BUG/MAJOR: server: prevent deadlock when using 'set maxconn server'
A deadlock is possible with 'set maxconn server' command, if there is
pending connection ready to be dequeued. This is caused by the locking
of server spinlock in both cli_parse_set_maxconn_server and
process_srv_queue.

Fix this by reducing the scope of the server lock into
server_parse_maxconn_change_request. If connection are dequeued, the
lock is taken a second time. This can be seen as suboptimal but as it
happens only during 'set maxconn server' it can be considered as
tolerable.

This issue was reported on the mailing list, for the 1.8.x branch.
It must be backported up to the 1.8.
2021-05-19 17:52:05 +02:00
Remi Tricot-Le Breton
a6b2784099 CLEANUP: ssl: Fix coverity issues found in CA file hot update code
Coverity found a few uninitialized values and some dead code in the
CA/CRL file hot update code as well as a missing return value check.
2021-05-18 10:52:54 +02:00
Remi Tricot-Le Breton
18c7d83934 BUILD/MINOR: ssl: Fix compilation with OpenSSL 1.0.2
The following functions used in CA/CRL file hot update were not defined
in OpenSSL 1.0.2 so they need to be defined in openssl-compat :
- X509_CRL_get_signature_nid
- X509_CRL_get0_lastUpdate
- X509_CRL_get0_nextUpdate
- X509_REVOKED_get0_serialNumber
- X509_REVOKED_get0_revocationDate
2021-05-18 00:28:31 +02:00
Remi Tricot-Le Breton
d75b99e69c BUILD/MINOR: ssl: Fix compilation with SSL enabled
The CA/CRL hot update patches did not compile on some targets of the CI
(mainly gcc + ssl). This patch should fix almost all of them. It adds
missing variable initializations and return value checks to the
BIO_reset calls in show_crl_detail.
2021-05-17 11:53:21 +02:00
Remi Tricot-Le Breton
51e28b6bee MEDIUM: ssl: Add "show ssl crl-file" CLI command
This patch adds the "show ssl crl-file [<crlfile>]" CLI command. This
command can be used to display the list of all the known CRL files when
no specific file name is specified, or to display the details of a
specific CRL file when a name is given.
The details displayed for a specific CRL file are inspired by the ones
shown by a "openssl crl -text -noout -in <filename>".
2021-05-17 10:50:24 +02:00
Remi Tricot-Le Breton
eef8e7b8bc MINOR: ssl: Add "abort ssl crl-file" CLI command
The "abort" command aborts an ongoing transaction started by a "set ssl
crl-file" command. Since the updated CRL file data is not pushed into
the CA file tree until a "commit ssl crl-file" call is performed, the
abort command simply deleted the new cafile_entry (storing the new CRL
file data) stored in the transaction.
2021-05-17 10:50:24 +02:00
Remi Tricot-Le Breton
720e3b9f33 MEDIUM: ssl: Add "new+del crl-file" CLI commands
This patch adds the "new ssl crl-file" and "del ssl crl-file" CLI
commands.
The "new" command can be used to create a new empty CRL file that can be
filled in thanks to a "set ssl crl-file" command. It can then be used in
a new crt-list line.
The newly created CRL file is added to the CA file tree so any call to
"show ssl crl-file" will display its name.
The "del" command allows to delete an unused CRL file. A CRL file will
be considered unused if its list of ckch instances is empty. It does not
work on an uncommitted CRL file transaction created via a "set ssl
crl-file" command call.
2021-05-17 10:50:24 +02:00
Remi Tricot-Le Breton
a51b339d95 MEDIUM: ssl: Add "set+commit ssl crl-file" CLI commands
This patch adds the "set ssl crl-file" and "commit ssl crl-file"
commands, following the same logic as the certificate and CA file update
equivalents.
When trying to update a Certificate Revocation List (CRL) file via a
"set" command, we start by looking for the entry in the CA file tree and
then building a new cafile_entry out of the payload, without adding it
to the tree yet. It will only be added when a "commit" command is
called.
During a "commit" command, we insert the newly built cafile_entry in the
CA file tree while keeping the previous entry. We then iterate over all
the instances that used the CRL file and rebuild a new one and its
dedicated SSL context for every one of them.
When all the contexts are properly created, the old instances get
replaced by the new ones and the old CRL file is removed from the tree.
2021-05-17 10:50:24 +02:00
Remi Tricot-Le Breton
f81c70ceec MINOR: ssl: Chain instances in ca-file entries
In order for crl-file hot update to be possible, we need to add an extra
link between the CA file tree entries that hold Certificate Revocation
Lists and the instances that use them. This way we will be able to
rebuild each instance upon CRL modification.
This mechanism is similar to what was made for the actual CA file update
since both the CA files and the CRL files are stored in the same CA file
tree.
2021-05-17 10:50:24 +02:00
Remi Tricot-Le Breton
c3a8477776 MINOR: ssl: Add "del ssl ca-file" CLI command
This patch adds the "del ssl ca-file <cafile>" CLI command which can be
used to delete an unused CA file.
The CA file will be considered unused if its list of ckch instances is
empty. This command cannot be used to delete the uncommitted CA file of
a previous "set ssl ca-file" without commit. It only acts on
CA file entries already inserted in the CA file tree.

This fixes a subpart of GitHub issue #1057.
2021-05-17 10:50:24 +02:00
Remi Tricot-Le Breton
9f40fe0202 MEDIUM: ssl: Add "new ssl ca-file" CLI command
This patch adds the "new ssl ca-file <cafile>" CLI command. This command
can be used to create a new empty CA file that can be filled in thanks
to a "set ssl ca-file" command. It can then be used in a new crt-list
line.
The newly created CA file is added directly in the cafile tree so any
following "show ssl ca-file" call will display its name.

This fixes a subpart of GitHub issue #1057.
2021-05-17 10:50:24 +02:00
Remi Tricot-Le Breton
2a22e16cb8 MEDIUM: ssl: Add "show ssl ca-file" CLI command
This patch adds the "show ssl ca-file [<cafile>[:index]]" CLI command.
This command can be used to display the list of all the known CA files
when no specific file name is specified, or to display the details of a
specific CA file when a name is given. If an index is given as well, the
command will only display the certificate having the specified index in
the CA file (if it exists).
The details displayed for each certificate are the same as the ones
showed when using the "show ssl cert" command on a single certificate.

This fixes a subpart of GitHub issue #1057.
2021-05-17 10:50:24 +02:00
Remi Tricot-Le Breton
523f0e483a MINOR: ssl: Refactorize the "show certificate details" code
Move all the code that dumps the details of a specific certificate into
a dedicated function so that it can be used elsewhere.
2021-05-17 10:50:24 +02:00
Remi Tricot-Le Breton
0bb482436c MINOR: ssl: Add a cafile_entry type field
The CA files and CRL files are stored in the same cafile_tree so this
patch adds a new field the the cafile_entry structure that specifies the
type of the entry. Since a ca-file can also have some CRL sections, the
type will be based on the option used to load the file and not on its
content (ca-file vs crl-file options).
2021-05-17 10:50:24 +02:00
Remi Tricot-Le Breton
d5fd09d339 MINOR: ssl: Add "abort ssl ca-file" CLI command
The "abort" command aborts an ongoing transaction started by a "set ssl
ca-file" command. Since the updated CA file data is not pushed into the
cafile tree until a "commit ssl ca-file" call is performed, the abort
command simply clears the new cafile_entry that was stored in the
cafile_transaction.

This fixes a subpart of GitHub issue #1057.
2021-05-17 10:50:24 +02:00
Remi Tricot-Le Breton
a32a68bd3b MEDIUM: ssl: Add "set+commit ssl ca-file" CLI commands
This patch adds the "set ssl ca-file" and "commit ssl ca-file" commands,
following the same logic as the certificate update equivalents.
When trying to update a ca-file entry via a "set" command, we start by
looking for the entry in the cafile_tree and then building a new
cafile_entry out of the given payload. This new object is not added to
the cafile_tree until "commit" is called.
During a "commit" command, we insert the newly built cafile_entry in the
cafile_tree, while keeping the previous entry as well. We then iterate
over all the instances linked in the old cafile_entry and rebuild a new
ckch instance for every one of them. The newly inserted cafile_entry is
used for all those new instances and their respective SSL contexts.
When all the contexts are properly created, the old instances get
replaced by the new ones and the old cafile_entry is removed from the
tree.

This fixes a subpart of GitHub issue #1057.
2021-05-17 10:50:24 +02:00
Remi Tricot-Le Breton
bfadc02f34 MINOR: ssl: Ckch instance rebuild and cleanup factorization in CLI handler
The process of rebuilding a ckch_instance when a certificate is updated
through a cli command will be roughly the same when a ca-file is updated
so this factorization will avoid code duplication.
2021-05-17 10:50:24 +02:00
Remi Tricot-Le Breton
38c999b11c MINOR: ssl: Add helper function to add cafile entries
Adds a way to insert a new uncommitted cafile_entry in the tree. This
entry will be the one fetched by any lookup in the tree unless the
oldest cafile_entry is explicitely looked for. This way, until a "commit
ssl ca-file" command is completed, there could be two cafile_entries
with the same path in the tree, the original one and the newly updated
one.
2021-05-17 10:50:24 +02:00
Remi Tricot-Le Breton
383fb1472e MEDIUM: ssl: Add a way to load a ca-file content from memory
The updated CA content coming from the CLI during a ca-file update will
directly be in memory and not on disk so the way CAs are loaded in a
cafile_entry for now (via X509_STORE_load_locations calls) cannot be
used.
This patch adds a way to fill a cafile_entry directly from memory and to
load the contained certificate and CRL sections into an SSL store.
CRL sections are managed as well as certificates in order to mimic the
way CA files are processed when specified in an option. Indeed, when
parsing a CA file given through a ca-file or ca-verify-file option, we
iterate over the different sections in ssl_set_cert_crl_file and load
them regardless of their type. This ensures that a file that was
properly parsed when given as an option will also be accepted by the
CLI.
2021-05-17 10:50:24 +02:00
Remi Tricot-Le Breton
5daff3c8ab MINOR: ssl: Add helper functions to create/delete cafile entries
Add ssl_store_create_cafile_entry and ssl_store_delete_cafile_entry
functions.
2021-05-17 10:50:24 +02:00
Remi Tricot-Le Breton
40ddea8222 MINOR: ssl: Add reference to default ckch instance in bind_conf
In order for the link between the cafile_entry and the default ckch
instance to be built, we need to give a pointer to the instance during
the ssl_sock_prepare_ctx call.
2021-05-17 10:50:24 +02:00
Remi Tricot-Le Breton
4458b9732d MEDIUM: ssl: Chain ckch instances in ca-file entries
Each ca-file entry of the tree will now hold a list of the ckch
instances that use it so that we can iterate over them when updating the
ca-file via a cli command. Since the link between the SSL contexts and
the CA file tree entries is only built during the ssl_sock_prepare_ctx
function, which are called after all the ckch instances are created, we
need to add a little post processing after each ssl_sock_prepare_ctx
that builds the link between the corresponding ckch instance and CA file
tree entries.
In order to manage the ca-file and ca-verify-file options, any ckch
instance can be linked to multiple CA file tree entries and any CA file
entry can link multiple ckch instances. This is done thanks to a
dedicated list of ckch_inst references stored in the CA file tree
entries over which we can iterate (during an update for instance). We
avoid having one of those instances go stale by keeping a list of
references to those references in the instances.
When deleting a ckch_inst, we can then remove all the ckch_inst_link
instances that reference it, and when deleting a cafile_entry, we
iterate over the list of ckch_inst reference and clear the corresponding
entry in their own list of ckch_inst_link references.
2021-05-17 10:50:24 +02:00
Remi Tricot-Le Breton
9f0c936057 MINOR: ssl: Allow duplicated entries in the cafile_tree
In order to ease ca-file hot update via the CLI, the ca-file tree will
need to allow duplicate entries for a given path. This patch simply
enables it and offers a way to select either the oldest entry or the
latest entry in the tree for a given path.
2021-05-17 10:50:24 +02:00
Remi Tricot-Le Breton
af8820a9a5 CLEANUP: ssl: Move ssl_store related code to ssl_ckch.c
This patch moves all the ssl_store related code to ssl_ckch.c since it
will mostly be used there once the CA file update CLI commands are all
implemented. It also makes the cafile_entry structure visible as well as
the cafile_tree.
2021-05-17 10:50:24 +02:00
Willy Tarreau
fb601956db BUILD: sample: use strtoll() instead of atoll()
atoll() is not portable, but strtoll() is more common. We must pass NULL
to the end pointer however since the parser must consume digits and stop
at the first non-digit char. No backport is needed as this was introduced
in 2.4-dev17 with commit 51c8ad45c ("MINOR: sample: converter: Add json_query
converter").
2021-05-14 08:51:53 +02:00
Willy Tarreau
388fc25915 IMPORT: slz: use inttypes.h instead of stdint.h
stdint.h is not as portable as inttypes.h. It doesn't exist at least
on AIX 5.1 and Solaris 7, while inttypes.h is present there and does
include stdint.h on platforms supporting it.

This is equivalent to libslz upstream commit e36710a ("slz: use
inttypes.h instead of stdint.h")
2021-05-14 08:44:52 +02:00
Willy Tarreau
6bfc10c392 BUILD: config: avoid a build warning on numa_detect_topology() without threads
The function is defined when using linux+cpu affinity but is only used
if threads are enabled, so let's add this condition to avoid aa build
warning about an unused function when building with thread disabled.
This came in 2.4-dev17 with commit b56a7c89a ("MEDIUM: cfgparse: detect
numa and set affinity if needed") so no backport is needed.
2021-05-14 08:30:46 +02:00
Willy Tarreau
26f42a0779 BUG/MAJOR: config: properly initialize cpu_map.thread[] up to MAX_THREADS
A mistake was introduced in 2.4-dev17 by commit 982fb5339 ("MEDIUM:
config: use platform independent type hap_cpuset for cpu-map"), it
initializes cpu_map.thread[] from 0 to MAX_PROCS-1 instead of
MAX_THREADS-1 resulting in crashes when the two differ, e.g. when
building with USE_THREAD= but still with USE_CPU_AFFINITY=1.

No backport is needed.
2021-05-14 08:26:38 +02:00
Willy Tarreau
89f6dedf48 BUG/MINOR: lua/vars: prevent get_var() from allocating a new name
Variable names are stored into a unified list that helps compare them
just based on a pointer instead of duplicating their name with every
variable. This is convenient for those declared in the configuration
but this started to cause issues with Lua when random names would be
created upon each access, eating lots of memory and CPU for lookups,
hence the work in 2.2 with commit 4e172c93f ("MEDIUM: lua: Add
`ifexist` parameter to `set_var`") to address this.

But there remains a corner case with get_var(), which also allocates
a new variables. After a bit of thinking and discussion, it never
makes sense to allocate a new variable name on get_var():
  - if the name exists, it will be returned ;
  - if it does not exist, then the only way for it to appear will
    be that some code calls set_var() on it
  - a call to get_var() after a careful set_var(ifexist) ruins the
    effort on set_var().

For this reason, this patch addresses this issue by making sure that
get_var() will never cause a variable to be allocated. This is done
by modifying vars_get_by_name() to always call register_name() with
alloc=0, since vars_get_by_name() is exclusively used by Lua and the
new CLI's "get/set var" which also benefit from this protection.

It probably makes sense to backport this as far as 2.2 after some
observation period and feedback from users.

For more context and discussions about the issues this was causing,
see https://www.mail-archive.com/haproxy@formilux.org/msg40451.html
and in issue #664.
2021-05-13 13:44:32 +02:00
Willy Tarreau
832e242b1f DEBUG: ssl: export ssl_sock_close() to see its symbol resolved in profiling
This function is one of the few high-profile, unresolved ones in the memory
profile output, let's have it resolve to ease matching of SSL allocations,
which are not easy to follow.
2021-05-13 10:11:03 +02:00
Willy Tarreau
f1c8a3846c MINOR: activity/cli: optionally support sorting by address on "show profiling"
"show profiling" by default sorts by usage/counts, which is suitable for
occasional use. But when called from scripts to monitor/search variations,
this is not very convenient. Let's add a new "byaddr" option to support
sorting the output by address. It also eases matching alloc/free calls
from within a same library, or reading grouped tasks costs by library.
2021-05-13 10:00:17 +02:00
Willy Tarreau
973a937c5f BUG/MINOR: stats: fix lastchk metric that got accidently lost
Commit d3a9a4992 ("MEDIUM: stats: allow to select one field in
`stats_fill_sv_stats`") left one occurrence of a direct assignment
of stats[] instead of placing it into the <metric> variable, and it
was on ST_F_CHECK_STATUS. This resulted in the field being overwritten
with an empty one immediately after being set in stats_fill_sv_stats()
and the field to appear empty on the stats page.

No backport is needed as this was only for 2.4.
2021-05-12 17:50:16 +02:00
Willy Tarreau
4263f68b65 CLEANUP: stick-table: remove a leftover of an old keyword declaration
There was a leftover of an antique declaration commented out that has
now been superseded by new ones, let's remove it.
2021-05-12 17:50:16 +02:00
Amaury Denoyelle
c460c70ab7 BUG/MEDIUM: stick_table: fix crash when using tcp smp_fetch_src
Since the introduction of bc_src, smp_fetch_src from tcp_sample inspect
the kw argument to choose between the frontend or the backend source
address. However, for the stick tables, the argument is left to NULL.
This causes a segfault.

Fix the crash by explicitely set the kw argument to "src" to retrieve
the source address of the frontend side.

This bug was introduced by the following commit :
  7d081f02a4
  MINOR: tcp_samples: Add samples to get src/dst info of the backend connection

It does not need a backport as it is integrated in the current 2.4-dev
branch.

To reproduce the crash, I used the following config :

frontend fe
	bind :20080
	http-request track-sc0 src table foo
	http-request reject if { src_conn_rate(foo) gt 10 }
	use_backend h1

backend foo
	stick-table type ip size 200k expire 30s store conn_rate(60s)

backend h1
	server nginx 127.0.0.1:30080 check

This should fix the github issue #1247.
2021-05-12 15:30:03 +02:00
Willy Tarreau
9e274280a4 IMPORT: slz: do not produce the crc32_fast table when CRC is natively supported
On ARM with native CRC support, no need to inflate the executable with
a 4kB CRC table, let's just drop it.

This is slz upstream commit d8715db20b2968d1f3012a734021c0978758f911.
2021-05-12 09:29:33 +02:00
Willy Tarreau
027fdcb168 IMPORT: slz: use the generic function for the last bytes of the crc32
This is the only place where we conditionally use the crc32_fast table,
better call the crc32_char inline function for this. This should also
reduce by ~1kB the L1 cache footprint of the compression when dealing
with small blocks, and at least shows a consistent 0.5% perf improvement.

This is slz upstream commit 075351b6c2513b548bac37d6582e46855bc7b36f.
2021-05-12 09:29:29 +02:00
Tim Duesterhus
dec1c36b3a MINOR: uri_normalizer: Add fragment-encode normalizer
This normalizer encodes '#' as '%23'.

See GitHub Issue #714.
2021-05-11 17:24:32 +02:00
Tim Duesterhus
c9e05ab2de MINOR: uri_normalizer: Add fragment-strip normalizer
This normalizer strips the URI's fragment component which should never be sent
to the server.

See GitHub Issue #714.
2021-05-11 17:23:46 +02:00
Tim Duesterhus
2f413136e9 BUG/MINOR: http_act: Fix normalizer names in error messages
These places were forgotten when the normalizers were renamed.

Bug introduced in 5be6ab269e, which is 2.4.
No backport needed.
2021-05-11 17:21:53 +02:00
Willy Tarreau
da7f11bfb5 CLEANUP: pattern: remove the unused and dangerous pat_ref_reload()
This function was not used anymore after the atomic updates were
implemented in 2.3, and it must not be used given that it does not
yield and can easily make the process hang for tens of seconds on
large acls/maps. Let's remove it before someone uses it as an
example to implement something else!
2021-05-11 16:49:55 +02:00
Willy Tarreau
f5fb858bb7 MINOR: memprof: also report the totals and delta alloc-free
Already had to perform too many additions by external scripts, it's
time to add the totals and delay alloc-free as a last line in the
output of the "show memory profiling".
2021-05-11 14:21:18 +02:00
Willy Tarreau
616491b7f7 MINOR: memprof: also report the method used by each call
This was planned but missing in the previous attempt, we really need to
see what is used at each place, especially due to realloc(). Now we
print the function used in front of the caller's address, as well as
the average alloc/free size per call.
2021-05-11 14:14:30 +02:00
Willy Tarreau
79acefa749 BUG/MINOR: memprof: properly account for differences for realloc()
The realloc() function checks if the size grew or reduced in order to
count an allocation or a free, but it does so with the absolute (new
or old) value instead of the difference, resulting in realloc() often
being credited for allocating too much.

No backport is needed.
2021-05-11 09:12:56 +02:00
Ilya Shipitsin
3df5989960 CLEANUP: assorted typo fixes in the code and comments
This is 23rd iteration of typo fixes
2021-05-10 23:05:08 +02:00
Daniel Corbett
67b3cefea3 CLEANUP: cli/activity: Remove double spacing in set profiling command
It was found that when viewing the help output from the CLI that
"set profiling" had 2 spaces in it, which was pushing it out from
the rest of similar commands.

i.e. it looked like this:
  prepare acl <acl>
  prepare map <acl>
  set  profiling  <what>  {auto|on|off}
  set dynamic-cookie-key backend <bk> <k>
  set map <map> [<key>|#<ref>] <value>
  set maxconn frontend <frontend> <value>

This patch removes all of the double spaces within the command and
unifies them to single spacing, which is what is observed within the
rest of the commands.
2021-05-10 22:29:12 +02:00
Amaury Denoyelle
c89d5337ee BUG/MINOR: http_fetch: fix possible uninit sockaddr in fetch_url_ip/port
Check the return value of url2sa in smp_fetch_url_ip/port. If negative,
the address result is uninitialized and the sample fetch is aborted.
Also, the sockaddr is prelimiary zero'ed before calling url2sa to ensure
that it is not used by upper functions even if the sample returns 0.

Without the check, the value returned by the url_ip/url_port fetches is
unspecified. This can be triggered with the following curl :
$ curl -iv --request-target "xxx://127.0.0.1:20080/" http://127.0.0.1:20080/

This should be backported to all stable branches. However, note that
between the 1.8 and 2.0, the targetted functions have been extracted
from proto_http.c to http_fetch.c.

This should fix in part coverity report from the github issue #1244.
2021-05-10 14:48:55 +02:00
Willy Tarreau
5db446d7e1 BUILD: cli: appease a null-deref warning in cli_gen_usage_msg()
The compiler sees the possibility of null-deref for which a path is
possible but which doesn't exist as we didn't pass a null args outside
of the help request. The test was introduced by the simplified test on
ishelp variable, so let's add it to shut the warning.
2021-05-10 07:47:05 +02:00
Willy Tarreau
7deb28ce65 BUG/MEDIUM: quic: fix null deref on error path in qc_conn_init()
When ctx is NULL, we go to the "err" label, which could dereference it.
No backport is needed.
2021-05-10 07:40:27 +02:00
Willy Tarreau
4a75328485 BUILD: memprof: make the old caller pointer a const in get_prof_bin()
It's a const void* in the target, we can't use a void* in the caller,
this causes a build warning with clang.
2021-05-09 23:18:50 +02:00
Willy Tarreau
23c740ea51 CLEANUP: cli/mworker: properly align the help messages
CLI help commands were re-aligned by commit b205bfdab but the
master-worker ones were not done, let's do it now.
2021-05-09 22:49:44 +02:00
Willy Tarreau
92fbbcc4c6 MINOR: cli: sort the output of the "help" keywords
It's still very difficult to find all commands starting with a given
keyword like "set", "show" etc. Let's sort the lines by usage message,
this is much more convenient.
2021-05-09 22:39:07 +02:00
Willy Tarreau
0b1b830e88 MINOR: cli: make "help" support a command in argument
With ~100 commands on the CLI, it's particularly difficult to find a
specific one in the "help" output. The function used to display the
help already supports filtering on certain commands, so in the end it's
just needed to pass the argument of the help command to enable the
automatic filtering. That's what this patch does so that "help clear"
only lists commands starting with "clear" and that "help map" lists
commands containing "map" in them.
2021-05-09 20:59:23 +02:00
Willy Tarreau
2a8a2f0223 BUILD: ssl: define HAVE_CRYPTO_memcmp() based on the library version
The build fails on versions older than 1.0.1d which is the first one
introducing CRYPTO_memcmp(), so let's have a define for this instead
of enabling it whenever USE_OPENSSL is set. One could also wonder why
we're relying on openssl for such a trivial thing, and a simple local
implementation could also allow to restore lexicographic ordering.
2021-05-09 12:10:36 +02:00
Willy Tarreau
48584645fb BUILD: http_fetch: address a few aliasing warnings with older compilers
gcc-4.4 complains about aliasing in smp_fetch_url_port() and
smp_fetch_url_ip() because the local addr variable is casted to sturct
sockaddr_in before being checked. The family should be checked on the
sockaddr_storage and we have a function to retrieve the port.
The compiler still sees some warnings but these ones are OK now.
2021-05-09 10:32:54 +02:00
Willy Tarreau
b2475a139e MINOR: tools/rnd: compute the result outside of the CAS loop
ha_random64() uses a DWCAS loop to produce the random, but it computes
the resulting value inside the loop while it doesn't change upon success,
so this is a needless overhead inside the critcal path that participates
to making threads fail the race and try again. Let's take the value out
of the loop.
2021-05-09 10:26:14 +02:00
Willy Tarreau
714f34580e DOC: fix a few remainig cases of "Haproxy" and "HAproxy" in doc and comments
Some of the Lua doc and a few places still used "Haproxy" or "HAproxy".
There was even one "HA proxy". A few of them were in an example of VTest
output, indicating that VTest ought to be fixed as well. No big deal but
better address all the remaining ones so that these inconsistencies stop
spreading around.
2021-05-09 06:50:46 +02:00
Willy Tarreau
64975cf2a4 MEDIUM: mailers: use "HAProxy" nor "HAproxy" in the subject of messages
It seems to be the last visible casing inconsistency, but better address
it for completeness otherwise we'll always have to deal with some
exceptions.
2021-05-09 06:45:16 +02:00
Willy Tarreau
a5357cdfa5 MINOR: version: report "HAProxy" not "HA-Proxy" in the version output
When running "haproxy -v", we still get "HA-Proxy" which is the last
place where this confusing oddity happens. Being so used to it I didn't
even notice it until it was reported to me just after 2.2 but it never
got fixed, despite the PRODUCT_NAME macro that is used to report the
name in the stats page and in "show info" being already set to "HAProxy"
15 years ago in 1.2.14 with commit e03312613. It's about time to
uniformize everything.
2021-05-09 06:14:25 +02:00
Willy Tarreau
c28aab05d8 BUILD: fd: include log.h from fd.c
It's needed for ha_alert() and the header was missing.
2021-05-08 20:35:39 +02:00
Willy Tarreau
202f93d885 BUILD: comp: include proxy.h from flt_http_comp.c
It's used for proxy_type_str() but the header was missing.
2021-05-08 20:35:39 +02:00
Willy Tarreau
11bd6f7296 BUILD: thread: include log.h from thread.c
It's needed for ha_alert(). Probably that a separate file for error
reporting at boot would be useful.
2021-05-08 20:35:39 +02:00
Willy Tarreau
d1dd2500f2 BUILD: http-rules: include proxy.h from http_rules.c
Many proxy functions are called there but the include was missing and
inherited via cfgparse.h.
2021-05-08 20:35:39 +02:00
Willy Tarreau
5958c43271 BUILD: listener: include proxy.h from listener.c
Many proxy functions are called there but the include was missing and
inherited via cfgparse.h.
2021-05-08 20:35:39 +02:00
Willy Tarreau
c5396bd673 BUILD: mux-fcgi: include proxy.h from mux-fcgi.c
proxy_capture_error() was called there without the include, which was
inherited via cfgparse.h.
2021-05-08 20:35:39 +02:00
Willy Tarreau
adc0240147 BUILD: mux-h1: include proxy.h from mux-h1.c
proxy_capture_error() was called there without the include, which was
inherited via cfgparse.h.
2021-05-08 20:35:39 +02:00
Willy Tarreau
3d6ee407e7 BUILD: hlua: include proxy.h from hlua.c
Many proxy functions are called there but the include was missing and
inherited via cfgparse.h.
2021-05-08 20:35:39 +02:00
Willy Tarreau
e08f4bf27f MINOR: task: stop including stream.h from task.c
This one comes with a very deep dependency hell, only to know that
process_stream() is a function. Dropping it reduces the preprocessed
output from 1.5MB to 640kB.
2021-05-08 20:27:08 +02:00
Willy Tarreau
c79e89853b BUILD: task: remove unused includes from task.c
freq_ctr.h and time.h are not used, let's drop them.
2021-05-08 20:27:08 +02:00
Willy Tarreau
08138612a4 REORG: config: uninline warnifnotcap() and failifnotcap()
These ones are used by virtually every config parser. Not only they
provide no benefit in being inlined, but they imply a very deep
dependency starting at proxy.h, which results for example in task.c
including openssl.

Let's move these two functions to cfgparse.c.
2021-05-08 20:27:08 +02:00
Willy Tarreau
3b63ca20f4 REORG: stick-table: uninline stktable_alloc_data_type()
This function has no business being inlined in stick_table.h since it's
only used at boot time by the config parser. In addition it causes an
undesired dependency on tools.h because it uses parse_time_err(). Let's
move it to stick_table.c.
2021-05-08 20:24:09 +02:00
Willy Tarreau
e59b5169b3 BUILD: connection: move list_mux_proto() to connection.c
No idea why this was put inlined into connection.h, it's used only once
for haproxy -vv, and requires tools.h, causing an undesired dependency
from connection.h. Let's move it to connection.c instead where it ought
to have been.
2021-05-08 20:24:09 +02:00
Willy Tarreau
03f839d0ea BUILD: fcgi-app: include proxy.h in fcgi-app.c
It's needed for proxies_list and used to be inherited via cfgparse.h.
2021-05-08 20:24:09 +02:00
Willy Tarreau
daa6f1a33d BUILD: filters: include proxy.h in filters.c
It's needed for proxies_list and used to be inherited via cfgparse.h.
2021-05-08 20:24:09 +02:00
Willy Tarreau
7c6685770d BUILD: mworker: include proxy.h in mworker.c
It's needed for proxies_list and used to be inherited via cfgparse.h.
2021-05-08 20:24:09 +02:00
Willy Tarreau
817538e397 BUILD: sink: include proxy.h in sink.c
It's needed for proxies_list but was missing.
2021-05-08 20:24:09 +02:00
Willy Tarreau
b00a8e30f1 BUILD: server: include missing proxy.h in server.c
It's needed for a number of functions and definitions but was missing.
2021-05-08 20:24:09 +02:00
Willy Tarreau
ba6300ea62 BUILD: server: include tools.h from server.c
A lot of functions from tools.h are used there but the file was only
inherited via other ones.
2021-05-08 19:37:41 +02:00
Willy Tarreau
ce65cbec38 BUILD: udp: include tools.h from proto_udp.c
A few functions are used from there for address conversion but the
file wasn't included.
2021-05-08 13:59:56 +02:00
Willy Tarreau
c1a689f2eb BUILD: queue: include tools.h from queue.c
It uses memprintf() without including the file because it inherited
it from other ones.
2021-05-08 13:59:05 +02:00
Willy Tarreau
745e98ce79 BUILD: mworker: include tools.h from mworker.c
It needs it for memprintf() but didn't include the file.
2021-05-08 13:58:19 +02:00
Willy Tarreau
c624da06c6 BUILD: compression: include tools.h in compression.c
It needs it for memprintf() but it wasn't included.
2021-05-08 13:57:19 +02:00
Willy Tarreau
67046bfc50 BUILD: vars: include tools.h in vars.c
A number of functions from tools.h are used there but the file was not
included.
2021-05-08 13:56:31 +02:00
Willy Tarreau
485261beab BUILD: payload: include tools.h in payload.c
It needs it for memprintf() but used to inherit it via other include files.
2021-05-08 13:55:40 +02:00
Willy Tarreau
9f9e9fc20c BUILD: dns: include tools.h in dns.c
It is used for get_addr_len() without being included. It could be worth
splitting address manipulation functions to a different set of files.
2021-05-08 13:09:46 +02:00
Willy Tarreau
bf1ae1a4b1 BUILD: server-state: include tools.h from server_state.c
Many functions from tools.h are called there without the file being
included.
2021-05-08 13:08:34 +02:00
Willy Tarreau
908908ef2a BUILD: connection: include tools.h in connection.c
Several functions from tools.h are called there without the file being
included.
2021-05-08 13:07:31 +02:00
Willy Tarreau
4bad5e2080 BUILD: sink: include tools.h in sink.c
Several functions from tools.h are used in sink.c without tools.h being
included.
2021-05-08 13:05:30 +02:00
Willy Tarreau
ce6700aec5 BUILD: cache: include tools.h in cache.c
cache.c uses a lot of functions from tools.h without including it.
2021-05-08 13:03:55 +02:00
Willy Tarreau
523ca9d102 BUILD: session: include tools.h in session.c
The file session.c calls plenty of functions from tools.h but did not
include it.
2021-05-08 13:03:04 +02:00
Willy Tarreau
e684483ec5 BUILD: proxy: include tools.h in proxy.c
Many functions are used from tools.h but the file wasn't included and
was inherited through others.
2021-05-08 13:02:07 +02:00
Willy Tarreau
4cbf62d48a BUILD: htx: include tools.h in http_htx.c
Several functions from tools.h are called there and it used to be
inherited through others.
2021-05-08 13:01:23 +02:00
Willy Tarreau
e9dcb3cd8a BUILD: config: include tools.h in cfgparse-listen.c
Many functions defined in tools.h were called there but the file used
to be inherited via others.
2021-05-08 13:00:23 +02:00
Willy Tarreau
ca14dd5537 BUILD: resolvers: include tools.h
Many functions from tools.h are called there but it was inherited via others.
2021-05-08 12:59:47 +02:00
Willy Tarreau
e16ada16d9 BUILD: spoe: flt_spoe.c needs tools.h
It uses many functions declared there but used to inherit it through others.
2021-05-08 12:57:17 +02:00
Willy Tarreau
cc81ecac44 BUILD: config: cfgparse-ssl.c needs tools.h
It calls parse_time_err() which is defined there but used to inherit it
through others.
2021-05-08 12:54:42 +02:00
Willy Tarreau
cb72b7e028 BUILD: ssl: ssl_utils requires chunk.h
It uses chunk_printf() so it needs it. Currently it gets it through
others.
2021-05-08 12:52:56 +02:00
Willy Tarreau
15f9ac3c59 REORG: mworker: move proc_self from global to mworker
Only mworker uses proc_self, and it was declared in global.h, forcing
users of global.h to include mworker and its dependencies.

Moving it to mworker reduces the preprocessed size of version.c from
170 to 125kB by shrinking the number of local includes from 30 to 16
and the number of system includes from 147 to 132.
2021-05-08 12:34:44 +02:00
Willy Tarreau
e8ceea1345 BUILD: auth: include missing list.h
list_for_each_entry() requires list.h but used to inherit it by accident
through global.h and mworker-t.h. Let's explicitly add it.
2021-05-08 12:29:51 +02:00
Willy Tarreau
7f673c2cde BUILD: wdt: include signal-t.h
WDT_SIG is used there, thus signal-t.h is required. Currently it's
retrieved by accident through global.h.
2021-05-08 12:29:01 +02:00
Willy Tarreau
cfc4f24d80 REORG: vars: move the "proc" scope variables out of the global struct
The presence of this field causes a long dependency chain because almost
everyone includes global-t.h, and vars include sample_data which include
some system includes as well as HTTP parts.

There is absolutely no reason for having the process-wide variables in
the global struct, let's just move them into vars.c and vars.h. This
reduces from ~190k to ~170k the preprocessed output of version.c.
2021-05-08 12:11:29 +02:00
Willy Tarreau
9eec7e206e MINOR: config: mark tune.fd.edge-triggered as experimental
This one is stated as experimental in the doc but could still be used
by accidental copy-paste. Let's mark it with KWF_EXPERIMENTAL so that
users have to opt-in to use it.
2021-05-08 11:06:32 +02:00
Willy Tarreau
c5977728b3 MINOR: stats: make "show info" able to report rates as floats when asked
Now "show info float" will also report SSL rates, connection rates and
key reuse ratios as floats. This can be convenient at very low rates.

Note that the SSL reuse ratio which used to commonly oscillate between
0 and 1 under load is now more often above zero with small values. It
indicates that for better stability we shouldn't be comparing a key rate
with a connection rate but instead we should measure the reuse rate at
its source.
2021-05-08 10:52:12 +02:00
Willy Tarreau
e8abc3293f MINOR: stats: report uptime and start time as floats with subsecond resolution
When "show info float" is used, the uptime and start time will be reported
with subsecond resolution (microsecond actually since timeval is used).
2021-05-08 10:52:12 +02:00
Willy Tarreau
d37e26eaa6 MINOR: stats: use tv_remain() to precisely compute the uptime
We'll have to support reporting sub-second uptimes, so let's use the
appropriate function which will automatically adjust the tv_usec field.
In addition to this, it will also report a more accurate uptime thanks
to considering the sub-second part in the result.
2021-05-08 10:52:12 +02:00
Willy Tarreau
2745620240 MINOR: stats: support an optional "float" option to "show info"
This will allow some fields to be produced with a higher accuracy when
the requester indicates being able to parse floats. Rates and times are
among the elements which can make sense.
2021-05-08 10:52:12 +02:00
Willy Tarreau
0b26b3866c MINOR: stats: pass the appctx flags to stats_fill_info()
Currently the stats filling function knows nothing about the caller's
needs, so let's pass the STAT_* flags so that it can adapt to the
requester's constraints.
2021-05-08 10:52:12 +02:00
Willy Tarreau
6004fb7681 MINOR: stats: add the HTML conversion for float types
For the prometheus exporter, a new float type was added for the fields
and its conversion was added everywhere except for the HTML output.
Now that we have F2H() we can implement it for consistency.
2021-05-08 10:48:17 +02:00
Willy Tarreau
065ba3186e MINOR: stats: avoid excessive padding of float values with trailing zeroes
When emitting stats, we don't need to have 6 zeroes after the decimal point
for each value, so let's trim floating point numbers to the longest needed
only.
2021-05-08 10:48:17 +02:00
Willy Tarreau
ae03d26eea MINOR: tools: add a float-to-ascii conversion function
We already had ultoa_r() and friends but nothing to emit inline floats.
This is now done with ftoa_r() and F2A/F2H. Note that the latter both use
the itoa_str[] as temporary storage and that the HTML format currently is
the exact same as the ASCII one. The trailing zeroes are always timmed so
these outputs are usable in user-visible output.
2021-05-08 10:48:17 +02:00
Willy Tarreau
56d1d8dab0 MINOR: tools: implement trimming of floating point numbers
When using "%f" to print a float, it automatically gets 6 digits after
the decimal point and there's no way to automatically adjust to the
required ones by dropping trailing zeroes. This function does exactly
this and automatically drops the decimal point if all digits after it
were zeroes. This will make numbers more friendly in stats and makes
outputs shorter (e.g. JSON where everything is just a "number").

The function is designed to be easy to use with snprint() and chunks:

  snprintf:
    flt_trim(buf, 0, snprintf(buf, sizeof(buf), "%f", x));

  chunk_printf:
    out->data = flt_trim(out->area, 0, chunk_printf(out, "%f", x));

  chunk_appendf:
    size_t prev_data = out->data;
    out->data = flt_trim(out->area, prev_data, chunk_appendf(out, "%f", x));
2021-05-08 10:42:11 +02:00
Willy Tarreau
a1169b6231 MINOR: sample: improve error reporting on missing arg to strcmp() converter
Calling the strcmp() converter with no argument yields this strange error:

  [ALERT]    (31439) : parsing [test.cfg:3] : error detected in frontend 'f' while parsing 'http-request redirect' rule : failed to parse sample expression <src,strcmp]> : invalid args in converter 'strcmp' : failed to register variable name ''.

This is because the vars name check tries to see if it can create such a
variable having an empty name. Let's at least make a special case of the
missing argument. Now we can read a more explicit:

  [ALERT]    (31655) : parsing [test.cfg:3] : error detected in frontend 'f' while parsing 'http-request redirect' rule : failed to parse sample expression <src,strcmp]> : invalid args in converter 'strcmp' : missing variable name.

This was done for secure_strcmp() as well.
2021-05-08 06:55:25 +02:00
Amaury Denoyelle
24abb0cdc1 BUG/MINOR: server: do not report diag for peer servers with null weight
Only check servers attached to a proxy with PR_CAP_LB.

This does not need to be backported as the diag message was added in the
current 2.4-dev branch.
2021-05-07 15:20:54 +02:00
Amaury Denoyelle
b979f59871 MINOR: proxy: define PR_CAP_LB
Add a new proxy capability for proxy with load-balancing capabilities.
This help to differentiate listen/frontend/backend with special proxies
such as peer proxies.
2021-05-07 15:12:20 +02:00
Amaury Denoyelle
86c1d0fddb BUILD: fix usage of ha_alert without format string
The compilation is failing due to no format string used in ha_alert.
This does not need to be backported.
2021-05-07 15:07:21 +02:00
Amaury Denoyelle
a9e639afe2 MINOR: http_act: mark normalize-uri as experimental
normalize-uri http rule is marked as experimental, so it cannot be
activated without the global 'expose-experimental-directives'. The
associated vtc is updated to be able to use it.
2021-05-07 14:35:02 +02:00
Amaury Denoyelle
5dfdf3e5b0 MINOR: stats: report tainted on show info
Add a new info field ST_F_TAINTED to dump tainted status at the end of
the 'show info' output.
2021-05-07 14:35:02 +02:00
Amaury Denoyelle
f492992065 MINOR: cli: set tainted when using CLI expert/experimental mode
Mark the process as tainted as soon as a command command only accessible
in expert or experimental mode is executed.
2021-05-07 14:35:02 +02:00
Amaury Denoyelle
0351773534 MINOR: action: implement experimental actions
Support experimental actions. It is mandatory to use
'expose-experimental-directives' before to be able to use them.

If such action is present in the config file, the tainted status of the
process is updated. Another tainted status is set when an experimental
action is executed.
2021-05-07 14:35:02 +02:00
Amaury Denoyelle
e4a617c931 MINOR: action: replace match_pfx by a keyword flags field
Define a new keyword flag KWF_MATCH_PREFIX. This is used to replace the
match_pfx field of action struct.

This has the benefit to have more explicit action declaration, and now
it is possible to quickly implement experimental actions.
2021-05-07 14:35:01 +02:00
Amaury Denoyelle
d2e53cd47e MINOR: cfgparse: implement experimental config keywords
Add a new flag to mark a keyword as experimental. An experimental
keyword cannot be used if the global 'expose-experimental-directives' is
not present first.

Only keywords parsed through a standard cfg_keywords lists in
global/proxies section will be automatically detected if declared
experimental. To support a keyword outside of these lists,
check_kw_experimental must be called manually during its parsing.

If an experimental keyword is present in the config, the tainted flag is
updated.

For the moment, no keyword is marked as experimental.
2021-05-07 14:34:41 +02:00
Amaury Denoyelle
484454d906 MINOR: global: define tainted flag
Add a global flag named 'tainted'. Its purpose is to report various
status about experimental features used for the current process
lifetime.

By default it is initialized to 0. It can be set/retrieve by a couple of
new functions mark_tainted()/get_tainted(). Once a flag is set, it
cannot be resetted.

Currently, no tainted status is implemented, it will be the subject of
the following commits.
2021-05-07 14:12:27 +02:00
Christopher Faulet
ea86083718 BUG/MINOR: checks: Reschedule check on observe mode only if fastinter is set
On observe mode, if a server is marked as DOWN, the server's health-check is
rescheduled using the fastinter timeout if the new expiration date is newer
that the current one. But this must only be performed if the fastinter
timeout is defined.

Internally, tick_is_lt() function only checks the date and does not perform any
verification on the provided args. Thus, we must take care of it. However, it is
possible to disable the server health-check by setting its task expiration date
to TICK_ETERNITY.

This patch must be backported as far as 2.2. It is related to
2021-05-07 12:10:30 +02:00
Christopher Faulet
92017a3215 BUG/MINOR: checks: Handle synchronous connect when a tcpcheck is started
A connection may be synchronously established. In the tcpcheck context, it
may be a problem if several connections come one after another. In this
case, there is no event to close the very first connection before starting
the next one. The checks is thus blocked and timed out, a L7 timeout error
is reported.

To fix the bug, when a tcpcheck is started, we immediately evaluate its
state. Most of time, nothing is performed and we must wait. But it is thus
possible to handle the result of a successfull connection.

This patch should fix the issue #1234. It must be backported as far as 2.2.
2021-05-07 12:00:56 +02:00
Christopher Faulet
30aa0da532 BUG/MINOR: stream: Reset stream final state and si error type on L7 retry
Thanks to a previous fix, the stream error mask is now cleared on L7
retry. But the stream final state (SF_FINST_*) and the stream-interface
error type must also be reset to properly restart a new connection and be
sure to not inherit errors from the previous connection attempt.

In addition, SF_ADDR_SET flag is not systematically removed.
stream_choose_redispatch() already takes care to unset it if necessary. When
the connection is not redispatch, the server address can be preserved.

This patch must be backported as far as 2.0.
2021-05-07 12:00:56 +02:00
Willy Tarreau
b205bfdab7 CLEANUP: cli/tree-wide: properly re-align the CLI commands' help messages
There were 102 CLI commands whose help were zig-zagging all along the dump
making them unreadable. This patch realigns all these messages so that the
command now uses up to 40 characters before the delimiting colon. About a
third of the commands did not correctly list their arguments which were
added after the first version, so they were all updated. Some abuses of
the term "id" were fixed to use a more explanatory term. The
"set ssl ocsp-response" command was not listed because it lacked a help
message, this was fixed as well. The deprecated enable/disable commands
for agent/health/server were prominently written as deprecated. Whenever
possible, clearer explanations were provided.
2021-05-07 11:51:26 +02:00
Willy Tarreau
7190b987ab MINOR: config: add a new message directive: .diag
This one works just like .notice/.warning/.alert except that it prints
the message at level "DIAG" only when haproxy runs in diagnostic mode
(-dD). This can be convenient for example to pass a few hints to help
locate certain config parts or to leave messages about certain temporary
workarounds.

Example:

  .diag "WTA/2021-05-07: $.LINE: replace 'redirect' with 'return' after final switch to 2.4"
         http-request redirect location /goaway if ABUSE
2021-05-07 09:06:40 +02:00
Willy Tarreau
9f903af510 MEDIUM: log: slightly refine the output format of alerts/warnings/etc
For about 20 years we've been emitting cryptic messages on warnings and
alerts, that nobody knows how to parse:

  [NOTICE] 126/080118 (3115) : haproxy version is 2.4-dev18-0b7c78-49
  [NOTICE] 126/080118 (3115) : path to executable is ./haproxy
  [WARNING] 126/080119 (3115) : Server default/srv1 is DOWN via static/srv1. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
  [ALERT] 126/080119 (3115) : backend 'default' has no server available!

Hint: the first 3-digit number is the day of year, and the 6 digits
after it represent the time of day in format HHMMSS, then the pid in
parenthesis. These are not quite user-friendly and such cryptic into
are not useful at all.

This patch slightly adjusts the output by performing these minimal changes:
  - removing the date/time, as they were added very early when haproxy
    was meant to be used in foreground as a debugging tool, and they're
    provided in more details in logs nowadays ;

  - better aligning the fields by padding the severity tag to 10 chars.
    The diag output was renamed to "DIAG" only.

Now the output provides this:

  [NOTICE]   (4563) : haproxy version is 2.4-dev18-75a428-51
  [NOTICE]   (4563) : path to executable is ./haproxy
  [WARNING]  (4563) : Server default/srv1 is DOWN via static/srv1. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
  [ALERT]    (4563) : backend 'default' has no server available!

The useless space before the colon was kept so as not to confuse any
possible output parser.

The few entries in the doc referring to this format were adjusted to
reflect the new one.

The change was tagged "MEDIUM" as it may have visible consequences on
home-grown monitoring tools, though it is extremely unlikely due to the
limited extent of these changes.
2021-05-07 08:55:11 +02:00
Willy Tarreau
75a4284bab BUG/MINOR: stream: properly clear the previous error mask on L7 retries
The cleanup of the previous error was incorrect on L7 retries, it would
OR two values while they're part of an enum, leaving some bits set.
Depending on the errors it was possible to occasionally see an internal
error ("I" flag) being logged.

This should be backported as far as 2.0, though the do_l7_retry() function
in in proto_htx.c in older versions.
2021-05-07 08:22:16 +02:00
Willy Tarreau
2639e2edc2 BUG/MINOR: activity: use the new pointer to calculate the new size in realloc()
When memory profiling is enabled, realloc() can occasionally get the area
size wrong due to the wrong pointer being used to check the new size. When
the old area gets unmapped in the operation, this may even result in a
crash. There's no impact without memory profiling though.

No backport is needed as this is exclusively 2.4-dev.
2021-05-07 08:01:35 +02:00
Willy Tarreau
0b7c78aa05 MINOR: config: add predicates "version_atleast" and "version_before" to cond blocks
These predicates respectively verify that the current version is at least
a given version or is before a specific one. The syntax is exactly the one
reported by "haproxy -v", though each component is optional, so both "1.5"
and "2.4-dev18-88910-48" are supported. Missing components equal zero, and
"dev" is below "pre" or "rc", which are both inferior to no such mention
(i.e. they are negative). Thus "2.4-dev18" is older than "2.4-rc1" which
is older than "2.4".
2021-05-06 17:04:45 +02:00
Willy Tarreau
58ca706e16 MINOR: config: add predicate "feature" to detect certain built-in features
The "feature(name)" predicate will return true if <name> corresponds to
a name listed after a '+' in the features list, that is it was enabled at
build time with USE_<name>=1. Typical use cases will include OPENSSL, LUA
and LINUX_SPLICE. But maybe it will also be convenient to use with optional
addons such as PROMEX and the device detection modules to help keeping the
same configs across various deployments.
2021-05-06 17:02:36 +02:00
Willy Tarreau
6492e87b0e MINOR: config: add predicates "streq()" and "strneq()" to conditional expressions
"streq(str1,str2)" will return true if the two strings match while
"strneq(str1,str2)" will return true only if they differ. This is
convenient to match an environment variable against a predefined value.
2021-05-06 17:02:36 +02:00
Willy Tarreau
42ed14b529 MINOR: config: add predicate "defined()" to conditional expression blocks
"defined(name)" will return true if <name> is a defined environment variable
otherwise false, regardless of its contents.
2021-05-06 17:02:36 +02:00
Willy Tarreau
732525fae7 MINOR: config: make cfg_eval_condition() support predicates with arguments
Now we can look up a list of known predicates and pre-parse their
arguments. For now the list is empty. The code needed to be arranged with
a common exit point to release all arguments because there's no default
argument freeing function (it likely only used to exist in the deinit
code). Since we only support simple arguments for now it's no big deal,
only a 2-liner loop.
2021-05-06 17:02:36 +02:00
Willy Tarreau
299bd1c3ae MINOR: config: improve .if condition error reporting
Let's return the position of the first unparsable character on error,
so that instead of just saying "unparsable conditional expression blah"
we can have:

  [ALERT] 125/150618 (13995) : parsing [test-conds2.cfg:1]: unparsable conditional expression '12/blah' in '.if' at position 1:
    .if 12/blah
        ^
This is important because conditions will be made from environment
variables or later from more complex expressions where the error will
not always be easy to locate.
2021-05-06 17:02:36 +02:00
Willy Tarreau
a43dfda4e1 MINOR: global: add version comparison functions
The new function split_version() converts a parsable haproxy version to
an array of integers. The function compare_current_version() compares an
arbitrary version to the current one. These two functions were written
by Thierry Fournier in 2013, and are still usable as-is. They will be
used to write config language predicates.
2021-05-06 17:02:36 +02:00
Willy Tarreau
f0d3b732fb MINOR: global: export the build features string list
Till now it was only presented in the version output but could not be
consulted outside of haproxy.c, let's export it as a variable, and set
it to an empty string if not defined.
2021-05-06 17:02:36 +02:00
Willy Tarreau
3e293a9135 MINOR: arg: improve the error message on missing closing parenthesis
When the closing brace is missing after an argument (acl, ...), the
error may report something like "expected ')' before ''". Let's just
drop "before ''" when the final word is empty to make the message a
bit clearer.
2021-05-06 17:02:36 +02:00
Willy Tarreau
7541056aa0 BUILD: activity: do not include malloc.h
It doesn't exist on MacOS and broke the build. We don't need it as it's
already included by compat.h when relevant. No backport is needed.
2021-05-06 11:38:41 +02:00
Willy Tarreau
a46f1af2b1 MINOR: config: support some pseudo-variables for file/line/section
The new pseudo-variables ".FILE", ".LINE" and ".SECTION" will be resolved
on the fly by the config parser and will respectively retrieve the current
configuration file name, the current line number and the current section
being parsed. This may help emit logs, errors, and debugging information
(e.g. which rule matched).

The '.' in the first char was reserved for such pseudo-variables and no
other variable is permitted. This will allow to add support for new ones
in the future if they prove to be useful (e.g. randoms/uuid for secret
keying or automatic naming of configuration objects).
2021-05-06 10:36:38 +02:00
Willy Tarreau
5150805a5c MINOR: config: keep up-to-date current file/line/section in the global struct
Let's add a few fields to the global struct to store information about
the current file being processed, the current line number and the current
section. This will be used to retrieve them using special variables.
2021-05-06 10:35:03 +02:00
Willy Tarreau
6a2110c717 MINOR: config: centralize the ".if"/".elif" condition parser and evaluator
Instead of duplicating the condition evaluations, let's have a single
function cfg_eval_condition() that returns true/false/error. It takes
less code and will ease its extension.
2021-05-06 10:35:03 +02:00
Willy Tarreau
71990e6bec BUG/MINOR: config: .if/.elif should also accept negative integers
The doc about .if/.elif config block conditions says:

  a non-nul integer (e.g. '1'), always returns "true"

So we must accept negative integers as well. The test was made on
atoi() > 0.

No backport is needed, this is only 2.4.
2021-05-06 10:35:03 +02:00
Willy Tarreau
f67ff02072 BUG/MINOR: config: add a missing "ELIF_TAKE" test for ".elif" condition evaluator
This missing state was causing a second elif condition to be evaluated
after a first one succeeded after a .if failed. For example in the test
below the else would be executed:

     .if    0
     .elif  1
     .elif  0
     .else
     .endif

No backport is needed, this is 2.4-only.
2021-05-06 10:35:03 +02:00
Willy Tarreau
6e647c94f2 BUG/MINOR: config: fix uninitialized initial state in ".if" block evaluator
The condition to skip the block in the ".if" evaluator forgot to check
that the level was high enough, resulting in rare cases where a random
value matched one of the 5 values that cause the block to be skipped.

No backport is needed as it's 2.4-only.
2021-05-06 10:35:03 +02:00
Christopher Faulet
e763c8c99f BUG/MINOR: stream: Decrement server current session counter on L7 retry
When a L7 retry is performed, we must not forget to decrement the current
session counter of the assigned server. Of course, it must only be done if
the current session is already counted on the server, thus if SF_CURR_SESS
flag is set on the stream.

This patch is related to the issue #1003. It must be backported as far as
2.0.
2021-05-06 09:21:12 +02:00
Christopher Faulet
10a8670f28 MINOR: mux-h1: Manage processing blocking flags on the H1 stream
Because H1C_F_RX_BLK and H1C_F_TX_BLK flags now only concerns data
processing, at the H1 stream level, there is no reason to still manage them
on the H1 connection. Thus, these flags are now set on the H1 stream.
2021-05-06 09:21:00 +02:00
Christopher Faulet
14ee9b8c8b CLEANUP: mux-h1: rename WAIT_INPUT/WAIT_OUTPUT flags
These flags are used to block, respectively, the output and the input
processing. Thus, to be more explicit, H1C_F_WAIT_INPUT is renamed to
H1C_F_TX_BLK and H1C_F_WAIT_OUTPUT is renamed to H1C_F_RX_BLK.
2021-05-06 09:21:00 +02:00
Christopher Faulet
02c92c3e6f MEDIUM: mux-h1: Wake H1 stream when both sides a synchronized
Instead of subscribing for reads or sends to restart data processing, when
both sides are synchronized, the H1 stream is woken up. This happens when
H1C_F_WAIT_INPUT or H1C_F_WAIT_OUTPUT flags are removed, Indeed, these flags
block the data processing and not raw data sending or receiving.
2021-05-06 09:21:00 +02:00
Christopher Faulet
94d35108b4 MINOR: mux-h1: Always subscribe for reads when splicing is disabled
In h1_rcv_pipe(), when the splicing is not possible or disabled at the end
of the fnuction, we make sure to subscribe for reads. It is not a bug but it
avoid an extra call to h1_rcv_pipe() to handle the subscription in some
cases (end of message, end of chunk or read0).

In addition, the condition to detect end of splicing has been simplified. We
now only rely on H1C_F_WANT_SPLICE flags.
2021-05-06 09:21:00 +02:00
Christopher Faulet
8454f2dbbc MINOR: mux-h1: Subscribe for sends if output buffer is not empty in h1_snd_pipe
In h1_snd_pipe(), before sending spliced data, we take care to flush the
output buffer by subscribing for sends. However, the condition to do so is
not accurate. We test data remaining in the pipe. It works but it also
unnecessarily subscribes H1C for sends when the output buffer is empty if we
are unable to send all spliced data in one time. Instead, H1C is now
subscribed for sends if output buffer is not empty.
2021-05-06 09:21:00 +02:00
Christopher Faulet
2b861bf723 MINOR: mux-h1: clean up conditions to enabled and disabled splicing
First, there is no reason to announce the splicing support at the
conn-stream level when it is created, at least for now. GTUNE_USE_SPLICE
option is already handled at the stream level.

Second, in h1_rcv_buf(), there is no reason to test the message state to
switch the H1C in splicing mode (via H1C_F_WANT_SPLICE flag).
h1_process_input() already takes care to set CS_FL_MAY_SPLICE flag on the
conn-stream when appropriate. Thus, in h1_rcv_buf(), we can rely on this
flag to change the H1C state.

Finally, if h1_rcv_pipe() is called, it means the H1C is already in the
splicing mode. H1C_F_WANT_SPLICE flag is necessarily already set. Thus no
reason to force it.
2021-05-06 09:21:00 +02:00
Christopher Faulet
1baef1523d BUG/MEDIUM: mux-h1: Properly report client close if abortonclose option is set
On client side, if CO_RFL_KEEP_RECV flags is set when h1_rcv_buf() is
called, we force subscription for reads to be able to catch read0. This way,
the event will be reported to upper layer to let the stream abort the
request.

This patch fixes the abortonclose option for H1 connections. It depends on
following patches :

  * MEDIUM: mux-h1: Don't block reads when waiting for the other side
  * MINOR: conn-stream: Force mux to wait for read events if abortonclose is set

But to be sure the event is handled by the stream, the following patches are
also required :

  * BUG/MINOR: stream-int: Don't block reads in si_update_rx() if chn may receive
  * MINOR: channel: Rely on HTX version if appropriate in channel_may_recv()

All the series must be backported with caution as far as 2.0, and only after
a period of observation to be sure nothing broke.
2021-05-06 09:19:06 +02:00
Christopher Faulet
ec4207cb68 MEDIUM: mux-h1: Don't block reads when waiting for the other side
When we are waiting for the other side to read more data, or to read the
next request, we must only stop the processing of input data and not the
data receipt. This patch don't change anything on the subscribes for
reads. So it should not change anything. The only difference is that the H1
connection will try to read data if it is woken up for an I/O event and if
it was subscribed for reads.

This patch is required to fix abortonclose option for H1 client connections.
2021-05-06 09:19:06 +02:00
Christopher Faulet
d8219b31e7 MINOR: conn-stream: Force mux to wait for read events if abortonclose is set
When the abortonclose option is enabled, to be sure to be immediately
notified when a shutdown is received from the client, the frontend
conn-stream must be sure the mux will wait for read events. To do so, the
CO_RFL_KEEP_RECV flag is set when mux->rcv_buf() is called. This new flag
instructs the mux to wait for read events, regardless its internal state.

This patch is required to fix abortonclose option for H1 client connections.
2021-05-06 09:19:05 +02:00
Christopher Faulet
e0dec4b7b2 BUG/MINOR: stream-int: Don't block reads in si_update_rx() if chn may receive
In si_update_rx() function, the reads may be blocked because we explicitly
don't want to read or because of a lack of room in the input buffer. The
first condition is valid. However the second one only test if the channel is
empty or not. It means the reads are blocked if there are still some output
data in the input channel, in its buffer or its pipe. This condition is not
accurate. The reads must not be blocked if the channel can still receive
data. Thus instead of relying on channel_is_empty() function, we now call
channel_may_recv().

This patch is especially useful to be able to catch read0 on client side
when we are waiting for a connection to the server, when abortonclose option
is enabled. Otherwise, the client abort is not detected.

This patch depends on "MINOR: channel: Rely on HTX version if appropriate in
channel_may_recv()". Both must be backported as far as 2.0 after a period of
observation to be sure nothing broke.
2021-05-06 09:19:05 +02:00
Willy Tarreau
ca3afc2456 MINOR: activity: add the profiling.memory global setting
This allows to enable/disable memory usage profiling very early, which
can be convenient to trace the memory usage in maps, certificates, Lua
etc.
2021-05-05 19:09:19 +02:00
Willy Tarreau
993d44d234 MINOR: activity: make "show profiling" also dump the memoery usage
Now the memory usage stats are dumped. They are first sorted by total
alloc+free so that the first ones are always the most relevant, and
that most symmetric alloc/free pairs appear next to each other. This
way it becomes convenient to only show a small part of them such as:

    show profiling memory 20

It's worth noting that the sorting is performed upon each call to the
iohandler so it is technically possible that an entry could appear
twice or be dropped if the ordering changes between two calls. In
practice it is not an issue but it's worth being mentioned.
2021-05-05 19:09:19 +02:00
Willy Tarreau
42712cb6d4 MINOR: activity: make "show profiling" support a few arguments
These ones allow to limit the output to only certain sections and/or
a number of lines per dump.
2021-05-05 19:09:19 +02:00
Willy Tarreau
637d85a93e MINOR: activity: clean up the show profiling io_handler a little bit
Let's rearrange it to make it more configurable and allow to iterate
over multiple parts (header, tasks, memory etc), to restart from a
given line number (previously it didn't work, though fortunately it
didn't happen), and to support dumping only certain parts and a given
number of lines. A few entries from ctx.cli are now used to store a
restart point and the current step.
2021-05-05 19:09:19 +02:00
Willy Tarreau
f93c7be87f MEDIUM: activity: collect memory allocator statistics with USE_MEMORY_PROFILING
When built with USE_MEMORY_PROFILING the main memory allocation functions
are diverted to collect statistics per caller. It is a bit tricky because
the only way to call the original ones is to find their pointer, which
requires dlsym(), and which is not available everywhere.

Thus all functions are designed to call their fallback function (the
original one), which is preset to an initialization function that is
supposed to call dlsym() to resolve the missing symbols, and vanish.
This saves expensive tests in the critical path.

A second problem is that dlsym() calls calloc() to initialize some
error messages. After plenty of tests with posix_memalign(), valloc()
and friends, it turns out that returning NULL still makes it happy.
Thus we currently use a visit counter (in_memprof) to detect if we're
reentering, in which case all allocation functions return NULL.

In order to convert a return address to an entry in the stats, we
perform a cheap hash consisting in multiplying the pointer by a
balanced number (as many zeros as ones) and keeping the middle bits.
The hash is already pretty good like this, achieving to store up to
638 entries in a 2048-entry table without collision. But in order to
further refine this and improve the fill ratio of the table, in case
of collision we move up to 16 adjacent entries to find a free place.
This remains quite cheap and manages to store all of these inside a
1024-entries hash table with even less risk of collision.

Also, free(NULL) does not produce any stats. By doing so we reduce
from 638 to 208 the average number of entries needed for a basic
config using SSL. free(NULL) not only provides no information as it's
a NOP, but keeping it is pure pollution as it happens all the time.

When DEBUG_MEM_STATS is enabled, malloc/calloc/realloc are redefined as
macros, preventing the code from compiling. Thus, when this option is
detected, the macros are undefined as they are pointless there anyway.

The functions are optimized to quickly jump to the fallback and as such
become almost invisible in terms of processing time, execpt an extra
"if" on a read_mostly variable and a jump. Considering that this only
happens for pool misses and library routines, this remains acceptable.

Performance tests in SSL (the most stressful test) shows less than 1%
performance loss when profiling is enabled on 2c4t.

The code was written in a way to ease backporting to modern versions
(2.2+) if needed, so it keeps the long names for integers and doesn't
use the _INC version of the atomic ops.
2021-05-05 19:09:19 +02:00
Willy Tarreau
db87fc7d36 MINOR: activity: declare the storage for memory usage statistics
We'll need to store for each call place, the pointer to the caller
(the return address to be more exact as with free() it's not uncommon
to see tail calls), the number of calls to alloc/free and the total
alloc/free bytes. realloc() will be counted either as alloc or free
depending on the balance of the size before vs after.

We store 1024+1 entries. The first ones are used as hashes and the
last one for collisions.

When profiling is enabled via the CLI, all the stats are reset.
2021-05-05 18:55:28 +02:00
Willy Tarreau
00dd44f67f MINOR: activity: add a "memory" entry to "profiling"
This adds the necessary flags to permit run-time enabling/disabling of
memory profiling. For now this is disabled.

A few words were added to the management doc about it and recalling that
this is limited to certain OSes.
2021-05-05 18:55:02 +02:00
Willy Tarreau
ef7380f916 CLEANUP: activity: mark the profiling and task_profiling_mask __read_mostly
These ones are only read by the scheduler and occasionally written to
by the CLI parser, so let's move them to read_mostly so that they do
not risk to suffer from cache line pollution.
2021-05-05 18:38:05 +02:00
Willy Tarreau
64192392c4 MINOR: tools: add functions to retrieve the address of a symbol
get_sym_curr_addr() will return the address of the first occurrence of
the given symbol while get_sym_next_addr() will return the address of
the next occurrence of the symbol. These ones return NULL on non-linux,
non-ELF, non-USE_DL.
2021-05-05 16:24:52 +02:00
Amaury Denoyelle
d3a88c1c32 MEDIUM: connection: close front idling connection on soft-stop
Implement a safe mechanism to close front idling connection which
prevents the soft-stop to complete. Every h1/h2 front connection is
added in a new per-thread list instance. On shutdown, a new task is
waking up which calls wake mux operation on every connection still
present in the new list.

A new stopping_list attach point has been added in the connection
structure. As this member is only used for frontend connections, it
shared the same union as the session_list reserved for backend
connections.
2021-05-05 14:39:23 +02:00
Amaury Denoyelle
efc6e95642 MEDIUM: mux_h1: release idling frontend conns on soft-stop
In h1_process, if the proxy of a frontend connection is disabled,
release the connection.

This commit is in preparation to properly close idling front connections
on soft-stop. h1_process must still be called, this will be done via a
dedicated task which monitors the global variable stopping.
2021-05-05 14:35:36 +02:00
Amaury Denoyelle
3109ccfe70 MINOR: srv: close all idle connections on shutdown
Implement a function to close all server idle connections. This function
is called via a global deinit server handler.

The main objective is to prevents from leaving sockets in TIME_WAIT
state. To limit the set of operations on shutdown and prevents
tasks rescheduling, only the ctrl stack closing is done.
2021-05-05 14:33:51 +02:00
Willy Tarreau
1ab6c0bfd2 MINOR: pools/debug: slightly relax DEBUG_DONT_SHARE_POOLS
The purpose of this debugging option was to prevent certain pools from
masking other ones when they were shared. For example, task, http_txn,
h2s, h1s, h1c, session, fcgi_strm, and connection are all 192 bytes and
would normally be mergedi, but not with this option. The problem is that
certain pools are declared multiple times with various parameters, which
are often very close, and due to the way the option works, they're not
shared either. Good examples of this are captures and stick tables. Some
configurations have large numbers of stick-tables of pretty similar types
and it's very common to end up with the following when the option is
enabled:

  $ socat - /tmp/sock1  <<< "show pools" | grep stick
    - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753800=56
    - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753880=57
    - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753900=58
    - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753980=59
    - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753a00=60
    - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753a80=61
    - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753b00=62
    - Pool sticktables (224 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753780=55

In addition to not being convenient, it can have important effects on the
memory usage because these pools will not share their entries, so one stick
table cannot allocate from another one's pool.

This patch solves this by going back to the initial goal which was not to
have different pools in the same list. Instead of masking the MAP_F_SHARED
flag, it simply adds a test on the pool's name, and disables pool sharing
if the names differ. This way pools are not shared unless they're of the
same name and size, which doesn't hinder debugging. The same test above
now returns this:

  $ socat - /tmp/sock1  <<< "show pools" | grep stick
    - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 7 users, @0x3fadb30 [SHARED]
    - Pool sticktables (224 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x3facaa0 [SHARED]

This is much better. This should probably be backported, in order to limit
the side effects of DEBUG_DONT_SHARE_POOLS being enabled in production.
2021-05-05 07:47:29 +02:00
Willy Tarreau
48129be18a MINOR: debug: add a new "debug dev sym" command in expert mode
This command attempts to resolve a pointer to a symbol name. This is
convenient during development as it's easier to get such pointers live
than by issuing a debugger or calling addr2line.
2021-05-05 07:47:29 +02:00
William Lallemand
5ba80d677d BUG/MINOR: ssl/cli: fix a lock leak when no memory available
This bug was introduced in e5ff4ad ("BUG/MINOR: ssl: fix a trash buffer
leak in some error cases").

When cli_parse_set_cert() returns because alloc_trash_chunk() failed, it
does not unlock the spinlock which can lead to a deadlock later.

Must be backported as far as 2.1 where e5ff4ad was backported.
2021-05-04 16:40:44 +02:00
Willy Tarreau
18b2a9dd87 BUG/MEDIUM: cli: prevent memory leak on write errors
Since the introduction of payload support on the CLI in 1.9-dev1 by
commit abbf60710 ("MEDIUM: cli: Add payload support"), a chunk is
temporarily allocated for the CLI to support defragmenting a payload
passed with a command. However it's only released when passing via
the CLI_ST_END state (i.e. on clean shutdown), but not on errors.
Something as trivial as:

  $ while :; do ncat --send-only -U /path/to/cli <<< "show stat"; done

with a few hundreds of servers is enough see the number of allocated
trash chunks go through the roof in "show pools".

This needs to be backported as far as 2.0.
2021-05-04 16:27:45 +02:00
Christopher Faulet
c31b200872 BUG/MINOR: hlua: Don't rely on top of the stack when using Lua buffers
When the lua buffers are used, a variable number of stack slots may be
used. Thus we cannot assume that we know where the top of the stack is. It
was not an issue for lua < 5.4.3 (at least for small buffers). But
'socket:receive()' now fails with lua 5.4.3 because a light userdata is
systematically pushed on the top of the stack when a buffer is initialized.

To fix the bug, in hlua_socket_receive(), we save the index of the top of
the stack before creating the buffer. This way, we can check the number of
arguments, regardless anything was pushed on the stack or not.

Note that the other buffer usages seem to be safe.

This patch should solve the issue #1240. It should be backport to all stable
branches.
2021-05-03 10:34:48 +02:00
Willy Tarreau
29202013c1 CLEANUP: map/cli: properly align the map/acl help
Due to extra options on some commands, the help started to become
a bit of a mess, so let's realign all the commands.
2021-04-30 15:36:31 +02:00
Willy Tarreau
bb51c44d64 MINOR: map/acl: make "add map/acl" support an optional version number
By passing a version number to "add map/acl", it becomes possible to
atomically replace maps and ACLs. The principle is that a new version
number is first retrieved by calling"prepare map/acl", and this version
number is used with "add map" and "add acl". Newly added entries then
remain invisible to the matching mechanism but are visible in "show
map/acl" when the version number is specified, or may be cleard with
"clear map/acl". Finally when the insertion is complete, a
"commit map/acl" command must be issued, and the version is atomically
updated so that there is no intermediate state with incomplete entries.
2021-04-30 15:36:31 +02:00
Willy Tarreau
7a562ca809 MINOR: map/acl: add the "commit map/acl" CLI command
The command is used to atomically replace a map/acl with the pending
contents of the designated version. The new version must have been
allocated by "prepare map/acl" prior to this. At the moment it is not
possible to force the version when adding new entries, so this may only
be used to atomically clear an ACL/map.
2021-04-30 15:36:31 +02:00
Willy Tarreau
97218ce3a9 MINOR: map/acl: add the "prepare map/acl" CLI command
This command allocates a new version for the map/acl, that will be usable
later to prepare the addition of new values to atomically replace existing
ones. Technically speaking the operation consists in atomically incrementing
the next version. There's no "undo" operation here, if a version is not
committed, it will automatically be trashed when committing a newer version.
2021-04-30 15:36:31 +02:00
Willy Tarreau
ff3feeb5cf MINOR: map/acl: add the possibility to specify the version in "clear map/acl"
This will ease maintenance of versionned maps by allowing to clear old or
failed updates instead of the current version. Nothing was done to allow
clearing everyhing, though if there was a need for this, implementing "@all"
or something equivalent wouldn't require more than 3 lines of code.
2021-04-30 15:36:31 +02:00
Willy Tarreau
a13afe6535 MINOR: pattern: support purging arbitrary ranges of generations
Instead of being able to purge only values older than a specific value,
let's support arbitrary ranges and make pat_ref_purge_older() just be
one special case of this one.
2021-04-30 15:36:31 +02:00
Willy Tarreau
95f753e403 MINOR: map/acl: add the possibility to specify the version in "show map/acl"
The maps and ACLs internally all have two versions, the "current" one,
which is the one being matched against, and the "next" one, the one being
filled during an atomic replacement. Till now the "show" commands only used
to show the current one but it can be convenient to be able to show other
ones as well, so let's add the ability to do this with "show map" and
"show acl". The method used here consists in passing the version number
as "@<ver>" before the map/acl name or ID. It would have been better after
it but that could create confusion with keys already using such a format.
2021-04-30 15:36:31 +02:00
Willy Tarreau
e3a42a6c2d MINOR: map: show the current and next pattern version in "show map"
The "show map" command wasn't updated when pattern generations were
added for atomic reloads, let's report them in the "show map" command
that lists all known maps. It will be useful for users.
2021-04-30 15:36:31 +02:00
Willy Tarreau
4053b03caa MINOR: map: get rid of map_add_key_value()
This function was only used once in cli_parse_add_map(), and half of the
work it used to do was already known from the caller or testable outside
of the lock. Given that we'll need to modify it soon to pass a generation
number, let's remerge it in the caller instead, using pat_ref_load() which
is the one we'll need.
2021-04-30 15:36:31 +02:00
Willy Tarreau
f7dd0e8796 CLEANUP: map: slightly reorder the add map function
The function uses two distinct code paths for single the key/value pair
and multiple pairs inserted as payload, each with a copy-paste of the
error handling. Let's modify the loop to factor them out.
2021-04-30 15:36:31 +02:00
Amaury Denoyelle
eafd701dc5 MINOR: server: fix doc/trace on lb algo for dynamic server creation
The text mentionned that only backends with consistent hash method were
supported for dynamic servers. In fact, it is only required that the lb
algorith is dynamic.
2021-04-29 14:59:42 +02:00
Willy Tarreau
7e702d13f4 CLEANUP: hlua: rename hlua_appctx* appctx to luactx
There is some serious confusion in the lua interface code related to
sockets and services coming from the hlua_appctx structs being called
"appctx" everywhere, and where the real appctx is reached using
appctx->appctx. This part is a bit of a pain to debug so let's rename
all occurrences of this local variable to "luactx".
2021-04-28 17:59:21 +02:00
Willy Tarreau
b4476c6a8c CLEANUP: freq_ctr: make arguments of freq_ctr_total() const
freq_ctr_total() doesn't modify the freq counters, it should take a
const argument.
2021-04-28 17:44:37 +02:00
Willy Tarreau
fe16126acc BUG/MEDIUM: time: fix updating of global_now upon clock drift
During commit 7e4a557f6 ("MINOR: time: change the global timeval and the
the global tick at once") the approach made sure that the new now_ms was
always higher than or equal to global_now_ms, but by forgetting the old
value. This can cause the first update to global_now_ms to fail if it's
already out of sync, going back into the loop, and the subsequent call
would then succeed due to commit 4d01f3dcd ("MINOR: time: avoid
overwriting the same values of global_now").

And if it goes out of sync, it will fail to update forever, as observed
by Ashley Penney in github issue #1194, causing incorrect freq counters
calculations everywhere. One possible trigger for this issue is one thread
spinning for a few milliseconds while the other ones continue to work.

The issue really is that old_now_ms ought not to be modified in the loop
as it's used for the CAS. But we don't need to structurally guarantee that
global_now_ms grows monotonically as it's computed from the new global_now
which is already verified for this via the __tv_islt() test. Thus, dropping
any corrections on global_now_ms in the loop is the correct way to proceed
as long as this one is always updated to follow global_now.

No backport is needed, this is only for 2.4-dev.
2021-04-28 17:43:55 +02:00
Emeric Brun
ccdfbae62c MINOR: peers: add informative flags about resync process for debugging
This patch adds miscellenous informative flags raised during the initial
full resync process performed during the reload for debugging purpose.

0x00000010: Timeout waiting for a full resync from a local node
0x00000020: Timeout waiting for a full resync from a remote node
0x00000040: Session aborted learning from a local node
0x00000080: Session aborted learning from a remote node
0x00000100: A local node teach us and was fully up to date
0x00000200: A remote node teach us and was fully up to date
0x00000400: A local node teach us but was partially up to date
0x00000800: A remote node teach us but was partially up to date
0x00001000: A local node was assigned for a full resync
0x00002000: A remote node was assigned for a full resync
0x00004000: A resync was explicitly requested

This patch could be backported on any supported branch
2021-04-28 14:23:10 +02:00
Emeric Brun
1a6b43e13e BUG/MEDIUM: peers: reset tables stage flags stages on new conns
Flags used as context to know current status of each table pushing a
full resync to a peer were correctly reset receiving a new resync
request or confirmation message but in case of local peer sync during
reload the resync request is implicit and those flags were not
correctly reset in this case.

This could result to a partial initial resync of some tables after reload
if the connection with the old process was broken and retried.

This patch reset those flags at the end of the handshake for all new
connections to be sure to push a entire full resync if needed.

This patch should be backported on all supported branches ( v >= 1.6 )
2021-04-28 14:23:10 +02:00
Emeric Brun
8e7a13ed66 BUG/MEDIUM: peers: re-work updates lookup during the sync on the fly
Only entries between the opposite of the last 'local update' rotating
counter were considered to be pushed. This processing worked in most
cases because updates are continually pushed trying to reach this point
but it remains some cases where updates id are more far away in the past
and appearing in futur and the push of updates is stuck until the head
reach again the tail which could take a very long time.

This patch re-work the lookup to consider that all positions on the
rotating counter is considered in the past until we reach exactly
the 'local update' value. Doing this, the updates push won't be stuck
anymore.

This patch should be backported on all supported branches ( >= 1.6 )
2021-04-28 14:23:10 +02:00
Emeric Brun
cc9cce9351 BUG/MEDIUM: peers: reset commitupdate value in new conns
The commitupdate value of the table is used to check if the update
is still pending for a push for all peers. To be sure to not miss a
push we reset it just after a handshake success.

This patch should be backported on all supported branches ( >= 1.6 )
2021-04-28 14:23:10 +02:00
Emeric Brun
d9729da982 BUG/MEDIUM: peers: reset starting point if peers appears longly disconnected
If two peers are disconnected and during this period they continue to
process a large amount of local updates, after a reconnection they
may take a long time before restarting to push their updates. because
the last pushed update would appear internally in futur.

This patch fix this resetting the cursor on acked updates at the maximum
point considered in the past if it appears in futur but it means we
may lost some updates. A clean fix would be to update the protocol to
be able to signal a remote peer that is was not updated for a too long
period and needs a full resync but this is not yet supported by the
protocol.

This patch should be backported on all supported branches ( >= 1.6 )
2021-04-28 14:23:10 +02:00
Emeric Brun
b0d60bed36 BUG/MEDIUM: peers: stop considering ack messages teaching a full resync
The re-con cursor was updated receiving any ack message
even if we are pushing a complete resync to a peer. This cursor
is reset at the end of the resync but if the connection is broken
during resync, we could re-start at an unwanted point.

With this patch, the peer stops to consider ack messages pushing
a resync since the resync process has is own acknowlegement and
is always restarted from the beginning in case of broken connection.

This patch should be backported on all supported branches ( >= 1.6 )
2021-04-28 14:23:10 +02:00
Emeric Brun
437e48ad92 BUG/MEDIUM: peers: register last acked value as origin receiving a resync req
Receiving a resync request, the origins to start the full sync and
to reset after the full resync are mistakenly computed based on
the last update on the table instead of computed based on the
the last update acked by the node requesting the resync.

It could result in disordered or missing updates pushing to the
requester

This patch sets correctly those origins.

This patch should be backported on all supported branches ( >= 1.6 )
2021-04-28 14:23:10 +02:00
Emeric Brun
2c4ab41816 BUG/MEDIUM: peers: initialize resync timer to get an initial full resync
If a reload is performed and there is no incoming connections
from the old process to push a full resync, the new process
can be stuck waiting indefinitely for this conn and it never tries a
fallback requesting a full resync from a remote peer because the resync
timer was init to TICK_ETERNITY.

This patch forces a reset of the resync timer to default value (5 secs)
if we detect value is TICK_ETERNITY.

This patch should be backported on all supported branches ( >= 1.6 )
2021-04-28 14:23:10 +02:00
Willy Tarreau
8a022d5049 MINOR: config: add a new "default-path" global directive
By default haproxy loads all files designated by a relative path from the
location the process is started in. In some circumstances it might be
desirable to force all relative paths to start from a different location
just as if the process was started from such locations. This is what this
directive is made for. Technically it will perform a temporary chdir() to
the designated location while processing each configuration file, and will
return to the original directory after processing each file. It takes an
argument indicating the policy to use when loading files whose path does
not start with a slash ('/').

A few options are offered, "current" (the default), "config" (files
relative to config file's dir), "parent" (files relative to config file's
parent dir), and "origin" with an absolute path.

This should address issue #1198.
2021-04-28 11:30:13 +02:00
Willy Tarreau
da543e130c CLEANUP: cfgparse: de-uglify early file error handling in readcfgfile()
In readcfgfile() when malloc() fails to allocate a buffer for the
config line, it currently says "parsing[<file>]: out of memory" while
the error is unrelated to the config file and may make one think it has
to do with the file's size. The second test (fopen() returning error)
needs to release the previously allocated line. Both directly return -1
which is not even documented as a valid error code for the function.

Let's simply make sure that the few variables freed at the end are
properly preset, and jump there upon error, after having displayed a
meaningful error message. Now at least we can get this:

  $ ./haproxy -f /dev/kmem
  [NOTICE] 116/191904 (23233) : haproxy version is 2.4-dev17-c3808c-13
  [NOTICE] 116/191904 (23233) : path to executable is ./haproxy
  [ALERT] 116/191904 (23233) : Could not open configuration file /dev/kmem : Permission denied
2021-04-28 11:21:32 +02:00
Christopher Faulet
925abdfdac BUG/MEDIUM: mux-h2: Handle EOM flag when sending a DATA frame with zero-copy
When a DATA frame is sent, we must take care to properly detect the EOM flag
on the HTX message to set ES flag on the frame when necessary, to finish the
stream. But it is only done when data are copied from the HTX message to the
mux buffer and not when the frame are sent via a zero-copy. This patch fixes
this bug.

It is a 2.4-specific bug. No backport is needed.
2021-04-28 11:08:35 +02:00
Christopher Faulet
bd878d2c73 BUG/MINOR: hlua: Don't consume headers when starting an HTTP lua service
When an HTTP lua service is started, headers are consumed before calling the
script. When it was initialized, the headers were stored in a lua array,
thus they can be removed from the HTX message because the lua service will
no longer access them. But it is a problem with bodyless messages because
the EOM flag is lost. Indeed, once the headers are consumed, the message is
empty and the buffer is reset, included the flags.

Now, the headers are not immediately consumed. We will skip them if
applet:receive() or applet:getline(). This way, the EOM flag is preserved.
At the end, when the script is finished, all output data are consumed, thus
this remains safe.

It is a 2.4-specific bug. No backport is needed.
2021-04-28 11:05:05 +02:00
Christopher Faulet
1eedf9b4cb BUG/MINOR: applet: Notify the other side if data were consumed by an applet
If an applet consumed output data (the amount of output data has changed
between before and after the call to the applet), the producer is
notified. It means CF_WRITE_PARTIAL and CF_WROTE_DATA are set on the output
channel and the opposite stream interface is notified some room was made in
its input buffer. This way, it is no longer the applet responsibility to
take care of it. However, it doesn't matter if the applet does the same.

Said like that, it looks like an improvement not a bug. But it really fixes
a bug in the lua, for HTTP applets. Indeed, applet:receive() and
applet:getline() are buggy for HTTP applets. Data are consumed but the
producer is not notified. It means if the payload is not fully received in
one time, the applet may be blocked because the producer remains blocked (it
is time dependent).

This patch must be backported as far as 2.0 (only for the HTX part).
2021-04-28 10:51:08 +02:00
Christopher Faulet
f506d96839 MEDIUM: http-ana: handle read error on server side if waiting for response
A read error on the server side is also reported as a write error on the
client side. It means some times, a server side error is handled on the
client side. Among others, it is the case when the client side is waiting
for the response while the request processing is already finished. In this
case, the error is not handled as a server error. It is not accurate.

So now, when the request processing is finished but not the response
processing and if a read error was encountered on the server side, the error
is not immediatly processed on the client side, to let a chance to response
analysers to properly catch the error.
2021-04-28 10:51:08 +02:00
Christopher Faulet
3d87558f35 BUG/MINOR: mux-h2: Don't encroach on the reserve when decoding headers
Since the input buffer is transferred to the stream when it is created,
there is no longer control on the request size to be sure the buffer's
reserve is still respected. It was automatically performed in h2_rcv_buf()
because the caller took care to provide the correct available space in the
buffer. The control is still there but it is no longer applied on the
request headers. Now, we should take care of the reserve when the headers
are decoded, before the stream creation.

The test is performed for the request and the response.

It is a 2.4-specific bug. No backport is needed.
2021-04-28 10:51:08 +02:00
Christopher Faulet
2b78f0bfc4 CLEANUP: htx: Remove unsued hdrs_bytes field from the HTX start-line
Thanks to the htx_xfer_blks() refactoring, it is now possible to remove
hdrs_bytes field from the start-line because no function rely on it anymore.
2021-04-28 10:51:08 +02:00
Christopher Faulet
c92ec0ba71 MEDIUM: htx: Refactor htx_xfer_blks() to not rely on hdrs_bytes field
It is the only function using the hdrs_bytes start-line field. Thus the
function has been refactored to no longer rely on it. To do so, we first
copy HTX blocks to the destination message, without removing them from the
source message. If the copy is interrupted on headers or trailers, we roll
back. Otherwise, data are drained from the source buffer.

Most of time, the copy will succeeds. So the roll back is only performed in
the worst but very rare case.
2021-04-28 10:51:08 +02:00
Christopher Faulet
5e9b24f4b4 BUG/MINOR: htx: Preserve HTX flags when draining data from an HTX message
When all data of an HTX message are drained, we rely on htx_reset() to
reinit the message state. However, the flags must be preserved. It is, among
other things, important to preserve processing or parsing errors.

This patch must be backported as far as 2.0.
2021-04-27 22:57:46 +02:00
Amaury Denoyelle
8f685c11e0 BUG/MEDIUM: cpuset: fix build on MacOS
The compilation fails due to the following commit:
fc6ac53dca
BUG/MAJOR: fix build on musl with cpu_set_t support

The new global variable cpu_map conflicted with a local variable of the
same name in the code path for the apple platform when setting the
process affinity.

This does not need to be backported.
2021-04-27 16:49:35 +02:00
Amaury Denoyelle
fc6ac53dca BUG/MAJOR: fix build on musl with cpu_set_t support
Move cpu_map structure outside of the global struct to a global
variable defined in cpuset.c compilation unit. This allows to reorganize
the includes without having to define _GNU_SOURCE everywhere for the
support of the cpu_set_t.

This fixes the compilation with musl libc, most notably used for the
alpine based docker image.

This fixes the github issue #1235.

No need to backport as this feature is new in the current
2.4-dev.
2021-04-27 14:11:26 +02:00
Remi Tricot-Le Breton
43899ec83d BUG/MINOR: ssl: ssl_sock_prepare_ssl_ctx does not return an error code
The return value check was wrongly based on error codes when the
function actually returns an error number.
This bug was introduced by f3eedfe195
which is a feature not present before branch 2.4.

It does not need to be backported.
2021-04-26 15:57:26 +02:00
Ilya Shipitsin
b2be9a1ea9 CLEANUP: assorted typo fixes in the code and comments
This is 22nd iteration of typo fixes
2021-04-26 10:42:58 +02:00
Christopher Faulet
df3db630e4 REORG: htx: Inline htx functions to add HTX blocks in a message
The HTX functions used to add new HTX blocks in a message have been moved to
the header file to inline them in calling functions. These functions are
small enough.
2021-04-26 10:24:57 +02:00
Christopher Faulet
fb38c910f8 BUG/MINOR: mux-fcgi: Don't send normalized uri to FCGI application
A normalized URI is the internal term used to specify an URI is stored using
the absolute format (scheme + authority + path). For now, it is only used
for H2 clients. It is the default and recommended format for H2 request.
However, it is unusual for H1 servers to receive such URI. So in this case,
we only send the path of the absolute URI. It is performed for H1 servers,
but not for FCGI applications. This patch fixes the difference.

Note that it is not a real bug, because FCGI applications should support
abosolute URI.

Note also a normalized URI is only detected for H2 clients when a request is
received. There is no such test on the H1 side. It means an absolute URI
received from an H1 client will be sent without modification to an H1 server
or a FCGI application.

To make it possible, a dedicated function has been added to get the H1
URI. This function is called by the H1 and the FCGI multiplexer when a
request is sent to a server.

This patch should fix the issue #1232. It must be backported as far as 2.2.
2021-04-26 10:23:18 +02:00
Tim Duesterhus
2e4a18e04a MINOR: uri_normalizer: Add a percent-decode-unreserved normalizer
This normalizer decodes percent encoded characters within the RFC 3986
unreserved set.

See GitHub Issue #714.
2021-04-23 19:43:45 +02:00
Willy Tarreau
07bf21cdcb BUG/MEDIUM: config: fix missing initialization in numa_detect_topology()
The error path of the NUMA topology detection introduced in commit
b56a7c89a ("MEDIUM: cfgparse: detect numa and set affinity if needed")
lacks an initialization resulting in possible crashes at boot. No
backport is needed since that was introduced in 2.4-dev.
2021-04-23 19:09:16 +02:00
Emeric Brun
2cc201f97e BUG/MEDIUM: peers: re-work refcnt on table to protect against flush
In proxy.c, when process is stopping we try to flush tables content
using 'stktable_trash_oldest'. A check on a counter "table->syncing" was
made to verify if there is no pending resync in progress.
But using multiple threads this counter can be increased by an other thread
only after some delay, so the content of some tables can be trashed earlier and
won't be pushed to the new process (after reload, some tables appear reset and
others don't).

This patch re-names the counter "table->syncing" to "table->refcnt" and
the counter is increased during configuration parsing (registering a table to
a peer section) to protect tables during runtime and until resync of a new
process has succeeded or failed.

The inc/dec operations are now made using atomic operations
because multiple peer sections could refer to the same table in futur.

This fix addresses github #1216.

This patch should be backported on all branches multi-thread support (v >= 1.8)
2021-04-23 18:03:06 +02:00
Emeric Brun
cbfe5ebc1c BUG/MEDIUM: peers: re-work connection to new process during reload.
The peers task handling the "stopping" could wake up multiple
times in stopping state with WOKEN_SIGNAL: the connection to the
local peer initiated on the first processing was immediatly
shutdown by the next processing of the task and the old process
exits considering it is unable to connect. It results on
empty stick-tables after a reload.

This patch checks the flag 'PEERS_F_DONOTSTOP' to know if the
signal is considered and if remote peers connections shutdown
is already done or if a connection to the local peer must be
established.

This patch should be backported on all supported branches (v >= 1.6)
2021-04-23 18:03:06 +02:00
Emeric Brun
1675ada4f4 BUG/MINOR: peers: remove useless table check if initial resync is finished
The old process checked each table resync status even if
the resync process is finished. This behavior had no known impact
except useless processing and was discovered during debugging on
an other issue.

This patch could be backported in all supported branches (v >= 1.6)
but once again, it has no impact except avoid useless processing.
2021-04-23 18:03:06 +02:00
Willy Tarreau
1f9e11e7f0 CLEANUP: time: use __tv_to_ms() in tv_update_date() instead of open-coding
Instead of calculating the current date in milliseconds by hand, let's
use __tv_to_ms() which was made exactly for this purpose.
2021-04-23 18:03:06 +02:00
Willy Tarreau
4d01f3dcdc MINOR: time: avoid overwriting the same values of global_now
In tv_update_date(), we calculate the new global date based on the local
one. It's very likely that other threads will end up with the exact same
now_ms date (at 1 million wakeups/s it happens 99.9% of the time), and
even the microsecond was measured to remain unchanged ~70% of the time
with 16 threads, simply because sometimes another thread already updated
a more recent version of it.

In such cases, performing a CAS to the global variable requires a cache
line flush which brings nothing. By checking if they're changed before
writing, we can divide by about 6 the number of writes to the global
variables, hence the overall contention.

In addition, it's worth noting that all threads will want to update at
the same time, so let's place a cpu relax call before trying again, this
will spread attempts apart.
2021-04-23 18:03:06 +02:00
Willy Tarreau
481795de13 MINOR: time: avoid unneeded updates to now_offset
The time adjustment is very rare, even at high pool rates. Tests show
that only 0.2% of tv_update_date() calls require a change of offset. Such
concurrent writes to a shared variable have an important impact on future
loads, so let's only update the variable if it changed.
2021-04-23 18:03:06 +02:00
Amaury Denoyelle
a6f9c5d2a7 BUG/MINOR: cpuset: fix compilation on platform without cpu affinity
The compilation is currently broken on platform without USE_CPU_AFFINITY
set. An error has been reported by the cygwin build of the CI.

This does not need to be backported.

In file included from include/haproxy/global-t.h:27,
                 from include/haproxy/global.h:26,
                 from include/haproxy/fd.h:33,
                 from src/ev_poll.c:22:
include/haproxy/cpuset-t.h:32:3: error: #error "No cpuset support implemented on this platform"
   32 | # error "No cpuset support implemented on this platform"
      |   ^~~~~
include/haproxy/cpuset-t.h:37:2: error: unknown type name ‘CPUSET_REPR’
   37 |  CPUSET_REPR cpuset;
      |  ^~~~~~~~~~~
make: *** [Makefile:944: src/ev_poll.o] Error 1
make: *** Waiting for unfinished jobs....
In file included from include/haproxy/global-t.h:27,
                 from include/haproxy/global.h:26,
                 from include/haproxy/fd.h:33,
                 from include/haproxy/connection.h:30,
                 from include/haproxy/ssl_sock.h:27,
                 from src/ssl_sample.c:30:
include/haproxy/cpuset-t.h:32:3: error: #error "No cpuset support implemented on this platform"
   32 | # error "No cpuset support implemented on this platform"
      |   ^~~~~
include/haproxy/cpuset-t.h:37:2: error: unknown type name ‘CPUSET_REPR’
   37 |  CPUSET_REPR cpuset;
      |  ^~~~~~~~~~~
make: *** [Makefile:944: src/ssl_sample.o] Error 1
2021-04-23 17:04:24 +02:00
Amaury Denoyelle
c5ed1f9d87 BUG/MINOR: haproxy: fix compilation on macOS
Fix the warning treated as error on the CI for the macOS compilation :
"src/haproxy.c:2939:23: error: unused variable 'set'
 [-Werror,-Wunused-variable]"

This does not need to be backported.
2021-04-23 16:41:22 +02:00
Amaury Denoyelle
0f50cb9c73 MINOR: global: add option to disable numa detection
Render numa detection optional with a global configuration statement
'no numa-cpu-mapping'. This can be used if the applied affinity of the
algorithm is not optimal. Also complete the documentation with this new
keyword.
2021-04-23 16:06:49 +02:00
Amaury Denoyelle
b56a7c89a8 MEDIUM: cfgparse: detect numa and set affinity if needed
On process startup, the CPU topology of the machine is inspected. If a
multi-socket CPU machine is detected, automatically define the process
affinity on the first node with active cpus. This is done to prevent an
impact on the overall performance of the process in case the topology of
the machine is unknown to the user.

This step is not executed in the following condition :
- a non-null nbthread statement is present
- a restrictive 'cpu-map' statement is present
- the process affinity is already restricted, for example via a taskset
  call

For the record, benchmarks were executed on a machine with 2 CPUs
Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz. In both clear and ssl
scenario, the performance were sub-optimal without the automatic
rebinding on a single node.
2021-04-23 16:06:49 +02:00
Amaury Denoyelle
a80823543c MINOR: cfgparse: support the comma separator on parse_cpu_set
Allow to specify multiple cpu ids/ranges in parse_cpu_set separated by a
comma. This is optional and must be activated by a parameter.

The comma support is disabled for the parsing of the 'cpu-map' config
statement. However, it will be useful to parse files in sysfs when
inspecting the cpus topology for NUMA automatic process binding.
2021-04-23 16:06:49 +02:00
Amaury Denoyelle
4c9efdecf5 MINOR: thread: implement the detection of forced cpu affinity
Create a function thread_cpu_mask_forced. Its purpose is to report if a
restrictive cpu mask is active for the current proces, for example due
to a taskset invocation. It is only implemented for the linux platform
currently.
2021-04-23 16:06:49 +02:00
Amaury Denoyelle
982fb53390 MEDIUM: config: use platform independent type hap_cpuset for cpu-map
Use the platform independent type hap_cpuset for the cpu-map statement
parsing. This allow to address CPU index greater than LONGBITS.

Update the documentation to reflect the removal of this limit except for
platforms without cpu_set_t type or equivalent.
2021-04-23 16:06:49 +02:00
Amaury Denoyelle
c90932bc8e MINOR: cfgparse: use hap_cpuset for parse_cpu_set
Replace the unsigned long parameter by a hap_cpuset. This allows to
address CPU with index greater than LONGBITS.

This function is used to parse the 'cpu-map' statement. However at the
moment, the result is casted back to a long to store it in the global
structure. The next step is to replace ulong in in cpu_map in the
global structure with hap_cpuset.
2021-04-23 16:06:49 +02:00
Amaury Denoyelle
f75c640f7b MINOR: cpuset: define a platform-independent cpuset type
This module can be used to manipulate a cpu sets in a platform agnostic
way. Use the type cpu_set_t/cpuset_t if available on the platform, or
fallback to unsigned long, which limits de facto the maximum cpu index
to LONGBITS.
2021-04-23 16:06:49 +02:00
Christopher Faulet
de9d605aa5 BUG/MEDIUM: mux-h2: Properly handle shutdowns when received with data
The H2_CF_RCVD_SHUT flag is used to report a read0 was encountered. It is
used by the H2 mux to properly handle shutdowns. However, this flag is only
set when no data are received. If it is detected at the socket level when
some data are received, it is not handled. And because the event was
reported on the connection, any other read attempts are blocked. In this
case, we are unable to close the connection and release the mux
immediately. We must wait the mux timeout expires.

This patch should fix the issue #1231. It must be backported as far as 2.0.
2021-04-23 15:42:39 +02:00
Willy Tarreau
5e65f4276b CLEANUP: compression: remove calls to SLZ init functions
As we now embed the library we don't need to support the older 1.0 API
any more, so we can remove the explicit calls to slz_make_crc_table()
and slz_prepare_dist_table().
2021-04-22 16:11:19 +02:00
Willy Tarreau
12840be005 BUILD: compression: switch SLZ from out-of-tree to in-tree
Now that SLZ is merged, let's update the makefile and compression
files to use it. As a result, SLZ_INC and SLZ_LIB are neither defined
nor used anymore.

USE_SLZ is enabled by default ("USE_SLZ=default") and can be disabled
by passing "USE_SLZ=" or by enabling USE_ZLIB=1.

The doc was updated to reflect the changes.
2021-04-22 16:08:25 +02:00
Willy Tarreau
ab2b7828e2 IMPORT: slz: import slz into the tree
SLZ is rarely packaged by distros and there have been complaints about
the CPU and memory usage of ZLIB, leading to some suggestions to better
address the issue by simply integrating SLZ into the tree (just 3 files).
See discussions below:

   https://www.mail-archive.com/haproxy@formilux.org/msg38037.html
   https://www.mail-archive.com/haproxy@formilux.org/msg40079.html
   https://www.mail-archive.com/haproxy@formilux.org/msg40365.html

This patch does just this, after minor adjustments to these files:
  - tables.h was renamed to slz-tables.h
  - tables.h had the precomputed tables removed since not used here
  - slz.c uses includes <import/slz*> instead of "slz*.h"

The slz commit imported here was b06c172 ("slz: avoid a build warning
with -Wimplicit-fallthrough"). No other change was performed either to
SLZ nor to haproxy at this point so that this operation may be replicated
if needed for a future version.
2021-04-22 15:50:41 +02:00
William Lallemand
aba7f8b313 BUG/MINOR: mworker: don't use oldpids[] anymore for reload
Since commit 3f12887 ("MINOR: mworker: don't use children variable
anymore"), the oldpids array is not used anymore to generate the new -sf
parameters. So we don't need to set nb_oldpids to 0 during the first
start of the master process.

This patch fixes a bug when 2 masters process tries to synchronize their
peers, there is a small chances that it won't work because nb_oldpids
equals 0.

Should be backported as far as 2.0.
2021-04-21 16:55:34 +02:00
William Lallemand
ea6bf83d62 BUG/MINOR: mworker/init: don't reset nb_oldpids in non-mworker cases
This bug affects the peers synchronisation code which rely on the
nb_oldpids variable to synchronize the peer from the old PID.

In the case the process is not started in master-worker mode and tries
to synchronize using the peers, there is a small chance that won't work
because nb_oldpids equals 0.

Fix the bug by setting the variable to 0 only in the case of the
master-worker when not reloaded.

It could also be a problem when trying to synchronize the peers between
2 masters process which should be fixed in another patch.

Bug exists since commit 8a361b5 ("BUG/MEDIUM: mworker: don't reuse PIDs
passed to the master").

Sould be backported as far as 1.8.
2021-04-21 16:42:18 +02:00
Amaury Denoyelle
a2944ecf5d MINOR: config: add a diag for invalid cpu-map statement
If a cpu-statement is refering to multiple processes and threads, it is
silently ignored. Add a diag message to report it to the user.
2021-04-21 15:18:57 +02:00
Amaury Denoyelle
af02c57406 BUG/MEDIUM: config: fix cpu-map notation with both process and threads
The application of a cpu-map statement with both process and threads
is broken (P-Q/1 or 1/P-Q notation).

For example, before the fix, when using P-Q/1, proc_t1 would be updated.
Then it would be AND'ed with thread which is still 0 and thus does
nothing.

Another problem is when using 1/1[-Q], thread[0] is defined. But if
there is multiple processes, every processes will use this define
affinity even if it should be applied only to 1st process.

The solution to the fix is a little bit too complex for my taste and
there is maybe a simpler solution but I did not wish to break the
storage of global.cpu_map, as it is quite painful to test all the
use-cases. Besides, this code will probably be clean up when
multiprocess support removed on the future version.

Let's try to explain my logic.

* either haproxy runs in multiprocess or multithread mode. If on
  multiprocess, we should consider proc_t1 (P-Q/1 notation). If on
  multithread, we should consider thread (1/P-Q notation). However
  during parsing, the final number of processes or threads is unknown,
  thus we have to consider the two possibilities.

* there is a special case for the first thread / first process which is
  present in both execution modes. And as a matter of fact cpu-map 1 or
  1/1 notation represents the same thing. Thus, thread[0] and proc_t1[0]
  represents the same thing. To solve this problem, only thread[0] is
  used for this special case.

This fix must be backported up to 2.0.
2021-04-21 15:18:57 +02:00
Maximilian Mader
ff3bb8b609 MINOR: uri_normalizer: Add a strip-dot normalizer
This normalizer removes "/./" segments from the path component.
Usually the dot refers to the current directory which renders those segments redundant.

See GitHub Issue #714.
2021-04-21 12:15:14 +02:00
Maximilian Mader
c9c79570d4 CLEANUP: uri_normalizer: Remove trailing whitespace
This patch removes a single trailing space.
2021-04-21 12:15:14 +02:00
Maximilian Mader
11f6f85c4b BUG/MINOR: uri_normalizer: Use delim parameter when building the sorted query in uri_normalizer_query_sort
Currently the delimiter is hardcoded as ampersand (&) but the function takes the delimiter as a paramter.
This patch replaces the hardcoded ampersand with the given delimiter.
2021-04-21 12:15:14 +02:00
Christopher Faulet
cb1847c772 BUG/MEDIUM: mux-h2: Fix dfl calculation when merging CONTINUATION frames
When header are splitted over several frames, payload of HEADERS and
CONTINUATION frames are merged to form a unique HEADERS frame before
decoding the payload. To do so, info about the current frame are updated
(dff, dfl..) with info of the next one. Here there is a bug when the frame
length (dfl) is update. We must add the next frame length (hdr.dfl) and not
only the amount of data found in the buffer (clen). Because HEADERS frames
are decoded in one pass, dfl value is the whole frame length or 0. nothing
intermediary.

This patch must be backported as far as 2.0.
2021-04-21 12:13:12 +02:00
Christopher Faulet
07f88d7582 BUG/MAJOR: mux-h2: Properly detect too large frames when decoding headers
In the function decoding payload of HEADERS frames, an internal error is
returned if the frame length is too large. it cannot exceed the buffer
size. The same is true when headers are splitted on several frames. The
payload of HEADERS and CONTINUATION frames are merged and the overall size
must not exceed the buffer size.

However, there is a bug when the current frame is big enough to only have
the space for a part of the header of the next frame. Because, in this case,
we wait for more data, to have the whole frame header. We don't properly
detect that the headers are too large to be stored in one buffer. In fact
the test to trigger this error is not accurate. When the buffer is full, the
error is reported if the frame length exceeds the amount of data in the
buffer. But in reality, an error must be reported when we are unable to
decode the current frame while the buffer is full. Because, in this case, we
know there is no way to change this state.

When the bug happens, the H2 connection is woken up in loop, consumming all
the CPU. But the traffic is not blocked for all that.

This patch must be backported as far as 2.0.
2021-04-21 12:13:12 +02:00
Amaury Denoyelle
d6b4b6da3f BUG/MINOR: server: fix potential null gcc error in delete server
gcc still reports a potential null pointer dereference in delete server
function event with a BUG_ON before it. Remove the misleading NULL check
in the for loop which should never happen.

This does not need to be backported.
2021-04-21 12:02:30 +02:00
Amaury Denoyelle
e558043e13 MINOR: server: implement delete server cli command
Implement a new CLI command 'del server'. It can be used to removed a
dynamically added server. Only servers in maintenance mode can be
removed, and without pending/active/idle connection on it.

Add a new reg-test for this feature. The scenario of the reg-test need
to first add a dynamic server. It is then deleted and a client is used
to ensure that the server is non joinable.

The management doc is updated with the new command 'del server'.
2021-04-21 11:00:31 +02:00
Amaury Denoyelle
d38e7fa233 MINOR: server: add log on dynamic server creation
Add a notice log to report the creation of a new server. The log is
printed at the end of the function.
2021-04-21 11:00:31 +02:00
Amaury Denoyelle
cece918625 BUG/MEDIUM: server: ensure thread-safety of server runtime creation
cli_parse_add_server can be executed in parallel by several CLI
instances and so must be thread-safe. The critical points of the
function are :
- server duplicate detection
- insertion of the server in the proxy list

The mode of operation has been reversed. The server is first
instantiated and parsed. The duplicate check has been moved at the end
just before the insertion in the proxy list, under the thread isolation.
Thus, the thread safety is guaranteed and server allocation is kept
outside of locks/thread isolation.
2021-04-21 11:00:30 +02:00
Amaury Denoyelle
d688e01032 BUG/MINOR: logs: free logsrv.conf.file on exit
Config information has been added into the logsrv struct. The filename
is duplicated and should be freed on exit.

Introduced in the current release.
This does not need to be backported.
2021-04-21 11:00:29 +02:00
Amaury Denoyelle
fb247946a1 BUG/MINOR: server: free srv.lb_nodes in free_server
lb_nodes is allocated for servers using lb_chash (balance random or
hash-type consistent).

It can be backported up to 1.8.
2021-04-21 11:00:03 +02:00
Willy Tarreau
2b71810cb3 CLEANUP: lists/tree-wide: rename some list operations to avoid some confusion
The current "ADD" vs "ADDQ" is confusing because when thinking in terms
of appending at the end of a list, "ADD" naturally comes to mind, but
here it does the opposite, it inserts. Several times already it's been
incorrectly used where ADDQ was expected, the latest of which was a
fortunate accident explained in 6fa922562 ("CLEANUP: stream: explain
why we queue the stream at the head of the server list").

Let's use more explicit (but slightly longer) names now:

   LIST_ADD        ->       LIST_INSERT
   LIST_ADDQ       ->       LIST_APPEND
   LIST_ADDED      ->       LIST_INLIST
   LIST_DEL        ->       LIST_DELETE

The same is true for MT_LISTs, including their "TRY" variant.
LIST_DEL_INIT keeps its short name to encourage to use it instead of the
lazier LIST_DELETE which is often less safe.

The change is large (~674 non-comment entries) but is mechanical enough
to remain safe. No permutation was performed, so any out-of-tree code
can easily map older names to new ones.

The list doc was updated.
2021-04-21 09:20:17 +02:00
Tim Duesterhus
3b9cdf1cb7 CLEANUP: sample: Use explicit return for successful json_querys
Move the `return 1` into each of the cases, instead of relying on the single
`return 1` at the bottom of the function.
2021-04-20 20:33:38 +02:00
Tim Duesterhus
8f3bc8ffca CLEANUP: sample: Explicitly handle all possible enum values from mjson
This makes it easier to find bugs, because -Wswitch can help us.
2021-04-20 20:33:34 +02:00
Tim Duesterhus
4809c8c955 CLEANUP: sample: Improve local variables in sample_conv_json_query
This improves the use of local variables in sample_conv_json_query:

- Use the enum type for the return value of `mjson_find`.
- Do not use single letter variables.
- Reduce the scope of variables that are only needed in a single branch.
- Add missing newlines after variable declaration.
2021-04-20 20:33:31 +02:00
Willy Tarreau
dcb121fd9c BUG/MINOR: server: make srv_alloc_lb() allocate lb_nodes for consistent hash
The test in srv_alloc_lb() to allocate the lb_nodes[] array used in the
consistent hash was incorrect, it wouldn't do it for consistent hash and
could do it for regular random.

No backport is needed as this was added for dynamic servers in 2.4-dev by
commit f99f77a50 ("MEDIUM: server: implement 'add server' cli command").
2021-04-20 11:39:54 +02:00
Willy Tarreau
942b89f7dc BUILD: pools: fix build with DEBUG_FAIL_ALLOC
Amaury noticed that I managed to break the build of DEBUG_FAIL_ALLOC
for the second time with 207c09509 ("MINOR: pools: move the fault
injector to __pool_alloc()"). The joy of endlessly reworking patch
sets... No backport is needed, that was in the just merged cleanup
series.
2021-04-19 18:36:48 +02:00
Willy Tarreau
b2a853d5f0 CLEANUP: pools: uninline pool_put_to_cache()
This function has become too big (251 bytes) and is now hurting
performance a lot, with up to 4% request rate being lost over the last
pool changes. Let's move it to pool.c as a regular function. Other
attempts were made to cut it in half but it's still inefficient. Doing
this results in saving ~90kB of object code, and even 112kB since the
pool changes, with code that is even slightly faster!

Conversely, pool_get_from_cache(), which remains half of this size, is
still faster inlined, likely in part due to the immediate use of the
returned pointer afterwards.
2021-04-19 15:24:33 +02:00
Willy Tarreau
fa19d20ac4 MEDIUM: pools: make pool_put_to_cache() always call pool_put_to_local_cache()
Till now it used to call it only if there were not too many objects into
the local cache otherwise would send the latest one directly into the
shared cache. Now it always sends to the local cache and it's up to the
local cache to free its oldest objects. From a cache freshness perspective
it's better this way since we always evict cold objects instead of hot
ones. From an API perspective it's better because it will help make the
shared cache invisible to the public API.
2021-04-19 15:24:33 +02:00
Willy Tarreau
87212036a1 MINOR: pools: evict excess objects using pool_evict_from_local_cache()
Till now we could only evict oldest objects from all local caches using
pool_evict_from_local_caches() until the cache size was satisfying again,
but there was no way to evict excess objects from a single cache, which
is the reason why pool_put_to_cache() used to refrain from putting into
the local cache and would directly write to the shared cache, resulting
in massive writes when caches were full.

Let's add this new function now. It will stop once the number of objects
in the local cache is no higher than 16+total/8 or the cache size is no
more than 75% full, just like before.

For now the function is not used.
2021-04-19 15:24:33 +02:00
Willy Tarreau
b8498e961a MEDIUM: pools: make CONFIG_HAP_POOLS control both local and shared pools
Continuing the unification of local and shared pools, now the usage of
pools is governed by CONFIG_HAP_POOLS without which allocations and
releases are performed directly from the OS using pool_alloc_nocache()
and pool_free_nocache().
2021-04-19 15:24:33 +02:00
Willy Tarreau
45e4e28161 MINOR: pools: factor the release code into pool_put_to_os()
There are two levels of freeing to the OS:
  - code that wants to keep the pool's usage counters updated uses
    pool_free_area() and handles the counters itself. That's what
    pool_put_to_shared_cache() does in the no-global-pools case.
  - code that does not want to update the counters because they were
    already updated only calls pool_free_area().

Let's extract these calls to establish the symmetry with pool_get_from_os()
and pool_alloc_nocache(), resulting in pool_put_to_os() (which only updates
the allocated counter) and pool_free_nocache() (which also updates the used
counter). This will later allow to simplify the generic code.
2021-04-19 15:24:33 +02:00
Willy Tarreau
2b5579f6da MINOR: pools: always use atomic ops to maintain counters
A part of the code cannot be factored out because it still uses non-atomic
inc/dec for pool->used and pool->allocated as these are located under the
pool's lock. While it can make sense in terms of bus cycles, it does not
make sense in terms of code normalization. Further, some operations were
still performed under a lock that could be totally removed via the use of
atomic ops.

There is still one occurrence in pool_put_to_shared_cache() in the locked
code where pool_free_area() is called under the lock, which must absolutely
be fixed.
2021-04-19 15:24:33 +02:00