Commit Graph

10981 Commits

Author SHA1 Message Date
Christopher Faulet
ea009736d8 BUILD: debug: Avoid warnings in dev mode with -02 because of some BUG_ON tests
Some BUG_ON() tests emit a warning because of a potential null pointer
dereference on an HTX block. In fact, it should never happen, but now, GCC is
happy.

This patch must be backported to 2.0.
2019-11-20 14:11:47 +01:00
Christopher Faulet
eba2294e5b MINOR: contrib/prometheus-exporter: Add a param to ignore servers in maintenance
By passing the parameter "no-maint" in the query-string, it is now possible to
ignore servers in maintenance. It means that the metrics for servers in this
state will not be exported.
2019-11-20 14:11:47 +01:00
Christopher Faulet
78407ce156 MINOR: contrib/prometheus-exporter: filter exported metrics by scope
Now, the prometheus exporter parses the HTTP query-string to filter or to adapt
the exported metrics. In this first version, it is only possible select the
scopes of metrics to export. To do so, one or more parameters with "scope" as
name must be passed in the query-string, with one of those values: global,
frontend, backend, server or '*' (means all). A scope parameter with no value
means to filter out all scopes (nothing is returned). The scope parameters are
parsed in their appearance order in the query-string. So an empty scope will
reset all scopes already parsed. But it can be overridden by following scope
parameters in the query-string. By default everything is exported.

The filtering can also be done on prometheus scraping configuration, but general
aim is to optimise the source of data to improve load and scraping time. This is
particularly true for huge configuration with thousands of backends and servers.
Also note that this configuration was possible on the previous official haproxy
exporter but with even more parameters to select the needed metrics. Here we
thought it was sufficient to simply avoid a given type of metric. However, more
filters are still possible.

Thanks to William Dauchy. This patch is based on his work.
2019-11-20 14:11:47 +01:00
Christopher Faulet
36b536d6c8 BUG/MEDIUM: stream-int: Don't loose events on the CS when an EOS is reported
In si_cs_recv(), when a shutdown for reads is handled, the conn-stream may be
closed. It happens when the ouput channel is closed for writes or if
SI_FL_NOHALF is set on the stream-interface. In this case, conn-stream's flags
are reset. Thus, if an error (CS_FL_ERROR) or an end of input (CS_FL_EOI) is
reported by the mux, the event is lost. si_cs_recv() does not report these
events by itself. It relies on si_cs_process() to report them to the
stream-interface and/or the channel.

For instance, if CS_FL_EOS and CS_FL_EOI are set by the H1 multiplexer during a
call to si_cs_recv() on the server side, if the conn-stream is closed (read0 +
SI_FL_NOHALF), the CS_FL_EOI flag is lost. Thus, this may lead the stream to
interpret it as a server abort.

Now, conn-stream's flags are processed at the end of si_cs_recv(). The function
is responsible to set the right flags on the stream-interface and/or the
channel. Due to this patch, the function is now almost linear. Except some early
checks at the beginning, there is only one return statement. It also fixes a
potential bug because of an inconsistency between the splicing and the buffered
receipt. On the first case, CS_FL_EOS if handled before errors on the connection
or the conn-stream. On the second one, it is the opposite.

This patch must be backported to 2.0 and 1.9.
2019-11-20 14:11:47 +01:00
Eric Salama
3c8bde88ca BUILD/MINOR: ssl: fix compiler warning about useless statement
There is a compiler warning after commit a9363eb6 ("BUG/MEDIUM: ssl:
'tune.ssl.default-dh-param' value ignored with openssl > 1.1.1"):

src/ssl_sock.c: In function 'ssl_sock_prepare_ctx':
src/ssl_sock.c:4481:4: error: statement with no effect [-Werror=unused-value]

Fix it by adding a (void)
2019-11-20 13:49:21 +01:00
Frédéric Lécaille
3585cab221 BUG/MINOR: peers: "peer alive" flag not reset when deconnecting.
The peer flags (->flags member of peer struct) are reset by __peer_session_deinit()
function. PEER_F_ALIVE flag which is used by the heartbeat part of the peer protocol
to mark a peer as being alive was not reset by this function. This simple patch adds
add the statement to this.

Note that, at this time, there was no identified issue due to this missing reset.

Must be backported to 2.0.
2019-11-20 13:38:13 +01:00
William Lallemand
677e2f2c35 BUG/MEDIUM: mworker: don't fill the -sf argument with -1 during the reexec
Upon a reexec_on_failure, if the process tried to exit after the
initialization of the process structure but before it was filled with a
PID, the PID in the mworker_proc structure is set to -1.

In this particular case the -sf argument is filled with -1 and haproxy
will exit with the usage message because of that argument.

Should be backported in 2.0.
2019-11-19 17:30:34 +01:00
William Lallemand
0bc9c8a243 MINOR: ssl/cli: 'abort ssl cert' deletes an on-going transaction
This patch introduces the new CLI command 'abort ssl cert' which abort
an on-going transaction and free its content.

This command takes the name of the filename of the transaction as an
argument.
2019-11-19 16:21:24 +01:00
Frédéric Lécaille
af9990f035 BUG/MINOR: peers: Wrong null "server_name" data field handling.
As the peers protocol expects to parse at least one encoded integer value for
each stick-table data field even when not configured on the local side,
about the "server_name" data field we must emit something even if it has
not been set (no server was configured for instance).
As this data field is made of first one encoded integer which is the length
of the remaining data (the dictionary cache entry), we encode the length 0
when emitting such an absent dictionary cache entry.
On the remote side, when we decode such an integer with 0 as value, we stop
parsing the data field and that's it.

Must be backported to 2.0.
2019-11-19 14:48:33 +01:00
Frédéric Lécaille
ec1c10b839 MINOR: peers: Add debugging information to "show peers".
This patch adds three counters to help in debugging peers protocol issues
to "peer" struct:
	->no_hbt counts the number of reconnection period without receiving heartbeat
	->new_conn counts the number of reconnections after ->reconnect timeout expirations.
	->proto_err counts the number of protocol errors.
2019-11-19 14:48:28 +01:00
Frédéric Lécaille
33cab3c0eb MINOR: peers: Add TX/RX heartbeat counters.
Add RX/TX heartbeat counters to "peer" struct to have an idead about which
peer is alive or not.
Dump these counters values on the CLI via "show peers" command.
2019-11-19 14:48:25 +01:00
Frédéric Lécaille
470502b420 MINOR: peers: Alway show the table info for disconnected peers.
This patch enable us to dump the stick-table information of remote or local peers
without already opened peer session. This may be the case also for the local peer
during synchronizations with an old processus (reload).
2019-11-19 14:48:21 +01:00
Emmanuel Hocdet
c5fdf0f3dc BUG/MINOR: ssl: fix crt-list neg filter for openssl < 1.1.1
Certificate selection in client_hello_cb (openssl >= 1.1.1) correctly
handles crt-list neg filter. Certificate selection for openssl < 1.1.1
has not been touched for a while: crt-list neg filter is not the same
than his counterpart and is wrong. Fix it to mimic the same behavior
has is counterpart.

It should be backported as far as 1.6.
2019-11-18 14:58:27 +01:00
Emmanuel Hocdet
c3775d28f9 BUG/MINOR: ssl: ssl_pkey_info_index ex_data can store a dereferenced pointer
With CLI cert update, sni_ctx can be removed at runtime. ssl_pkey_info_index
ex_data is filled with one of sni_ctx.kinfo pointer but SSL_CTX can be shared
between sni_ctx. Remove and free a sni_ctx can lead to a segfault when
ssl_pkey_info_index ex_data is used (in ssl_sock_get_pkey_algo). Removing the
dependency on ssl_pkey_info_index ex_data is the easiest way to fix the issue.
2019-11-18 14:55:32 +01:00
William Dauchy
f9af9d7f3c MINOR: init: avoid code duplication while setting identify
since the introduction of mworker, the setuid/setgid was duplicated in
two places; try to improve that by creating a dedicated function.
this patch does not introduce any functional change.

Signed-off-by: William Dauchy <w.dauchy@criteo.com>
2019-11-17 16:55:50 +01:00
William Dauchy
e039f26ba4 BUG/MINOR: init: fix set-dumpable when using uid/gid
in mworker mode used with uid/gid settings, it was not possible to get
a coredump despite the set-dumpable option.
indeed prctl(2) manual page specifies the dumpable attribute is reverted
to `/proc/sys/fs/suid_dumpable` in a few conditions such as process
effective user and group are changed.

this patch moves the whole set-dumpable logic before the polling code in
order to catch all possible cases where we could have changed the
uid/gid. It however does not cover the possible segfault at startup.

this patch should be backported in 2.0.

Signed-off-by: William Dauchy <w.dauchy@criteo.com>
2019-11-17 16:55:24 +01:00
Willy Tarreau
846813260d [RELEASE] Released version 2.1-dev5
Released version 2.1-dev5 with the following main changes :
    - BUG/MEDIUM: ssl/cli: don't alloc path when cert not found
    - BUG/MINOR: ssl/cli: unable to update a certificate without bundle extension
    - BUG/MINOR: ssl/cli: fix an error when a file is not found
    - MINOR: ssl/cli: replace the default_ctx during 'commit ssl cert'
    - DOC: fix date and http_date keywords syntax
    - MINOR: peers: Add "log" directive to "peers" section.
    - BUG/MEDIUM: mux-h1: Disable splicing for chunked messages
    - BUG/MEDIUM: stream: Be sure to support splicing at the mux level to enable it
    - MINOR: flt_trace: Rename macros to print trace messages
    - MINOR: trace: Add a set of macros to trace events if HA is compiled with debug
    - MEDIUM: stream/trace: Register a new trace source with its events
    - MINOR: doc: http-reuse connection pool fix
    - BUG/MEDIUM: stream: Be sure to release allocated captures for TCP streams
    - MINOR: http-ana: Remove the unused function http_reset_txn()
    - BUG/MINOR: action: do-resolve now use cached response
    - BUG: dns: timeout resolve not applied for valid resolutions
    - DOC: management: fix typo on "cache_lookups" stats output
    - BUG/MINOR: stream: init variables when the list is empty
    - BUG/MEDIUM: tasks: Make tasklet_remove_from_tasklet_list() no matter the tasklet.
    - BUG/MINOR: queue/threads: make the queue unlinking atomic
    - BUG/MEDIUM: Make sure we leave the session list in session_free().
    - CLEANUP: session: slightly simplify idle connection cleanup logic
    - MINOR: memory: also poison the area on freeing
    - CLEANUP: cli: use srv_shutdown_streams() instead of open-coding it
    - CLEANUP: stats: use srv_shutdown_streams() instead of open-coding it
    - BUG/MEDIUM: listeners: always pause a listener on out-of-resource condition
    - BUILD: contrib/da: remove an "unused" warning
    - BUG/MEDIUM: filters: Don't call TCP callbacks for HTX streams
    - MEDIUM: filters: Adapt filters API to allow again TCP filtering on HTX streams
    - MINOR: freq_ctr: Make the sliding window sums thread-safe
    - MINOR: stream: Remove the lock on the proxy to update time stats
    - MINOR: counters: Add fields to store the max observed for {q,c,d,t}_time
    - MINOR: stats: Report max times in addition of the averages for sessions
    - MINOR: contrib/prometheus-exporter: Report metrics about max times for sessions
    - BUG/MINOR: contrib/prometheus-exporter: Rename some metrics
    - MINOR: contrib/prometheus-exporter: report the number of idle conns per server
    - DOC: Add missing stats fields in the management manual
    - BUG/MINOR: mux-h1: Properly catch parsing errors on payload and trailers
    - BUG/MINOR: mux-h1: Don't set CS_FL_EOS on a read0 when receiving data to pipe
    - MINOR: mux-h1: Set EOI on the conn-stream when EOS is reported in TUNNEL state
    - MINOR: sink: Set the default max length for a message to BUFSIZE
    - MINOR: ring: make the parse function automatically set the handler/release
    - BUG/MINOR: log: make "show startup-log" use a ring buffer instead
    - MINOR: stick-table: allow sc-set-gpt0 to set value from an expression
2019-11-15 18:49:37 +01:00
Cédric Dufour
0d7712dff0 MINOR: stick-table: allow sc-set-gpt0 to set value from an expression
Allow the sc-set-gpt0 action to set GPT0 to a value dynamically evaluated from
its <expr> argument (in addition to the existing static <int> alternative).
2019-11-15 18:24:19 +01:00
Willy Tarreau
869efd5eeb BUG/MINOR: log: make "show startup-log" use a ring buffer instead
The copy of the startup logs used to rely on a re-allocated memory area
on the fly, that would attempt to be delivered at once over the CLI. But
if it's too large (too many warnings) it will take time to start up, and
may not even show up on the CLI as it doesn't fit in a buffer.

The ring buffer infrastructure solves all this with no more code, let's
switch to this instead. It simply requires a parsing function to attach
the ring via ring_attach_cli() and all the rest is automatically handled.

Initially this was imagined as a code cleanup, until a test with a config
involving 100k backends and just one occurrence of
"load-server-state-from-file global" in the defaults section took approx
20 minutes to parse due to the O(N^2) cost of concatenating the warnings
resulting in ~1 TB of data to be copied, while it took only 0.57s with
the ring.

Ideally this patch should be backported to 2.0 and 1.9, though it relies
on the ring infrastructure which will then also need to be backported.
Configs able to trigger the bug are uncommon, so another workaround for
older versions without backporting the rings would consist in simply
limiting the size of the error message in print_message() to something
always printable, which will only return the first errors.
2019-11-15 15:50:16 +01:00
Willy Tarreau
fcf94981e4 MINOR: ring: make the parse function automatically set the handler/release
ring_attach_cli() is called by the keyword parsing function to dump a
ring to the CLI. It can only work with a specific handler and release
function. Let's make it set them appropriately instead of having the
caller know these functions. This way adding a command to dump a ring
is as simple as declaring a parsing function calling ring_attach_cli().
2019-11-15 15:48:12 +01:00
Christopher Faulet
a63a5c2c65 MINOR: sink: Set the default max length for a message to BUFSIZE
It was set to MAX_SYSLOG_LEN (1K). It is a bit short to print debug
traces. Especially when part of a buffers is dump. Now, the maximum length is
set to BUFSIZE (16K).
2019-11-15 15:10:19 +01:00
Christopher Faulet
466080da0e MINOR: mux-h1: Set EOI on the conn-stream when EOS is reported in TUNNEL state
It could help to distinguish client/server aborts from legitimate shudowns for
reads.
2019-11-15 14:24:06 +01:00
Christopher Faulet
3f21611bdd BUG/MINOR: mux-h1: Don't set CS_FL_EOS on a read0 when receiving data to pipe
This is mandatory to process input one more time to add the EOM in the HTX
message and to set CS_FL_EOI on the conn-stream. Otherwise, in the stream, a
SHUTR will be reported on the corresponding channel without the EOI. It may be
erroneously interpreted as an abort.

This patch must be backported to 2.0 and 1.9.
2019-11-15 14:24:06 +01:00
Christopher Faulet
02a0253888 BUG/MINOR: mux-h1: Properly catch parsing errors on payload and trailers
Errors during the payload or the trailers parsing are reported with the
HTX_FL_PARSING_ERROR flag on the HTX message and not a negative return
value. This change was introduced when the fonctions to convert an H1 message to
HTX one were moved to a dedicated file. But the h1 mux was not fully updated
accordingly.

No backport needed except if the commits about file h1_htx.c are backported.
2019-11-15 14:24:06 +01:00
Christopher Faulet
2ac2574409 DOC: Add missing stats fields in the management manual
Following fields was missing : srv_icur, src_ilim, qtime_max, ctime_max,
rtime_max and ttime_max.
2019-11-15 14:24:06 +01:00
Christopher Faulet
20ab80c0c0 MINOR: contrib/prometheus-exporter: report the number of idle conns per server
This adds two extra metrics per server, one for the current number of idle
connections and one for the configured limit :

 * haproxy_server_idle_connections_current
 * haproxy_server_idle_connections_limit
2019-11-15 14:24:06 +01:00
Christopher Faulet
68b6968ecd BUG/MINOR: contrib/prometheus-exporter: Rename some metrics
The following metrics have been renamed without the "_http" part :

 * http_queue_time_average_seconds     => queue_time_average_seconds
 * http_connect_time_average_seconds   => connect_time_average_seconds
 * http_response_time_average_seconds  => response_time_average_seconds
 * http_total_time_average_seconds     => total_time_average_seconds

These metrics are reported per backend and per server and are not specific to
HTTP sessions.
2019-11-15 14:24:06 +01:00
Christopher Faulet
8fc027d468 MINOR: contrib/prometheus-exporter: Report metrics about max times for sessions
Now, for the sessions, the maximum times (queue, connect, response, total) are
reported in addition of the averages over the last 1024 connections. These
metrics are reported per backend and per server. Here are the metrics name :

  * haproxy_backend_max_queue_time_seconds
  * haproxy_backend_max_connect_time_seconds
  * haproxy_backend_max_response_time_seconds
  * haproxy_backend_max_total_time_seconds

and

  * haproxy_server_max_queue_time_seconds
  * haproxy_server_max_connect_time_seconds
  * haproxy_server_max_response_time_seconds
  * haproxy_server_max_total_time_seconds

This patch is related to #272.
2019-11-15 14:24:01 +01:00
Christopher Faulet
0d1c2a65e8 MINOR: stats: Report max times in addition of the averages for sessions
Now, for the sessions, the maximum times (queue, connect, response, total) are
reported in addition of the averages over the last 1024 connections. These
values are called qtime_max, ctime_max, rtime_max and ttime_max.

This patch is related to #272.
2019-11-15 14:23:54 +01:00
Christopher Faulet
efb41f0d8d MINOR: counters: Add fields to store the max observed for {q,c,d,t}_time
For backends and servers, some average times for last 1024 connections are
already calculated. For the moment, the averages for the time passed in the
queue, the connect time, the response time (for HTTP session only) and the total
time are calculated. Now, in addition, the maximum time observed for these
values are also stored.

In addition, These new counters are cleared as all other max values with the CLI
command "clear counters".

This patch is related to #272.
2019-11-15 14:23:21 +01:00
Christopher Faulet
b927a9d866 MINOR: stream: Remove the lock on the proxy to update time stats
swrate_add() is now thread-safe. So the lock on the proxy is no longer needed to
update q_time, c_time, d_time and t_time.
2019-11-15 13:43:08 +01:00
Christopher Faulet
e2e8c6779e MINOR: freq_ctr: Make the sliding window sums thread-safe
swrate_add() and swrate_add_scaled() now rely on the CAS atomic operation. So
the sliding window sums are atomically updated.
2019-11-15 13:43:08 +01:00
Christopher Faulet
b2e58492b1 MEDIUM: filters: Adapt filters API to allow again TCP filtering on HTX streams
This change make the payload filtering uniform between TCP and HTTP
filters. Now, in TCP, like in HTTP, there is only one callback responsible to
forward data. Thus, old callbacks, tcp_data() and tcp_forward_data(), are
replaced by a single callback function, tcp_payload(). This new callback gets
the offset in the payload to (re)start the filtering and the maximum amount of
data it can forward. It is the filter's responsibility to be compatible with HTX
streams. If not, it must not set the flag FLT_CFG_FL_HTX.

Because of this change, nxt and fwd offsets are no longer needed. Thus they are
removed from the filter structure with their update functions,
flt_change_next_size() and flt_change_forward_size(). Moreover, the trace filter
has been updated accordingly.

This patch breaks the compatibility with the old API. Thus it should probably
not be backported. But, AFAIK, there is no TCP filter, thus the breakage is very
limited.
2019-11-15 13:43:08 +01:00
Christopher Faulet
bb9a7e04bd BUG/MEDIUM: filters: Don't call TCP callbacks for HTX streams
For now, TCP callbacks are incompatible with the HTX streams because they are
designed to manipulate raw buffers. A new callback will probably be added to be
used in both modes, raw and HTX. So, for HTX streams, these callbacks are
ignored. This should not be a real problem because there is no known filters,
expect the trace filter, implementing these callbacks.

This patch must be backported to 2.0 and 1.9.
2019-11-15 13:43:08 +01:00
Willy Tarreau
ed295cc344 BUILD: contrib/da: remove an "unused" warning
The rcsid variable is static an unused, causing a build warning. Let's
just add __attribute__((unused)) to shut the warning.

This may be backported to 2.0.
2019-11-15 13:39:16 +01:00
Willy Tarreau
93604edb65 BUG/MEDIUM: listeners: always pause a listener on out-of-resource condition
A corner case was opened in the listener_accept() code by commit 3f0d02bbc2
("MAJOR: listener: do not hold the listener lock in listener_accept()"). The
issue is when one listener (or a group of) managed to eat all the proxy's or
all the process's maxconn, and another listener tries to accept a new socket.
This results in the atomic increment to detect the excess connection count
and immediately abort, without pausing the listener, thus the call is
immediately performed again. This doesn't happen when the test is run on a
single listener because this listener got limited when crossing the limit.
But with 2 or more listeners, we don't have this luxury.

The solution consists in limiting the listener as soon as we have to
decline accepting an incoming connection. This means that the listener
will not be marked full yet if it gets the exact connection count but
this is not a problem in practice since all other listeners will only be
marked full after their first attempt. Thus from now on, a listener is
only full once it has already failed taking an incoming connection.

This bug was definitely responsible for the unreproduceable occasional
reports of high CPU usage showing epoll_wait() returning immediately
without accepting an incoming connection, like in bug #129.

This fix must be backported to 1.9 and 1.8.
2019-11-15 10:34:51 +01:00
Willy Tarreau
af7ea814f9 CLEANUP: stats: use srv_shutdown_streams() instead of open-coding it
The "shutdown sessions" admin-mode command used to open-code the list
traversal while there's already a function for this: srv_shutdown_streams().
Better use it.
2019-11-15 07:06:46 +01:00
Willy Tarreau
d9e26a7dd5 CLEANUP: cli: use srv_shutdown_streams() instead of open-coding it
The "shutdown session server" command used to open-code the list traversal
while there's already a function for this: srv_shutdown_streams(). Better
use it.
2019-11-15 07:06:46 +01:00
Willy Tarreau
da52035a45 MINOR: memory: also poison the area on freeing
Doing so sometimes helps detect some UAF situations without the overhead
associated to the DEBUG_UAF define.
2019-11-15 07:06:46 +01:00
Willy Tarreau
5de7817ae8 CLEANUP: session: slightly simplify idle connection cleanup logic
Since previous commit a132e5efa9 ("BUG/MEDIUM: Make sure we leave the
session list in session_free().") it's pointless to delete the conn
element inside "if" blocks given that the second test is always true
as well. Let's simplify this with a single LIST_DEL_INIT() before the
test.
2019-11-15 07:06:46 +01:00
Olivier Houchard
a132e5efa9 BUG/MEDIUM: Make sure we leave the session list in session_free().
In session_free(), if we're about to destroy a connection that had no mux,
make sure we leave the session_list before calling conn_free(). Otherwise,
conn_free() would call session_unown_conn(), which would potentially free
the associated srv_list, but session_free() also frees it, so that would
lead to a double free, and random memory corruption.

This should be backported to 1.9 and 2.0.
2019-11-14 19:25:49 +01:00
Willy Tarreau
9ada030697 BUG/MINOR: queue/threads: make the queue unlinking atomic
There is a very short race in the queues which happens in the following
situation:
  - stream A on thread 1 is being processed by a server
  - stream B on thread 2 waits in the backend queue for a server
  - stream B on thread 2 is fed up with waiting and expires, calls
    stream_free() which calls pendconn_free(), which sees the
    stream attached
  - at the exact same instant, stream A finishes on thread 1, sees
    one stream is waiting (B), detaches it and wakes it up
  - stream B continues pendconn_free() and calls pendconn_unlink()
  - pendconn_unlink() now detaches the node again and performs a
    second deletion (harmless since idempotent), and decrements
    srv/px->nbpend again

=> the number of connections on the proxy or server may reach -1 if/when
   this race occurs.

It is extremely tight as it can only occur during the test on p->leaf_p
though it has been witnessed at least once. The solution consists in
testing leaf_p again once the lock is held to make sure the element was
not removed in the mean time.

This should be backported to 2.0 and 1.9, probably even 1.8.
2019-11-14 14:58:39 +01:00
Olivier Houchard
7031e3dace BUG/MEDIUM: tasks: Make tasklet_remove_from_tasklet_list() no matter the tasklet.
In tasklet_remove_from_tasket_list(), we can be called for a tasklet that is
either in the private task list, or in the shared tasklet list. Take that into
account and always use MT_LIST_DEL() to remove it, otherwise if we're in the
shared list and another thread attempts to add a tasklet in it, bad things
will happen.

__tasklet_remove_from_tasklet_list() is left unchanged, it's only supposed
to be used by process_runnable_task() to remove task/tasklets from the private
tast list.

This should not be backported.
This should fix github issue #357.
2019-11-09 18:27:17 +01:00
Jerome Magnin
2f44e8843a BUG/MINOR: stream: init variables when the list is empty
We need to call vars_init() when the list is empty otherwise we
can't use variables in the response scope. This regression was
introduced by cda7f3f5 (MINOR: stream: don't prune variables if
the list is empty).

The following config reproduces the issue:

 defaults
   mode http

 frontend in
   bind *:11223
   http-request set-var(req.foo) str("foo")  if { path /bar }
   http-request set-header bar %[var(req.foo)]  if { var(req.foo) -m found }
   http-response set-var(res.bar) str("bar")
   http-response set-header foo %[var(res.bar)] if { var(res.bar) -m found }
   use_backend out

 backend out
   server s1 127.0.0.1:11224

 listen back
   bind *:11224
   http-request deny deny_status 200

 > GET /ba HTTP/1.1
 > Host: localhost:11223
 > User-Agent: curl/7.66.0
 > Accept: */*
 >
 < HTTP/1.0 200 OK
 < Cache-Control: no-cache
 < Content-Type: text/html

 > GET /bar HTTP/1.1
 > Host: localhost:11223
 > User-Agent: curl/7.66.0
 > Accept: */*
 >
 < HTTP/1.0 200 OK
 < Cache-Control: no-cache
 < Content-Type: text/html
 < foo: bar

This must be backported as far as 1.9.
2019-11-09 18:25:41 +01:00
Willy Tarreau
7297429fa5 DOC: management: fix typo on "cache_lookups" stats output
The trailing "s" was missing.
2019-11-08 07:29:34 +01:00
Baptiste Assmann
f50e1ac444 BUG: dns: timeout resolve not applied for valid resolutions
Documentation states that the interval between 2 DNS resolution is
driven by "timeout resolve <time>" directive.
From a code point of view, this was applied unless the latest status of
the resolution was VALID. In such case, "hold valid" was enforce.
This is a bug, because "hold" timers are not here to drive how often we
want to trigger a DNS resolution, but more how long we want to keep an
information if the status of the resolution itself as changed.
This avoid flapping and prevent shutting down an entire backend when a
DNS server is not answering.

This issue was reported by hamshiva in github issue #345.

Backport status: 1.8
2019-11-07 18:50:07 +01:00
Baptiste Assmann
7264dfe949 BUG/MINOR: action: do-resolve now use cached response
As reported by David Birdsong on the ML, the HTTP action do-resolve does
not use the DNS cache.
Actually, the action is "registred" to the resolution for said name to
be resolved and wait until an other requester triggers the it. Once the
resolution is finished, then the action is updated with the result.
To trigger this, you must have a server with runtime DNS resolution
enabled and run a do-resolve action with the same fqdn AND they use the
same resolvers section.

This patch fixes this behavior by ensuring the resolution associated to
the action has a valid answer which is not considered as expired. If
those conditions are valid, then we can use it (it's the "cache").

Backport status: 2.0
2019-11-07 18:46:55 +01:00
Christopher Faulet
fee726ffa7 MINOR: http-ana: Remove the unused function http_reset_txn()
Since the legacy HTTP mode was removed, the stream is always released at the end
of each HTTP transaction and a new is created to handle the next request for
keep-alive connections. So the HTTP transaction is no longer reset and the
function http_reset_txn() can be removed.
2019-11-07 15:32:52 +01:00
Christopher Faulet
5939925a38 BUG/MEDIUM: stream: Be sure to release allocated captures for TCP streams
All TCP and HTTP captures are stored in 2 arrays, one for the request and
another for the response. In HAPRoxy 1.5, these arrays are part of the HTTP
transaction and thus are released during its cleanup. Because in this version,
the transaction is part of the stream (in 1.5, streams are still called
sessions), the cleanup is always performed, for HTTP and TCP streams.

In HAProxy 1.6, the HTTP transaction was moved out from the stream and is now
dynamically allocated only when required (becaues of an HTTP proxy or an HTTP
sample fetch). In addition, still in 1.6, the captures arrays were moved from
the HTTP transaction to the stream. This way, it is still possible to capture
elements from TCP rules for a full TCP stream. Unfortunately, the release is
still exclusively performed during the HTTP transaction cleanup. Thus, for a TCP
stream where the HTTP transaction is not required, the TCP captures, if any, are
never released.

Now, all captures are released when the stream is freed. This fixes the memory
leak for TCP streams. For streams with an HTTP transaction, the captures are now
released when the transaction is reset and not systematically during its
cleanup.

This patch must be backported as fas as 1.6.
2019-11-07 15:32:52 +01:00
Lukas Tribus
e8adfeb84b MINOR: doc: http-reuse connection pool fix
Since 1.9 we actually do use a connection pool, configurable with
pool-max-conn.

Update the documentation in this regard.

Must be backported to 1.9.
2019-11-06 11:52:07 +01:00