788 Commits

Author SHA1 Message Date
Olivier Houchard
602bf7d2ea MEDIUM: streams: Add a new http action, disable-l7-retry.
Add a new action for http-request, disable-l7-retry, that can be used to
disable any attempt at retry requests (see retry-on) if it fails for any
reason other than a connection failure.
This is useful for example to make sure POST requests aren't retried.
2019-05-10 17:49:09 +02:00
Christopher Faulet
7a3367cca0 BUG/MINOR: stream: Attach the read side on the response as soon as possible
A backend stream-interface attached to a reused connection remains in the state
SI_ST_CONN until some data are sent to validate the connection. But when the
url_param algorithm is used to balance connections, no data are sent while the
connection is not established. So it is a chicken and egg situation.

To solve the problem, if no error is detected and when the request channel is
waiting for the connect(), we mark the read side as attached on the response
channel as soon as possible and we wake the request channel up once. This
happens in 2 places. The first one is right after the connect(), when the
stream-interface is still in state SI_ST_CON, in the function
sess_update_st_con_tcp(). The second one is when an applet is used instead of a
real connection to a server, in the function sess_prepare_conn_req(). In fact,
it is done when the backend stream-interface is set to the state SI_ST_EST.

This patch must be backported to 1.9.
2019-05-10 11:47:00 +02:00
Olivier Houchard
a254a37ad7 MEDIUM: streams: Add the ability to retry a request on L7 failure.
When running in HTX mode, if we sent the request, but failed to get the
answer, either because the server just closed its socket, we hit a server
timeout, or we get a 404, 408, 425, 500, 501, 502, 503 or 504 error,
attempt to retry the request, exactly as if we just failed to connect to
the server.

To do so, add a new backend keyword, "retry-on".

It accepts a list of keywords, which can be "none" (never retry),
"conn-failure" (we failed to connect, or to do the SSL handshake),
"empty-response" (the server closed the connection without answering),
"response-timeout" (we timed out while waiting for the server response),
or "404", "408", "425", "500", "501", "502", "503" and "504".

The default is "conn-failure".
2019-05-04 10:19:56 +02:00
Olivier Houchard
f4bda993dd BUG/MEDIUM: streams: Don't add CF_WRITE_ERROR if early data were rejected.
In sess_update_st_con_tcp(), if we have an error on the stream_interface
because we tried to send early_data but failed, don't flag the request
channel as CF_WRITE_ERROR, or we will never reach the analyser that sends
back the 425 response.

This should be backported to 1.9.
2019-05-03 22:23:41 +02:00
Willy Tarreau
4ad574fbe2 MEDIUM: streams: measure processing time and abort when detecting bugs
On some occasions we've had loops happening when processing actions
(e.g. a yield not being well understood) resulting in analysers being
called in loops until the analysis timeout without incrementing the
stream's call count, thus this type of bug cannot be caught by the
current protection system.

What this patch proposes is to start to measure the time spent in analysers
when profiling is enabled on the thread, in order to detect if a stream is
really misbehaving. In this case we measured the consumed CPU time, not the
wall clock time, so as not to be affected by possible noisy neighbours
sharing the same CPU. When more than 100ms are spent in an analyser, we
trigger the stream_dump_and_crash() function to report the anomaly.

The choice of 100ms comes from the fact that regular calls only take around
1 microsecond and it seems reasonable to accept a degradation factor of
100000, which covers very slow machines such as home gateways running on
sub-ghz processors, with extremely heavy configurations. Some complete
tests show that even this common bogus map_regm() entry supposedly designed
to extract a port from an IP:port entry does not trigger the timeout (25 ms
evaluation time for a 4kB header, exercise left to the reader to spot the
mistake) :

   ([0-9]{0,3}).([0-9]{0,3}).([0-9]{0,3}).([0-9]{0,3}):([0-9]{0,5}) \5

However this one purposely designed to kill haproxy definitely dies as it
manages to completely freeze the whole process for more than one second
on a 4 GHz CPU for only 120 bytes in :

   (.{0,20})(.{0,20})(.{0,20})(.{0,20})(.{0,20})b \1

This protection will definitely help during the code stabilization period
and may possibly be left enabled later depending on reported issues or not.

If you've noticed that your workload is affected by this patch, please
report it as you have very likely found a bug. And in the mean time you
can turn profiling off to disable it.
2019-04-26 14:30:59 +02:00
Willy Tarreau
3d07a16f14 MEDIUM: stream/debug: force a crash if a stream spins over itself forever
If a stream is caught spinning over itself at more than 100000 loops per
second and for more than one second, the process will be aborted and the
offender reported on the console and logs. Typical figures usually are just
a few tens to hundreds per second over a very short time so there is a huge
margin here. Using even higher values could also work but there is the risk
of not being able to catch offenders if multiple ones start to bug at the
same time and share the load. This code should ideally be disabled for
stable releases, though in theory nothing should ever trigger it.
2019-04-26 13:16:14 +02:00
Willy Tarreau
71c07ac65a MINOR: stream/debug: make a stream dump and crash function
During 1.9 development (and even a bit after) we've started to face a
significant number of situations where streams were abusively spinning
due to an uncaught error flag or complex conditions that couldn't be
correctly identified. Sometimes streams wake appctx up and conversely
as well. More importantly when this happens the only fix is to restart.

This patch adds a new function to report a serious error, some relevant
info and to crash the process using abort() so that a core dump is
available. The purpose will be for this function to be called in various
situations where the process is unfixable. It will help detect these
issues much earlier during development and may even help fixing test
platforms which are able to automatically restart when such a condition
happens, though this is not the primary purpose.

This patch only provides the function and doesn't use it yet.
2019-04-26 13:15:56 +02:00
Willy Tarreau
22d63a24d9 MINOR: applet: measure and report an appctx's call rate in "show sess"
Very similarly to previous commit doing the same for streams, we now
measure and report an appctx's call rate. This will help catch applets
which do not consume all their data and/or which do not properly report
that they're waiting for something else. Some of them like peers might
theorically be able to exhibit some occasional peeks when teaching a
full table to a nearby peer (e.g. the new replacement process), but
nothing close to what a bogus service can do so there is no risk of
confusion.
2019-04-24 16:04:23 +02:00
Willy Tarreau
2e9c1d2960 MINOR: stream: measure and report a stream's call rate in "show sess"
Quite a few times some bugs have made a stream task incorrectly
handle a complex combination of events, which was often reported as
"100% CPU", and was usually caused by the event not being properly
identified and flushed, and the stream's handler called in loops.

This patch adds a call rate counter to the stream struct. It's not
huge, it's really inexpensive (especially compared to the rest of the
processing function) and will easily help spot such tasks in "show sess"
output, possibly even allowing to kill them.

A future patch should probably consist in alerting when they're above a
certain threshold, possibly sending a dump and killing them. Some options
could also consist in aborting in order to get an analyzable core dump
and let a service manager restart a fresh new process.
2019-04-24 16:04:23 +02:00
Willy Tarreau
69b5a7f1a3 CLEANUP: task: report calls as unsigned in show sess
The "show sess" output used signed ints to report the number of calls,
which is confusing for runaway tasks where the call count can turn
negative.
2019-04-24 16:04:23 +02:00
Christopher Faulet
5e1a9d715e BUG/MEDIUM: stream: Fix the way early aborts on the client side are handled
A regression was introduced with the commit c9aecc8ff ("BUG/MEDIUM: stream:
Don't request a server connection if a shutw was scheduled"). Among other this,
it breaks the CLI when the shutr on the client side is handled with the client
data. To depend on the flag CF_SHUTW_NOW to not establish the server connection
when an error on the client side is detected is the right way to fix the bug,
because this flag may be set without any error on the client side.

So instead, we abort the request where the error is handled and only when the
backend stream-interface is in the state SI_ST_INI. This way, there is no
ambiguity on the reason why the abort accurred. The stream-interface is also
switched to the state SI_ST_CLO.

This patch must be backported to 1.9. If the commit c9aecc8ff is backported to
previous versions, this one MUST also be backported. Otherwise, it MAY be
backported to older versions that 1.9 with caution.
2019-04-23 21:20:47 +02:00
Frédéric Lécaille
bed883abe8 BUG/MAJOR: stream: Missing DNS context initializations.
Fix some missing initializations wich came with 333939c commit (MINOR: action:
new '(http-request|tcp-request content) do-resolve' action). The DNS contexts of
streams which were allocated were not initialized by stream_new(). This leaded to
accesses to non-allocated memory when freeing these contexts with stream_free().
2019-04-23 20:24:11 +02:00
Baptiste Assmann
333939c2ee MINOR: action: new '(http-request|tcp-request content) do-resolve' action
The 'do-resolve' action is an http-request or tcp-request content action
which allows to run DNS resolution at run time in HAProxy.
The name to be resolved can be picked up in the request sent by the
client and the result of the resolution is stored in a variable.
The time the resolution is being performed, the request is on pause.
If the resolution can't provide a suitable result, then the variable
will be empty. It's up to the admin to take decisions based on this
statement (return 503 to prevent loops).

Read carefully the documentation concerning this feature, to ensure your
setup is secure and safe to be used in production.

This patch creates a global counter to track various errors reported by
the action 'do-resolve'.
2019-04-23 11:41:52 +02:00
Christopher Faulet
c54e4b053d BUG/MEDIUM: stream: Don't request a server connection if a shutw was scheduled
If a shutdown for writes was performed on the client side (CF_SHUTW is set on
the request channel) while the server connection is still unestablished (the
stream-int is in the state SI_ST_INI), then it is aborted. It must also be
aborted when the shudown for write is pending (only CF_SHUTW_NOW is
set). Otherwise, some errors on the request channel can be ignored, leaving the
stream in an undefined state.

This patch must be backported to 1.9. It may probably be backported to all
suported versions, but it is unclear if the bug is visbile for older versions
than 1.9. So it is probably safer to wait bug reports on these versions to
backport this patch.
2019-04-19 15:53:23 +02:00
Olivier Houchard
3f795f76e8 MEDIUM: tasks: Merge task_delete() and task_free() into task_destroy().
task_delete() was never used without calling task_free() just after, and
task_free() was only used on error pathes to destroy a just-created task,
so merge them into task_destroy(), that will remove the task from the
wait queue, and make sure the task is either destroyed immediately if it's
not in the run queue, or destroyed when it's supposed to run.
2019-04-18 10:10:04 +02:00
Christopher Faulet
0e160ff5bb MINOR: stream: Set a flag when the stream uses the HTX
The flag SF_HTX has been added to know when a stream uses the HTX or not. It is
set when an HTX stream is created. There are 2 conditions to set it. The first
one is when the HTTP frontend enables the HTX. The second one is when the attached
conn_stream uses an HTX multiplexer.
2019-04-12 22:06:53 +02:00
Olivier Houchard
56897e20a3 BUG/MEDIUM: streams: Only re-run process_stream if we're in a connected state.
In process_stream(), only try again when there's the SI_FL_ERR flag and we're
in a connected state, otherwise we can loop forever.
It used to work because si_update_both() bogusly removed the SI_FL_ERR flag,
and it would never be set at this point. Now it does, so take that into
account.
Many, many thanks to Maciej Zdeb for reporting the problem, and helping
investigating it.

This should be backported to 1.9.
2019-04-12 13:14:48 +02:00
Olivier Houchard
86dcad6c62 BUG/MEDIUM: stream: Don't clear the stream_interface flags in si_update_both.
In commit d7704b534, we introduced and expiration flag on the stream interface,
which is used for the connect, the queue and the turn around. Because the
turn around state isn't an error, the flag was reset in process_stream(), and
later in commit cff6411f9 when introducing the SI_FL_ERR flag, the cleanup
of the flag at this place was erroneously generalized.
To fix this, the SI_FL_EXP flag is only cleared at the end of the turn around
state, and nobody should clear the stream interface flags anymore.

This should be backported to 1.9, it has no known impact on older versions.
2019-04-09 19:31:22 +02:00
Olivier Houchard
120f64a8c4 BUG/MEDIUM: streams: Store prev_state before calling si_update_both().
As si_update_both() sets prev_state to state for each stream_interface, if
we want to check it changed, copy it before calling si_update_both().

This should be backported to 1.9.
2019-04-09 19:31:22 +02:00
Olivier Houchard
39cc020af1 BUG/MEDIUM: streams: Don't remove the SI_FL_ERR flag in si_update_both().
Don't inconditionally remove the SI_FL_ERR code in si_update_both(), which
is called at the end of process_stream(). Doing so was a bug that was there
since the flag was introduced, because we were always setting si->flags to
SI_FL_NONE, however we don't want to lose that one, except if we will retry
connecting, so only remove it in sess_update_st_cer().

This should be backported to 1.9.
2019-04-09 19:31:22 +02:00
Christopher Faulet
769d0e98b8 BUG/MEDIUM: http/htx: Fix handling of the option abortonclose
Because the flag CF_SHUTR is no more set to mark the end of the message by the
H2 multiplexer, we can rely on it again to detect aborts. there is no more need
to make a check on the flag SI_FL_CLEAN_ABRT when the option abortonclose is
enabled. So, this option should work as before for h2 clients.

This patch must be backported to 1.9 with the previous EOI patches.
2019-03-25 06:55:13 +01:00
Christopher Faulet
2571bc6410 MINOR: http/applets: Handle all applets intercepting HTTP requests the same way
In addition to stats and cache applets, there are also HTTP applet services
declared in an http-request rule. All these applets are now handled the same
way. Among other things, the header Expect is handled at the same place for all
these applets.
2019-03-19 09:54:20 +01:00
Willy Tarreau
679bba13f7 MINOR: init: report the list of optionally available services
It's never easy to guess what services are built in. We currently have
the prometheus exporter in contrib/ which is the only extension for now.
Let's enumerate all available ones just like we do for filterr and pollers.
2019-03-19 08:08:10 +01:00
Olivier Houchard
dc6111e864 MEDIUM: stream: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:38 +01:00
Willy Tarreau
1ef724e216 BUILD/MINOR: stream: avoid a build warning with threads disabled
gcc 6+ complains about a possible null-deref here due to the test in
objt_server() :

       if (objt_server(s->target))
           HA_ATOMIC_ADD(&objt_server(s->target)->counters.retries, 1);

Let's simply change it to __objt_server(). This can be backported to
1.9 and 1.8.
2019-02-12 10:59:32 +01:00
Willy Tarreau
09c4bab411 BUG/MAJOR: stream: avoid double free on unique_id
Commit 32211a1 ("BUG/MEDIUM: stream: Don't forget to free
s->unique_id in stream_free().") addressed a memory leak but in
exchange may cause double-free due to the fact that after freeing
s->unique_id it doesn't null it and then calls http_end_txn()
which frees it again. Thus the process quickly crashes at runtime.

This fix must be backported to all stable branches where the
aforementioned patch was backported.
2019-02-10 18:49:37 +01:00
Olivier Houchard
32211a17eb BUG/MEDIUM: stream: Don't forget to free s->unique_id in stream_free().
In stream_free(), free s->unique_id. We may still have one, because it's
allocated in log.c::strm_log() no matter what, even if it's a TCP connection
and thus it won't get free'd by http_end_txn().
Failure to do so leads to a memory leak.

This should probably be backported to all maintained branches.
2019-02-01 18:17:36 +01:00
Willy Tarreau
28e581b21c BUG/MINOR: stream: don't close the front connection when facing a backend error
In 1.5-dev13, a bug was introduced by commit e3224e870 ("BUG/MINOR:
session: ensure that we don't retry connection if some data were sent").
If a connection error is reported after some data were sent (and lost),
we used to accidently mark the front connection as being in error instead
of only the back one because the two direction flags were applied to the
same channel. This case is extremely rare with raw connections but can
happen a bit more often with multiplexed streams. This will result in
the error not being correctly reported to the client.

This patch can be backported to all supported versions.
2019-01-31 19:38:25 +01:00
Willy Tarreau
f7a259d46f MINOR: stream: don't wait before retrying after a failed connection reuse
When a connection reuse fails, we must not wait before retrying, as most
likely the issue is related to the reused connection and not to the server
itself.

This should be backported to 1.9, though it depends on previous patches
dealing with SI_ST_CON for connection reuse.
2019-01-24 19:06:43 +01:00
Willy Tarreau
bf66bd1b8b MEDIUM: stream-int: always mark pending outgoing SI_ST_CON
Before the first send() attempt, we should be in SI_ST_CON, not
SI_ST_EST, since we have not yet attempted to send and we are
allowed to retry. This is particularly important with complex
outgoing muxes which can fail during the first send attempt (e.g.
failed stream ID allocation).

It only requires that sess_update_st_con_tcp() knows about this
possibility, as we must not forcefully close a reused connection
when facing an error in this case, this will be handled later.

This may be backported to 1.9 with care after some observation period.
2019-01-24 19:06:43 +01:00
Willy Tarreau
64f6945fec BUG/MINOR: stream: take care of synchronous errors when trying to send
We currently detect a number of situations where we have to immediately
deal with a state change, but we failed to consider the case of the
synchronous error reported on the stream-interface. We definitely do not
want to have to wait for a timeout to handle this one, especially at the
beginning of the connection when it can lead to an immediate retry.

This should be backported to 1.9.
2019-01-24 19:06:43 +01:00
Willy Tarreau
7778b59be1 MINOR: stream/cli: report more info about the HTTP messages on "show sess all"
The "show sess all" command didn't allow to detect whether compression
is in use for a given stream, which is sometimes annoying. Let's add a
few more info about the HTTP messages, namely the flags, body len, chunk
len and the "next" pointer.
2019-01-07 10:38:10 +01:00
Willy Tarreau
adf7a15bd1 MINOR: stream/cli: fix the location of the waiting flag in "show sess all"
The "waiting" flag indicates if the stream is waiting for some memory,
and was placed on the same output line as the txn for ease of reading.
But since 1.6 the txn is not part of the stream anymore so this output
was placed under a condition, resulting in "waiting" to appear only
when a txn is present. Let's move it upper, closer to the stream's
flags to fix this.

This may safely be backported though it has little value for older
versions.
2019-01-07 10:10:07 +01:00
Willy Tarreau
b84e67fee9 MINOR: stream/htx: add the HTX flags output in "show sess all"
Commit b9af88151 ("MINOR: stream/htx: Add info about the HTX structs in
"show sess all" command") accidently forgot the flags on the request
path, it was only on the response path.

It makes sense to backport this to 1.9 so that both outputs are the same.
2019-01-07 10:01:34 +01:00
Willy Tarreau
e6e52366c1 BUG/MEDIUM: cli: make "show sess" really thread-safe
This one used to rely on a few spin locks around lists manipulations
only but 1) there were still a few races (e.g. when aborting, or
between STAT_ST_INIT and STAT_ST_LIST), and 2) after last commit
which dumps htx info it became obvious that dereferencing the buffer
contents is not safe at all.

This patch uses the thread isolation from the rendez-vous point
instead, to guarantee that nothing moves during the dump. It may
make the dump a bit slower but it will be 100% safe.

This fix must be backported to 1.9, and possibly to 1.8 which likely
suffers from the short races above, eventhough they're extremely
hard to trigger.
2019-01-04 18:06:49 +01:00
Christopher Faulet
224a2d705a MINOR: stream: Add the subscription events of SIs in "show sess all" command
It could be helpful to debug frozen sessions.

The patch may be backported to 1.9.
2019-01-04 15:23:02 +01:00
Christopher Faulet
b9af88151a MINOR: stream/htx: Add info about the HTX structs in "show sess all" command
For HTX streams, info about the HTX structure is now dumped for the request and
the response channels in "show sess all" command.

The patch may be backported to 1.9.
2019-01-04 15:21:03 +01:00
Willy Tarreau
14bfe9af12 CLEANUP: stream-int: consistently call the si/stream_int functions
As long-time changes have accumulated over time, the exported functions
of the stream-interface were almost all prefixed "si_<something>" while
most private ones (mostly callbacks) were called "stream_int_<something>".
There were still a few confusing exceptions, which were addressed to
follow this shcme :
  - stream_sock_read0(), only used internally, was renamed stream_int_read0()
    and made static
  - stream_int_notify() is only private and was made static
  - stream_int_{check_timeouts,report_error,retnclose,register_handler,update}
    were renamed si_<something>.

Now it is clearer when checking one of these if it risks to be used outside
or not.
2018-12-19 15:25:43 +01:00
Christopher Faulet
2dba1a50c3 BUG/MEDIUM: stream: Forward the right amount of data before infinite forwarding
Before setting the infinite forward, we first forward all remaining input data
from the channel. Of course for HTX streams, this must be done using the amount
of data in the HTX message not in the channel (which appears as full because of
the HTX).
2018-12-19 13:45:53 +01:00
Willy Tarreau
fb3b1b00e2 MINOR: cli/stream: add the conn_stream in "show sess" output
The "show sess" output didn't report the conn_stream nor its flags,
which was a bit problematic. Now it's done.
2018-12-18 14:30:09 +01:00
Olivier Houchard
25b4015363 BUG/MEDIUM: connection: Just make sure we closed the fd on connection failure.
When the connection failed, we don't really want to close the conn_stream,
as we're probably about to retry, so just make sure the file descriptor is
closed.
2018-12-13 17:32:15 +01:00
Willy Tarreau
0007d0afbc CLEANUP: stream: remove SF_TUNNEL, SF_INITIALIZED, SF_CONN_TAR
These flags haven't been used for a while. SF_TUNNEL was reintroduced
by commit d62b98c6e ("MINOR: stream: don't set backend's nor response
analysers on SF_TUNNEL") to handle the two-level streams needed to
deal with the first model for H2, and was not removed after this model
was abandonned. SF_INITIALIZED was only set. SF_CONN_TAR was never
referenced at all.
2018-12-11 18:01:38 +01:00
Willy Tarreau
b96b77ed6e REORG: htx: merge types+proto into common/htx.h
All the HTX definition is self-contained and doesn't really depend on
anything external since it's a mostly protocol. In addition, some
external similar files (like h2) also placed in common used to rely
on it, making it a bit awkward.

This patch moves the two htx.h files into a single self-contained one.
The historical dependency on sample.h could be also removed since it
used to be there only for http_meth_t which is now in http.h.
2018-12-11 17:15:04 +01:00
William Lallemand
459e18e9e7 MINOR: cli: use pcli_flags for prompt activation
Instead of using a variable to activate the prompt, we just use a flag.
2018-12-11 17:05:40 +01:00
William Lallemand
ebf61804ef MEDIUM: cli: handle payload in CLI proxy
The CLI proxy was not handling payload. To do that, we needed to keep a
connection active on a server and to transfer each new line over that
connection until we receive a empty line.

The CLI proxy handles the payload in the same way that the CLI do it.

Examples:

   $ echo -e "@1;add map #-1 <<\n$(cat data)\n" | socat /tmp/master-socket -

   $ socat /tmp/master-socket readline
   prompt
   master> @1
   25130> add map #-1 <<
   + test test
   + test2 test2
   + test3 test3
   +

   25130>
2018-12-11 17:05:36 +01:00
William Lallemand
5b80fa2864 MINOR: cli: parse prompt command in the CLI proxy
Handle the prompt command. Works the same way as the CLI.
2018-12-11 16:54:18 +01:00
Olivier Houchard
1909f6de18 BUG/MEDIUM: stream: Don't dereference s->txn when it is not there yet.
Test if s->txn is non-NULL before attempting to dereference it, it was lost
during the transition to HTX.
2018-12-06 15:02:25 +01:00
Christopher Faulet
b2aedea142 MEDIUM: channel/htx: Add functions for forward HTX data
To ease the fast forwarding and the infinte forwarding on HTX proxies, 2
functions have been added to let the channel be almost aware of the way data are
stored in its buffer. By calling these functions instead of legacy ones, we are
sure to forward the right amount of data.
2018-12-05 17:29:30 +01:00
Willy Tarreau
b54c40ac0b BUILD: threads: fix minor build warnings when threads are disabled
These potential null-deref warnings are emitted on gcc 7 and above
when threads are disabled due to the use of objt_server() after an
existing validity test. Let's switch to __objt_server() since we
know the pointer is valid, it will not confuse the compiler.

Some of these may be backported to 1.8.
2018-12-02 19:28:41 +01:00
Christopher Faulet
b3484d67d3 MINOR: stream: Rely on CS's info if it exists and fallback on session's ones
When the stream is created, If si_get_cs_info() returns valid info for the client
connection stream, we use it. Otherwise we use session' info.
2018-12-01 17:37:27 +01:00