Willy Tarreau 4173f4ea29 BUG/MINOR: conn_stream: do not confirm a connection from the frontend path
In issue #1468 it was reported that sometimes server-side connection
attempts were only validated after the "timeout connect" value, and
that would only happen with an H2 client. A long code analysis with the
output dumps showed only one possible call path: an I/O event on the
frontend while reading had just been disabled calls h2_wake() which in
turns wakes cs_conn_io_cb(), which tries cs_conn_process() and cs_notify(),
which sees that the other side is not blocked (already in CS_ST_CON)
and tries cs_chk_snd() on it. But on that side the connection had just
finished to be set up and not yet woken the stream up, cs_notify()
would then call cs_conn_send() which succeeds and passes the connection
to CS_ST_RDY. The problem is that nothing new happened on the frontend
side so there's no reason to wake the stream up and the backend-side
conn_stream remains in CS_ST_RDY state with the stream never being
woken up.

Once the "timeout connect" strikes, process_stream() is woken up and
finds the connection finally setup, so it ignores the timeout and goes
on.

The number of conditions to meet to reproduce this is huge, which also
explains why the reporter says it's "occasional" and we were never able
to reproduce it in the lab. It needs at least reads to be disabled and
immediately re-enabled on the frontend side (e.g. buffer full) with an
I/O even reported before the poller had an opportunity to be disabled
but with no subscribe being reinstalled, so that sock_conn_iocb() has
no other choice but calling h2_wake(), and exactly at the same time
the backend connection must finish to set up so that it was not yet
reported by the poller, the data were sent and the polling for writes
disabled.

Several factors are to be considered here:
  - h2_wake() should probably not call h2_wake_some_streams() for
    ret >= 0 (common case), but only if some special event is reported
    for at least one stream; that part is sensitive though as in the
    past we managed to lose some rare cases (e.g. restart processing
    after a pause), and such wakeups are extremely rare so we'd better
    make that effort once in a while.

  - letting a lazy forward attempt on the frontend confirm a backend
    connection establishment is too smart to be reliable. That wasn't
    in fact the intent and it's inherited from the very old code where
    muxes didn't exist and where it was guaranteed that an even at this
    layer would wake everyone up.

Here the best thing to do is to refrain from attempting to forward data
until the connection is confirmed. This will let the poller report the
connect() event to the backend side which will process it as it should
and does in all other cases.

Thanks to Jimmy Crutchfield for having reported useful traces and
tested patches.

This will have to be backported to all stable branches after some
observation. Before 2.6 the function is stream_int_chk_snd_conn(),
and the flag to remove is SI_SB_CON.
2022-04-29 15:32:14 +02:00
2022-04-13 15:10:16 +02:00
2022-04-12 07:59:06 +02:00
2021-09-16 09:14:14 +02:00
2022-04-23 04:38:36 +02:00
2019-06-15 21:59:54 +02:00
2022-04-23 04:38:36 +02:00
2022-04-23 04:38:36 +02:00

The HAProxy documentation has been split into a number of different files for
ease of use.

Please refer to the following files depending on what you're looking for :

  - INSTALL for instructions on how to build and install HAProxy
  - BRANCHES to understand the project's life cycle and what version to use
  - LICENSE for the project's license
  - CONTRIBUTING for the process to follow to submit contributions

The more detailed documentation is located into the doc/ directory :

  - doc/intro.txt for a quick introduction on HAProxy
  - doc/configuration.txt for the configuration's reference manual
  - doc/lua.txt for the Lua's reference manual
  - doc/SPOE.txt for how to use the SPOE engine
  - doc/network-namespaces.txt for how to use network namespaces under Linux
  - doc/management.txt for the management guide
  - doc/regression-testing.txt for how to use the regression testing suite
  - doc/peers.txt for the peers protocol reference
  - doc/coding-style.txt for how to adopt HAProxy's coding style
  - doc/internals for developer-specific documentation (not all up to date)
Description
No description provided
Readme 51 MiB
Languages
C 98.1%
Shell 0.8%
Makefile 0.5%
Lua 0.2%
Python 0.2%