DOC: config: update the reminder on the HTTP model and add some terminology

It was really necessary to try to clear the confusion between sessions and streams, so let's first lift a little bit the HTTP model part to better consider new protocols, and explain what a stream is and how this differs from the earlier sessions.
2025-09-21 22:01:31 +02:00 · 2023-12-04 18:16:52 +01:00 · 2023-12-04 18:16:52 +01:00 · fafa34e5f5
commit fafa34e5f5
parent 18f2ccd244
1 changed files with 169 additions and 38 deletions
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@ -29,12 +29,13 @@ Summary
 1.    Quick reminder about HTTP
 1.1.      The HTTP transaction model
-1.2.      HTTP request
+1.2.      Terminology
-1.2.1.        The request line
+1.3.      HTTP request
-1.2.2.        The request headers
+1.3.1.        The request line
-1.3.      HTTP response
+1.3.2.        The request headers
-1.3.1.        The response line
+1.4.      HTTP response
-1.3.2.        The response headers
+1.4.1.        The response line
 1.4.2.        The response headers
 2.    Configuring HAProxy
 2.1.      Configuration file format
@ -138,6 +139,7 @@ Summary
 11.2.     Socket type prefixes
 11.3.     Protocol prefixes
 1. Quick reminder about HTTP
 ----------------------------
@ -149,35 +151,65 @@ However, it is important to understand how HTTP requests and responses are
 formed, and how HAProxy decomposes them. It will then become easier to write
 correct rules and to debug existing configurations.
 First, HTTP is standardized by a series of RFC that HAProxy follows as closely
 as possible:
  - RFC 9110: HTTP Semantics (explains the meaning of protocol elements)
  - RFC 9111: HTTP Caching (explains the rules to follow for an HTTP cache)
  - RFC 9112: HTTP/1.1 (representation, interoperability rules, security)
  - RFC 9113: HTTP/2   (representation, interoperability rules, security)
  - RFC 9114: HTTP/3   (representation, interoperability rules, security)
 In addition to these, RFC 8999 to 9002 specify the QUIC transport layer used by
 the HTTP/3 protocol.
 1.1. The HTTP transaction model
 -------------------------------
 The HTTP protocol is transaction-driven. This means that each request will lead
-to one and only one response. Traditionally, a TCP connection is established
+to one and only one response. Originally, with version 1.0 of the protocol,
-from the client to the server, a request is sent by the client through the
+there was a single request per connection: a TCP connection is established from
-connection, the server responds, and the connection is closed. A new request
+the client to the server, a request is sent by the client over the connection,
-will involve a new connection :
+the server responds, and the connection is closed. A new request then involves
 a new connection :
  [CON1] [REQ1] ... [RESP1] [CLO1] [CON2] [REQ2] ... [RESP2] [CLO2] ...
-In this mode, called the "HTTP close" mode, there are as many connection
+In this mode, often called the "HTTP close" mode, there are as many connection
 establishments as there are HTTP transactions. Since the connection is closed
 by the server after the response, the client does not need to know the content
-length.
+length, it considers that the response is complete when the connection closes.
 This also means that if some responses are truncated due to network errors, the
 client could mistakenly think a response was complete, and this used to cause
 truncated images to be rendered on screen sometimes.
 Due to the transactional nature of the protocol, it was possible to improve it
 to avoid closing a connection between two subsequent transactions. In this mode
 however, it is mandatory that the server indicates the content length for each
 response so that the client does not wait indefinitely. For this, a special
-header is used: "Content-length". This mode is called the "keep-alive" mode :
+header is used: "Content-length". This mode is called the "keep-alive" mode,
 and arrived with HTTP/1.1 (some HTTP/1.0 agents support it), and connections
 that are reused between requests are called "persistent connections":
  [CON] [REQ1] ... [RESP1] [REQ2] ... [RESP2] [CLO] ...
-Its advantages are a reduced latency between transactions, and less processing
+Its advantages are a reduced latency between transactions, less processing
-power required on the server side. It is generally better than the close mode,
+power required on the server side, and the ability to detect a truncated
-but not always because the clients often limit their concurrent connections to
+response. It is generally faster than the close mode, but not always because
-a smaller value.
+some clients often limit their concurrent connections to a smaller value, and
 this compensates less for poor network connectivity. Also, some servers have to
 keep the connection alive for a long time waiting for a possible new request
 and may experience a high memory usage due to the high number of connections,
 and closing too fast may break some requests that arrived at the moment the
 connection was closed.
 In this mode, the response size needs to be known upfront so that's not always
 possible with dynamically generated or compressed contents. For this reason
 another mode was implemented, the "chunked mode", where instead of announcing
 the size of the whole size at once, the sender only advertises the size of the
 next "chunk" of response it already has in a buffer, and can terminate at any
 moment with a zero-sized chunk. In this mode, the Content-Length header is not
 used.
 Another improvement in the communications is the pipelining mode. It still uses
 keep-alive, but the client does not wait for the first response to send the
@ -190,19 +222,43 @@ This can obviously have a tremendous benefit on performance because the network
 latency is eliminated between subsequent requests. Many HTTP agents do not
 correctly support pipelining since there is no way to associate a response with
 the corresponding request in HTTP. For this reason, it is mandatory for the
-server to reply in the exact same order as the requests were received.
+server to reply in the exact same order as the requests were received. In
 practice, after several attempts by various clients to deploy it, it has been
 totally abandonned for its lack of reliability on certain servers. But it is
 mandatory for servers to support it.
-The next improvement is the multiplexed mode, as implemented in HTTP/2 and HTTP/3.
+The next improvement is the multiplexed mode, as implemented in HTTP/2 and
-This time, each transaction is assigned a single stream identifier, and all
+HTTP/3. In this mode, multiple transactions (i.e. request-response pairs) are
-streams are multiplexed over an existing connection. Many requests can be sent in
+transmitted in parallel over a single connection, and they all progress at
-parallel by the client, and responses can arrive in any order since they also
+their own speed, independent from each other. With multiplexed protocols, a new
-carry the stream identifier.
+notion of "stream" was introduced, to represent these parallel communications
 happening over the same connection. Each stream is generally assigned a unique
 identifier for a given connection, that is used by both endpoints to know where
 to deliver the data. It is fairly common for clients to start many (up to 100,
 sometimes more) streams in parallel over a same connection, and let the server
 sort them out and respond in any order depending on what response is available.
 The main benefit of the multiplexed mode is that it significantly reduces the
 number of round trips, and speeds up page loading time over high latency
 networks. It is sometimes visibles on sites using many images, where all images
 appear to load in parallel.
 These protocols have also improved their efficiency by adopting some mechanisms
 to compress header fields in order to reduce the number of bytes on the wire,
 so that without the appropriate tools, they are not realistically manipulable
 by hand nor readable to the naked eye like HTTP/1 was. For this reason, various
 examples of HTTP messages continue to be represented in literature (including
 this document) using the HTTP/1 syntax even for newer versions of the protocol.
 HTTP/2 suffers from some design limitations, such as packet losses affecting
 all streams at once, and if a client takes too much time to retrieve an object
 (e.g. needs to store it on disk), it may slow down its retrieval and make it
 impossible during this time to access the data that is pending behind it. This
 is called "head of line blocking" or "HoL blocking" or sometimes just "HoL".
 HTTP/3 is implemented over QUIC, itself implemented over UDP. QUIC solves the
-head of line blocking at transport level by means of independently treated
+head of line blocking at the transport level by means of independently handled
 streams. Indeed, when experiencing loss, an impacted stream does not affect the
-other streams.
+other streams, and all of them can be accessed in parallel.
 By default HAProxy operates in keep-alive mode with regards to persistent
 connections: for each connection it processes each request and response, and
@ -211,16 +267,91 @@ start of a new request. When it receives HTTP/2 connections from a client, it
 processes all the requests in parallel and leaves the connection idling,
 waiting for new requests, just as if it was a keep-alive HTTP connection.
-HAProxy supports 4 connection modes :
+HAProxy essentially supports 3 connection modes :
-  - keep alive    : all requests and responses are processed (default)
+  - keep alive    : all requests and responses are processed, and the client
-  - tunnel        : only the first request and response are processed,
+                    facing and server facing connections are kept alive for new
-                    everything else is forwarded with no analysis (deprecated).
+                    requests. This is the default and suits the modern web and
                    modern protocols (HTTP/2 and HTTP/3).
  - server close  : the server-facing connection is closed after the response.
-  - close         : the connection is actively closed after end of response.
+
  - close         : the connection is actively closed after end of response on
                    both sides.
 In addition to this, by default, the server-facing connection is reusable by
 any request from any client, as mandated by the HTTP protocol specification, so
 any information pertaining to a specific client has to be passed along with
 each request if needed (e.g. client's source adress etc). When HTTP/2 is used
 with a server, by default HAProxy will dedicate this connection to the same
 client to avoid the risk of head of line blocking between clients.
 1.2. Terminology
 ----------------
-1.2. HTTP request
+Inside HAProxy, the terminology has evolved a bit over the ages to follow the
 evolutions of the HTTP protocol and its usages. While originally there was no
 significant difference between a connection, a session, a stream or a
 transaction, these ones clarified over time to match closely what exists in the
 modern versions of the HTTP protocol, though some terms remain visible in the
 configuration or the command line interface for the purpose of historical
 compatibility.
 Here are some definitions that apply to the current version of HAProxy:
  - connection: a connection is a single, bidiractional communication channel
    between a remote agent (client or server) and haproxy, at the lowest level
    possible. Usually it corresponds to a TCP socket established between a pair
    of IP and ports. On the client-facing side, connections are the very first
    entities that are instantiated when a client connects to haproxy, and rules
    applying at the connection level are the earliest ones that apply.
  - session: a session adds some context information associated with a
    connection. This includes and information specific to the transport layer
    (e.g. TLS keys etc), or variables. This term has long been used inside
    HAProxy to denote end-to-end HTTP/1.0 communications between two ends, and
    as such it remains visible in the name of certain CLI commands or
    statistics, despite representing streams nowadays, but the help messages
    and descriptions try to make this unambiguous. It is still valid when it
    comes to network-level terminology (e.g. TCP sessions inside the operating
    systems, or TCP sessions across a firewall), or for non-HTTP user-level
    applications (e.g. a telnet session or an SSH session). It must not be
    confused with "application sessions" that are used to store a full user
    context in a cookie and require to be sent to the same server.
  - stream: a stream exactly corresponds to an end-to-end bidirectional
    communication at the application level, where analysis and transformations
    may be applied. In HTTP, it contains a single request and its associated
    response, and is instantiated by the arrival of the request and is finished
    with the end of delivery of the response. In this context there is a 1:1
    relation between such a stream and the stream of a multiplexed protocol. In
    TCP communications there is a single stream per connection.
  - transaction: a transaction is only a pair of a request and the associated
    response. The term was used in conjunction with sessions before the streams
    but nowadays there is a 1:1 relation between a transaction and a stream. It
    is essentially visible in the variables' scope "txn" which is valid during
    the whole transaction, hence the stream.
  - request: it designates the traffic flowing from the client to the server.
    It is mainly used for HTTP to indicate where operations are performed. This
    term also exists for TCP operations to indicate where data are processed.
    Requests often appear in counters as a unit of traffic or activity. They do
    not always imply a response (e.g. due to errors), but since there is no
    spontaneous responses without requests, requests remain a relevant metric
    of the overall activity. In TCP there are as many requests as connections.
  - response: this designates the traffic flowing from the server to the
    client, or sometimes from HAProxy to the client, when HAProxy produces the
    response itself (e.g. an HTTP redirect).
  - service: this generally indicates some internal processing in HAProxy that
    does not require a server, such as the stats page, the cache, or some Lua
    code to implement a small application. A service usually reads a request,
    performs some operations and produces a response.
 1.3. HTTP request
 -----------------
 First, let's consider this HTTP request :
@ -234,7 +365,7 @@ First, let's consider this HTTP request :
     5     Accept: image/png
-1.2.1. The Request line
+1.3.1. The Request line
 -----------------------
 Line 1 is the "request line". It is always composed of 3 fields :
@ -288,7 +419,7 @@ HTTP/2 doesn't convey a version information with the request, so the version is
 assumed to be the same as the one of the underlying protocol (i.e. "HTTP/2").
-1.2.2. The request headers
+1.3.2. The request headers
 --------------------------
 The headers start at the second line. They are composed of a name at the
@ -297,7 +428,7 @@ an LWS is added after the colon but that's not required. Then come the values.
 Multiple identical headers may be folded into one single line, delimiting the
 values with commas, provided that their order is respected. This is commonly
 encountered in the "Cookie:" field. A header may span over multiple lines if
-the subsequent lines begin with an LWS. In the example in 1.2, lines 4 and 5
+the subsequent lines begin with an LWS. In the example in 1.3, lines 4 and 5
 define a total of 3 values for the "Accept:" header.
 Contrary to a common misconception, header names are not case-sensitive, and
@ -324,7 +455,7 @@ Important note:
   correctly and not to be fooled by such complex constructs.
-1.3. HTTP response
+1.4. HTTP response
 ------------------
 An HTTP response looks very much like an HTTP request. Both are called HTTP
@ -352,7 +483,7 @@ if a CONNECT had occurred. Then the Upgrade header would contain additional
 information about the type of protocol the connection is switching to.
-1.3.1. The response line
+1.4.1. The response line
 ------------------------
 Line 1 is the "response line". It is always composed of 3 fields :
@ -405,11 +536,11 @@ The error 4xx and 5xx codes above may be customized (see "errorloc" in section
 4.2).
-1.3.2. The response headers
+1.4.2. The response headers
 ---------------------------
 Response headers work exactly like request headers, and as such, HAProxy uses
-the same parsing function for both. Please refer to paragraph 1.2.2 for more
+the same parsing function for both. Please refer to paragraph 1.3.2 for more
 details.