Some wrong operations were performed on buffers, assuming the
offsets were relative to the beginning of the request while they
are relative to the beginning of the buffer. In practice this is
not yet an issue since both are the same... until we add support
for keep-alive.
It's not enough to know if the connection will be in CLOSE or TUNNEL mode,
we still need to know whether we want to read a full message to a known
length or read it till the end just as in TUNNEL mode. Some updates to the
RFC clarify slightly better the corner cases, in particular for the case
where a non-chunked encoding is used last.
Now we also take care of adding a proper "connection: close" to messages
whose size could not be determined.
Chunked encoding can be slightly more complex than what was implemented.
Specifically, it supports some optional extensions that were not parsed
till now if present, and would have caused an error to be returned.
Also, now we enforce check for too large values in chunk sizes in order
to ensure we never overflow.
Last, we're now able to return a request error if we can't read the
chunk size because the buffer is already full.
This state indicates that an HTTP message (request or response) is
complete. This will be used to know when we can re-initialize a
new transaction. Right now we only switch to it after the end of
headers if there is no data. When other analysers are implemented,
we can switch to this state too.
The condition to reuse a connection is when the response finishes
after the request. This will have to be checked when setting the
state.
The response 1xx was set too low and required a lot of tests along
the code in order to avoid some processing. We still left the test
after the response rewrite rules so that we can eliminate unwanted
headers if required.
This code really belongs to the http part since it's transaction-specific.
This will also make it easier to later reinitialize a transaction in order
to support keepalive.
We used to apply a limit to each buffer's size in order to leave
some room to rewrite headers, then we used to remove this limit
once the session switched to a data state.
Proceeding that way becomes a problem with keepalive because we
have to know when to stop reading too much data into the buffer
so that we can leave some room again to process next requests.
The principle we adopt here consists in only relying on to_forward+send_max.
Indeed, both of those data define how many bytes will leave the buffer.
So as long as their sum is larger than maxrewrite, we can safely
fill the buffers. If they are smaller, then we refrain from filling
the buffer. This means that we won't risk to fill buffers when
reading last data chunk followed by a POST request and its contents.
The only impact identified so far is that we must ensure that the
BF_FULL flag is correctly dropped when starting to forward. Right
now this is OK because nobody inflates to_forward without using
buffer_forward().
Up to now, we only had a flag in the session indicating if it had to
work in "connection: close" mode. This is not at all compatible with
keep-alive.
Now we ensure that both sides of a connection act independantly and
only relative to the transaction. The HTTP version of the request
and response is also correctly considered. The connection already
knows several modes :
- tunnel (CONNECT or no option in the config)
- keep-alive (when permitted by configuration)
- server-close (close the server side, not the client)
- close (close both sides)
This change carefully detects all situations to find whether a request
can be fully processed in its mode according to the configuration. Then
the response is also checked and tested to fix corner cases which can
happen with different HTTP versions on both sides (eg: a 1.0 client
asks for explicit keep-alive, and the server responds with 1.1 without
a header).
The mode is selected by a capability elimination algorithm which
automatically focuses on the least capable agent between the client,
the frontend, the backend and the server. This ensures we won't get
undesired situtations where one of the 4 "agents" is not able to
process a transaction.
No "Connection: close" header will be added anymore to HTTP/1.0 requests
or responses since they're already in close mode.
The server-close mode is still not completely implemented. The response
needs to be rewritten as keep-alive before being sent to the client if
the connection was already in server-close (which implies the request
was in keep-alive) and if the response has a content-length or a
transfer-encoding (but only if client supports 1.1).
A later improvement in server-close mode would probably be to detect
some situations where it's interesting to close the response (eg:
redirections with remote locations). But even then, the client might
close by itself.
It's also worth noting that in tunnel mode, no connection header is
affected in either direction. A tunnelled connection should theorically
be notified at the session level, but this is useless since by definition
there will not be any more requests on it. Thus, we don't need to add a
flag into the session right now.
The POST body analysis was split between two analysers for historical
reasons. Now we only have one analyser which checks content length
and waits for enough data to come.
Right now this analyser waits for <url_param_post_limit> bytes of
body to reach the buffer, or the first chunk. But this could be
improved to wait for any other amount of data or any specific
contents.
Implement decreasing health based on observing communication between
HAProxy and servers.
Changes in this version 2:
- documentation
- close race between a started check and health analysis event
- don't force fastinter if it is not set
- better names for options
- layer4 support
Changes in this version 3:
- add stats
- port to the current 1.4 tree
In order to support keepalive, we'll have to differentiate
normal sessions from tunnel sessions, which are the ones we
don't want to analyse further.
Those are typically the CONNECT requests where we don't care
about any form of content-length, as well as the requests
which are forwarded on non-close and non-keepalive proxies.
To sum up :
- len : it's now the max number of characters for the value, preventing
garbaged results.
- a new option "prefix" is added, this allows to use dynamic cookie
names (e.g. ASPSESSIONIDXXX).
Previously in the thread, I wanted to use the value found with
"capture cookie" but when i started to update the documentation, I
found this solution quite weird. I've made a small rework to not
depend on "capture cookie".
- There's the posssiblity to define the URL parser mode (path parameters
or query string).
We now set msg->col and msg->sov to the first byte of non-header.
They will be used later when parsing chunks. A new macro was added
to perform size additions on an http_msg in order to limit the risks
of copy-paste in the long term.
During this operation, it appeared that the http_msg struct was not
optimal on 64-bit, so it was re-ordered to fill the holes.
An HTTP message can be decomposed into several sub-states depending
on the transfer-encoding. We'll have to keep these state information
while parsing chunks, so we must extend the values. In order not to
change everything, we'll now consider that anything >= MSG_BODY is
the body, and that the value indicates the precise state. The
MSG_ERROR status which was greater than MSG_BODY was moved for this.
This patch extends and corrects the functionality introduced by
"Collect & provide http response codes received from servers":
- responses are now also accounted for frontends
- backend's and frontend's counters are incremented based
on responses sent to client, not received from servers
We also check the close status and terminate the server persistent
connection if appropriate. Note that since this change, we'll not
get any "Connection: close" headers added to HTTP/1.0 responses
anymore, which is good.
The code part which waits for an HTTP response has been extracted
from the old function. We now have two analysers and the second one
may re-enable the first one when an 1xx response is encountered.
This has been tested and works.
The calls to stream_int_return() that were remaining in the wait
analyser have been converted to stream_int_retnclose().
Store those elements in the transaction. RFC2616 is strictly followed.
Note that requests containing two different content-length fields are
discarded as invalid.
This patch has 2 goals :
1. I wanted to test the appsession feature with a small PHP code,
using PHPSESSID. The problem is that when PHP gets an unknown session
id, it creates a new one with this ID. So, when sending an unknown
session to PHP, persistance is broken : haproxy won't see any new
cookie in the response and will never attach this session to a
specific server.
This also happens when you restart haproxy : the internal hash becomes
empty and all sessions loose their persistance (load balancing the
requests on all backend servers, creating a new session on each one).
For a user, it's like the service is unusable.
The patch modifies the code to make haproxy also learn the persistance
from the client : if no session is sent from the server, then the
session id found in the client part (using the URI or the client cookie)
is used to associated the server that gave the response.
As it's probably not a feature usable in all cases, I added an option
to enable it (by default it's disabled). The syntax of appsession becomes :
appsession <cookie> len <length> timeout <holdtime> [request-learn]
This helps haproxy repair the persistance (with the risk of losing its
session at the next request, as the user will probably not be load
balanced to the same server the first time).
2. This patch also tries to reduce the memory usage.
Here is a little example to explain the current behaviour :
- Take a Tomcat server where /session.jsp is valid.
- Send a request using a cookie with an unknown value AND a path
parameter with another unknown value :
curl -b "JSESSIONID=12345678901234567890123456789012" http://<haproxy>/session.jsp;jsessionid=00000000000000000000000000000001
(I know, it's unexpected to have a request like that on a live service)
Here, haproxy finds the URI session ID and stores it in its internal
hash (with no server associated). But it also finds the cookie session
ID and stores it again.
- As a result, session.jsp sends a new session ID also stored in the
internal hash, with a server associated.
=> For 1 request, haproxy has stored 3 entries, with only 1 which will be usable
The patch modifies the behaviour to store only 1 entry (maximum).
When processing a GET or HEAD request in close mode, we know we don't
need to read anything anymore on the socket, so we can disable it.
Doing this can save up to 40% of the recv calls, and half of the
epoll_ctl calls.
For this we need a buffer flag indicating that we're not interesting in
reading anymore. Right now, this flag also disables both polled reads.
We might benefit from disabling only speculative reads, but we will need
at least this flag when we want to support keepalive anyway.
Currently we don't disable the flag on completion, but it does not
matter as we close ASAP when performing the shutw().
Till now we would only set SN_CONN_CLOSED after rewriting it. Now we
set it just after checking the Connection header so that we can use
the result later if required.
Recent "struct chunk rework" introduced a NULL pointer dereference
and now haproxy segfaults if auth is required for stats but not found.
The reason is that size_t cannot store negative values, but current
code assumes that "len < 0" == uninitialized.
This patch fixes it.
This patch allows to collect & provide separate statistics for each socket.
It can be very useful if you would like to distinguish between traffic
generate by local and remote users or between different types of remote
clients (peerings, domestic, foreign).
Currently no "Session rate" is supported, but adding it should be possible
if we found it useful.
Doing this, we can remove the last BF_HIJACK user and remove
produce_content(). s->data_source could also be removed but
it is currently used to detect if the stats or a server was
used.
Due to a misplaced call to stream_int_retnclose(), the stats output
buffer was erased before each call to produce_content(), resulting
in missing pieces in the stats output if the connection was not
fast enough between haproxy and the client.
We will need to modify the stats dump functions so that they can
be used in interactive mode. For this, we want their caller to
prepare the connection for a close, not themselves to do it.
Let's simply move the stream_int_retnclose() out.
The BF_WRITE_ENA buffer flag became very complex to deal with, because
it was used to :
- enable automatic connection
- enable close forwarding
- enable data forwarding
The last point was not very true anymore since we introduced ->send_max,
but still the test remained everywhere. This was causing issues such as
impossibility to connect without forwarding data, impossibility to prevent
closing when data was forwarded, etc...
This patch clarifies the situation by getting rid of this multi-purpose
flag and replacing it with :
- data forwarding based only on ->send_max || ->pipe ;
- a new BF_AUTO_CONNECT flag to allow automatic connection and only
that ;
- ability to perform an automatic connection when ->send_max or ->pipe
indicate that data is waiting to leave the buffer ;
- a new BF_AUTO_CLOSE flag to let the producer automatically set the
BF_SHUTW_NOW flag when it gets a BF_SHUTR.
During this cleanup, it was discovered that some tests were performed
twice, or that the BF_HIJACK flag was still tested, which is not needed
anymore since ->send_max replcaed it. These places have been fixed too.
These cleanups have also revealed a few areas where the other flags
such as BF_EMPTY are not cleanly used. This will be an opportunity for
a second patch.
HTTP supports status codes 100 and 101 to report protocol indications,
which are followed by the requests's response. Till now, haproxy would
only see those responses without parsing subsequent ones. That means
that cookie additions were only performed on 1xx messages for instance,
which does not work since headers must be ignored with 1xx messages.
Also, logs were not terribly useful with the common 100 status code
in response to "Expect: 100-continue" during POST some requests.
This change adds support for such messages. Now haproxy sees them,
forwards them and skips them until it finds a correct response, which
it logs and processes. As an exception, header removal/rewriting still
work on 1xx responses in order to be able to strip out sensible
information that may have accidentely been left by another equipment
(possibly an older haproxy itself). But headers addition are disabled
however.
This change brings the ability to loop on response without data, which
is a starting point to support keepalive. The change is marked as major
as a few fixes had to be performed in the HTTP message parser.
Tarpit was broken by recent splitting of analysers. It would still
let the connection go to the server due to a missing buffer_write_dis().
Also, it was performed too late (after content switching rules).
The first step towards dynamic buffer size consists in removing
all static definitions of the buffer size. Instead, we store a
buffer's size in itself. Right now they're all preinitialized
to BUFSIZE, but we will change that.
s->srv_error was set depending on the frontend's protocol. Now it is
set by the HTTP analyser, so that even when switching from a TCP
frontend to an HTTP backend, we can have HTTP error messages.
This Linux-specific option was never really used in production and
has since been superseded by new splicing options brought by recent
Linux kernels.
It caused several particular cases in the code because the kernel
would take care of the session without haproxy being able to do
anything on it, which became hard to handle in the new architecture.
Let's simply get rid of it now that there is a replacement available.
Since we can now switch from TCP to HTTP, we need to be able to apply
the HTTP request timeout after switching. That means we need to take
it from the backend and not from the frontend. Since the backend points
to the frontend before switching, that changes nothing for the normal
case.
This patch allows a TCP frontend to switch to an HTTP backend.
During the switch, missing structures are automatically allocated.
The HTTP parser is enabled so that the backend first waits for a
full HTTP request.
Now that we can perform TCP-based content switching, it makes sense
to be able to detect HTTP traffic and act accordingly. We already
have an HTTP decoder, we just have to call it in order to detect HTTP
protocol. Note that since the decoder will automatically fill in the
interesting fields of the HTTP transaction, it would make sense to
use this parsing to extend HTTP matching to TCP.
The HTTP processing has been splitted into 7 steps, one of which
is not anymore HTTP-specific (content-switching). That way, it
becomes possible to use "use_backend" rules in TCP mode. A new
"use_server" directive should follow soon.