These 4 combinations are needlessly complicated since the session already
has direct access to the associated stream interfaces without having to
check an indirect pointer.
The purpose of these two macros will be to pass via the session to
find the relevant stream interfaces so that we don't need to store
the ->cons nor ->prod pointers anymore. Currently they're only defined
so that all references could be removed.
Note that many places need a second pass of clean up so that we don't
have any chn_prod(&s->req) anymore and only &s->si[0] instead, and
conversely for the 3 other cases.
This new flag "SI_FL_ISBACK" is set only on the back SI and is cleared
on the front SI. That way it's possible only by looking at the SI to
know what side it is.
We'll soon remove direct references to the channels from the stream
interface since everything belongs to the same session, so let's
first not dereference si->ib / si->ob anymore and use macros instead.
The channels were pointers to outside structs and this is not needed
anymore since the buffers have moved, but this complicates operations.
Move them back into the session so that both channels and stream interfaces
are always allocated for a session. Some places (some early sample fetch
functions) used to validate that a channel was NULL prior to dereferencing
it. Now instead we check if chn->buf is NULL and we force it to remain NULL
until the channel is initialized.
Commit bc4c1ac ("MEDIUM: http/tcp: permit to resume http and tcp custom actions")
unfortunately broke the stats applet by moving the clearing of the analyser bit
after processing the applet headers. It used to work only in HTTP/1.1 and not
in HTTP/1.0. This is 1.6-specific, no backport is needed.
Later, the processing of some actions needs to be interrupted and resumed
later. This patch permit to resume the actions. The actions that needs
to run with the resume mode are not yet avalaible. It will be soon with
Lua patches. So the code added by this patch is untestable for the moment.
The list of "tcp_exec_req_rules" cannot resme because is called by the
unresumable function "accept_session".
Actually, this function returns a pointer on the rule that stop
the evaluation of the rule list. Later we integrate the possibility
of interrupt and resue the processsing of some actions. The current
response mode is not sufficient to returns the "interrupt" information.
The pointer returned is never used, so I change the return type of
this function by an enum. With this enum, the function is ready to
return the "interupt" value.
The functions "val_payload_lv" and "val_hdr" are useful with
lua. The lua automatic binding for sample fetchs needs to
compare check functions.
The "arg_type_names" permit to display error messages.
Some usages of the converters need to know the attached session. The Lua
needs the session for retrieving his running context. This patch adds
the "session" as an argument of the converters prototype.
These new sample fetches retrieve the list of header names as they appear
in the request or response. This can be used for debugging, for statistics
as well as an aid to better detect the presence of proxies or plugins on
some browsers, which alter the request compared to a regular browser by
adding or reordering headers.
Commit c600204 ("BUG/MEDIUM: regex: fix risk of buffer overrun in
exp_replace()") added a control of failure on the response headers,
but forgot to check for the error during request processing. So if
the filters fail to apply, we could keep the request. It might
cause some headers to silently fail to be added for example. Note
that it's tagged MINOR because a standard configuration cannot make
this case happen.
The fix should be backported to 1.5 and 1.4 though.
The two http-req/http-resp actions "replace-hdr" and "replace-value" were
expecting exactly one space after the colon, which is wrong. It was causing
the first char not to be seen/modified when no space was present, and empty
headers not to be modified either. Instead of using name->len+2, we must use
ctx->val which points to the first character of the value even if there is
no value.
This fix must be backported into 1.5.
This commit implements the following new actions :
- "set-method" rewrites the request method with the result of the
evaluation of format string <fmt>. There should be very few valid reasons
for having to do so as this is more likely to break something than to fix
it.
- "set-path" rewrites the request path with the result of the evaluation of
format string <fmt>. The query string, if any, is left intact. If a
scheme and authority is found before the path, they are left intact as
well. If the request doesn't have a path ("*"), this one is replaced with
the format. This can be used to prepend a directory component in front of
a path for example. See also "set-query" and "set-uri".
Example :
# prepend the host name before the path
http-request set-path /%[hdr(host)]%[path]
- "set-query" rewrites the request's query string which appears after the
first question mark ("?") with the result of the evaluation of format
string <fmt>. The part prior to the question mark is left intact. If the
request doesn't contain a question mark and the new value is not empty,
then one is added at the end of the URI, followed by the new value. If
a question mark was present, it will never be removed even if the value
is empty. This can be used to add or remove parameters from the query
string. See also "set-query" and "set-uri".
Example :
# replace "%3D" with "=" in the query string
http-request set-query %[query,regsub(%3D,=,g)]
- "set-uri" rewrites the request URI with the result of the evaluation of
format string <fmt>. The scheme, authority, path and query string are all
replaced at once. This can be used to rewrite hosts in front of proxies,
or to perform complex modifications to the URI such as moving parts
between the path and the query string. See also "set-path" and
"set-query".
All of them are handled by the same parser and the same exec function,
which is why they're merged all together. For once, instead of adding
even more entries to the huge switch/case, we used the new facility to
register action keywords. A number of the existing ones should probably
move there as well.
This function (and its sister regex_exec_match2()) abstract the regex
execution but make it impossible to pass flags to the regex engine.
Currently we don't use them but we'll need to support REG_NOTBOL soon
(to indicate that we're not at the beginning of a line). So let's add
support for this flag and update the API accordingly.
The way http-request/response set-header works is stupid. For a naive
reuse of the del-header code, it removes all occurrences of the header
to be set before computing the new format string. This makes it almost
unusable because it is not possible to append values to an existing
header without first copying them to a dummy header, performing the
copy back and removing the dummy header.
Instead, let's share the same code as add-header and perform the optional
removal after the string is computed. That way it becomes possible to
write things like :
http-request set-header X-Forwarded-For %[hdr(X-Forwarded-For)],%[src]
Note that this change is not expected to have any undesirable impact on
existing configs since if they rely on the bogus behaviour, they don't
work as they always retrieve an empty string.
This fix must be backported to 1.5 to stop the spreadth of ugly configs.
This fetch extracts the request's query string, which starts after the first
question mark. If no question mark is present, this fetch returns nothing. If
a question mark is present but nothing follows, it returns an empty string.
This means it's possible to easily know whether a query string is present
using the "found" matching method. This fetch is the completemnt of "path"
which stops before the question mark.
It applies to the channel and it doesn't erase outgoing data, only
pending unread data, which is strictly equivalent to what recv()
does with MSG_TRUNC, so that new name is more accurate and intuitive.
channel_reserved is confusingly named. It is used to know whether or
not the rewrite area is left intact for situations where we want to
ensure we can use it before proceeding. Let's rename it to fix this
confusion.
In 1.4-dev7, a header removal mechanism was introduced with commit 68085d8
("[MINOR] http: add http_remove_header2() to remove a header value."). Due
to a typo in the function, the beginning of the headers gets desynchronized
if the header preceeding the deleted one ends with an LF/CRLF combination
different form the one of the removed header. The reason is that while
rewinding the pointer, we go back by a number of bytes taking into account
the LF/CRLF status of the removed header instead of the previous one. The
case where it fails is in http-request del-header/set-header where the
multiple occurrences of a header are present and their LF/CRLF ending
differs from the preceeding header. The loop then stops because no more
headers are found given that the names and length do not match.
Another point to take into consideration is that removing headers using
a loop of http_find_header2() and this function is inefficient since we
remove values one at a time while it could be simpler and faster to
remove full header lines. This is something that should be addressed
separately.
This fix must be backported to 1.5 and 1.4. Note that http-send-name-header
relies on this function as well so it could be possible that some of the
issues encountered with it in 1.4 come from this bug.
Doing so ensures that even when no memory is available, we leave the
channel in a sane condition. There's a special case in proto_http.c
regarding the compression, we simply pre-allocate the tmpbuf to point
to the dummy buffer. Not reusing &buf_empty for this allows the rest
of the code to differenciate an empty buffer that's not used from an
empty buffer that results from a failed allocation which has the same
semantics as a buffer full.
b_alloc() now allocates a buffer and initializes it to the size specified
in the pool minus the size of the struct buffer itself. This ensures that
callers do not need to care about buffer details anymore. Also this never
applies memory poisonning, which is slow and useless on buffers.
Since during parsing stage, curproxy always represents a proxy to be operated,
it should be a mistake by referring proxy.
Signed-off-by: Godbach <nylzhaowei@gmail.com>
When enabling stats, response analysers were set on the request
analyser list, which 1) has no effect, and 2) means we don't have
the response analysers properly set.
In practice these response analysers are set when the connection
to the server or applet is established so we don't need/must not
set them here.
Fortunately this bug had no impact since the flags are distinct,
but it definitely is confusing.
It should be backported to 1.5.
Colin Ingarfield reported some unexplainable flags in the logs.
For example, a "LR" termination state was set on a request which was forwarded
to a server, where "LR" means that the request should have been handled
internally by haproxy.
This case happens when at least client side keep-alive is enabled. Next
requests in the connection will inherit the flags from the previous request.
2 fields are impacted : "termination_state" and "Tt" in the timing events,
where a "+" can be added, when a previous request was redispatched.
This is not critical for the service itself but can confuse troubleshooting.
The fix must be backported to 1.5 and 1.4.
When the HTTP parser is in state HTTP_MSG_ERROR, we don't know if it was
already initialized or not. If the error happens before HTTP_MSG_RQBEFORE,
random offsets might be present and we don't want to display such random
strings in debug mode.
While it's theorically possible to randomly crash the process when running
in debug mode here, this bug was not tagged MAJOR because it would not
make sense to run in debug mode in production.
This fix must be backported to 1.5 and 1.4.
MAX_SESS_STKCTR allows one to define the number of stick counters that can
be used in parallel in track-sc* rules. The naming of this macro creates
some confusion because the value there is sometimes used as a max instead
of a count, and the config parser accepts values from 0 to MAX_SESS_STKCTR
and the processing ignores anything tracked on the last one. This means
that by default, track-sc3 is allowed and ignored.
This fix must be backported to 1.5 where the problem there only affects
TCP rules.
Commit 179085c ("MEDIUM: http: move Connection header processing earlier")
introduced a regression : the backend's HTTP mode is not considered anymore
when setting the session's HTTP mode, because wait_for_request() is only
called once, when the frontend receives the request (or when the frontend
is in TCP mode, when the backend receives the request).
The net effect is that in some situations when the frontend and the backend
do not work in the same mode (eg: keep-alive vs close), the backend's mode
is ignored.
This patch moves all that processing to a dedicated function, which is
called from the original place, as well as from session_set_backend()
when switching from an HTTP frontend to an HTTP backend in different
modes.
This fix must be backported to 1.5.
Ryan Brock reported that server stickiness did not work for WebSocket
because the cookies and headers are not modified on 1xx responses. He
found that his browser correctly presents the cookies learned on 101
responses, which was not specifically defined in the WebSocket spec,
nor in the cookie spec. 101 is a very special case. Being part of 1xx,
it's an interim response. But within 1xx, it's special because it's
the last HTTP/1 response that transits on the wire, which is different
from 100 or 102 which may appear multiple times. So in that sense, we
can consider it as a final response regarding HTTP/1, and it makes
sense to allow header processing there. Note that we still ensure not
to mangle the Connection header, which is critical for HTTP upgrade to
continue to work smoothly with agents that are a bit picky about what
tokens are found there.
The rspadd rules are now processed for 101 responses as well, but the
cache-control checks are not performed (since no body is delivered).
Ryan confirmed that this patch works for him.
It would make sense to backport it to 1.5 given that it improves end
user experience on WebSocket servers.
Commit bb2e669 ("BUG/MAJOR: http: correctly rewind the request body
after start of forwarding") was incorrect/incomplete. It used to rely on
CF_READ_ATTACHED to stop updating msg->sov once data start to leave the
buffer, but this is unreliable because since commit a6eebb3 ("[BUG]
session: clear BF_READ_ATTACHED before next I/O") merged in 1.5-dev1,
this flag is only ephemeral and is cleared once all analysers have
seen it. So we can start updating msg->sov again each time we pass
through this place with new data. With a sufficiently large amount of
data, it is possible to make msg->sov wrap and validate the if()
condition at the top, causing the buffer to advance by about 2GB and
crash the process.
Note that the offset cannot be controlled by the attacker because it is
a sum of millions of small random sizes depending on how many bytes were
read by the server and how many were left in the buffer, only because
of the speed difference between reading and writing. Also, nothing is
written, the invalid pointer resulting from this operation is only read.
Many thanks to James Dempsey for reporting this bug and to Chris Forbes for
narrowing down the faulty area enough to make its root cause analysable.
This fix must be backported to haproxy 1.5.
pat_parse_meth() had some remains of an early implementation attempt for
the patterns, it initialises a trash and never sets the pattern value there.
The result is that a non-standard method cannot be matched anymore. The bug
appeared during the pattern rework in 1.5, so this fix must be backported
there. Thanks to Joe Williams of GitHub for reporting the bug.
This results in a string-based HTTP method match returning true when
it doesn't match and conversely. This bug was reported by Joe Williams.
The fix must be backported to 1.5, though it still doesn't work because
of at least 3-4 other bugs in the long path which leads to building this
pattern list.
Before the commit bbba2a8ecc35daf99317aaff7015c1931779c33b
(1.5-dev24-8), the tarpit section set timeout and return, after this
commit, the tarpit section set the timeout, and go to the "done" label
which reset the timeout.
Thanks Bryan Talbot for the bug report and analysis.
This should be backported in 1.5.
A couple of typo fixed in 'http-response replace-header':
- an error when counting the number of arguments
- a typo in the alert message
This should be backported to 1.5.
Add support for http-request track-sc, similar to what is done in
tcp-request for backends. A new act_prm field was added to HTTP
request rules to store the track params (table, counter). Just
like for TCP rules, the table is resolved while checking for
config validity. The code was mostly copied from the TCP code
with the exception that here we also count the HTTP request count
and rate by hand. Probably that something could be factored out in
the future.
It seems like tracking flags should be improved to mark each hook
which tracks a key so that we can have some check points where to
increase counters of the past if not done yet, a bit like is done
for TRACK_BACKEND.
We're using the internal memory representation of base32 here, which is
wrong since these data might be exported to headers for logs or be used
to stick to a server and replicated to other peers. Let's convert base32
to big endian (network representation) when building the binary block.
This mistake is also present in 1.5, it would be better to backport it.
Daniel Dubovik reported an interesting bug showing that the request body
processing was still not 100% fixed. If a POST request contained short
enough data to be forwarded at once before trying to establish the
connection to the server, we had no way to correctly rewind the body.
The first visible case is that balancing on a header does not always work
on such POST requests since the header cannot be found. But there are even
nastier implications which are that http-send-name-header would apply to
the wrong location and possibly even affect part of the request's body
due to an incorrect rewinding.
There are two options to fix the problem :
- first one is to force the HTTP_MSG_F_WAIT_CONN flag on all hash-based
balancing algorithms and http-send-name-header, but there's always a
risk that any new algorithm forgets to set it ;
- the second option is to account for the amount of skipped data before
the connection establishes so that we always know the position of the
request's body relative to the buffer's origin.
The second option is much more reliable and fits very well in the spirit
of the past changes to fix forwarding. Indeed, at the moment we have
msg->sov which points to the start of the body before headers are forwarded
and which equals zero afterwards (so it still points to the start of the
body before forwarding data). A minor change consists in always making it
point to the start of the body even after data have been forwarded. It means
that it can get a negative value (so we need to change its type to signed)..
In order to avoid wrapping, we only do this as long as the other side of
the buffer is not connected yet.
Doing this definitely fixes the issues above for the requests. Since the
response cannot be rewound we don't need to perform any change there.
This bug was introduced/remained unfixed in 1.5-dev23 so the fix must be
backported to 1.5.
As usual, when touching any is* function, Solaris complains about the
type of the element being checked. Better backport this to 1.5 since
nobody knows what the emitted code looks like since macros are used
instead of functions.
We used to only clear flags when reusing the static sample before calling
sample_process(), but that's not enough because there's a context in samples
that can be used by some fetch functions such as auth, headers and cookies,
and not reinitializing it risks that a pointer of a different type is used
in the wrong context.
An example configuration which triggers the case consists in mixing hdr()
and http_auth_group() which both make use of contexts :
http-request add-header foo2 %[hdr(host)],%[http_auth_group(foo)]
The solution is simple, initialize all the sample and not just the flags.
This fix must be backported into 1.5 since it was introduced in 1.5-dev19.
Baptiste Assmann reported a corner case in the releasing of stick-counters:
we release content-aware counters before logging. In the past it was not a
problem, but since now we can log them it, it prevents one from logging
their value. Simply switching the log production and the release of the
counter fixes the issue.
This should be backported into 1.5.
The sample fetch function "base" makes use of the trash which is also
used by set-header/add-header etc... everything which builds a formated
line. So we end up with some junk in the header if base is in use. Let's
fix this as all other fetches by using a trash chunk instead.
This bug was reported by Baptiste Assmann, and also affects 1.5.
This is the 3rd regression caused by the changes below. The latest to
date was reported by Finn Arne Gangstad. If a server responds with no
content-length and the client's FIN is never received, either we leak
the client-side FD or we spin at 100% CPU if timeout client-fin is set.
Enough is enough. The amount of tricks needed to cover these side-effects
starts to look like used toilet paper stacked over a chocolate cake. I
don't want to eat that cake anymore!
All this to avoid reporting a server-side timeout when a client stops
uploading data and haproxy expires faster than the server... A lot of
"ifs" resulting in a technically valid log that doesn't always please
users, and whose alternative causes that many issues for all others
users.
So let's revert this crap merged since 1.5-dev25 :
Revert "CLEANUP: http: don't clear CF_READ_NOEXP twice"
This reverts commit 1592d1e72a4a2d25a554c299ae95a3e6cad80bf1.
Revert "BUG/MEDIUM: http: clear CF_READ_NOEXP when preparing a new transaction"
This reverts commit 77d29029af1c44216b190dd7442964b9d8f45257.
Revert "BUG/MEDIUM: session: don't clear CF_READ_NOEXP if analysers are not called"
This reverts commit 0943757a2144761c60e416b5ed07baa76934f5a4.
Revert "BUG/MEDIUM: http: disable server-side expiration until client has sent the body"
This reverts commit 3bed5e9337fd6eeab0f0006ebefcbe98ee5c4f9f.
Revert "BUG/MEDIUM: http: correctly report request body timeouts"
This reverts commit b9edf8fbecc9d1b5c82794735adcc367a80a4ae2.
Revert "BUG/MEDIUM: http/session: disable client-side expiration only after body"
This reverts commit b1982e27aaff2a92a389a9f1bc847e3bb8fdb4f2.
If a cleaner AND SAFER way to do something equivalent in 1.6-dev, we *might*
consider backporting it to 1.5, but given the vicious bugs that have surfaced
since, I doubt it will happen any time soon.
Fortunately, that crap never made it into 1.4 so no backport is needed.
The new regex function can use string and length. The HAproxy buffer are
not null-terminated, and the use of the regex_exec* functions implies
the add of this null character. This patch replace these function by the
functions which takes a string and length as input.
Just the file "proto_http.c" is change because this one is more executed
than other. The file "checks.c" have a very low usage, and it is not
interesting to change it. Furthermore, the buffer used by "checks.c" are
null-terminated.
This patch remove all references of standard regex in haproxy. The last
remaining references are only in the regex.[ch] files.
In the file src/checks.c, the original function uses a "pmatch" array.
In fact this array is unused. This patch remove it.