This is similar to the commit 5ad6e1dc ("BUG/MINOR: http: base32+src
should use the big endian version of base32"). Now we convert url32 to big
endian when building the binary block.
This patch needs to be backported to 1.6 and 1.5.
If we use the action "http-request add-header" with a Lua sample-fetch or
converter, and the Lua function calls one of the Lua log function, the
header name is corrupted, it contains an extract of the last loggued data.
This is due to an overwrite of the trash buffer, because his scope is not
respected in the "add-header" function. The scope of the trash buffer must
be limited to the function using it. The build_logline() function can
execute a lot of other function which can use the trash buffer.
This patch fix the usage of the trash buffer. It limits the scope of this
global buffer to the local function, we build first the header value using
build_logline, and after we store the header name.
Thanks Michael Ezzell for the repporting.
This patch must be backported in 1.6 version
The header name is copied two time in the buffer. The first copy is a printf-like
function writing the name and the http separators in the buffer, and the second
form is a memcopy. This seems to be inherited from some changes. This patch
removes the printf like, format.
This patch must be backported in 1.6 and 1.5 versions
The 'set-src' action was not available for tcp actions The action code
has been converted into a function in proto_tcp.c to be used for both
'http-request' and 'tcp-request connection' actions.
Both http and tcp keywords are registered in proto_tcp.c
Commit 108b1dd ("MEDIUM: http: configurable http result codes for
http-request deny") introduced in 1.6-dev2 was incomplete. It introduced
a new field "rule_deny_status" into struct http_txn, which is filled only
by actions "http-request deny" and "http-request tarpit". It's then used
in the deny code path to emit the proper error message, but is used
uninitialized when the deny comes from a "reqdeny" rule, causing random
behaviours ranging from returning a 200, an empty response, or crashing
the process. Often upon startup only 200 was returned but after the fields
are used the crash happens. This can be sped up using -dM.
There's no need at all for storing this status in the http_txn struct
anyway since it's used immediately after being set. Let's store it in
a temporary variable instead which is passed as an argument to function
http_req_get_intercept_rule().
As an extra benefit, removing it from struct http_txn reduced the size
of this struct by 8 bytes.
This fix must be backported to 1.6 where the bug was detected. Special
thanks to Falco Schmutz for his detailed report including an exploitable
core and a reproducer.
When compiled with GCC 6, the IP address specified for a frontend was
ignored and HAProxy was listening on all addresses instead. This is
caused by an incomplete copy of a "struct sockaddr_storage".
With the GNU Libc, "struct sockaddr_storage" is defined as this:
struct sockaddr_storage
{
sa_family_t ss_family;
unsigned long int __ss_align;
char __ss_padding[(128 - (2 * sizeof (unsigned long int)))];
};
Doing an aggregate copy (ss1 = ss2) is different than using memcpy():
only members of the aggregate have to be copied. Notably, padding can be
or not be copied. In GCC 6, some optimizations use this fact and if a
"struct sockaddr_storage" contains a "struct sockaddr_in", the port and
the address are part of the padding (between sa_family and __ss_align)
and can be not copied over.
Therefore, we replace any aggregate copy by a memcpy(). There is another
place using the same pattern. We also fix a function receiving a "struct
sockaddr_storage" by copy instead of by reference. Since it only needs a
read-only copy, the function is converted to request a reference.
Since client-side HTTP keep-alive was introduced in 1.4-dev, a good
number of corner cases had to be dealt with. One of them remained and
caused some occasional CPU spikes that Cyril Bonté had the opportunity
to observe from time to time and even recently to capture for deeper
analysis.
What happens is the following scenario :
1) a client sends a first request which causes the server to respond
using chunked encoding
2) the server starts to respond with a large response that doesn't
fit into a single buffer
3) the response buffer fills up with the response
4) the client reads the response while it is being sent by the server.
5) haproxy eventually receives the end of the response from the server,
which occupies the whole response buffer (must at least override the
reserve), parses it and prepares to receive a second request while
sending the last data blocks to the client. It then reinitializes the
whole http_txn and calls http_wait_for_request(), which arms the
http-request timeout.
6) at this exact moment the client emits a second request, probably
asking for an object referenced in the first page while parsing it
on the fly, and silently disappears from the internet (internet
access cut or software having trouble with pipelined request).
7) the second request arrives into the request buffer and the response
data stall in the response buffer, preventing the reserve from being
used for anything else.
8) haproxy calls http_wait_for_request() to parse the next request which
has just arrived, but since it sees the response buffer is full, it
refrains from doing so and waits for some data to leave or a timeout
to strike.
9) the http-request timeout strikes, causing http_wait_for_request() to
be called again, and the function immediately returns since it cannot
even produce an error message, and the loop is maintained here.
10) the client timeout strikes, aborting the loop.
At first glance a check for timeout would be needed *before* considering
the buffer in http_wait_for_request(), but the issue is not there in fact,
because when doing so we see in the logs a client timeout error while
waiting for a request, which is wrong. The real issue is that we must not
consider the first transaction terminated until at least the reserve is
released so that http_wait_for_request() has no problem starting to process
a new request. This allows the first response to be reported in timeout and
not the second request. A more restrictive control could consist in not
considering the request complete until the response buffer has no more
outgoing data but this brings no added value and needlessly increases the
number of wake-ups when dealing with pipelining.
Note that the same issue exists with the request, we must ensure that any
POST data are finished being forwarded to the server before accepting a new
request. This case is much harder to trigger however as servers rarely
disappear and if they do so, they impact all their sessions at once.
No specific reproducer is usable because the bug is very hard to reproduce
and depends on the system as well with settings varying across reboots. On
a test machine, socket buffers were reduced to 4096 (tcp_rmem and tcp_wmem),
haproxy's buffers were set to 16384 and tune.maxrewrite to 12288. The proxy
must work in http-server-close mode, with a request timeout smaller than the
client timeout. The test is run like this :
$ (printf "GET /15000 HTTP/1.1\r\n\r\n"; usleep 100000; \
printf "GET / HTTP/1.0\r\n\r\n"; sleep 15) | nc6 --send-only 0 8002
The server returns chunks of the requested size (15000 bytes here, but
78000 in a previous test was the only working value). Strace must show the
last recvfrom() succeed and the last sendto() being shorter than desired or
better, not being called.
This fix must be backported to 1.6, 1.5 and 1.4. It was made in a way that
should make it possible to backport it. It depends on channel_congested()
which also needs to be backported. Further cleanup of http_wait_for_request()
could be made given that some of the tests are now useless.
Commit dbe34eb ("MEDIUM: filters/http: Move body parsing of HTTP
messages in dedicated functions") introduced a bug in function
http_response_forward_body() by getting rid of the while(1) loop.
The code immediately following the loop was only reachable on missing
data but now it's also reachable under normal conditions, which used
to be dealt with by the skip_resync_state label returning zero. The
side effect is that in http_server_close situations, the channel's
SHUTR flag is seen and considered as a server error which is reported
if any other error happens (eg: client timeout).
This bug is specific to 1.7, no backport is needed.
The following patch allow to log cookie for tarpit and denied request.
This minor bug affect at least 1.5, 1.6 and 1.7 branch.
The solution is not perfect : may be the cookie processing
(manage_client_side_cookies) can be moved into http_process_req_common.
first, we modify the signatures of http_msg_forward_body and
http_msg_forward_chunked_body as they are declared as inline
below. Secondly, just verify the returns of the chunk initialization
which holds the Authorization Method (althought it is unlikely to fail ...).
Both from gcc warnings.
Instead of repeating the type of the LHS argument (sizeof(struct ...))
in calls to malloc/calloc, we directly use the pointer
name (sizeof(*...)). The following Coccinelle patch was used:
@@
type T;
T *x;
@@
x = malloc(
- sizeof(T)
+ sizeof(*x)
)
@@
type T;
T *x;
@@
x = calloc(1,
- sizeof(T)
+ sizeof(*x)
)
When the LHS is not just a variable name, no change is made. Moreover,
the following patch was used to ensure that "1" is consistently used as
a first argument of calloc, not the last one:
@@
@@
calloc(
+ 1,
...
- ,1
)
In C89, "void *" is automatically promoted to any pointer type. Casting
the result of malloc/calloc to the type of the LHS variable is therefore
unneeded.
Most of this patch was built using this Coccinelle patch:
@@
type T;
@@
- (T *)
(\(lua_touserdata\|malloc\|calloc\|SSL_get_app_data\|hlua_checkudata\|lua_newuserdata\)(...))
@@
type T;
T *x;
void *data;
@@
x =
- (T *)
data
@@
type T;
T *x;
T *data;
@@
x =
- (T *)
data
Unfortunately, either Coccinelle or I is too limited to detect situation
where a complex RHS expression is of type "void *" and therefore casting
is not needed. Those cases were manually examined and corrected.
There are two issues with error captures. The first one is that the
capture size is still hard-coded to BUFSIZE regardless of any possible
tune.bufsize setting and of the fact that frontends only capture request
errors and that backends only capture response errors. The second is that
captures are allocated in both directions for all proxies, which start to
count a lot in configs using thousands of proxies.
This patch changes this so that error captures are allocated only when
needed, and of the proper size. It also refrains from dumping a buffer
that was not allocated, which still allows to emit all relevant info
such as flags and HTTP states. This way it is possible to save up to
32 kB of RAM per proxy in the default configuration.
This patch adds a sample fetch which returns the unique-id if it is
configured. If the unique-id is not yet generated, it build it. If
the unique-id is not configured, it returns none.
Similar issue was fixed in 67dad27, but the fix is incomplete. Crash still
happened when utilizing req.fhdr() and sending exactly MAX_HDR_HISTORY
headers.
This fix needs to be backported to 1.5 and 1.6.
Signed-off-by: Nenad Merdanovic <nmerdan@anine.io>
Cyril reported that recent commit 320ec2a ("BUG/MEDIUM: chunks: always
reject negative-length chunks") introduced a build warning because gcc
cannot guess that we can't fall into the case where the auth_method
chunk is not initialized.
This patch addresses it, though for the long term it would be best
if chunk_initlen() would always initialize the result.
This fix must be backported to 1.6 and 1.5 where the aforementionned
fix was already backported.
The output for each field is :
field:<origin><nature><scope>:type:value
where field reminds the type of the object being dumped as well as its
position (pid, iid, sid), field number and field name. This way a
monitoring utility may very well report all available information without
knowing new fields in advance.
This format is also supported in the HTTP version of the stats by adding
";typed" after the URI, instead of ";csv" for the CSV format.
The doc was not updated yet.
This is the continuation of previous patch called "BUG/MAJOR: samples:
check smp->strm before using it".
It happens that variables may have a session-wide scope, and that their
session is retrieved by dereferencing the stream. But nothing prevents them
from being used from a streamless context such as tcp-request connection,
thus crashing the process. Example :
tcp-request connection accept if { src,set-var(sess.foo) -m found }
In order to fix this, we have to always ensure that variable manipulation
only happens via the sample, which contains the correct owner and context,
and that we never use one from a different source. This results in quite a
large change since a lot of functions are inderctly involved in the call
chain, but the change is easy to follow.
This fix must be backported to 1.6, and requires the last two patches.
Since commit 6879ad3 ("MEDIUM: sample: fill the struct sample with the
session, proxy and stream pointers") merged in 1.6-dev2, the sample
contains the pointer to the stream and sample fetch functions as well
as converters use it heavily.
The problem is that earlier commit 87b0966 ("REORG/MAJOR: session:
rename the "session" entity to "stream"") had split the session and
stream resulting in the possibility for smp->strm to be NULL before
the stream was initialized. This is what happens in tcp-request
connection rulesets, as discovered by Baptiste.
The sample fetch functions must now check that smp->strm is valid
before using it. An alternative could consist in using a dummy stream
with nothing in it to avoid some checks but it would only result in
deferring them to the next step anyway, and making it harder to detect
that a stream is valid or the dummy one.
There is still an issue with variables which requires a complete
independant fix. They use strm->sess to find the session with strm
possibly NULL and passed as an argument. All call places indirectly
use smp->strm to build strm. So the problem is there but the API needs
to be changed to remove this duplicate argument that makes it much
harder to know what pointer to use.
This fix must be backported to 1.6, as well as the next one fixing
variables.
This is an improvement, especially when the message body is big. Before this
patch, remaining data were forwarded when there is no filter on the stream. Now,
the forwarding is triggered when there is no "data" filter on the channel. When
no filter is used, there is no difference, but when at least one filter is used,
it can be really significative.
Now, http_parse_chunk_size and http_skip_chunk_crlf return the number of bytes
parsed on success. http_skip_chunk_crlf does not use msg->sol anymore.
On the other hand, http_forward_trailers is unchanged. It returns >0 if the end
of trailers is reached and 0 if not. In all cases (except if an error is
encountered), msg->sol contains the length of the last parsed part of the
trailer headers.
Internal doc and comments about msg->sol has been updated accordingly.
Before, functions to filter HTTP body (and TCP data) were called from the moment
at least one filter was attached to the stream. If no filter is interested by
these data, this uselessly slows data parsing.
A good example is the HTTP compression filter. Depending of request and response
headers, the response compression can be enabled or not. So it could be really
nice to call it only when enabled.
So, now, to filter HTTP/TCP data, a filter must use the function
register_data_filter. For TCP streams, this function can be called only
once. But for HTTP streams, when needed, it must be called for each HTTP request
or HTTP response.
Only registered filters will be called during data parsing. At any time, a
filter can be unregistered by calling the function unregister_data_filter.
Now body parsing is done in http_msg_forward_body and
http_msg_forward_chunked_body functions, regardless of whether we parse a
request or a response.
Parsing result is still handled in http_request_forward_body and
http_response_forward_body functions.
This patch will ease futur optimizations, mainly on filters.
This new analyzer will be called for each HTTP request/response, before the
parsing of the body. It is identified by AN_FLT_HTTP_HDRS.
Special care was taken about the following condition :
* the frontend is a TCP proxy
* filters are defined in the frontend section
* the selected backend is a HTTP proxy
So, this patch explicitly add AN_FLT_HTTP_HDRS analyzer on the request and the
response channels when the backend is a HTTP proxy and when there are filters
attatched on the stream.
This patch simplifies http_request_forward_body and http_response_forward_body
functions.
For Chunked HTTP request/response, the body filtering can be really
expensive. In the worse case (many chunks of 1 bytes), the filters overhead is
of 3 calls per chunk. If http_data callback is useful, others are just
informative.
So these callbacks has been removed. Of course, existing filters (trace and
compression) has beeen updated accordingly. For the HTTP compression filter, the
update is quite huge. Its implementation is closer to the old one.
When no filter is attached to the stream, the CPU footprint due to the calls to
filters_* functions is huge, especially for chunk-encoded messages. Using macros
to check if we have some filters or not is a great improvement.
Furthermore, instead of checking the filter list emptiness, we introduce a flag
to know if filters are attached or not to a stream.
HTTP compression has been rewritten to use the filter API. This is more a PoC
than other thing for now. It allocates memory to work. So, if only for that, it
should be rewritten.
In the mean time, the implementation has been refactored to allow its use with
other filters. However, there are limitations that should be respected:
- No filter placed after the compression one is allowed to change input data
(in 'http_data' callback).
- No filter placed before the compression one is allowed to change forwarded
data (in 'http_forward_data' callback).
For now, these limitations are informal, so you should be careful when you use
several filters.
About the configuration, 'compression' keywords are still supported and must be
used to configure the HTTP compression behavior. In absence of a 'filter' line
for the compression filter, it is added in the filter chain when the first
compression' line is parsed. This is an easy way to do when you do not use other
filters. But another filter exists, an error is reported so that the user must
explicitly declare the filter.
For example:
listen tst
...
compression algo gzip
compression offload
...
filter flt_1
filter compression
filter flt_2
...
HTTP compression will be moved in a true filter. To prepare the ground, some
functions have been moved in a dedicated file. Idea is to keep everything about
compression algos in compression.c and everything related to the filtering in
flt_http_comp.c.
For now, a header has been added to help during the transition. It will be
removed later.
Unused empty ACL keyword list was removed. The "compression" keyword
parser was moved from cfgparse.c to flt_http_comp.c.
This patch adds the support of filters in HAProxy. The main idea is to have a
way to "easely" extend HAProxy by adding some "modules", called filters, that
will be able to change HAProxy behavior in a programmatic way.
To do so, many entry points has been added in code to let filters to hook up to
different steps of the processing. A filter must define a flt_ops sutrctures
(see include/types/filters.h for details). This structure contains all available
callbacks that a filter can define:
struct flt_ops {
/*
* Callbacks to manage the filter lifecycle
*/
int (*init) (struct proxy *p);
void (*deinit)(struct proxy *p);
int (*check) (struct proxy *p);
/*
* Stream callbacks
*/
void (*stream_start) (struct stream *s);
void (*stream_accept) (struct stream *s);
void (*session_establish)(struct stream *s);
void (*stream_stop) (struct stream *s);
/*
* HTTP callbacks
*/
int (*http_start) (struct stream *s, struct http_msg *msg);
int (*http_start_body) (struct stream *s, struct http_msg *msg);
int (*http_start_chunk) (struct stream *s, struct http_msg *msg);
int (*http_data) (struct stream *s, struct http_msg *msg);
int (*http_last_chunk) (struct stream *s, struct http_msg *msg);
int (*http_end_chunk) (struct stream *s, struct http_msg *msg);
int (*http_chunk_trailers)(struct stream *s, struct http_msg *msg);
int (*http_end_body) (struct stream *s, struct http_msg *msg);
void (*http_end) (struct stream *s, struct http_msg *msg);
void (*http_reset) (struct stream *s, struct http_msg *msg);
int (*http_pre_process) (struct stream *s, struct http_msg *msg);
int (*http_post_process) (struct stream *s, struct http_msg *msg);
void (*http_reply) (struct stream *s, short status,
const struct chunk *msg);
};
To declare and use a filter, in the configuration, the "filter" keyword must be
used in a listener/frontend section:
frontend test
...
filter <FILTER-NAME> [OPTIONS...]
The filter referenced by the <FILTER-NAME> must declare a configuration parser
on its own name to fill flt_ops and filter_conf field in the proxy's
structure. An exemple will be provided later to make it perfectly clear.
For now, filters cannot be used in backend section. But this is only a matter of
time. Documentation will also be added later. This is the first commit of a long
list about filters.
It is possible to have several filters on the same listener/frontend. These
filters are stored in an array of at most MAX_FILTERS elements (define in
include/types/filters.h). Again, this will be replaced later by a list of
filters.
The filter API has been highly refactored. Main changes are:
* Now, HA supports an infinite number of filters per proxy. To do so, filters
are stored in list.
* Because filters are stored in list, filters state has been moved from the
channel structure to the filter structure. This is cleaner because there is no
more info about filters in channel structure.
* It is possible to defined filters on backends only. For such filters,
stream_start/stream_stop callbacks are not called. Of course, it is possible
to mix frontend and backend filters.
* Now, TCP streams are also filtered. All callbacks without the 'http_' prefix
are called for all kind of streams. In addition, 2 new callbacks were added to
filter data exchanged through a TCP stream:
- tcp_data: it is called when new data are available or when old unprocessed
data are still waiting.
- tcp_forward_data: it is called when some data can be consumed.
* New callbacks attached to channel were added:
- channel_start_analyze: it is called when a filter is ready to process data
exchanged through a channel. 2 new analyzers (a frontend and a backend)
are attached to channels to call this callback. For a frontend filter, it
is called before any other analyzer. For a backend filter, it is called
when a backend is attached to a stream. So some processing cannot be
filtered in that case.
- channel_analyze: it is called before each analyzer attached to a channel,
expects analyzers responsible for data sending.
- channel_end_analyze: it is called when all other analyzers have finished
their processing. A new analyzers is attached to channels to call this
callback. For a TCP stream, this is always the last one called. For a HTTP
one, the callback is called when a request/response ends, so it is called
one time for each request/response.
* 'session_established' callback has been removed. Everything that is done in
this callback can be handled by 'channel_start_analyze' on the response
channel.
* 'http_pre_process' and 'http_post_process' callbacks have been replaced by
'channel_analyze'.
* 'http_start' callback has been replaced by 'http_headers'. This new one is
called just before headers sending and parsing of the body.
* 'http_end' callback has been replaced by 'channel_end_analyze'.
* It is possible to set a forwarder for TCP channels. It was already possible to
do it for HTTP ones.
* Forwarders can partially consumed forwardable data. For this reason a new
HTTP message state was added before HTTP_MSG_DONE : HTTP_MSG_ENDING.
Now all filters can define corresponding callbacks (http_forward_data
and tcp_forward_data). Each filter owns 2 offsets relative to buf->p, next and
forward, to track, respectively, input data already parsed but not forwarded yet
by the filter and parsed data considered as forwarded by the filter. A any time,
we have the warranty that a filter cannot parse or forward more input than
previous ones. And, of course, it cannot forward more input than it has
parsed. 2 macros has been added to retrieve these offets: FLT_NXT and FLT_FWD.
In addition, 2 functions has been added to change the 'next size' and the
'forward size' of a filter. When a filter parses input data, it can alter these
data, so the size of these data can vary. This action has an effet on all
previous filters that must be handled. To do so, the function
'filter_change_next_size' must be called, passing the size variation. In the
same spirit, if a filter alter forwarded data, it must call the function
'filter_change_forward_size'. 'filter_change_next_size' can be called in
'http_data' and 'tcp_data' callbacks and only these ones. And
'filter_change_forward_size' can be called in 'http_forward_data' and
'tcp_forward_data' callbacks and only these ones. The data changes are the
filter responsability, but with some limitation. It must not change already
parsed/forwarded data or data that previous filters have not parsed/forwarded
yet.
Because filters can be used on backends, when we the backend is set for a
stream, we add filters defined for this backend in the filter list of the
stream. But we must only do that when the backend and the frontend of the stream
are not the same. Else same filters are added a second time leading to undefined
behavior.
The HTTP compression code had to be moved.
So it simplifies http_response_forward_body function. To do so, the way the data
are forwarded has changed. Now, a filter (and only one) can forward data. In a
commit to come, this limitation will be removed to let all filters take part to
data forwarding. There are 2 new functions that filters should use to deal with
this feature:
* flt_set_http_data_forwarder: This function sets the filter (using its id)
that will forward data for the specified HTTP message. It is possible if it
was not already set by another filter _AND_ if no data was yet forwarded
(msg->msg_state <= HTTP_MSG_BODY). It returns -1 if an error occurs.
* flt_http_data_forwarder: This function returns the filter id that will
forward data for the specified HTTP message. If there is no forwarder set, it
returns -1.
When an HTTP data forwarder is set for the response, the HTTP compression is
disabled. Of course, this is not definitive.
When working on the previous bug, it appeared that it the case that was
triggering the bug would also work between two backends, one of which
doesn't support http-reuse. The reason is that while the idle connection
is moved to the private pool, upon reuse we only check if it holds the
CO_FL_PRIVATE flag. And we don't set this flag when there's no reuse.
So let's always set it in this case, it will guarantee that no undesired
connection sharing may happen.
This fix must be backported to 1.6.
Gregor KovaÄ reported that http_date() did not return the right day of the
week. For example "Sat, 22 Jan 2016 17:43:38 GMT" instead of "Fri, 22 Jan
2016 17:43:38 GMT". Indeed, gmtime() returns a 'struct tm' result, where
tm_wday begins on Sunday, whereas the code assumed it began on Monday.
This patch must be backported to haproxy 1.5 and 1.6.
The function http_reply_and_close has been added in proto_http.c to wrap calls
to stream_int_retnclose. This functions will be modified when the filters will
be added.
When the response body is forwarded, if the server closes the input before the
end, an error is thrown. But if the data processing is too slow, all data could
already be received and pending in the input buffer. So this is a bug to stop
processing in this context. The server doesn't really closed the input before
the end.
As an example, this could happen when HAProxy is configured to do compression
offloading. If the server closes the connection explicitly after the response
(keep-alive disabled by the server) and if HAProxy receives the data faster than
they are compressed, then the response could be truncated.
This patch fixes the bug by checking if some pending data remain in the input
buffer before returning an error. If yes, the processing continues.
Several cases of "<=" instead of "<" were found in the url_param parser,
mostly affecting the case where the parameter is wrapping. They shouldn't
affect header operations, just body parsing in a wrapped pipelined request.
The code is a bit complicated with certain operations done multiple times
in multiple functions, so it's not sure others are not left. This code
must be re-audited.
It should only be backported to 1.6 once carefully tested, because it is
possible that other bugs relied on these ones.
Krishna Kumar reported that the following configuration doesn't permit
HTTP reuse between two clients :
frontend private-frontend
mode http
bind :8001
default_backend private-backend
backend private-backend
mode http
http-reuse always
server bck 127.0.0.1:8888
The reason for this is that in http_end_txn_clean_session() we check the
stream's backend backend's http-reuse option before deciding whether the
backend connection should be moved back to the server's pool or not. But
since we're doing this after the call to http_reset_txn(), the backend is
reset to match the frontend, which doesn't have the option. However it
will work fine in a setup involving a "listen" section.
We just need to keep a pointer to the current backend before calling
http_reset_txn(). The code does that and replaces the few remaining
references to s->be inside the same function so that if any part of
code were to be moved later, this trap doesn't happen again.
This fix must be backported to 1.6.
Currently urlp fetching samples were able to find parameters with an empty
value, but the return code depended on the value length. The final result was
that acls using urlp couldn't match empty values.
Example of acl which always returned "false":
acl MATCH_EMPTY urlp(foo) -m len 0
The fix consists in unconditionally return 1 when the parameter is found.
This fix must be backported to 1.6 and 1.5.
There is a bug where "option http-keep-alive" doesn't force a response
to stay in keep-alive if the server sends the FIN along with the response
on the second or subsequent response. The reason is that the auto-close
was forced enabled when recycling the HTTP transaction and it's never
disabled along the response processing chain before the SHUTR gets a
chance to be forwarded to the client side. The MSG_DONE state of the
HTTP response properly disables it but too late.
There's no more reason for enabling auto-close here, because either it
doesn't matter in non-keep-alive modes because the connection is closed,
or it is automatically enabled by process_stream() when it sees there's
no analyser on the stream.
This bug also affects 1.5 so a backport is desired.
When a server timeout is detected on the second or nth request of a keep-alive
connection, HAProxy closes the connection without writing a response.
Some clients would fail with a remote disconnected exception and some
others would retry potentially unsafe requests.
This patch removes the special case and makes sure a 504 timeout is
written back whenever a server timeout is handled.
Signed-off-by: lsenta <laurent.senta@gmail.com>
Cyril Bonté reported a reproduceable sequence which can lead to a crash
when using backend connection reuse. The problem comes from the fact that
we systematically add the server connection to an idle pool at the end of
the HTTP transaction regardless of the fact that it might already be there.
This is possible for example when processing a request which doesn't use
a server connection (typically a redirect) after a request which used a
connection. Then after the first request, the connection was already in
the idle queue and we're putting it a second time at the end of the second
request, causing a corruption of the idle pool.
Interestingly, the memory debugger in 1.7 immediately detected a suspicious
double free on the connection, leading to a very early detection of the
cause instead of its consequences.
Thanks to Cyril for quickly providing a working reproducer.
This fix must be backported to 1.6 since connection reuse was introduced
there.
The 'OPTIONS' method was not in the list of supported HTTP methods and
find_http_meth return HTTP_METH_OTHER instead of HTTP_METH_OPTIONS.
[wt: this fix needs to be backported at least to 1.5, 1.4 and 1.3]
This flag is used by custom actions to know that they're called for the
first time. The only case where it's not set is when they're resuming
from a yield. It will be needed to let them know when they have to
allocate some resources.
In HTTP it's more difficult to know when to pass the flag or not
because all actions are supposed to be final and there's no inspection
delay. Also, the input channel may very well be closed without this
being an error. So we only set the flag when option abortonclose is
set and the input channel is closed, which is the only case where the
user explicitly wants to forward a close down the chain.
Since commit bc4c1ac ("MEDIUM: http/tcp: permit to resume http and tcp
custom actions"), some actions may yield and be called back when new
information are available. Unfortunately some of them may continue to
yield because they simply don't know that it's the last call from the
rule set. For this reason we'll need to pass a flag to the custom
action to pass such information and possibly other at the same time.
When we call the function smp_prefetch_http(), if the txn is not initialized,
it doesn't work. This patch fix this. Now, smp_prefecth_http() permits to use
http with any proxy mode.
Added the definition of CHECK_HTTP_MESSAGE_FIRST and the declaration of
smp_prefetch_http to the header.
Changed smp_prefetch_http implementation to remove the static qualifier.
When converting the "method" fetch to a string, we used to get an empty
string if the first character was not an upper case. This was caused by
the lookup function which returns HTTP_METH_NONE when a lookup is not
possible, and this method being mapped to an empty string in the array.
This is a totally stupid mechanism, there's no reason for having the
result depend on the first char. In fact the message parser already
checks that the syntax matches an HTTP token so we can only land there
with a valid token, hence only HTTP_METH_OTHER should be returned.
This fix should be backported to all actively supported branches.
Before this patch, two type of custom actions exists: ACT_ACTION_CONT and
ACT_ACTION_STOP. ACT_ACTION_CONT is a non terminal action and ACT_ACTION_STOP is
a terminal action.
Note that ACT_ACTION_STOP is not used in HAProxy.
This patch remove this behavior. Only type type of custom action exists, and it
is called ACT_CUSTOM. Now, the custion action can return a code indicating the
required behavior. ACT_RET_CONT wants that HAProxy continue the current rule
list evaluation, and ACT_RET_STOP wants that HAPRoxy stops the the current rule
list evaluation.
Jesse Hathaway reported a crash that Cyril Bonté diagnosed as being
caused by the manipulation of srv_conn after setting it to NULL. This
happens in http-server-close mode when the server returns either a 401
or a 407, because the connection was previously closed then it's being
assigned the CO_FL_PRIVATE flag.
This bug only affects 1.6-dev as it was introduced by connection reuse code
with commit 387ebf8 ("MINOR: connection: add a new flag CO_FL_PRIVATE").
This patch is inspired by Bowen Ni's proposal and it is based on his first
implementation:
With Lua integration in HAProxy 1.6, one can change the request method,
path, uri, header, response header etc except response line.
I'd like to contribute the following methods to allow modification of the
response line.
[...]
There are two new keywords in 'http-response' that allows you to rewrite
them in the native HAProxy config. There are also two new APIs in Lua that
allows you to do the same rewriting in your Lua script.
Example:
Use it in HAProxy config:
*http-response set-code 404*
Or use it in Lua script:
*txn.http:res_set_reason("Redirect")*
I dont take the full patch because the manipulation of the "reason" is useless.
standard reason are associated with each returned code, and unknown code can
take generic reason.
So, this patch can set the status code, and the reason is automatically adapted.
This patch normalize the return code of the configuration parsers. Before
these changes, the tcp action parser returned -1 if fail and 0 for the
succes. The http action returned 0 if fail and 1 if succes.
The normalisation does:
- ACT_RET_PRS_OK for succes
- ACT_RET_PRS_ERR for failure
Each (http|tcp)-(request|response) action use the same method
for looking up the action keyword during the cofiguration parsing.
This patch mutualize the code.
This patch merges the conguration keyword struct. Each declared configuration
keyword struct are similar with the others. This patch simplify the code.
Action function can return 3 status:
- error if the action encounter fatal error (like out of memory)
- yield if the action must terminate his work later
- continue in other cases
For performances considerations, some actions are not processed by remote
function. They are directly processed by the function. Some of these actions
does the same things but for different processing part (request / response).
This patch give the same name for the same actions, and change the normalization
of the other actions names.
This patch is ONLY a rename, it doesn't modify the code.
This patch group the action name in one file. Some action are called
many times and need an action embedded in the action caller. The main
goal is to have only one header file grouping all definitions.
This mark permit to detect if the action tag is over the allowed range.
- Normally, this case doesn't appear
- If it appears, it is processed by ded fault case of the switch
This patch removes the generic opaque type for storing the configuration of the
acion "set-src" (HTTP_REQ_ACT_SET_SRC), and use the dedicated type "struct expr"
This patch is the first of a serie which merge all the action structs. The
function "tcp-request content", "tcp-response-content", "http-request" and
"http-response" have the same values and the same process for some defined
actions, but the struct and the prototype of the declared function are
different.
This patch try to unify all of these entries.
The union name "data" is a little bit heavy while we read the source
code because we can read "data.data.sint". The rename from "data" to "u"
makes the read easiest like "data.u.sint".
This patch remove the struct information stored both in the struct
sample_data and in the striuct sample. Now, only thestruct sample_data
contains data, and the struct sample use the struct sample_data for storing
his own data.
appsessions started to be deprecated with the introduction of stick
tables, and the latter are much more powerful and flexible, and in
addition they are replicated between nodes and maintained across
reloads. Let's now remove appsession completely.
This strategy is less extreme than "always", it only dispatches first
requests to validated reused connections, and moves a connection from
the idle list to the safe list once it has seen a second request, thus
proving that it could be reused.
In connect_server(), if we don't have a connection attached to the
stream-int, we first look into the server's idle_conns list and we
pick the first one there, we detach it from its owner if it had one.
If we used to have a connection, we close it.
This mechanism works well but doesn't scale : as servers increase,
the likeliness that the connection attached to the stream interface
doesn't match the server and gets closed increases.
This flag is set on an outgoing connection when this connection gets
some properties that must not be shared with other connections, such
as dynamic transparent source binding, SNI or a proxy protocol header,
or an authentication challenge from the server. This will be needed
later to implement connection reuse.
This function is now dedicated to idle connections only, which means
that it must not be used without any endpoint nor anything not a
connection. The connection remains attached to the stream interface.
Since we now always call this function with the reuse parameter cleared,
let's simplify the function's logic as it cannot return the existing
connection anymore. The savings on this inline function are appreciable
(240 bytes) :
$ size haproxy.old haproxy.new
text data bss dec hex filename
1020383 40816 36928 1098127 10c18f haproxy.old
1020143 40816 36928 1097887 10c09f haproxy.new
This patch removes the 32 bits unsigned integer and the 32 bit signed
integer. It replaces these types by a unique type 64 bit signed.
This makes easy the usage of integer and clarify signed and unsigned use.
With the previous version, signed and unsigned are used ones in place of
others, and sometimes the converter loose the sign. For example, divisions
are processed with "unsigned", if one entry is negative, the result is
wrong.
Note that the integer pattern matching and dotted version pattern matching
are already working with signed 64 bits integer values.
There is one user-visible change : the "uint()" and "sint()" sample fetch
functions which used to return a constant integer have been replaced with
a new more natural, unified "int()" function. These functions were only
introduced in the latest 1.6-dev2 so there's no impact on regular
deployments.
The man said that gmtime() and localtime() can return a NULL value.
This is not tested. It appears that all the values of a 32 bit integer
are valid, but it is better to check the return of these functions.
However, if the integer move from 32 bits to 64 bits, some 64 values
can be unsupported.
This option enables overriding source IP address in a HTTP request. It is
useful when we want to set custom source IP (e.g. front proxy rewrites address,
but provides the correct one in headers) or we wan't to mask source IP address
for privacy or compliance.
It acts on any expression which produces correct IP address.
This modification makes possible to use sample_fetch_string() in more places,
where we might need to fetch sample values which are not plain strings. This
way we don't need to fetch string, and convert it into another type afterwards.
When using aliased types, the caller should explicitly check which exact type
was returned (e.g. SMP_T_IPV4 or SMP_T_IPV6 for SMP_T_ADDR).
All usages of sample_fetch_string() are converted to use new function.
This patch adds support of variables during the processing of each stream. The
variables scope can be set as 'session', 'transaction', 'request' or 'response'.
The variable type is the type returned by the assignment expression. The type
can change while the processing.
The allocated memory can be controlled for each scope and each request, and for
the global process.
This patch permits to register a new keyword with the keyword "tcp-request content"
'tcp-request connection", tcp-response content", http-request" and "http-response"
which is identified only by matching the start of the keyword.
for example, we register the keyword "set-var" with the option "match_pfx"
and the configuration keyword "set-var(var_name)" matchs this entry.
Commit 9fbe18e ("MEDIUM: http: add a new option http-buffer-request")
introduced a regression due to a misplaced check causing the admin
mode of the HTTP stats not to work anymore.
This patch tried to ensure that when we need a request body for the
stats applet, and we have already waited for this body, we don't wait
for it again, but the condition was applied too early causing a
disabling of the entire processing the body, and based on the wrong
HTTP state (MSG_BODY) resulting in the test never matching.
Thanks to Chad Lavoie for reporting the problem.
This bug is 1.6-only, no backport is needed.
There are two reasons for not keeping the client connection alive upon a
redirect :
- save the client from uploading all data
- avoid keeping a connection alive if the redirect goes to another domain
The first case should consider an exception when all the data from the
client have been read already. This specifically happens on response
redirects after a POST to a server. This is an easy situation to detect.
It could later be improved to cover the cases where option
http-buffer-request is used.
Sometimes it's problematic not to have "http-response redirect" rules,
for example to perform a browser-based redirect based on certain server
conditions (eg: match of a header).
This patch adds "http-response redirect location <fmt>" which gives
enough flexibility for most imaginable operations. The connection to
the server is closed when this is performed so that we don't risk to
forward any pending data from the server.
Any pending response data are trimmed so that we don't risk to
forward anything pending to the client. It's harmless to also do that
for requests so we don't need to consider the direction.
In order to support http-response redirect, the parsing needs to be
adapted a little bit to only support the "location" type, and to
adjust the log-format parser so that it knows the direction of the
sample fetch calls.
This function was made to perform a redirect on requests only, it was
using a message or txn->req in an inconsistent way and did not consider
the possibility that it could be used for the other direction. Let's
clean it up to have both a request and a response messages.
This patch adds a http response capture keyword with the same behavior
as the previous patch called "MEDIUM: capture: Allow capture with slot
identifier".
This patch modifies the current http-request capture function
and adds a new keyword "id" that permits to identify a capture slot.
If the identified doesn't exists, the action fails silently.
Note that this patch removs an unused list initilisation, which seems
to be inherited from a copy/paste. It's harmless and does not need to
be backported.
LIST_INIT((struct list *)&rule->arg.act.p[0]);
This patch adds "capture-req" and "capture-res". These two converters
capture their entry in the allocated slot given in argument and pass
the input on the output.
These ones were already obsoleted in 1.4, marked for removal in 1.5,
and not documented anymore. They used to emit warnings, and do still
require quite some code to stay in place. Let's remove them now.
The "name" and "name_len" arguments in function "smp_fetch_url_param"
could be left uninitialized for subsequent calls.
[wt: no backport needed, this is an 1.6 regression introduced by
commit 4fdc74c ("MINOR: http: split the url_param in two parts") ]
This patch is the part of the body_param fetch. The goal is to have
generic url-encoded parser which can used for parsing the query string
and the body.
There are some situations hwere it's desirable to scan multiple occurrences
of a same parameter name in the query string. This change ensures this can
work, even with an empty name which will then iterate over all parameters.
This patch removes the structs "session", "stream" and "proxy" from
the sample-fetches and converters function prototypes.
This permits to remove some weight in the prototype call.
There's an issue related with shutting down POST transfers or closing the
connection after the end of the upload : the shutdown is forwarded to the
server regardless of the abortonclose option. The problem it causes is that
during a scan, brute force or whatever, it becomes possible that all source
ports are exhausted with all sockets in TIME_WAIT state.
There are multiple issues at once in fact :
- no action is done for the close, it automatically happens at the lower
layers thanks for channel_auto_close(), so we cannot act on NOLINGER ;
- we *do* want to continue to send a clean shutdown in tunnel mode because
some protocols transported over HTTP may need this, regardless of option
abortonclose, thus we can't set the option inconditionally
- for all other modes, we do want to close the dirty way because we're
certain whether we've sent everything or not, and we don't want to eat
all source ports.
The solution is a bit complex and applies to DONE/TUNNEL states :
1) disable automatic close for everything not a tunnel and not just
keep-alive / server-close. Force-close is now covered, as is HTTP/1.0
which implicitly works in force-close mode ;
2) when processing option abortonclose, we know we can disable lingering
if the client has closed and the connection is not in tunnel mode.
Since the last case above leads to a situation where the client side reports
an error, we know the connection will not be reused, so leaving the flag on
the stream-interface is safe. A client closing in the middle of the data
transmission already aborts the transaction so this case is not a problem.
This fix must be backported to 1.5 where the problem was detected.
Due to the code being mostly inspired from the tcp-request parser, it
does some crap because both don't work the same way. The "len" argument
could be mismatched and then the length could be used uninitialized.
This is only possible in frontends of course, but it will finally
make it possible to capture arbitrary http parts, including URL
parameters or parts of the message body.
It's worth noting that an ugly (char **) cast had to be done to
call sample_fetch_string() which is caused by a 5- or 6- levels
of inheritance of this type in the API. Here it's harmless since
the function uses it as a const, but this API madness must be
fixed, starting with the one or two rare functions that modify
the args and inflict this on each and every keyword parser.
(cherry picked from commit 484a4f38460593919a1c1d9a047a043198d69f45)
Body processing is still fairly limited, but this is a start. It becomes
possible to apply regex to find contents in order to decide where to route
a request for example. Only the first chunk is parsed for now, and the
response is not yet available (the parsing function must be duplicated for
this).
req.body : binary
This returns the HTTP request's available body as a block of data. It
requires that the request body has been buffered made available using
"option http-buffer-request". In case of chunked-encoded body, currently only
the first chunk is analyzed.
req.body_len : integer
This returns the length of the HTTP request's available body in bytes. It may
be lower than the advertised length if the body is larger than the buffer. It
requires that the request body has been buffered made available using
"option http-buffer-request".
req.body_size : integer
This returns the advertised length of the HTTP request's body in bytes. It
will represent the advertised Content-Length header, or the size of the first
chunk in case of chunked encoding. In order to parse the chunks, it requires
that the request body has been buffered made available using
"option http-buffer-request".
It is sometimes desirable to wait for the body of an HTTP request before
taking a decision. This is what is being done by "balance url_param" for
example. The first use case is to buffer requests from slow clients before
connecting to the server. Another use case consists in taking the routing
decision based on the request body's contents. This option placed in a
frontend or backend forces the HTTP processing to wait until either the whole
body is received, or the request buffer is full, or the first chunk is
complete in case of chunked encoding. It can have undesired side effects with
some applications abusing HTTP by expecting unbufferred transmissions between
the frontend and the backend, so this should definitely not be used by
default.
Note that it would not work for the response because we don't reset the
message state before starting to forward. For the response we need to
1) reset the message state to MSG_100_SENT or BODY , and 2) to reset
body_len in case of chunked encoding to avoid counting it twice.
Due to the fact that we were still considering only msg->sov for the
first byte of data after calling http_parse_chunk_size(), we used to
miscompute the input data size and to count the CRLF and the chunk size
as part of the input data. The effect is that it was possible to release
the processing with 3 or 4 missing bytes, especially if they're typed by
hand during debugging sessions. This can cause the stats page to return
some errors in admin mode, and the url_param balance algorithm to fail
to properly hash a body input.
This fix must be backported to 1.5.
Recently some browsers started to implement a "pre-connect" feature
consisting in speculatively connecting to some recently visited web sites
just in case the user would like to visit them. This results in many
connections being established to web sites, which end up in 408 Request
Timeout if the timeout strikes first, or 400 Bad Request when the browser
decides to close them first. These ones pollute the log and feed the error
counters. There was already "option dontlognull" but it's insufficient in
this case. Instead, this option does the following things :
- prevent any 400/408 message from being sent to the client if nothing
was received over a connection before it was closed ;
- prevent any log from being emitted in this situation ;
- prevent any error counter from being incremented
That way the empty connection is silently ignored. Note that it is better
not to use this unless it is clear that it is needed, because it will hide
real problems. The most common reason for not receiving a request and seeing
a 408 is due to an MTU inconsistency between the client and an intermediary
element such as a VPN, which blocks too large packets. These issues are
generally seen with POST requests as well as GET with large cookies. The logs
are often the only way to detect them.
This patch should be backported to 1.5 since it avoids false alerts and
makes it easier to monitor haproxy's status.
There's not much reason for continuing to accept HTTP/0.9 requests
nowadays except for manual testing. Now we disable support for these
by default, unless option accept-invalid-http-request is specified,
in which case they continue to be upgraded to 1.0.
While RFC2616 used to allow an undeterminate amount of digits for the
major and minor components of the HTTP version, RFC7230 has reduced
that to a single digit for each.
If a server can't properly parse the version string and falls back to 0.9,
it could then send a head-less response whose payload would be taken for
headers, which could confuse downstream agents.
Since there's no more reason for supporting a version scheme that was
never used, let's upgrade to the updated version of the standard. It is
still possible to enforce support for the old behaviour using options
accept-invalid-http-request and accept-invalid-http-response.
It would be wise to backport this to 1.5 as well just in case.
The spec mandates that content-length must be removed from messages if
Transfer-Encoding is present, not just for valid ones.
This must be backported to 1.5 and 1.4.
The rules related to how to handle a bad transfer-encoding header (one
where "chunked" is not at the final place) have evolved to mandate an
abort when this happens in the request. Previously it was only a close
(which is still valid for the server side).
This must be backported to 1.5 and 1.4.
While Transfer-Encoding is HTTP/1.1, we must still parse it in HTTP/1.0
in case an agent sends it, because it's likely that the other side might
use it as well, causing confusion. This will also result in getting rid
of the Content-Length header in such abnormal situations and in having
a clean connection.
This must be backported to 1.5 and 1.4.
RFC7230 clarified the behaviour to adopt when facing both a
content-length and a transfer-encoding: chunked in a message. While
haproxy already complied with the method for getting the message
length right, and used to detect improper content-length duplicates,
it still did not remove the content-length header when facing a
transfer-encoding: chunked. Usually it is not a problem since other
agents (clients and servers) are required to parse the message
according to the rules that have been in place since RFC2616 in
1999.
However Régis Leroy reported the existence of at least one such
non-compliant agent so haproxy could be abused to get out of sync
with it on pipelined requests (HTTP request smuggling attack),
it consider part of a payload as a subsequent request.
The best thing to do is then to remove the content-length according
to RFC7230. It used to be in the todo list with a fixme in the code
while waiting for the standard to stabilize, let's apply it now that
it's published.
Thanks to Régis for bringing that subject to our attention.
This fix must be backported to 1.5 and 1.4.
When one of these functions replaces a part of the query string by
a shorter or longer new one, the header parsing is broken. This is
because the start of the first header is not updated.
In the same way, the total length of the request line is not updated.
I dont see any bug caused by this miss, but I guess than it is better
to store the good length.
This bug is only in the development version.
Commit 350f487 ("CLEANUP: session: simplify references to chn_{prod,cons}(&s->{req,res})")
introduced a regression causing the cli_conn to be picked from the server
side instead of the client side, so the XFF header is not appended anymore
since the connection is NULL.
Thanks to Reinis Rozitis for reporting this bug. No backport is needed
as it's 1.6-specific.
Commit bc4c1ac ("MEDIUM: http/tcp: permit to resume http and tcp custom
actions") introduced the ability to interrupt and restart processing in
the middle of a TCP/HTTP ruleset. But it doesn't do it in a consistent
way : it checks current_rule_list, immediately dereferences current_rule,
which is only set in certain cases and never cleared. So that broke the
tcp-request content rules when the processing was interrupted due to
missing data, because current_rule was not yet set (segfault) or could
have been inherited from another ruleset if it was used in a backend
(random behaviour).
The proper way to do it is to always set current_rule before dereferencing
it. But we don't want to set it for all rules because we don't want any
action to provide a checkpointing mechanism. So current_rule is set to NULL
before entering the loop, and only used if not NULL and if current_rule_list
matches the current list. This way they both serve as a guard for the other
one. This fix also makes the current rule point to the rule instead of its
list element, as it's much easier to manipulate.
No backport is needed, this is 1.6-specific.
This patch adds support for error codes 429 and 405 to Haproxy and a
"deny_status XXX" option to "http-request deny" where you can specify which
code is returned with 403 being the default. We really want to do this the
"haproxy way" and hope to have this patch included in the mainline. We'll
be happy address any feedback on how this is implemented.
Many such function need a session, and till now they used to dereference
the stream. Once we remove the stream from the embryonic session, this
will not be possible anymore.
So as of now, sample fetch functions will be called with this :
- sess = NULL, strm = NULL : never
- sess = valid, strm = NULL : tcp-req connection
- sess = valid, strm = valid, strm->txn = NULL : tcp-req content
- sess = valid, strm = valid, strm->txn = valid : http-req / http-res
The registerable http_req_rules / http_res_rules used to require a
struct http_txn at the end. It's redundant with struct stream and
propagates very deep into some parts (ie: it was the reason for lua
requiring l7). Let's remove it now.
All of them can now retrieve the HTTP transaction *if it exists* from
the stream and be sure to get NULL there when called with an embryonic
session.
The patch is a bit large because many locations were touched (all fetch
functions had to have their prototype adjusted). The opportunity was
taken to also uniformize the call names (the stream is now always "strm"
instead of "l4") and to fix indent where it was broken. This way when
we later introduce the session here there will be less confusion.
Now this one is dynamically allocated. It means that 280 bytes of memory
are saved per TCP stream, but more importantly that it will become
possible to remove the l7 pointer from fetches and converters since
it will be deduced from the stream and will support being null.
A lot of care was taken because it's easy to forget a test somewhere,
and the previous code used to always trust s->txn for being valid, but
all places seem to have been visited.
All HTTP fetch functions check the txn first so we shouldn't have any
issue there even when called from TCP. When branching from a TCP frontend
to an HTTP backend, the txn is properly allocated at the same time as the
hdr_idx.
This one will not necessarily be allocated for each stream, and we want
to use the fact that it equals null to know it's not present so that we
can always deduce its presence from the stream pointer.
This commit only creates the new pool.
The header captures are now general purpose captures since tcp rules
can use them to capture various contents. That removes a dependency
on http_txn that appeared in some sample fetch functions and in the
order by which captures and http_txn were allocated.
Interestingly the reset of the header captures were done at too many
places as http_init_txn() used to do it while it was done previously
in every call place.
The stream may never be null given that all these functions are called
from sample_process(). Let's remove this now confusing test which
sometimes happens after a dereference was already done.
When s->si[0].end was dereferenced as a connection or anything in
order to retrieve information about the originating session, we'll
now use sess->origin instead so that when we have to chain multiple
streams in HTTP/2, we'll keep accessing the same origin.
Just like for the listener, the frontend is session-wide so let's move
it to the session. There are a lot of places which were changed but the
changes are minimal in fact.
With HTTP/2, we'll have to support multiplexed streams. A stream is in
fact the largest part of what we currently call a session, it has buffers,
logs, etc.
In order to catch any error, this commit removes any reference to the
struct session and tries to rename most "session" occurrences in function
names to "stream" and "sess" to "strm" when that's related to a session.
The files stream.{c,h} were added and session.{c,h} removed.
The session will be reintroduced later and a few parts of the stream
will progressively be moved overthere. It will more or less contain
only what we need in an embryonic session.
Sample fetch functions and converters will have to change a bit so
that they'll use an L5 (session) instead of what's currently called
"L4" which is in fact L6 for now.
Once all changes are completed, we should see approximately this :
L7 - http_txn
L6 - stream
L5 - session
L4 - connection | applet
There will be at most one http_txn per stream, and a same session will
possibly be referenced by multiple streams. A connection will point to
a session and to a stream. The session will hold all the information
we need to keep even when we don't yet have a stream.
Some more cleanup is needed because some code was already far from
being clean. The server queue management still refers to sessions at
many places while comments talk about connections. This will have to
be cleaned up once we have a server-side connection pool manager.
Stream flags "SN_*" still need to be renamed, it doesn't seem like
any of them will need to move to the session.
Commit a0dc23f ("MEDIUM: http: implement http-request set-{method,path,query,uri}")
forgot to null-terminate the list, resulting in crashes when these actions
are used if the platform doesn't pad the struct with nulls.
Thanks to Gunay Arslan for reporting a detailed trace showing the
origin of this bug.
No backport to 1.5 is needed.
It's documented that these sample fetch functions should count all headers
and/or all values when called with no name but in practice it's not what is
being done as a missing name causes an immediate return and an absence of
result.
This bug is present in 1.5 as well and must be backported.
Thanks to MSIE/IIS, the "deflate" name is ambigous. According to the RFC
it's a zlib-wrapped deflate stream, but IIS used to send only a raw deflate
stream, which is the only format MSIE understands for "deflate". The other
widely used browsers do support both formats. For this reason some people
prefer to emit a raw deflate stream on "deflate" to serve more users even
it that means violating the standards. Haproxy only follows the standard,
so they cannot do this.
This patch makes it possible to have one algorithm name in the configuration
and another one in the protocol. This will make it possible to have a new
configuration token to add a different algorithm so that users can decide if
they want a raw deflate or the standard one.
This function is a callback for HTTP actions. This function
creates the replacement string from a build_logline() format
and transform the header.
This patch split this function in two part. With this modification,
the header transformation and the replacement string are separed.
We can now transform the header with another replacement string
source than a build_logline() format.
The first part is the replacement engine. It take a replacement action
number and a replacement string and process the action.
The second part is the function which is called by the 'http-request
action' to replace a request line part. This function makes the
string used as replacement.
This split permits to use the replacement engine in other parts of the
code than the request action. The Lua use it for his own http action.
These function used an invalid header parser.
- The trailing white-spaces were embedded in the replacement regex,
- The double-quote (") containing comma (,) were not respected.
This patch replace this parser by the "official" parser http_find_header2().
The function http_replace_value use bad variable to detect the end
of the input string.
Regression introduced by the patch "MEDIUM: regex: Remove null
terminated strings." (c9c2daf2)
We need to backport this patch int the 1.5 stable branch.
WT: there is no possibility to overwrite existing data as we only read
past the end of the request buffer, to copy into the trash. The copy
is bounded by buffer_replace2(), just like the replacement performed
by exp_replace(). However if a buffer happens to contain non-zero data
up to the next unmapped page boundary, there's a theorical risk of
crashing the process despite this not being reproducible in tests.
The risk is low because "http-request replace-value" did not work due
to this bug so that probably means it's not used yet.