Commit Graph

1231 Commits

Author SHA1 Message Date
Thierry FOURNIER
85c6c97830 MINOR: action: add reference to the original keywork matched for the called parser.
This is usefull because the keyword can contains some condifiguration
data set while the keyword registration.
2015-09-23 21:44:23 +02:00
Willy Tarreau
c29d0cda4b BUG/MEDIUM: http: do not dereference strm_li(stream)
Some streams do not have a listener (eg: Lua's cosockets) so
let's check for this. For now this problem cannot happen but
it's definitely unsafe.
2015-09-23 13:42:08 +02:00
James Rosewell
91a41cb32d MINOR: http: made CHECK_HTTP_MESSAGE_FIRST accessible to other functions
Added the definition of CHECK_HTTP_MESSAGE_FIRST and the declaration of
smp_prefetch_http to the header.

Changed smp_prefetch_http implementation to remove the static qualifier.
2015-09-21 12:05:26 +02:00
Willy Tarreau
b7ce424be2 BUG/MINOR: http: remove stupid HTTP_METH_NONE entry
When converting the "method" fetch to a string, we used to get an empty
string if the first character was not an upper case. This was caused by
the lookup function which returns HTTP_METH_NONE when a lookup is not
possible, and this method being mapped to an empty string in the array.

This is a totally stupid mechanism, there's no reason for having the
result depend on the first char. In fact the message parser already
checks that the syntax matches an HTTP token so we can only land there
with a valid token, hence only HTTP_METH_OTHER should be returned.

This fix should be backported to all actively supported branches.
2015-09-03 17:15:21 +02:00
Thierry FOURNIER
42148735bc MEDIUM: actions: remove ACTION_STOP
Before this patch, two type of custom actions exists: ACT_ACTION_CONT and
ACT_ACTION_STOP. ACT_ACTION_CONT is a non terminal action and ACT_ACTION_STOP is
a terminal action.

Note that ACT_ACTION_STOP is not used in HAProxy.

This patch remove this behavior. Only type type of custom action exists, and it
is called ACT_CUSTOM. Now, the custion action can return a code indicating the
required behavior. ACT_RET_CONT wants that HAProxy continue the current rule
list evaluation, and ACT_RET_STOP wants that HAPRoxy stops the the current rule
list evaluation.
2015-09-02 18:36:38 +02:00
Willy Tarreau
bd99d5818d BUG/MAJOR: http: don't manipulate the server connection if it's killed
Jesse Hathaway reported a crash that Cyril Bonté diagnosed as being
caused by the manipulation of srv_conn after setting it to NULL. This
happens in http-server-close mode when the server returns either a 401
or a 407, because the connection was previously closed then it's being
assigned the CO_FL_PRIVATE flag.

This bug only affects 1.6-dev as it was introduced by connection reuse code
with commit 387ebf8 ("MINOR: connection: add a new flag CO_FL_PRIVATE").
2015-09-02 10:52:05 +02:00
Thierry FOURNIER
35d70efc33 MINOR: http: Action for manipulating the returned status code.
This patch is inspired by Bowen Ni's proposal and it is based on his first
implementation:

   With Lua integration in HAProxy 1.6, one can change the request method,
   path, uri, header, response header etc except response line.
   I'd like to contribute the following methods to allow modification of the
   response line.

   [...]

   There are two new keywords in 'http-response' that allows you to rewrite
   them in the native HAProxy config. There are also two new APIs in Lua that
   allows you to do the same rewriting in your Lua script.

   Example:
   Use it in HAProxy config:
   *http-response set-code 404*
   Or use it in Lua script:
   *txn.http:res_set_reason("Redirect")*

I dont take the full patch because the manipulation of the "reason" is useless.
standard reason are associated with each returned code, and unknown code can
take generic reason.

So, this patch can set the status code, and the reason is automatically adapted.
2015-08-27 14:29:44 +02:00
Thierry FOURNIER
3f4bc65a22 DOC: fix "http_action_set_req_line()" comments
Bowen repports errors about http_action_set_req_line() comments.
Some other errors appears from the patches about "actions" reorganisation.
2015-08-27 11:31:19 +02:00
Thierry FOURNIER
afa80496db MEDIUM: actions: Normalize the return code of the configuration parsers
This patch normalize the return code of the configuration parsers. Before
these changes, the tcp action parser returned -1 if fail and 0 for the
succes. The http action returned 0 if fail and 1 if succes.

The normalisation does:
 - ACT_RET_PRS_OK for succes
 - ACT_RET_PRS_ERR for failure
2015-08-20 17:13:47 +02:00
Thierry FOURNIER
322a124867 MINOR: actions: mutualise the action keyword lookup
Each (http|tcp)-(request|response) action use the same method
for looking up the action keyword during the cofiguration parsing.

This patch mutualize the code.
2015-08-20 17:13:47 +02:00
Thierry FOURNIER
36481b8667 MEDIUM: actions: Merge (http|tcp)-(request|reponse) keywords structs
This patch merges the conguration keyword struct. Each declared configuration
keyword struct are similar with the others. This patch simplify the code.
2015-08-20 17:13:47 +02:00
Thierry FOURNIER
24ff6c6fce MEDIUM: actions: Add standard return code for the action API
Action function can return 3 status:
 - error if the action encounter fatal error (like out of memory)
 - yield if the action must terminate his work later
 - continue in other cases
2015-08-20 17:13:47 +02:00
Thierry FOURNIER
0ea5c7fafa MINOR: actions: change actions names
For performances considerations, some actions are not processed by remote
function. They are directly processed by the function. Some of these actions
does the same things but for different processing part (request / response).

This patch give the same name for the same actions, and change the normalization
of the other actions names.

This patch is ONLY a rename, it doesn't modify the code.
2015-08-20 17:13:47 +02:00
Thierry FOURNIER
91f6ba0f2c MINOR: actions: Declare all the embedded actions in the same header file
This patch group the action name in one file. Some action are called
many times and need an action embedded in the action caller. The main
goal is to have only one header file grouping all definitions.
2015-08-20 17:13:47 +02:00
Thierry FOURNIER
22e49011b1 MINOR: actions: remove the mark indicating the last entry in enum
This mark permit to detect if the action tag is over the allowed range.
 - Normally, this case doesn't appear
 - If it appears, it is processed by ded fault case of the switch
2015-08-20 17:13:47 +02:00
Thierry FOURNIER
5563e4b469 MINOR: actions: add "from" information
This struct member is used to specify who is the rule caller. It permits
to use one function for differents callers.
2015-08-20 17:13:47 +02:00
Thierry FOURNIER
5ec63e008d MEDIUM: track-sc: Move the track-sc configuration storage in the union
This patch moves the track-sc configuration struct (track_ctr_prm) in the main
"arg" union. This reduce the size od the struct.
2015-08-20 17:13:47 +02:00
Thierry FOURNIER
e209797ef0 MINOR: proto_http: replace generic opaque types by real used types in "http_capture" by id
This patch removes the generic opaque type for storing the configuration of the
action "http_capture" by id.
2015-08-20 17:13:46 +02:00
Thierry FOURNIER
32b15003fe MINOR: proto_http: replace generic opaque types by real used types in "http_capture"
This patch removes the generic opaque type for storing the configuration of the
action "http_capture"".
2015-08-20 17:13:46 +02:00
Thierry FOURNIER
8855a92d8c MINOR: proto_http: replace generic opaque types by real used types for the actions on thr request line
This patch removes the generic opaque type for storing the configuration of the
action "set-method", "set-path", "set-query" and "set-uri".
2015-08-20 17:13:46 +02:00
Thierry FOURNIER
a002dc9df8 MINOR: proto_http: use an "expr" type in place of generic opaque type.
This patch removes the generic opaque type for storing the configuration of the
acion "set-src" (HTTP_REQ_ACT_SET_SRC), and use the dedicated type "struct expr"
2015-08-20 17:13:46 +02:00
Thierry FOURNIER
a28a9429b2 MEDIUM: actions: Merge (http|tcp)-(request|reponse) action structs
This patch is the first of a serie which merge all the action structs. The
function "tcp-request content", "tcp-response-content", "http-request" and
"http-response" have the same values and the same process for some defined
actions, but the struct and the prototype of the declared function are
different.

This patch try to unify all of these entries.
2015-08-20 17:13:46 +02:00
Thierry FOURNIER
136f9d34a9 MINOR: samples: rename union from "data" to "u"
The union name "data" is a little bit heavy while we read the source
code because we can read "data.data.sint". The rename from "data" to "u"
makes the read easiest like "data.u.sint".
2015-08-20 17:13:46 +02:00
Thierry FOURNIER
8c542cac07 MEDIUM: samples: Use the "struct sample_data" in the "struct sample"
This patch remove the struct information stored both in the struct
sample_data and in the striuct sample. Now, only thestruct sample_data
contains data, and the struct sample use the struct sample_data for storing
his own data.
2015-08-20 17:13:46 +02:00
Thierry FOURNIER
a6b6343cff CLEANUP: http/tcp actions: remove the scope member
The scope member is not used. This patch removes this entry.
2015-08-11 13:44:53 +02:00
Thierry FOURNIER
9b49f589ed CLEANUP: proto_http: remove useless initialisation
This initialisation of the opaque array is useless.
2015-08-11 13:44:51 +02:00
Willy Tarreau
53a09d520e MAJOR: http: remove references to appsession
appsessions started to be deprecated with the introduction of stick
tables, and the latter are much more powerful and flexible, and in
addition they are replicated between nodes and maintained across
reloads. Let's now remove appsession completely.
2015-08-10 19:16:18 +02:00
Willy Tarreau
449d74a906 MEDIUM: backend: add the "http-reuse aggressive" strategy
This strategy is less extreme than "always", it only dispatches first
requests to validated reused connections, and moves a connection from
the idle list to the safe list once it has seen a second request, thus
proving that it could be reused.
2015-08-06 16:29:01 +02:00
Willy Tarreau
8dff998b91 MAJOR: backend: initial work towards connection reuse
In connect_server(), if we don't have a connection attached to the
stream-int, we first look into the server's idle_conns list and we
pick the first one there, we detach it from its owner if it had one.
If we used to have a connection, we close it.

This mechanism works well but doesn't scale : as servers increase,
the likeliness that the connection attached to the stream interface
doesn't match the server and gets closed increases.
2015-08-06 11:34:21 +02:00
Willy Tarreau
387ebf84dd MINOR: connection: add a new flag CO_FL_PRIVATE
This flag is set on an outgoing connection when this connection gets
some properties that must not be shared with other connections, such
as dynamic transparent source binding, SNI or a proxy protocol header,
or an authentication challenge from the server. This will be needed
later to implement connection reuse.
2015-08-06 11:14:17 +02:00
Willy Tarreau
4320eaac62 MINOR: stream-int: make si_idle_conn() only accept valid connections
This function is now dedicated to idle connections only, which means
that it must not be used without any endpoint nor anything not a
connection. The connection remains attached to the stream interface.
2015-08-06 11:11:10 +02:00
Willy Tarreau
323a2d925c MEDIUM: stream-int: queue idle connections at the server
Now we get a per-server list of all idle connections. That way we'll
be able to reclaim them upon shortage later.
2015-08-06 11:06:25 +02:00
Willy Tarreau
973a54235f MEDIUM: stream-int: simplify si_alloc_conn()
Since we now always call this function with the reuse parameter cleared,
let's simplify the function's logic as it cannot return the existing
connection anymore. The savings on this inline function are appreciable
(240 bytes) :

$ size haproxy.old haproxy.new
   text    data     bss     dec     hex filename
1020383   40816   36928 1098127  10c18f haproxy.old
1020143   40816   36928 1097887  10c09f haproxy.new
2015-08-05 21:51:09 +02:00
Thierry FOURNIER
bf65cd4d77 MAJOR: arg: converts uint and sint in sint
This patch removes the 32 bits unsigned integer and the 32 bit signed
integer. It replaces these types by a unique type 64 bit signed.
2015-07-22 00:48:23 +02:00
Thierry FOURNIER
07ee64ef4d MAJOR: sample: converts uint and sint in 64 bits signed integer
This patch removes the 32 bits unsigned integer and the 32 bit signed
integer. It replaces these types by a unique type 64 bit signed.

This makes easy the usage of integer and clarify signed and unsigned use.
With the previous version, signed and unsigned are used ones in place of
others, and sometimes the converter loose the sign. For example, divisions
are processed with "unsigned", if one entry is negative, the result is
wrong.

Note that the integer pattern matching and dotted version pattern matching
are already working with signed 64 bits integer values.

There is one user-visible change : the "uint()" and "sint()" sample fetch
functions which used to return a constant integer have been replaced with
a new more natural, unified "int()" function. These functions were only
introduced in the latest 1.6-dev2 so there's no impact on regular
deployments.
2015-07-22 00:48:23 +02:00
Thierry FOURNIER
fac9ccfb70 BUG/MINOR: http/sample: gmtime/localtime can fail
The man said that gmtime() and localtime() can return a NULL value.
This is not tested. It appears that all the values of a 32 bit integer
are valid, but it is better to check the return of these functions.

However, if the integer move from 32 bits to 64 bits, some 64 values
can be unsupported.
2015-07-20 12:21:35 +02:00
Adis Nezirovic
2fbcafc9ce MEDIUM: http: Add new 'set-src' option to http-request
This option enables overriding source IP address in a HTTP request. It is
useful when we want to set custom source IP (e.g. front proxy rewrites address,
but provides the correct one in headers) or we wan't to mask source IP address
for privacy or compliance.

It acts on any expression which produces correct IP address.
2015-07-06 16:17:28 +02:00
Adis Nezirovic
79beb248b9 CLEANUP: sample: generalize sample_fetch_string() as sample_fetch_as_type()
This modification makes possible to use sample_fetch_string() in more places,
where we might need to fetch sample values which are not plain strings. This
way we don't need to fetch string, and convert it into another type afterwards.

When using aliased types, the caller should explicitly check which exact type
was returned (e.g. SMP_T_IPV4 or SMP_T_IPV6 for SMP_T_ADDR).

All usages of sample_fetch_string() are converted to use new function.
2015-07-06 16:17:25 +02:00
Thierry FOURNIER
4834bc773c MEDIUM: vars: adds support of variables
This patch adds support of variables during the processing of each stream. The
variables scope can be set as 'session', 'transaction', 'request' or 'response'.
The variable type is the type returned by the assignment expression. The type
can change while the processing.

The allocated memory can be controlled for each scope and each request, and for
the global process.
2015-06-13 23:01:37 +02:00
Thierry FOURNIER
0e11863a6f MINOR: tcp/http/conf: extends the keyword registration options
This patch permits to register a new keyword with the keyword "tcp-request content"
'tcp-request connection", tcp-response content", http-request" and "http-response"
which is identified only by matching the start of the keyword.

for example, we register the keyword "set-var" with the option "match_pfx"
and the configuration keyword "set-var(var_name)" matchs this entry.
2015-06-13 23:01:37 +02:00
Willy Tarreau
b8cdf52da0 BUG/MEDIUM: http: fix body processing for the stats applet
Commit 9fbe18e ("MEDIUM: http: add a new option http-buffer-request")
introduced a regression due to a misplaced check causing the admin
mode of the HTTP stats not to work anymore.

This patch tried to ensure that when we need a request body for the
stats applet, and we have already waited for this body, we don't wait
for it again, but the condition was applied too early causing a
disabling of the entire processing the body, and based on the wrong
HTTP state (MSG_BODY) resulting in the test never matching.

Thanks to Chad Lavoie for reporting the problem.

This bug is 1.6-only, no backport is needed.
2015-05-29 01:12:38 +02:00
Willy Tarreau
2de8a50918 MEDIUM: http: no need to close the request on redirect if data was parsed
There are two reasons for not keeping the client connection alive upon a
redirect :
  - save the client from uploading all data
  - avoid keeping a connection alive if the redirect goes to another domain

The first case should consider an exception when all the data from the
client have been read already. This specifically happens on response
redirects after a POST to a server. This is an easy situation to detect.

It could later be improved to cover the cases where option
http-buffer-request is used.
2015-05-28 17:45:43 +02:00
Willy Tarreau
51d861a44f MEDIUM: http: implement http-response redirect rules
Sometimes it's problematic not to have "http-response redirect" rules,
for example to perform a browser-based redirect based on certain server
conditions (eg: match of a header).

This patch adds "http-response redirect location <fmt>" which gives
enough flexibility for most imaginable operations. The connection to
the server is closed when this is performed so that we don't risk to
forward any pending data from the server.

Any pending response data are trimmed so that we don't risk to
forward anything pending to the client. It's harmless to also do that
for requests so we don't need to consider the direction.
2015-05-28 17:45:43 +02:00
Willy Tarreau
be4653b6d4 MINOR: http: prepare support for parsing redirect actions on responses
In order to support http-response redirect, the parsing needs to be
adapted a little bit to only support the "location" type, and to
adjust the log-format parser so that it knows the direction of the
sample fetch calls.
2015-05-28 17:43:11 +02:00
Willy Tarreau
b329a312e3 CLEANUP: http: explicitly reference request in http_apply_redirect_rules()
This function was made to perform a redirect on requests only, it was
using a message or txn->req in an inconsistent way and did not consider
the possibility that it could be used for the other direction. Let's
clean it up to have both a request and a response messages.
2015-05-28 17:42:16 +02:00
Thierry FOURNIER
e80fadaaca MEDIUM: capture: adds http-response capture
This patch adds a http response capture keyword with the same behavior
as the previous patch called "MEDIUM: capture: Allow capture with slot
identifier".
2015-05-28 13:51:00 +02:00
Thierry FOURNIER
82bf70dff4 MEDIUM: capture: Allow capture with slot identifier
This patch modifies the current http-request capture function
and adds a new keyword "id" that permits to identify a capture slot.
If the identified doesn't exists, the action fails silently.

Note that this patch removs an unused list initilisation, which seems
to be inherited from a copy/paste. It's harmless and does not need to
be backported.

   LIST_INIT((struct list *)&rule->arg.act.p[0]);
2015-05-28 13:50:29 +02:00
Thierry FOURNIER
35ab27561e MINOR: capture: add two "capture" converters
This patch adds "capture-req" and "capture-res". These two converters
capture their entry in the allocated slot given in argument and pass
the input on the output.
2015-05-28 13:50:29 +02:00
Willy Tarreau
98d0485a90 MAJOR: config: remove the deprecated reqsetbe / reqisetbe actions
These ones were already obsoleted in 1.4, marked for removal in 1.5,
and not documented anymore. They used to emit warnings, and do still
require quite some code to stay in place. Let's remove them now.
2015-05-26 12:18:29 +02:00
Dragan Dosen
26f77e534c BUG/MEDIUM: http: fix the url_param fetch
The "name" and "name_len" arguments in function "smp_fetch_url_param"
could be left uninitialized for subsequent calls.

[wt: no backport needed, this is an 1.6 regression introduced by
 commit 4fdc74c ("MINOR: http: split the url_param in two parts") ]
2015-05-25 19:01:39 +02:00
Thierry FOURNIER
8be451c52a MEDIUM: http: url-encoded parsing function can run throught wrapped buffer
The functions smp_fetch_param(), find_next_url_param() and
find_url_param_pos() can look for argument in 2 chunks and not only
one.
2015-05-20 16:05:38 +02:00
Thierry FOURNIER
e28c49975a MINOR: http: add body_param fetch
This fetch returns one body param or the list of each body param.
This first version runs only with one chunk.
2015-05-20 15:56:23 +02:00
Thierry FOURNIER
0948d41a12 CLEANUP: http: bad indentation
Some function argument uses space in place of tabulation
for the indentation.
2015-05-20 15:56:23 +02:00
Thierry FOURNIER
4fdc74c22c MINOR: http: split the url_param in two parts
This patch is the part of the body_param fetch. The goal is to have
generic url-encoded parser which can used for parsing the query string
and the body.
2015-05-20 15:56:23 +02:00
Willy Tarreau
1ede1daab6 MEDIUM: http: make url_param iterate over multiple occurrences
There are some situations hwere it's desirable to scan multiple occurrences
of a same parameter name in the query string. This change ensures this can
work, even with an empty name which will then iterate over all parameters.
2015-05-19 13:16:07 +02:00
Thierry FOURNIER
0786d05a04 MEDIUM: sample: change the prototype of sample-fetches functions
This patch removes the "opt" entry from the prototype of the
sample-fetches fucntions. This permits to remove some weight
in the prototype call.
2015-05-11 20:03:08 +02:00
Thierry FOURNIER
0a9a2b8cec MEDIUM: sample change the prototype of sample-fetches and converters functions
This patch removes the structs "session", "stream" and "proxy" from
the sample-fetches and converters function prototypes.

This permits to remove some weight in the prototype call.
2015-05-11 20:01:42 +02:00
Willy Tarreau
bbfb6c4085 BUG/MEDIUM: http: don't forward client shutdown without NOLINGER except for tunnels
There's an issue related with shutting down POST transfers or closing the
connection after the end of the upload : the shutdown is forwarded to the
server regardless of the abortonclose option. The problem it causes is that
during a scan, brute force or whatever, it becomes possible that all source
ports are exhausted with all sockets in TIME_WAIT state.

There are multiple issues at once in fact :
  - no action is done for the close, it automatically happens at the lower
    layers thanks for channel_auto_close(), so we cannot act on NOLINGER ;

  - we *do* want to continue to send a clean shutdown in tunnel mode because
    some protocols transported over HTTP may need this, regardless of option
    abortonclose, thus we can't set the option inconditionally

  - for all other modes, we do want to close the dirty way because we're
    certain whether we've sent everything or not, and we don't want to eat
    all source ports.

The solution is a bit complex and applies to DONE/TUNNEL states :

  1) disable automatic close for everything not a tunnel and not just
     keep-alive / server-close. Force-close is now covered, as is HTTP/1.0
     which implicitly works in force-close mode ;

  2) when processing option abortonclose, we know we can disable lingering
     if the client has closed and the connection is not in tunnel mode.

Since the last case above leads to a situation where the client side reports
an error, we know the connection will not be reused, so leaving the flag on
the stream-interface is safe. A client closing in the middle of the data
transmission already aborts the transaction so this case is not a problem.

This fix must be backported to 1.5 where the problem was detected.
2015-05-11 19:05:42 +02:00
Thierry FOURNIER
82ff3c9b05 MINOR: sample: add url_dec converter
This converter decodes an url-encoded string. It takes a string as
input and returns string as output.
2015-05-11 11:40:36 +02:00
Willy Tarreau
3986ac1860 BUG/MEDIUM: http: fix the http-request capture parser
Due to the code being mostly inspired from the tcp-request parser, it
does some crap because both don't work the same way. The "len" argument
could be mismatched and then the length could be used uninitialized.
2015-05-08 16:13:42 +02:00
Willy Tarreau
a9083d0722 MEDIUM: http: add new "capture" action for http-request
This is only possible in frontends of course, but it will finally
make it possible to capture arbitrary http parts, including URL
parameters or parts of the message body.

It's worth noting that an ugly (char **) cast had to be done to
call sample_fetch_string() which is caused by a 5- or 6- levels
of inheritance of this type in the API. Here it's harmless since
the function uses it as a const, but this API madness must be
fixed, starting with the one or two rare functions that modify
the args and inflict this on each and every keyword parser.
(cherry picked from commit 484a4f38460593919a1c1d9a047a043198d69f45)
2015-05-08 15:43:54 +02:00
Willy Tarreau
a5910cc6ef MEDIUM: http: provide 3 fetches for the body
Body processing is still fairly limited, but this is a start. It becomes
possible to apply regex to find contents in order to decide where to route
a request for example. Only the first chunk is parsed for now, and the
response is not yet available (the parsing function must be duplicated for
this).

req.body : binary
  This returns the HTTP request's available body as a block of data. It
  requires that the request body has been buffered made available using
  "option http-buffer-request". In case of chunked-encoded body, currently only
  the first chunk is analyzed.

req.body_len : integer
  This returns the length of the HTTP request's available body in bytes. It may
  be lower than the advertised length if the body is larger than the buffer. It
  requires that the request body has been buffered made available using
  "option http-buffer-request".

req.body_size : integer
  This returns the advertised length of the HTTP request's body in bytes. It
  will represent the advertised Content-Length header, or the size of the first
  chunk in case of chunked encoding. In order to parse the chunks, it requires
  that the request body has been buffered made available using
  "option http-buffer-request".
2015-05-02 00:46:08 +02:00
Willy Tarreau
9fbe18e174 MEDIUM: http: add a new option http-buffer-request
It is sometimes desirable to wait for the body of an HTTP request before
taking a decision. This is what is being done by "balance url_param" for
example. The first use case is to buffer requests from slow clients before
connecting to the server. Another use case consists in taking the routing
decision based on the request body's contents. This option placed in a
frontend or backend forces the HTTP processing to wait until either the whole
body is received, or the request buffer is full, or the first chunk is
complete in case of chunked encoding. It can have undesired side effects with
some applications abusing HTTP by expecting unbufferred transmissions between
the frontend and the backend, so this should definitely not be used by
default.

Note that it would not work for the response because we don't reset the
message state before starting to forward. For the response we need to
1) reset the message state to MSG_100_SENT or BODY , and 2) to reset
body_len in case of chunked encoding to avoid counting it twice.
2015-05-02 00:10:44 +02:00
Willy Tarreau
e115b49c39 BUG/MEDIUM: http: wait for the exact amount of body bytes in wait_for_request_body
Due to the fact that we were still considering only msg->sov for the
first byte of data after calling http_parse_chunk_size(), we used to
miscompute the input data size and to count the CRLF and the chunk size
as part of the input data. The effect is that it was possible to release
the processing with 3 or 4 missing bytes, especially if they're typed by
hand during debugging sessions. This can cause the stats page to return
some errors in admin mode, and the url_param balance algorithm to fail
to properly hash a body input.

This fix must be backported to 1.5.
2015-05-01 23:24:32 +02:00
Willy Tarreau
0f228a037a MEDIUM: http: add option-ignore-probes to get rid of the floods of 408
Recently some browsers started to implement a "pre-connect" feature
consisting in speculatively connecting to some recently visited web sites
just in case the user would like to visit them. This results in many
connections being established to web sites, which end up in 408 Request
Timeout if the timeout strikes first, or 400 Bad Request when the browser
decides to close them first. These ones pollute the log and feed the error
counters. There was already "option dontlognull" but it's insufficient in
this case. Instead, this option does the following things :
   - prevent any 400/408 message from being sent to the client if nothing
     was received over a connection before it was closed ;
   - prevent any log from being emitted in this situation ;
   - prevent any error counter from being incremented

That way the empty connection is silently ignored. Note that it is better
not to use this unless it is clear that it is needed, because it will hide
real problems. The most common reason for not receiving a request and seeing
a 408 is due to an MTU inconsistency between the client and an intermediary
element such as a VPN, which blocks too large packets. These issues are
generally seen with POST requests as well as GET with large cookies. The logs
are often the only way to detect them.

This patch should be backported to 1.5 since it avoids false alerts and
makes it easier to monitor haproxy's status.
2015-05-01 15:39:23 +02:00
Willy Tarreau
13317669d5 MEDIUM: http: disable support for HTTP/0.9 by default
There's not much reason for continuing to accept HTTP/0.9 requests
nowadays except for manual testing. Now we disable support for these
by default, unless option accept-invalid-http-request is specified,
in which case they continue to be upgraded to 1.0.
2015-05-01 14:57:54 +02:00
Willy Tarreau
91852eb428 MEDIUM: http: restrict the HTTP version token to 1 digit as per RFC7230
While RFC2616 used to allow an undeterminate amount of digits for the
major and minor components of the HTTP version, RFC7230 has reduced
that to a single digit for each.

If a server can't properly parse the version string and falls back to 0.9,
it could then send a head-less response whose payload would be taken for
headers, which could confuse downstream agents.

Since there's no more reason for supporting a version scheme that was
never used, let's upgrade to the updated version of the standard. It is
still possible to enforce support for the old behaviour using options
accept-invalid-http-request and accept-invalid-http-response.

It would be wise to backport this to 1.5 as well just in case.
2015-05-01 14:57:01 +02:00
Willy Tarreau
b4d0c03aee BUG/MEDIUM: http: remove content-length form responses with bad transfer-encoding
The spec mandates that content-length must be removed from messages if
Transfer-Encoding is present, not just for valid ones.

This must be backported to 1.5 and 1.4.
2015-05-01 13:56:11 +02:00
Willy Tarreau
34dfc60571 BUG/MEDIUM: http: incorrect transfer-coding in the request is a bad request
The rules related to how to handle a bad transfer-encoding header (one
where "chunked" is not at the final place) have evolved to mandate an
abort when this happens in the request. Previously it was only a close
(which is still valid for the server side).

This must be backported to 1.5 and 1.4.
2015-05-01 13:56:10 +02:00
Willy Tarreau
4979d5c5d1 BUG/MEDIUM: http: do not restrict parsing of transfer-encoding to HTTP/1.1
While Transfer-Encoding is HTTP/1.1, we must still parse it in HTTP/1.0
in case an agent sends it, because it's likely that the other side might
use it as well, causing confusion. This will also result in getting rid
of the Content-Length header in such abnormal situations and in having
a clean connection.

This must be backported to 1.5 and 1.4.
2015-05-01 13:56:10 +02:00
Willy Tarreau
557f199fb7 DOC: http: update the comments about the rules for determining transfer-length
Let's now use the text from RFC7230 which is stricter and more precise.

This must be backported to 1.5 and 1.4.
2015-05-01 13:56:10 +02:00
Willy Tarreau
1c91391df4 BUG/MEDIUM: http: remove content-length from chunked messages
RFC7230 clarified the behaviour to adopt when facing both a
content-length and a transfer-encoding: chunked in a message. While
haproxy already complied with the method for getting the message
length right, and used to detect improper content-length duplicates,
it still did not remove the content-length header when facing a
transfer-encoding: chunked. Usually it is not a problem since other
agents (clients and servers) are required to parse the message
according to the rules that have been in place since RFC2616 in
1999.

However Régis Leroy reported the existence of at least one such
non-compliant agent so haproxy could be abused to get out of sync
with it on pipelined requests (HTTP request smuggling attack),
it consider part of a payload as a subsequent request.

The best thing to do is then to remove the content-length according
to RFC7230. It used to be in the todo list with a fixme in the code
while waiting for the standard to stabilize, let's apply it now that
it's published.

Thanks to Régis for bringing that subject to our attention.

This fix must be backported to 1.5 and 1.4.
2015-05-01 13:56:10 +02:00
Thierry FOURNIER
7f6192c0d3 BUG/MEDIUM: http: functions set-{path,query,method,uri} breaks the HTTP parser
When one of these functions replaces a part of the query string by
a shorter or longer new one, the header parsing is broken. This is
because the start of the first header is not updated.

In the same way, the total length of the request line is not updated.
I dont see any bug caused by this miss, but I guess than it is better
to store the good length.

This bug is only in the development version.
2015-04-27 11:56:52 +02:00
Willy Tarreau
ee335e65dc BUG/MEDIUM: http: properly retrieve the front connection
Commit 350f487 ("CLEANUP: session: simplify references to chn_{prod,cons}(&s->{req,res})")
introduced a regression causing the cli_conn to be picked from the server
side instead of the client side, so the XFF header is not appended anymore
since the connection is NULL.

Thanks to Reinis Rozitis for reporting this bug. No backport is needed
as it's 1.6-specific.
2015-04-21 18:15:13 +02:00
Willy Tarreau
152b81e7b2 BUG/MAJOR: tcp/http: fix current_rule assignment when restarting over a ruleset
Commit bc4c1ac ("MEDIUM: http/tcp: permit to resume http and tcp custom
actions") introduced the ability to interrupt and restart processing in
the middle of a TCP/HTTP ruleset. But it doesn't do it in a consistent
way : it checks current_rule_list, immediately dereferences current_rule,
which is only set in certain cases and never cleared. So that broke the
tcp-request content rules when the processing was interrupted due to
missing data, because current_rule was not yet set (segfault) or could
have been inherited from another ruleset if it was used in a backend
(random behaviour).

The proper way to do it is to always set current_rule before dereferencing
it. But we don't want to set it for all rules because we don't want any
action to provide a checkpointing mechanism. So current_rule is set to NULL
before entering the loop, and only used if not NULL and if current_rule_list
matches the current list. This way they both serve as a guard for the other
one. This fix also makes the current rule point to the rule instead of its
list element, as it's much easier to manipulate.

No backport is needed, this is 1.6-specific.
2015-04-20 13:46:20 +02:00
CJ Ess
108b1dd69d MEDIUM: http: configurable http result codes for http-request deny
This patch adds support for error codes 429 and 405 to Haproxy and a
"deny_status XXX" option to "http-request deny" where you can specify which
code is returned with 403 being the default. We really want to do this the
"haproxy way" and hope to have this patch included in the mainline. We'll
be happy address any feedback on how this is implemented.
2015-04-11 10:34:54 +02:00
Willy Tarreau
d0d8da989b MINOR: stream: provide a few helpers to retrieve frontend, listener and origin
Expressions are quite long when using strm_sess(strm)->whatever, so let's
provide a few helpers : strm_fe(), strm_li(), strm_orig().
2015-04-06 11:37:29 +02:00
Willy Tarreau
192252e2d8 MAJOR: sample: pass a pointer to the session to each sample fetch function
Many such function need a session, and till now they used to dereference
the stream. Once we remove the stream from the embryonic session, this
will not be possible anymore.

So as of now, sample fetch functions will be called with this :

   - sess = NULL,  strm = NULL                     : never
   - sess = valid, strm = NULL                     : tcp-req connection
   - sess = valid, strm = valid, strm->txn = NULL  : tcp-req content
   - sess = valid, strm = valid, strm->txn = valid : http-req / http-res
2015-04-06 11:37:25 +02:00
Willy Tarreau
987e3fb868 MEDIUM: http: remove the now useless http_txn from {req/res} rules
The registerable http_req_rules / http_res_rules used to require a
struct http_txn at the end. It's redundant with struct stream and
propagates very deep into some parts (ie: it was the reason for lua
requiring l7). Let's remove it now.
2015-04-06 11:35:53 +02:00
Willy Tarreau
15e91e1b36 MAJOR: sample: don't pass l7 anymore to sample fetch functions
All of them can now retrieve the HTTP transaction *if it exists* from
the stream and be sure to get NULL there when called with an embryonic
session.

The patch is a bit large because many locations were touched (all fetch
functions had to have their prototype adjusted). The opportunity was
taken to also uniformize the call names (the stream is now always "strm"
instead of "l4") and to fix indent where it was broken. This way when
we later introduce the session here there will be less confusion.
2015-04-06 11:35:53 +02:00
Willy Tarreau
eee5b51248 MAJOR: http: move http_txn out of struct stream
Now this one is dynamically allocated. It means that 280 bytes of memory
are saved per TCP stream, but more importantly that it will become
possible to remove the l7 pointer from fetches and converters since
it will be deduced from the stream and will support being null.

A lot of care was taken because it's easy to forget a test somewhere,
and the previous code used to always trust s->txn for being valid, but
all places seem to have been visited.

All HTTP fetch functions check the txn first so we shouldn't have any
issue there even when called from TCP. When branching from a TCP frontend
to an HTTP backend, the txn is properly allocated at the same time as the
hdr_idx.
2015-04-06 11:35:52 +02:00
Willy Tarreau
63986c72c8 MINOR: http: create a dedicated pool for http_txn
This one will not necessarily be allocated for each stream, and we want
to use the fact that it equals null to know it's not present so that we
can always deduce its presence from the stream pointer.

This commit only creates the new pool.
2015-04-06 11:35:52 +02:00
Willy Tarreau
cb7dd015be MEDIUM: http: move header captures from http_txn to struct stream
The header captures are now general purpose captures since tcp rules
can use them to capture various contents. That removes a dependency
on http_txn that appeared in some sample fetch functions and in the
order by which captures and http_txn were allocated.

Interestingly the reset of the header captures were done at too many
places as http_init_txn() used to do it while it was done previously
in every call place.
2015-04-06 11:35:52 +02:00
Willy Tarreau
53c9b4db41 CLEANUP: sample: remove useless tests in fetch functions for l4 != NULL
The stream may never be null given that all these functions are called
from sample_process(). Let's remove this now confusing test which
sometimes happens after a dereference was already done.
2015-04-06 11:35:52 +02:00
Willy Tarreau
9ad7bd48d2 MEDIUM: session: use the pointer to the origin instead of s->si[0].end
When s->si[0].end was dereferenced as a connection or anything in
order to retrieve information about the originating session, we'll
now use sess->origin instead so that when we have to chain multiple
streams in HTTP/2, we'll keep accessing the same origin.
2015-04-06 11:34:29 +02:00
Willy Tarreau
e36cbcb3b0 MEDIUM: stream: move the frontend's pointer to the session
Just like for the listener, the frontend is session-wide so let's move
it to the session. There are a lot of places which were changed but the
changes are minimal in fact.
2015-04-06 11:23:58 +02:00
Willy Tarreau
fb0afa77c9 MEDIUM: stream: move the listener's pointer to the session
The listener is session-specific, move it there.
2015-04-06 11:23:57 +02:00
Willy Tarreau
e7dff02dd4 REORG/MEDIUM: stream: rename stream flags from SN_* to SF_*
This is in order to keep things consistent.
2015-04-06 11:23:57 +02:00
Willy Tarreau
87b09668be REORG/MAJOR: session: rename the "session" entity to "stream"
With HTTP/2, we'll have to support multiplexed streams. A stream is in
fact the largest part of what we currently call a session, it has buffers,
logs, etc.

In order to catch any error, this commit removes any reference to the
struct session and tries to rename most "session" occurrences in function
names to "stream" and "sess" to "strm" when that's related to a session.

The files stream.{c,h} were added and session.{c,h} removed.

The session will be reintroduced later and a few parts of the stream
will progressively be moved overthere. It will more or less contain
only what we need in an embryonic session.

Sample fetch functions and converters will have to change a bit so
that they'll use an L5 (session) instead of what's currently called
"L4" which is in fact L6 for now.

Once all changes are completed, we should see approximately this :

   L7 - http_txn
   L6 - stream
   L5 - session
   L4 - connection | applet

There will be at most one http_txn per stream, and a same session will
possibly be referenced by multiple streams. A connection will point to
a session and to a stream. The session will hold all the information
we need to keep even when we don't yet have a stream.

Some more cleanup is needed because some code was already far from
being clean. The server queue management still refers to sessions at
many places while comments talk about connections. This will have to
be cleaned up once we have a server-side connection pool manager.
Stream flags "SN_*" still need to be renamed, it doesn't seem like
any of them will need to move to the session.
2015-04-06 11:23:56 +02:00
Willy Tarreau
cb703b0352 BUG/MAJOR: http: null-terminate the http actions keywords list
Commit a0dc23f ("MEDIUM: http: implement http-request set-{method,path,query,uri}")
forgot to null-terminate the list, resulting in crashes when these actions
are used if the platform doesn't pad the struct with nulls.

Thanks to Gunay Arslan for reporting a detailed trace showing the
origin of this bug.

No backport to 1.5 is needed.
2015-04-03 09:58:02 +02:00
Willy Tarreau
601a4d1741 BUG/MEDIUM: http: hdr_cnt would not count any header when called without name
It's documented that these sample fetch functions should count all headers
and/or all values when called with no name but in practice it's not what is
being done as a missing name causes an immediate return and an absence of
result.

This bug is present in 1.5 as well and must be backported.
2015-04-01 19:16:09 +02:00
Willy Tarreau
615105e7e8 MEDIUM: compression: add a distinction between UA- and config- algorithms
Thanks to MSIE/IIS, the "deflate" name is ambigous. According to the RFC
it's a zlib-wrapped deflate stream, but IIS used to send only a raw deflate
stream, which is the only format MSIE understands for "deflate". The other
widely used browsers do support both formats. For this reason some people
prefer to emit a raw deflate stream on "deflate" to serve more users even
it that means violating the standards. Haproxy only follows the standard,
so they cannot do this.

This patch makes it possible to have one algorithm name in the configuration
and another one in the protocol. This will make it possible to have a new
configuration token to add a different algorithm so that users can decide if
they want a raw deflate or the standard one.
2015-03-28 16:46:38 +01:00
Willy Tarreau
e7e49a8d0b MINOR: http: check the algo name "identity" instead of the function pointer
Next patch will statity all compression functions, so let's stop relying
on a function pointer comparison and use the algo name instead.
2015-03-28 15:43:17 +01:00
Thierry FOURNIER
7fe75e0dab MINOR: http: export function inet_set_tos()
This is used by Lua.
2015-03-18 11:34:06 +01:00
Thierry FOURNIER
5531f87ace MINOR: http: split http_transform_header() function in two parts.
This function is a callback for HTTP actions. This function
creates the replacement string from a build_logline() format
and transform the header.

This patch split this function in two part. With this modification,
the header transformation and the replacement string are separed.

We can now transform the header with another replacement string
source than a build_logline() format.
2015-03-18 11:34:06 +01:00
Thierry FOURNIER
b77aece24a MINOR: http: split the function http_action_set_req_line() in two parts
The first part is the replacement engine. It take a replacement action
number and a replacement string and process the action.

The second part is the function which is called by the 'http-request
action' to replace a request line part. This function makes the
string used as replacement.

This split permits to use the replacement engine in other parts of the
code than the request action. The Lua use it for his own http action.
2015-03-18 11:34:06 +01:00
Thierry FOURNIER
63d692c037 MEDIUM: http: allows 'R' and 'S' in the protocol alphabet
This patch allow the 'R' and the 'S' in the protocol/version
alphabet. It permits to process RTSP requests like HTTP.
2015-03-17 16:19:52 +01:00
Thierry FOURNIER
5a33ac78ad MEDIUM/CLEANUP: http: rewrite and lighten http_transform_header() prototype
The http_transform_header() function prototype uses some parameter
which can be guessed from other parameer. This patch removes
theses parameters.
2015-03-17 11:42:43 +01:00
Thierry FOURNIER
191f9efdc5 BUG/MEDIUM: http: the function "(req|res)-replace-value" doesn't respect the HTTP syntax
These function used an invalid header parser.
 - The trailing white-spaces were embedded in the replacement regex,
 - The double-quote (") containing comma (,) were not respected.

This patch replace this parser by the "official" parser http_find_header2().
2015-03-17 11:42:43 +01:00
Thierry FOURNIER
534101658d BUG/MAJOR: http: don't read past buffer's end in http_replace_value
The function http_replace_value use bad variable to detect the end
of the input string.

Regression introduced by the patch "MEDIUM: regex: Remove null
terminated strings." (c9c2daf2)

We need to backport this patch int the 1.5 stable branch.

WT: there is no possibility to overwrite existing data as we only read
    past the end of the request buffer, to copy into the trash. The copy
    is bounded by buffer_replace2(), just like the replacement performed
    by exp_replace(). However if a buffer happens to contain non-zero data
    up to the next unmapped page boundary, there's a theorical risk of
    crashing the process despite this not being reproducible in tests.
    The risk is low because "http-request replace-value" did not work due
    to this bug so that probably means it's not used yet.
2015-03-16 14:20:07 +01:00
Thierry FOURNIER
01c30124ae BUG/MEDIUM: http: the action set-{method|path|query|uri} doesn't run.
This bug is introduced by the commit "MEDIUM: http/tcp: permit to
resume http and tcp custom actions" ( bc4c1ac6ad ).

Before this patch, the return code of the function was ignored.
After this path, if the function returns 0, it wats a YIELD.

The function http_action_set_req_line() retunrs 0, in succes case.

This patch changes the return code of this function.
2015-03-14 15:53:31 +01:00
Jesse Hathaway
2468d4e4f7 MEDIUM: http: Compress HTTP responses with status codes 201,202,203 in addition to 200
It is common for rest applications to return status codes other than
200, so compress the other common 200 level responses which might
contain content.
2015-03-11 23:23:41 +01:00
Willy Tarreau
350f487300 CLEANUP: session: simplify references to chn_{prod,cons}(&s->{req,res})
These 4 combinations are needlessly complicated since the session already
has direct access to the associated stream interfaces without having to
check an indirect pointer.
2015-03-11 20:41:47 +01:00
Willy Tarreau
73796535a9 REORG/MEDIUM: channel: only use chn_prod / chn_cons to find stream-interfaces
The purpose of these two macros will be to pass via the session to
find the relevant stream interfaces so that we don't need to store
the ->cons nor ->prod pointers anymore. Currently they're only defined
so that all references could be removed.

Note that many places need a second pass of clean up so that we don't
have any chn_prod(&s->req) anymore and only &s->si[0] instead, and
conversely for the 3 other cases.
2015-03-11 20:41:47 +01:00
Willy Tarreau
a5f5d8dc69 MEDIUM: stream-int: add a flag indicating which side the SI is on
This new flag "SI_FL_ISBACK" is set only on the back SI and is cleared
on the front SI. That way it's possible only by looking at the SI to
know what side it is.
2015-03-11 20:41:46 +01:00
Willy Tarreau
2bb4a96f8f REORG/MEDIUM: stream-int: introduce si_ic/si_oc to access channels
We'll soon remove direct references to the channels from the stream
interface since everything belongs to the same session, so let's
first not dereference si->ib / si->ob anymore and use macros instead.
2015-03-11 20:41:46 +01:00
Willy Tarreau
22ec1eadd0 REORG/MAJOR: move session's req and resp channels back into the session
The channels were pointers to outside structs and this is not needed
anymore since the buffers have moved, but this complicates operations.
Move them back into the session so that both channels and stream interfaces
are always allocated for a session. Some places (some early sample fetch
functions) used to validate that a channel was NULL prior to dereferencing
it. Now instead we check if chn->buf is NULL and we force it to remain NULL
until the channel is initialized.
2015-03-11 20:41:46 +01:00
Willy Tarreau
612adb8459 BUG/MAJOR: http: fix stats regression consecutive to HTTP_RULE_RES_YIELD
Commit bc4c1ac ("MEDIUM: http/tcp: permit to resume http and tcp custom actions")
unfortunately broke the stats applet by moving the clearing of the analyser bit
after processing the applet headers. It used to work only in HTTP/1.1 and not
in HTTP/1.0. This is 1.6-specific, no backport is needed.
2015-03-10 15:33:55 +01:00
Thierry FOURNIER
bc4c1ac6ad MEDIUM: http/tcp: permit to resume http and tcp custom actions
Later, the processing of some actions needs to be interrupted and resumed
later. This patch permit to resume the actions. The actions that needs
to run with the resume mode are not yet avalaible. It will be soon with
Lua patches. So the code added by this patch is untestable for the moment.

The list of "tcp_exec_req_rules" cannot resme because is called by the
unresumable function "accept_session".
2015-02-28 23:12:33 +01:00
Thierry FOURNIER
9e2ef999a9 MEDIUM: http: change the code returned by the response processing rule functions
Actually, this function returns a pointer on the rule that stop
the evaluation of the rule list. Later we integrate the possibility
of interrupt and resue the processsing of some actions. The current
response mode is not sufficient to returns the "interrupt" information.

The pointer returned is never used, so I change the return type of
this function by an enum. With this enum, the function is ready to
return the "interupt" value.
2015-02-28 23:12:33 +01:00
Thierry FOURNIER
49f45af9aa MINOR: global: export many symbols.
The functions "val_payload_lv" and "val_hdr" are useful with
lua. The lua automatic binding for sample fetchs needs to
compare check functions.

The "arg_type_names" permit to display error messages.
2015-02-28 23:12:32 +01:00
Thierry FOURNIER
f41a809dc9 MINOR: sample: add private argument to the struct sample_fetch
The add of this private argument is to prepare the integration
of the lua fetchs.
2015-02-28 23:12:31 +01:00
Thierry FOURNIER
68a556e282 MINOR: converters: give the session pointer as converter argument
Some usages of the converters need to know the attached session. The Lua
needs the session for retrieving his running context. This patch adds
the "session" as an argument of the converters prototype.
2015-02-28 23:12:31 +01:00
Thierry FOURNIER
1edc971919 MINOR: converters: add a "void *private" argument to converters
This permits to store specific configuration pointer. It is useful
with future Lua integration.
2015-02-28 23:12:31 +01:00
Willy Tarreau
eb27ec7569 MINOR: http: add the new sample fetches req.hdr_names and res.hdr_names
These new sample fetches retrieve the list of header names as they appear
in the request or response. This can be used for debugging, for statistics
as well as an aid to better detect the presence of proxies or plugins on
some browsers, which alter the request compared to a regular browser by
adding or reordering headers.
2015-02-20 14:00:44 +01:00
Willy Tarreau
c90dc23e99 MINOR: http: add a new function to iterate over each header line
New function http_find_next_header() will be used to scan all the input
headers for various processing and for http/1 to http/2 header mapping.
2015-02-20 14:00:44 +01:00
Willy Tarreau
34d4c3c13f BUG/MINOR: http: abort request processing on filter failure
Commit c600204 ("BUG/MEDIUM: regex: fix risk of buffer overrun in
exp_replace()") added a control of failure on the response headers,
but forgot to check for the error during request processing. So if
the filters fail to apply, we could keep the request. It might
cause some headers to silently fail to be added for example. Note
that it's tagged MINOR because a standard configuration cannot make
this case happen.

The fix should be backported to 1.5 and 1.4 though.
2015-01-30 20:58:58 +01:00
Willy Tarreau
aa435e7d7e BUG/MINOR: http: fix incorrect header value offset in replace-hdr/replace-value
The two http-req/http-resp actions "replace-hdr" and "replace-value" were
expecting exactly one space after the colon, which is wrong. It was causing
the first char not to be seen/modified when no space was present, and empty
headers not to be modified either. Instead of using name->len+2, we must use
ctx->val which points to the first character of the value even if there is
no value.

This fix must be backported into 1.5.
2015-01-29 14:01:34 +01:00
Willy Tarreau
a0dc23f093 MEDIUM: http: implement http-request set-{method,path,query,uri}
This commit implements the following new actions :

- "set-method" rewrites the request method with the result of the
  evaluation of format string <fmt>. There should be very few valid reasons
  for having to do so as this is more likely to break something than to fix
  it.

- "set-path" rewrites the request path with the result of the evaluation of
  format string <fmt>. The query string, if any, is left intact. If a
  scheme and authority is found before the path, they are left intact as
  well. If the request doesn't have a path ("*"), this one is replaced with
  the format. This can be used to prepend a directory component in front of
  a path for example. See also "set-query" and "set-uri".

  Example :
      # prepend the host name before the path
      http-request set-path /%[hdr(host)]%[path]

- "set-query" rewrites the request's query string which appears after the
  first question mark ("?") with the result of the evaluation of format
  string <fmt>. The part prior to the question mark is left intact. If the
  request doesn't contain a question mark and the new value is not empty,
  then one is added at the end of the URI, followed by the new value. If
  a question mark was present, it will never be removed even if the value
  is empty. This can be used to add or remove parameters from the query
  string. See also "set-query" and "set-uri".

  Example :
      # replace "%3D" with "=" in the query string
      http-request set-query %[query,regsub(%3D,=,g)]

- "set-uri" rewrites the request URI with the result of the evaluation of
  format string <fmt>. The scheme, authority, path and query string are all
  replaced at once. This can be used to rewrite hosts in front of proxies,
  or to perform complex modifications to the URI such as moving parts
  between the path and the query string. See also "set-path" and
  "set-query".

All of them are handled by the same parser and the same exec function,
which is why they're merged all together. For once, instead of adding
even more entries to the huge switch/case, we used the new facility to
register action keywords. A number of the existing ones should probably
move there as well.
2015-01-23 20:27:41 +01:00
Willy Tarreau
15a53a4384 MEDIUM: regex: add support for passing regex flags to regex_exec_match()
This function (and its sister regex_exec_match2()) abstract the regex
execution but make it impossible to pass flags to the regex engine.
Currently we don't use them but we'll need to support REG_NOTBOL soon
(to indicate that we're not at the beginning of a line). So let's add
support for this flag and update the API accordingly.
2015-01-22 14:24:53 +01:00
Willy Tarreau
8560328211 BUG/MEDIUM: http: make http-request set-header compute the string before removal
The way http-request/response set-header works is stupid. For a naive
reuse of the del-header code, it removes all occurrences of the header
to be set before computing the new format string. This makes it almost
unusable because it is not possible to append values to an existing
header without first copying them to a dummy header, performing the
copy back and removing the dummy header.

Instead, let's share the same code as add-header and perform the optional
removal after the string is computed. That way it becomes possible to
write things like :

   http-request set-header X-Forwarded-For %[hdr(X-Forwarded-For)],%[src]

Note that this change is not expected to have any undesirable impact on
existing configs since if they rely on the bogus behaviour, they don't
work as they always retrieve an empty string.

This fix must be backported to 1.5 to stop the spreadth of ugly configs.
2015-01-21 20:45:00 +01:00
Willy Tarreau
49ad95cc8e MINOR: http: add a new fetch "query" to extract the request's query string
This fetch extracts the request's query string, which starts after the first
question mark. If no question mark is present, this fetch returns nothing. If
a question mark is present but nothing follows, it returns an empty string.
This means it's possible to easily know whether a query string is present
using the "found" matching method. This fetch is the completemnt of "path"
which stops before the question mark.
2015-01-20 19:47:47 +01:00
Willy Tarreau
319f745ba0 MINOR: channel: rename bi_erase() to channel_truncate()
It applies to the channel and it doesn't erase outgoing data, only
pending unread data, which is strictly equivalent to what recv()
does with MSG_TRUNC, so that new name is more accurate and intuitive.
2015-01-14 20:32:59 +01:00
Willy Tarreau
ba0902ede4 CLEANUP: channel: rename channel_reserved -> channel_is_rewritable
channel_reserved is confusingly named. It is used to know whether or
not the rewrite area is left intact for situations where we want to
ensure we can use it before proceeding. Let's rename it to fix this
confusion.
2015-01-14 18:41:33 +01:00
Willy Tarreau
7c1c217426 BUG/MEDIUM: http: fix header removal when previous header ends with pure LF
In 1.4-dev7, a header removal mechanism was introduced with commit 68085d8
("[MINOR] http: add http_remove_header2() to remove a header value."). Due
to a typo in the function, the beginning of the headers gets desynchronized
if the header preceeding the deleted one ends with an LF/CRLF combination
different form the one of the removed header. The reason is that while
rewinding the pointer, we go back by a number of bytes taking into account
the LF/CRLF status of the removed header instead of the previous one. The
case where it fails is in http-request del-header/set-header where the
multiple occurrences of a header are present and their LF/CRLF ending
differs from the preceeding header. The loop then stops because no more
headers are found given that the names and length do not match.

Another point to take into consideration is that removing headers using
a loop of http_find_header2() and this function is inefficient since we
remove values one at a time while it could be simpler and faster to
remove full header lines. This is something that should be addressed
separately.

This fix must be backported to 1.5 and 1.4. Note that http-send-name-header
relies on this function as well so it could be possible that some of the
issues encountered with it in 1.4 come from this bug.
2015-01-07 17:23:50 +01:00
Willy Tarreau
f2f7d6b27b MEDIUM: buffer: add a new buf_wanted dummy buffer to report failed allocations
Doing so ensures that even when no memory is available, we leave the
channel in a sane condition. There's a special case in proto_http.c
regarding the compression, we simply pre-allocate the tmpbuf to point
to the dummy buffer. Not reusing &buf_empty for this allows the rest
of the code to differenciate an empty buffer that's not used from an
empty buffer that results from a failed allocation which has the same
semantics as a buffer full.
2014-12-24 23:47:32 +01:00
Willy Tarreau
e583ea583a MEDIUM: buffer: use b_alloc() to allocate and initialize a buffer
b_alloc() now allocates a buffer and initializes it to the size specified
in the pool minus the size of the struct buffer itself. This ensures that
callers do not need to care about buffer details anymore. Also this never
applies memory poisonning, which is slow and useless on buffers.
2014-12-24 23:47:32 +01:00
Godbach
d972203fbc BUG/MINOR: parse: refer curproxy instead of proxy
Since during parsing stage, curproxy always represents a proxy to be operated,
it should be a mistake by referring proxy.

Signed-off-by: Godbach <nylzhaowei@gmail.com>
2014-12-18 11:01:51 +01:00
Godbach
1f1fae6202 BUG/MINOR: http: fix typo: "401 Unauthorized" => "407 Unauthorized"
401 Unauthorized => 407 Unauthorized

Signed-off-by: Godbach <nylzhaowei@gmail.com>
2014-12-17 17:05:49 +01:00
Willy Tarreau
5506e3f8b6 BUG/MINOR: stats: correctly set the request/response analysers
When enabling stats, response analysers were set on the request
analyser list, which 1) has no effect, and 2) means we don't have
the response analysers properly set.

In practice these response analysers are set when the connection
to the server or applet is established so we don't need/must not
set them here.

Fortunately this bug had no impact since the flags are distinct,
but it definitely is confusing.

It should be backported to 1.5.
2014-11-21 17:53:08 +01:00
Cyril Bonté
a83a50bd7d BUG/MINOR: log: fix request flags when keep-alive is enabled
Colin Ingarfield reported some unexplainable flags in the logs.
For example, a "LR" termination state was set on a request which was forwarded
to a server, where "LR" means that the request should have been handled
internally by haproxy.

This case happens when at least client side keep-alive is enabled. Next
requests in the connection will inherit the flags from the previous request.

2 fields are impacted : "termination_state" and "Tt" in the timing events,
where a "+" can be added, when a previous request was redispatched.

This is not critical for the service itself but can confuse troubleshooting.

The fix must be backported to 1.5 and 1.4.
2014-10-22 22:37:30 +02:00
Willy Tarreau
7d59e90473 BUG/MEDIUM: http: don't dump debug headers on MSG_ERROR
When the HTTP parser is in state HTTP_MSG_ERROR, we don't know if it was
already initialized or not. If the error happens before HTTP_MSG_RQBEFORE,
random offsets might be present and we don't want to display such random
strings in debug mode.

While it's theorically possible to randomly crash the process when running
in debug mode here, this bug was not tagged MAJOR because it would not
make sense to run in debug mode in production.

This fix must be backported to 1.5 and 1.4.
2014-10-22 19:25:09 +02:00
Willy Tarreau
e1cfc1f2b4 BUG/MINOR: config: do not accept more track-sc than configured
MAX_SESS_STKCTR allows one to define the number of stick counters that can
be used in parallel in track-sc* rules. The naming of this macro creates
some confusion because the value there is sometimes used as a max instead
of a count, and the config parser accepts values from 0 to MAX_SESS_STKCTR
and the processing ignores anything tracked on the last one. This means
that by default, track-sc3 is allowed and ignored.

This fix must be backported to 1.5 where the problem there only affects
TCP rules.
2014-10-17 11:53:05 +02:00
Willy Tarreau
4e21ff9244 BUG/MEDIUM: http: adjust close mode when switching to backend
Commit 179085c ("MEDIUM: http: move Connection header processing earlier")
introduced a regression : the backend's HTTP mode is not considered anymore
when setting the session's HTTP mode, because wait_for_request() is only
called once, when the frontend receives the request (or when the frontend
is in TCP mode, when the backend receives the request).

The net effect is that in some situations when the frontend and the backend
do not work in the same mode (eg: keep-alive vs close), the backend's mode
is ignored.

This patch moves all that processing to a dedicated function, which is
called from the original place, as well as from session_set_backend()
when switching from an HTTP frontend to an HTTP backend in different
modes.

This fix must be backported to 1.5.
2014-09-30 18:44:22 +02:00
Willy Tarreau
ce730de867 MEDIUM: http: enable header manipulation for 101 responses
Ryan Brock reported that server stickiness did not work for WebSocket
because the cookies and headers are not modified on 1xx responses. He
found that his browser correctly presents the cookies learned on 101
responses, which was not specifically defined in the WebSocket spec,
nor in the cookie spec. 101 is a very special case. Being part of 1xx,
it's an interim response. But within 1xx, it's special because it's
the last HTTP/1 response that transits on the wire, which is different
from 100 or 102 which may appear multiple times. So in that sense, we
can consider it as a final response regarding HTTP/1, and it makes
sense to allow header processing there. Note that we still ensure not
to mangle the Connection header, which is critical for HTTP upgrade to
continue to work smoothly with agents that are a bit picky about what
tokens are found there.

The rspadd rules are now processed for 101 responses as well, but the
cache-control checks are not performed (since no body is delivered).

Ryan confirmed that this patch works for him.

It would make sense to backport it to 1.5 given that it improves end
user experience on WebSocket servers.
2014-09-16 10:40:38 +02:00
Willy Tarreau
9dc1c61c43 BUG/CRITICAL: http: don't update msg->sov once data start to leave the buffer
Commit bb2e669 ("BUG/MAJOR: http: correctly rewind the request body
after start of forwarding") was incorrect/incomplete. It used to rely on
CF_READ_ATTACHED to stop updating msg->sov once data start to leave the
buffer, but this is unreliable because since commit a6eebb3 ("[BUG]
session: clear BF_READ_ATTACHED before next I/O") merged in 1.5-dev1,
this flag is only ephemeral and is cleared once all analysers have
seen it. So we can start updating msg->sov again each time we pass
through this place with new data. With a sufficiently large amount of
data, it is possible to make msg->sov wrap and validate the if()
condition at the top, causing the buffer to advance by about 2GB and
crash the process.

Note that the offset cannot be controlled by the attacker because it is
a sum of millions of small random sizes depending on how many bytes were
read by the server and how many were left in the buffer, only because
of the speed difference between reading and writing. Also, nothing is
written, the invalid pointer resulting from this operation is only read.

Many thanks to James Dempsey for reporting this bug and to Chris Forbes for
narrowing down the faulty area enough to make its root cause analysable.

This fix must be backported to haproxy 1.5.
2014-09-02 16:48:54 +02:00
Willy Tarreau
912c119557 BUG/MEDIUM: http: fix improper parsing of HTTP methods for use with ACLs
pat_parse_meth() had some remains of an early implementation attempt for
the patterns, it initialises a trash and never sets the pattern value there.
The result is that a non-standard method cannot be matched anymore. The bug
appeared during the pattern rework in 1.5, so this fix must be backported
there. Thanks to Joe Williams of GitHub for reporting the bug.
2014-08-29 15:15:50 +02:00
Willy Tarreau
4de2a94165 BUG/MEDIUM: http: fix inverted condition in pat_match_meth()
This results in a string-based HTTP method match returning true when
it doesn't match and conversely. This bug was reported by Joe Williams.

The fix must be backported to 1.5, though it still doesn't work because
of at least 3-4 other bugs in the long path which leads to building this
pattern list.
2014-08-28 20:42:57 +02:00
Thierry FOURNIER
7566e30477 BUG/MEDIUM: http: tarpit timeout is reset
Before the commit bbba2a8ecc
(1.5-dev24-8), the tarpit section set timeout and return, after this
commit, the tarpit section set the timeout, and go to the "done" label
which reset the timeout.

Thanks Bryan Talbot for the bug report and analysis.

This should be backported in 1.5.
2014-08-22 11:58:02 +02:00
Baptiste Assmann
12cb00b216 BUG: config: error in http-response replace-header number of arguments
A couple of typo fixed in 'http-response replace-header':
- an error when counting the number of arguments
- a typo in the alert message

This should be backported to 1.5.
2014-08-08 17:50:57 +02:00
Willy Tarreau
09448f7d7c MEDIUM: http: add the track-sc* actions to http-request rules
Add support for http-request track-sc, similar to what is done in
tcp-request for backends. A new act_prm field was added to HTTP
request rules to store the track params (table, counter). Just
like for TCP rules, the table is resolved while checking for
config validity. The code was mostly copied from the TCP code
with the exception that here we also count the HTTP request count
and rate by hand. Probably that something could be factored out in
the future.

It seems like tracking flags should be improved to mark each hook
which tracks a key so that we can have some check points where to
increase counters of the past if not done yet, a bit like is done
for TRACK_BACKEND.
2014-07-16 17:26:40 +02:00
Willy Tarreau
5ad6e1dc09 BUG/MINOR: http: base32+src should use the big endian version of base32
We're using the internal memory representation of base32 here, which is
wrong since these data might be exported to headers for logs or be used
to stick to a server and replicated to other peers. Let's convert base32
to big endian (network representation) when building the binary block.

This mistake is also present in 1.5, it would be better to backport it.
2014-07-15 21:36:10 +02:00
Thierry FOURNIER
055b9d5c63 MINOR: http: export the function 'smp_fetch_base32'
It's sometimes useful outside of proto_http.c.
2014-07-15 19:09:36 +02:00
Willy Tarreau
bb2e669f9e BUG/MAJOR: http: correctly rewind the request body after start of forwarding
Daniel Dubovik reported an interesting bug showing that the request body
processing was still not 100% fixed. If a POST request contained short
enough data to be forwarded at once before trying to establish the
connection to the server, we had no way to correctly rewind the body.

The first visible case is that balancing on a header does not always work
on such POST requests since the header cannot be found. But there are even
nastier implications which are that http-send-name-header would apply to
the wrong location and possibly even affect part of the request's body
due to an incorrect rewinding.

There are two options to fix the problem :
  - first one is to force the HTTP_MSG_F_WAIT_CONN flag on all hash-based
    balancing algorithms and http-send-name-header, but there's always a
    risk that any new algorithm forgets to set it ;

  - the second option is to account for the amount of skipped data before
    the connection establishes so that we always know the position of the
    request's body relative to the buffer's origin.

The second option is much more reliable and fits very well in the spirit
of the past changes to fix forwarding. Indeed, at the moment we have
msg->sov which points to the start of the body before headers are forwarded
and which equals zero afterwards (so it still points to the start of the
body before forwarding data). A minor change consists in always making it
point to the start of the body even after data have been forwarded. It means
that it can get a negative value (so we need to change its type to signed)..

In order to avoid wrapping, we only do this as long as the other side of
the buffer is not connected yet.

Doing this definitely fixes the issues above for the requests. Since the
response cannot be rewound we don't need to perform any change there.

This bug was introduced/remained unfixed in 1.5-dev23 so the fix must be
backported to 1.5.
2014-07-10 19:29:45 +02:00
Willy Tarreau
506c69a50e BUILD: http: fix isdigit & isspace warnings on Solaris
As usual, when touching any is* function, Solaris complains about the
type of the element being checked. Better backport this to 1.5 since
nobody knows what the emitted code looks like since macros are used
instead of functions.
2014-07-08 01:13:34 +02:00
Willy Tarreau
6c616e0b96 BUG/MAJOR: sample: correctly reinitialize sample fetch context before calling sample_process()
We used to only clear flags when reusing the static sample before calling
sample_process(), but that's not enough because there's a context in samples
that can be used by some fetch functions such as auth, headers and cookies,
and not reinitializing it risks that a pointer of a different type is used
in the wrong context.

An example configuration which triggers the case consists in mixing hdr()
and http_auth_group() which both make use of contexts :

     http-request add-header foo2 %[hdr(host)],%[http_auth_group(foo)]

The solution is simple, initialize all the sample and not just the flags.
This fix must be backported into 1.5 since it was introduced in 1.5-dev19.
2014-06-25 17:12:08 +02:00
Willy Tarreau
d713bcc326 BUG/MINOR: counters: do not untrack counters before logging
Baptiste Assmann reported a corner case in the releasing of stick-counters:
we release content-aware counters before logging. In the past it was not a
problem, but since now we can log them it, it prevents one from logging
their value. Simply switching the log production and the release of the
counter fixes the issue.

This should be backported into 1.5.
2014-06-25 15:36:04 +02:00
Willy Tarreau
3caf2afabe BUG/MEDIUM: http: fetch "base" is not compatible with set-header
The sample fetch function "base" makes use of the trash which is also
used by set-header/add-header etc... everything which builds a formated
line. So we end up with some junk in the header if base is in use. Let's
fix this as all other fetches by using a trash chunk instead.

This bug was reported by Baptiste Assmann, and also affects 1.5.
2014-06-24 17:27:02 +02:00
Baptiste Assmann
92df370621 BUG/MINOR: config: http-request replace-header arg typo
http-request replace-header was introduced with a typo which prevents it
to be conditionned by an ACL.
This patch fixes this issue.
2014-06-24 11:13:33 +02:00
Willy Tarreau
6f0a7bac28 BUG/MAJOR: session: revert all the crappy client-side timeout changes
This is the 3rd regression caused by the changes below. The latest to
date was reported by Finn Arne Gangstad. If a server responds with no
content-length and the client's FIN is never received, either we leak
the client-side FD or we spin at 100% CPU if timeout client-fin is set.

Enough is enough. The amount of tricks needed to cover these side-effects
starts to look like used toilet paper stacked over a chocolate cake. I
don't want to eat that cake anymore!

All this to avoid reporting a server-side timeout when a client stops
uploading data and haproxy expires faster than the server... A lot of
"ifs" resulting in a technically valid log that doesn't always please
users, and whose alternative causes that many issues for all others
users.

So let's revert this crap merged since 1.5-dev25 :
  Revert "CLEANUP: http: don't clear CF_READ_NOEXP twice"
    This reverts commit 1592d1e72a.
  Revert "BUG/MEDIUM: http: clear CF_READ_NOEXP when preparing a new transaction"
    This reverts commit 77d29029af.
  Revert "BUG/MEDIUM: session: don't clear CF_READ_NOEXP if analysers are not called"
    This reverts commit 0943757a21.
  Revert "BUG/MEDIUM: http: disable server-side expiration until client has sent the body"
    This reverts commit 3bed5e9337.
  Revert "BUG/MEDIUM: http: correctly report request body timeouts"
    This reverts commit b9edf8fbec.
  Revert "BUG/MEDIUM: http/session: disable client-side expiration only after body"
    This reverts commit b1982e27aa.

If a cleaner AND SAFER way to do something equivalent in 1.6-dev, we *might*
consider backporting it to 1.5, but given the vicious bugs that have surfaced
since, I doubt it will happen any time soon.

Fortunately, that crap never made it into 1.4 so no backport is needed.
2014-06-23 15:47:00 +02:00
Thierry FOURNIER
c9c2daf283 MEDIUM: regex: Remove null terminated strings.
The new regex function can use string and length. The HAproxy buffer are
not null-terminated, and the use of the regex_exec* functions implies
the add of this null character. This patch replace these function by the
functions which takes a string and length as input.

Just the file "proto_http.c" is change because this one is more executed
than other. The file "checks.c" have a very low usage, and it is not
interesting to change it. Furthermore, the buffer used by "checks.c" are
null-terminated.
2014-06-18 15:12:51 +02:00
Thierry FOURNIER
09af0d6d43 MEDIUM: regex: replace all standard regex function by own functions
This patch remove all references of standard regex in haproxy. The last
remaining references are only in the regex.[ch] files.

In the file src/checks.c, the original function uses a "pmatch" array.
In fact this array is unused. This patch remove it.
2014-06-18 15:07:57 +02:00
Willy Tarreau
b854392824 BUG/MINOR: http: fix typos in previous patch
When I renamed the modify-header action to replace-value, one of them
was mistakenly set to "replace-val" instead. Additionally, differentiation
of the two actions must be done on args[0][8] and not *args[8]. Thanks
Thierry for spotting...
2014-06-17 19:03:56 +02:00
Sasha Pachev
218f064f55 MEDIUM: http: add actions "replace-header" and "replace-values" in http-req/resp
This patch adds two new actions to http-request and http-response rulesets :
  - replace-header : replace a whole header line, suited for headers
                     which might contain commas
  - replace-value  : replace a single header value, suited for headers
                     defined as lists.

The match consists in a regex, and the replacement string takes a log-format
and supports back-references.
2014-06-17 18:34:32 +02:00
Willy Tarreau
4bfc580dd3 MEDIUM: session: maintain per-backend and per-server time statistics
Using the last rate counters, we now compute the queue, connect, response
and total times per server and per backend with a 95% accuracy over the last
1024 samples. The operation is cheap so we don't need to condition it.
2014-06-17 17:15:56 +02:00
Willy Tarreau
54da8db40b MINOR: capture: extend the captures to support non-header keys
This patch adds support for captures with no header name. The purpose
is to allow extra captures to be defined and logged along with the
header captures.
2014-06-13 16:32:48 +02:00
Willy Tarreau
1592d1e72a CLEANUP: http: don't clear CF_READ_NOEXP twice
Last patch cleared the flag twice in the response, which is useless.
Thanks Lukas for spotting it :-)
2014-06-11 16:49:14 +02:00
Willy Tarreau
77d29029af BUG/MEDIUM: http: clear CF_READ_NOEXP when preparing a new transaction
Commit b1982e2 ("BUG/MEDIUM: http/session: disable client-side expiration
only after body") was tricky and caused an issue which was fixed by commit
0943757 ("BUG/MEDIUM: session: don't clear CF_READ_NOEXP if analysers are
not called"). But that's not enough, another issue was introduced and further
emphasized by last fix.

The issue is that the CF_READ_NOEXP flag needs to be cleared when waiting
for a new request over that connection, otherwise we cannot expire anymore
an idle connection waiting for a new request.

This explains the neverending keepalives reported by at least 3 different
persons since dev24. No backport is needed.
2014-06-11 14:11:44 +02:00
Sasha Pachev
c600204ddf BUG/MEDIUM: regex: fix risk of buffer overrun in exp_replace()
Currently exp_replace() (which is used in reqrep/reqirep) is
vulnerable to a buffer overrun. I have been able to reproduce it using
the attached configuration file and issuing the following command:

  wget -O - -S -q http://localhost:8000/`perl -e 'print "a"x4000'`/cookie.php

Str was being checked only in in while (str) and it was possible to
read past that when more than one character was being accessed in the
loop.

WT:
   Note that this bug is only marked MEDIUM because configurations
   capable of triggering this bug are very unlikely to exist at all due
   to the fact that most rewrites consist in static string additions
   that largely fit into the reserved area (8kB by default).

   This fix should also be backported to 1.4 and possibly even 1.3 since
   it seems to have been present since 1.1 or so.

Config:
-------

  global
        maxconn         500
        stats socket /tmp/haproxy.sock mode 600

  defaults
        timeout client      1000
        timeout connect      5000
        timeout server      5000
        retries         1
        option redispatch

  listen stats
        bind :8080
        mode http
        stats enable
        stats uri /stats
        stats show-legends

  listen  tcp_1
        bind :8000
        mode            http
        maxconn 400
        balance roundrobin
        reqrep ^([^\ :]*)\ /(.*)/(.*)\.php(.*) \1\ /\3.php?arg=\2\2\2\2\2\2\2\2\2\2\2\2\2\4
        server  srv1 127.0.0.1:9000 check port 9000 inter 1000 fall 1
        server  srv2 127.0.0.1:9001 check port 9001 inter 1000 fall 1
2014-05-27 14:36:06 +02:00
Willy Tarreau
892337c8e1 MAJOR: server: use states instead of flags to store the server state
Servers used to have 3 flags to store a state, now they have 4 states
instead. This avoids lots of confusion for the 4 remaining undefined
states.

The encoding from the previous to the new states can be represented
this way :

  SRV_STF_RUNNING
   |  SRV_STF_GOINGDOWN
   |   |  SRV_STF_WARMINGUP
   |   |   |
   0   x   x     SRV_ST_STOPPED
   1   0   0     SRV_ST_RUNNING
   1   0   1     SRV_ST_STARTING
   1   1   x     SRV_ST_STOPPING

Note that the case where all bits were set used to exist and was randomly
dealt with. For example, the task was not stopped, the throttle value was
still updated and reported in the stats and in the http_server_state header.
It was the same if the server was stopped by the agent or for maintenance.

It's worth noting that the internal function names are still quite confusing.
2014-05-22 11:27:00 +02:00
Willy Tarreau
c93cd16b6c REORG/MEDIUM: server: split server state and flags in two different variables
Till now, the server's state and flags were all saved as a single bit
field. It causes some difficulties because we'd like to have an enum
for the state and separate flags.

This commit starts by splitting them in two distinct fields. The first
one is srv->state (with its counter-part srv->prev_state) which are now
enums, but which still contain bits (SRV_STF_*).

The flags now lie in their own field (srv->flags).

The function srv_is_usable() was updated to use the enum as input, since
it already used to deal only with the state.

Note that currently, the maintenance mode is still in the state for
simplicity, but it must move as well.
2014-05-22 11:27:00 +02:00
Willy Tarreau
3bed5e9337 BUG/MEDIUM: http: disable server-side expiration until client has sent the body
It's the final part of the 2 previous patches. We prevent the server from
timing out if we still have some data to pass to it. That way, even if the
server runs with a short timeout and the client with a large one, the server
side timeout will only start to count once the client sends everything. This
ensures we don't report a 504 before the server gets the whole request.

It is not certain whether the 1.4 state machine is fully compatible with
this change. Since the purpose is only to ensure that we never report a
server error before a client error if some data are missing from the client
and when the server-side timeout is smaller than or equal to the client's,
it's probably not worth attempting the backport.
2014-05-07 15:23:52 +02:00
Willy Tarreau
b9edf8fbec BUG/MEDIUM: http: correctly report request body timeouts
This is the continuation of previous patch "BUG/MEDIUM: http/session:
disable client-side expiration only after body".

This one takes care of properly reporting the client-side read timeout
when waiting for a body from the client. Since the timeout may happen
before or after the server starts to respond, we have to take care of
the situation in three different ways :
  - if the server does not read our data fast enough, we emit a 504
    if we're waiting for headers, or we simply break the connection
    if headers were already received. We report either sH or sD
    depending on whether we've seen headers or not.

  - if the server has not yet started to respond, but has read all of
    the client's data and we're still waiting for more data from the
    client, we can safely emit a 408 and abort the request ;

  - if the server has already started to respond (thus it's a transfer
    timeout during a bidirectional exchange), then we silently break
    the connection, and only the session flags will indicate in the
    logs that something went wrong with client or server side.

This bug is tagged MEDIUM because it touches very sensible areas, however
its impact is very low. It might be worth performing a careful backport
to 1.4 once it has been confirmed that everything is correct and that it
does not introduce any regression.
2014-05-07 15:22:27 +02:00
Willy Tarreau
b1982e27aa BUG/MEDIUM: http/session: disable client-side expiration only after body
For a very long time, back in the v1.3 days, we used to rely on a trick
to avoid expiring the client side while transferring a payload to the
server. The problem was that if a client was able to quickly fill the
buffers, and these buffers took some time to reach the server, the
client should not expire while not sending anything.

In order to cover this situation, the client-side timeout was disabled
once the connection to the server was OK, since it implied that we would
at least expire on the server if required.

But there is a drawback to this : if a client stops uploading data before
the end, its timeout is not enforced and we only expire on the server's
timeout, so the logs report a 504.

Since 1.4, we have message body analysers which ensure that we know whether
all the expected data was received or not (HTTP_MSG_DATA or HTTP_MSG_DONE).
So we can fix this problem by disabling the client-side or server-side
timeout at the end of the transfer for the respective side instead of
having it unconditionally in session.c during all the transfer.

With this, the logs now report the correct side for the timeout. Note that
this patch is not enough, because another issue remains : the HTTP body
forwarders do not abort upon timeout, they simply rely on the generic
handling from session.c. So for now, the session is still aborted when
reaching the server timeout, but the culprit is properly reported. A
subsequent patch will address this specific point.

This bug was tagged MEDIUM because of the changes performed. The issue
it fixes is minor however. After some cooling down, it may be backported
to 1.4.

It was reported by and discussed with Rachel Chavez and Patrick Hemmer
on the mailing list.
2014-05-07 14:21:47 +02:00
William Lallemand
07c8b24edb MINOR: http: export the smp_fetch_cookie function
Remove the static attribute of smp_fetch_cookie, and declare the
function in proto/proto_http.h for future use.
2014-05-02 18:05:15 +02:00
Willy Tarreau
644c101e2d BUG/MAJOR: http: connection setup may stall on balance url_param
On the mailing list, seri0528@naver.com reported an issue when
using balance url_param or balance uri. The request would sometimes
stall forever.

Cyril Bonté managed to reproduce it with the configuration below :

  listen test :80
    mode http
    balance url_param q
    hash-type consistent
    server s demo.1wt.eu:80

and found it appeared with this commit : 80a92c0 ("BUG/MEDIUM: http:
don't start to forward request data before the connect").

The bug is subtle but real. The problem is that the HTTP request
forwarding analyzer refrains from starting to parse the request
body when some LB algorithms might need the body contents, in order
to preserve the data pointer and avoid moving things around during
analysis in case a redispatch is later needed. And in order to detect
that the connection establishes, it watches the response channel's
CF_READ_ATTACHED flag.

The problem is that a request analyzer is not subscribed to a response
channel, so it will only see changes when woken for other (generally
correlated) reasons, such as the fact that part of the request could
be sent. And since the CF_READ_ATTACHED flag is cleared once leaving
process_session(), it is important not to miss it. It simply happens
that sometimes the server starts to respond in a sequence that validates
the connection in the middle of process_session(), that it is detected
after the analysers, and that the newly assigned CF_READ_ATTACHED is
not used to detect that the request analysers need to be called again,
then the flag is lost.

The CF_WAKE_WRITE flag doesn't work either because it's cleared upon
entry into process_session(), ie if we spend more than one call not
connecting.

Thus we need a new flag to tell the connection initiator that we are
specifically interested in being notified about connection establishment.
This new flag is CF_WAKE_CONNECT. It is set by the requester, and is
cleared once the connection succeeds, where CF_WAKE_ONCE is set instead,
causing the request analysers to be scanned again.

For future versions, some better options will have to be considered :
  - let all analysers subscribe to both request and response events ;
  - let analysers subscribe to stream interface events (reduces number
    of useless calls)
  - change CF_WAKE_WRITE's semantics to persist across calls to
    process_session(), but that is different from validating a
    connection establishment (eg: no data sent, or no data to send)

The bug was introduced in 1.5-dev23, no backport is needed.
2014-04-30 20:02:02 +02:00
Willy Tarreau
0b7483385e MEDIUM: http: make http-request rules processing return a verdict instead of a rule
Till now we used to return a pointer to a rule, but that makes it
complicated to later add support for registering new actions which
may fail. For example, the redirect may fail if the response is too
large to fit into the buffer.

So instead let's return a verdict. But we needed the pointer to the
last rule to get the address of a redirect and to get the realm used
by the auth page. So these pieces of code have moved into the function
and they produce a verdict.
2014-04-29 00:46:01 +02:00
Willy Tarreau
ae3c010226 MEDIUM: http: factorize the "auth" action of http-request and stats
Both use exactly the same mechanism, except for the choice of the
default realm to be emitted when none is selected. It can be achieved
by simply comparing the ruleset with the stats' for now. This achieves
a significant code reduction and further, removes the dependence on
the pointer to the final rule in the caller.
2014-04-29 00:46:01 +02:00
Willy Tarreau
f75e5c3d84 MINOR: http: remove the now unused loop over "block" rules
This ruleset is now always empty, simply remove it.
2014-04-28 22:15:00 +02:00
Willy Tarreau
353bc9f43f CLEANUP: proxy: rename "block_cond" to "block_rules"
Next patch will make them real rules, not only conditions. This separate
patch makes the next one more readable.
2014-04-28 22:05:31 +02:00
Willy Tarreau
5bd6759a19 MINOR: http: silently support the "block" action for http-request
This one will be used to convert "block" rules into "http-request block".
2014-04-28 22:00:46 +02:00
Willy Tarreau
5254259609 MEDIUM: http: remove even more of the spaghetti in the request path
Some of the remaining interleaving of request processing after the
http-request rules can now safely be removed, because all remaining
actions are mutually exclusive.

So we can move together all those related to an intercepting rule,
then proceed with stats, then with req*.

We still keep an issue with stats vs reqrep which forces us to
keep the stats split in two (detection and action). Indeed, from the
beginning, stats are detected before rewriting and not after. But a
reqdeny rule would stop stats, so in practice we have to first detect,
then perform the action. Maybe we'll be able to kill this in version
1.6.
2014-04-28 21:35:30 +02:00
Willy Tarreau
179085ccac MEDIUM: http: move Connection header processing earlier
Till now the Connection header was processed in the middle of the http-request
rules and some reqadd rules. It used to force some http-request actions to be
cut in two parts.

Now with keep-alive, not only that doesn't make any sense anymore, but it's
becoming a total mess, especially since we need to know the headers contents
before proceeding with most actions.

The real reason it was not moved earlier is that the "block" or "http-request"
rules can see a different version if some fields are changed there. But that
is already not reliable anymore since the values observed by the frontend
differ from those in the backend.

This patch is the equivalent of commit f118d9f ("REORG: http: move HTTP
Connection response header parsing earlier") but for the request side. It
has been tagged MEDIUM as it could theorically slightly affect some setups
relying on corner cases or invalid setups, though this does not make real
sense and is highly unlikely.
2014-04-28 21:35:29 +02:00
Willy Tarreau
65410831a1 BUG/MINOR: http: block rules forgot to increment the session's request counter
The session's backend request counters were incremented after the block
rules while these rules could increment the session's error counters,
meaning that we could have more errors than requests reported in a stick
table! Commit 5d5b5d8 ("MEDIUM: proto_tcp: add support for tracking L7
information") is the most responsible for this.

This bug is 1.5-specific and does not need any backport.
2014-04-28 21:34:43 +02:00
Willy Tarreau
5fa7082911 BUG/MINOR: http: block rules forgot to increment the denied_req counter
"block" rules used to build the whole response and forgot to increment
the denied_req counters. By jumping to the general "deny" label created
in previous patch, it's easier to fix this.

The issue was already present in 1.3 and remained unnoticed, in part
because few people use "block" nowadays.
2014-04-28 18:46:40 +02:00
Willy Tarreau
bbba2a8ecc MEDIUM: http: jump to dedicated labels after http-request processing
Continue the cleanup of http-request post-processing to remove some
of the interleaved tests. Here we set up a few labels to deal with
the deny and tarpit actions and avoid interleaved ifs.
2014-04-28 18:46:20 +02:00
Willy Tarreau
5e9edce0f0 MEDIUM: http: move reqadd after execution of http_request redirect
We still have a plate of spaghetti in the request processing rules.
All http-request rules are executed at once, then some responses are
built interlaced with other rules that used to be there in the past.
Here, reqadd is executed after an http-req redirect rule is *decided*,
but before it is *executed*.

So let's match the doc and config checks, to put the redirect actually
before the reqadd completely.
2014-04-28 17:25:40 +02:00
Willy Tarreau
cfe7fdd02d MINOR: http: rely on the message body parser to send 100-continue
There's no point in open-coding the sending of 100-continue in
the stats initialization code, better simply rely on the function
designed to process the message body which already does it.
2014-04-28 17:25:40 +02:00
Willy Tarreau
e6d24163e5 BUG/MINOR: http: log 407 in case of proxy auth
Commit 844a7e7 ("[MEDIUM] http: add support for proxy authentication")
merged in v1.4-rc1 added the ability to emit a status code 407 in auth
responses, but forgot to set the same status in the logs, which still
contain 401.

The bug is harmless, no backport is needed.
2014-04-28 17:24:42 +02:00
Thierry FOURNIER
e47e4e2385 BUG/MEDIUM: patterns: last fix was still not enough
Last fix did address the issue for inlined patterns, but it was not
enough because the flags are lost as well when updating patterns
dynamically over the CLI.

Also if the same file was used once with -i and another time without
-i, their references would have been merged and both would have used
the same matching method.

It's appear that the patterns have two types of flags. The first
ones are relative to the pattern matching, and the second are
relative to the pattern storage. The pattern matching flags are
the same for all the patterns of one expression. Now they are
stored in the expression. The storage flags are information
returned by the pattern mathing function. This information is
relative to each entry and is stored in the "struct pattern".

Now, the expression matching flags are forwarded to the parse
and index functions. These flags are stored during the
configuration parsing, and they are used during the parse and
index actions.

This issue was introduced in dev23 with the major pattern rework,
and is a continuation of commit a631fc8 ("BUG/MAJOR: patterns: -i
and -n are ignored for inlined patterns"). No backport is needed.
2014-04-28 14:19:17 +02:00
Willy Tarreau
a631fc8de8 BUG/MAJOR: patterns: -i and -n are ignored for inlined patterns
These flags are only passed to pattern_read_from_file() which
loads the patterns from a file. The functions used to parse the
patterns from the current line do not provide the means to pass
the pattern flags so they're lost.

This issue was introduced in dev23 with the major pattern rework,
and was reported by Graham Morley. No backport is needed.
2014-04-27 09:21:08 +02:00
Willy Tarreau
6c09c2ceae BUILD: http: remove a warning on strndup
The latest commit about set-map/add-acl/... causes this warning for
me :

src/proto_http.c: In function 'parse_http_req_cond':
src/proto_http.c:8863: warning: implicit declaration of function 'strndup'
src/proto_http.c:8863: warning: incompatible implicit declaration of built-in function 'strndup'
src/proto_http.c:8890: warning: incompatible implicit declaration of built-in function 'strndup'
src/proto_http.c:8917: warning: incompatible implicit declaration of built-in function 'strndup'
src/proto_http.c:8944: warning: incompatible implicit declaration of built-in function 'strndup'

Use my_strndup() instead of strndup() which is not portable. No backport
needed.
2014-04-25 21:39:17 +02:00
William Lallemand
73025dd7e2 MEDIUM: http: register http-request and http-response keywords
The http_(res|req)_keywords_register() functions allow to register
new keywords.

You need to declare a keyword list:

struct http_req_action_kw_list test_kws = {
	.scope = "testscope",
	.kw = {
		{ "test", parse_test },
		{ NULL, NULL },
	}
};

and a parsing function:

int parse_test(const char **args, int *cur_arg, struct proxy *px, struct http_req_rule *rule, char **err)
{
	rule->action = HTTP_REQ_ACT_CUSTOM_STOP;
	rule->action_ptr = action_function;

	return 0;
}

http_req_keywords_register(&test_kws);

The HTTP_REQ_ACT_CUSTOM_STOP action stops evaluation of rules after
your rule, HTTP_REQ_ACT_CUSTOM_CONT permits the evaluation of rules
after your rule.
2014-04-25 18:48:35 +02:00
Baptiste Assmann
fabcbe0de6 MEDIUM: http: ACL and MAP updates through http-(request|response) rules
This patch allows manipulation of ACL and MAP content thanks to any
information available in a session: source IP address, HTTP request or
response header, etc...

It's an update "on the fly" of the content  of the map/acls. This means
it does not resist to reload or restart of HAProxy.
2014-04-25 18:48:35 +02:00
Willy Tarreau
6d8bac7ddc BUG/MAJOR: http: fix the 'next' pointer when performing a redirect
Commit bed410e ("MAJOR: http: centralize data forwarding in the request path")
has woken up an issue in redirects, where msg->next is not reset when flushing
the input buffer. The result is an attempt to forward a negative amount of
data, making haproxy crash.

This bug does not seem to affect versions prior to dev23, so no backport is
needed.
2014-04-25 12:21:09 +02:00
Willy Tarreau
3c1b5ec29c MINOR: http: add capture.req.ver and capture.res.ver
These ones report a string as "HTTP/1.0" or "HTTP/1.1" depending on the
version of the request message or the response message, respectively.
The purpose is to be able to emit custom log lines reporting this version
in a persistent way.
2014-04-24 23:41:57 +02:00
Willy Tarreau
f118d9f507 REORG: http: move HTTP Connection response header parsing earlier
Currently, the parsing of the HTTP Connection header for the response
is performed at the same place as the rule sets, which means that after
parsing the beginning of the response, we still have no information on
whether the response is keep-alive compatible or not. Let's do that
earlier.

Note that this is the same code that was moved in the previous function,
both of them are always called in a row so no change of behaviour is
expected.

A future change might consist in having a late analyser to perform the
late header changes such as mangling the connection header. It's quite
painful that currently this is mixed with the rest of the processing
such as filters.
2014-04-24 22:34:30 +02:00
Willy Tarreau
70730dddbd MEDIUM: http: enable analysers to have keep-alive on stats
This allows the stats page to work in keep-alive mode and to be
compressed. At compression ratios up to 80%, it's quite interesting
for large pages.

We ensure to skip filters because we don't want to unexpectedly block
a response nor to mangle response headers.
2014-04-24 22:32:12 +02:00
Willy Tarreau
5897567273 CLEANUP: http: remove the useless "if (1)" inherited from version 1.4
This block has been enclosed inside an "if (1)" statement when migrating
1.3 to 1.4 to avoid a massive reindent. Let's get rid of it now.
2014-04-24 21:26:23 +02:00
Willy Tarreau
f1fd9dc8fb CLEANUP: general: get rid of all old occurrences of "session *t"
All the code inherited from version 1.1 still holds a lot ot sessions
called "t" because in 1.1 they were tasks. This naming is very annoying
and sometimes even confusing, for example in code involving tables.
Let's get rid of this once for all and before 1.5-final.

Nothing changed beyond just carefully renaming these variables.
2014-04-24 21:25:50 +02:00
Willy Tarreau
628c40cd96 MEDIUM: http: move skipping of 100-continue earlier
It's useless to process 100-continue in the middle of response filters
because there's no info in the 100 response itself, and it could even
make things worse. So better use it as it is, an interim response
waiting for the next response, thus we just have to put it into
http_wait_for_response(). That way we ensure to have a valid response
in this function.
2014-04-24 20:21:56 +02:00
Willy Tarreau
4d1f128a18 BUG/MEDIUM: http: 100-continue responses must process the next part immediately
Since commit d7ad9f5 ("MAJOR: channel: add a new flag CF_WAKE_WRITE to
notify the task of writes"), we got another bug with 100-continue responses.
If the final response comes in the same packet as the 100, then the rest of
the buffer is not processed since there is no wake-up event.

In fact the change above uncoverred the real culprit which is more
likely session.c which should detect that an earlier analyser was set
and should loop back to it.

A cleaner fix would be better, but setting the flag works fine.
This issue was introduced in 1.5-dev22, no backport is needed.
2014-04-24 20:21:56 +02:00
Willy Tarreau
efdf094df2 BUG/MAJOR: http: fix timeouts during data forwarding
Patches c623c17 ("MEDIUM: http: start to centralize the forwarding code")
and bed410e ("MAJOR: http: centralize data forwarding in the request path")
merged into 1.5-dev23 cause transfers to be silently aborted after the
server timeout due to the fact that the analysers are woken up when the
timeout strikes and they believe they have nothing more to do, so they're
terminating the transfer.

No backport is needed.
2014-04-24 20:21:56 +02:00
Willy Tarreau
af3cf70d7c MEDIUM: stats: reimplement HTTP keep-alive on the stats page
This basically reimplements commit f3221f9 ("MEDIUM: stats: add support
for HTTP keep-alive on the stats page") which was reverted by commit
51437d2 after Igor Chan reported a broken stats page caused by the bug
fix by previous commit.
2014-04-24 17:24:56 +02:00
Willy Tarreau
b2c6a786f7 BUG/MINOR: http: don't report server aborts as client aborts
Commit f003d37 ("BUG/MINOR: http: don't report client aborts as server errors")
attempted to fix a longstanding issue by which some client aborts could be
logged as server errors. Unfortunately, one of the tests involved there also
catches truncated server responses, which are reported as client aborts.

Instead, only check that the client has really closed using the abortonclose
option, just as in done in the request path (which means that the close was
propagated to the server).

The faulty fix above was introduced in 1.5-dev15, and was backported into
1.4.23.

Thanks to Patrick Hemmer for reporting this issue with traces showing the
root cause of the problem.
2014-04-23 20:29:01 +02:00
Willy Tarreau
38b3aa5646 BUG/MAJOR: http: fix bug in parse_qvalue() when selecting compression algo
Commit ad90351 ("MINOR: http: Add the "language" converter to for use with accept-language")
introduced a typo in parse_qvalue :

	if (*end)
		*end = qvalue;

while it should be :

	if (end)
		*end = qvalue;

Since end is tested for being NULL. This crashes when selecting the
compression algorithm since end is NULL here. No backport is needed,
this is just in latest 1.5-dev.
2014-04-22 23:32:05 +02:00
Willy Tarreau
3ce10ff9f0 CLEANUP: http: remove all calls to http_silent_debug()
This macro has long remained unused and calls are unevenly spread over
the code, so it's totally useless and pollutes the code. Remove it now.
2014-04-22 23:15:29 +02:00
Willy Tarreau
d351021860 CLEANUP: http: document the response forwarding states
The forwarding code is never obvious to enter into for newcomers, so
better improve the documentation about how states are chained and what
happens for each of them.
2014-04-22 23:15:29 +02:00
Willy Tarreau
bed410e0e8 MAJOR: http: centralize data forwarding in the request path
It is the same principle as what was just done for the response.
It makes the code cleaner, faster, and more maintainable.
2014-04-22 23:15:29 +02:00
Willy Tarreau
32b5ab2a28 MEDIUM: http: only allocate the temporary compression buffer when needed
Since we know when the buffer is needed, only check for its allocation
at the same place in order to avoid useless tests on the normal path.
2014-04-22 23:15:29 +02:00