Commit Graph

917 Commits

Author SHA1 Message Date
Willy Tarreau
068621e4ad MINOR: http: try to stick to same server after status 401/407
In HTTP keep-alive mode, if we receive a 401, we still have a chance
of being able to send the visitor again to the same server over the
same connection. This is required by some broken protocols such as
NTLM, and anyway whenever there is an opportunity for sending the
challenge to the proper place, it's better to do it (at least it
helps with debugging).
2013-12-23 15:12:44 +01:00
Willy Tarreau
2737562e43 MEDIUM: stream-int: implement a very simplistic idle connection manager
Idle connections are not monitored right now. So if a server closes after
a response without advertising it, it won't be detected until a next
request wants to use the connection. This is a bit problematic because
it unnecessarily maintains file descriptors and sockets in an idle
state.

This patch implements a very simple idle connection manager for the stream
interface. It presents itself as an I/O callback. The HTTP engine enables
it when it recycles a connection. If a close or an error is detected on the
underlying socket, it tries to drain as much data as possible from the socket,
detect the close and responds with a close as well, then detaches from the
stream interface.
2013-12-17 00:00:28 +01:00
Willy Tarreau
b169eba58d BUG/MEDIUM: http: cook_cnt() forgets to set its output type
Since comit b805f71 (MEDIUM: sample: let the cast functions set their
output type), the output type of a fetch function is automatically
considered and passed to the next converter. A bug introduced in
1.5-dev9 with commit f853c46 (MEDIUM: pattern/acl: get rid of
temp_pattern in ACLs) was revealed by this last one : the output type
remained string instead of UINT, causing the cast function to try to
cast the contents and to crash on a NULL deref.

Note: this fix was made after a careful review of all fetch functions.
A few non-trivial ones had their comments amended to clearly indicate
the output type.
2013-12-16 15:21:29 +01:00
Willy Tarreau
e8df1e128d MEDIUM: http: make option http_proxy automatically rewrite the URL
There are very few users of http_proxy, and all of them complain about
the same thing : the request is passed unmodified to the server (in its
proxy form), and it is not possible to fix it using reqrep rules because
http_proxy happens after.

So let's have http_proxy fix the URL it has analysed to get rid of the
scheme and the host part. This will do what users of this feature expect.
2013-12-16 14:30:55 +01:00
Willy Tarreau
6b726adb35 MEDIUM: http: do not report connection errors for second and further requests
In HTTP keep-alive, if we face a connection error to the server while sending
the request, the error should not be reported, and the client-side connection
should simply be closed, so that client knows it can retry. This can happen if
the server has too short a keep-alive timeout and quits at the same moment the
new request comes in.
2013-12-16 02:23:54 +01:00
Willy Tarreau
4213a11df9 MAJOR: http: add the keep-alive transition on the server side
When a connection to the server is complete, if the transaction
requests keep-alive mode, we don't shut the connection and we just
reinitialize the stream interface in order to be able to reuse the
connection afterwards.

Note that the server connection count is decremented, just like the
backend's, and that we still try to wake up waiters. But that makes
sense considering that we'll eventually be able to immediately pass
idle connections to waiters.
2013-12-16 02:23:54 +01:00
Willy Tarreau
9471b8ced9 MEDIUM: connection: inform si_alloc_conn() whether existing conn is OK or not
When allocating a new connection, only the caller knows whether it's
acceptable to reuse the previous one or not. Let's pass this information
to si_alloc_conn() which will do the cleanup if the connection is not
acceptable.
2013-12-16 02:23:53 +01:00
Willy Tarreau
2e7a165899 OPTIM: http: do not re-enable reading on client side while closing the server side
It's common to observe a an recv() call on the client side just after
the connect() to has been issued to the server side when running in
server close mode. The reason is that the whole request has been sent
and the shutw() has been queued in the channel, so the request message
switches to the MSG_CLOSED state, which didn't disable reading. Let's
do it now. That way the reading will only be re-enabled after the
response is transferred to the client. However if abortonclose is set,
we still leave it enabled.
2013-12-16 02:23:53 +01:00
Willy Tarreau
3f3997e6c6 OPTIM: http: set CF_READ_DONTWAIT on response message
strace shows a lot of EAGAIN on small response messages. This
is caused by the fact that the READ_DONTWAIT flag is not set
on response message, it's only there when we want to flush
pending data.

For small responses, it's a waste of CPU cycles to call recv()
for nothing since most of the time, everything we'll need will
be in the first response. Also, this will offer more opportunities
for using splice() to transfer data.
2013-12-16 02:23:52 +01:00
Willy Tarreau
89efaed6b6 BUILD: definitely silence some stupid GCC warnings
It's becoming increasingly difficult to ignore unwanted function returns in
debug code with gcc. Now even when you try to work around it, it suggests a
way to write your code differently. For example :

    src/frontend.c:187:65: warning: if statement has empty body [-Wempty-body]
                if (write(1, trash.str, trash.len) < 0) /* shut gcc warning */;
                                                                              ^
    src/frontend.c:187:65: note: put the semicolon on a separate line to silence this warning
    1 warning generated.

This is totally unacceptable, this code already had to be written this way
to shut it up in earlier versions. And now it comments the form ? What's the
purpose of the C language if you can't write anymore the code that does what
you want ?

Emeric proposed to just keep a global variable to drain such useless results
so that gcc stops complaining all the time it believes people who write code
are monkeys. The solution is acceptable because the useless assignment is done
only in debug code so it will not impact performance. This patch implements
this, until gcc becomes even "smarter" to detect that we tried to cheat.
2013-12-13 15:21:36 +01:00
Thierry FOURNIER
0b2fe4a5cd MINOR: pattern: add support for compiling patterns for lookups
With this patch, patterns can be compiled for two modes :
  - match
  - lookup

The match mode is used for example in ACLs or maps. The lookup mode
is used to lookup a key for pattern maintenance. For example, looking
up a network is different from looking up one address belonging to
this network.

A special case is made for regex. In lookup mode they return the input
regex string and do not compile the regex.
2013-12-12 15:44:02 +01:00
Thierry FOURNIER
7148ce6ef4 MEDIUM: pattern: Extract the index process from the pat_parse_*() functions
Now, the pat_parse_*() functions parses the incoming data. The input
"pattern" struct can be preallocated. If the parser needs to add some
buffers, it allocates memory.

The function pattern_register() runs the call to the parser, process
the key indexation and associate the "sample_storage" used by maps.
2013-12-12 15:42:11 +01:00
Thierry FOURNIER
cc0e0b3dbb MINOR: pattern: Each pattern sets the expected input type
This is used later for increasing the compability with incoming
sample types. When multiple compatible types are supported, one
is arbitrarily used (eg: UINT).
2013-12-12 11:07:33 +01:00
Willy Tarreau
2d400bb931 MINOR: stream_interface: add reporting of ressouce allocation errors
SSL and keep-alive will need to be able to fail on allocation errors,
and the stream interface did not allow to report such a cause. The flag
will then be "RC" as already documented.
2013-12-09 17:12:18 +01:00
Willy Tarreau
3770f23a3a MINOR: http: switch the http state to an enum
This reduces its size which is not reused by anything else. However it
will significantly improve the debugger's output since we'll now get
real state values.

The default case had to be enabled in the parsers because gcc tries
to optimize the switch/case and noticed some values were missing from
the enums and emitted a warning.
2013-12-09 16:06:22 +01:00
Willy Tarreau
c8987b3664 DIET/MINOR: http: reduce the size of struct http_txn by 8 bytes
Here again we had some oversized and misaligned entries. The method
and the status don't need 4 bytes each, and there was a hole after
the status that does not exist anymore. That's 8 additional bytes
saved from http_txn and as much for the session.

Also some fields were slightly moved to present better memory access
patterns resulting in a steady 0.5% performance increase.
2013-12-09 16:06:22 +01:00
Willy Tarreau
1fbe1c9ec8 MEDIUM: stream-int: return the allocated appctx in stream_int_register_handler()
The task returned by stream_int_register_handler() is never used, however we
always need to access the appctx afterwards. So make it return the appctx
instead. We already plan for it to fail, which is the reason for the addition
of a few tests and the possibility for the HTTP analyser to return a status
code 500.
2013-12-09 15:40:23 +01:00
Willy Tarreau
7b4b499fde MEDIUM: stream-int: replace occurrences of si->appctx with si_appctx()
We're about to remove si->appctx, so first let's replace all occurrences
of its usage with a dynamic extract from si->end. A lot of code was changed
by search-n-replace, but the behaviour was intentionally not altered.

The code surrounding calls to stream_int_register_handler() was slightly
changed since we can only use si->end *after* the registration.
2013-12-09 15:40:23 +01:00
Willy Tarreau
32e3c6a607 MAJOR: stream interface: dynamically allocate the outgoing connection
The outgoing connection is now allocated dynamically upon the first attempt
to touch the connection's source or destination address. If this allocation
fails, we fail on SN_ERR_RESOURCE.

As we didn't use si->conn anymore, it was removed. The endpoints are released
upon session_free(), on the error path, and upon a new transaction. That way
we are able to carry the existing server's address across retries.

The stream interfaces are not initialized anymore before session_complete(),
so we could even think about allocating them dynamically as well, though
that would not provide much savings.

The session initialization now makes use of conn_new()/conn_free(). This
slightly simplifies the code and makes it more logical. The connection
initialization code is now shorter by about 120 bytes because it's done
at once, allowing the compiler to remove all redundant initializations.

The si_attach_applet() function now takes care of first detaching the
existing endpoint, and it is called from stream_int_register_handler(),
so we can safely remove the calls to si_release_endpoint() in the
application code around this call.

A call to si_detach() was made upon stream_int_unregister_handler() to
ensure we always free the allocated connection if one was allocated in
parallel to setting an applet (eg: detect HTTP proxy while proceeding
with stats maybe).
2013-12-09 15:40:23 +01:00
Willy Tarreau
f79c8171b2 MAJOR: connection: add two new flags to indicate readiness of control/transport
Currently the control and transport layers of a connection are supposed
to be initialized when their respective pointers are not NULL. This will
not work anymore when we plan to reuse connections, because there is an
asymmetry between the accept() side and the connect() side :

  - on accept() side, the fd is set first, then the ctrl layer then the
    transport layer ; upon error, they must be undone in the reverse order,
    then the FD must be closed. The FD must not be deleted if the control
    layer was not yet initialized ;

  - on the connect() side, the fd is set last and there is no reliable way
    to know if it has been initialized or not. In practice it's initialized
    to -1 first but this is hackish and supposes that local FDs only will
    be used forever. Also, there are even less solutions for keeping trace
    of the transport layer's state.

Also it is possible to support delayed close() when something (eg: logs)
tracks some information requiring the transport and/or control layers,
making it even more difficult to clean them.

So the proposed solution is to add two flags to the connection :

  - CO_FL_CTRL_READY is set when the control layer is initialized (fd_insert)
    and cleared after it's released (fd_delete).

  - CO_FL_XPRT_READY is set when the control layer is initialized (xprt->init)
    and cleared after it's released (xprt->close).

The functions have been adapted to rely on this and not on the pointers
anymore. conn_xprt_close() was unused and dangerous : it did not close
the control layer (eg: the socket itself) but still marks the transport
layer as closed, preventing any future call to conn_full_close() from
finishing the job.

The problem comes from conn_full_close() in fact. It needs to close the
xprt and ctrl layers independantly. After that we're still having an issue :
we don't know based on ->ctrl alone whether the fd was registered or not.
For this we use the two new flags CO_FL_XPRT_READY and CO_FL_CTRL_READY. We
now rely on this and not on conn->xprt nor conn->ctrl anymore to decide what
remains to be done on the connection.

In order not to miss some flag assignments, we introduce conn_ctrl_init()
to initialize the control layer, register the fd using fd_insert() and set
the flag, and conn_ctrl_close() which unregisters the fd and removes the
flag, but only if the transport layer was closed.

Similarly, at the transport layer, conn_xprt_init() calls ->init and sets
the flag, while conn_xprt_close() checks the flag, calls ->close and clears
the flag, regardless xprt_ctx or xprt_st. This also ensures that the ->init
and the ->close functions are called only once each and in the correct order.
Note that conn_xprt_close() does nothing if the transport layer is still
tracked.

conn_full_close() now simply calls conn_xprt_close() then conn_full_close()
in turn, which do nothing if CO_FL_XPRT_TRACKED is set.

In order to handle the error path, we also provide conn_force_close() which
ignores CO_FL_XPRT_TRACKED and closes the transport and the control layers
in turns. All relevant instances of fd_delete() have been replaced with
conn_force_close(). Now we always know what state the connection is in and
we can expect to split its initialization.
2013-12-09 15:40:23 +01:00
Willy Tarreau
4bd33a9e15 MINOR: http: use conn_init() to reinitialize the server connection
It's safer and easier to proceed using this function which sets all
the required fields.
2013-12-09 15:40:23 +01:00
Willy Tarreau
b363a1f469 MAJOR: stream-int: stop using si->conn and use si->end instead
The connection will only remain there as a pre-allocated entity whose
goal is to be placed in ->end when establishing an outgoing connection.
All connection initialization can be made on this connection, but all
information retrieved should be applied to the end point only.

This change is huge because there were many users of si->conn. Now the
only users are those who initialize the new connection. The difficulty
appears in a few places such as backend.c, proto_http.c, peers.c where
si->conn is used to hold the connection's target address before assigning
the connection to the stream interface. This is why we have to keep
si->conn for now. A future improvement might consist in dynamically
allocating the connection when it is needed.
2013-12-09 15:40:22 +01:00
Willy Tarreau
9b6c2c721e MINOR: stream-int: rename ->applet to ->appctx
Since this is the applet context, call it ->appctx to avoid the confusion
with the pointer to the applet. Many places were changed but it's only a
renaming.
2013-12-09 15:40:22 +01:00
Willy Tarreau
1e6902fd6a MINOR: connection: always initialize conn->objt_type to OBJ_TYPE_CONN
We do this everywhere we prepare a connection so that we can safely
switch to objt_conn() next.
2013-12-09 15:40:22 +01:00
Willy Tarreau
414e9bb806 MEDIUM: stats: move request argument processing to the final step
At the moment, stats require some preliminary storage just to store
some flags and codes that are parsed very early and used later. In
fact that doesn't make much sense and makes it very hard to allocate
the applet dynamically.

This patch changes this. Now stats_check_uri() only checks for the
validity of the request and the fact that it matches the stats uri.

It's handle_stats() which parses it. It makes more sense because
handle_stats() used to already perform some preliminary processing
such as verifying that POST contents are not missing, etc...

There is only one minor hiccup in doing so : the reqrep rules might
be processed in between. This has been addressed by moving
http_handle_stats() just after stats_check_uri() and setting s->target
at the same time. Now that s->target is totally operational, it's used
to mark the current request as being targetted at the stats, and this
information is used after the request processing to remove the HTTP
analysers and only let the applet handle the request.

Thus we guarantee that the storage for the applet is filled with the
relevant information and not overwritten when we switch to the applet.
2013-12-09 15:40:22 +01:00
Willy Tarreau
347a35d19e MAJOR: stats: move the HTTP stats handling to its applet
There is a big trouble with the way POST is handled for the admin
stats page. The POST parameters are extracted from some http-request
rules, and if not round they return zero hoping for being called again
when more data passes. This results in the HTTP analyser being called
several times and all the rules prior to the stats being executed
multiple times as well. That includes rewrite rules.

So instead of doing this, we now move all the processing of the stats
into the stats applet.

That way we just set the stats applet in the HTTP analyser when a stats
request is detected, and the applet takes the time it needs to read the
arguments and respond. We could even imagine improving the applet to
support requests larger than a single buffer.

The code was almost only moved and minimally changed. Several new HTTP
states were added to the stats applet to emit headers, redirects and
to read POST. It was necessary to do this because the headers sent
depend on the parsing of the POST request. In the end it's beneficial
because we removed two stream_int_retnclose() calls.
2013-12-09 15:40:22 +01:00
Willy Tarreau
96d44918f7 MEDIUM: stats: prepare the HTTP stats I/O handler to support more states
In preparation for moving the POST processing to the applet, we first
add new states to the HTTP I/O handler. Till now st0 was only 0/1 for
start/end. We now replace it with an enum.
2013-12-09 15:40:22 +01:00
Willy Tarreau
4c804ec6ee MINOR: http: prevent smp_fetch_url_{ip,port} from using si->conn
These two fetch methods predate the samples and used to store the
destination address into the server-facing connection's address field
because we had no other place at this time.

This will become problematic with the current connection changes, so
let's fix this.
2013-12-09 15:40:22 +01:00
Willy Tarreau
306f8306cb MEDIUM: stats: don't use conn->xprt_ctx anymore
This field was used by dumpstats to retrieve a pointer to the current
session, which may already be found from ->owner. With this change,
the stats code doesn't need the connection at all anymore.
2013-12-09 15:40:21 +01:00
Willy Tarreau
a94d2d7653 MEDIUM: stats: don't use conn->xprt_st anymore
We're trying to move the applets out of the struct connection. So
let's remove the dependence on xprt_st and introduce si->applet.st2
to store the missing contextual data instead.
2013-12-09 15:40:21 +01:00
Willy Tarreau
08382955fe CLEANUP: stream_interface: remove unused field err_loc
This field was still fed with a pointer to the server that caught an
error but was not used anymore. Let's remove it.
2013-12-09 15:40:21 +01:00
Willy Tarreau
0900bcbdbb BUG/MEDIUM: checks: also update the DRAIN state from the web interface
In commit 8c3d0be (MEDIUM: Add DRAIN state and report it on the stats page),
the drain state was updated on every weight change except those that can be
sent via the web interface. This caused inconsistent state combinations to
be reported in the stats depending on the sequence (web then cli vs cli
then web).

It would seem that a call to set_server_drain_state() from within
server_recalc_eweight() would simplify things but that's not completely
certain yet.
2013-12-04 00:54:18 +01:00
Willy Tarreau
60e0838f60 BUG/MINOR: http: usual deinit stuff in last commit
We need to initialize the rdr_fmt list inconditionally. Using only
a redirect rule without an http-redirect may cause a crash during
deinit because of the list iterating from null.
2013-12-03 00:48:45 +01:00
Thierry FOURNIER
d18cd0f110 MEDIUM: http: The redirect strings follows the log format rules.
We handle "http-request redirect" with a log-format string now, but we
leave "redirect" unaffected.

Note that the control of the special "/" case is move from the runtime
execution to the configuration parsing. If the format rule list is
empty, the build_logline() function does nothing.
2013-12-02 23:31:33 +01:00
Willy Tarreau
0cba607400 MINOR: acl/pattern: use types different from int to clarify who does what.
We now have the following enums and all related functions return them and
consume them :

   enum pat_match_res {
	PAT_NOMATCH = 0,         /* sample didn't match any pattern */
	PAT_MATCH = 3,           /* sample matched at least one pattern */
   };

   enum acl_test_res {
	ACL_TEST_FAIL = 0,           /* test failed */
	ACL_TEST_MISS = 1,           /* test may pass with more info */
	ACL_TEST_PASS = 3,           /* test passed */
   };

   enum acl_cond_pol {
	ACL_COND_NONE,		/* no polarity set yet */
	ACL_COND_IF,		/* positive condition (after 'if') */
	ACL_COND_UNLESS,	/* negative condition (after 'unless') */
   };

It's just in order to avoid doubts when reading some code.
2013-12-02 23:31:33 +01:00
Thierry FOURNIER
a65b343eee MEDIUM: pattern: rename "acl" prefix to "pat"
This patch just renames functions, types and enums. No code was changed.
A significant number of files were touched, especially the ACL arrays,
so it is likely that some external patches will not apply anymore.

One important thing is that we had to split ACL_PAT_* into two groups :
  - ACL_TEST_{PASS|MISS|FAIL}
  - PAT_{MATCH|UNMATCH}

A future patch will enforce enums on all these places to avoid confusion.
2013-12-02 23:31:33 +01:00
Thierry FOURNIER
ed66c297c2 REORG: acl/pattern: extract pattern matching from the acl file and create pattern.c
This patch just moves code without any change.

The ACL are just the association between sample and pattern. The pattern
contains the match method and the parse method. These two things are
different. This patch cleans the code by splitting it.
2013-12-02 23:31:33 +01:00
Thierry FOURNIER
dd69a04666 MEDIUM: acl: associate "struct sample_storage" to each "struct acl_pattern"
This will be used later with maps. Each map will associate an entry with
a sample_storage value.

This patch changes the "parse" prototype and all the parsing methods.
The goal is to associate "struct sample_storage" to each entry of
"struct acl_pattern". Only the "parse" function can add the sample value
into the "struct acl_pattern".
2013-12-02 23:31:33 +01:00
Thierry FOURNIER
1c0054fe83 BUG/MINOR: arg: fix error reporting for add-header/set-header sample fetch arguments
The 'add-header %[samples]' parsing errors associated to http-request
and http-response are displayed with the wrong keyword.

Configuration entry:

   http-request set-header mon-header %[res.hdr(user-agent)]

Original error message:

   [WARNING] 323/150920 (16559) : parsing [haproxy.conf:36] : 'log-format' : sample fetch <res.hdr ...

After commit error message:

   [WARNING] 323/150929 (16580) : parsing [haproxy.conf:36] : 'http-request' : sample fetch <res.hdr ...
2013-11-28 18:25:18 +01:00
Simon Horman
58c32978b2 MEDIUM: Set rise and fall of agent checks to 1
This is achieved by moving rise and fall from struct server to struct check.

After this move the behaviour of the primary check, server->check is
unchanged. However, the secondary agent check, server->agent now has
independent rise and fall values each of which are set to 1.

The result is that receiving "fail", "stopped" or "down" just once from the
agent will mark the server as down. And receiving a weight just once will
allow the server to be marked up if its primary check is in good health.

This opens up the scope to allow the rise and fall values of the agent
check to be configurable, however this has not been implemented at this
stage.

Signed-off-by: Simon Horman <horms@verge.net.au>
2013-11-25 07:31:16 +01:00
Willy Tarreau
004e045f31 BUG/MAJOR: server: weight calculation fails for map-based algorithms
A crash was reported by Igor at owind when changing a server's weight
on the CLI. Lukas Tribus could reproduce a related bug where setting
a server's weight would result in the new weight being multiplied by
the initial one. The two bugs are the same.

The incorrect weight calculation results in the total farm weight being
larger than what was initially allocated, causing the map index to be out
of bounds on some hashes. It's easy to reproduce using "balance url_param"
with a variable param, or with "balance static-rr".

It appears that the calculation is made at many places and is not always
right and not always wrong the same way. Thus, this patch introduces a
new function "server_recalc_eweight()" which is dedicated to this task
of computing ->eweight from many other elements including uweight and
current time (for slowstart), and all users now switch to use this
function.

The patch is a bit large but the code was not trivially fixable in a way
that could guarantee this situation would not occur anymore. The fix is
much more readable and has been verified to work with all algorithms,
with both consistent and map-based hashes, and even with static-rr.

Slowstart was tested as well, just like enable/disable server.

The same bug is very likely present in 1.4 as well, so the patch will
probably need to be backported eventhough it will not apply as-is.

Thanks to Lukas and Igor for the information they provided to reproduce it.
2013-11-21 15:09:02 +01:00
Simon Horman
125d099662 MEDIUM: Move health element to struct check
This is in preparation for associating a agent check
with a server which runs as well as the server's existing check.

Signed-off-by: Simon Horman <horms@verge.net.au>
2013-11-19 09:36:07 +01:00
Simon Horman
4a741432be MEDIUM: Paramatise functions over the check of a server
Paramatise the following functions over the check of a server

* set_server_down
* set_server_up
* srv_getinter
* server_status_printf
* set_server_check_status
* set_server_disabled
* set_server_enabled

Generally the server parameter of these functions has been removed.
Where it is still needed it is obtained using check->server.

This is in preparation for associating a agent check
with a server which runs as well as the server's existing check.
By paramatising these functions they may act on each of the checks
without further significant modification.

Explanation of the SSP_O_HCHK portion of this change:

* Prior to this patch SSP_O_HCHK serves a single purpose which
  is to tell server_status_printf() weather it should print
  the details of the check of a server or not.

  With the paramatisation that this patch adds there are two cases.
  1) Printing the details of the check in which case a
     valid check parameter is needed.
  2) Not printing the details of the check in which case
     the contents check parameter are unused.

  In case 1) we could pass SSP_O_HCHK and a valid check and;
  In case 2) we could pass !SSP_O_HCHK and any value for check
  including NULL.

  If NULL is used for case 2) then SSP_O_HCHK becomes supurfulous
  and as NULL is used for case 2) SSP_O_HCHK has been removed.

Signed-off-by: Simon Horman <horms@verge.net.au>
2013-11-19 09:35:54 +01:00
Willy Tarreau
e155ec245a BUG/MINOR: http: fix build warning introduced with url32/url32_src
commit 39c63c5 "url32+src - like base32+src but whole url including parameters"
was missing the last argument "const char *kw", resulting in the build warning
below :

src/proto_http.c:10351:2: warning: initialization from incompatible pointer type [enabled by default]
src/proto_http.c:10351:2: warning: (near initialization for 'sample_fetch_keywords.kw[50].process') [enabled by default]
src/proto_http.c:10352:2: warning: initialization from incompatible pointer type [enabled by default]
src/proto_http.c:10352:2: warning: (near initialization for 'sample_fetch_keywords.kw[51].process') [enabled by default]

It's harmless since it's not needed there anyway.
2013-11-18 18:33:32 +01:00
Willy Tarreau
6d4890cfea BUG/MEDIUM: http: fix possible parser crash when parsing erroneous "http-request redirect" rules
Baptiste Assmann reported a bug affecting the "http-request redirect"
parser. It may randomly crash when reporting an error message if the
syntax is not OK. It happens that this is caused by the output error
message pointer which was not initialized to NULL.

This bug is 1.5-specific (introduced in dev17), no backport is needed.
2013-11-18 18:07:35 +01:00
Neil - HAProxy List
39c63c56d2 url32+src - like base32+src but whole url including parameters
I have a need to limit traffic to each url from each source address. much
like base32+src but the whole url including parameters (this came from
looking at the recent 'Haproxy rate limit per matching request' thread)

attached is patch that seems to do the job, its a copy and paste job of the
base32 functions

the url32 function seems to work too and using 2 machines to request the
same url locks me out of both if I abuse from either with the url32 key
function and only the one if I use url32_src.

Neil
2013-11-18 06:50:38 +01:00
Willy Tarreau
3b44e729e5 CLEANUP: http: merge error handling for req* and http-request *
The reqdeny/reqtarpit and http-request deny/tarpit were using
a copy-paste of the error handling code because originally the
req* actions used to maintain their own stats. This is not the
case anymore so we can use the same error blocks for both.

The http-request rulesets still has precedence over req* so no
functionality was changed.
2013-11-16 10:30:14 +01:00
Willy Tarreau
687ba13e92 CLEANUP: http: homogenize processing of denied req counter
The reqdeny/reqideny and reqtarpit/reqitarpit rules used to maintain
the stats counters themselves while http-request deny/tarpit and
rspdeny/rspideny used to centralize them at the point where the
error is processed.

Thus, let's do the same for reqdeny/reqtarpit so that the functions
which iterate over the rules do not have to deal with these counters
anymore.
2013-11-16 10:13:35 +01:00
Willy Tarreau
8ac7249611 BUG/MINOR: stats: don't count tarpitted connections twice
When a connection is tarpitted, a denied req is counted once when the
action is applied, and then a failed req is counted when the tarpit
timeout expires. This is completely wrong as the tarpit is exactly
equivalent to a deny since it's a disguised deny.

So let's not increment the failed req anymore.

This fix may be backported to 1.4 which has the same issue.
2013-11-16 10:06:44 +01:00
Thierry FOURNIER
5068d96ac1 MINOR: http: change url_decode to return the size of the decoded string.
Currently url_decode returns 1 or 0 depending on whether it could decode
the string or not. For some future use cases, it will be needed to get the
decoded string length after a successful decoding, so let's make it return
that value, and fall back to a negative one in case of error.
2013-10-23 12:26:50 +02:00
Willy Tarreau
472b1ee115 BUG/MEDIUM: http: accept full buffers on smp_prefetch_http
Bertrand Jacquin reported a but when using tcp_request content rules
on large POST HTTP requests. The issue is that smp_prefetch_http()
first tries to validate an input buffer, but only if the buffer is
not full. This test is wrong since it must only be performed after
the parsing has failed, otherwise we don't accept POST requests which
fill the buffer as valid HTTP requests.

This bug is 1.5-specific, no backport needed.
2013-10-14 22:47:00 +02:00
Willy Tarreau
7959a55e15 MINOR: http: compute response time before processing headers
At the moment, HTTP response time is computed after response headers are
processed. This can misleadingly assign to the server some heavy local
processing (eg: regex), and also prevents response headers from passing
information related to the response time (which can sometimes be useful
for stats).

Let's retrieve the reponse time before processing the headers instead.

Note that in order to remain compatible with what was previously done,
we disable the response time when we get a 502 or any bad response. This
should probably be changed in 1.6 since it does not make sense anymore
to lose this information.
2013-09-23 16:53:11 +02:00
William Lallemand
5b7ea3afa1 BUG/MEDIUM: unique_id: junk in log on empty unique_id
When a request fail, the unique_id was allocated but not generated.
The string was not initialized and junk was printed in the log with %ID.

This patch changes the behavior of the unique_id. The unique_id is now
generated when a request failed.

This bug was reported by Patrick Hemmer.
2013-08-31 08:01:14 +02:00
Willy Tarreau
9f09521f2d BUG/MEDIUM: unique_id: HTTP request counter must be unique!
The HTTP request counter is incremented non atomically, which means that
many requests can log the same ID. Let's increment it when it is consumed
so that we avoid this case.

This bug was reported by Patrick Hemmer. It's 1.5-specific and does not
need to be backported.
2013-08-13 17:52:20 +02:00
Willy Tarreau
ef38c39287 MEDIUM: sample: systematically pass the keyword pointer to the keyword
We're having a lot of duplicate code just because of minor variants between
fetch functions that could be dealt with if the functions had the pointer to
the original keyword, so let's pass it as the last argument. An earlier
version used to pass a pointer to the sample_fetch element, but this is not
the best solution for two reasons :
  - fetch functions will solely rely on the keyword string
  - some other smp_fetch_* users do not have the pointer to the original
    keyword and were forced to pass NULL.

So finally we're passing a pointer to the keyword as a const char *, which
perfectly fits the original purpose.
2013-08-01 21:17:13 +02:00
Willy Tarreau
276fae9ab9 MINOR: samples: add the http_date([<offset>]) sample converter.
Converts an integer supposed to contain a date since epoch to
a string representing this date in a format suitable for use
in HTTP header fields. If an offset value is specified, then
it is a number of seconds that is added to the date before the
conversion is operated. This is particularly useful to emit
Date header fields, Expires values in responses when combined
with a positive offset, or Last-Modified values when the
offset is negative.
2013-07-25 15:00:38 +02:00
Willy Tarreau
506d050600 BUG/MAJOR: http: sample prefetch code was not properly migrated
When ACLs and samples were converged in 1.5-dev18, function
"acl_prefetch_http" was not properly converted after commit 8ed669b1.
It used to return -1 when contents did not match HTTP traffic, which
was considered as a "true" boolean result by the ACL execution code,
possibly causing crashes due to missing data when checking for HTTP
traffic in TCP rules.

Another issue is that when the function returned zero, it did not
set tje SMP_F_MAY_CHANGE flag, so it could randomly exit on partial
requests before waiting for a complete one.

Last issue is that when it returned 1, it did not set smp->data.uint,
so this last one would retain a random value from a past execution.
This could randomly cause some matches to fail as well.

Thanks to Remo Eichenberger for reporting this issue with a detailed
explanation and configuration.

This bug is 1.5-specific, no backport is needed.
2013-07-06 13:36:34 +02:00
Willy Tarreau
5b15f9004d BUG/MEDIUM: http: "option checkcache" fails with the no-cache header
The checkcache option checks for cacheable responses with a set-cookie
header. Since the response processing code was refactored in 1.3.8
(commit a15645d4), the check was broken because the no-cache value
is only checked as no-cache="set-cookie", and not alone.

Thanks to Herv Commowick for reporting this stupid bug!

The fix should be backported to 1.4 and 1.3.
2013-07-04 12:49:28 +02:00
Lukas Tribus
67db8df12b MEDIUM: http: add IPv6 support for "set-tos"
As per RFC3260 #4 and BCP37 #4.2 and #5.2, the IPv6 counterpart of TOS
is "traffic class".

Add support for IPv6 traffic class in "set-tos" by moving the "set-tos"
related code to the new inline function inet_set_tos(), handling IPv4
(IP_TOS), IPv6 (IPV6_TCLASS) and IPv4-mapped sockets (IP_TOS, like
::ffff:127.0.0.1).

Also define - if missing - the IN6_IS_ADDR_V4MAPPED() macro in
include/common/compat.h for compatibility.
2013-06-23 18:01:38 +02:00
Lukas Tribus
2dd1d1a93f BUG/MINOR: http: fix "set-tos" not working in certain configurations
s->req->prod->conn->addr.to.ss_family contains only useful data if
conn_get_to_addr() is called early. If thats not the case (nothing in the
configuration needs the destination address like logs, transparent, ...)
then "set-tos" doesn't work.

Fix this by checking s->req->prod->conn->addr.from.ss_family instead.
Also fix a minor doc issue about set-tos in http-response.
2013-06-23 18:01:31 +02:00
Willy Tarreau
dc13c11c1e BUG/MEDIUM: prevent gcc from moving empty keywords lists into BSS
Benoit Dolez reported a failure to start haproxy 1.5-dev19. The
process would immediately report an internal error with missing
fetches from some crap instead of ACL names.

The cause is that some versions of gcc seem to trim static structs
containing a variable array when moving them to BSS, and only keep
the fixed size, which is just a list head for all ACL and sample
fetch keywords. This was confirmed at least with gcc 3.4.6. And we
can't move these structs to const because they contain a list element
which is needed to link all of them together during the parsing.

The bug indeed appeared with 1.5-dev19 because it's the first one
to have some empty ACL keyword lists.

One solution is to impose -fno-zero-initialized-in-bss to everyone
but this is not really nice. Another solution consists in ensuring
the struct is never empty so that it does not move there. The easy
solution consists in having a non-null list head since it's not yet
initialized.

A new "ILH" list head type was thus created for this purpose : create
an Initialized List Head so that gcc cannot move the struct to BSS.
This fixes the issue for this version of gcc and does not create any
burden for the declarations.
2013-06-21 23:29:02 +02:00
Willy Tarreau
67dad2715b BUG/CRITICAL: fix a possible crash when using negative header occurrences
When a config makes use of hdr_ip(x-forwarded-for,-1) or any such thing
involving a negative occurrence count, the header is still parsed in the
order it appears, and an array of up to MAX_HDR_HISTORY entries is created.
When more entries are used, the entries simply wrap and continue this way.

A problem happens when the incoming header field count exactly divides
MAX_HDR_HISTORY, because the computation removes the number of requested
occurrences from the count, but does not care about the risk of wrapping
with a negative number. Thus we can dereference the array with a negative
number and randomly crash the process.

The bug is located in http_get_hdr() in haproxy 1.5, and get_ip_from_hdr2()
in haproxy 1.4. It affects configurations making use of one of the following
functions with a negative <value> occurence number :

   - hdr_ip(<name>, <value>)  (in 1.4)
   - hdr_*(<name>, <value>)   (in 1.5)

It also affects "source" statements involving "hdr_ip(<name>)" since that
statement implicitly uses -1 for <value> :

   - source 0.0.0.0 usesrc hdr_ip(<name>)

A workaround consists in rejecting dangerous requests early using
hdr_cnt(<name>), which is available both in 1.4 and 1.5 :

   block if { hdr_cnt(<name>) ge 10 }

This bug has been present since the introduction of the negative offset
count in 1.4.4 via commit bce70882. It has been reported by David Torgerson
who offered some debugging traces showing where the crash happened, thus
making it significantly easier to find the bug!

CVE-2013-2175 was assigned to this bug.

This fix must absolutely be backported to 1.4.
2013-06-17 12:00:22 +02:00
Willy Tarreau
3c4beb1feb CLEANUP: http: remove the bogus urlp_ip ACL match
This one is wrong, never matches and cannot work. It was brought by a blind
copy-paste from the url_* version in 1.5-dev9, but there is no underlying
fetch returning an IP type for this.
2013-06-12 22:26:04 +02:00
Willy Tarreau
c32484ed35 MEDIUM: acl: remove 15 additional useless ACLs that are equivalent to their fetches
The following 15 ACLs were missed from previous review, and are not needed
either.

hdr_cnt, hdr_ip, hdr_val, rep_ssl_hello_type, req_len, req_ssl_hello_type,
scook_cnt, scook_val, shdr_cnt, shdr_ip, shdr_val, url_ip, url_port,
urlp_val, req_proto_http.
2013-06-12 22:23:40 +02:00
Willy Tarreau
6d4e4e8dd2 MEDIUM: acl: remove a lot of useless ACLs that are equivalent to their fetches
The following 116 ACLs were removed because they're redundant with their
fetch function since last commit which allows the fetch function to be
used instead for types BOOL, INT and IP. Most places are now left with
an empty ACL keyword list that was not removed so that it's easier to
add other ACLs later.

always_false, always_true, avg_queue, be_conn, be_id, be_sess_rate, connslots,
nbsrv, queue, srv_conn, srv_id, srv_is_up, srv_sess_rate, res.comp, fe_conn,
fe_id, fe_sess_rate, dst_conn, so_id, wait_end, http_auth, http_first_req,
status, dst, dst_port, src, src_port, sc1_bytes_in_rate, sc1_bytes_out_rate,
sc1_clr_gpc0, sc1_conn_cnt, sc1_conn_cur, sc1_conn_rate, sc1_get_gpc0,
sc1_gpc0_rate, sc1_http_err_cnt, sc1_http_err_rate, sc1_http_req_cnt,
sc1_http_req_rate, sc1_inc_gpc0, sc1_kbytes_in, sc1_kbytes_out, sc1_sess_cnt,
sc1_sess_rate, sc1_tracked, sc1_trackers, sc2_bytes_in_rate,
sc2_bytes_out_rate, sc2_clr_gpc0, sc2_conn_cnt, sc2_conn_cur, sc2_conn_rate,
sc2_get_gpc0, sc2_gpc0_rate, sc2_http_err_cnt, sc2_http_err_rate,
sc2_http_req_cnt, sc2_http_req_rate, sc2_inc_gpc0, sc2_kbytes_in,
sc2_kbytes_out, sc2_sess_cnt, sc2_sess_rate, sc2_tracked, sc2_trackers,
sc3_bytes_in_rate, sc3_bytes_out_rate, sc3_clr_gpc0, sc3_conn_cnt,
sc3_conn_cur, sc3_conn_rate, sc3_get_gpc0, sc3_gpc0_rate, sc3_http_err_cnt,
sc3_http_err_rate, sc3_http_req_cnt, sc3_http_req_rate, sc3_inc_gpc0,
sc3_kbytes_in, sc3_kbytes_out, sc3_sess_cnt, sc3_sess_rate, sc3_tracked,
sc3_trackers, src_bytes_in_rate, src_bytes_out_rate, src_clr_gpc0,
src_conn_cnt, src_conn_cur, src_conn_rate, src_get_gpc0, src_gpc0_rate,
src_http_err_cnt, src_http_err_rate, src_http_req_cnt, src_http_req_rate,
src_inc_gpc0, src_kbytes_in, src_kbytes_out, src_sess_cnt, src_sess_rate,
src_updt_conn_cnt, table_avl, table_cnt, ssl_c_ca_err, ssl_c_ca_err_depth,
ssl_c_err, ssl_c_used, ssl_c_verify, ssl_c_version, ssl_f_version, ssl_fc,
ssl_fc_alg_keysize, ssl_fc_has_crt, ssl_fc_has_sni, ssl_fc_use_keysize,
2013-06-11 21:22:58 +02:00
Willy Tarreau
51347ed94c MEDIUM: http: add the "set-mark" action on http-request/http-response rules
"set-mark" is used to set the Netfilter MARK on all packets sent to the
client to the value passed in <mark> on platforms which support it. This
value is an unsigned 32 bit value which can be matched by netfilter and
by the routing table. It can be expressed both in decimal or hexadecimal
format (prefixed by "0x"). This can be useful to force certain packets to
take a different route (for example a cheaper network path for bulk
downloads). This works on Linux kernels 2.6.32 and above and requires
admin privileges.
2013-06-11 19:34:13 +02:00
Willy Tarreau
42cf39e3b9 MEDIUM: http: add support for "set-tos" in http-request/http-response
This manipulates the TOS field of the IP header of outgoing packets sent
to the client. This can be used to set a specific DSCP traffic class based
on some request or response information. See RFC2474, 2597, 3260 and 4594
for more information.
2013-06-11 19:04:37 +02:00
Willy Tarreau
9a355ec257 MEDIUM: http: add support for action "set-log-level" in http-request/http-response
Some users want to disable logging for certain non-important requests such as
stats requests or health-checks coming from another equipment. Other users want
to log with a higher importance (eg: notice) some special traffic (POST requests,
authenticated requests, requests coming from suspicious IPs) or some abnormally
large responses.

This patch responds to all these needs at once by adding a "set-log-level" action
to http-request/http-response. The 8 syslog levels are supported, as well as "silent"
to disable logging.
2013-06-11 17:50:26 +02:00
Willy Tarreau
abcd5145f8 MEDIUM: log: add a log level override value in struct session
This log level will be used in a further patch to change the log level
depending on the request or response.
2013-06-11 17:50:26 +02:00
Willy Tarreau
f4c43c13be MEDIUM: http: add the "set-nice" action to http-request and http-response
This new action changes the nice factor of the task processing the current
request.
2013-06-11 17:50:26 +02:00
Willy Tarreau
e365c0b92b MEDIUM: http: add a new "http-response" ruleset
Some actions were clearly missing to process response headers. This
patch adds a new "http-response" ruleset which provides the following
actions :
  - allow : stop evaluating http-response rules
  - deny : stop and reject the response with a 502
  - add-header : add a header in log-format mode
  - set-header : set a header in log-format mode
2013-06-11 16:06:12 +02:00
Willy Tarreau
04ff9f105f MINOR: http: add full-length header fetch methods
The req.hdr and res.hdr fetch methods do not work well on headers which
are allowed to contain commas, such as User-Agent, Date or Expires.
More specifically, full-length matching is impossible if a comma is
present.

This patch introduces 4 new fetch functions which are designed to work
with these full-length headers :
  - req.fhdr, req.fhdr_cnt
  - res.fhdr, res.fhdr_cnt

These ones do not stop at commas and permit to return full-length header
values.
2013-06-10 18:39:42 +02:00
Willy Tarreau
570f221cbb MINOR: log: add a new flag 'L' for locally processed requests
People who use "option dontlog-normal" are bothered with redirects and
stats being logged and reported as errors in the logs ("PR" = proxy
blocked the request).

This patch introduces a new flag 'L' for when a request is locally
processed, that is not considered as an error by the log filters. That
way we know a request was intercepted and processed by haproxy without
logging the line when "option dontlog-normal" is in effect.
2013-06-10 16:42:09 +02:00
Willy Tarreau
379357af58 BUG/MAJOR: http: always ensure response buffer has some room for a response
Since 1.5-dev12 and commit 3bf1b2b8 (MAJOR: channel: stop relying on
BF_FULL to take action), the HTTP parser switched to channel_full()
instead of BF_FULL to decide whether a buffer had enough room to start
parsing a request or response. The problem is that channel_full()
intentionally ignores outgoing data, so a corner case exists where a
large response might still be left in a response buffer with just a
few bytes left (much less than the reserve), enough to accept a second
response past the last data, but not enough to permit the HTTP processor
to add some headers. Since all the processing relies on this space being
available, we can get some random crashes when clients pipeline requests.

The analysis of a core from haproxy configured with 20480 bytes buffers
shows this : with enough "luck", when sending back the response for the
first request, the client is slow, the TCP window is congested, the socket
buffers are full, and haproxy's buffer fills up. We still have 20230 bytes
of response data in a 20480 response buffer. The second request is sent to
the server which returns 214 bytes which fit in the small 250 bytes left
in this buffer. And the buffer arrangement makes it possible to escape all
the controls in http_wait_for_response() :

    |<------ response buffer = 20480 bytes ------>|
    [ 2/2  | 3 | 4 |          1/2                 ]
           ^ start of circular buffer

      1/2 = beginning of previous response (18240)
      2/2 = end of previous response       (1990)
        3 = current response               (214)
        4 = free space                     (36)

  - channel_full() returns false (20230 bytes are going to leave)
  - the response headers does not wrap at the end of the buffer
  - the remaining linear room after the headers is larger than the
    reserve, because it's the previous response which wraps :
  => response is processed

Header rewriting causes it to reach 260 bytes, 10 bytes larger than what
the buffer could hold. So all computations during header addition are
wrong and lead to the corruption we've observed.

All the conditions are very hard to meet (which explains why it took
almost one year for this bug to show up) and are almost impossible to
reproduce on purpose on a test platform. But the bug is clearly there.

This issue was reported by Dinko Korunic who kindly devoted a lot of
time to provide countless traces and cores, and to experiment with
troubleshooting patches to knock the bug down. Thanks Dinko!

No backport is needed, but all 1.5-dev versions between dev12 and dev18
included must be upgraded. A workaround consists in setting option
forceclose to prevent pipelined requests from being processed.
2013-06-08 13:14:17 +02:00
Willy Tarreau
7fe3300b76 BUG/MEDIUM: stats: fix a regression when dealing with POST requests
In 1.5-dev17 (commit 1facd6d6), we reorganized the way HTTP stats
requests are handled. When moving the code, we dropped a "return 0"
which happens upon incomplete POST request, so we now end up with
the next return 1 which causes processing to go on with next
analyser. This causes incomplete POST requests to try to forward
the request to servers, resulting in either a 404 or a 503 depending
on the configuration.

This patch fixes this regression to restore the previous behaviour.
It's not enough though, as it happens that the stats code is handled
after all http header processing but in the same function. The net
effect is that incomplete requests cause the headers manipulation to
be performed multiple times, possibly resulting in multiple headers
in the request buffer. Since the stats requests are not meant to be
forwarded, it's not an issue yet but this is something to take care
of later.

A remaining issue that's not handled yet is that if the client does
not send the complete POST headers, then the request is finally
forwarded. This is not a regression, it has always been there and
seems to be caused by the lack of timeout processing when waiting
for the POST body. The solution to this issue would be to move the
handling of stats requests into a dedicated analyser placed after
http_process_request_body().

Bug reported by Guillaume de Lafond.
2013-04-21 08:16:10 +02:00
de Lafond Guillaume
88c278fadf MEDIUM: stats: add proxy name filtering on the statistic page
This patch adds a "scope" box in the statistics page in order to
display only proxies with a name that contains the requested value.
The scope filter is preserved across all clicks on the page.
2013-04-15 22:50:33 +02:00
Willy Tarreau
667c2a3d2a BUG/MAJOR: http: compression still has defects on chunked responses
The compression state machine happens to start work it cannot undo if
there's no more data in the input buffer, and has trouble accounting
for it. Fixing it requires more than a few lines, as the confusion is
in part caused by the way the pointers to the various places in the
message are handled internally. So as a temporary fix, let's disable
compression on chunk-encoded responses. This will give us more time
to perform the required changes.
2013-04-14 23:32:53 +02:00
Willy Tarreau
8d1c5164f3 BUG/MINOR: http: add-header/set-header did not accept the ACL condition
Sander Klein reported this bug. The test for the extra argument on these
rules prevent any condition from being added. The bug was introduced with
the feature itself in 1.5-dev16.
2013-04-03 14:13:58 +02:00
Willy Tarreau
a4312fa28e MAJOR: sample: maintain a per-proxy list of the fetch args to resolve
While ACL args were resolved after all the config was parsed, it was not the
case with sample fetch args because they're almost everywhere now.

The issue is that ACLs now solely rely on sample fetches, so their args
resolving doesn't work anymore. And many fetches involving a server, a
proxy or a userlist don't work at all.

The real issue is that at the bottom layers we have no information about
proxies, line numbers, even ACLs in order to report understandable errors,
and that at the top layers we have no visibility over the locations where
fetches are referenced (think log node).

After failing multiple unsatisfying solutions attempts, we now have a new
concept of args list. The principle is that every proxy has a list head
which contains a number of indications such as the config keyword, the
context where it's used, the file and line number, etc... and a list of
arguments. This list head is of the same type as the elements, so it
serves as a template for adding new elements. This way, it is filled from
top to bottom by the callers with the information they have (eg: line
numbers, ACL name, ...) and the lower layers just have to duplicate it and
add an element when they face an argument they cannot resolve yet.

Then at the end of the configuration parsing, a loop passes over each
proxy's list and resolves all the args in sequence. And this way there is
all necessary information to report verbose errors.

The first immediate benefit is that for the first time we got very precise
location of issues (arg number in a keyword in its context, ...). Second,
in order to do this we had to parse log-format and unique-id-format a bit
earlier, so that was a great opportunity for doing so when the directives
are encountered (unless it's a default section). This way, the recorded
line numbers for these args are the ones of the place where the log format
is declared, not the end of the file.

Userlists report slightly more information now. They're the only remaining
ones in the ACL resolving function.
2013-04-03 02:13:02 +02:00
Willy Tarreau
0a0daecbb2 MEDIUM: http: remove val_usr() to validate user_lists
This one was incorrect since it tried to validate the user-lists
before end of parsing.
2013-04-03 02:13:02 +02:00
Willy Tarreau
ff5afcc32b MINOR: http: replace acl_parse_ver with acl_parse_str
The HTTP version parser used in ACLs has long been a string and
still had its own parser. This makes no sense, switch it to use
the standard string parser.
2013-04-03 02:13:01 +02:00
Willy Tarreau
d86e29d2a1 CLEANUP: acl: remove unused references to ACL_USE_*
Now that acl->requires is not used anymore, we can remove all references
to it as well as all ACL_USE_* flags.
2013-04-03 02:13:00 +02:00
Willy Tarreau
18ed2569f5 MINOR: http: add new direction-explicit sample fetches for headers and cookies
Since "hdr" and "cookie" were ambiguously referring to the request or response
depending on the context, we need a way to explicitly specify the direction.
By prefixing the fetches names with "req." and "res.", we can now restrict such
fetches to the appropriate direction. At the moment the fetches are explicitly
declared by later we might think about having an automatic match when "req." or
"res." appears. These explicit fetches are now used by the relevant ACLs.
2013-04-03 02:12:59 +02:00
Willy Tarreau
9baae63d8d MAJOR: acl: remove fetch argument validation from the ACL struct
ACL fetch being inherited from the sample fetch keyword, we don't need
anymore to specify what function to use to validate the fetch arguments.

Note that the job is still done in the ACL parsing code based on elements
from the sample fetch structs.
2013-04-03 02:12:59 +02:00
Willy Tarreau
c48c90dfa5 MAJOR: acl: remove the arg_mask from the ACL definition and use the sample fetch's
Now that ACLs solely rely on sample fetch functions, make them use the
same arg mask. All inconsistencies have been fixed separately prior to
this patch, so this patch almost only adds a new pointer indirection
and removes all references to ARG*() in the definitions.

The parsing is still performed by the ACL code though.
2013-04-03 02:12:58 +02:00
Willy Tarreau
8ed669b12a MAJOR: acl: make all ACLs reference the fetch function via a sample.
ACL fetch functions used to directly reference a fetch function. Now
that all ACL fetches have their sample fetches equivalent, we can make
ACLs reference a sample fetch keyword instead.

In order to simplify the code, a sample keyword name may be NULL if it
is the same as the ACL's, which is the most common case.

A minor change appeared, http_auth always expects one argument though
the ACL allowed it to be missing and reported as such afterwards, so
fix the ACL to match this. This is not really a bug.
2013-04-03 02:12:58 +02:00
Willy Tarreau
409bcde176 MEDIUM: http: unify acl and sample fetch functions
The following sample fetch functions were only usable by ACLs but are now
usable by sample fetches too :

    cook, cook_cnt, cook_val, hdr_cnt, hdr_ip, hdr_val, http_auth,
    http_auth_group, http_first_req, method, req_proto_http, req_ver,
    resp_ver, scook, scook_cnt, scook_val, shdr, shdr_cnt, shdr_ip,
    shdr_val, status, urlp, urlp_val,

Most of them won't bring much benefit at the moment, or are even aliases of
existing ones, however they'll be needed for ACL->SMP convergence.

A new val_usr() function was added to resolve userlist names into pointers.

The http_auth_group ACL forgot to make its first argument mandatory, so
there was a check in cfgparse to report a vague error. Now that args are
correctly parsed, let's report something more precise.

All urlp* ACLs now support an optional 3rd argument like their sample
counter-part which is the optional delimiter.

The fetch functions have been renamed "smp_fetch_*".

Some args controls on the sample keywords have been relaxed so that we
can soon use them for ACLs :

  - cookie now accepts to have an optional name ; it will return the
    first matching cookie if the name is not set ;
  - same for set-cookie and hdr
2013-04-03 02:12:57 +02:00
Willy Tarreau
434c57c95c MINOR: log: indicate it when some unreliable sample fetches are logged
If a log-format involves some sample fetches that may not be present at
the logging instant, we can now report a warning.

Note that this is done both for log-format and for add-header and carefully
respects the original fetch keyword's capabilities.
2013-04-03 02:12:56 +02:00
Willy Tarreau
80aca90ad2 MEDIUM: samples: use new flags to describe compatibility between fetches and their usages
Samples fetches were relying on two flags SMP_CAP_REQ/SMP_CAP_RES to describe
whether they were compatible with requests rules or with response rules. This
was never reliable because we need a finer granularity (eg: an HTTP request
method needs to parse an HTTP request, and is available past this point).

Some fetches are also dependant on the context (eg: "hdr" uses request or
response depending where it's involved, causing some abiguity).

In order to solve this, we need to precisely indicate in fetches what they
use, and their users will have to compare with what they have.

So now we have a bunch of bits indicating where the sample is fetched in the
processing chain, with a few variants indicating for some of them if it is
permanent or volatile (eg: an HTTP status is stored into the transaction so
it is permanent, despite being caught in the response contents).

The fetches also have a second mask indicating their validity domain. This one
is computed from a conversion table at registration time, so there is no need
for doing it by hand. This validity domain consists in a bitmask with one bit
set for each usage point in the processing chain. Some provisions were made
for upcoming controls such as connection-based TCP rules which apply on top of
the connection layer but before instantiating the session.

Then everywhere a fetch is used, the bit for the control point is checked in
the fetch's validity domain, and it becomes possible to finely ensure that a
fetch will work or not.

Note that we need these two separate bitfields because some fetches are usable
both in request and response (eg: "hdr", "payload"). So the keyword will have
a "use" field made of a combination of several SMP_USE_* values, which will be
converted into a wider list of SMP_VAL_* flags.

The knowledge of permanent vs dynamic information has disappeared for now, as
it was never used. Later we'll probably reintroduce it differently when
dealing with variables. Its only use at the moment could have been to avoid
caching a dynamic rate measurement, but nothing is cached as of now.
2013-04-03 02:12:56 +02:00
Willy Tarreau
e0db1e8946 MEDIUM: acl: remove flag ACL_MAY_LOOKUP which is improperly used
This flag is used on ACL matches that support being looking up patterns
in trees. At the moment, only strings and IPs support tree-based lookups,
but the flag is randomly set also on integers and binary data, and is not
even always set on strings nor IPs.

Better get rid of this mess by only relying on the matching function to
decide whether or not it supports tree-based lookups, this is safer and
easier to maintain.
2013-04-03 02:12:56 +02:00
Willy Tarreau
aae75e3279 BUG/CRITICAL: using HTTP information in tcp-request content may crash the process
During normal HTTP request processing, request buffers are realigned if
there are less than global.maxrewrite bytes available after them, in
order to leave enough room for rewriting headers after the request. This
is done in http_wait_for_request().

However, if some HTTP inspection happens during a "tcp-request content"
rule, this realignment is not performed. In theory this is not a problem
because empty buffers are always aligned and TCP inspection happens at
the beginning of a connection. But with HTTP keep-alive, it also happens
at the beginning of each subsequent request. So if a second request was
pipelined by the client before the first one had a chance to be forwarded,
the second request will not be realigned. Then, http_wait_for_request()
will not perform such a realignment either because the request was
already parsed and marked as such. The consequence of this, is that the
rewrite of a sufficient number of such pipelined, unaligned requests may
leave less room past the request been processed than the configured
reserve, which can lead to a buffer overflow if request processing appends
some data past the end of the buffer.

A number of conditions are required for the bug to be triggered :
  - HTTP keep-alive must be enabled ;
  - HTTP inspection in TCP rules must be used ;
  - some request appending rules are needed (reqadd, x-forwarded-for)
  - since empty buffers are always realigned, the client must pipeline
    enough requests so that the buffer always contains something till
    the point where there is no more room for rewriting.

While such a configuration is quite unlikely to be met (which is
confirmed by the bug's lifetime), a few people do use these features
together for very specific usages. And more importantly, writing such
a configuration and the request to attack it is trivial.

A quick workaround consists in forcing keep-alive off by adding
"option httpclose" or "option forceclose" in the frontend. Alternatively,
disabling HTTP-based TCP inspection rules enough if the application
supports it.

At first glance, this bug does not look like it could lead to remote code
execution, as the overflowing part is controlled by the configuration and
not by the user. But some deeper analysis should be performed to confirm
this. And anyway, corrupting the process' memory and crashing it is quite
trivial.

Special thanks go to Yves Lafon from the W3C who reported this bug and
deployed significant efforts to collect the relevant data needed to
understand it in less than one week.

CVE-2013-1912 was assigned to this issue.

Note that 1.4 is also affected so the fix must be backported.
2013-04-03 02:12:55 +02:00
Willy Tarreau
2d43e18b69 BUG/MAJOR: http: fix regression introduced by commit d655ffe
Sander Klein reported that since last snapshot, some downloads would
hang from nginx but succeed from apache. The culprit was not too hard
to find given the low number of recent changes affecting the data path.

Commit d655ffe slightly reorganized the HTTP state machine and
introduced this regression. The reason is that we must never jump
into the MSG_DONE case without first flushing remaining data because
this is not done anymore afterwards. This part is scheduled for
being reorganized since it's totally ugly especially since we added
compression, and this regression is an illustration of its readability.

The issue is entirely dependant on the server close sequence, which
explains why it was reproducible only with nginx here.
2013-04-03 00:22:25 +02:00
Willy Tarreau
ffb6f08bab BUG/MAJOR: http: fix regression introduced by commit a890d072
This commit fixed a bug and introduced a new one at the same time.
It's a stupid typo, the index to store the context is [0], not [2].

The effect is that parsing the header can loop forever if multiple
headers are found. This issue was reported by Lukas Tribus.
2013-04-02 23:19:30 +02:00
Willy Tarreau
a890d072fc BUG/MAJOR: http: use a static storage for sample fetch context
Baptiste Assmann reported that the cook*() ACLs do not work anymore.

The reason is the way we store the hdr_ctx between subsequent calls
to smp_fetch_cookie() since commit 3740635b (1.5-dev10).

The smp->ctx.a[] storage holds up to 8 pointers. It is not meant for
generic storage. We used to store hdr_ctx in the ctx, but while it used
to just fit for smp_fetch_hdr(), it does not for smp_fetch_cookie()
since we stored it at offset 2.

The correct solution is to use this storage to store a pointer to the
current hdr_ctx struct which is statically allocated.
2013-04-02 12:01:06 +02:00
Willy Tarreau
d655ffe863 OPTIM: http: optimize the response forward state machine
By replacing the if/else series with a switch/case, we could save
another 20% on the worst case (chunks of 1 byte).
2013-04-02 02:01:00 +02:00
Willy Tarreau
0161d62d23 OPTIM: http: improve branching in chunk size parser
By tweaking a bit some conditions in http_parse_chunk_size(), we could
improve the overall performance in the worst case by 15%.
2013-04-02 02:00:57 +02:00
Yves Lafon
e267421e93 MINOR: http: status 301 should not be marked non-cacheable
Also, browsers behaviour is inconsistent regarding the Cache-Control
header field on a 301.
2013-03-30 11:22:41 +01:00
Yves Lafon
3e8d1ae2d2 MEDIUM: http: implement redirect 307 and 308
I needed to emit a 307 and noticed it was not available so I did it,
as well as 308.
2013-03-29 19:17:41 +01:00
Yves Lafon
4e8ec500e5 MINOR: http: status code 303 is HTTP/1.1 only
Don't return a 303 redirect with "HTTP/1.0" as it's HTTP/1.1 only.
2013-03-29 19:08:09 +01:00
Willy Tarreau
2fef9b1ef6 BUG/MEDIUM: http: fix another issue caused by http-send-name-header
An issue reported by David Coulson is that when using http-send-name-header,
the response processing would randomly be performed. The issue was first
diagnosed by Cyril Bont as being related to a time race when processing
the closing of the response.

In practice, the issue is a bit trickier. It happens that
http_send_name_header() did not update msg->sol after a rewrite. This
counter is supposed to point to the beginning of the message's body
once headers are scheduled for being forwarded. And not updating it
means that the first forwarding of the request headers in
http_request_forward_body() does not send the correct count, leaving
some bytes in chn->to_forward.

Then if the server sends its response in a single packet with the
close, the stream interface switches to state SI_ST_DIS which in
turn moves to SI_ST_CLO in process_session(), and to close the
outgoing connection. This is detected by http_request_forward_body(),
which then switches the request message to the error state, and syncs
all FSMs and removes any response analyser.

The response analyser being removed, no processing is performed on
the response buffer, which is tunnelled as-is to the client.

Of course, the correct fix consists in having http_send_name_header()
update msg->sol. Normally this ought not to have been needed, but it
is an abuse to modify data already scheduled for being forwarded, so
it is expected that such specific handling has to be done there. Better
not have generic functions deal with such cases, so that it does not
become the standard.

Note: 1.4 does not have this issue even if it does not update the
pointer either, because it forwards from msg->som which is not
updated at the moment the connect() succeeds. So no backport is
required.
2013-03-26 01:21:47 +01:00
Willy Tarreau
3bfeadb3f6 BUG/MEDIUM: http: add-header should not emit "-" for empty fields
Patch 6cbbdbf3 fixed the missing "-" delimitors in logs but it caused
them to be emitted with "http-request add-header", eventhough it was
correctly fixed for the unique-id format. Fix this by simply removing
LOG_OPT_MANDATORY in this case.
2013-03-24 07:33:22 +01:00
Willy Tarreau
6cbbdbf3f3 BUG/MEDIUM: log: emit '-' for empty fields again
Commit 2b0108ad accidently got rid of the ability to emit a "-" for
empty log fields. This can happen for captured request and response
cookies, as well as for fetches. Since we don't want to have this done
for headers however, we set the default log method when parsing the
format. It is still possible to force the desired mode using +M/-M.
2013-02-05 18:55:09 +01:00
Willy Tarreau
192e59fb07 CLEANUP: http: don't try to deinitialize http compression if it fails before init
In select_compression_response_header(), some tests are rather confusing
as the "fail" label is used to deinitialize the compression context for
the session while it's branched only before initialization succeeds. The
test is always false here and the dereferencing of the comp_algo pointer
which might be null is also confusing. Remove that code which is not needed
anymore since commit ec3e3890 got rid of the latest issues.

Reported-by: Dinko Korunic <dkorunic@reflected.net>
2013-01-24 16:19:19 +01:00
Willy Tarreau
4521ba689c CLEANUP: http: remove a useless null check
srv cannot be null in http_perform_server_redirect(), as it's taken
from the stream interface's target which is always valid for a
server-based redirect, and it was already dereferenced above, so in
practice, gcc already removes the test anyway.

Reported-by: Dinko Korunic <dkorunic@reflected.net>
2013-01-24 16:19:18 +01:00
Baptiste Assmann
116eefed8f MINOR: config: http-request configuration error message misses new keywords
"redirect" and "tarpit" keywords were missing from http-request configuration
error message.
2013-01-05 16:53:49 +01:00
Willy Tarreau
56e9ffa6a6 BUG/MINOR: http-compression: lookup Cache-Control in the response, not the request
As stated in both RFC2616 and the http-bis drafts, Cache-Control:
no-transform must be looked up in the response since we're modifying
the response. However, its presence in the request is irrelevant to
any changes in the response :

  7.2.1.6. no-transform
   The "no-transform" request directive indicates that an intermediary
   (whether or not it implements a cache) MUST NOT change the Content-
   Encoding, Content-Range or Content-Type request header fields, nor
   the request representation.

  7.2.2.9. no-transform
   The "no-transform" response directive indicates that an intermediary
   (regardless of whether it implements a cache) MUST NOT change the
   Content-Encoding, Content-Range or Content-Type response header
   fields, nor the response representation.

Note: according to the specs, we're supposed to emit the following
response header :

  Warning: 214 transformation applied

However no other product seems to do it, so the effect on user agents
is unclear.
2013-01-05 16:31:58 +01:00
Willy Tarreau
ccbcc37a01 MEDIUM: http: add support for "http-request tarpit" rule
The "reqtarpit" rule is not very handy to use. Now that we have more
flexibility with "http-request", let's finally make the tarpit rules
usable there.

There are still semantical differences between apply_filters_to_request()
and http_req_get_intercept_rule() because the former updates the counters
while the latter does not. So we currently have almost similar code leafs
for similar conditions, but this should be cleaned up later.
2012-12-28 14:47:19 +01:00
Willy Tarreau
81499eb67d MEDIUM: http: add support for "http-request redirect" rules
These are exactly the same as the classic redirect rules except
that they can be interleaved with other http-request rules for
more flexibility.

The redirect parser should probably be changed to stop at the condition
so that the caller puts its own condition pointer. At the moment, the
redirect rule and condition are parsed at once by build_redirect_rule()
and the condition is assigned to the http_req_rule.
2012-12-28 14:47:19 +01:00
Willy Tarreau
4baae248fc REORG: config: move the http redirect rule parser to proto_http.c
We'll have to use this elsewhere soon, let's move it to the proper
place.
2012-12-28 14:47:19 +01:00
Willy Tarreau
71241abfd3 MINOR: http: move redirect rule processing to its own function
We now have http_apply_redirect_rule() which does all the redirect-specific
job instead of having this inside http_process_req_common().

Also one of the benefit gained from uniformizing this code is that both
keep-alive and close response do emit the PR-- flags. The fix for the
flags could probably be backported to 1.4 though it's very minor.

The previous function http_perform_redirect() was becoming confusing
so it was renamed http_perform_server_redirect() since it only applies
to server-based redirection.
2012-12-28 14:47:19 +01:00
Willy Tarreau
96257ec5c8 CLEANUP: http: rename the misleading http_check_access_rule
Several bugs were introduced recently due to a misunderstanding of how
this function works and what it was supposed to do. Since it's supposed
to only return the pointer to a rule which aborts further processing of
the request, let's rename it to avoid further issues.

The function was also slightly cleaned up without any functional change.
2012-12-28 14:47:19 +01:00
Willy Tarreau
d79a3b248e BUG/MINOR: log: make log-format, unique-id-format and add-header more independant
It happens that all of them call parse_logformat_line() which sets
proxy->to_log with a number of flags affecting the line format for
all three users. For example, having a unique-id specified disables
the default log-format since fe->to_log is tested when the session
is established.

Similarly, having "option logasap" will cause "+" to be inserted in
unique-id or headers referencing some of the fields depending on
LW_BYTES.

This patch first removes most of the dependency on fe->to_log whenever
possible. The first possible cleanup is to stop checking fe->to_log
for being null, considering that it always contains at least LW_INIT
when any such usage is made of the log-format!

Also, some checks are wrong. s->logs.logwait cannot be nulled by
"logwait &= ~LW_*" since LW_INIT is always there. This results in
getting the wrong log at the end of a request or session when a
unique-id or add-header is set, because logwait is still not null
but the log-format is not checked.

Further cleanups are required. Most LW_* flags should be removed or at
least replaced with what they really mean (eg: depend on client-side
connection, depend on server-side connection, etc...) and this should
only affect logging, not other mechanisms.

This patch fixes the default log-format and tries to limit interferences
between the log formats, but does not pretend to do more for the moment,
since it's the most visible breakage.
2012-12-28 09:51:00 +01:00
Willy Tarreau
cbc743e36c BUG/MEDIUM: stats: disable request analyser when processing POST or HEAD
After the response headers are sent and the request processing is done,
the buffers are wiped out and the stream interface is closed. We must
then disable the request analysers, otherwise some processing will
happen on a closed stream interface and empty buffers which do not
match, causing all sort of crashes. This issue was introduced with
recent work on the stats, and was reported by Seri.
2012-12-28 08:36:50 +01:00
Willy Tarreau
1a1e8072f9 BUG/MINOR: stats: http-request rules still don't cope with stats
Since commit 20b0de5, we also had another remaining issue : an
"http-request allow" rule would prevent a stats rule from being
processed.
2012-12-27 10:34:21 +01:00
Willy Tarreau
8b80f0c9a2 BUG/MINOR: stats: last fix was still wrong
Previous commit was still wrong, it broke add-header and set-header
because we don't want to leave on these actions.

The http_check_access_rule() function should be redesigned, it was
initially thought for allow/deny rules but now it is executing other
non-final rules and at the same time returning a pointer to the last
final rule. That becomes a bit confusing and will need to be addressed
before we implement redirect and return.
2012-12-25 21:55:37 +01:00
Willy Tarreau
418c1a0a95 BUG/MEDIUM: stats: fix stats page regression introduced by commit 20b0de5
This commit adding http-request add-header/set-header unfortunately introduced
a regression to the handling of the stats page which is not matched anymore.

Thanks to Dmitry Sivachenko for reporting this.
2012-12-25 20:52:58 +01:00
Willy Tarreau
20b0de56d4 MEDIUM: http: add http-request 'add-header' and 'set-header' to build headers
These two new statements allow to pass information extracted from the request
to the server. It's particularly useful for passing SSL information to the
server, but may be used for various other purposes such as combining headers
together to emulate internal variables.
2012-12-24 15:56:20 +01:00
Willy Tarreau
5c2e198390 MINOR: http: prepare to support more http-request actions
We'll need to support per-action arguments, so we need to have an
"arg" union in http_req_rule.
2012-12-24 12:26:26 +01:00
Willy Tarreau
354898bba9 MINOR: stats: replace STAT_FMT_CSV with STAT_FMT_HTML
We need to switch the default mode if we want to add new output formats
later. Let CSV be the default and HTML be an option.
2012-12-23 21:46:30 +01:00
Willy Tarreau
47ca54505c MINOR: chunks: centralize the trash chunk allocation
At the moment, we need trash chunks almost everywhere and the only
correctly implemented one is in the sample code. Let's move this to
the chunks so that all other places can use this allocator.

Additionally, the get_trash_chunk() function now really returns two
different chunks. Previously it used to always overwrite the same
chunk and point it to a different buffer, which was a bit tricky
because it's not obvious that two consecutive results do alias each
other.
2012-12-23 21:46:07 +01:00
Willy Tarreau
1facd6d67e REORG: stats: move the HTTP header injection to proto_http
The HTTP header injection that are performed in dumpstats when responding
or when redirecting a POST request have nothing to do in dumpstats. They
do not use any state from the stats, and are 100% HTTP. Let's make the
headers there in the HTTP core, and have dumpstats only produce stats.
2012-12-22 22:50:01 +01:00
Willy Tarreau
d9bdcd5139 REORG: stats: massive code reorg and cleanup
The dumpstats code looks like a spaghetti plate. Several functions are
supposed to be able to do several things but rely on complex states to
dispatch the work to independant functions. Most of the HTML output is
performed within the switch/case statements of the whole state machine.

Let's clean this up by adding new functions to emit the data and have
a few more iterators to avoid relying on so complex states.

The new stats dump sequence looks like this for CLI and for HTTP :

  cli_io_handler()
      -> stats_dump_sess_to_buffer()      // "show sess"
      -> stats_dump_errors_to_buffer()    // "show errors"
      -> stats_dump_raw_info_to_buffer()  // "show info"
         -> stats_dump_raw_info()
      -> stats_dump_raw_stat_to_buffer()  // "show stat"
         -> stats_dump_csv_header()
         -> stats_dump_proxy()
            -> stats_dump_px_hdr()
            -> stats_dump_fe_stats()
            -> stats_dump_li_stats()
            -> stats_dump_sv_stats()
            -> stats_dump_be_stats()
            -> stats_dump_px_end()

  http_stats_io_handler()
      -> stats_http_redir()
      -> stats_dump_http()              // also emits the HTTP headers
         -> stats_dump_html_head()      // emits the HTML headers
         -> stats_dump_csv_header()     // emits the CSV headers (same as above)
         -> stats_dump_http_info()      // note: ignores non-HTML output
         -> stats_dump_proxy()          // same as above
         -> stats_dump_http_end()       // emits HTML trailer
2012-12-22 20:45:02 +01:00
Willy Tarreau
40f151aa79 BUG/MINOR: http: don't abort client connection on premature responses
When a server responds prematurely to a POST request, haproxy used to
cause the transfer to be aborted before the end. This is problematic
because this causes the client to receive a TCP reset when it tries to
push more data, generally preventing it from receiving the response
which contain the reason for the premature reponse (eg: "entity too
large" or an authentication request).

From now on we take care of allowing the upload traffic to flow to the
server even when the response has been received, since the server is
supposed to drain it. That way the client receives the server response.

This bug has been present since 1.4 and the fix should probably be
backported there.
2012-12-20 12:10:09 +01:00
Willy Tarreau
f26b252ee4 MINOR: http: make resp_ver and status ACLs check for the presence of a response
The two ACL fetches "resp_ver" and "status", if used in a request despite
the warning, would return a match of zero length. This is inappropriate,
better return a non-match to be more consistent with other ACL processing.
2012-12-14 08:35:45 +01:00
Willy Tarreau
4a55060aa6 MINOR: http: add the "base32+src" fetch method.
This returns the concatenation of the base32 fetch and the src fetch.
The resulting type is of type binary, with a size of 8 or 20 bytes
depending on the source address family. This can be used to track
per-IP, per-URL counters.
2012-12-09 14:53:32 +01:00
Willy Tarreau
ab1f7b72fb MINOR: http: add the "base32" pattern fetch function
This returns a 32-bit hash of the value returned by the "base"
fetch method above. This is useful to track per-URL activity on
high traffic sites without having to store all URLs. Instead a
shorter hash is stored, saving a lot of memory. The output type
is an unsigned integer.
2012-12-09 14:08:48 +01:00
Willy Tarreau
5d5b5d8eaf MEDIUM: proto_tcp: add support for tracking L7 information
Until now it was only possible to use track-sc1/sc2 with "src" which
is the IPv4 source address. Now we can use track-sc1/sc2 with any fetch
as well as any transformation type. It works just like the "stick"
directive.

Samples are automatically converted to the correct types for the table.

Only "tcp-request content" rules may use L7 information, and such information
must already be present when the tracking is set up. For example it becomes
possible to track the IP address passed in the X-Forwarded-For header.

HTTP request processing now also considers tracking from backend rules
because we want to be able to update the counters even when the request
was already parsed and tracked.

Some more controls need to be performed (eg: samples do not distinguish
between L4 and L6).
2012-12-09 14:08:47 +01:00
Willy Tarreau
dc979f2492 BUG/MINOR: http: don't log a 503 on client errors while waiting for requests
If a client aborts a request with an error (typically a TCP reset), we must
log a 400. Till now we did not set the status nor close the stream interface,
causing the request to attempt to be forwarded and logging a 503.

Should be backported to 1.4 which is affected as well.
2012-12-04 10:52:22 +01:00
Willy Tarreau
14cba4b0b1 MEDIUM: connection: add an error code in connections
This will be needed to improve error reporting, especially for SSL.
2012-12-03 14:22:13 +01:00
Willy Tarreau
8139b9959f MINOR: compression: make the stats a bit more robust
To ensure that we only count when a response was compressed, we also
check for the SN_COMP_READY flag which indicates that the compression
was effectively initialized. Comp_algo alone is meaningless.
2012-11-27 09:34:00 +01:00
Willy Tarreau
9101535038 BUG/MINOR: http: disable compression when message has no body
Compression was not disabled on 1xx, 204, 304 nor HEAD requests. This
is not really a problem, but it reports more compressed responses than
really done.
2012-11-27 09:34:00 +01:00
Willy Tarreau
0a80a8dbb2 MINOR: http: factor out the content-type checks
Let's only look up the content-type header once. This involves
inverting the condition which is not dramatic.

Also, we now always check the value length before comparing it, and we
always reset the ctx.idx before looking a header up. Otherwise that
could make header lookups depend on their on-wire order. It would be
a minor issue however since at worst it would cause some responses not
to be compressed.
2012-11-26 16:36:00 +01:00
William Lallemand
d300261bab MINOR: compression: disable on multipart or status != 200
The compression is disabled when the HTTP status code is not 200, indeed
compression on some HTTP code can create issues (ex: 206, 416).

Multipart message should not be compressed eitherway.
2012-11-26 16:02:58 +01:00
William Lallemand
859550e068 BUG/MINOR: compression: Content-Type is case insensitive
The Content-Type parameter must be case insensitive.
2012-11-26 16:02:58 +01:00
Willy Tarreau
f003d375ec BUG/MINOR: http: don't report client aborts as server errors
If a client aborts with an abortonclose flag, the close is forwarded
to the server and when server response is processed, the analyser thinks
it's the server who has closed first, and logs flags "SD" or "SH" and
counts a server error. In order to avoid this, we now first detect that
the client has closed and log a client abort instead.

This likely is the reason why many people have been observing a small rate
of SD/SH flags without being able to find what the error was.

This fix should probably be backported to 1.4.
2012-11-26 13:50:02 +01:00
Willy Tarreau
5e16cbc3bd MINOR: stats: report the total number of compressed responses per front/back
Depending on the content-types and accept-encoding fields, some responses
might or might not be compressed. Let's have a counter of the number of
compressed responses and report it in the stats to help improve compression
usage.

Some cosmetic issues were fixed in the CSV output too (missing commas at the
end).
2012-11-24 14:54:13 +01:00
William Lallemand
00bf1dee9c BUG/MEDIUM: compression: does not forward trailers
The commit bf3ae617 introduced a regression about the forward of the
trailers in compression mode.
2012-11-23 11:12:33 +01:00
Willy Tarreau
193b8c6168 MINOR: http: allow the cookie capture size to be changed
Some users need more than 64 characters to log large cookies. The limit
was set to 63 characters (and not 64 as previously documented). Now it
is possible to change this using the global "tune.http.cookielen" setting
if required.
2012-11-22 00:44:27 +01:00
William Lallemand
072a2bf537 MINOR: compression: CPU usage limit
New option 'maxcompcpuusage' in global section.
Sets the maximum CPU usage HAProxy can reach before stopping the
compression for new requests or decreasing the compression level of
current requests.  It works like 'maxcomprate' but with the Idle.
2012-11-21 02:15:16 +01:00
William Lallemand
8b52bb3878 MEDIUM: compression: use pool for comp_ctx
Use pool for comp_ctx, it is allocated during the comp_algo->init().
The allocation of comp_ctx is accounted for in the zlib_memory_available.
2012-11-21 01:56:47 +01:00
William Lallemand
bf3ae61789 MEDIUM: compression: don't compress when no data
This patch makes changes in the http_response_forward_body state
machine. It checks if the compress algorithm had consumed data before
swapping the temporary and the input buffer. So it prevents null sized
zlib chunks.
2012-11-19 14:57:29 +01:00
Willy Tarreau
b97b6190e1 BUG: compression: properly disable compression when content-type does not match
Disabling compression based on the content-type was improperly done since the
introduction of the COMP_READY flag, sometimes resulting in truncated responses.
2012-11-19 14:55:02 +01:00
Willy Tarreau
543db62e1f BUG/MEDIUM: compression: release the zlib pools between keep-alive requests
There was a possible memory leak in the zlib code when the first response of
a keep-alive session was compressed, because the next request would reset the
compression algo, preventing a later call to session_free() from releasing it.
The reason is that it is necessary to release the assigned resources in
http_end_txn_clean_session().
2012-11-15 16:41:22 +01:00
William Lallemand
ec3e3890f0 BUG/MINOR: compression: deinit zlib only when required
The zlib stream was deinitialized even when the init failed.
2012-11-15 15:42:17 +01:00
William Lallemand
c04ca58222 BUG/MEDIUM: compression: no Content-Type header but type in configuration
HAProxy was compressing data when there was no Content-Type header in
the response but a compression type specified in the configuration.
2012-11-15 15:42:11 +01:00
Willy Tarreau
3fdb366885 MAJOR: connection: replace struct target with a pointer to an enum
Instead of storing a couple of (int, ptr) in the struct connection
and the struct session, we use a different method : we only store a
pointer to an integer which is stored inside the target object and
which contains a unique type identifier. That way, the pointer allows
us to retrieve the object type (by dereferencing it) and the object's
address (by computing the displacement in the target structure). The
NULL pointer always corresponds to OBJ_TYPE_NONE.

This reduces the size of the connection and session structs. It also
simplifies target assignment and compare.

In order to improve the generated code, we try to put the obj_type
element at the beginning of all the structs (listener, server, proxy,
si_applet), so that the original and target pointers are always equal.

A lot of code was touched by massive replaces, but the changes are not
that important.
2012-11-12 00:42:33 +01:00
Willy Tarreau
50fc7777c6 MEDIUM: http: refrain from sending "Connection: close" when Upgrade is present
Some servers are not totally HTTP-compliant when it comes to parsing the
Connection header. This is particularly true with WebSocket where it happens
from time to time that a server doesn't support having a "close" token along
with the "Upgrade" token in the Connection header. This broken behaviour has
also been noticed on some clients though the problem is less frequent on the
response path.

Sometimes the workaround consists in enabling "option http-pretend-keepalive"
to leave the request Connection header untouched, but this is not always the
most convenient solution. This patch introduces a new solution : haproxy now
also looks for the "Upgrade" token in the Connection header and if it finds
it, then it refrains from adding any other token to the Connection header
(though "keep-alive" and "close" may still be removed if found). The same is
done for the response headers.

This way, WebSocket much with less changes even when facing non-compliant
clients or servers. At least it fixes the DISCONNECT issue that was seen
on the websocket.org test.

Note that haproxy does not change its internal mode, it just refrains from
adding new tokens to the connection header.
2012-11-11 22:40:00 +01:00
Willy Tarreau
7f7ad91056 BUILD: stream_interface: remove si_fd() and its references
si_fd() is not used a lot, and breaks builds on OpenBSD 5.2 which
defines this name for its own purpose. It's easy enough to remove
this one-liner function, so let's do it.
2012-11-11 20:53:29 +01:00
William Lallemand
d85f917daf MINOR: compression: maximum compression rate limit
This patch adds input and output rate calcutation on the HTTP compresion
feature.

Compression can be limited with a maximum rate value in kilobytes per
second. The rate is set with the global 'maxcomprate' option. You can
change this value dynamicaly with 'set rate-limit http-compression
global' on the UNIX socket.
2012-11-10 17:47:27 +01:00
William Lallemand
f3747837e5 MINOR: compression: tune.comp.maxlevel
This option allows you to set the maximum compression level usable by
the compression algorithm. It affects CPU usage.
2012-11-10 17:47:07 +01:00