Commit Graph

1257 Commits

Author SHA1 Message Date
Willy Tarreau
8740c8b1b2 REORG: http: move the HTTP/1 header block parser to h1.c
Since it still depends on http_msg, it was not renamed yet.
2017-10-22 09:54:13 +02:00
Willy Tarreau
db4893d6a4 REORG: http: move the HTTP/1 chunk parser to h1.{c,h}
Functions http_parse_chunk_size(), http_skip_chunk_crlf() and
http_forward_trailers() were moved to h1.h and h1.c respectively so
that they can be called from outside. The parts that were inline
remained inline as it's critical for performance (+41% perf
difference reported in an earlier test). For now the "http_" prefix
remains in their name since they still depend on the http_msg type.
2017-10-22 09:54:13 +02:00
Willy Tarreau
0da5b3bddc REORG: http: move some very http1-specific parts to h1.{c,h}
Certain types and enums are very specific to the HTTP/1 parser, and we'll
need to share them with the HTTP/2 to HTTP/1 translation code. Let's move
them to h1.c/h1.h. Those with very few occurrences or only used locally
were renamed to explicitly mention the relevant HTTP version :

  enum ht_state      -> h1_state.
  http_msg_state_str -> h1_msg_state_str
  HTTP_FLG_*         -> H1_FLG_*
  http_char_classes  -> h1_char_classes

Others like HTTP_IS_*, HTTP_MSG_* are left to be done later.
2017-10-22 09:54:13 +02:00
Willy Tarreau
06d80a9a9c REORG: channel: finally rename the last bi_* / bo_* functions
For HTTP/2 we'll need some buffer-only equivalent functions to some of
the ones applying to channels and still squatting the bi_* / bo_*
namespace. Since these names have kept being misleading for quite some
time now and are really getting annoying, it's time to rename them. This
commit will use "ci/co" as the prefix (for "channel in", "channel out")
instead of "bi/bo". The following ones were renamed :

  bi_getblk_nc, bi_getline_nc, bi_putblk, bi_putchr,
  bo_getblk, bo_getblk_nc, bo_getline, bo_getline_nc, bo_inject,
  bi_putchk, bi_putstr, bo_getchr, bo_skip, bi_swpbuf
2017-10-19 15:01:08 +02:00
Christopher Faulet
a258479e3f BUG/MEDIUM: http: Return an error when url_dec sample converter failed
url_dec sample converter uses url_decode function to decode an URL. This
function fails by returning -1 when an invalid character is found. But the
sample converter never checked the return value and it used it as length for the
decoded string. Because it always succeeded, the invalid sample (with a string
length set to -1) could be used by other sample fetches or sample converters,
leading to undefined behavior like segfault.

The fix is pretty simple, url_dec sample converter just needs to return an error
when url_decode fails.

This patch must be backported in 1.7 and 1.6.
2017-10-05 11:11:34 +02:00
Andjelko Iharos
c3680ecdf8 MINOR: add severity information to cli feedback messages 2017-09-13 13:38:32 +02:00
Christopher Faulet
5d468ca97b BUG/MEDIUM: http: Close streams for connections closed before a redirect
A previous fix was made to prevent the connection to a server if a redirect was
performed during the request processing when we wait to keep the client
connection alive. This fix introduced a pernicious bug. If a client closes its
connection immediately after sending a request, it is possible to keep stream
alive infinitely. This happens when the connection closure is caught when the
request is received, before the request parsing.

To be more specific, this happens because the close event is not "forwarded",
first because of the call to "channel_dont_connect" in the function
"http_apply_redirect_rule", then because we want to keep the client connection
alive, we explicitly call "channel_dont_close" in the function
"http_request_forward_body".

So, to fix the bug, instead of blocking the server connection, we force its
shutdown. This will force the stream to re-evaluate all connexions states. So it
will detect the client has closed its connection.

This patch must be backported in 1.7.
2017-09-11 17:39:21 +02:00
Emeric Brun
52a91d3d48 MEDIUM: check: server states and weight propagation re-work
The server state and weight was reworked to handle
"pending" values updated by checks/CLI/LUA/agent.
These values are commited to be propagated to the
LB stack.

In further dev related to multi-thread, the commit
will be handled into a sync point.

Pending values are named using the prefix 'next_'
Current values used by the LB stack are named 'cur_'
2017-09-05 15:23:16 +02:00
Christopher Faulet
6988f678cd MINOR: http: Use a trash chunk to store decoded string of the HTTP auth header
This string is used in sample fetches so it is safe to use a preallocated trash
chunk instead of a buffer dynamically allocated during HAProxy startup.
2017-09-05 10:36:28 +02:00
Willy Tarreau
585744bf2e REORG/MEDIUM: connection: introduce the notion of connection handle
Till now connections used to rely exclusively on file descriptors. It
was planned in the past that alternative solutions would be implemented,
leading to member "union t" presenting sock.fd only for now.

With QUIC, the connection will need to continue to exist but will not
rely on a file descriptor but a connection ID.

So this patch introduces a "connection handle" which is either a file
descriptor or a connection ID, to replace the existing "union t". We've
now removed the intermediate "struct sock" which was never used. There
is no functional change at all, though the struct connection was inflated
by 32 bits on 64-bit platforms due to alignment.
2017-08-24 19:30:04 +02:00
Willy Tarreau
a5480694bf MINOR: http: export some of the HTTP parser macros
The two macros EXPECT_LF_HERE and EAT_AND_JUMP_OR_RETURN were exported
for use outside the HTTP parser. They now take extra arguments to avoid
implicit pointers and jump labels. These will be used to reimplement a
minimalist HTTP/1 parser in the H1->H2 gateway.
2017-08-18 13:38:47 +02:00
Christopher Faulet
56d260916f BUG/MAJOR: http: Fix possible infinity loop in http_sync_(req|res)_state
In commit "MINOR: http: Switch requests/responses in TUNNEL mode only by
checking txn flags", it is possible to have an infinite loop on HTTP_MSG_CLOSING
state.
2017-07-20 11:44:28 +02:00
Christopher Faulet
a81ff60454 BUG/MINOR: http: Fix bug introduced in previous patch in http_resync_states
The previous patch ("MINOR: http: Rely on analyzers mask to end processing in
forward_body functions") contains a bug for keep-alive transactions.

For these transactions, AN_REQ_FLT_END and AN_RES_FLT_END analyzers must be
removed only when all outgoing data was forwarded.
2017-07-19 10:57:53 +02:00
Christopher Faulet
894da4c8ea MINOR: http: Rely on analyzers mask to end processing in forward_body functions
Instead of relying on request or response state, we use "chn->analysers" mask as
all other analyzers. So now, http_resync_states does not return anything
anymore.

The debug message in http_resync_states has been improved.
2017-07-18 15:24:05 +02:00
Christopher Faulet
1486b0ab6d BUG/MEDIUM: http: Switch HTTP responses in TUNNEL mode when body length is undefined
When the body length of a HTTP response is undefined, the HTTP parser is blocked
in the body parsing. Before HAProxy 1.7, in this case, because
AN_RES_HTTP_XFER_BODY is never set, there is no visible effect. When the server
closes its connection to terminate the response, HAProxy catches it as a normal
closure. Since 1.7, we always set this analyzer to enter at least once in
http_response_forward_body. But, in the present case, when the server connection
is closed, http_response_forward_body is called one time too many. The response
is correctly sent to the client, but an error is catched and logged with "SD--"
flags.

To reproduce the bug, you can use the configuration "tests/test-fsm.cfg". The
tests 3 and 21 hit the bug.

Idea to fix the bug is to switch the response in TUNNEL mode without switching
the request. This is possible because of previous patches.

First, we need to detect responses with undefined body length during states
synchronization. Excluding tunnelled transactions, when the response length is
undefined, TX_CON_WANT_CLO is always set on the transaction. So, when states are
synchronized, if TX_CON_WANT_CLO is set, the response is switched in TUNNEL mode
and the request remains unchanged.

Then, in http_msg_forward_body, we add a specific check to switch the response
in DONE mode if the body length is undefined and if there is no data filter.

This patch depends on following previous commits:

  * MINOR: http: Switch requests/responses in TUNNEL mode only by checking txn flags
  * MINOR: http: Reorder/rewrite checks in http_resync_states

This patch must be backported in 1.7 with 2 previous ones.
2017-07-18 15:22:26 +02:00
Christopher Faulet
4be9803914 MINOR: http: Switch requests/responses in TUNNEL mode only by checking txn flags
Today, the only way to have a request or a response in HTTP_MSG_TUNNEL state is
to have the flag TX_CON_WANT_TUN set on the transaction. So this is a symmetric
state. Both the request and the response are switch in same time in this
state. This can be done only by checking transaction flags instead of relying on
the other side state. This is the purpose of this patch.

This way, if for any reason we need to switch only one side in TUNNEL mode, it
will be possible. And to prepare asymmetric cases, we check channel flags in
DONE _AND_ TUNNEL states.

WARNING: This patch will be used to fix a bug. The fix will be commited in a
very next commit. So if the fix is backported, this one must be backported too.
2017-07-18 15:15:12 +02:00
Christopher Faulet
f77bb539d4 MINOR: http: Reorder/rewrite checks in http_resync_states
The previous patch removed the forced symmetry of the TUNNEL mode during the
state synchronization. Here, we take care to remove body analyzer only on the
channel in TUNNEL mode. In fact, today, this change has no effect because both
sides are switched in same time. But this way, with some changes, it will be
possible to keep body analyzer on a side (to finish the states synchronization)
with the other one in TUNNEL mode.

WARNING: This patch will be used to fix a bug. The fix will be commited in a
very next commit. So if the fix is backported, this one must be backported too.
2017-07-18 15:09:54 +02:00
Christopher Faulet
a3992e06a6 BUG/MINOR: http: Set the response error state in http_sync_res_state
This is just typo. It may only report a wrong response message state in
"show errors" on the CLI.

This patch must be backported in 1.7.
2017-07-18 15:09:10 +02:00
Willy Tarreau
a14ad72d30 BUG/MINOR: http: properly handle all 1xx informational responses
Only 100 was considered informational instead of all 1xx. This can be
a problem when facing a 102 ("progress") or with the upcoming 103 for
early hints. Let's properly handle all 1xx now, leaving a special case
for 101 which is used for the upgrade.

This fix should be backported to 1.7, 1.6 and 1.5. In 1.4 the code is
different but the backport should be made there as well.
2017-07-07 11:36:32 +02:00
Christopher Faulet
a94fda30bd BUG/MINOR: http: Don't reset the transaction if there are still data to send
To reset an HTTP transaction, we need to be sure all data were sent, for the
request and the response. There are tests on request and response buffers for
that in http_resync_states function. But the return code was wrong. We must
return 0 to wait.

This patch must be backported in 1.7
2017-07-06 23:06:57 +02:00
Emeric Brun
96fd926ccc BUG/MAJOR: http: fix buffer overflow on loguri buffer.
The pool used to log the uri was created with a size of 0 because the
configuration and 'tune.http.logurilen' were parsed too earlier.

The fix consist to postpone the pool_create as it is done for
cookie captures.

Regression introduced with 'MINOR: log: Add logurilen tunable'
2017-07-05 13:59:29 +02:00
Christopher Faulet
a33510b215 BUG/MINOR: http/filters: Be sure to wait if a filter loops in HTTP_MSG_ENDING
A filter can choose to loop when a HTTP message is in the state
HTTP_MSG_ENDING. But the transaction is terminated with an error if the input is
closed (CF_SHUTR set on the channel). At this step, we have received all data,
so we can wait.

So now, we also check the parser state before leaving. This fix only affects
configs that use a filter that can wait in http_forward_data or http_end
callbacks, when all data were parsed.
2017-06-14 16:46:21 +02:00
Willy Tarreau
6a0bca9e78 BUG/MAJOR: http: call manage_client_side_cookies() before erasing the buffer
Jean Lubatti reported a crash on haproxy using a config involving cookies
and tarpit rules. It just happens that since 1.7-dev3 with commit 83a2c3d
("BUG/MINOR : allow to log cookie for tarpit and denied request"), function
manage_client_side_cookies() was called after erasing the request buffer in
case of a tarpit action. The problem is that this function must absolutely
not be called with an empty buffer since it moves parts of it. A typical
reproducer consists in sending :

    "GET / HTTP/1.1\r\nCookie: S=1\r\n\r\n"

On such a config :

    listen crash
        bind :8001
        mode http
        reqitarpit .
        cookie S insert indirect
        server s1 127.0.0.1:8000 cookie 1

The fix simply consists in moving the call to the function before the call
to buffer_erase().

Many thanks to Jean for testing instrumented code and providing a usable
core.

This fix must be backported to all stable versions since the fix introducing
this bug was backported as well.
2017-06-11 18:08:18 +02:00
Stphane Cottin
23e9e93128 MINOR: log: Add logurilen tunable.
The default len of request uri in log messages is 1024. In some use
cases, you need to keep the long trail of GET parameters. The only
way to increase this len is to recompile with DEFINE=-DREQURI_LEN=2048.

This commit introduces a tune.http.logurilen configuration directive,
allowing to tune this at runtime.
2017-06-02 11:06:36 +02:00
Lukas Tribus
23953686da DOC: update RFC references
A few doc and code comment updates bumping RFC references to the new
ones.
2017-04-28 18:58:11 +02:00
Thierry FOURNIER
d7d8881543 MINOR: proto-http: Add sample fetch wich returns all HTTP headers
The sample fetch returns all headers including the last jump line.
The last jump line is used to determine if the block of headers is
truncated or not.
2017-04-27 11:56:11 +02:00
Thierry FOURNIER
5617dce27d MINOR: Add binary encoding request header sample fetch
This sample fetch encodes the http request headers in binary
format. This sample-fetch is useful with SPOE.
2017-04-27 11:54:54 +02:00
Christopher Faulet
9f724edbd8 BUG/MEDIUM: http: Drop the connection establishment when a redirect is performed
This bug occurs when a redirect rule is applied during the request analysis on a
persistent connection, on a proxy without any server. This means, in a frontend
section or in a listen/backend section with no "server" line.

Because the transaction processing is shortened, no server can be selected to
perform the connection. So if we try to establish it, this fails and a 503 error
is returned, while a 3XX was already sent. So, in this case, HAProxy generates 2
replies and only the first one is expected.

Here is the configuration snippet to easily reproduce the problem:

    listen www
        bind :8080
        mode http
        timeout connect 5s
        timeout client 3s
        timeout server 6s
        redirect location /

A simple HTTP/1.1 request without body will trigger the bug:

    $ telnet 0 8080
    Trying 0.0.0.0...
    Connected to 0.
    Escape character is '^]'.
    GET / HTTP/1.1

    HTTP/1.1 302 Found
    Cache-Control: no-cache
    Content-length: 0
    Location: /

    HTTP/1.0 503 Service Unavailable
    Cache-Control: no-cache
    Connection: close
    Content-Type: text/html

    <html><body><h1>503 Service Unavailable</h1>
    No server is available to handle this request.
    </body></html>
    Connection closed by foreign host.

[wt: only 1.8-dev is impacted though the bug is present in older ones]
2017-04-21 07:37:45 +02:00
Christopher Faulet
69744d92a3 BUG/MEDIUM: http: Fix blocked HTTP/1.0 responses when compression is enabled
When the compression filter is enabled, if a HTTP/1.0 response is received (no
content-length and no transfer-encoding), no data are forwarded to the client
because of a bug and the transaction is blocked indefinitly.

The bug comes from the fact we need to synchronize the end of the request and
the response because of the compression filter. This inhibits the infinite
forwarding of data. But for these responses, the compression is not
activated. So the response body is not analyzed. This leads to a deadlock.

The solution is to enable the analyze of the response body in all cases and
handle this one to enable the infinite forwarding. All other cases should
already by handled.

This fix should be backported to 1.7.
2017-03-31 14:40:42 +02:00
Christopher Faulet
814d270360 MINOR: http: Add debug messages when HTTP body analyzers are called
Only DPRINTF() for developers.
2017-03-31 14:38:33 +02:00
Christopher Faulet
be821b9f40 MINOR: http: remove useless check on HTTP_MSGF_XFER_LEN for the request
The flag HTTP_MSGF_XFER_LEN is always set for an HTTP request because we always
now the body length. So there is no need to do check on it.
2017-03-31 14:38:33 +02:00
Christopher Faulet
c0c672a2ab BUG/MINOR: http: Fix conditions to clean up a txn and to handle the next request
To finish a HTTP transaction and to start the new one, we check, among other
things, that there is enough space in the reponse buffer to eventually inject a
message during the parsing of the next request. Because these messages can reach
the maximum buffers size, it is mandatory to have an empty response
buffer. Remaining input data are trimmed during the txn cleanup (in
http_reset_txn), so we just need to check that the output data were flushed.

The current implementation depends on channel_congested, which does check the
reserved area is available. That's not of course good enough. There are other
tests on the reponse buffer is http_wait_for_request. But conditions to move on
are almost the same. So, we can imagine some scenarii where some output data
remaining in the reponse buffer during the request parsing prevent any messages
injection.

To fix this bug, we just wait that output data were flushed before cleaning up
the HTTP txn (ie. s->res.buf->o == 0). In addition, in http_reset_txn we realign
the response buffer (note the buffer is empty at this step).

Thanks to this changes, there is no more need to set CF_EXPECT_MORE on the
response channel in http_end_txn_clean_session. And more important, there is no
more need to check the response buffer state in http_wait_for_request. This
remove a workaround on response analysers to handle HTTP pipelining.

This patch can be backported in 1.7, 1.6 and 1.5.
2017-03-31 14:36:20 +02:00
Hongbo Long
e39683c4d4 BUG/MEDIUM: stream: fix client-fin/server-fin handling
A tcp half connection can cause 100% CPU on expiration.

First reproduced with this haproxy configuration :

  global
      tune.bufsize 10485760
  defaults
      timeout server-fin 90s
      timeout client-fin 90s
  backend node2
      mode tcp
      timeout server 900s
      timeout connect 10s
      server def 127.0.0.1:3333
  frontend fe_api
      mode  tcp
      timeout client 900s
      bind :1990
      use_backend node2

Ie timeout server-fin shorter than timeout server, the backend server
sends data, this package is left in the cache of haproxy, the backend
server continue sending fin package, haproxy recv fin package. this
time the session information is as follows:

  time the session information is as follows:
      0x2373470: proto=tcpv4 src=127.0.0.1:39513 fe=fe_api be=node2
      srv=def ts=08 age=1s calls=3 rq[f=848000h,i=0,an=00h,rx=14m58s,wx=,ax=]
      rp[f=8004c020h,i=0,an=00h,rx=,wx=14m58s,ax=] s0=[7,0h,fd=6,ex=]
      s1=[7,18h,fd=7,ex=] exp=14m58s

  rp has set the CF_SHUTR state, next, the client sends the fin package,

  session information is as follows:
      0x2373470: proto=tcpv4 src=127.0.0.1:39513 fe=fe_api be=node2
      srv=def ts=08 age=38s calls=4 rq[f=84a020h,i=0,an=00h,rx=,wx=,ax=]
      rp[f=8004c020h,i=0,an=00h,rx=1m11s,wx=14m21s,ax=] s0=[7,0h,fd=6,ex=]
      s1=[9,10h,fd=7,ex=] exp=1m11s

  After waiting 90s, session information is as follows:
      0x2373470: proto=tcpv4 src=127.0.0.1:39513 fe=fe_api be=node2
      srv=def ts=04 age=4m11s calls=718074391 rq[f=84a020h,i=0,an=00h,rx=,wx=,ax=]
      rp[f=8004c020h,i=0,an=00h,rx=?,wx=10m49s,ax=] s0=[7,0h,fd=6,ex=]
      s1=[9,10h,fd=7,ex=] exp=? run(nice=0)

  cpu information:
      6899 root      20   0  112224  21408   4260 R 100.0  0.7   3:04.96 haproxy

Buffering is set to ensure that there is data in the haproxy buffer, and haproxy
can receive the fin package, set the CF_SHUTR flag, If the CF_SHUTR flag has been
set, The following code does not clear the timeout message, causing cpu 100%:

stream.c:process_stream:

        if (unlikely((res->flags & (CF_SHUTR|CF_READ_TIMEOUT)) == CF_READ_TIMEOUT)) {
            if (si_b->flags & SI_FL_NOHALF)
            si_b->flags |= SI_FL_NOLINGER;
            si_shutr(si_b);
        }

If you have closed the read, set the read timeout does not make sense.
With or without cf_shutr, read timeout is set:

       if (tick_isset(s->be->timeout.serverfin)) {
           res->rto = s->be->timeout.serverfin;
           res->rex = tick_add(now_ms, res->rto);
       }

After discussion on the mailing list, setting half-closed timeouts the
hard way here doesn't make sense. They should be set only at the moment
the shutdown() is performed. It will also solve a special case which was
already reported of some half-closed timeouts not working when the shutw()
is performed directly at the stream-interface layer (no analyser involved).
Since the stream interface layer cannot know the timeout values, we'll have
to store them directly in the stream interface so that they are used upon
shutw(). This patch does this, fixing the problem.

An easier reproducer to validate the fix is to keep the huge buffer and
shorten all timeouts, then call it under tcploop server and client, and
wait 3 seconds to see haproxy run at 100% CPU :

  global
      tune.bufsize 10485760

  listen px
      bind :1990
      timeout client 90s
      timeout server 90s
      timeout connect 1s
      timeout server-fin 3s
      timeout client-fin 3s
      server def 127.0.0.1:3333

  $ tcploop 3333 L W N20 A P100 F P10000 &
  $ tcploop 127.0.0.1:1990 C S10000000 F
2017-03-21 15:04:43 +01:00
Christopher Faulet
014e39c0b6 BUG/MAJOR: http: fix typo in http_apply_redirect_rule
Because of this typo, AN_RES_FLT_END was never called when a redirect rule is
applied on a keep-alive connection. In almost all cases, this bug has no
effect. But, it leads to a memory leak if a redirect is done on a http-response
rule when the HTTP compression is enabled.

This patch should be backported in 1.7.
2017-03-21 11:21:18 +01:00
Christopher Faulet
e6006245de BUG/MEDIUM: filters: Fix channels synchronization in flt_end_analyze
When a filter is used, there are 2 channel's analyzers to surround all the
others, flt_start_analyze and flt_end_analyze. This is the good place to acquire
and release resources used by filters, when needed. In addition, the last one is
used to synchronize the both channels, especially for HTTP streams. We must wait
that the analyze is finished for the both channels for an HTTP transaction
before restarting it for the next one.

But this part was buggy, leading to unexpected behaviours. First, depending on
which channel ends first, the request or the response can be switch in a
"forward forever" mode. Then, the HTTP transaction can be cleaned up too early,
while a processing is still in progress on a channel.

To fix the bug, the flag CF_FLT_ANALYZE has been added. It is set on channels in
flt_start_analyze and is kept if at least one filter is still analyzing the
channel. So, we can trigger the channel syncrhonization if this flag was removed
on the both channels. In addition, the flag TX_WAIT_CLEANUP has been added on
the transaction to know if the transaction must be cleaned up or not during
channels syncrhonization. This way, we are sure to reset everything once all the
processings are finished.

This patch should be backported in 1.7.
2017-03-15 19:09:06 +01:00
Willy Tarreau
2019f95997 CLEANUP: http: make http_server_error() not set the status anymore
Given that all call places except one had to set txn->status prior to
calling http_server_error(), it's simpler to make this function rely
on txn->status than have it store it from an argument.
2017-03-14 11:09:04 +01:00
Jarno Huuskonen
800d1761d0 MINOR: http-request tarpit deny_status.
Implements deny_status for http-request tarpit rule (allows setting
custom http status code). This commit depends on:
MEDIUM: http_error_message: txn->status / http_get_status_idx.
2017-03-14 10:41:54 +01:00
Jarno Huuskonen
9e6906b9ec MEDIUM: http_error_message: txn->status / http_get_status_idx.
This commit removes second argument(msgnum) from http_error_message and
changes http_error_message to use s->txn->status/http_get_status_idx for
mapping status code from 200..504 to HTTP_ERR_200..HTTP_ERR_504(enum).

This is needed for http-request tarpit deny_status commit.
2017-03-14 10:41:41 +01:00
Willy Tarreau
19b1412e02 MINOR: http: don't close when redirect location doesn't start with "/"
In 1.4-dev5 when we started to implement keep-alive, commit a9679ac
("[MINOR] http: make the conditional redirect support keep-alive")
added a specific check was added to support keep-alive on redirect
rules but only when the location would start with a "/" indicating
the client would come back to the same server.

But nowadays most applications put http:// or https:// in front of
each and every location, and continuing to perform a close there is
counter-efficient, especially when multiple objects are fetched at
once from a same origin which redirects them to the correct origin
(eg: after an http to https forced upgrade).

It's about time to get rid of this old trick as it causes more harm
than good at an era where persistent connections are omnipresent.

Special thanks to Ciprian Dorin Craciun for providing convincing
arguments with a pretty valid use case and proposing this draft
patch which addresses the issue he was facing.

This change although not exactly a bug fix should be backported
to 1.7 to adapt better to existing infrastructure.
2017-02-28 09:48:11 +01:00
Christopher Faulet
cdade94cf5 BUG/MINOR: http: Return an error when a replace-header rule failed on the response
Historically, http-response rules couldn't produce errors generating HTTP
responses during their evaluation. This possibility was "implicitly" added with
http-response redirect rules (51d861a4). But, at the time, replace-header rules
were kept untouched. When such a rule failed, the rules processing was just
stopped (like for an accept rule).

Conversely, when a replace-header rule fails on the request, it generates a HTTP
response (400 Bad Request).

With this patch, errors on replace-header rule are now handled in the same way
for HTTP requests and HTTP responses.

This patch should be backported in 1.7 and 1.6.
2017-02-08 19:08:38 +01:00
Christopher Faulet
07a0fecced BUG/MEDIUM: http: Prevent replace-header from overwriting a buffer
This is the same fix as which concerning the redirect rules (0d94576c).

The buffer used to expand the <replace-fmt> argument must be protected to
prevent it being overwritten during build_logline() execution (the function used
to expand the format string).

This patch should be backported in 1.7, 1.6 and 1.5. It relies on commit b686afd
("MINOR: chunks: implement a simple dynamic allocator for trash buffers") for
the trash allocator, which has to be backported as well.
2017-02-08 19:08:38 +01:00
Christopher Faulet
f1cc5d0eaf BUG/MEDIUM: filters: Do not truncate HTTP response when body length is undefined
Some users have experienced some troubles using the compression filter when the
HTTP response body length is undefined. They complained about receiving
truncated responses.

In fact, the bug can be triggered if there is at least one filter attached to
the stream but none registered to analyze the HTTP response body. In this case,
when the body length is undefined, data should be forwarded without any
parsing. But, because of a wrong check, we were starting to parse them. Because
it was not expected, the end of response was not correctly detected and the
response could be truncted. So now, we rely on HAS_DATA_FILTER macro instead of
HAS_FILTER one to choose to parse HTTP response body or not.

Furthermore, in http_response_forward_body, the test to not forward the server
closure to the client has been updated to reflect conditions listed in the
associated comment.

And finally, in http_msg_forward_body, when the body length is undefined, we
continue the parsing it until the server closes the connection without any on
filters. So filters can safely stop to filter data during their parsing.

This fix should be backported in 1.7
2017-02-08 19:08:38 +01:00
Thierry FOURNIER
0d94576c74 BUG/MEDIUM: http: prevent redirect from overwriting a buffer
See 4b788f7d34

If we use the action "http-request redirect" with a Lua sample-fetch or
converter, and the Lua function calls one of the Lua log function, the
header name is corrupted, it contains an extract of the last loggued data.

This is due to an overwrite of the trash buffer, because his scope is not
respected in the "add-header" function. The scope of the trash buffer must
be limited to the function using it. The build_logline() function can
execute a lot of other function which can use the trash buffer.

This patch fix the usage of the trash buffer. It limits the scope of this
global buffer to the local function, we build first the header value using
build_logline, and after we store the header name.

Thanks Jesse Schulman for the bug repport.

This patch must be backported in 1.7, 1.6 and 1.5 version, and it relies
on commit b686afd ("MINOR: chunks: implement a simple dynamic allocator for
trash buffers") for the trash allocator, which has to be backported as well.
2017-02-08 11:16:46 +01:00
Jarno Huuskonen
59af2df102 MINOR: proto_http.c 502 error txt typo.
[wt: should be backported to 1.7 and 1.6 as it was introduced in 1.6-dev4]
2017-01-11 12:44:41 +01:00
Jarno Huuskonen
16ad94adf6 MINOR: Use "500 Internal Server Error" for 500 error/status code message.
Internal Server Error is what is in RFC 2616/7231.
2017-01-11 12:44:40 +01:00
Robin H. Johnson
52f5db2a44 MINOR: http: custom status reason.
The older 'rsprep' directive allows modification of the status reason.

Extend 'http-response set-status' to take an optional string of the new
status reason.

  http-response set-status 418 reason "I'm a coffeepot"

Matching updates in Lua code:
- AppletHTTP.set_status
- HTTP.res_set_status

Signed-off-by: Robin H. Johnson <robbat2@gentoo.org>
2017-01-06 11:57:44 +01:00
Willy Tarreau
2afff9c2d6 BUG/MAJOR: http: fix risk of getting invalid reports of bad requests
Commits 5f10ea3 ("OPTIM: http: improve parsing performance of long
URIs") and 0431f9d ("OPTIM: http: improve parsing performance of long
header lines") introduced a bug in the HTTP parser : when a partial
request is read, the first part ends up on a 8-bytes boundary (or 4-byte
on 32-bit machines), the end lies in the header field value part, and
the buffer used to contain a CR character exactly after the last block,
then the parser could be confused and read this CR character as being
part of the current request, then switch to a new state waiting for an
LF character. Then when the next part of the request appeared, it would
read the character following what was erroneously mistaken for a CR,
see that it is not an LF and fail on a bad request. In some cases, it
can even be worse and the header following the hole can be improperly
indexed causing all sort of unexpected behaviours like a content-length
being ignored or a header appended at the wrong position. The reason is
that there's no control of and of parsing just after breaking out of the
loop.

One way to reproduce it is with this config :

  global
      stats socket /tmp/sock1 mode 666 level admin
      stats timeout 1d

  frontend  px
      bind :8001
      mode http
      timeout client 10s
      redirect location /

And sending requests this way :

  $ tcploop 8001 C P S:"$(dd if=/dev/zero bs=16384 count=1 2>/dev/null | tr '\000' '\r')"
  $ tcploop 8001 C P S:"$(dd if=/dev/zero bs=16384 count=1 2>/dev/null | tr '\000' '\r')"
  $ tcploop 8001 C P \
    S:"GET / HTTP/1.0\r\nX-padding: 0123456789.123456789.123456789.123456789.123456789.123456789.1234567" P \
    S:"89.123456789\r\n\r\n" P

Then a "show errors" on the socket will report :

  $ echo "show errors" | socat - /tmp/sock1
  Total events captured on [04/Jan/2017:15:09:15.755] : 32

  [04/Jan/2017:15:09:13.050] frontend px (#2): invalid request
    backend <NONE> (#-1), server <NONE> (#-1), event #31
    src 127.0.0.1:59716, session #91, session flags 0x00000080
    HTTP msg state 17, msg flags 0x00000000, tx flags 0x00000000
    HTTP chunk len 0 bytes, HTTP body len 0 bytes
    buffer flags 0x00808002, out 0 bytes, total 111 bytes
    pending 111 bytes, wrapping at 16384, error at position 107:

    00000  GET / HTTP/1.0\r\n
    00016  X-padding: 0123456789.123456789.123456789.123456789.123456789.12345678
    00086+ 9.123456789.123456789\r\n
    00109  \r\n

This fix must be backported to 1.7.

Many thanks to Aleksey Gordeev and Axel Reinhold for providing detailed
network captures and configurations exhibiting the issue.
2017-01-05 20:26:55 +01:00
Willy Tarreau
10e61cbf41 BUG/MINOR: http: report real parser state in error captures
Error captures almost always report a state 26 (MSG_ERROR) making it
very hard to know what the parser was expecting. The reason is that
we have to switch to MSG_ERROR to trigger the dump, and then during
the dump we capture the current state which is already MSG_ERROR. With
this change we now copy the current state into an err_state field that
will be reported as the faulty state.

This patch looks a bit large because the parser doesn't update the
current state until it runs out of data so the current state is never
known when jumping to ther error label! Thus the code had to be updated
to take copies of the current state before switching to MSG_ERROR based
on the switch/case values.

As a bonus, it now shows the current state in human-readable form and
not only in numeric form ; in the past it was not an issue since it was
always 26 (MSG_ERROR).

At least now we can get exploitable invalid request/response reports :

  [05/Jan/2017:19:28:57.095] frontend f (#2): invalid request
    backend <NONE> (#-1), server <NONE> (#-1), event #1
    src 127.0.0.1:39894, session #4, session flags 0x00000080
    HTTP msg state MSG_RQURI(4), msg flags 0x00000000, tx flags 0x00000000
    HTTP chunk len 0 bytes, HTTP body len 0 bytes
    buffer flags 0x00908002, out 0 bytes, total 20 bytes
    pending 20 bytes, wrapping at 16384, error at position 5:

    00000  GET /\e HTTP/1.0\r\n
    00017  \r\n
    00019  \n

  [05/Jan/2017:19:28:33.827] backend b (#3): invalid response
    frontend f (#2), server s1 (#1), event #0
    src 127.0.0.1:39718, session #0, session flags 0x000004ce
    HTTP msg state MSG_HDR_NAME(17), msg flags 0x00000000, tx flags 0x08300000
    HTTP chunk len 0 bytes, HTTP body len 0 bytes
    buffer flags 0x80008002, out 0 bytes, total 59 bytes
    pending 59 bytes, wrapping at 16384, error at position 31:

    00000  HTTP/1.1 200 OK\r\n
    00017  Content-length : 10\r\n
    00038  \r\n
    00040  0a\r\n
    00044  0123456789\r\n
    00056  0\r\n

This should be backported to 1.7 and 1.6 at least to help with bug
reports.
2017-01-05 19:48:50 +01:00
Christopher Faulet
0184ea71a6 BUG/MAJOR: channel: Fix the definition order of channel analyzers
It is important to defined analyzers (AN_REQ_* and AN_RES_*) in the same order
they are evaluated in process_stream. This order is really important because
during analyzers evaluation, we run them in the order of the lower bit to the
higher one. This way, when an analyzer adds/removes another one during its
evaluation, we know if it is located before or after it. So, when it adds an
analyzer which is located before it, we can switch to it immediately, even if it
has already been called once but removed since.

With the time, and introduction of new analyzers, this order was broken up. the
main problems come from the filter analyzers. We used values not related with
their evaluation order. Furthermore, we used same values for request and response
analyzers.

So, to fix the bug, filter analyzers have been splitted in 2 distinct lists to
have different analyzers for the request channel than those for the response
channel. And of course, we have moved them to the right place.

Some other analyzers have been reordered to respect the evaluation order:

  * AN_REQ_HTTP_TARPIT has been moved just before AN_REQ_SRV_RULES
  * AN_REQ_PRST_RDP_COOKIE has been moved just before AN_REQ_STICKING_RULES
  * AN_RES_STORE_RULES has been moved just after AN_RES_WAIT_HTTP

Note today we have 29 analyzers, all stored into a 32 bits bitfield. So we can
still add 4 more analyzers before having a problem. A good way to fend off the
problem for a while could be to have a different bitfield for request and
response analyzers.

[wt: all of this must be backported to 1.7, and part of it must be backported
 to 1.6 and 1.5]
2017-01-05 17:58:22 +01:00
Thierry FOURNIER / OZON.IO
43ad11dc75 MINOR: Do not forward the header "Expect: 100-continue" when the option http-buffer-request is set
When the option "http-buffer-request" is set, HAProxy send itself the
"HTTP/1.1 100 Continue" response in order to retrieve the post content.
When HAProxy forward the request, it send the body directly after the
headers. The header "Expect: 100-continue" was sent with the headers.
This header is useless because the body will be sent in all cases, and
the server reponse is not removed by haproxy.

This patch removes the header "Expect: 100-continue" if HAProxy sent it
itself.
2016-12-12 17:33:42 +01:00
Matthieu Guegan
35088f960d BUG/MINOR: http: don't send an extra CRLF after a Set-Cookie in a redirect
By investigating a keep-alive issue with CloudFlare, we[1] found that
when using the 'set-cookie' option in a redirect (302) HAproxy is adding
an extra `\r\n`.

Triggering rule :

`http-request redirect location / set-cookie Cookie=value if [...]`

Expected result :

```
HTTP/1.1 302 Found
Cache-Control: no-cache
Content-length: 0
Location: /
Set-Cookie: Cookie=value; path=/;
Connection: close
```

Actual result :

```
HTTP/1.1 302 Found
Cache-Control: no-cache
Content-length: 0
Location: /
Set-Cookie: Cookie=value; path=/;

Connection: close
```

This extra `\r\n` seems to be harmless with another HAproxy instance in
front of it (sanitizing) or when using a browser. But we confirm that
the CloudFlare NGINX implementation is not able to handle this. It
seems that both 'Content-length: 0' and extra carriage return broke RFC
(to be confirmed).

When looking into the code, this carriage-return was already present in
1.3.X versions but just before closing the connection which was ok I
think. Then, with 1.4.X the keep-alive feature was added and this piece
of code remains unchanged.

[1] all credit for the bug finding goes to CloudFlare Support Team

[wt: the bug was indeed present since the Set-Cookie was introduced
 in 1.3.16, by commit 0140f25 ("[MINOR] redirect: add support for
 "set-cookie" and "clear-cookie"") so backporting to all supported
 versions is desired]
2016-12-05 19:27:48 +01:00
Christopher Faulet
6962f4e0d6 BUG/MINOR: http: Call XFER_DATA analyzer when HTTP txn is switched in tunnel mode
This allow a filter to start to analyze data in HTTP and to fallback in TCP when
data are tunneled.

[wt: backport desired in 1.7 - no impact right now but may impact the ability
 to backport future fixes]
2016-11-29 17:03:04 +01:00
Christopher Faulet
3235957685 BUG/MINOR: http: Keep the same behavior between 1.6 and 1.7 for tunneled txn
In HAProxy 1.6, When "http-tunnel" option is enabled, HTTP transactions are
tunneled as soon as possible after the headers parsing/forwarding. When the
transfer length of the response can be determined, this happens when all data
are forwarded. But for responses with an undetermined transfer length this
happens when headers are forwarded. This behavior is questionable, but this is
not the purpose of this fix...

In HAProxy 1.7, the first use-case works like in 1.6. But the second one not
because of the data filtering. HAProxy was always trying to forward data until
the server closes the connection. So the transaction was never switched in
tunnel mode. This is the expected behavior when there is a data filter. But in
the default case (no data filter), it should work like in 1.6.

This patch fixes the bug. We analyze response data until the server closes the
connection only when there is a data filter.

[wt: backport needed in 1.7]
2016-11-29 17:03:01 +01:00
Christopher Faulet
d1cd209b21 BUG/MEDIUM: http: Fix tunnel mode when the CONNECT method is used
When a 2xx response to a CONNECT request is returned, the connection must be
switched in tunnel mode immediatly after the headers, and Transfer-Encoding and
Content-Length headers must be ignored. So from the HTTP parser point of view,
there is no body.

The bug comes from the fact the flag HTTP_MSGF_XFER_LEN was not set on the
response (This flag means that the body size can be determined. In our case, it
can, it is 0). So, during data forwarding, the connection was never switched in
tunnel mode and we were blocked in a state where we were waiting that the
server closes the connection to ends the response.

Setting the flag HTTP_MSGF_XFER_LEN on the response fixed the bug.

The code of http_wait_for_response has been slightly updated to be more
readable.

[wt: 1.7-only, this is not needed in 1.6]
2016-11-29 17:00:14 +01:00
Willy Tarreau
35069f84af MINOR: cli: make "show errors" capable of dumping only request or response
When dealing with many proxies, it's hard to spot response errors because
all internet-facing frontends constantly receive attacks. This patch now
makes it possible to demand that only request or response errors are dumped
by appending "request" or "reponse" to the show errors command.
2016-11-25 09:16:37 +01:00
Willy Tarreau
234ba2d8eb MINOR: cli: make "show errors" support a proxy name
Till now it was needed to know the proxy's ID while we do have the
ability to look up a proxy by its name now.
2016-11-25 08:56:55 +01:00
Thierry FOURNIER / OZON.IO
8a4e4420fb MEDIUM: log-format: Use standard HAProxy log system to report errors
The function log format emit its own error message using Alert(). This
patch replaces this behavior and uses the standard HAProxy error system
(with memprintf).

The benefits are:
 - cleaning the log system

 - the logformat can ignore the caller (actually the caller must set
   a flag designing the caller function).

 - Make the usage of the logformat function easy for future components.
2016-11-25 07:32:58 +01:00
Thierry FOURNIER / OZON.IO
59fd511555 MEDIUM: log-format/conf: take into account the parse_logformat_string() return code
This patch takes into account the return code of the parse_logformat_string()
function. Now the configuration parser will fail if the log_format is not
strict.
2016-11-24 18:54:26 +01:00
Thierry FOURNIER / OZON.IO
6fe0e1b977 CLEANUP: log-format: remove unused arguments
The log-format function parse_logformat_string() takes file and line
for building parsing logs. These two parameters are embedded in the
struct proxy curproxy, which is the current parsing context.

This patch removes these two unused arguments.
2016-11-24 18:54:26 +01:00
Willy Tarreau
30e5e18bbb CLEANUP: cli: remove assignments to st0 and st2 in keyword parsers
Now it's not needed anymore to set STAT_ST_INIT nor CLI_ST_CALLBACK
in the parsers, remove it in the various places.
2016-11-24 16:59:28 +01:00
Willy Tarreau
12207b360a REORG: cli: move "show errors" out of cli.c
It really belongs to proto_http.c since it's a dump for HTTP request
and response errors. Note that it's possible that some parts do not
need to be exported anymore since it really is the only place where
errors are manipulated.
2016-11-24 16:59:28 +01:00
William Lallemand
9ed6203aef REORG: cli: split dumpstats.h in stats.h and cli.h
proto/dumpstats.h has been split in 4 files:

  * proto/cli.h  contains protypes for the CLI
  * proto/stats.h contains prototypes for the stats
  * types/cli.h contains definition for the CLI
  * types/stats.h contains definition for the stats
2016-11-24 16:59:27 +01:00
Willy Tarreau
e6d9c21059 OPTIM: http: optimize lookup of comma and quote in header values
http_find_header2() relies on find_hdr_value_end() to find the comma
delimiting a header field value, which also properly handles double
quotes and backslashes within quotes. In fact double quotes are very
rare, and commas happen once every multiple characters, especially
with cookies where a full block can be found at once. So it makes
sense to optimize this function to speed up the lookup of the first
block before the quote.

This change increases the performance from 212k to 217k req/s when
requests contain a 1kB cookie (+2.5%). We don't care about going
back into the fast parser after the first quote, as it may
needlessly make the parser more complex for very marginal gains.
2016-11-05 18:23:38 +01:00
Willy Tarreau
5f10ea30f4 OPTIM: http: improve parsing performance of long URIs
Searching the trailing space in long URIs takes some time. This can
happen especially on static files and some blogs. By skipping valid
character ranges by 32-bit blocks, it's possible to increase the
HTTP performance from 212k to 216k req/s on requests features a
100-character URI, which is an increase of 2%. This is done for
architectures supporting unaligned accesses (x86_64, x86, armv7a).
There's only a 32-bit version because URIs are rarely long and very
often short, so it's more efficient to limit the systematic overhead
than to try to optimize for the rarest requests.
2016-11-05 18:00:35 +01:00
Willy Tarreau
0431f9d476 OPTIM: http: improve parsing performance of long header lines
A performance test with 1kB cookies was capping at 194k req/s. After
implementing multi-byte skipping, the performance increased to 212k req/s,
or 9.2% faster. This patch implements this for architectures supporting
unaligned accesses (x86_64, x86, armv7a). Maybe other architectures can
benefit from this but they were not tested yet.
2016-11-05 18:00:17 +01:00
Willy Tarreau
2235b261b6 OPTIM: http: move all http character classs tables into a single one
We used to have 7 different character classes, each was 256 bytes long,
resulting in almost 2kB being used in the L1 cache. It's as cheap to
test a bit than to check the byte is not null, so let's store a 7-bit
composite value and check for the respective bits there instead.

The executable is now 4 kB smaller and the performance on small
objects increased by about 1% to 222k requests/second with a config
involving 4 http-request rules including 1 header lookup, one header
replacement, and 2 variable assignments.
2016-11-05 15:58:08 +01:00
Erwan Velu
b12ff9a201 CLEANUP: proto_http: Removing useless variable assignation
delta is set to 0 just before being assigned to a buffer.
This patch is just removing this useless line, shorted is better.
2016-08-30 14:24:48 +02:00
Thierry FOURNIER / OZON.IO
4cac359a39 MEDIUM: log: Decompose %Tq in %Th %Ti %TR
Tq is the time between the instant the connection is accepted and a
complete valid request is received. This time includes the handshake
(SSL / Proxy-Protocol), the idle when the browser does preconnect and
the request reception.

This patch decomposes %Tq in 3 measurements names %Th, %Ti, and %TR
which returns respectively the handshake time, the idle time and the
duration of valid request reception. It also adds %Ta which reports
the request's active time, which is the total time without %Th nor %Ti.
It replaces %Tt as the total time, reporting accurate measurements for
HTTP persistent connections.

%Th is avalaible for TCP and HTTP sessions, %Ti, %TR and %Ta are only
avalaible for HTTP connections.

In addition to this, we have new timestamps %tr, %trg and %trl, which
log the date of start of receipt of the request, respectively in the
default format, in GMT time and in local time (by analogy with %t, %T
and %Tl). All of them are obviously only available for HTTP. These values
are more relevant as they more accurately represent the request date
without being skewed by a browser's preconnect nor a keep-alive idle
time.

The HTTP log format and the CLF log format have been modified to
use %tr, %TR, and %Ta respectively instead of %t, %Tq and %Tt. This
way the default log formats now produce the expected output for users
who don't want to manually fiddle with the log-format directive.

Example with the following log-format :

   log-format "%ci:%cp [%tr] %ft %b/%s h=%Th/i=%Ti/R=%TR/w=%Tw/c=%Tc/r=%Tr/a=%Ta/t=%Tt %ST %B %CC %CS %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq %hr %hs %{+Q}r"

The request was sent by hand using "openssl s_client -connect" :

   Aug 23 14:43:20 haproxy[25446]: 127.0.0.1:45636 [23/Aug/2016:14:43:20.221] test~ test/test h=6/i=2375/R=261/w=0/c=1/r=0/a=262/t=2643 200 145 - - ---- 1/1/0/0/0 0/0 "GET / HTTP/1.1"

=> 6 ms of SSL handshake, 2375 waiting before sending the first char (in
fact the time to type the first line), 261 ms before the end of the request,
no time spent in queue, 1 ms spend connecting to the server, immediate
response, total active time for this request = 262ms. Total time from accept
to close : 2643 ms.

The timing now decomposes like this :

                 first request               2nd request
      |<-------------------------------->|<-------------- ...
      t         tr                       t    tr ...
   ---|----|----|----|----|----|----|----|----|--
      : Th   Ti   TR   Tw   Tc   Tr   Td : Ti   ...
      :<---- Tq ---->:                   :
      :<-------------- Tt -------------->:
                :<--------- Ta --------->:
2016-08-23 15:18:08 +02:00
Willy Tarreau
3146a4cde2 BUG/MINOR: peers: don't count track-sc multiple times on errors
Ruoshan Huang found that the call to session_inc_http_err_ctr() in the
recent http-response patch was not a good idea because it also increments
counters that are already tracked (eg: http-request track-sc or previous
http-response track-sc). Better open-code the update, it's simple.
2016-07-26 15:25:32 +02:00
Ruoshan Huang
e4edc6b628 MEDIUM: http: implement http-response track-sc* directive
This enables tracking of sticky counters from current response. The only
difference from "http-request track-sc" is the <key> sample expression
can only make use of samples in response (eg. res.*, status etc.) and
samples below Layer 6.
2016-07-26 14:31:14 +02:00
Christopher Faulet
a9300a3d5a BUG/MINOR: Rework slightly commit 9962f8fc to clean code and avoid mistakes
In commit 9962f8fc (BUG/MEDIUM: http: unbreak uri/header/url_param hashing), we
take care to update 'msg->sov' value when the parser changes to state
HTTP_MSG_DONE. This works when no filter is used. But, if a filter is used and
if it loops on 'http_end' callback, the following block is evaluated two times
consecutively:

    if (unlikely(!(chn->flags & CF_WROTE_DATA) || msg->sov > 0))
            msg->sov -= ret;

Today, in practice, because this happens when all data are parsed and forwarded,
the second test always fails (after the first update, msg->sov is always lower
or equal to 0). But it is useless and error prone. So to avoid misunderstanding
the code has been slightly changed. Now, in all cases, we try to update msg->sov
only once per iteration.

No backport is needed.
2016-06-28 16:34:50 +02:00
Willy Tarreau
9962f8fc44 BUG/MEDIUM: http: unbreak uri/header/url_param hashing
Vedran Furac reported that "balance uri" doesn't work anymore in recent
1.7-dev versions. Dragan Dosen found that the first faulty commit was
dbe34eb ("MEDIUM: filters/http: Move body parsing of HTTP messages in
dedicated functions"), merged in 1.7-dev2.

After this patch, the hashing is performed on uninitialized data,
indicating that the buffer is not correctly rewound. In fact, all forms
of content-based hashing are broken since the commit above. Upon code
inspection, it appears that the new functions http_msg_forward_chunked_body()
and http_msg_forward_body() forget to rewind the buffer in the success
case, when the parser changes to state HTTP_MSG_DONE. The rewinding code
was reinserted in both functions and the fix was confirmed by two test,
with and without chunking.

No backport it needed.
2016-06-28 11:57:06 +02:00
Willy Tarreau
29bdb1c7ff BUG/MINOR: http: fix misleading error message for response captures
Kay Fuchs reported that the error message is misleading in response
captures because it suggests that "len" is accepted while it's not.

This needs to be backported to 1.6.
2016-06-24 15:36:34 +02:00
Christopher Faulet
1eea6d7ba8 BUG/MINOR: filters: Fix HTTP parsing when a filter loops on data forwarding
A filter can choose to loop on data forwarding. When this loop occurs in
HTTP_MSG_ENDING state, http_foward_data callbacks are called twice because of a
goto on the wrong label.

A filter can also choose to loop at the end of a HTTP message, in http_end
callback. Here the goto is good but the label is not at the right place. We must
be sure to upate msg->sov value.
2016-06-21 18:53:09 +02:00
Ruoshan Huang
dd01678a79 BUG/MINOR: fix http-response set-log-level parsing error
hi,
    `http-response set-log-level` doesn't work, as the config parsing always set the log level to -1.

From 2b183447c5b37c19aae5d596871fc0b9004c87b4 Mon Sep 17 00:00:00 2001
From: Ruoshan Huang <ruoshan.huang@gmail.com>
Date: Wed, 15 Jun 2016 22:07:58 +0800
Subject: [PATCH] BUG/MINOR: fix http-response set-log-level parsing error

http-response set-log-level can't parse the log level correctly,
as the level argument ptr is one byte ahead when passed to get_log_level
---
 src/proto_http.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
2016-06-17 17:57:58 +02:00
Dragan Dosen
db5af61f3c BUG/MINOR: http: url32+src should check cli_conn before using it
In function smp_fetch_url32_src(), it's better to check the value of
cli_conn before we go any further.

This patch needs to be backported to 1.6 and 1.5.
2016-06-16 12:53:25 +02:00
Dragan Dosen
e5f4133b19 BUG/MINOR: http: url32+src should use the big endian version of url32
This is similar to the commit 5ad6e1dc ("BUG/MINOR: http: base32+src
should use the big endian version of base32"). Now we convert url32 to big
endian when building the binary block.

This patch needs to be backported to 1.6 and 1.5.
2016-06-16 12:53:25 +02:00
Thierry Fournier
4b788f7d34 BUG/MEDIUM: http: add-header: buffer overwritten
If we use the action "http-request add-header" with a Lua sample-fetch or
converter, and the Lua function calls one of the Lua log function, the
header name is corrupted, it contains an extract of the last loggued data.

This is due to an overwrite of the trash buffer, because his scope is not
respected in the "add-header" function. The scope of the trash buffer must
be limited to the function using it. The build_logline() function can
execute a lot of other function which can use the trash buffer.

This patch fix the usage of the trash buffer. It limits the scope of this
global buffer to the local function, we build first the header value using
build_logline, and after we store the header name.

Thanks Michael Ezzell for the repporting.

This patch must be backported in 1.6 version
2016-06-08 10:34:22 +02:00
Thierry Fournier
53c1a9b7cb BUG/MINOR: http: add-header: header name copied twice
The header name is copied two time in the buffer. The first copy is a printf-like
function writing the name and the http separators in the buffer, and the second
form is a memcopy. This seems to be inherited from some changes. This patch
removes the printf like, format.

This patch must be backported in 1.6 and 1.5 versions
2016-06-08 10:34:07 +02:00
William Lallemand
2e785f23cb MEDIUM: tcp: add 'set-src' to 'tcp-request connection'
The 'set-src' action was not available for tcp actions The action code
has been converted into a function in proto_tcp.c to be used for both
'http-request' and 'tcp-request connection' actions.

Both http and tcp keywords are registered in proto_tcp.c
2016-06-01 11:44:11 +02:00
Willy Tarreau
58727ec088 BUG/MAJOR: http: fix breakage of "reqdeny" causing random crashes
Commit 108b1dd ("MEDIUM: http: configurable http result codes for
http-request deny") introduced in 1.6-dev2 was incomplete. It introduced
a new field "rule_deny_status" into struct http_txn, which is filled only
by actions "http-request deny" and "http-request tarpit". It's then used
in the deny code path to emit the proper error message, but is used
uninitialized when the deny comes from a "reqdeny" rule, causing random
behaviours ranging from returning a 200, an empty response, or crashing
the process. Often upon startup only 200 was returned but after the fields
are used the crash happens. This can be sped up using -dM.

There's no need at all for storing this status in the http_txn struct
anyway since it's used immediately after being set. Let's store it in
a temporary variable instead which is passed as an argument to function
http_req_get_intercept_rule().

As an extra benefit, removing it from struct http_txn reduced the size
of this struct by 8 bytes.

This fix must be backported to 1.6 where the bug was detected. Special
thanks to Falco Schmutz for his detailed report including an exploitable
core and a reproducer.
2016-05-25 16:23:59 +02:00
Vincent Bernat
6e61589573 BUG/MAJOR: fix listening IP address storage for frontends
When compiled with GCC 6, the IP address specified for a frontend was
ignored and HAProxy was listening on all addresses instead. This is
caused by an incomplete copy of a "struct sockaddr_storage".

With the GNU Libc, "struct sockaddr_storage" is defined as this:

    struct sockaddr_storage
      {
        sa_family_t ss_family;
        unsigned long int __ss_align;
        char __ss_padding[(128 - (2 * sizeof (unsigned long int)))];
      };

Doing an aggregate copy (ss1 = ss2) is different than using memcpy():
only members of the aggregate have to be copied. Notably, padding can be
or not be copied. In GCC 6, some optimizations use this fact and if a
"struct sockaddr_storage" contains a "struct sockaddr_in", the port and
the address are part of the padding (between sa_family and __ss_align)
and can be not copied over.

Therefore, we replace any aggregate copy by a memcpy(). There is another
place using the same pattern. We also fix a function receiving a "struct
sockaddr_storage" by copy instead of by reference. Since it only needs a
read-only copy, the function is converted to request a reference.
2016-05-19 10:43:24 +02:00
Willy Tarreau
3de5bd603c BUG/MEDIUM: http: fix risk of CPU spikes with pipelined requests from dead client
Since client-side HTTP keep-alive was introduced in 1.4-dev, a good
number of corner cases had to be dealt with. One of them remained and
caused some occasional CPU spikes that Cyril Bont had the opportunity
to observe from time to time and even recently to capture for deeper
analysis.

What happens is the following scenario :
  1) a client sends a first request which causes the server to respond
     using chunked encoding

  2) the server starts to respond with a large response that doesn't
     fit into a single buffer

  3) the response buffer fills up with the response

  4) the client reads the response while it is being sent by the server.

  5) haproxy eventually receives the end of the response from the server,
     which occupies the whole response buffer (must at least override the
     reserve), parses it and prepares to receive a second request while
     sending the last data blocks to the client. It then reinitializes the
     whole http_txn and calls http_wait_for_request(), which arms the
     http-request timeout.

  6) at this exact moment the client emits a second request, probably
     asking for an object referenced in the first page while parsing it
     on the fly, and silently disappears from the internet (internet
     access cut or software having trouble with pipelined request).

  7) the second request arrives into the request buffer and the response
     data stall in the response buffer, preventing the reserve from being
     used for anything else.

  8) haproxy calls http_wait_for_request() to parse the next request which
     has just arrived, but since it sees the response buffer is full, it
     refrains from doing so and waits for some data to leave or a timeout
     to strike.

  9) the http-request timeout strikes, causing http_wait_for_request() to
     be called again, and the function immediately returns since it cannot
     even produce an error message, and the loop is maintained here.

 10) the client timeout strikes, aborting the loop.

At first glance a check for timeout would be needed *before* considering
the buffer in http_wait_for_request(), but the issue is not there in fact,
because when doing so we see in the logs a client timeout error while
waiting for a request, which is wrong. The real issue is that we must not
consider the first transaction terminated until at least the reserve is
released so that http_wait_for_request() has no problem starting to process
a new request. This allows the first response to be reported in timeout and
not the second request. A more restrictive control could consist in not
considering the request complete until the response buffer has no more
outgoing data but this brings no added value and needlessly increases the
number of wake-ups when dealing with pipelining.

Note that the same issue exists with the request, we must ensure that any
POST data are finished being forwarded to the server before accepting a new
request. This case is much harder to trigger however as servers rarely
disappear and if they do so, they impact all their sessions at once.

No specific reproducer is usable because the bug is very hard to reproduce
and depends on the system as well with settings varying across reboots. On
a test machine, socket buffers were reduced to 4096 (tcp_rmem and tcp_wmem),
haproxy's buffers were set to 16384 and tune.maxrewrite to 12288. The proxy
must work in http-server-close mode, with a request timeout smaller than the
client timeout. The test is run like this :

  $ (printf "GET /15000 HTTP/1.1\r\n\r\n"; usleep 100000; \
     printf "GET / HTTP/1.0\r\n\r\n"; sleep 15) | nc6 --send-only 0 8002

The server returns chunks of the requested size (15000 bytes here, but
78000 in a previous test was the only working value). Strace must show the
last recvfrom() succeed and the last sendto() being shorter than desired or
better, not being called.

This fix must be backported to 1.6, 1.5 and 1.4. It was made in a way that
should make it possible to backport it. It depends on channel_congested()
which also needs to be backported. Further cleanup of http_wait_for_request()
could be made given that some of the tests are now useless.
2016-05-02 16:39:22 +02:00
Willy Tarreau
f51d03cf14 BUG/MEDIUM: http: fix incorrect reporting of server errors
Commit dbe34eb ("MEDIUM: filters/http: Move body parsing of HTTP
messages in dedicated functions") introduced a bug in function
http_response_forward_body() by getting rid of the while(1) loop.
The code immediately following the loop was only reachable on missing
data but now it's also reachable under normal conditions, which used
to be dealt with by the skip_resync_state label returning zero. The
side effect is that in http_server_close situations, the channel's
SHUTR flag is seen and considered as a server error which is reported
if any other error happens (eg: client timeout).

This bug is specific to 1.7, no backport is needed.
2016-05-02 16:39:22 +02:00
Thierry Fournier
0e00dca58b DOC: http: rename the unique-id sample and add the documentation
This patch renames the ssample fetch from "uniqueid" to "unique-id".
It also adds the documentation associated with this sample fetch.
2016-04-07 19:14:58 +02:00
Bertrand Paquet
83a2c3d4d7 BUG/MINOR : allow to log cookie for tarpit and denied request
The following patch allow to log cookie for tarpit and denied request.
This minor bug affect at least 1.5, 1.6 and 1.7 branch.

The solution is not perfect : may be the cookie processing
(manage_client_side_cookies) can be moved into http_process_req_common.
2016-04-06 14:58:41 +02:00
David Carlier
7365f7d41b CLEANUP: proto_http: few corrections for gcc warnings.
first, we modify the signatures of http_msg_forward_body and
http_msg_forward_chunked_body as they are declared as inline
below. Secondly, just verify the returns of the chunk initialization
which holds the Authorization Method (althought it is unlikely to fail  ...).
Both from gcc warnings.
2016-04-05 18:05:24 +02:00
Vincent Bernat
02779b6263 CLEANUP: uniformize last argument of malloc/calloc
Instead of repeating the type of the LHS argument (sizeof(struct ...))
in calls to malloc/calloc, we directly use the pointer
name (sizeof(*...)). The following Coccinelle patch was used:

@@
type T;
T *x;
@@

  x = malloc(
- sizeof(T)
+ sizeof(*x)
  )

@@
type T;
T *x;
@@

  x = calloc(1,
- sizeof(T)
+ sizeof(*x)
  )

When the LHS is not just a variable name, no change is made. Moreover,
the following patch was used to ensure that "1" is consistently used as
a first argument of calloc, not the last one:

@@
@@

  calloc(
+ 1,
  ...
- ,1
  )
2016-04-03 14:17:42 +02:00
Vincent Bernat
3c2f2f207f CLEANUP: remove unneeded casts
In C89, "void *" is automatically promoted to any pointer type. Casting
the result of malloc/calloc to the type of the LHS variable is therefore
unneeded.

Most of this patch was built using this Coccinelle patch:

@@
type T;
@@

- (T *)
  (\(lua_touserdata\|malloc\|calloc\|SSL_get_app_data\|hlua_checkudata\|lua_newuserdata\)(...))

@@
type T;
T *x;
void *data;
@@

  x =
- (T *)
  data

@@
type T;
T *x;
T *data;
@@

  x =
- (T *)
  data

Unfortunately, either Coccinelle or I is too limited to detect situation
where a complex RHS expression is of type "void *" and therefore casting
is not needed. Those cases were manually examined and corrected.
2016-04-03 14:17:42 +02:00
Willy Tarreau
f3764b7993 MEDIUM: proxy: use dynamic allocation for error dumps
There are two issues with error captures. The first one is that the
capture size is still hard-coded to BUFSIZE regardless of any possible
tune.bufsize setting and of the fact that frontends only capture request
errors and that backends only capture response errors. The second is that
captures are allocated in both directions for all proxies, which start to
count a lot in configs using thousands of proxies.

This patch changes this so that error captures are allocated only when
needed, and of the proper size. It also refrains from dumping a buffer
that was not allocated, which still allows to emit all relevant info
such as flags and HTTP states. This way it is possible to save up to
32 kB of RAM per proxy in the default configuration.
2016-03-31 13:49:23 +02:00
Thierry Fournier
f4011ddcf5 MINOR: http: sample fetch which returns unique-id
This patch adds a sample fetch which returns the unique-id if it is
configured. If the unique-id is not yet generated, it build it. If
the unique-id is not configured, it returns none.
2016-03-30 17:19:45 +02:00
Nenad Merdanovic
69ad4b9977 BUG/MAJOR: Fix crash in http_get_fhdr with exactly MAX_HDR_HISTORY headers
Similar issue was fixed in 67dad27, but the fix is incomplete. Crash still
happened when utilizing req.fhdr() and sending exactly MAX_HDR_HISTORY
headers.

This fix needs to be backported to 1.5 and 1.6.

Signed-off-by: Nenad Merdanovic <nmerdan@anine.io>
2016-03-29 16:03:41 +02:00
Willy Tarreau
5c557d14d5 CLEANUP: http: fix a build warning introduced by a recent fix
Cyril reported that recent commit 320ec2a ("BUG/MEDIUM: chunks: always
reject negative-length chunks") introduced a build warning because gcc
cannot guess that we can't fall into the case where the auth_method
chunk is not initialized.

This patch addresses it, though for the long term it would be best
if chunk_initlen() would always initialize the result.

This fix must be backported to 1.6 and 1.5 where the aforementionned
fix was already backported.
2016-03-13 08:17:02 +01:00
Willy Tarreau
1e62df92e3 MEDIUM: stats: implement a typed output format for stats
The output for each field is :
  field:<origin><nature><scope>:type:value

where field reminds the type of the object being dumped as well as its
position (pid, iid, sid), field number and field name. This way a
monitoring utility may very well report all available information without
knowing new fields in advance.

This format is also supported in the HTTP version of the stats by adding
";typed" after the URI, instead of ";csv" for the CSV format.

The doc was not updated yet.
2016-03-11 17:24:15 +01:00
Willy Tarreau
6204cd9f27 BUG/MAJOR: vars: always retrieve the stream and session from the sample
This is the continuation of previous patch called "BUG/MAJOR: samples:
check smp->strm before using it".

It happens that variables may have a session-wide scope, and that their
session is retrieved by dereferencing the stream. But nothing prevents them
from being used from a streamless context such as tcp-request connection,
thus crashing the process. Example :

    tcp-request connection accept if { src,set-var(sess.foo) -m found }

In order to fix this, we have to always ensure that variable manipulation
only happens via the sample, which contains the correct owner and context,
and that we never use one from a different source. This results in quite a
large change since a lot of functions are inderctly involved in the call
chain, but the change is easy to follow.

This fix must be backported to 1.6, and requires the last two patches.
2016-03-10 17:28:04 +01:00
Willy Tarreau
be508f1580 BUG/MAJOR: samples: check smp->strm before using it
Since commit 6879ad3 ("MEDIUM: sample: fill the struct sample with the
session, proxy and stream pointers") merged in 1.6-dev2, the sample
contains the pointer to the stream and sample fetch functions as well
as converters use it heavily.

The problem is that earlier commit 87b0966 ("REORG/MAJOR: session:
rename the "session" entity to "stream"") had split the session and
stream resulting in the possibility for smp->strm to be NULL before
the stream was initialized. This is what happens in tcp-request
connection rulesets, as discovered by Baptiste.

The sample fetch functions must now check that smp->strm is valid
before using it. An alternative could consist in using a dummy stream
with nothing in it to avoid some checks but it would only result in
deferring them to the next step anyway, and making it harder to detect
that a stream is valid or the dummy one.

There is still an issue with variables which requires a complete
independant fix. They use strm->sess to find the session with strm
possibly NULL and passed as an argument. All call places indirectly
use smp->strm to build strm. So the problem is there but the API needs
to be changed to remove this duplicate argument that makes it much
harder to know what pointer to use.

This fix must be backported to 1.6, as well as the next one fixing
variables.
2016-03-10 16:42:58 +01:00
Christopher Faulet
75e2eb66e5 MINOR: filters/http: Forward remaining data when a channel has no "data" filters
This is an improvement, especially when the message body is big. Before this
patch, remaining data were forwarded when there is no filter on the stream. Now,
the forwarding is triggered when there is no "data" filter on the channel. When
no filter is used, there is no difference, but when at least one filter is used,
it can be really significative.
2016-02-09 14:53:15 +01:00
Christopher Faulet
113f7decfc MINOR: filters/http: Slightly update the parsing of chunks
Now, http_parse_chunk_size and http_skip_chunk_crlf return the number of bytes
parsed on success. http_skip_chunk_crlf does not use msg->sol anymore.

On the other hand, http_forward_trailers is unchanged. It returns >0 if the end
of trailers is reached and 0 if not. In all cases (except if an error is
encountered), msg->sol contains the length of the last parsed part of the
trailer headers.

Internal doc and comments about msg->sol has been updated accordingly.
2016-02-09 14:53:15 +01:00
Christopher Faulet
da02e17d42 MAJOR: filters: Require explicit registration to filter HTTP body and TCP data
Before, functions to filter HTTP body (and TCP data) were called from the moment
at least one filter was attached to the stream. If no filter is interested by
these data, this uselessly slows data parsing.
A good example is the HTTP compression filter. Depending of request and response
headers, the response compression can be enabled or not. So it could be really
nice to call it only when enabled.

So, now, to filter HTTP/TCP data, a filter must use the function
register_data_filter. For TCP streams, this function can be called only
once. But for HTTP streams, when needed, it must be called for each HTTP request
or HTTP response.
Only registered filters will be called during data parsing. At any time, a
filter can be unregistered by calling the function unregister_data_filter.
2016-02-09 14:53:15 +01:00
Christopher Faulet
dbe34eb8cb MEDIUM: filters/http: Move body parsing of HTTP messages in dedicated functions
Now body parsing is done in http_msg_forward_body and
http_msg_forward_chunked_body functions, regardless of whether we parse a
request or a response.
Parsing result is still handled in http_request_forward_body and
http_response_forward_body functions.

This patch will ease futur optimizations, mainly on filters.
2016-02-09 14:53:15 +01:00