Commit Graph

7942 Commits

Author SHA1 Message Date
Christopher Faulet
174bfb163c BUG/MEDIUM: htx: Set the right start-line offset after a defrag
The offset was always wrong after an HTX defragmentation because the wrong
address was used and because the update could occcur several time on the same
defragmentation.
2018-12-06 15:01:40 +01:00
William Lallemand
27f3fa56f5 BUG/MEDIUM: mworker: stop every tasks in the master
The master is not supposed to run (at the moment) any task before the
polling loop, the created tasks should be run only in the workers but in
the master they should be disabled or removed.

No backport needed.
2018-12-06 14:12:58 +01:00
Christopher Faulet
145aa4772c MINOR: mux-h1: Drain obuf if the output is closed after sending data
It avoids to subscribe to send events because some may remain in the output
buffer. If the output is closed or if an error occurred, there is no way to send
these data anyway, so it is safe to drain them.
2018-12-06 14:11:29 +01:00
Willy Tarreau
c14999b3bc BUG/MEDIUM: mux-h2: stop sending using HTX on errors
We didn't take care of the stream error in the HTX send loop, causing
some errors (like buffer full) to provoke 100% CPU.

No backport is needed.
2018-12-06 14:09:09 +01:00
Willy Tarreau
8e162ee1f9 BUG/MEDIUM: mux-h2: use the correct offset for the HTX start line
Due to a thinko, I used sl_off as the start line index number but it's
not it, it's its offset. The first index is obtained using htx_get_head(),
and the start line is obtained using htx_get_sline(). This caused crashes
to happen when forwarding HTX traffic via the H2 mux once the HTX buffer
started to wrap.

No backport is needed.
2018-12-06 14:07:27 +01:00
Christopher Faulet
b2e841681a MINOR: mux-h1: Allow partial data consumption during outgoing data processing
In h1_process_output(), instead of waiting to have enough data to send to
consume a full block of data, we are now able consume partially these blocks.
2018-12-06 13:26:16 +01:00
Christopher Faulet
aa75b3d2d5 CLEANUP: htx: Fix indentation here and there in HTX files 2018-12-05 17:33:14 +01:00
Christopher Faulet
56df0a82ea MINOR: mux-h1: Don't adjust anymore the amount of data sent in h1_snd_buf()
Because the infinite forward is now HTX aware, it is now useless to tinker with
the number of bytes really sent.
2018-12-05 17:32:10 +01:00
Christopher Faulet
b2aedea142 MEDIUM: channel/htx: Add functions for forward HTX data
To ease the fast forwarding and the infinte forwarding on HTX proxies, 2
functions have been added to let the channel be almost aware of the way data are
stored in its buffer. By calling these functions instead of legacy ones, we are
sure to forward the right amount of data.
2018-12-05 17:29:30 +01:00
Christopher Faulet
27ba2dc6d6 MEDIUM: htx: Rework conversion from a buffer to an htx structure
Now, the function htx_from_buf() will set the buffer's length to its size
automatically. In return, the caller should call htx_to_buf() at the end to be
sure to leave the buffer hosting the HTX message in the right state. When the
caller can use the function htxbuf() to get the HTX message without any update
on the underlying buffer.
2018-12-05 17:10:16 +01:00
Christopher Faulet
7003378eac BUG/MINOR: mux-h1: Check h1m flags to set the server conn_mode on request path
On the server side, we must test the request headers to deduce if we able to do
keepalive or not. Otherwise, by default, the keepalive will be enabled on the
server's connection, whatever the client said.
2018-12-05 16:46:44 +01:00
Willy Tarreau
674e0addc4 BUG/MEDIUM: stream-int: don't mark as blocked an empty buffer on Rx
After 8706c8131 ("BUG/MEDIUM: mux_pt: Always set CS_FL_RCV_MORE."), a
side effect caused failed receives to mark the buffer as missing room,
a flag that no other place can remove since it's empty. Ideally we need
a separate flag to mean "failed to deliver data by lack of room", but
in the mean time at the very least we must not mark as blocked an
empty buffer.

No backport is needed.
2018-12-05 13:45:41 +01:00
Willy Tarreau
c5efa33021 MEDIUM: mux-h1: avoid a double copy on the Tx path whenever possible
In order to properly deal with unaligned contents, the output data are
currently copied into a temporary buffer, to be copied into the mux's
output buffer at the end. The new buffer API allows several buffers to
share the same data area, so we're using this here to make the temporary
buffer point to the same area as the output buffer when that one is
empty. This is enough to avoid the copy at the end, only pointers and
lengths have to be adjusted. In addition the output buffer's head is
advanced by the HTX header size so that the remaining copy is aligned.

By doing this we improve the large object performance by an extra 10%,
which is 64% above the 1.9-dev9 state. It's worth noting that there are
no more calls to __memcpy_sse2_unaligned() now.

Since this code deals with various block types, it appears difficult to
adjust it to be smart enough to even avoid the first copy. However a
distinct approach could consist in trying to detect a single blocked
HTX and jump to dedicated code in this case.
2018-12-05 11:23:41 +01:00
Willy Tarreau
78f548f49e MEDIUM: mux-h1: attempt to zero-copy Rx DATA transfers
When transferring large objects, most calls are made between a full
buffer and an empty buffer. In this case there is a large opportunity
for performing zero-copy calls, with a few exceptions : the input data
must fit into the output buffer, and the data need to be properly
aligned and formated to let the HTX header fit before and the HTX
block(s) fit after.

This patch does two things :

1) it makes sure that we prepare an empty input buffer before an recv()
   call so that it appears as holding an HTX block at the front, which is
   removed afterwards. This way the data received using recv() are placed
   exactly at the target position in the input buffer for a later cast to
   HTX.

2) when receiving data in h1_process_data(), if it appears that the input
   buffer can be cast to an HTX buffer and the target buffer is empty,
   then the buffers are swapped, an HTX block is prepended in front of the
   data area, and the HTX block is appended to reference this data block.

In practice, this ensures that in most cases when transferring large files,
calls to h1_rcv_buf() are made using zero copy and a little bit of buffer
preparation (~40 bytes to be written).

Doing this adds an extra 13% performance boost on top of previous patch,
resulting in a total of 50% speed up on large transfers.
2018-12-05 11:10:24 +01:00
Willy Tarreau
45f2b89156 MEDIUM: mux-h1: make use of buf_room_for_htx_data() instead of b_room()
Just by using this buffer room estimation for the demux buffer, the large
object performance has increased by up to 33%. This is mostly due to less
recv() calls and unaligned copies.
2018-12-05 10:57:42 +01:00
Olivier Houchard
8706c81316 BUG/MEDIUM: mux_pt: Always set CS_FL_RCV_MORE.
When using the mux_pt, as we can't know if there's more data to be read,
always set CS_FL_RCV_MORE, and only remove it if we got an error or a shutr
and rcv_buf() returned 0.
2018-12-04 19:23:56 +01:00
Olivier Houchard
29a22bc0a7 MEDIUM: h1: Realign the ibuf before calling rcv_buf if needed.
If the ibuf only contains a small amount of data, realign it
before calling rcv_buf(), as it's probably going to be cheaper
to do so than to do 2 calls to recv().
2018-12-04 18:42:33 +01:00
Olivier Houchard
cf42d5afa0 BUG/MEDIUM: h1: Correctly report used data with no len.
When we have no content-length, and not in chunk mode, correctly
report the used data. We really used "ret", and not "max".
2018-12-04 18:32:39 +01:00
Willy Tarreau
2fb1d4caaa MINOR: mux-h2: stop on non-DATA and non-EOM HTX blocks
We don't want to send such blocks as DATA frames if they were ever to
appear, let's quit when meeting them.
2018-12-04 18:32:39 +01:00
Willy Tarreau
ee57376ffb BUG/MEDIUM: mux-h2: don't send more HTX data than requested
It's incorrect to send more bytes than requested, because some filters
(e.g. compression) might intentionally hold on some blocks, so DATA
blocks must not be processed past the advertised byte count. It is not
the case for headers however.

No backport is needed.
2018-12-04 18:32:39 +01:00
Willy Tarreau
b08d91fbc5 BUG/MEDIUM: mux-h2: stop sending HTX once the mux is blocked
If we're blocking on mux full, mux busy or whatever, we must get out of
the loop. In legacy mode this problem doesn't exist as we can normally
return 0 but here it's not a sufficient condition to stop sending, so
we must inspect the blocking flags as well.

No backport is needed.
2018-12-04 18:32:39 +01:00
Willy Tarreau
0c22fa7d6f BUG/MEDIUM: mux-h2: make sure to always report HTX EOM when consumed by headers
The way htx_xfer_blks() was used is wrong, if we receive data, we must
report everything we found, not just the headers blocks. This ways causing
the EOM to be postponed and some fast responses (or errors) to be incorrectly
delayed.

No backport is needed.
2018-12-04 18:32:39 +01:00
Willy Tarreau
0f799ca4df BUG/MEDIUM: mux-h2: properly update the window size in HTX mode
When sending data in HTX mode, we forgot to update the window size, it
was the cause of the limitation to 1 GB in testing.

No backport is needed.
2018-12-04 18:32:39 +01:00
Olivier Houchard
8122a8d681 BUG/MEDIUM: h2: When sending in HTX, make sure the caller knows we sent all.
In h2_snd_buf(), when running with htx, make sure we return the amount of
data the caller specified, if we emptied the buffer, as it is what the
caller expects, and will lead to him properly consider the buffer to be
empty.
2018-12-04 18:32:39 +01:00
Christopher Faulet
f3d480517f BUG/MINOR: proto_htx: Truncate the request when an error is detected
When HTTP_MSGF_ERROR is set on a channel (the request or the response), the
request must be truncated, not the response.
2018-12-04 16:43:30 +01:00
Christopher Faulet
1a7ad7ad18 BUG/MEDIUM: mux-h1: Release the mux H1 in h1_process() if there is no h1s
With the current design, there is always an H1 stream attached to the mux. So
after the conn_stream is detached, if we don't create a new H1 stream in
h1_process(), it is important to release the mux.
2018-12-04 16:43:30 +01:00
Christopher Faulet
c386a8851e MINOR: mux-h1: Make sure to return 1 in h1_recv() when needed
In h1_recv(), return 1 if we have data available, or if h1_recv_allowed()
failed, to be sure h1_process() is called. Also don't subscribe if our buffer
is full.
2018-12-04 16:43:30 +01:00
Christopher Faulet
37e3607e37 BUG/MEDIUM: mux-h1: Always set CS_FL_RCV_MORE when data are received in h1_recv()
It is a warranty that the data will be handled by the stream, even if an error
is reported on the connection or on the conn_stream.
2018-12-04 16:43:30 +01:00
Olivier Houchard
75159a96de MEDIUM: mux-h1: Revamp the way subscriptions are handled.
Don't always wake the tasklets subscribed to recv or send events as soon as
we had any I/O event, and don't call the wake() method if there were no
subscription, instead, wake the recv tasklet if we received data in h2_recv(),
and wake the send tasklet if we were able to send data in h2_send(), and the
buffer is not full anymore.
Only call the data_cb->wake() method if we get an error/a read 0, just in
case the stream was not subscribed to receive events.
2018-12-04 16:43:30 +01:00
Olivier Houchard
c490efd625 BUG/MEDIUM: stream_interface: Make REALLY sure we read all the data.
In si_cs_recv(), try inconditionally to recv as long as the CS_FL_RCV_MORE is
set on the conn_stream, or we will miss some data.
2018-12-04 16:43:30 +01:00
Christopher Faulet
6e54095d0a BUG/MINOR: flt_trace/compression: Use the right flag to add the HTX support
Of course, the flag FLT_CFG_FL_HTX must be used and not
STRM_FLT_FL_HAS_FILTERS. "Fortunately", these 2 flags have the same value, so
everything worked as expected.
2018-12-04 16:43:30 +01:00
Olivier Houchard
435ce2d71d BUG/MEDIUM: h2: Don't forget to wake the tasklet after shutr/shutw.
When reaching h2_shutr/h2_shutw, as we may have generated an empty frame,
a goaway or a rst, make sure we wake the I/O tasklet, or we may not send
what we just generated.
Also in h2_shutw(), don't forget to return if all went well, we don't want
to subscribe the h2s to wait events.
2018-12-04 05:57:34 +01:00
Olivier Houchard
7ccff1a3d3 BUG/MEDIUM: h1: Destroy a connection after detach if it has no owner.
Destroy the connection while detaching, even if it has keep alive, if it has
no owner, or nobody else will be able to do so.
2018-12-04 05:57:31 +01:00
William Lallemand
2fd45fae46 BUG/MEDIUM: mworker: stop proxies which have no listener in the master
The previous code was only stopping the listeners in the master, not the
entire proxy.

Since we now have a polling loop in the master, there might be some side
effects, indeed some things that are still initialized. For example the
checks were still running.
2018-12-04 05:54:33 +01:00
Jérôme Magnin
e064a80fa3 BUG/MINOR: fix ssl_fc_alpn and actually add ssl_bc_alpn
When ssl_bc_alpn was meant to be added, a typo slipped in and as a result ssl_fc_alpn behaved as ssl_bc_alpn,
and ssl_bc_alpn was not a valid keyword. this patch aims at fixing this.
2018-12-04 05:53:45 +01:00
Christopher Faulet
1e7af46aae BUG/MINOR: htx: Force HTTP/1.1 on H1 formatting when version is 1.1 or above
This only happens for connections using the h1 mux. We must be sure to force the
version to HTTP/1.1 when the version of the message is 1.1 or above. It is
important for H2 messages to not send an invalid version string (HTTP/2.0) to
peers.
2018-12-04 05:51:39 +01:00
Christopher Faulet
c59ff23804 MINOR: htx: Rename functions htx_*_to_str() to be H1 specific
"_to_h1" suffix is now used because these function produce H1 strings. It avoids
any ambiguity on the output format.
2018-12-04 05:51:37 +01:00
Willy Tarreau
b54c40ac0b BUILD: threads: fix minor build warnings when threads are disabled
These potential null-deref warnings are emitted on gcc 7 and above
when threads are disabled due to the use of objt_server() after an
existing validity test. Let's switch to __objt_server() since we
know the pointer is valid, it will not confuse the compiler.

Some of these may be backported to 1.8.
2018-12-02 19:28:41 +01:00
Willy Tarreau
c8b476d952 BUG/MINOR: lb-map: fix unprotected update to server's score
The loop trying to figure the best server is theorically capable of
finishing the loop with best == NULL, causing the HA_ATOMIC_SUB()
to fail there. However for this to happen the list should be empty,
which is avoided at the beginning of the function. As it is, the
function still remains at risk so better address this now.

This patch should be backported to 1.8.
2018-12-02 19:22:55 +01:00
Joseph Herlant
3b4e8e118f CLEANUP: Fix a typo in the base64 subsystem
Fixes a typo in the code comments of the base64 subsystem.
2018-12-02 18:42:08 +01:00
Joseph Herlant
31019eebe9 CLEANUP: Fix typo in the 51d subsystem
Fixes a typo in the code comments of the 51d subsystem.
2018-12-02 18:41:54 +01:00
Joseph Herlant
008b3cefa1 CLEANUP: Fix typos in the cli subsystem
Fixes typos in the code comments of the cli subsystem.
2018-12-02 18:41:44 +01:00
Joseph Herlant
40650960fd CLEANUP: Fix typo in the fwrr subsystem
Fixes a typo in the code comment of the fwrr subsystem.
2018-12-02 18:40:53 +01:00
Joseph Herlant
f43b88bc09 CLEANUP: Fix typos in the map management functions
Fixes typos in the code comments of the MAP management functions.
2018-12-02 18:40:38 +01:00
Joseph Herlant
8bb32ae8c6 CLEANUP: Fix typos in the socket pair protocol subsystem
Fixes typos in the code comments of the socket pair protocol subsystem.
2018-12-02 18:40:33 +01:00
Joseph Herlant
3952643b35 CLEANUP: Fix typos in the shctx subsystem
Fixes typos in the code comments of the shctx subsystem.
2018-12-02 18:40:29 +01:00
Joseph Herlant
d8499ecb6e CLEANUP: Fix a typo in the queue subsystem
Fixes a typo in the code comments of the queue subsystem.
2018-12-02 18:40:11 +01:00
Joseph Herlant
d091bfbc6f CLEANUP: Fix a typo in the session subsystem
Fixes a typo in the code comments of the session subsystem.
2018-12-02 18:39:57 +01:00
Joseph Herlant
f6989ca056 CLEANUP: Fix a typo in the signal subsystem
Fixes a typo in the code comments of the signal subsystem.
2018-12-02 18:39:52 +01:00
Joseph Herlant
a6331475e0 CLEANUP: Fix typos in the proto_tcp subsystem
Fixes typos in the code comments of the proto_tcp subsystem.
2018-12-02 18:39:05 +01:00
Joseph Herlant
e9d5c727c1 CLEANUP: Fix a typo in the proto_htx subsystem
Fixes a typo in the code comments of the proto_htx subsystem.
2018-12-02 18:38:48 +01:00
Joseph Herlant
d77575d03e CLEANUP: Fix typos in the h2 subsystem
Fixes typos in the code comments of the h2 subsystem.
2018-12-02 18:38:08 +01:00
Joseph Herlant
30bc509c40 CLEANUP: Fix typos in the h1 subsystem
Fixes typos in the code comments of the h1 subsystem.
2018-12-02 18:38:02 +01:00
Joseph Herlant
c42c0e9969 CLEANUP: fix typos in the htx subsystem
Fix typos detected in the code comments of the htx subsystem.
2018-12-02 18:37:50 +01:00
Olivier Houchard
0c18a6fe34 MEDIUM: servers: Add a way to keep idle connections alive.
Add a new keyword for servers, "idle-timeout". If set, unused connections are
kept alive until the timeout happens, and will be picked for reuse if no
other connection is available.
2018-12-02 18:16:53 +01:00
Olivier Houchard
8defe4b51a MINOR: mux: add a "max_streams" method.
Add a new method to muxes, "max_streams", that returns the max number of
streams the mux can handle. This will be used to know if a mux is in use
or not.
2018-12-02 17:48:32 +01:00
Olivier Houchard
a6cf7112bb MEDIUM: mux-h2: Don't bother flagging outgoing connections as TOOMANY.
When creating a new stream, don't bother flagging a connection with
H2_CF_DEM_TOOMANY if we created the last available stream. We won't create
any other anyway, because h2_avail_streams() would return 0 available streams,
and has it is a blocking flag, it prevents us from reading data after.
2018-12-02 13:31:53 +01:00
Olivier Houchard
7a57e8a67a MEDIUM: mux-h2: Implement h2_attach().
Implement h2_attach(), so that we can have multiple streams in one outgoin
h2 connection.
2018-12-02 13:31:53 +01:00
Willy Tarreau
c12f38fe32 MEDIUM: mux-h2: make h2_process_demux() capable of processing responses as well
The function now calls h2c_bck_handle_headers() or h2c_frt_handle_headers()
depending on the connection's side. The former doesn't create a new stream
but feeds an existing one. At this point it's possible to forward an H2
request to a backend server and retrieve the response headers.
2018-12-02 13:31:52 +01:00
Willy Tarreau
c3e18f3448 MEDIUM: mux-h2: make h2_frt_decode_headers() direction-agnostic
This function does not really depend on the request, all it does is
also valid for H2 responses found on the backend side, so this patch
renames it and makes it call the appropriate decoder based on the
direction.
2018-12-02 13:31:52 +01:00
Willy Tarreau
8073969376 MEDIUM: mux-h2: implement encoding of H2 request on the backend side
This creates an H2 HEADERS frame from an HTX request. The code is
very similar to the response encoding, so probably that in the future
we'll have to factor these functions differently. The HTX's start line
type is used to decide on the direction. We also purposely error out
when trying to encode an H2 request from an H1 message since it's not
implemented.
2018-12-02 13:31:52 +01:00
Willy Tarreau
01b4482b46 MEDIUM: mux-h2: start to create the outgoing mux
For now it reports an immediate error when trying to encode the request
since it doesn't parse as a response. We take care of sending the preface
and settings frame with the outgoing connection, and not to wait for a
preface during the H2_CS_PREFACE phase for outgoing connections.
2018-12-02 13:31:51 +01:00
Willy Tarreau
751f2d0ddf MINOR: mux-h2: implement an outgoing stream allocator : h2c_bck_stream_new()
For the backend we'll need to allocate streams as well. Let's do this
with h2c_bck_stream_new(). The stream ID allocator was split from it
so that the caller can decide whether or not to stay on the same
connection or create a new one. It possibly isn't the best way to do
this as once we're on the mux it's too late to give up creation of a
new stream. Another approach would possibly consist in detaching muxes
that reached their connection count limit before they can be reused.

Instead of choosing the stream id as soon as the stream is created, wait
until data is about to be sent. If we don't do that, the stream may send
data out of order, and so the stream 3 may send data before the stream 1,
and then when the stream 1 will try to send data, the other end will
consider that an error, as stream ids should always be increased.

Cc: Olivier Houchard <ohouchard@haproxy.com>
2018-12-02 13:31:51 +01:00
Willy Tarreau
f8957277ff MINOR: mux-h2: mention that the mux is compatible with both sides
We declare two configurations for the H2 mux. One supporting only
the frontend in HTTP mode and one supporting both sides in HTX mode.

This is only to ease development at this point. Trying to assign an h2
mux on the server side will still fail during h2_init() anyway instead
of at config parsing time.
2018-12-02 13:31:03 +01:00
Willy Tarreau
1329b5be71 MINOR: h2: add new functions to produce an HTX message from an H2 response
The new function h2_prepare_htx_stsline() produces an HTX response message
from an H2 response presented as a list of header fields.
2018-12-02 13:30:17 +01:00
Willy Tarreau
a40782bb24 MINOR: hpack: add ":path" to the list of common header fields
The ":path" header field will be used a lot with outgoing requests,
let's encode it with its index.
2018-12-02 13:30:17 +01:00
Willy Tarreau
76a551de2e MINOR: config: make sure to associate the proper mux to bind and servers
Currently a mux may be forced on a bind or server line by specifying the
"proto" keyword. The problem is that the mux may depend on the proxy's
mode, which is not known when parsing this keyword, so a wrong mux could
be picked.

Let's simply update the mux entry while checking its validity. We do have
the name and the side, we only need to see if a better mux fits based on
the proxy's mode. It also requires to remove the side check while parsing
the "proto" keyword since a wrong mux could be picked.

This way it becomes possible to declare multiple muxes with the same
protocol names and different sides or modes.
2018-12-02 13:29:35 +01:00
Willy Tarreau
c5753aedf7 BUG/MEDIUM: mux-h2: remove the HTX EOM block on H2 response headers
If we decided to emit the end of stream flag on the H2 response headers
frame, we must remove the EOM block from the HTX stream, otherwise it
will lead to an extra DATA frame being sent with the ES flag and will
violate the protocol.
2018-12-02 12:31:51 +01:00
Willy Tarreau
fab9bb08fc BUG/MEDIUM: mux-h2: don't lose the first response header in HTX mode
When converting response headers from HTX to H2, we accidently skipped
the first header block.
2018-12-02 12:31:20 +01:00
Christopher Faulet
bf7a9597e2 BUG/MINOR: cfgparse: Fix the call to post parser of the last sections parsed
Wrong variable was used to know if we need to call the callback
post_section_parser() or not. We must use 'cs' and not 'pcs'.

This patch must be backported in 1.8 with the commit 7805e2b ("BUG/MINOR:
cfgparse: Fix transition between 2 sections with the same name").
2018-12-02 10:21:47 +01:00
Willy Tarreau
61ea7dc005 MEDIUM: mux-h2: support passing H2 DATA frames to HTX blocks
This is used for uploads, we can now convert H2 DATA frames to HTX
DATA blocks. It's uncertain whether it's better to reuse the same
function or to split it in two at this point. For now the same
function was added with some paths specific to HTX. In this mode
we loop back to the same or next frame in order to try to complete
DATA blocks.
2018-12-01 23:31:13 +01:00
Willy Tarreau
0c535fd1b5 MEDIUM: mux-h2: implement the emission of DATA frames from HTX DATA blocks
At the moment the way it's done is not optimal. We should aggregate multiple
blocks into a single DATA frame, and we should merge the ES flag with the
last one when we already know we've reached the end. For now and for an
easier tracking of the HTX stream, an individual empty DATA frame is sent
with the ES bit when EOM is met.

The DATA function is called for DATA, EOD and EOM since these stats indicate
that a previous frame was already produced without the ES flag (typically a
headers frame or another DATA frame). Thus it makes sense to handle all these
blocks there.

There's still an uncertainty on the way the EOD and EOM HTX blocks must be
accounted for, as they're counted as one byte in the HTX stream, but if we
count that byte off when parsing these blocks, we end up sending too much
and desynchronizing the HTX stream. Maybe it hides an issue somewhere else.

At least it's possible to reliably retrieve payloads up to 1 GB over H2/HTX
now. It's still unclear why larger ones are interrupted at 1 GB.
2018-12-01 23:27:08 +01:00
Willy Tarreau
115e83b071 MEDIUM: mux-h2: implement emission of H2 headers frames from HTX blocks
When using HTX, we need a separate function to emit a headers frame.
The code is significantly different from the H1 to H2 conversion, though
it borrows some parts there. It looks like the part building the H2 frame
from the headers list could be factored out, however some of the logic
around dealing with end of stream or block sizes remains different.

With this patch it becomes possible to retrieve bodyless HTTP responses
using H2 over HTX.
2018-12-01 23:27:08 +01:00
Willy Tarreau
bd4a6b675c MEDIUM: mux-h2: add basic H2->HTX transcoding support for headers
When the proxy is configured to use HTX mode, the headers frames
will be converted to HTX header blocks instead of HTTP/1 messages.
This requires very little modifications to the existing function
so it appeared better to do it this way than to duplicate it.

Only the request headers are handled, responses are not processed
yet and data frames are not processed yet either. The return value
is inaccurate but this is not an issue since we're using it as a
boolean : data received or not.
2018-12-01 23:27:08 +01:00
Willy Tarreau
bcd3bb3ca2 MEDIUM: mux-h2: make h2_snd_buf() HTX-aware
Now h2_snd_buf() will check the proxy's mode to decide whether to use
HTX-specific send functions or legacy functions. In HTX mode, the HTX
blocks of the output buffer will be parsed and the related functions
will be called accordingly based on the block type, and unimplemented
blocks will be skipped. For now all blocks are skipped, this is only
helpful for debugging.
2018-12-01 23:27:07 +01:00
Willy Tarreau
86724e2e8a MEDIUM: mux-h2: make h2_rcv_buf() support HTX transfers
The function needs to be slightly adapted to transfer HTX blocks, since
it may face a full buffer on the receive path, thus it needs to transfer
HTX blocks between the two sides ignoring the <count> argument in this
mode.
2018-12-01 23:25:55 +01:00
Willy Tarreau
5ae9600950 MEDIUM: mux-h2: register mux for both HTTP and HTX modes
The H2 mux will now be called for both HTTP and HTX modes. For now the
data transferr functions are not HTX-aware so this will lead to problems
if used as-is but it's convenient for development and debugging.
2018-12-01 19:03:20 +01:00
Willy Tarreau
6deb4129de MINOR: h2: implement H2->HTX request header frame transcoding
Till now we could only produce an HTTP/1 request from a list of H2
request headers. Now the new function h2_make_htx_request() does the
same but using the HTX encoding instead, while respecting the H2
semantics. The code is not much different from the first version,
only the encoding differs.

For now it's not used.
2018-12-01 17:38:32 +01:00
Christopher Faulet
e6902cd57c MEDIUM: compression: Adapt to be compatible with the HTX representation
Functions analyzing request and response headers have been duplicated and
adapted to support HTX messages. The callback http_payload have been implemented
to handle the data compression itself. It loops on HTX blocks and replace
uncompressed value of DATA block by compressed one. Unlike the HTTP legacy
version, there is no chunk at all. So HTX version is significantly easier.
2018-12-01 17:37:27 +01:00
Christopher Faulet
e0aa6f7a9a MINOR: flt_trace: Adapt to be compatible with the HTX representation
The callback http_headers has been updated to dump HTX headers when the HTX
internal representation is in use. And the callback http_payload has been
implemented with its hexdump function.
2018-12-01 17:37:27 +01:00
Christopher Faulet
aed82cfb04 MEDIUM: proto_htx/filters: Add data filtering during the forwarding
If there is data filters registered on the stream, the function
flt_http_payload() is called before forwarding any data. And the function
flt_http_end() is called when all data are forwarded. While at least one data
filter reamins registered on the stream, no fast forwarding is used.
2018-12-01 17:37:27 +01:00
Christopher Faulet
75bc913d23 MAJOR: filters: Adapt filters API to be compatible with the HTX represenation
First, to be called on HTX streams, a filter must explicitly be declared as
compatible by setting the flag STRM_FLT_FL_HAS_FILTERS on the filter's config at
HAProxy startup. This flag is checked when a filter implementation is attached
to a stream.

Then, some changes have been made on HTTP callbacks. The callback http_payload
has been added to filter HTX data. It will be called on HTX streams only. It
replaces the callbacks http_data, http_chunk_trailers and http_forward_data,
called on legacy HTTP streams only and marked as deprecated. The documention
(once updated)) will give all information to implement this new callback. Other
HTTP callbacks will be called for HTX and HTTP legacy streams. So it is the
filter's responsibility to known which kind of data it handles. The macro
IS_HTX_STRM should be used in such cases.

There is at least a noticeable changes in the way data are forwarded. In HTX,
after the call to the callback http_headers, all the headers are considered as
forwarded. So, in http_payload, only the body and eventually the trailers will
be filtered.
2018-12-01 17:37:27 +01:00
Christopher Faulet
e44769b4fa MINOR: mux-h1: Capture bad H1 messages
First of all, an dedicated error snapshot, h1_snapshot, has been added. It
contains more or less the some info than http_snapshot but adapted for H1
messages. Then, the function h1_capture_bad_message() has been added to capture
bad H1 messages. And finally, the function h1_show_error_snapshot() is used to
dump these errors. Only Headers or data parsing are captured.
2018-12-01 17:37:27 +01:00
Christopher Faulet
bd44ca6ede MINOR: mux-h1: Change client conn_mode on an explicit close for the response
in h1_set_cli_conn_mode(), on the response path, If the response's connection
header is explicitly set to close and if the request is unfinished (state !=
DONE), then the client connection is marked as WANT_CLO.
2018-12-01 17:37:27 +01:00
Christopher Faulet
d1ebb1eeb5 MINOR: mux-h1: Process conn_mode on the EOH when no connection header is found
Instead of looking for a connection header just after the start line to know if
we must process the conn_mode by hand or if we wait to parse the connection
header, we now delay this processing when the end of headers is reached. A flag
is used to know if it was already done (or skipped) or not. This save a lookup
on headers.
2018-12-01 17:37:27 +01:00
Christopher Faulet
a7b677cd0d MEDIUM: proto_htx: Convert all HTTP error messages into HTX
During startup, after the configuration parsing, all HTTP error messages
(errorloc, errorfile or default messages) are converted into HTX messages and
stored in dedicated buffers. We use it to return errors in the HTX analyzers
instead of using ugly OOB blocks.
2018-12-01 17:37:27 +01:00
Christopher Faulet
99daf28a76 MINOR: proto_htx: Send valid HTX message to send 30x responses
The function htx_apply_redirect_rule() has been rewritten to send a valid
HTX message.
2018-12-01 17:37:27 +01:00
Christopher Faulet
0eaed6bb08 MINOR: proto_htx: Send valid HTX message when redir mode is enabled on a server
The function htx_perform_server_redirect() has been rewritten to send a valid
HTX message.
2018-12-01 17:37:27 +01:00
Christopher Faulet
12c51e28dd MINOR: proto_htx: Use full HTX messages to send 401 and 407 responses
Instead of replying by adding an OOB block in the HTX structure, we now add a
valid HTX message. The old code relied on the function http_reply_and_close() to
send 401/407 responses. Now, we push it in the response's buffer. So we take
care to drain the request's channel and to shutdown the response's channel for
the read.
2018-12-01 17:37:27 +01:00
Christopher Faulet
ee9b5bfe89 MINOR: proto_htx: Use full HTX messages to send 103-Early-Hints responses
Instead of replying by adding an OOB block in the HTX structure, we now add a
valid HTX message. A header block is added to each early-hint rule, prefixed by
the start line if it is the first one. The response is terminated and forwarded
when the rules execution is stopped or when a rule of another type is applied.
2018-12-01 17:37:27 +01:00
Christopher Faulet
23a3c790e6 MINOR: proto_htx: Use full HTX messages to send 100-Continue responses
Instead of replying by adding an OOB block in the HTX structure, we now add a
valid HTX message.
2018-12-01 17:37:27 +01:00
Christopher Faulet
b2db4fa016 MINOR: htx: Add BODYLESS flags on the HTX start-line and the HTTP message
the flags HTX_SL_F_BODYLESS and HTTP_MSGF_BODYLESS have been added. These flags
are set when the corresponding HTTP message has no body at all.
2018-12-01 17:37:27 +01:00
Christopher Faulet
0359911935 MINOR: proto-htx: Use the start-line flags to set the HTTP messsage ones
the flags of the HTX start-line (HTX_SL_F_*) are mapped on ones of the HTTP
message (HTTP_MSGS_*). So we can easily retrieve info from the parsing in HTX
analyzers.
2018-12-01 17:37:27 +01:00
Christopher Faulet
f1ba18d7b3 MEDIUM: htx: Don't rely on h1_sl anymore except during H1 header parsing
Instead, we now use the htx_sl coming from the HTX message. It avoids to have
too H1 specific code in version-agnostic parts. Of course, the concept of the
start-line is higly influenced by the H1, but the structure htx_sl can be
adapted, if necessary. And many things depend on a start-line during HTTP
analyzis. Using the structure htx_sl also avoid boring conversions between HTX
version and H1 version.
2018-12-01 17:37:27 +01:00
Christopher Faulet
54483df5ba MINOR: htx: Add the start-line offset for the HTX message in the HTX structure
If there is no start-line, this offset is set to -1. Otherwise, it is the
relative address where the start-line is stored in the data block. When the
start-line is added, replaced or removed, this offset is updated accordingly. On
remove, if the start-line is no set and if the next block is a start-line, the
offset is updated. Finally, when an HTX structure is defragmented, the offset is
also updated accordingly.
2018-12-01 17:37:27 +01:00
Christopher Faulet
570d1614fa MEDIUM: htx: Change htx_sl to be a struct instead of an union
The HTX start-line is now a struct. It will be easier to extend, if needed. Same
info can be found, of course. In addition it is now possible to set flags on
it. It will be used to set some infos about the message.

Some macros and functions have been added in proto/htx.h to help accessing
different parts of the start-line.
2018-12-01 17:37:27 +01:00
Christopher Faulet
b1b0821e8e MINOR: stats: Don't add end-of-data marker and trailers in the HTX response
Because the mux H1 is able to handle these blocks by itself, it is easier to
ignore them in the stats applet.
2018-12-01 17:37:27 +01:00
Christopher Faulet
24ed835129 MINOR: htx: Add function to add an HTX block just before another one
The function htx_add_data_before() can be used to add an HTX block before
another one. For instance, it could be used to add some data before the
end-of-message marker.
2018-12-01 17:37:27 +01:00
Christopher Faulet
9400a3924d MEDIUM: mux-h1: Add keep-alive outgoing connections in connections list
With the legacy representation, keep-alive outgoing connections are added in
private/idle/safe connections list when the transaction is cleaned up. But this
stage does not exist with the HTX representaion because a new stream, and
therefore a new transaction, is created for each request. So it is now handled
when the stream is detached from the connection.
2018-12-01 17:37:27 +01:00
Christopher Faulet
5d37dac785 MINOR: mux-h1: Consume channel's data in a loop in h1_snd_buf()
In h1_snd_buf(), the data sending is done synchronously, as much as possible. So
if some data remains in the channel's buffer, because there was not enougth
place in the output buffer, it may be good the retry after a send because some
space may have been released when sending. Most of time the output buffer is
empty and all channel's data are consumed the first time. And if no data are
sent, we don't retry to do more. So the loop is just here to optimize edge cases
without any cost for all others.
2018-12-01 17:37:27 +01:00
Christopher Faulet
f96c322664 MINOR: mux-h1: Subscribe to send in h1_snd_buf() when not all data have been sent
After a call to snd_buf, if some data remain in the channel's buffer, this means
the system buffers are full or we are unable to fully consume an HTX block for
any reason. In the last case, we need to wakeup the stream to process more data
as soon as possible. We do it subscribing to send at the end of h1_snd_buf().
2018-12-01 17:37:27 +01:00
Christopher Faulet
1727648e10 MINOR: mux-h1: Be prepare to fail when EOM is added during trailers parsing
When trailers are parsed, we must add the corrresponsing HTX block and then we
must add the block end-of-message. But this last operation can failed because
there is not enough space the HTX message. This case was left aside till
now. Now, we stay in the state H1_MSG_TRAILERS with the warranty we will be able
to restart at the right stage.
2018-12-01 17:37:27 +01:00
Christopher Faulet
3218821b70 MINOR: mux-h1: Write last chunk and trailers if not found in the HTX message
For chunked messages, during output process, the mux is now able to write the
last empty chunk and empty trailers when corrsponding blocks have not been found
in the HTX message. It is handy for filters changing a not-chunked message into
a chunked one (like the compression filter).
2018-12-01 17:37:27 +01:00
Christopher Faulet
a1692f51a5 MINOR: mux-h1: Don't rely on the stream anymore in h1_set_srv_conn_mode()
In h1_set_srv_conn_mode(), we need to get the frontend proxy of a server
connection. untill now, we relied on the stream to get it. But it was a bit
dirty. The stream always exists at this stage but to get it, we also need to get
the stream-interface. Since the commit 7c6f8b146 ("MAJOR: connections: Detach
connections from streams."), the connection's owner is always the session, even
for outgoing connections. So now, we rely on the session to get the frontend
proxy in h1_set_srv_conn_mode().

Use the session instead of the stream to get
the frontend on the server connection
2018-12-01 17:37:27 +01:00
Christopher Faulet
870aad9116 MINOR: proto_htx: Use conn_stream's info to set t_idle duration when possible
On the client side, if si_get_cs_info() returns valid info, we use it to set
t_idle duration. Otherwise, we compute it using the stream's logs info.
2018-12-01 17:37:27 +01:00
Christopher Faulet
b3484d67d3 MINOR: stream: Rely on CS's info if it exists and fallback on session's ones
When the stream is created, If si_get_cs_info() returns valid info for the client
connection stream, we use it. Otherwise we use session' info.
2018-12-01 17:37:27 +01:00
Christopher Faulet
feb1174be0 MINOR: mux-h1: Implement get_cs_info() callback
When the connection client is accepted, the info of the client conn_stream are
filled with the session info (accept_date, tv_accept and t_handshake). For all
other conn_streams, on client and server side, their info are filled using
global values (date and now).
2018-12-01 17:37:27 +01:00
Christopher Faulet
573fe735f4 BUG/MINOR: htx: Stop a header or a start line lookup on the first EOH or EOM
Because several messages can be stored in the HTX structure, it is important to
restrict searches to the current message.
2018-12-01 17:20:36 +01:00
Christopher Faulet
72b6273b5b BUG/MINOR: proto_htx: Send outgoing data to client to start response processing
In http_wait_for_response(), we wait that all outgoing data have really been
sent (from the channel's point of view) to start the processing of the
response. In fact, it is used to send all intermediate 10x responses. For now
the HTX api is not really handy when multiple messages are stored in the HTX
structure.
2018-12-01 17:20:36 +01:00
Christopher Faulet
66229af8df BUG/MEDIUM: mux-h1: Reset the H1 parser when an outgoing message is processed
Because multiple HTTP messages can be stored in an HTX structure, it is
important to not forget to reset the H1 parser at the beginning of each
one. With the current version, this case only happens on the response, when
multiple HTTP-1XX responses are forwarded to the client (for instance
103-Early-Hints). So strickly speaking, it is the same message. But for now,
internally, each one is a standalone message. Note that it might change in a
future version of the HTX.
2018-12-01 17:20:36 +01:00
Christopher Faulet
5999b86500 BUG/MINOR: mux-h1: Fix processing of "Connection: " header on outgoing messages
in h1_process_output(), before formatting the headers, we need to find and check
the "Connection: " header to update the connection mode. But, the context used
to do so was not correctly initialized. We must explicitly set ctx.value to NULL
to be sure to rescan the current header.
2018-12-01 17:20:36 +01:00
Christopher Faulet
53ad16a0ef BUG/MINOR: htx: Fix block size calculation when a start-line is added/replaced
What we store in the buffer is a union htx_sl, not an h1_sl, so the
computed size was not correct.
2018-12-01 17:20:36 +01:00
Christopher Faulet
ed26fb8ac8 BUG/MINOR: http: Use out buffer instead of trash to display error snapshot
the function http_show_error_snapshot() must not use the trash buffer to append
the HTTP error description. Instead, it must use the <out> buffer, its first
argument. Note that concretely, this function always succeeds because <out> is
always the trash buffer.
2018-12-01 17:20:36 +01:00
Christopher Faulet
7805e2bc1f BUG/MINOR: cfgparse: Fix transition between 2 sections with the same name
When a section's parser is registered, it can also define a post section
callback, called at the end of the section parsing. But when 2 sections with the
same name followed each other, the transition between them was missed. This
induced 2 bugs. First, the call to the post section callback was skipped. Then,
the parsing of the second section was mixed with the first one.

This patch must be backported in 1.8.
2018-12-01 17:20:36 +01:00
Olivier Houchard
2442f68dd3 BUG/MEDIUM: Special-case http_proxy when dealing with outgoing connections.
http_proxy is special, because it creates its connection and conn_stream
earlier. So in assign_server(), check that the connection associated with
the conn_stream has a destination address set, and in connect_server(),
use the connection and the conn_stream already attached to the
stream_interface, instead of looking for a connection in the session, and
creating a new conn_stream.
2018-12-01 17:20:03 +01:00
Olivier Houchard
ba4fff5fd2 MEDIUM: server: Be smarter about deciding to reuse the last server.
Instead of parsing all the available connections owned by the session
each time we choose a server, even if prefer-last-server is not set,
just do it if prefer-last-server is used, and check if the server is usable,
before checking the connections.
2018-12-01 15:45:30 +01:00
Olivier Houchard
985f139aa2 MEDIUM: session: Steal owner-less connections on end of transaction.
When a transaction ends, if we want to do keepalive, and the connection we
used didn't have an owner, attach the connection to the session, so that we
don't have to destroy it, and we can reuse it later.
2018-12-01 10:47:19 +01:00
Olivier Houchard
00cf70f28b MAJOR: sessions: Store multiple outgoing connections in the session.
Instead of just storing the last connection in the session, store all of
the connections, for at most MAX_SRV_LIST (currently 5) targets.
That way we can do keepalive on more than 1 outgoing connection when the
client uses HTTP/2.
2018-12-01 10:47:18 +01:00
Olivier Houchard
93c8852572 MEDIUM: h2: Destroy a connection with no stream if it has no owner.
In h2_detach(), if the connection has no stream left, and no associated
owner, then destroy it, as nobody else will be able to.
2018-12-01 10:47:18 +01:00
Olivier Houchard
bf024f0a15 MEDIUM: connections: Put H2 connections in the idle list if http-reuse always.
When creating a new outgoing H2 connection, put it in the idle list so that
it's immediately available for others to use, if http-reuse always is used.
2018-12-01 10:47:18 +01:00
Olivier Houchard
b72d98a619 BUG/MEDIUM: mux_pt: Don't try to send if handshake is not done.
While it is true the SSL code will do the right thing if the SSL handshake
is not done, we have other types of handshake to deal with (proxy protocol,
netscaler, ...). For those we definitively don't want to try to send data
before it's done. All handshakes but SSL will go through the mux_pt, so in
mux_pt_snd_buf, don't try to send while a handshake is pending.
2018-12-01 10:47:17 +01:00
Olivier Houchard
d7d627c0b9 BUG/MEDIUM: session: properly clean the outgoing connection before freeing.
In session_free(), make sure the outgoing connection is not in the idle list
anymore, and it does no longer have an owner, so that it will properly be
destroyed and nobody will be able to access it.
2018-12-01 10:47:17 +01:00
Olivier Houchard
a30a40bcca BUG/MEDIUM: connections: Remove the connection from the idle list before destroy.
Before calling the destroy() method, remove the connection from the idle list,
so that no new session will pick it.
2018-12-01 10:47:16 +01:00
Olivier Houchard
a49d41a9af BUG/MEDIUM: connections: Don't assume we have a mux in connect_server().
When dealing with the previous connection, don't assume it has a mux, as it
may not yet be the case if we're waiting for the ALPN.
2018-12-01 10:47:16 +01:00
Olivier Houchard
14547b2e1c BUG/MEDIUM: streams: Don't assume we have a CS in sess_update_st_con_tcp.
We can reach sess_update_st_con_tcp() while we still have a connection
attached, so take that into account, and free the connection, instead of
assuming it's always a conn_stream.
2018-12-01 10:47:16 +01:00
Olivier Houchard
5c6109691a BUG/MEDIUM: session: Remove the session from the session_list in session_free.
When freeing the session, we may fail to free the outgoing connection,
because it still has streams attached. So remove ourself from the session
list, so that the connection doesn't try to access it later.
2018-12-01 10:47:15 +01:00
Olivier Houchard
4667773a8a BUG/MEDIUM: h2: Call h2_process() if there's an error on the connection.
In h2_recv(), return 1 if there's an error on the connection, not just if
there's a read0 pending, so that h2_process() can be called and act as a
janitor.
2018-11-29 17:39:04 +01:00
Olivier Houchard
24b8fe874e BUG/MEDIUM: stream_interface: Make sure we read all the data available.
In si_cs_recv(), when there's an error on the connection or the conn_stream,
don't give up if CS_FL_RCV_MORE is set on the conn_stream, as it means there's
still data available.
2018-11-29 17:39:04 +01:00
Olivier Houchard
3e1f68bcf9 BUG/MEDIUM: stream_interface: Don't check if the handshake is done.
In si_cs_send(), don't give up and subscribe if the connection is still
waiting for a SSL handshake. We will never be woken up once the handshake is
done if we're using HTTP/2. Instead, directly try to send data. When using
the mux_pt, if the handshake is not done yet, snd_buf() would return 0 and
we will subscribe anyway.
2018-11-29 17:39:04 +01:00
Olivier Houchard
d76bd2d40b BUG/MEDIUM: connections: Don't forget to detach the connection from the SI.
When we're deferring the mux choice until the ALPN is negociated, we
attach the connection to the stream_interface until it's done, so that we
can destroy it if something goes wrong and the stream is destroy.
Before calling si_attach_cs() to attach the conn_stream once we have it,
call si_detach_endpoint(), or is_attach_cs() would destroy the connection.
2018-11-29 17:39:04 +01:00
Olivier Houchard
70d9b2fdb0 BUG/MEDIUM: connections: Wake the stream once the mux is chosen.
When we defer the mux choice until the ALPN is negociated, don't forget
to wake the stream once it's done, or it will never have the opportunity
to send data.
2018-11-29 17:39:04 +01:00
Baptiste Assmann
6be139f867 BUG/MINOR: ssl: ssl_sock_parse_clienthello ignores session id
In ssl_sock_parse_clienthello(), the code considers that SSL Sessionid
size is '1', and then considers that the SSL cipher suite is availble
right after the session id size information.
This actually works in a single case, when the client does not send a
session id.

This patch fixes this issue by introducing the a propoer way to parse
the session id and move forward the cursor by the session id length when
required.

Need to be backported to 1.8.
2018-11-29 16:55:29 +01:00
Olivier Houchard
1ced485b29 BUG/MEDIUM: mux_pt: Don't forget to unsubscribe() on attach.
In the mux_pt, when we're attaching a new conn_stream, don't forget to
unsubscribe from the connection. Failure to do so may lead to the mux_pt
freeing the connection while the conn_stream can still want to access it.
2018-11-29 13:52:31 +01:00
Olivier Houchard
0024a98640 BUG/MEDIUM: h2: Don't bogusly error if the previous stream was closed.
In h2_process_demux(), if we're demuxing multiple frames, and the previous
frame led to a stream getting closed, don't bogusly consider that an error,
and destroy the next stream, as there are valid cases where the stream could
be closed.
2018-11-28 14:09:55 +01:00
Tim Duesterhus
3f024f3be5 CLEANUP: http: Fix typo in init_http's comment
It read "non-zero" where it should read zero.
2018-11-28 04:20:51 +01:00
William Lallemand
d913800a7d BUG/MEDIUM: listeners: CLOEXEC flag is not correctly set
The CLOEXEC flag was set using a F_SETFL which can't work.
To set the CLOEXEC flag F_SETFD should be used, the problem is that it
needs a new call to fcntl() and it's on the path of every accept.

This flag was only needed in the case of the master, so the patch was
reverted and the flag set only in this case.

The bug was introduced by 0b3e849 ("MEDIUM: listeners: set O_CLOEXEC on
the accepted FDs").

No backport needed.
2018-11-27 19:34:00 +01:00
William Lallemand
4b58c80ee2 REORG: mworker: declare master variable in global.h
This variable is used at several places, better declare it in global.h.
2018-11-27 19:34:00 +01:00
William Lallemand
c03eb01c1a BUG/MEDIUM: mworker: avoid leak of client socket
If the master was reloaded and there was a established connection to a
server, the FD resulting from the accept was leaking.

There was no CLOEXEC flag set on the FD of the socketpair created during
a connect call. This is specific to the socketpair in the master process
but it should be applied to every protocol in case we use them in the
master at some point.

No backport needed.
2018-11-27 19:34:00 +01:00
Willy Tarreau
680b2bdf2f MINOR: h2: make struct h2_ops static
There's no reason to export this descriptor, it used to be needed during
early H2 development and will complicate porting to HTX.
2018-11-27 09:59:48 +01:00
Christopher Faulet
6160832bf7 BUG/MINOR: proto_htx: only mark connections private if NTLM is detected
The commit fd9b68c48 ("BUG/MINOR: only mark connections private if NTLM is
detected") was forgotten when HTX analyzers were added.
2018-11-27 09:25:35 +01:00
Lukas Tribus
7706b85e0c MINOR: ssl: free ctx when libssl doesn't support NPN
The previous fix da95fd90 ("BUILD/MINOR: ssl: fix build with non-alpn/
non-npn libssl") does fix the build in old OpenSSL release, but I
overlooked that the ctx is only freed when NPN is supported.

Fix this by moving the #endif to the proper place (this was broken in
c7566001 ("MINOR: server: Add "alpn" and "npn" keywords")).
2018-11-27 04:32:32 +01:00
Willy Tarreau
7f0165e399 MEDIUM: memory: make the pool cache an array and not a thread_local
Having a thread_local for the pool cache is messy as we need to
initialize all elements upon startup, but we can't until the threads
are created, and once created it's too late. For this reason, the
allocation code used to check for the pool's initialization, and
it was the release code which used to detect the first call and to
initialize the cache on the fly, which is not exactly optimal.

Now that we have initcalls, let's turn this into a per-thread array.
This array is initialized very early in the boot process (STG_PREPARE)
so that pools are always safe to use. This allows to remove the tests
from the alloc/free calls.

Doing just this has removed 2.5 kB of code on all cumulated pool_alloc()
and pool_free() paths.
2018-11-26 19:50:32 +01:00
Willy Tarreau
b6b3df3ed3 MEDIUM: initcall: use initcalls for a few initialization functions
signal_init(), init_log(), init_stream(), and init_task() all used to
only preset some values and lists. This needs to be done very early to
provide a reliable interface to all other users. The calls used to be
explicit in haproxy.c:init(). Now they're placed in initcalls at the
STG_PREPARE stage. The functions are not exported anymore.
2018-11-26 19:50:32 +01:00
Willy Tarreau
2455cebe00 MEDIUM: memory: use pool_destroy_all() to destroy all pools on deinit()
Instead of exporting a number of pools and having to manually delete
them in deinit() or to have dedicated destructors to remove them, let's
simply kill all pools on deinit().

For this a new function pool_destroy_all() was introduced. As its name
implies, it destroys and frees all pools (provided they don't have any
user anymore of course).

This allowed to remove 4 implicit destructors, 2 explicit ones, and 11
individual calls to pool_destroy(). In addition it properly removes
the mux_pt_ctx pool which was not cleared on exit (no backport needed
here since it's 1.9 only). The sig_handler pool doesn't need to be
exported anymore and became static now.
2018-11-26 19:50:32 +01:00
Willy Tarreau
8ceae72d44 MEDIUM: init: use initcall for all fixed size pool creations
This commit replaces the explicit pool creation that are made in
constructors with a pool registration. Not only this simplifies the
pools declaration (it can be done on a single line after the head is
declared), but it also removes references to pools from within
constructors. The only remaining create_pool() calls are those
performed in init functions after the config is parsed, so there
is no more user of potentially uninitialized pool now.

It has been the opportunity to remove no less than 12 constructors
and 6 init functions.
2018-11-26 19:50:32 +01:00
Willy Tarreau
7107c8b494 MINOR: memory: add a callback function to create a pool
The new function create_pool_callback() takes 3 args including the
return pointer, and creates a pool with the specified name and size.
In case of allocation error, it emits an error message and returns.

The new macro REGISTER_POOL() registers a callback using this function
and will be usable to request some pools creation and guarantee that
the allocation will be checked. An even simpler approach is to use
DECLARE_POOL() and DECLARE_STATIC_POOL() which declare and register
the pool.
2018-11-26 19:50:32 +01:00
Willy Tarreau
e655251e80 MINOR: initcall: use initcalls for section parsers
The two calls to cfg_register_section() and cfg_register_postparser()
are now supported by initcalls. This allowed to remove two other
constructors.
2018-11-26 19:50:32 +01:00
Willy Tarreau
172f5ce948 MINOR: initcall: use initcalls for most post_{check,deinit} and per_thread*
Most calls to hap_register_post_check(), hap_register_post_deinit(),
hap_register_per_thread_init(), hap_register_per_thread_deinit() can
be done using initcalls and will not require a constructor anymore.
Let's create a set of simplified macros for this, called respectively
REGISTER_POST_CHECK, REGISTER_POST_DEINIT, REGISTER_PER_THREAD_INIT,
and REGISTER_PER_THREAD_DEINIT.

Some files were not modified because they wouldn't benefit from this
or because they conditionally register (e.g. the pollers).
2018-11-26 19:50:32 +01:00
Willy Tarreau
8071338c78 MINOR: initcall: apply initcall to all register_build_opts() calls
Most register_build_opts() calls use static strings. These ones were
replaced with a trivial REGISTER_BUILD_OPTS() statement adding the string
and its call to the STG_REGISTER section. A dedicated section could be
made for this if needed, but there are very few such calls for this to
be worth it. The calls made with computed strings however, like those
which retrieve OpenSSL's version or zlib's version, were moved to a
dedicated function to guarantee they are called late in the process.
For example, the SSL call probably requires that SSL_library_init()
has been called first.
2018-11-26 19:50:32 +01:00
Willy Tarreau
86abe44e42 MEDIUM: init: use self-initializing spinlocks and rwlocks
This patch replaces a number of __decl_hathread() followed by HA_SPIN_INIT
or HA_RWLOCK_INIT by the new __decl_spinlock() or __decl_rwlock() which
automatically registers the lock for initialization in during the STG_LOCK
init stage. A few static modifiers were lost in the process, but since they
were not essential at all it was not worth extending the API to provide such
a variant.
2018-11-26 19:50:32 +01:00
Willy Tarreau
a8ae77da61 MINOR: thread: provide a set of lock initialisers
This patch adds ha_spin_init() and ha_rwlock_init() which are used as
a callback to initialise locks at boot time. They perform exactly the
same as HA_SPIN_INIT() or HA_RWLOCK_INIT() but from within a real
function.
2018-11-26 19:50:32 +01:00
Willy Tarreau
0108d90c6c MEDIUM: init: convert all trivial registration calls to initcalls
This switches explicit calls to various trivial registration methods for
keywords, muxes or protocols from constructors to INITCALL1 at stage
STG_REGISTER. All these calls have in common to consume a single pointer
and return void. Doing this removes 26 constructors. The following calls
were addressed :

- acl_register_keywords
- bind_register_keywords
- cfg_register_keywords
- cli_register_kw
- flt_register_keywords
- http_req_keywords_register
- http_res_keywords_register
- protocol_register
- register_mux_proto
- sample_register_convs
- sample_register_fetches
- srv_register_keywords
- tcp_req_conn_keywords_register
- tcp_req_cont_keywords_register
- tcp_req_sess_keywords_register
- tcp_res_cont_keywords_register
- flt_register_keywords
2018-11-26 19:50:32 +01:00
Willy Tarreau
5794fb0c22 MINOR: init: process all initcalls in order at boot time
main() now iterates over all initcall stages at boot time. This will allow
to move init code from constructors to initcalls.
2018-11-26 19:50:32 +01:00
William Lallemand
7c756a8ccc BUG/MEDIUM: mworker: fix FD leak upon reload
We reintroduced some FDs leaking by using a poller and some listeners in
the master.

The master proxy needs to be stopped to avoid leaking its listeners, the
polling loop needs to be deinit, and the thread waker pipe need to be
closed too.

No backport needed.
2018-11-26 19:31:17 +01:00
Willy Tarreau
e548974ca8 MINOR: compression: always create the compression pool
Surprisingly, the compression pool was created at runtime on first use,
which is not very convenient, has performance and reliability impacts,
and even makes monitoring less easy. Let's move the pool creation at
startup time instead. This even removes the need for the spinlock in
case USE_ZLIB is not defined.
2018-11-26 14:46:55 +01:00
Willy Tarreau
3bfcd10218 BUILD: compression: fix build error with DEFAULT_MAXZLIBMEM
The tune.maxzlibmem setting was moved with commit 368780334 ("MEDIUM:
compression: move the zlib-specific stuff from global.h to compression.c")
but the preset value using DEFAULT_MAXZLIBMEM was incorrectly moved :
  - the field is in "global" and not "global.tune"
  - the trailing comma instead of semi-colon will make it either zero
    (threads enabled), break (threads enabled with debugging), or cast
    the memprintf's return pointer to int (threads disabled)

It simply proves that nobody ever used DEFAULT_MAXZLIBMEM since 1.8!

This needs to be backported to 1.8.
2018-11-26 10:27:51 +01:00
Tim Duesterhus
742e0f9f1f BUG/MINOR: mworker: Do not attempt to close(2) fd -1
Valgrind reports:

==3389== Warning: invalid file descriptor -1 in syscall close()

Check for >= 0 before closing.

This bug was introduced in commit ce83b4a5dd
and is specific to 1.9. No backport needed.
2018-11-26 08:35:41 +01:00
Lukas Tribus
da95fd901b BUILD/MINOR: ssl: fix build with non-alpn/non-npn libssl
In commit c7566001 ("MINOR: server: Add "alpn" and "npn" keywords") and
commit 201b9f4e ("MAJOR: connections: Defer mux creation for outgoing
connection if alpn is set"), the build was broken on older OpenSSL
releases.

Move the #ifdef's around so that we build again with older OpenSSL
releases (0.9.8 was tested).
2018-11-26 08:34:40 +01:00
Willy Tarreau
082f559d36 BUG/MEDIUM: h2: restart demuxing after releasing buffer space
Since the connection changes in 1.9, some breakage happened to the H2 mux
whose initial design was heavily relying on the fact that connection-level
functions were woken up after data were transferred to the stream layer.

We need to wake the demux up after receiving such data if the demux is
blocked. This at least allows to receive POSTs again. One issue remains,
it looks like the end of the uploaded data is silently discarded if the
server responds before the end of the transfer (H2 in half-closed(local)
state), which doesn't happen with 1.8.14 and nghttp as the client.

No backport is needed.
2018-11-25 09:06:42 +01:00
Willy Tarreau
1ed87b77b4 BUG/MEDIUM: h2: wake the processing task up after demuxing
After the changes to the connection layer in 1.9, some wake up calls
need to be introduced to re-activate reading from the connection. One
such place is at the end of h2_process_demux(), otherwise processing
of input data stops after a few frames.

No backport is needed.
2018-11-25 08:52:11 +01:00
Olivier Houchard
ee23b2a1e3 MEDIUM: servers: Store the connection in the SI until we have a mux.
When we create a connection, if we have to defer the conn_stream and the
mux creation until we can decide it (ie until the SSL handshake is done, and
the ALPN is decided), store the connection in the stream_interface, so that
we're sure we can destroy it if needed.
2018-11-23 19:11:14 +01:00
Olivier Houchard
25607afa0a BUG/MEDIUM: sessions: Set sess->origin to NULL if the origin was destroyed.
When ending a stream, if the origin is an appctx, the appctx will have been
destroyed already, but it does not destroy the session. So later, when we
try to destroy the session, we try to dereference sess->origin and die
trying.
Fix this by explicitely setting sess->origin to NULL before calling
session_free().
2018-11-23 14:56:46 +01:00
Olivier Houchard
1295016873 BUG/MEDIUM: servers: Don't check if we have a conn_stream too soon.
The creation of the conn_stream for an outgoing connection has been delayed
a bit, and when using dispatch, a check was made to see if a conn_stream
was attached before the conn_stream was created, so remove the test, as
it's done later anyway, and create and install the conn_stream right away
when we don't have a server, as is done when we don't have an alpn/npn
defined.
2018-11-23 14:56:21 +01:00
Olivier Houchard
c6e0bb4944 MINOR: server: Only defined conn_complete_server if USE_OPENSSL is set.
conn_complete_server() is only used when using ALPN/NPN, so only define it
if USE_OPENSSL is set.
2018-11-23 14:56:13 +01:00
Olivier Houchard
637b695d6a BUG/MEDIUM: connections: Don't reset the conn flags in *connect_server().
In the various connect_server() functions, don't reset the connection flags,
as some may have been set before. The flags are initialized in conn_init(),
anyway.
2018-11-23 14:55:18 +01:00
Olivier Houchard
7fc3be76c7 MINOR: servers: Free [idle|safe|priv]_conns on exit.
Don't forget to free idle_conns, safe_conns and priv_conns on exit.

This can be backported to 1.8.
2018-11-22 19:53:03 +01:00
Olivier Houchard
6b77f49e78 MEDIUM: ssl: Add ssl_bc_alpn and ssl_bc_npn sample fetches.
Add 2 new sample fetches, ssl_bc_alpn and ssl_bc_npn, that provides the
ALPN and the NPN for an outgoing connection.
2018-11-22 19:52:44 +01:00
Olivier Houchard
201b9f4eb5 MAJOR: connections: Defer mux creation for outgoing connection if alpn is set.
If an ALPN (or a NPN) was chosen for a server, defer choosing the mux until
after the SSL handshake is done, and the ALPN/NPN has been negociated, so
that we know which mux to pick.
2018-11-22 19:52:23 +01:00
Olivier Houchard
66b5166af9 MEDIUM: connection: Don't bother reactivating polling after connection retry.
As we now will no longer try tro subscribe to recv/send events before the
connection is established, there's no need to reactivate polling on the fd
when retrying connection. It will be activated later on subscribe.
2018-11-22 19:50:39 +01:00
Olivier Houchard
c756600103 MINOR: server: Add "alpn" and "npn" keywords.
Add new keywords to "server" lines, alpn and npn.
If set, when connecting through SSL, those alpn/npn will be negociated
during the SSL handshake.
2018-11-22 19:50:08 +01:00
Willy Tarreau
beb859abce MINOR: polling: add an option to support busy polling
In some situations, especially when dealing with low latency on processors
supporting a variable frequency or when running inside virtual machines,
each time the process waits for an I/O using the poller, the processor
goes back to sleep or is offered to another VM for a long time, and it
causes excessively high latencies.

A solution to this provided by this patch is to enable busy polling using
a global option. When busy polling is enabled, the pollers never sleep and
loop over themselves waiting for an I/O event to happen or for a timeout
to occur. On multi-processor machines it can significantly overheat the
processor but it usually results in much lower latencies.

A typical test consisting in injecting traffic over a single connection at
a time over the loopback shows a bump from 4640 to 8540 connections per
second on forwarded connections, indicating a latency reduction of 98
microseconds for each connection, and a bump from 12500 to 21250 for
locally terminated connections (redirects), indicating a reduction of
33 microseconds.

It is only usable with epoll and kqueue because select() and poll()'s
API is not convenient for such usages, and the level of performance they
are used in doesn't benefit from this anyway.

The option, which obviously remains disabled by default, can be turned
on using "busy-polling" in the global section, and turned off later
using "no busy-polling". Its status is reported in "show info" to help
troubleshooting suspicious CPU spikes.
2018-11-22 19:47:30 +01:00
Willy Tarreau
48f8bc1368 MINOR: poller: move the call of tv_update_date() back to the pollers
The reason behind this will be to be able to compute a timeout when
busy polling.
2018-11-22 18:57:37 +01:00
William Lallemand
744a08903e BUG/MINOR: mworker: fix FD leak and memory leak in error path
Fix some memory leak and a FD leak in the error path of the master proxy
initialisation. It's a really minor issue since the process is exiting
when taking those error paths.
2018-11-22 17:34:12 +01:00
Tim Duesterhus
4cae3b2f33 BUG/MINOR: cli: Fix memory leak
Valgrind's memcheck reports memory leaks in cli.c, because
the out parameter of memprintf is not properly freed:

  ==31035== 11 bytes in 1 blocks are definitely lost in loss record 16 of 101
  ==31035==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
  ==31035==    by 0x4C2FDEF: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
  ==31035==    by 0x4A3C72: my_realloc2 (standard.h:1364)
  ==31035==    by 0x4A3C72: memvprintf (standard.c:3459)
  ==31035==    by 0x4A3D93: memprintf (standard.c:3482)
  ==31035==    by 0x4AF77E: mworker_cli_sockpair_new (cli.c:2324)
  ==31035==    by 0x48E826: init (haproxy.c:1749)
  ==31035==    by 0x408BBC: main (haproxy.c:2725)
  ==31035==
  ==31035== 11 bytes in 1 blocks are definitely lost in loss record 17 of 101
  ==31035==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
  ==31035==    by 0x4C2FDEF: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
  ==31035==    by 0x4A3C72: my_realloc2 (standard.h:1364)
  ==31035==    by 0x4A3C72: memvprintf (standard.c:3459)
  ==31035==    by 0x4A3D93: memprintf (standard.c:3482)
  ==31035==    by 0x4AF071: mworker_cli_proxy_create (cli.c:2172)
  ==31035==    by 0x48EC89: init (haproxy.c:1760)
  ==31035==    by 0x408BBC: main (haproxy.c:2725)

These leaks were introduced in commits
ce83b4a5dd and
8a02257d88
which are specific to haproxy 1.9 dev.
2018-11-22 17:34:12 +01:00
Willy Tarreau
4f93e0c280 MINOR: cli/activity: rename the stolen CPU time fields to mention milliseconds
The "cpust_{tot,1s,15s}" fields used to report milliseconds but nothing
in the value's title made this explicit. Let's rename the field to report
"cpust_ms_{tot,1s,15s}" to more easily remind that the unit represents
milliseconds.
2018-11-22 16:13:17 +01:00
Willy Tarreau
70fe94419c MINOR: sample: add cpu_calls, cpu_ns_avg, cpu_ns_tot, lat_ns_avg, lat_ns_tot
These sample fetch keywords report performance metrics about the task calling
them. They are useful to report in logs which requests consume too much CPU
time and what negative performane impact it has on other requests. Typically
logging cpu_ns_avg and lat_ns_avg will show culprits and victims.
2018-11-22 16:07:39 +01:00
Willy Tarreau
9efd7456e0 MEDIUM: tasks: collect per-task CPU time and latency
Right now we measure for each task the cumulated time spent waiting for
the CPU and using it. The timestamp uses a 64-bit integer to report a
nanosecond-level date. This is only enabled when "profiling.tasks" is
enabled, and consumes less than 1% extra CPU on x86_64 when enabled.
The cumulated processing time and wait time are reported in "show sess".

The task's counters are also reset when an HTTP transaction is reset
since the HTTP part pretends to restart on a fresh new stream. This
will make sure we always report correct numbers for each request in
the logs.
2018-11-22 15:44:21 +01:00
Willy Tarreau
75c62c2793 MINOR: activity: add configuration and CLI support for "profiling.tasks"
This is a new global setting which enables or disables CPU profiling
per task. For now it only sets/resets the variable based on the global
option "profiling.tasks" and supports showing it as well as setting it
from the CLI using "show profiling" and "set profiling". The option will
be used by a future commit. It was done in a way which should ease future
addition of profiling options.
2018-11-22 11:48:51 +01:00
Willy Tarreau
baba82fe70 MINOR: activity: report the average loop time in "show activity"
Since we know the time it takes to process everything between two poll()
calls, we can use this as the max latency measurement any task will
experience and average it.

This code does this, and reports in "show activity" the average of this
loop time over the last 1024 poll() loops, for each thread. It will vary
quickly at high loads and slowly under low to moderate loads, depending
on the rate at which poll() is called. The latency a task experiences
is expected to be half of this on average.
2018-11-22 11:48:41 +01:00
Willy Tarreau
609aad9e73 REORG: time/activity: move activity measurements to activity.{c,h}
At the moment the situation with activity measurement is quite tricky
because the struct activity is defined in global.h and declared in
haproxy.c, with operations made in time.h and relying on freq_ctr
which are defined in freq_ctr.h which itself includes time.h. It's
barely possible to touch any of these files without breaking all the
circular dependency.

Let's move all this stuff to activity.{c,h} and be done with it. The
measurement of active and stolen time is now done in a dedicated
function called just after tv_before_poll() instead of mixing the two,
which used to be a lazy (but convenient) decision.

No code was changed, stuff was just moved around.
2018-11-22 11:48:41 +01:00
William Lallemand
0564d41333 BUG/MEDIUM: mworker: unregister the signals of main()
The signal_register_fct() does not remove the handlers assigned to a
signal, but add a new handler to a list.

We accidentality inherited the handlers of the main() function in the
master process which is a problem because they act on the proxies.

The side effect was to stop the MASTER proxy which handle the master CLI
on a SIGUSR1, and to display some debug info when doing a SIGHUP and a
SIGQUIT.
2018-11-22 11:42:51 +01:00
William Lallemand
31a1c1d5e7 MEDIUM: signal: signal_unregister() removes every handlers
The new function signal_unregister() removes every handlers assigned to
a signal. Once the handler list of the signal is empty, the signal is
ignored with SIG_IGN.
2018-11-22 11:42:51 +01:00
William Lallemand
db6bdfbf68 MINOR: cli: add mworker_accept_wrapper to 'show fd'
In the output of 'show fd', the worker CLI's socketpair was still
handled by an "unknown" function. That can be really confusing during
debug. Fixed it by showing "mworker_accept_wrapper" instead.
2018-11-22 11:42:51 +01:00
Olivier Houchard
e2c78cd3e8 BUG/MEDIUM: http_fetch: Make sure name is initialized before http_find_header.
Before calling http_find_header, make sure name is initialized properly, or
its value would be random.
2018-11-22 10:09:58 +01:00
William Lallemand
220567ec34 MINOR: mworker: use ha_notice to announce a new worker
Displays the PID and the relative PID when we fork a new worker with
 ha_notice().
2018-11-21 19:02:23 +01:00
William Lallemand
9c56a22b20 MINOR: log: introduce ha_notice()
It's like ha_warning() or ha_alert() but with a NOTICE prefix.
2018-11-21 19:02:23 +01:00
William Lallemand
944e619b64 MEDIUM: mworker: wait mode use standard init code path
The mworker waitpid mode (which is used when a reload failed to apply
the new configuration) was still using a specific initialisation path.
That's a problem since we use a polling loop in the master now, the
master proxy is not initialized and the master CLI is not activated.

This patch removes the initialisation code of the wait mode and
introduce the MODE_MWORKER_WAIT in order to use the same init path as
the MODE_MWORKER with some exceptions. It allows to use the master proxy
and the master CLI during the waitpid mode.
2018-11-21 17:05:30 +01:00
Christopher Faulet
7e346f3694 BUG/MINOR: mux-htx: Fix bad test on h1c flags in h1_recv_allowed()
A logical OR was used instead of a binary OR. Thanks to David Carlier to spot
and report this bug.
2018-11-20 17:22:37 +01:00
Christopher Faulet
7ff4f14204 BUG/MINOR: config: Be aware of the HTX during the check of mux protocols
Because the HTX is still experimental, we must add special cases during the
configuration check to be sure it is not enabled on a proxy with incompatible
options. Here, for HTX proxies, when a mux protocol is specified on a bind line
or a server line, we must force the HTX mode (PROTO_MODE_HTX).

Concretely, H2 is the only mux protocol that can be forced. And it doesn't yet
support the HTX. So forcing the H2 on an HTX proxy will always fail.
2018-11-20 14:31:44 +01:00
Christopher Faulet
55dec0dca4 MINOR: stream-int: remove useless checks on CS and conn flags in si_cs_send()
In si_cs_send(), some checks are done the CS flags en the connection flags
before calling snd_buf(). But these checks are useless because they have already
been done earlier in the function. The harder to figure out is the flag
CO_FL_SOCK_WR_SH. So it is now tested with CF_SHUTW at the beginning.
2018-11-20 14:31:44 +01:00
Christopher Faulet
3f76f4ccf7 BUG/MINOR: stream-int: Don't call snd_buf() if there are still data in the pipe
In si_cs_send, as said in comments, snd_buf() should only be called if there is
no data in the pipe anymore. But actually, this condition was not respected.
2018-11-20 14:31:44 +01:00
Christopher Faulet
e4acd5e471 MINOR: stream-int: Notify caller when an error is reported after a rcv_buf()
For the same reason than for the commit b46784b1c ("MINOR: stream-int: Notify
caller when an error is reported after a rcv_pipe()"), we return 1 after the
call to rcv_buf() in si_cs_send() to notify the caller some processing may be
triggered.

This patch is not flagged as a bug because no strange behaviour was yet observed
without it. It is just a proactive fix to be consistent.
2018-11-20 14:31:44 +01:00
Christopher Faulet
5ed7aab68a MINOR: stream-int: Notify caller when an error is reported after a rcv_pipe()
In si_cs_send(), when an error is found on the CS or the connection at the
beginning of the function, we return 1 to notify the caller some processing may
be triggered. So, it seems logical to do the same after the call to rcv_pipe().

This patch is not flagged as a bug because no strange behaviour was yet observed
without it. It is just a proactive fix to be consistent.
2018-11-20 14:31:44 +01:00
Christopher Faulet
b42a8b6c61 BUG/MINOR: proto_htx: Fix request/response synchronisation on error
The HTTP transaction must be aborted if an error is detected on any one
side.
2018-11-20 14:31:44 +01:00
Christopher Faulet
9b95d31122 BUG/MINOR: stats/htx: Remove channel's output when the request is eaten
The request is eaten when the stats applet have finished to send its
response. It was removed from the channel's buffer, removing all HTX blocks till
the EOM. But the channel's output was not reset, leaving the request channel in
an undefined state.
2018-11-20 14:31:44 +01:00
Christopher Faulet
9c38840055 BUG/MEDIUM: mux-h1: Don't set the flag CS_FL_RCV_MORE when nothing was parsed
When we start to parse a new message, if all headers have not been received,
nothing is copied in the channel's buffer. In this situation, we must not set
the flag CS_FL_RCV_MORE on the conn-stream. If we do so, the connection freezes
because there is no data to send that can reenable the reads
2018-11-20 14:31:44 +01:00
Christopher Faulet
d44ad5b8bd BUG/MEDIUM: mux-h1: Fix freeze when the kernel splicing is used
First of all, we need to be sure to keep the flag H1S_F_BUF_FLUSH on the H1S
reading data until all data was flushed from the buffer. Then we need to know
when the kernel splicing is in use and when it ends. This is handled with the
new flag H1S_F_SPLICED_DATA.

Then, we must subscribe to send when some data remain in the pipe after a
snd_pipe(). It is mandatory to wakeup the stream and avoid a freeze.

Finally, we must be sure to update the message state when we restart to use the
channel's buffer. Among other things, it is mandatory to swith the message from
DATA to DONE state when all data were sent using the kernel splicing.
2018-11-20 14:31:44 +01:00
Christopher Faulet
81d484326b BUG/MINOR: mux-h1: Enable keep-alive on server side
Don't force the close on server side anymore. Since commit 7c6f8b146 ("MAJOR:
connections: Detach connections from streams"), it is possible to release a
stream without the underlying connection.

Because of this change, we must be sure to create a new stream to handle the
next HTTP transaction only on the client side. And we must be sure to correctly
handle the read0 event in h1_recv, to be sure to call h1_process().
2018-11-20 14:31:44 +01:00
Christopher Faulet
539e029cc5 MAJOR: mux-h1: Remove the rxbuf and decode HTTP messages in channel's buffer
It avoids a copy between the rxbuf and the channel's buffer. It means the
parsing is done in h1_rcv_buf(). So we need to have a stream to start the
parsing. This change should improve the overall performances. It also implies a
better split between the connection layer and the applicative layer. Now, on the
connection layer, only raw data are manipulated. Raw data received from the
socket are stored in ibuf and those sent are get from obuf. On the applicative
layer, data in ibuf are parsed and copied into the channel's buffer. And on the
other side, those structured data are formatted and copied into obuf.
2018-11-20 14:31:44 +01:00
Willy Tarreau
4bf194cbdb BUG/MEDIUM: hpack: fix encoding of "accept-ranges" field
James Brown reported that when an "accept-ranges" header field is sent
through haproxy and converted from HTTP/1.1 to H2, it's mis-encoded as
"accept-language". It happens that it's one of the few very common header
fields encoded using its index value and that this index value was misread
in the spec as 17 instead of 18, resulting in the wrong name being sent.
Thanks to Lukas for spotting the issue in the HPACK encoder itself.

This fix must be backported to 1.8.
2018-11-20 04:47:38 +01:00
William Lallemand
16dd1b3ead MINOR: cli: show master information in 'show proc'
Displays the master information in show proc.
2018-11-20 04:43:54 +01:00
William Lallemand
e368330128 MINOR: cli: displays uptime in show proc
Displays the uptime of the workers in `show proc`
2018-11-20 04:43:54 +01:00
William Lallemand
e09cdc6d48 MINOR: cli: format show proc to be more readable
Add more space on the output to be more readable and separate old
processes from current ones.
2018-11-20 04:43:54 +01:00
Willy Tarreau
3a1f5fda10 REORG: config: extract the proxy parser into cfgparse-listen.c
This was the largest function of the whole file, taking a rough second
to build alone. Let's move it to a distinct file along with a few
dependencies. Doing so saved about 2 seconds on the total build time.
2018-11-19 06:47:09 +01:00
Willy Tarreau
36b9e222bb REORG: config: extract the global section parser into cfgparse-global
The config parser is the largest file to build and its build dominates
the total project's build time. Let's start to split it into multiple
smaller pieces by extracting the "global" section parser into a new
file called "cfgparse-global.c". This removes 1/4th of the file's build
time.
2018-11-19 06:41:57 +01:00
Joseph Herlant
f1da69d9ea MINOR: Fix a typo in a warning message in the spoe subsystem
Fix a typo in a user-facing message of the spoe subsystem.
2018-11-18 22:29:19 +01:00
Joseph Herlant
7fe1577cc7 MINOR: Fix typo in the error 500 output of hlua
Fixes a common typo in the output generated by the hlua subsystem when
emitting an error 500 page.
2018-11-18 22:28:09 +01:00
Joseph Herlant
b8f9c5e634 CLEANUP: fix typos in the comments of hlua
Fix typos in the code comments of the hlua subsystem.
2018-11-18 22:28:09 +01:00
Joseph Herlant
76dbe785b5 MINOR: Fix typo in error message in the standard subsystem
Fix a typo in an error message that could be user-visible when running
out of memory in the parse_binary function.
2018-11-18 22:26:42 +01:00
Joseph Herlant
cf92b6d332 CLEANUP: Fix typos in the task subsystem
Fix typos in the code comments of the task subsystem.
2018-11-18 22:26:42 +01:00
Joseph Herlant
5662fa4707 CLEANUP: Fix typos in the stick_table subsystem
Fix some typos in the code comments of the stick_table subsystem.
2018-11-18 22:26:42 +01:00
Joseph Herlant
32b8327266 CLEANUP: Fix typos in the standard subsystem
Fix typos in the code comments of the standard subsystem.
2018-11-18 22:26:42 +01:00
Joseph Herlant
f7f6031184 CLEANUP: Fix typos in the spoe subsystem
Fix typos in the code comments of the spoe subsystem.
2018-11-18 22:26:42 +01:00
Joseph Herlant
757f5ad73a CLEANUP: Fix typos in the sample subsystem
Fix some typos in the code comment of the sample subsystem.
2018-11-18 22:26:42 +01:00
Joseph Herlant
eda75484a8 CLEANUP: Fix typos in the regex subsystem
Fix typos in the code comment of the regex subsystem.
2018-11-18 22:26:42 +01:00
Joseph Herlant
82b2f54d4c CLEANUP: Fix typos in the peers subsystem
Fix some typos in the code comments of the peers subsystem.
2018-11-18 22:26:42 +01:00
Joseph Herlant
85b4059b82 CLEANUP: Fix typos in the log subsystem
Fix some misspells in the code comments of the log subsystem.
2018-11-18 22:26:42 +01:00
Joseph Herlant
942eea3f5c CLEANUP: Fix typos in the http subsystem
Fix typos in code comment of the http subsystem.
2018-11-18 22:26:42 +01:00
Joseph Herlant
b35ea68081 CLEANUP: Fix typos in the filters subsystem
Fix typos in the code comments of the filters subsystems.
2018-11-18 22:26:42 +01:00
Joseph Herlant
a14c03ef43 CLEANUP: Fix typos in the cfgparse subsystem
Fix typos in the code comments of the cfgpase subsystem.
2018-11-18 22:26:42 +01:00
Joseph Herlant
8dae5b38b8 CLEANUP: Fix typos in the cache subsystem
Fix common misspells in the code comments of the cache subsystem.
2018-11-18 22:26:42 +01:00
Joseph Herlant
6808279b2a CLEANUP: Fix typos in the acl subsystem
Fix typos in the code comments of the acl subsystem.
2018-11-18 22:26:26 +01:00
Joseph Herlant
29023ec5d9 CLEANUP: Fix a typo in the stats subsystem
Fix a typo in a code comment of the stats subsystem.
2018-11-18 22:26:26 +01:00
Joseph Herlant
9edebb8568 MINOR: Fix typos in error messages in the proxy subsystem
Fix typos in error messages that will be user-visible in the proxy
subsystem.
2018-11-18 22:23:15 +01:00
Joseph Herlant
07a0834635 MINOR: Fix an error message thrown when we run out of memory
Fixes a typo in an error message that can be seen by the end user when
the haproxy subsystem runs out of memory.
2018-11-18 22:23:15 +01:00
Joseph Herlant
017b3da94e CLEANUP: fix typos in the ssl_sock subsystem
Fix some typos found in the code comments of the ssl_sock subsystem.
2018-11-18 22:23:15 +01:00
Joseph Herlant
59dd295926 CLEANUP: fix typos in the proxy subsystem
Fix typos in the code comments of the proxy subsystem.
2018-11-18 22:23:15 +01:00
Joseph Herlant
5ba8025976 CLEANUP: fix typos in the proto_http subsystem
Fixes typos in the code comments of the proto_http subsystem.
2018-11-18 22:23:15 +01:00
Joseph Herlant
b3d92e3b9f CLEANUP: fix typos in the hlua_fcn subsystem
Fixes typos detected in the code comments of the hlua_fcn subsystem using
the misspell tool and other ones detected manually.
2018-11-18 22:23:15 +01:00
Joseph Herlant
0767689e93 CLEANUP: fix typos in the comments of the vars subsystem
Those are mostly misspells of the words available and variable.
2018-11-18 22:23:15 +01:00
Joseph Herlant
4cc8d0d60c CLEANUP: fix a typo found in the stream subsystem
This typo is in a code comment so not end-user visible.
2018-11-18 22:23:15 +01:00
Joseph Herlant
44466826b1 CLEANUP: fix a few typos in the comments of the server subsystem
A few misspells where detected in the server subsystem. This commit
fixes them.
2018-11-18 22:23:15 +01:00
Joseph Herlant
b2db6a00f9 CLEANUP: fix 2 typos in the xxhash subsystem
Fixes 2 typos in the comments of the xxhash subsystem.
2018-11-18 22:23:15 +01:00
Joseph Herlant
4189d671b7 CLEANUP: Fix typos in the pattern subsystem
Fixes typos in the code comments of the pattern subsystem.
2018-11-18 22:23:15 +01:00
Joseph Herlant
42cf6395c4 CLEANUP: Fix typos in the dns subsystem
Fix misspells in the code comments of the dns subsystem.
2018-11-18 22:23:15 +01:00
Joseph Herlant
0342090ed7 CLEANUP: Fix some typos in the haproxy subsystem
Fix some typos in the code comments of the haproxy subsystem.
2018-11-18 22:23:15 +01:00
Christopher Faulet
afd8f10be9 MINOR: lua/htx: Forbid lua usage when the HTX is enabled on a proxy
For now, the lua scripts are not compatible with the new HTX internal
representation of HTTP messages. Thus, for a given proxy, when the option
"http-use-htx" is enabled, an error is triggered if any lua's
action/service/sample-fetch/converter is also configured.
2018-11-18 22:10:09 +01:00
Christopher Faulet
0c859127f1 MINOR: filters/htx: Forbid filters when the HTX is enabled on a proxy
For now, the filters are not compatible with the new HTX internal representation
of HTTP messages. Thus, for a given proxy, when the option "http-use-htx" is
enabled, an error is triggered if any filter is also configured.
2018-11-18 22:10:09 +01:00
Christopher Faulet
473652733a MEDIUM: mux-h1: Handle errors and timeouts in the stream
To do so, the stream is created as earlier as possible. It means, during the mux
creation for the first request, and for others, just at the end of the previous
transaction. Because all timeouts are handled by the strream, the mux's task is
now useless, so it is removed. Finally, to report errors, flags are set on the
HTX message. The HTX message is passed to the stream if there is some content to
analyse or if there is some error to handle.

All of this will probably be reworked later to handle errors and timeouts
directly in the mux. For now, it is the simpler way to handle all of this.
2018-11-18 22:10:08 +01:00
Christopher Faulet
ed28da534a MINOR: stream: Don't reset sov value with HTX messages 2018-11-18 22:10:08 +01:00
Christopher Faulet
ef77922776 MINOR: stats/htx: Adapt the stats applet to handle HTX messages
Switches between the HTX version of the code and the legacy one have been added
to let the stats applet work with both.
2018-11-18 22:10:08 +01:00
Christopher Faulet
3b88b8d02e MEDIUM: mux-h1: Wait for connection establishment before consuming channel's data
When a server is down, the channel's data must not be consumed. This is
required to allow redispatch and connection retry. So now, we wait for
the connection to be marked as connected, with the flag CO_FL_CONNECTED,
before starting to consume channel's data. In the mux, this event is
tracked with the flag H1C_F_CS_WAIT_CONN.
2018-11-18 22:10:04 +01:00
Christopher Faulet
311c7eaad0 MEDIUM: http_fetch: Adapt all fetches to handle HTX messages
For HTTP proxies, when the HTX internal representation is used, or for all TCP
proxies, we use the HTX version of sample fetches.
2018-11-18 22:09:01 +01:00
Christopher Faulet
ef453ed9b0 MINOR: http_fetch: Add smp_prefetch_htx
It does the same than smp_prefetch_http but for HTX messages. It can be called
from an HTTP proxy or a TCP proxy. For HTTP proxies, the parsing is handled by
the mux, so it does nothing but wait. For TCP proxies, it tries to parse an HTTP
message and to convert it in a temporary HTX message. Sample fetches will use
this temporary variable to do their job.
2018-11-18 22:09:00 +01:00
Christopher Faulet
fec7bd16de MEDIUM: proto_htx: Adapt htx_process_res_common to handle HTX messages 2018-11-18 22:09:00 +01:00
Christopher Faulet
f76ebe8bc7 MEDIUM: proto_htx: Adapt htx_wait_for_request_body to handle HTX messages
This version is simpler than the legacy one because the parsing is no more
handled by the analyzer. So now we just need to wait to have more data to move
on.
2018-11-18 22:09:00 +01:00
Christopher Faulet
8137c27094 MINOR: proto_htx: Adapt htx_process_tarpit to handle HTX messages 2018-11-18 22:09:00 +01:00
Christopher Faulet
d7bdfb1e7b MEDIUM: proto_htx: Adapt htx_process_request to handle HTX messages 2018-11-18 22:08:59 +01:00
Christopher Faulet
ff2759febd MEDIUM: proto_htx: Adapt htx_process_req_common to handle HTX messages
Here, the only real change is that the stats and cache applets are disabled.
2018-11-18 22:08:59 +01:00
Christopher Faulet
377c5a508c MINOR: proto_htx: Add functions to handle the stats applet
For now, the call to the stats applet is disabled for HTX messages. But HTX
versions of the function to check the request URI against the stats URI and the
fnuction to prepare the call to the stats applet have been added.
2018-11-18 22:08:59 +01:00
Christopher Faulet
fefc73da34 MINOR: proto_htx: Add functions htx_perform_server_redirect
It is more or less the same than legacy version but adapted to be called from
HTX analyzers. In the legacy version of this function, we switch on the HTX code
when applicable.
2018-11-18 22:08:58 +01:00
Christopher Faulet
64159df1fb MINOR: proto_htx: Add functions htx_send_name_header
It is more or less the same than legacy version but adapted to be called from
HTX analyzers. In the legacy version of this function, we switch on the HTX code
when applicable.
2018-11-18 22:08:58 +01:00
Christopher Faulet
25a02f65b1 MINOR: proto_htx: Add functions to check the cacheability of HTX messages
It is more or less the same than legacy versions but adapted to be called from
HTX analyzers. In the legacy versions of these functions, we switch on the HTX
code when applicable.
2018-11-18 22:08:58 +01:00
Christopher Faulet
fcda7c6850 MINOR: proto_htx: Add functions to manage cookies on HTX messages
It is more or less the same than legacy versions but adapted to be called from
HTX analyzers.
2018-11-18 22:08:57 +01:00
Christopher Faulet
336400861e MINOR: proto_htx: Add functions to apply req* and rsp* rules on HTX messages
It is more or less the same than legacy versions but adapted to be called from
HTX analyzers.
2018-11-18 22:08:57 +01:00
Christopher Faulet
3e9641951f MINOR: proto_htx: Add functions htx_req_get_intercept_rule and htx_res_get_intercept_rule
It is more or less the same than legacy versions but adapted to be called from
HTX analyzers.
2018-11-18 22:08:57 +01:00
Christopher Faulet
6eb92897f7 MINOR: proto_htx: Add function to build and send HTTP 103 responses
It is more or less the same than legacy versions but adapted to be called from
HTX analyzers.
2018-11-18 22:08:56 +01:00
Christopher Faulet
8d8ac191a7 MINOR: proto_htx: Add functions htx_req_replace_stline and htx_res_set_status
It is more or less the same than legacy versions but adapted to be called from
HTX analyzers. In the legacy versions of these functions, we switch on the HTX
code when applicable.
2018-11-18 22:08:56 +01:00
Christopher Faulet
7233352fe4 MINOR: proto_htx: Add functions htx_transform_header and htx_transform_header_str
It is more or less the same than legacy versions but adapted to be called from
HTX analyzers.
2018-11-18 22:08:56 +01:00
Christopher Faulet
f052354892 MINOR: proto_htx: Add the internal function htx_fmt_res_line 2018-11-18 22:08:55 +01:00
Christopher Faulet
0b6bdc55af MINOR: proto_htx: Add the internal function htx_del_hdr_value
It is more or less the same than del_hdr_value but adapted to be called from HTX
analyzers. The main changes is that it takes pointers on the start and the end
of the header value.
2018-11-18 22:08:55 +01:00
Christopher Faulet
80f14bffc7 MINOR: proto_htx: Rewrite htx_apply_redirect_rule to handle HTX messages 2018-11-18 22:08:55 +01:00
Christopher Faulet
7ff1ceaa5e MINOR: http_htx: Add functions to retrieve a specific occurrence of a header
There are 2 functions. The first one considers any comma as a delimiter for
distinct values. The second one considers full-line headers.
2018-11-18 22:08:55 +01:00
Christopher Faulet
e010c80753 MINOR: http_htx: Add functions to replace part of the start-line 2018-11-18 22:08:54 +01:00
Christopher Faulet
9768c2660e MAJOR: mux-h1/proto_htx: Switch mux-h1 and HTX analyzers on the HTX representation
The mux-h1 now parses and formats HTTP/1 messages using the HTX
representation. The HTX analyzers have been updated too. For now, only
htx_wait_for_{request/response} and http_{request/response}_forward_body have
been adapted. Others are disabled for now.

Now, the HTTP messages are parsed by the mux on a side and then, after analysis,
formatted on the other side. In the middle, in the stream, there is no more
parsing. Among other things, the version parsing is now handled by the
mux. During the data forwarding, depending the value of the "extra" field, we
are able to know if the body length is known or not and if yes, how many bytes
are still expected.
2018-11-18 22:08:54 +01:00
Christopher Faulet
0f226958b7 MINOR: proto_htx: Add some functions to handle HTX messages
More functions will come, but it is the minimum to switch HTX analyzers on the
HTX internal representation.
2018-11-18 22:08:54 +01:00
Christopher Faulet
47596d3787 MINOR: http_htx: Add functions to manipulate HTX messages in http_htx.c
This file will host all functions to manipulate HTTP messages using the HTX
representation. Functions in this file will be able to be called from anywhere
and are mainly related to the HTTP semantics.
2018-11-18 22:08:53 +01:00
Christopher Faulet
a3d2a16fad MEDIUM: htx: Add API to deal with the internal representation of HTTP messages
The internal representation of an HTTP message, called HTX, is a structured
representation, unlike the old one which is a raw representation of
messages. Idea is to have a version-agnostic representation of the HTTP
messages, which can be easily used by to handle HTTP/1, HTTP/2 and hopefully
QUIC messages, and communication from one of them to another.

In this patch, we add types to define the internal representation itself and the
main functions to manipulate them.
2018-11-18 22:08:53 +01:00
Christopher Faulet
1be55f9eb2 MEDIUM: mux-h1: Add support of the kernel TCP splicing to forward data
The mux relies on the flag CO_RFL_BUF_FLUSH during a call to h1_rcv_buf to know
if it needs to stop reads and to flush its internal buffers to use kernel tcp
splicing. It is the caller responsibility (here the SI) to know when it must
come back on buffered exchanges.
2018-11-18 22:08:53 +01:00
Christopher Faulet
f2824e6e10 MAJOR: mux-h1/proto_htx: Handle keep-alive connections in the mux
Now, the connection mode is detected in the mux and not in HTX analyzers
anymore. Keep-alive connections are now managed by the mux. A new stream is
created for each transaction. This removes the most important part of the
synchronization between channels and the HTTP transaction cleanup. These changes
only affect the HTX part (proto_htx.c). Legacy HTTP analyzers remain untouched
for now.

On the client-side, the mux is responsible to create new streams when a new
request starts. It is also responsible to parse and update the "Connection:"
header of the response. On the server-side, the mux is responsible to parse and
update the "Connection:" header of the request. Muxes on each side are
independent. For now, there is no connection pool on the server-side, so it
always close the server connection.
2018-11-18 22:02:42 +01:00
Christopher Faulet
129817b394 MEDIUM: mux-h1: Add parsing of incoming and ougoing HTTP messages
For now, it only parses and transfers data. There is no internal representation
yet. It means the stream still need to parse it too. So a message is parsed 3
times today: one time by each muxes (the client one and the server one) and
another time by the stream. This is of course inefficient. But don't worry, it
is only a transitionnal state. And this mux is optional for now.

BTW, headers and body parsing are now handled using same functions than the mux
H2. Request/Response synchronization is also handled. The mux's task is now used
to catch client/http-request timeouts. Others timeouts are still handled by the
stream. On the clien-side, the stream is created once headers are fully parsed
and body parsing starts only when heeaders are transferred to the stream (ie,
copied into channel buffer).

There is still some known limitations here and there. But, it works in the
common cases. Bad message are not captured and some logs are emitted when errors
occur, only if no stream are attached to the mux. Otherwise, data are
transferred and we let the stream handles errors itself.
2018-11-18 22:02:41 +01:00
Christopher Faulet
51dbc94d48 MEDIUM: mux-h1: Add dummy mux to handle HTTP/1.1 connections
For now, it is just an other kind of passthrough multiplexer, but with internal
buffers to be prepared to parse incoming messages and to format outgoing
ones. There is also a task attached to it to handle timeouts. However, because
it does not handle any timeout for now, this task is unused. And finally,
because it handles internal buffers, it also handles retries on recv/send. To
use this multiplexer, you must use the option "http-use-htx" both on the
frontend and the backend.

It does not support keep-alive and will freeze connections after the first
request/response.
2018-11-18 22:02:11 +01:00
Christopher Faulet
e0768ebabc MEDIUM: proto_htx: Add HTX analyzers and use it when the mux H1 is used
For now, these analyzers are just copies of the legacy HTTP analyzers. But,
during the HTTP refactoring, it will be the main place where it will be
visible. And in legacy analyzers, the macro IS_HTX_STRM is used to know if the
HTX version should be called or not.

Note: the following commits were applied to proto_http.c after this patch
      was developed and need to be studied to see if an adaptation to htx
      is required :

  fd9b68c BUG/MINOR: only mark connections private if NTLM is detected
2018-11-18 21:45:50 +01:00
Christopher Faulet
effc3750cc MINOR: conn_stream: Add a flag to notify the SI some data were received
The flag CS_FL_READ_PARTIAL can be set by the mux on the conn_stream to notify
the stream interface that some data were received. Is is used in si_cs_recv to
re-arm read timeout on the channel.
2018-11-18 21:45:49 +01:00
Christopher Faulet
27a3dc8fb2 MINOR: http: Call http_send_name_header with the stream instead of the txn
This is just a minor change to ease integrartion of the HTX.
2018-11-18 21:45:49 +01:00
Christopher Faulet
8277ca72b1 MINOR: http: Add standalone functions to parse a start-line or a header
These 2 functions are pretty naive. They only split a start-line into its 3
substrings or a header line into its name and value. Spaces before and after
each part are skipped. No CRLF at the end are expected.
2018-11-18 21:45:49 +01:00
Christopher Faulet
72d9125efb MINOR: conn_stream: Add a flag to notify the mux it must respect the reserve
By setting the flag CO_RFL_KEEP_RSV when calling mux->rcv_buf, the
stream-interface notifies the mux it must keep some space to preserve the
buffer's reserve. This flag is only useful for multiplexers handling structured
data, because in such case, the stream-interface cannot know the real amount of
free space in the channel's buffer.
2018-11-18 21:45:48 +01:00
Christopher Faulet
f4eb75d177 MINOR: htx: Add proto_htx.c file
This file is empty for now. But it will be used to add new versions of the HTTP
analyzers based on the internal representation of HTTP messages (not implemented
yet but called HTX).
2018-11-18 21:45:48 +01:00
Christopher Faulet
c6618d6835 MINOR: conn_stream: Add a flag to notify the mux it should flush its buffers
By setting the flag CO_RFL_BUF_FLUSH when calling mux->rcv_buf, the
stream-interface notifies the mux it should flush its buffers without reading
more data. This flag is set when the SI want to use the kernel TCP splicing to
forward data. Of course, the mux can respect it or not, depending on its
state. It's just an information.
2018-11-18 21:45:48 +01:00
Olivier Houchard
7c6f8b146d MAJOR: connections: Detach connections from streams.
Do not destroy the connection when we're about to destroy a stream. This
prevents us from doing keepalive on server connections when the client is
using HTTP/2, as a new stream is created for each request.
Instead, the session is now responsible for destroying connections.
When reusing connections, the attach() mux method is now used to create a new
conn_stream.
2018-11-18 21:45:45 +01:00
Olivier Houchard
131fd89d5a MINOR: sessions: Start to store the outgoing connection in sessions.
Introduce a new field in session, "srv_conn", and a linked list of sessions
in the connection. It will be used later when we'll switch connections
from being managed by the stream, to being managed by the session.
2018-11-18 21:44:56 +01:00
Olivier Houchard
060ed43361 MINOR: mux: Add a destroy() method.
Add a new method to muxes, destroy(), that is responsible for destroying
the mux and the associated connection, to be used for server connections.
2018-11-18 21:44:53 +01:00
Olivier Houchard
d540b36e8a MINOR: mux: Add a new "avail_streams" method.
Add a new method for mux, avail_streams, that returns the number of streams
still available for a mux.
For the mux_pt, it'll return 1 if the connection is in idle, or 0. For
the H2 mux, it'll return the max number of streams allowed, minus the number
of streams currently in use.
2018-11-18 21:44:06 +01:00
Olivier Houchard
b6c32ee4c2 MEDIUM: mux: Teach the mux_pt how to deal with idle connections.
In order to make the mux_pt able to handle idle connections, give it its
own context, where it'll stores the connection, the current conn_stream if
any, and a wait_event, so that it can subscribe to I/O events.
Add a new parameter to the detach() method, that gives the mux a hint
if it should destroy the connection or not when detaching a conn_stream.
If 1, then the mux_pt immediately destroys the connecion, if 0, then it
just subscribes to any read event. If a read happens, it will call
conn_sock_drain(), and if there's a connection error, it'll free the
connection, after removing it from the idle list.
2018-11-18 21:44:03 +01:00
Olivier Houchard
47e9a1ad4e MEDIUM: connections: Wait until the connection is established to try to recv.
Instead of trying to receive as soon as the connection is created, and to
eventually have to transfer subscription if we move connections, wait
until the connection is established before attempting to recv.
2018-11-18 21:41:50 +01:00
Willy Tarreau
db398435aa MINOR: stream-int: replace si_cant_put() with si_rx_room_{blk,rdy}()
Remaining calls to si_cant_put() were all for lack of room and were
turned to si_rx_room_blk(). A few places where SI_FL_RXBLK_ROOM was
cleared by hand were converted to si_rx_room_rdy().

The now unused si_cant_put() function was removed.
2018-11-18 21:41:50 +01:00
Willy Tarreau
b26a6f9708 MEDIUM: stream-int: make use of si_rx_chan_{rdy,blk} to control the stream-int from the channel
The channel can disable reading from the stream-interface using various
methods, such as :
  - CF_DONT_READ
  - !channel_may_recv()
  - and possibly others

Till now this was done by mangling SI_FL_RX_WAIT_EP which is not
appropriate at all since it's not the stream interface which decides
whether it wants to deliver data or not. Some places were also wrongly
relying on SI_FL_RXBLK_ROOM since it was the only other alternative,
but it's not suitable for CF_DONT_READ.

Let's use the SI_FL_RXBLK_CHAN flag for this instead. It will properly
prevent the stream interface from being woken up and reads from
subscribing to more receipt without being accidently removed. It is
automatically reset if CF_DONT_READ is not set in stream_int_notify().

The code is not trivial because it splits the logic between everything
related to buffer contents (channel_is_empty(), CF_WRITE_PARTIAL, etc)
and buffer policy (CF_DONT_READ). Also it now needs to decide timeouts
based on any blocking flag and not just SI_FL_RXBLK_ROOM anymore.

It looks like this patch has caused a minor performance degradation on
connection rate, which possibly deserves being investigated deeper as
the test conditions are uncertain (e.g. slightly more subscribe calls?).
2018-11-18 21:41:49 +01:00
Willy Tarreau
47baeb85d4 MEDIUM: stream-int: unconditionally call si_chk_rcv() in update and notify
For a long time, stream_int_update() and stream_int_notify() used to only
conditionally call si_chk_rcv() based on state change detection. This
detection is not reliable and quite complex. With the new blocked flags
that si_chk_rcv() checks, it's much more reliable to always call the
function to take into account recent changes,and let it decide if it needs
to wake something up or not.

This also removes the calls to si_chk_rcv() that were performed in
si_update_both() since these ones are systematically performed in
stream_int_update() after updating the Rx flags.
2018-11-18 21:41:49 +01:00
Willy Tarreau
abb5d4202f MEDIUM: stream-int: use si_rx_shut_blk() to indicate the SI is closed
Till now we were using si_done_put() upon shutr, but these flags could
be reset upon next activity. Now let's switch to SI_FL_RXBLK_SHUT which
doesn't go away. It's also set in stream_int_update() in case a shutr
condition is detected.

The now unused si_done_put() was removed.
2018-11-18 21:41:49 +01:00
Willy Tarreau
4b962a4179 MEDIUM: stream-int: fix the si_cant_put() calls used for buffer readiness
A number of calls to si_cant_put() were used in fact to request being
called back once a buffer is available. These ones are not needed anymore
since si_alloc_ibuf() already sets the SI_FL_RXBLK_BUFF flag when called
in appctx context. Those called with a foreign stream-int are simply turned
to si_rx_buff_blk().
2018-11-18 21:41:48 +01:00
Willy Tarreau
3367d4156d MEDIUM: stream-int: fix the si_cant_put() calls used for end point readiness
A number of si_cant_put() calls were still present to in fact indicate
that the end point is ready (thus should be turned to si_rx_endp_more()).

One other call in the Lua handler indicates that the endpoint wanted to
be blocked until some room is made in the Rx buffer in order to detect
that the connection happened, which is in fact an indication that it
wants to be called once the endpoint is ready, this is the default case
for an applet so this call was removed.

A useless call to si_cant_put() before appctx_wakeup() in the Lua
applet wakeup call was removed as well since the first thing that will
be done there will be to set end ENDP blocking flag.
2018-11-18 21:41:48 +01:00
Willy Tarreau
186dcdd128 MINOR: stream-int: automatically mark applets as ready if they block on the channel
If an applet reports being blocked due to any of the channel-side flags,
it's reportedly ready to deliver incoming data. It's better to do this
after the return from the applet handler so that applet developers don't
have to worry about details related to flags ordering.
2018-11-18 21:41:48 +01:00
Willy Tarreau
dd5621ab80 MEDIUM: stream-int: update the endp polling status only at the end of si_cs_recv()
Instead of first indicating that there's more data to read from the
conn_stream then re-adjusting this info along the function, we now
instead set the status according to the subscription status at the
end. It's easier, more accurate, and less sensitive to intermediary
changes.

This will soon allow to remove all the si_cant_put() calls that were
placed in the middle to force a subsequent callback and prevent the
function from subscribing to the mux layer.
2018-11-18 21:41:47 +01:00
Willy Tarreau
8bb2ffb831 MINOR: stream-int: replace si_{want,stop}_put() with si_rx_endp_{more,done}()
Here it's only a 1-to-1 replacement.
2018-11-18 21:41:47 +01:00
Willy Tarreau
8be7cd7b92 MEDIUM: stream-int: use si_rx_buff_{rdy,blk} to report buffer readiness
The stream interface used to conflate a missing buffer and lack of
buffer space into SI_FL_WAIT_ROOM but this causes difficulties as
these cannot be checked at the same moment and are not resolved at
the same moment either. Now we instead mark the buffer as presumably
available using si_rx_buff_rdy() and mark it as unavailable+requested
using si_rx_buff_blk().

The call to si_alloc_buf() was moved after si_stop_put(). This makes
sure that the SI_FL_RX_WAIT_EP flag is cleared on allocation failure so
that the function is called again if the callee fails to do its work.
2018-11-18 21:41:47 +01:00
Willy Tarreau
32742fdf45 MINOR: stream-int: use si_rx_blocked()/si_tx_blocked() to check readiness
This way we don't limit ourselves to random flags only and the code
is more readable and safer for the long term.
2018-11-18 21:41:46 +01:00
Willy Tarreau
05b9b64afb MINOR: stream-int: replace SI_FL_WANT_PUT with !SI_FL_RX_WAIT_EP
The SI_FL_WANT_PUT flag is used in an awkward way, sometimes it's
set by the stream-interface to mean "I have something to deliver",
sometimes it's cleared by the channel to say "I don't want you to
send what you have", and it has to be set back once CF_DONT_READ
is cleared. This will have to be split between SI_FL_RX_WAIT_EP
and SI_FL_RXBLK_CHAN. This patch only replaces all uses of the
flag with its natural (but negated) replacement SI_FL_RX_WAIT_EP.
The code is expected to be strictly equivalent. The now unused flag
was completely removed.
2018-11-18 21:41:46 +01:00
Willy Tarreau
d0f5bbcd64 MINOR: stream-int: rename SI_FL_WAIT_ROOM to SI_FL_RXBLK_ROOM
This flag is not enough to describe all blocking situations, as can be
seen in each case we remove it. The muxes has taught us that using multiple
blocking flags in parallel will be much easier, so let's start to do this
now. This patch only renames this flags in order to make next changes more
readable.
2018-11-18 21:41:45 +01:00
Willy Tarreau
89b6a2b4fd MINOR: stream-int: relax the forwarding rules in stream_int_notify()
There currently is an optimization in stream_int_notify() consisting
in not trying to forward small bits of data if extra data remain to be
processed. The purpose is to avoid forwarding one chunk at a time if
multiple chunks are available to be parsed at once. It consists in
avoiding sending pending output data if there are still data to be
parsed in the channel's buffer, since process_stream() will have the
opportunity to deal with them all at once.

Not only this optimization is less useful with the new way the connections
work, but it even causes problems like lost events since WAIT_ROOM will
not be removed. And with HTX, it will never be able to update the input
buffer after the first read.

Let's relax the rules now, by always sending if we don't have the
CF_EXPECT_MORE flag (used to group writes), or if the buffer is
already full.
2018-11-18 21:41:44 +01:00
Willy Tarreau
6b1379fb8a MINOR: stream-int: make conn_si_send_proxy() use cs_get_first()
The function used to abuse the internals of mux_pt to retrieve a
conn_stream, which will not work anymore after the idle connection
changes. Let's make it rely on the more reliable cs_get_first()
instead.
2018-11-18 21:38:19 +01:00
Willy Tarreau
fafd3984b9 MINOR: mux: implement a get_first_cs() method
This method is used to retrieve the first known good conn_stream from
the mux. It will be used to find the other end of a connection when
dealing with the proxy protocol for example.
2018-11-18 21:29:20 +01:00
Willy Tarreau
479998adbf CLEANUP: h2: minimum documentation for recent API changes
Commit d4dd22d ("MINOR: h2: Let user of h2_recv() and h2_send() know xfer
has been done") changed the API without documenting the expected returned
values which appear to come out of nowhere in the code :-(  Please don't
do that anymore! The description was recovered from the commit message.
2018-11-18 06:35:29 +01:00
Christopher Faulet
6b44975fbd BUG/MINOR: config: Copy default error messages when parsing of a backend starts
To be used, error messages declared in a default section must be copied when the
parsing of a proxy section starts. But this was only done for frontends.

This patch may be backported to older versions.
2018-11-18 06:17:03 +01:00
Willy Tarreau
ade6478a8c MINOR: stream: move the conn_stream specific calls to the stream-int
There are still some unwelcome synchronous calls to si_cs_recv() in
process_stream(). Let's have a new function si_sync_recv() to perform
a synchronous receive call on a stream interface regardless of the type
of its endpoint, and move these calls there. For now it only implements
conn_streams since it doesn't seem useful to support applets there. The
function implements an extra check for the stream interface to be in an
established state before attempting anything.
2018-11-17 19:53:45 +01:00
Willy Tarreau
00b3b8c361 BUG/MINOR: stream-int: set SI_FL_WANT_PUT in sess_establish()
In commit f26c26c ("BUG/MEDIUM: stream-int: change the way buffer room
is requested by a stream-int") we used to call si_want_put() at the
end of sess_update_st_con_tcp(), when switching to SI_ST_EST state.
But this is incorrect as there are a few other situations where we
can switch to this state, such as in si_connect() where a connection
reuse is detected, or when directly calling an applet (in which case
that was already covered anyway). For now it doesn't have any side
effect but it could impact connection reuse after the stream-int
changes by stalling an immediately reused connection.

Let's move this flag change to sess_establish() instead, which is the
only place which is always called exactly once on connection setup.

No backport is needed, this is purely 1.9.
2018-11-17 19:20:01 +01:00
William Lallemand
a337229ac2 MEDIUM: cli: worker socketpair is unstoppable
In master-worker mode, the socketpair CLI listener of the worker is now
marked unstoppable, which allows to connect to the CLI of an old process
which is in a leaving state, allowing to debug it.
2018-11-16 17:05:40 +01:00
William Lallemand
c59f9884d7 MEDIUM: listeners: support unstoppable listener
An unstoppable listener is a listener which won't be stop during a soft
stop. The unstoppable_jobs variable is incremented and the listener
won't prevent the process to leave properly.

It is not a good idea to use this feature (the LI_O_NOSTOP flag) with a
listener that need to be bind again on another process during a soft
reload.
2018-11-16 17:05:40 +01:00
William Lallemand
a719926cf8 MEDIUM: jobs: support unstoppable jobs for soft stop
This patch allows a process to properly quit when some jobs are still
active, this feature is handled by the unstoppable_jobs variable, which
must be atomically incremented.

During each new iteration of run_poll_loop() the break condition of the
loop is now (jobs - unstoppable_jobs) == 0.

The unique usage of this at the moment is to handle the socketpair CLI
of a the worker during the stopping of the process.  During the soft
stop, we could mark the CLI listener as an unstoppable job and still
handle new connections till every other jobs are stopped.
2018-11-16 17:05:40 +01:00
Christopher Faulet
3c0544efbf BUG/MINOR: http: Be sure to sent fully formed HTTP 103 responses
The previous commit fedceaf33 ("MINOR: http: Regroup return statements of
http_req_get_intercept_rule at the end") partly fixes the problem. But not
entierly. Because HTTP 103 reponses were sent line by line it is possible to mix
them with others. For instance, an early-hint rule followed by a redirect rule
leaving the response buffer totally messed up. Furthermore, if we fail to add
the last CRLF to finish the HTTP 103 response because there is no more space in
the buffer, it leave the buffer with an unfinished and invalid message.

This patch fixes the bug by creating a fully formed HTTP 103 response before
trying to push it in the response buffer. If an error occurred during the copy
or if another response was already sent, the HTTP 103 response is
ignored. However, the last point should never happened because, for redirects
and authentication errors, we first try to copy any pending HTTP 103 response.
2018-11-16 16:05:51 +01:00
Christopher Faulet
6c243ebb9f MINOR: http: Regroup return statements of http_res_get_intercept_rule at the end
Instead of having multiple return statements spreaded here and there in middle
of the function, we just exit from the loop setting the right return code. It
let a chance to do some work before leaving the function. It is also less error
prone.
2018-11-16 16:05:51 +01:00
Christopher Faulet
ea827bdcbc MINOR: http: Regroup return statements of http_req_get_intercept_rule at the end
Instead of having multiple return statements spreaded here and there in middle
of the function, we just exit from the loop setting the right return code. It
let a chance to do some work before leaving the function. It is also less error
prone.
2018-11-16 16:05:51 +01:00
Christopher Faulet
78337bbbaa BUG/MINOR: http_fetch: Remove the version part when capturing the request uri
This patch fixes a bug introduced in the commit 6b952c810 ("REORG: http: move
http_get_path() to http.c"). In the reorg, the code responsible to skip the
version to only extract the path in the HTTP request was dropped.

No backport is needed, this only affects 1.9.
2018-11-16 16:05:51 +01:00
Willy Tarreau
ffb1205a47 BUG/MINOR: stream-int: make sure not to go through the rcv_buf path after splice()
When splice() reports a pipe full condition, we go through the common
code used to release a possibly empty pipe (which we don't have) and which
immediately tries to allocate a buffer that will never be used. Further,
it may even subscribe to get this buffer if the resources are low. Let's
simply get out of this way if the pipe is full.

This fix could be backported to 1.8 though the code is a bit different
overthere.
2018-11-15 17:00:08 +01:00
Willy Tarreau
81464b4e4d BUG/MEDIUM: stream-int: clear CO_FL_WAIT_ROOM after splicing data in
Since we don't necessarily pass through conn_fd_handler() when reading,
conn_refresh_polling_flags() is not necessarily called when performing
a recv() operation, thus flags like CO_FL_WAIT_ROOM are not cleared.

It happens that si_cs_recv() checks CO_FL_WAIT_ROOM before deciding to
receive into a buffer, to see if the previous rcv_pipe() call failed by
lack of pipe room. The combined effect of these two statements is that
at the end of a file transmission, when there's too little data to
warrant the use of a pipe and the pipe is empty, we refrain from using
rcv_pipe() for the last few bytes, but since CO_FL_WAIT_ROOM is still
present, we don't use rcv_buf() either, and the connection remains
frozen in this state with si_cs_recv() called in loops.

In order to fix this we can simply manually clear CO_FL_WAIT_ROOM when
not using pipe so that the next check sees the result of the previous
operation and not an old one. We could equally call
cond_refresh_polling_flags() but that would be overkill and dangerous
given that it would manipulate the connection's flags under the mux.
By the way ideally the mux should report this flag into the connstream
for cleaner manipulation.

No backport is needed as this is only post 1.9-dev2.
2018-11-15 17:00:08 +01:00
Willy Tarreau
f6975aa920 BUG/MEDIUM: stream-int: make failed splice_in always subscribe to recv
As part of the changes that went into 1.9-dev2 regarding the polling
modifications, the changes consecutive to the removal of the wait_list
from the conn_streams (commit 71384551a) made si_cs_recv() occasionally
return without subscribing to receive events, causing spliced transfers
to randomly fail if the client was at least as fast as the server. This
may remain unnoticed on most deployments since servers are usually close
to haproxy with higher bandwidth than clients have, resulting in buffers
always being full.

In order to reproduce his effect, it is better to do it on the local
machine and to transfer very large objects (hundreds of gigs) over a
single connection, to see it suddenly stall after a few tens of gigs.
Now with this fix it's fine even after 3 TB over a single connection.

No backport is needed.
2018-11-15 14:39:03 +01:00
Olivier Houchard
52dabbc4fa BUG/MEDIUM: Make sure stksess is properly aligned.
When we allocate struct stksess, we also allocate memory to store the
associated data before the struct itself.
As the data can be of different types, they can have different size. However,
we need the struct stksess to be properly aligned, as it can do 64bits
load/store (including atomic load/stores) on 64bits platforms, and some of
them doesn't support unaligned access.
So, when allocating the struct stksess, round the size up to the next
multiple of sizeof(void *), and make sure the struct stksess itself is
properly aligned.
Many thanks to Paul Martin for investigating and reporting that bug.

This should be backported to earlier releases.
2018-11-15 14:24:05 +01:00
William Lallemand
a8b2671cf6 BUG/MEDIUM: log: don't CLOEXEC the inherited FDs
When configuring the logs with a FD and using the master worker, the FD
was closed upon a reload because it was configured with CLOEXEC. It
leads to using the wrong FD for the logs and to close them. Which is
unfortunate since the master rely on the FD left opened during a reload.

The fix is to stop doing a CLOEXEC when the FD is inherited.
No backport needed.
2018-11-13 19:32:45 +01:00
William Lallemand
2e8fad9c30 MINOR: mworker: only close std{in,out,err} in daemon mode
This allows to output messages when we are not in daemon mode which is
useful to use log stdout in master worker mode.
2018-11-13 16:21:15 +01:00
Frédéric Lécaille
9ca51aa288 MINOR: http: Implement "early-hint" http request rules.
This patch implements http_apply_early_hint_rule() function is responsible of
building HTTP 103 Early Hint responses each time a "early-hint" rule is matched.
2018-11-12 21:08:55 +01:00
Frédéric Lécaille
0ebbcb663c MINOR: http: Make new "early-hint" http-request action really be parsed.
This patch adds a "early_hint" struct to "arg" union of "act_rule" struct
and parse "early-hint" http-request keyword with it using the same
code as for "(add|set)-header" parser.
2018-11-12 21:08:55 +01:00
Frédéric Lécaille
a985e3875b MINOR: http: Add new "early-hint" http-request action.
This patch adds the new "early-hint" action to "http-request" rules parser.
This action should be parsed the same way as "(add|set)-header" actions.
2018-11-12 21:08:55 +01:00
David Carlier
42d9e5ae68 BUILD/MEDIUM: threads/affinity: DragonFly build fix
DragonFlyBSD does not have a build on its own, it has always
used the FreeBSD's. To be able to support the cpu affinity,
it needs few more headers.
2018-11-12 19:16:00 +01:00
Willy Tarreau
7520e4ff57 MINOR: namespaces: don't build namespace.c if disabled
When namespaces are disabled, support is still reported because the file
is built with almost nothing in it but built anyway. Instead of extending
the scope of the numerous ifdefs in this file, better avoid building it
when namespaces are diabled. In this case we define my_socketat() as an
inline function mapping directly to socket(). The struct netns_entry
still needs to be defined because it's used by various other functions
in the code.
2018-11-12 19:15:15 +01:00
Willy Tarreau
691fe39284 BUG/MEDIUM: stream-int: convert some co_data() checks to channel_is_empty()
Splicing was in great part broken over the last few development version
due to the use of co_data() to detect if data are available in the channel.
But co_data() only looks at buffered data, not spliced data.

Channel_is_empty() takes care of both and should be used. With this,
splicing restarts to work but there are still a few cases where transfers
may stall.

No backport is needed.
2018-11-12 19:00:22 +01:00
Willy Tarreau
f26c26cca2 BUG/MEDIUM: stream-int: change the way buffer room is requested by a stream-int
Subsequent to the recent stream-int updates, we started to consider that
SI_FL_WANT_PUT needs to be set when receipt is enabled, but this is wrong
and results in 100% CPU when an HTTP client stays idle after a keep-alive
request because the stream-int has nothing to provide and nothing to send.

In fact just like for applets this flag should reflect the continuation
of an attempt. So it's si_cs_recv() which should set the flag, and clear
it if it has nothing more to provide. This function is called the first
time in process_stream()), and called again during transfers, so it will
always be up to date during stream_int_update() and stream_int_notify().

As a special case, it should also be set when a connection switches to
the established state. And we should absolutely refrain from calling
si_cs_recv() to re-enable reading, normally just setting this flag
(from within the stream-int's handler or prior to calling si_chk_rcv())
is expected to be OK.

A corner case remains where it was observed that in stream_int_notify() we
can sometimes be called with an empty output channel with SI_FL_WAIT_ROOM
and no CF_WRITE_PARTIAL, so there's no way to detect that we should
re-enable receiving. It's easy to also take care of this condition
there for the time it takes to figure if this situation is expected
or not.

Now it becomes more obvious that relying on a single flag to request
room (or on two flags to arbiter activity) is not workable given the
autonomy of both sides. The mux_h2 has taught us that blocking flags
are much more reliable, require much less condition and are much easier
to deal with. That's probably something to consider quickly in this
area.

No backport is needed.
2018-11-12 18:58:45 +01:00
Willy Tarreau
c1b0645dac MEDIUM: log: add a new "raw" format
This format is pretty similar to the previous "short" format except
that it also removes the severity level. Thus only the raw message is
sent. This is suitable for use in containers, where only the raw
information is expected and where the severity is supposed to come
from the file descriptor used.
2018-11-12 18:37:55 +01:00
Willy Tarreau
e8746a08b2 MEDIUM: log: support a new "short" format
This format is meant to be used with local file descriptors. It emits
messages only prefixed with a level, removing all the process name,
system name, date and so on. It is similar to the printk() format used
on Linux. It's suitable to be sent to a local logger compatible with
systemd's output format.

Note that the facility is still required but not used, hence it is
suggested to use "daemon" to remind that it's a local logger.
Example :

    log stdout format short daemon          # send everything to stdout
    log stderr format short daemon notice   # send important events to stderr
2018-11-12 18:37:55 +01:00
Willy Tarreau
5a32ecc6cf MEDIUM: log: add support for logging to existing file descriptors
In certain situations it would be desirable to log to an existing file
descriptor, the most common case being a pipe between containers or
processes. The main issue with pipes is that using write() on them will
randomly truncate messages. But there is a trick. By using writev(), we
can atomically deliver or drop a message, which perfectly fits the
purpose. The only caveat is that large messages (4096 bytes on modern
operating systems) may be interleaved with messages from other processes
if using nbproc for example. In practice such messages are rare and most
of the time when users need such type of logging, the load is low enough
for a single process to be running so this is not really a problem.

This logging method thus uses unbuffered writev() calls and is uses more
CPU than if it used its own buffer with large writes at once, though this
is not a problem for moderate loads.

Logging to a file descriptor attached to a file also works with the side
effect that the process is significantly slowed down during disk accesses
and that it's not possible to rotate the file without restarting the
process. For this reason this option is not offered as a configuration
option, since it would confuse most users, but one could decide to
redirect haproxy's output to a file during debugging sessions. Two aliases
"stdout" and "stderr" are provided, but keep in mind that these are closed
by default in daemon mode.

When logging to a pipe or socket at a high enough rate, some logs will be
dropped and the number of dropped messages is reported in "show info".
2018-11-12 18:37:55 +01:00
Willy Tarreau
13ef773722 MINOR: log: report the number of dropped logs in the stats
It's easy to detect when logs on some paths are lost as sendmsg() will
return EAGAIN. This is particularly true when sending to /dev/log, which
often doesn't support a big logging capacity. Let's keep track of these
and report the total number of dropped messages in "show info".
2018-11-12 18:37:55 +01:00
Willy Tarreau
251fe34ca2 MINOR: log: slightly improve error message syntax on log failure
The error messages used to say something along "socket logger 2 failed"
or "sendmsg logger 2 failed" which are confusing. Let's rephrase this
"sendmsg() failed for logger 2".
2018-11-12 18:37:55 +01:00
Willy Tarreau
96062a181d BUILD: cache: fix a build warning regarding too large an integer for the age
Building on 32 bit gives this :

  src/cache.c: In function 'http_action_store_cache':
  src/cache.c:466:4: warning: this decimal constant is unsigned only in ISO C90 [enabled by default]
  src/cache.c:467:5: warning: this decimal constant is unsigned only in ISO C90 [enabled by default]
  src/cache.c: In function 'cache_channel_append_age_header':
  src/cache.c:578:2: warning: this decimal constant is unsigned only in ISO C90 [enabled by default]
  src/cache.c:579:3: warning: this decimal constant is unsigned only in ISO C90 [enabled by default]

It's because of the definition below added in commit e7a770c ("MINOR:
cache: Add "Age" header.") :

  #define CACHE_ENTRY_MAX_AGE 2147483648

Just appending "U" to mark it unsigned is enough to fix it. This only
affects 1.9, no backport is needed.
2018-11-11 14:03:02 +01:00
Willy Tarreau
4db49c0704 BUG/MINOR: config: better detect the presence of the h2 pattern in npn/alpn
In 1.8, commit 45a66cc ("MEDIUM: config: ensure that tune.bufsize is at
least 16384 when using HTTP/2") tried to avoid an annoying issue making
H2 fail when haproxy is built with default buffer sizes smaller than 16kB,
which used to be the case for a very long time. Sadly, the test only sees
when NPN/ALPN exactly match "h2" and not when it's combined like
"h2,http/1.1" nor "http/1.1,h2". We can safely use strstr() there because
the string is prefixed by the token's length (0x02) which is unambiguous
as it cannot be part of any other token.

This fix should be backported to 1.8 as a safety guard against bad
configurations.
2018-11-11 10:42:37 +01:00
Christopher Faulet
4eb7d745e2 MEDIUM: stream-int: Try to read data even if channel's buffer seems to be full
Before calling the mux to get incoming data, we get the amount of space
available at the input of the buffer. If there is no space, we don't try to read
more data. This is good enough when raw data are stored in the buffer. But this
info has no meaning when structured data are stored. Because with the HTTP
refactoring, such kind of data will be stored in buffers, it is a bit annoying.

So, to avoid any problems, we always call the mux. It is the mux's responsiblity
to notify the stream interface it needs more space to store more data. This must
be done by setting the flag CS_FL_RCV_MORE on the conn_stream.

This is exactly what we do in the pass-through mux when <count> is null.
2018-11-11 10:18:37 +01:00
Christopher Faulet
b3e0de46ce MEDIUM: stream-int: Rely only on SI_FL_WAIT_ROOM to stop data receipt
This flag is set on the stream interface when we should wait for more space in
the channel's buffer to store more incoming data. This means we should wait some
outgoing data are sent before retrying to receive more data.

But in stream interface functions, at many places, instead of checking this
flag, we use the function channel_may_recv to know if we can (re)start
reading. This currently works but it is not really consistent. And, it works
because only raw data are stored in buffers. But it will be a problem when we
start to store structured data in buffers.

So to avoid any problems with futur implementations, we now rely only on
SI_FL_WAIT_ROOM. The function channel_may_recv can still be called, but only
when we are sure to handle raw data (for instance in functions ci_put*). To do
so, among other things, we must be sure to unset SI_FL_WAIT_ROOM and offer an
opportunity to call chk_rcv() on a stream interface when some data are sent
on the other end, which is now granted by the previous patch series.
2018-11-11 10:18:37 +01:00
Willy Tarreau
d0d40ebf5e CLEANUP: stream-int: remove the now unused si->update() function
We exclusively use stream_int_update() now, the lower layers are not
called anymore so let's remove them, as well as si_update() which used
to be their wrapper.
2018-11-11 10:18:37 +01:00
Willy Tarreau
bf89ff3db8 MEDIUM: stream-int: make stream_int_update() aware of the lower layers
It's far from being clean, but at least it allows to resync both CS and
applets from the same place, taking into account the fact that CS are
processed synchronously for the send side while appletx are processed
outside of the process_stream() loop. The arrangement is optimised to
minimize the amount of iteration by handling send first, then updating
the SI_FL_WAIT_ROOM flags and only then dealing with si_chk_rcv() on
both sides. The SI_FL_WANT_PUT flag is set if needed before calling
si_chk_rcv() since this is done prior to calling stream_int_update().

Now there's no risk that stream_int_notify() is called anymore during
such operations, thus we cannot have any spurious wake-up anymore. The
case where a successful send() could complete a pending connect() is
handled by taking any stream-int state changes into account at the
call place, which is normal since process_stream() is designed to
iterate till stabilisation.

Doing this solves most of the remaining inconsistencies between CS and
applets.
2018-11-11 10:18:37 +01:00
Willy Tarreau
d14844a734 MINOR: stream-int: replace si_update() with si_update_both()
The function used to be called in turn for each side of the stream, but
since it's called exclusively from process_stream(), it prevents us from
making use of the knowledge we have of the operations in progress for
each side, resulting in having to go all the way through functions like
stream_int_notify() which are not appropriate there.

That patch creates a new function, si_update_both() which takes two
stream interfaces expected to belong to the same stream, and processes
their flags in a more suitable order, but for now doesn't change the
logic at all.

The next step will consist in trying to reinsert the rest of the socket
layer-specific update code to ultimately update the flags correctly at
the end of the operation.
2018-11-11 10:18:37 +01:00
Willy Tarreau
abf531caa0 MEDIUM: stream-int: always call si_chk_rcv() when we make room in the buffer
Instead of clearing the SI_FL_WAIT_ROOM flag and losing the information
about the need from the producer to be woken up, we now call si_chk_rcv()
immediately. This is cheap to do and it could possibly be further improved
by only doing it when SI_FL_WAIT_ROOM was still set, though this will
require some extra auditing of the code paths.

The only remaining place where the flag was cleared without a call to
si_chk_rcv() is si_alloc_ibuf(), but since this one is called from a
receive path woken up from si_chk_rcv() or not having failed, the
clearing was not necessary anymore either.

And there was one place in stream_int_notify() where si_chk_rcv() was
called with SI_FL_WAIT_ROOM still explicitly set so this place was
adjusted in order to clear the flag prior to calling si_chk_rcv().

Now we don't have any situation where we randomly clear SI_FL_WAIT_ROOM
without trying to wake the other side up, nor where we call si_chk_rcv()
with the flag set, so this flag should accurately represent a failed
attempt at putting data into the buffer.
2018-11-11 10:18:37 +01:00
Willy Tarreau
1f9de21c38 MEDIUM: stream-int: make SI_FL_WANT_PUT reflect CF_DONT_READ
When CF_DONT_READ is set, till now we used to set SI_FL_WAIT_ROOM, which
is not appropriate since it would lose the subscribe status. Instead let's
clear SI_FL_WANT_PUT (just like applets do), and set the flag only when
CF_DONT_READ is cleared.

We have to do this in stream_int_update(), and in si_cs_io_cb() after
returning from si_cs_recv() since it would be a bit invasive to hack
this one for now. It must not be done in stream_int_notify() otherwise
it would re-enable blocked applets.

Last, when si_chk_rcv() is called, it immediately clears the flag before
calling ->chk_rcv() so that we are not tempted to uselessly loop on the
same call until the receive function is called. This is the same principle
as what is done with the applet scheduler.
2018-11-11 10:18:37 +01:00
Willy Tarreau
1bdb598a55 MINOR: stream-int: factor the SI_ST_EST state test into si_chk_rcv()
This test is made in each implementation of the function, better to
merge it.
2018-11-11 10:18:37 +01:00
Willy Tarreau
96aadd5c55 MEDIUM: stream-int: temporarily make si_chk_rcv() take care of SI_FL_WAIT_ROOM
This flag should already be cleared before calling the *chk_rcv() functions.
Before adapting all call places, let's first make sure si_chk_rcv() clears
it before calling them so that these functions do not have to check it again
and so that they do not adjust it. This function will only call the lower
layers if the SI_FL_WANT_PUT flag is present so that the endpoint can decide
not to be called (as done with applets).
2018-11-11 10:18:37 +01:00
Willy Tarreau
43e69dc1eb MINOR: stream-int: make use of si_done_{get,put}() in shut{w,r}
It's cleaner to use these functions there to properly clear the flags.
2018-11-11 10:18:37 +01:00
Willy Tarreau
af4f6f6d2f MINOR: stream-int: use si_cant_put() instead of setting SI_FL_WAIT_ROOM
We now do this on the si_cs_recv() path so that we always have
SI_FL_WANT_PUT properly set when there's a need to receive and
SI_FL_WAIT_ROOM upon failure.
2018-11-11 10:18:37 +01:00
Willy Tarreau
0cd3bd628a MINOR: stream-int: rename si_applet_{want|stop|cant}_{get|put}
It doesn't make sense to limit this code to applets, as any stream
interface can use it. Let's rename it by simply dropping the "applet_"
part of the name. No other change was made except updating the comments.
2018-11-11 10:18:37 +01:00
Willy Tarreau
21028b5e7f MEDIUM: appctx: check for allocation attempts in buffer allocation callbacks
The buffer allocation callback appctx_res_wakeup() used to rely on old
tricks to detect if a buffer was already granted to an appctx, namely
by checking the task's state. Not only this test is not valid anymore,
but it's inaccurate.

Let's solely on SI_FL_WAIT_ROOM that is now set on allocation failure by
the functions trying to allocate a buffer. The buffer is now allocated on
the fly and the flag removed so that the consistency between the two
remains granted. The patch also fixes minor issues such as the function
being improperly declared inline(!) and the fact that using appctx_wakeup()
sets the wakeup reason to TASK_WOKEN_OTHER while we try to use TASK_WOKEN_RES
when waking up consecutive to a ressource allocation such as a buffer.
2018-11-11 10:18:37 +01:00
Willy Tarreau
b882dd88cc MEDIUM: stream: implement stream_buf_available()
This function replaces stream_res_available(), which is used as a callback
for the buffer allocator. It now carefully checks which stream interface
was blocked on a buffer allocation, tries to allocate the input buffer to
this stream interface, and wakes the task up once such a buffer was found.
It will automatically remove the SI_FL_WAIT_ROOM flag upon success since
the info this flag indicates becomes wrong as soon as the buffer is
allocated.

The code is still far from being perfect because if a call to si_cs_recv()
fails to allocate a buffer, we'll still end up passing via process_stream()
again, but this could be improved in the future by using finer-grained
wake-up notifications.
2018-11-11 10:18:37 +01:00
William Lallemand
e260e0df44 BUG/MEDIUM: cli: crash when trying to access a worker
When using the CLI proxy of the master and trying to access a worker
with the @ prefix, the worker just crash.

The commit 7216032 ("MEDIUM: mworker: leave when the master die")
reintroduced the old code of the pipe, which was not trying to access
the pointers before. The owner of the FD was modified to a different
value, this is a problem since we call listener_accept() in most cases
now from the mworker_accept_wrapper() and it casts the owner variable to
get the listener.

This patch fix the issue by setting back the previous owner of the FD.
2018-11-08 14:48:06 +01:00
Willy Tarreau
b69f1713af BUG/MEDIUM: stream-int: don't wake up for nothing during SI_ST_CON
Commit eafd8ebcf ("MEDIUM: stream-int: call si_cs_process() in
stream_int_update_conn") uncovered a sleeping bug. By calling
si_cs_process() within si_update(), we end up calling stream_int_notify().
We rely on it to update the stream-int before quitting as a hack, but
it happens to immediately wake the task up while the stream int's
state is still SI_ST_CON (during the connection establishment). The
observable effect is that an unreachable server causes haproxy to
use 100% CPU until the connection timeout strikes.

This patch fixes this by not causing the wake up for the SI_ST_CON
state. It would equally be possible to check for states higher than
SI_ST_EST as is done in other places, but for now better stay on the
safe side by covering the only issue that can be triggered. It's
suspected that this issue slightly affects older versions by causing
one extra call to process_stream() during the connection setup for
each activity change on the other side, but this should not have
any observable effect.

No backport is needed.
2018-11-08 14:47:24 +01:00
William Lallemand
a90cacfd70 BUG/MEDIUM: mworker: does not abort() in mworker_pipe_register()
The process was aborting with nbthread > 1.

The mworker_pipe_register() could be called several time in multithread
mode, we don't want to abort() there.
2018-11-07 09:26:36 +01:00
Willy Tarreau
8ccd2081f5 CLEANUP: stream-int: retro-document si_cs_io_cb()
It took me 17 minutes this morning to figure where si->wait_event was
set (it's in si_reset() which should now probably be renamed since it
doesn't just perform a reset anymore but also an allocation) and what
its task was assigned to (si_cs_io_cb() even for applets and empty SI).

This is too confusing and not intuitive enough, let's at least add a
few comments for now to help figure how this stuff works next time.
2018-11-07 07:54:26 +01:00
William Lallemand
7216032e6f MEDIUM: mworker: leave when the master die
When the master die, the worker should exit too, this is achieved by
checking if the FD of the socketpair/pipe was closed between the master
and the worker.

In the former architecture of the master-worker, there was only a pipe
between the master and the workers, and it was easy to check an EOF on
the pipe FD to exit() the worker.

With the new architecture, we use a socketpair by process, and this
socketpair is also used to accept new connections with the
listener_accept() callback.

This accept callback can't handle the EOF and the exit of the process,
because it's very specific to the master worker. This is why we
transformed the mworker_pipe_handler() function in a wrapper which check
if there is an EOF and exit the process, and if not call
listener_accept() to achieve the accept.
2018-11-06 18:30:57 +01:00
William Lallemand
5d05db8ce1 MINOR: mworker: displays a message when a worker is forked
Displays the PID and the relative PID when we fork a new worker.
2018-11-06 18:30:50 +01:00
William Lallemand
91723745c6 MEDIUM: mworker: exit with the incriminated exit code
The former behavior was to exit() the master process with the latest
status code known, which was the one of the last process to exit.

The problem is that the master process was not exiting with the status
code which provoked the exit-on-failure.
2018-11-06 18:28:33 +01:00
William Lallemand
18e52a8834 MINOR: mworker: displays more information when leaving
When a worker is leaving, we display the relative PID and the result of
the strsignal() function if it was killed by a signal.
2018-11-06 18:28:33 +01:00
William Lallemand
550db6d188 MEDIUM: mworker: does not create the CLI proxy when no listener
Does not create the CLI proxy if no -S argument was specified. It
prevents a warning that says that the MASTER proxy does not have any
bind option.
2018-11-06 18:28:33 +01:00
William Lallemand
6b7cd0a72b MINOR: cli: can't connect to the target CLI
Return an error and quit if the CLI proxy is not able to connect to a
target.
2018-11-06 18:28:33 +01:00
William Lallemand
adbce8e0dd MINOR: cli: show the number of reload in 'show proc'
Displays the number of reload in the life of each worker.
2018-11-06 18:28:33 +01:00
Willy Tarreau
2d372c2aa1 MINOR: stats: report the number of currently connected peers
The active peers output indicates both the number of established peers
connections and the number of peers connection attempts. The new counter
"ConnectedPeers" also indicates the number of currently connected peers.
This helps detect that some peers cannot be reached for example. It's
worth mentioning that this value changes over time because unused peers
are often disconnected and reconnected. Most of the time it should be
equal to ActivePeers.
2018-11-05 17:15:21 +01:00
Willy Tarreau
199ad24661 MINOR: stats: report the number of active peers in "show info"
Peers are the last type of activity which can maintain a job present, so
it's important to report that such an entity is still active to explain
why the job count may be higher than zero. Here by "ActivePeers" we report
peers sessions, which include both established connections and outgoing
connection attempts.
2018-11-05 17:15:21 +01:00
Willy Tarreau
00098ea034 MINOR: stats: report the number of active jobs and listeners in "show info"
When an haproxy process doesn't stop after a reload, it's because it
still has some active "jobs", which mainly are active sessions, listeners,
peers or other specific activities. Sometimes it's difficult to troubleshoot
the cause of these issues (which generally are the result of a bug) only
because some indicators are missing.

This patch add the number of listeners, the number of jobs, and the stopping
status to the output of "show info". This way it becomes a bit easier to try
to narrow down the cause of such an issue should it happen. A typical use
case is to connect to the CLI before reloading, then issuing the "show info"
command to see what happens. In the normal situation, stopping should equal
1, jobs should equal 1 (meaning only the CLI is still active) and listeners
should equal zero.

The patch is so trivial that it could make sense to backport it to 1.8 in
order to help with troubleshooting.
2018-11-05 17:15:21 +01:00
Willy Tarreau
086735a688 BUG/MINOR: tasks: make sure wakeup events are properly reported to subscribers
The tasks API was changed in 1.9-dev1 with commit 9f6af3322 ("MINOR: tasks:
Change the task API so that the callback takes 3 arguments."), causing the
task's state not to be usable anymore and to have been replaced with an
explicit argument in the callee. The task's state doesn't contain any trace
of the wakeup cause anymore. But there were two places where the old task's
state remained in use :
  - sessions, used to more accurately report timeouts in logs when seeing
    TASK_WOKEN_TIMEOUT ;
  - peers, used to finish resynchronization when seeing TASK_WOKEN_SIGNAL

This commit fixes both occurrences by making sure we don't access task->state
directly (should we rename it by the way ?).

No backport is needed.
2018-11-05 17:15:21 +01:00
Willy Tarreau
1d0b7069f2 BUG/MAJOR: stream-int: don't call si_cs_recv() in stream_int_chk_rcv_conn()
This one causes some events to be lost. It has already been tested in
an experimental branch but was not merged until being certain it was
needed. Fred figured that requesting /?k=1&s=447392 from httpterm through
haproxy-master was enough to stall the transfer.

No backport is needed, this only affects 1.9-dev5.
2018-10-30 11:05:24 +01:00
Willy Tarreau
943e7ec025 MEDIUM: auth/threads: make use of crypt_r() on systems supporting it
On systems where crypt_r() is available, prefer it over a locked crypt().
This improves performance especially on very slow crypto algorithms.
2018-10-29 19:17:39 +01:00
Willy Tarreau
34d4b525a1 BUG/MEDIUM: auth/threads: use of crypt() is not thread-safe
It was reported here that authentication may fail when threads are
enabled :

    https://bugzilla.redhat.com/show_bug.cgi?id=1643941

While I couldn't reproduce the issue, it's obvious that there is a
problem with the use of the non-reentrant crypt() function there.
On Linux systems there's crypt_r() but not on the vast majority of
other ones. Thus a first approach consists in placing a lock around
this crypt() call. Another patch may relax it when crypt_r() is
available.

This fix must be backported to 1.8. Thanks to Ryan O'Hara for the
quick notification.
2018-10-29 18:06:02 +01:00
William Lallemand
744de5b52a BUG/MINOR: cli: forward the whole command on master CLI
A bug occurs when the CLI proxy of the master received a command which
is prefixed by some spaces but without a routing prefix (@).
In this case the pcli_parse_request() was returning a wrong number of
data to forward.

The response analyzer was called twice and the prompt displayed twice.
2018-10-29 17:23:27 +01:00
Willy Tarreau
cde1bc64cb BUG/MINOR: backend: assign the wait list after the error check
Commit 85b73e9 ("BUG/MEDIUM: stream: Make sure polling is right on retry.")
introduced a possible null dereference on the error path detected by gcc-7.
Let's simply assign srv_conn after checking the error and not before.

No backport is needed.
2018-10-28 20:36:00 +01:00
Willy Tarreau
9d9ccdbf8b BUG/MAJOR: http: http_txn_get_path() may deference an inexisting buffer
When the "path" sample fetch function is called without any path, the
function doesn't check that the request buffer is allocated. While this
doesn't happen with the request during processing, it can definitely
happen when mistakenly trying to reference a path from the response
since the request channel is not allocated anymore.

It's certain that this bug was emphasized by the buffer changes that
went in 1.9 and the HTTP refactoring, but at first glance, 1.8 doesn't
seem 100% safe either so it's possible that older version are affected
as well.

Thanks to PiBa-NL for reporting this bug with a reproducer.
2018-10-28 20:16:12 +01:00
Frédéric Lécaille
e7a770ce80 MINOR: cache: Add "Age" header.
This patch makes the cache capable of adding an "Age" header as defined by
rfc7234.

During the storage of new HTTP objects we memorize ->eoh value and
the value of the "Age" header coming from the origin server.
These information may then be reused to return the cached HTTP objects
with a new "Age" header.

May be backported to 1.8.
2018-10-28 19:06:59 +01:00
William Lallemand
deeaa593f3 MINOR: cli: helper to write an response message and close
pcli_reply_and_close() writes a message to the client and close the
connection. To be used only in the CLI proxy.
2018-10-28 14:13:35 +01:00
William Lallemand
2f4ce202d7 MEDIUM: cli: write a prompt for the CLI proxy of the master
Write a prompt with the PID of the target or master.
It's always activated for now.

Example:
    1234>
    master>
2018-10-28 14:13:34 +01:00
William Lallemand
309dc9adec MEDIUM: mworker: stop the master proxy in the workers
The master proxy which handles the CLI should not be used or shown in
the stats of the workers. This proxy is now disabled after the fork.
2018-10-28 14:03:31 +01:00
William Lallemand
0b3e849a48 MEDIUM: listeners: set O_CLOEXEC on the accepted FDs
Set the O_CLOEXEC flag on the accept, useful to avoid an FD leak in the
master process, since it reexecutes itself during a reload
2018-10-28 14:03:31 +01:00
William Lallemand
4e8450b7d6 MINOR: cli: put @master @<relative pid> @!<pid> in the help
Add help for the prefix command of the CLI. These help only displays
from the CLI of the master.
2018-10-28 14:03:30 +01:00
William Lallemand
35851fbaf4 MEDIUM: cli: enable "show cli sockets" for the master
Enable the keyword on the master CLI.
2018-10-28 14:03:30 +01:00
William Lallemand
2631434b4b MINOR: cli: displays sockpair@ in "show cli sockets"
The 'show cli sockets' was not handling the sockpairs, it now displays
the fd of the socket and also show the unknown protocols.
2018-10-28 14:03:30 +01:00
William Lallemand
cf62f7e3cb MEDIUM: cli: implement 'mode cli' proxy analyzers
This patch implements analysers for parsing the CLI and extra features
for the master's CLI.

For each command (sent alone, or separated by ; or \n) the request
analyser will determine to which server it should send the request.

The 'mode cli' proxy is able to parse a prefix for each command which is
used to select the apropriate server. The prefix start by @ and is
followed by "master", the PID preceded by ! or the relative PID. (e.g.
@master, @1, @!1234). The servers are not round-robined anymore.

The command is sent with a SHUTW which force the server to close the
connection after sending its response. However the proxy allows a
keepalive connection on the client side and does not close.

The response analyser does not do much stuff, it only reinits the
connection when it received a close from the server, and forward the
response. It does not analyze the response data.
The only guarantee of the end of the response is the close of the
server, we can't rely on the double \n since it's not send by every
command.

This could be reimplemented later as a filter.
2018-10-28 14:03:06 +01:00
William Lallemand
b9f9e3bc17 MEDIUM: cli: 'show proc' displays processus
This patch implements a command which displays the current processes.

It only works in the CLI of the master.
2018-10-28 13:51:39 +01:00
William Lallemand
291810d8f8 MEDIUM: mworker: find the server ptr using a CLI prefix
Add a struct server pointer in the mworker_proc struct so we can easily
use it as a target for the mworker proxy.

pcli_prefix_to_pid() is used to find the right PID of the worker
when using a prefix in the CLI. (@master, @#<relative pid> , @<pid>)

pcli_pid_to_server() is used to find the right target server for the
CLI proxy.
2018-10-28 13:51:39 +01:00
William Lallemand
14721be11f MEDIUM: cli: disable some keywords in the master
The master process does not need all the keywords of the cli, add 2
flags to chose which keyword to use.

It might be useful to activate some of them in a debug mode later...
2018-10-28 13:51:39 +01:00
William Lallemand
e736115d3a MEDIUM: mworker: create CLI listeners from argv[]
This patch introduces mworker_cli_proxy_new_listener() which allows the
creation of new listeners for the CLI proxy.

Using this function it is possible to create new listeners from the
program arguments with -Sa <unix_socket>. It is allowed to create
multiple listeners with several -Sa.
2018-10-28 13:51:39 +01:00
William Lallemand
8a02257d88 MEDIUM: mworker: proxy for the master CLI
This patch implements a listen proxy within the master. It uses the
sockpair of all the workers as servers.

In the current state of the code, the proxy is only doing round robin on
the CLI of the workers. A CLI mode will be needed to know to which CLI
send the requests.
2018-10-28 13:51:39 +01:00
William Lallemand
1b66361f8d MEDIUM: mworker: move proc_list gen before proxies startup
We need to generate the process list before starting the proxies,
because it will be used to create a proxy in the master
2018-10-28 13:51:38 +01:00
William Lallemand
313bfd18c1 MINOR: server: export new_server() function
The new_server() function will be useful to create a proxy for the
master-worker.
2018-10-28 13:51:38 +01:00
William Lallemand
7e1299bb3a REORG: mworker: move struct mworker_proc to global.h
Move the definition of the mworker_proc structure in types/global.h.
2018-10-28 13:51:38 +01:00
William Lallemand
ce83b4a5dd MEDIUM: mworker: each worker socketpair is a CLI listener
The init code of the mworker_proc structs has been moved before the
init of the listeners.

Each socketpair is now connected to a CLI within the workers, which
allows the master to access their CLI.

The inherited flag of the worker side socketpair is removed so the
socket can be closed in the master.
2018-10-28 13:51:38 +01:00
William Lallemand
f1a62860c8 MINOR: mworker: number of reload in the life of a worker
This patch adds a field in the mworker_proc structure which contains how
much time the master reloaded during the life of a worker.
2018-10-28 13:51:38 +01:00
Willy Tarreau
908d26fd03 MINOR: stream-int: don't needlessly call si_cs_send() in si_cs_process()
There's a call there to si_cs_send() while we're supposed to come from
si_cs_io_cb() which has just done it. But in fact we can also come here
as a lower layer callback from ->wake() after a connection is established.
Since most of the time we'll end up here with either no data in the buffer
or a blocked output, let's simply check if we're already susbcribed to send
events before calling si_cs_send().
2018-10-28 13:50:02 +01:00
Willy Tarreau
0dfccb20f5 MINOR: stream-int: make stream_int_notify() not wake the tasklet up
stream_int_notify() is I/O agnostic and should not wake up the tasklet,
it's up to si_cs_process() to do that, just like si_applet_wake_cb()
does it for the applet.
2018-10-28 13:50:01 +01:00
Willy Tarreau
33a09a5f2a MINOR: stream-int: don't needlessly call tasklet_wakeup() in stream_int_chk_snd_conn()
This one was added by commit 53216e7db ("MEDIUM: connections: Don't
directly mess with the polling from the upper layers.") after the
removal of the conditional cs_want_send() call. But after analysis
it turned out that it's not needed since the si_cs_send() call will
either succeed or subscribe.
2018-10-28 13:50:01 +01:00
Willy Tarreau
eafd8ebcfe MEDIUM: stream-int: call si_cs_process() in stream_int_update_conn
Calling si_cs_send() alone is always dangerous because it can result
in the loss of an event if it manages to empty the buffer. Indeed, in
this case it's critical to call si_chk_rcv() on the opposite stream-int.
Given that si_cs_process() takes care of all this, let's call it instead.
All this code could possibly be refined soon to avoid redoing the whole
stream_int_notify() and do it only after a send(), but at the moment it's
not important.
2018-10-28 13:48:06 +01:00
Willy Tarreau
85f890174a MEDIUM: stream-int: make si_update() synchronize flag changes before the I/O
With the new synchronous si_cs_send() at the end of process_stream(),
we're seeing re-appear the I/O layer specific part of the stream interface
which is supposed to deal with I/O event subscription. The only difference
is that now we subscribe to I/Os only after having attempted (and failed)
them.

This patch brings a cleanup in this by reintroducing stream_int_update_conn()
with the send code from process_stream(). However this alone would not be
enough because the flags which are cleared afterwards would result in the
loss of the possible events (write events only at the moment). So the flags
clearing and stream-int state updates are also performed inside si_update()
between the generic code and the I/O specific code. This definitely makes
sense as after this call we can simply check again for channel and SI flag
changes and decide to loop once again or not.
2018-10-28 13:47:00 +01:00
Willy Tarreau
0f8d3ab362 MEDIUM: stream: don't try to send first in process_stream()
The rationale here is that we should never need to try to send() at the
beginning of process_stream() because :
  - if something was pending, it's very unlikely that it was unblocked
    and not sent just between the last poll() and the wakeup instant.
  - if something pending was recently sent, then we don't have anything
    to send anymore.

So at first glance it doesn't seem like there could be any valid case
where trying to send before entering the function brings any benefit.
2018-10-28 13:47:00 +01:00
Willy Tarreau
18e066c2e7 MEDIUM: stream: always call si_cs_recv() after a failed buffer allocation
If a buffer allocation failed, we have SI_FL_WAIT_ROOM set and c_size(buf)
being zero. It's the only moment where we have a new opportunity to try to
allocate this buffer. However we don't want to waste our time trying this
if both are non-null since it indicates missing room without any changed
condition.
2018-10-28 13:47:00 +01:00
Willy Tarreau
581abd3f99 MEDIUM: stream-int: replace channel_alloc_buffer() with si_alloc_ibuf() everywhere
Well that's only 3 places (applet.c, stream_interface.c, hlua.c). This
ensures we always clear SI_FL_WAIT_ROOM before setting it on failure,
so that it is granted that SI_FL_WAIT_ROOM always indicates a lack of
room for doing an operation, including the inability to allocate a
buffer for this.
2018-10-28 13:47:00 +01:00
Willy Tarreau
cda7f3f5c2 MINOR: stream: don't prune variables if the list is empty
The vars_prune() and vars_init() functions involve locking while most of
the time there is no variable at all in streams nor sessions. Let's check
for emptiness before calling these functions. Simply doing this has
increased the multithreaded performance from 1.5 to 5% depending on the
workload.
2018-10-28 13:46:47 +01:00
Lukas Tribus
80512b186f BUG/MINOR: only auto-prefer last server if lb-alg is non-deterministic
While "option prefer-last-server" only applies to non-deterministic load
balancing algorithms, 401/407 responses actually caused haproxy to prefer
the last server unconditionally.

As this breaks deterministic load balancing algorithms like uri, this
patch applies the same condition here.

Should be backported to 1.8 (together with "BUG/MINOR: only mark
connections private if NTLM is detected").
2018-10-27 22:10:32 +02:00
Lukas Tribus
fd9b68c48e BUG/MINOR: only mark connections private if NTLM is detected
Instead of marking all connections that see a 401/407 response private
(for connection reuse), this patch detects a RFC4559/NTLM authentication
scheme and restricts the private setting to those connections.

This is so we can reuse connections with 401/407 responses with
deterministic load balancing algorithms later (which requires another fix).

This fixes the problem reported here by Elliot Barlas :

  https://discourse.haproxy.org/t/unable-to-configure-load-balancing-per-request-over-persistent-connection/3144

Should be backported to 1.8.
2018-10-27 22:10:29 +02:00
Willy Tarreau
ede3d884fc MEDIUM: channel: merge back flags CF_WRITE_PARTIAL and CF_WRITE_EVENT
The behaviour of the flag CF_WRITE_PARTIAL was modified by commit
95fad5ba4 ("BUG/MAJOR: stream-int: don't re-arm recv if send fails") due
to a situation where it could trigger an immediate wake up of the other
side, both acting in loops via the FD cache. This loss has caused the
need to introduce CF_WRITE_EVENT as commit c5a9d5bf, to replace it, but
both flags express more or less the same thing and this distinction
creates a lot of confusion and complexity in the code.

Since the FD cache now acts via tasklets, the issue worked around in the
first patch no longer exists, so it's more than time to kill this hack
and to restore CF_WRITE_PARTIAL's semantics (i.e.: there has been some
write activity since we last left process_stream).

This patch mostly reverts the two commits above. Only the part making
use of CF_WROTE_DATA instead of CF_WRITE_PARTIAL to detect the loss of
data upon connection setup was kept because it's more accurate and
better suited.
2018-10-26 08:32:57 +02:00
Frédéric Lécaille
b80bc273a3 MINOR: shctx: Change max. object size type to unsigned int.
This change is there to prevent implicit conversions when comparing
shctx maximum object sizes with other unsigned values.
2018-10-26 04:54:40 +02:00
Frédéric Lécaille
4eba544e24 MINOR: cache: Avoid usage of atoi() when parsing "max-object-size".
With this patch we avoid parsing "max-object-size" with atoi() and we store its
value as an unsigned int to prevent bad implicit conversion issues especially
when we compare it with others unsigned value (content length).
2018-10-26 04:54:40 +02:00
Frédéric Lécaille
4c8aa117f9 BUG/MINOR: ssl: Wrong usage of shctx_init().
With this patch we check that shctx_init() does not return 0.

Must be backported to 1.8.
2018-10-26 04:54:40 +02:00
Frédéric Lécaille
bc584494e6 BUG/MINOR: cache: Wrong usage of shctx_init().
With this patch we check that shctx_init() does not returns 0.
This is possible if the maxblocks argument, which is passed as an
int, is negative due to an implicit conversion.

Must be backported to 1.8.
2018-10-26 04:54:40 +02:00
Frédéric Lécaille
b9b8b6b6be BUG/MINOR: cache: Crashes with "total-max-size" > 2047(MB).
With this patch we support cache size larger than 2047 (MB) and prevent haproxy from crashing when "total-max-size" is parsed as negative values by atoi().

The limit at parsing time is 4095 MB (UINT_MAX >> 20).

May be backported to 1.8.
2018-10-26 04:54:40 +02:00
Frédéric Lécaille
a2219f5e3b MINOR: cache: Add "max-object-size" option.
This patch adds "max-object-size" option to the cache to limit
the size in bytes of the HTTP objects to be cached. When not provided,
the maximum size of an HTTP object is a 256th of the cache size.
2018-10-24 04:40:03 +02:00
Frédéric Lécaille
b7838afe6f MINOR: shctx: Add a maximum object size parameter.
This patch adds a new parameter to shctx_init() function to be used to
limit the size of each shared object, -1 value meaning "no limit".
2018-10-24 04:39:44 +02:00
Frédéric Lécaille
8df65ae5e2 MINOR: cache: Larger HTTP objects caching.
This patch makes the capable of storing HTTP objects larger than a buffer.
It makes usage of the "block by block shared object allocation" new shctx API.

A new pointer to struct shared_block has been added to the cache applet
context to memorize the next block to be used by the HTTP cache I/O handler
http_cache_io_handler() to emit the data. Another member, named "sent" memorize
the number of bytes already sent by this handler. So, to send an object from cache,
http_cache_io_handler() must be called until "sent" counter reaches the size
of this object.
2018-10-24 04:37:12 +02:00
Frédéric Lécaille
0bec807e08 MINOR: shctx: Shared objects block by block allocation.
This patch makes shctx capable of storing objects in several parts,
each parts being made of several blocks. There is no more need to
walk through until reaching the end of a row to append new blocks.

A new pointer to a struct shared_block member, named last_reserved,
has been added to struct shared_block so that to memorize the last block which was
reserved by shctx_row_reserve_hot(). Same thing about "last_append" pointer which
is used to memorize the last block used by shctx_row_data_append() to store the data.
2018-10-24 04:35:53 +02:00
Willy Tarreau
30f931ead2 BUG/MEDIUM: pools: fix the minimum allocation size
Fred reported a random crash related to the pools. This was introduced
by commit e18db9e98 ("MEDIUM: pools: implement a thread-local cache for
pool entries") because the minimum pool item size should have been
increased to 32 bytes to accommodate the 2 double-linked lists.

No backport is needed.
2018-10-23 14:40:23 +02:00
Willy Tarreau
68ad3a42f7 MINOR: proxy: add a new option "http-use-htx"
This option makes a proxy use only HTX-compatible muxes instead of the
HTTP-compatible ones for HTTP modes. It must be set on both ends, this
is checked at parsing time.
2018-10-23 10:22:36 +02:00
Christopher Faulet
955188d37d BUG/MEDIUM: stream-int: don't set SI_FL_WAIT_ROOM on CF_READ_DONTWAIT
With the previous connection model, when we purposely decided to stop
receiving in order to avoid polling after a complete request was received
for example, it was needed to set SI_FL_WAIT_ROOM to prevent receive
polling from being re-armed. Now with the new subscription-based model
there is no such thing anymore and there is noone to remove this flag
either. Thus if a request takes more than one packet to come in or
spans over too many packets, this flag will cause it to wait forever.
Let's simply remove this flag now.

This patch should not be backported since older versions still need
that this flag is set here to stop receiving.
2018-10-23 10:22:36 +02:00
Christopher Faulet
66943a4903 CLEANUP: http: Remove the unused function http_find_header 2018-10-23 10:22:36 +02:00
Olivier Houchard
31f04e4416 MINOR: stream_interface: Avoid calling si_cs_send/recv if not needed.
Don't bother calling si_cs_send and si_cs_recv if we're either already
subscribe, or if the output buffer is empty for si_cs_send.
2018-10-22 16:05:08 +02:00
Olivier Houchard
d846c267d5 MINOR: h2: Don't run tasks that are waiting to send if mux in full.
We wake up all the streams waiting to send data when we have space available
in the mux buffer. Doing so means we probably wake way too many streams,
because after a few the buffer will probably be full instead. So keep a
list of all the streams that are about to send data, and if we detect that
the buffer is full, unschedule the tasks and put the streams back to the
send_list.
2018-10-21 06:00:13 +02:00
Olivier Houchard
d7bd3e3c4c MINOR: streams: Call tasklet_free() after si_release_endpoint().
Make sure we call tasklet_free() only after si_release_endpoint(), when the
unsubscribe() method has been called, so that we're sure the mux won't
attempt to access the taslet.
2018-10-21 05:59:55 +02:00
Olivier Houchard
53216e7db9 MEDIUM: connections: Don't directly mess with the polling from the upper layers.
Avoid using conn_xprt_want_send/recv, and totally nuke cs_want_send/recv,
from the upper layers. The polling is now directly handled by the connection
layer, it is activated on subscribe(), and unactivated once we got the event
and we woke the related task.
2018-10-21 05:58:40 +02:00
Olivier Houchard
81a15af6bc MINOR: h2: Make sure to return 1 in h2_recv() when needed.
In h2_recv(), return 1 if we have data available, or if h2_recv_allowed()
failed, to be sure h2_process() is called.
Also don't subscribe if our buffer is full.
2018-10-21 05:58:33 +02:00
Olivier Houchard
85b73e9427 BUG/MEDIUM: stream: Make sure polling is right on retry.
When retrying to connect to a server, because the previous connection failed,
make sure if we subscribed to the previous connection, the polling flags will
be true for the new fd.

No backport is needed.
2018-10-21 05:55:32 +02:00
Olivier Houchard
52b946686c BUG/MEDIUM: h2: Close connection if no stream is left an GOAWAY was sent.
When we're closing a stream, is there's no stream left and a goaway was sent,
close the connection, there's no reason to keep it open.

[wt: it's likely that this is needed in 1.8 as well, though it's unclear
 how to trigger this issue, some tests are needed]
2018-10-21 05:53:09 +02:00
Olivier Houchard
8b2c8a7894 BUILD: memory: fix free_list pointer declaration again for atomic CAS
Similary to what's been done in 7a6ad88b02,
take into account that free_list that free_list is a void **, and so use
a void ** too when attempting to do a CAS.
2018-10-21 05:44:38 +02:00
Willy Tarreau
4e7cc3381b BUILD: compiler: rename __unreachable() to my_unreachable()
Olivier reported that on FreeBSD __unreachable is already defined
and causes build warnings. Let's rename it then.
2018-10-20 17:45:48 +02:00
Willy Tarreau
ed72d82827 MEDIUM: time: measure the time stolen by other threads
The purpose is to detect if threads or processes are competing for the
same CPU. This can happen when threads are incorrectly bound, or after a
reload if the previous process still has an important activity. With
threads this situation is problematic because a preempted thread holding
a lock will block other ones waiting for this lock to be released.

A first attempt consisted in measuring the cumulated lost time more
precisely but the system's scheduler is smart enough to try to limit the
thread preemption rate by mostly context switching during poll()'s blank
periods, so most of the time lost is not seen. In essence this is good
because it means a thread is not preempted with a lock held, and even
regarding the rendez-vous point it cannot prevent the other ones from
making progress. But still it happens tens to hundreds of times per
second that a thread might be preempted, so it's still possible to detect
that the situation is happening, thus it's interesting to measure and
report its frequency.

Each time we enter the poller, we check the CPU time spent working and
see if we've lost time doing something else. To limit false positives,
we're only interested in losses of 500 microseconds or more (i.e. half
a clock tick on a 1 kHz system). If so, it indicates that some time was
stolen by another thread or process. Note that we purposely store some
sub-millisecond counters so that under heavy traffic with a 1 kHz clock,
it's still possible to measure something without being subject to the
risk of rounding errors (i.e. if exactly 1 ms is stolen it's possible
that the time difference could often be slightly lower).

This counter of lost CPU time slots time is reported in "show activity"
in numbers of milliseconds of CPU lost per second, per 15s, and total
over the process' life. By definition, the per-second counter cannot
report values larger than 1000 per thread per second and the 15s one
will be limited to 15000/s in the worst case, but it's possible that
peak values exceed such thresholds after long pauses.
2018-10-19 08:51:59 +02:00
Willy Tarreau
ac6c8805be BUILD: memory: fix pointer declaration for atomic CAS
The calls to HA_ATOMIC_CAS() on the lockfree version of the pool allocator
were mistakenly done on (void*) for the old value instead of (void **).
While this has no impact on "recent" gcc, it does have one for gcc < 4.7
since the CAS was open coded and it's not possible to assign a temporary
variable of type "void".

No backport is needed, this only affects 1.9.
2018-10-18 16:12:28 +02:00
Willy Tarreau
7e9c4ae4de MINOR: poller: move time and date computation out of the pollers
By placing this code into time.h (tv_entering_poll() and tv_leaving_poll())
we can remove the logic from the pollers and prepare for extending this to
offer more accurate time measurements.
2018-10-17 19:59:43 +02:00
Willy Tarreau
f37ba94768 MINOR: fd: centralize poll timeout computation in compute_poll_timeout()
The 4 pollers all contain the same code used to compute the poll timeout.
This is pointless, let's centralize this into fd.h. This also gets rid of
the useless SCHEDULER_RESOLUTION macro which used to work arond a very old
linux 2.2 bug causing select() to wake up slightly before the timeout.
2018-10-17 19:59:43 +02:00
Olivier Houchard
33992267aa MINOR: peers: use defines instead of enums to appease clang.
Clang (rightfully) warns that we're trying to set chars to values >= 128.
Use defines with hex values instead of an enum to address this.
2018-10-16 19:31:15 +02:00
Olivier Houchard
3332090a2d MINOR: cfgparse: Write 130 as 128 as 0x82 and 0x80.
Write 130 and 128 as 8x82 and 0x80, to avoid warnings about casting from
int to size. "check_req" should probably be unsigned, but it's hard to do so.
2018-10-16 19:28:35 +02:00
Willy Tarreau
5dfb6c4cc9 CLEANUP: state-file: make the path concatenation code a bit more consistent
There are as many ways to build the globalfilepathlen variable as branches
in the if/then/else, creating lots of confusion. Address the most obvious
parts, but some polishing definitely is still needed.
2018-10-16 19:26:12 +02:00
Olivier Houchard
17f8b90736 MINOR: server: Use memcpy() instead of strncpy().
Use memcpy instead of strncpy, strncpy buys us nothing, and gcc is being
annoying.
2018-10-16 19:22:20 +02:00
Willy Tarreau
b059b894cd BUILD: lua: silence some compiler warnings after WILL_LJMP
These ones are on error paths that are properly handled by luaL_error()
which does a longjmp() but the compiler cannot know it. By adding an
__unreachable() statement in WILL_LJMP(), there is no ambiguity anymore.

This may be backported to 1.8 but these previous patches are needed first :
  - BUILD: compiler: add a new statement "__unreachable()"
  - MINOR: lua: all functions calling lua_yieldk() may return
  - BUILD: lua: silence some compiler warnings about potential null derefs (#2)
2018-10-16 17:57:36 +02:00
Willy Tarreau
9635e03c41 MINOR: lua: all functions calling lua_yieldk() may return
There was a mistake when tagging functions which always use longjmp and
those which may use it in that all those supposed to call lua_yieldk()
may return without calling longjmp. Thus they must not use WILL_LJMP()
but MAY_LJMP(). It has zero impact on the code emitted as such, but
prevents other fixes from being properly implemented : this was the
cause of the previous failure with the __unreachable() calls.

This may be backported to older versions. It may or may not apply
well depending on the context, though the change simply consists in
replacing "WILL_LJMP(hlua_yieldk" with "MAY_LJMP(hlua_yieldk", and
same with the single call to lua_yieldk() in hlua_yieldk().
2018-10-16 17:56:20 +02:00
Willy Tarreau
e09101e8d9 BUILD: lua: silence some compiler warnings about potential null derefs (#2)
Here we make sure that appctx is always taken from the unchecked value
since we know it's an appctx, which explains why it's immediately
dereferenced. A missing test was added to ensure that task_new() does
not return a NULL.

This may be backported to 1.8.
2018-10-16 17:39:05 +02:00
Willy Tarreau
526aed219f Revert "BUILD: lua: silence some compiler warnings about potential null derefs"
This reverts commit f1ffb39b61.

It breaks Lua causing some timeouts. Removing the __unreachable() statement
from WILL_LJMP() fixes it. It's very strange and unclear whether it's an
issue with WILL_LJMP() not fullfilling its promise of not returning, if
the code emitted with __unreachable() gets broken, or anything else. Let's
revert this for now.
2018-10-16 17:32:55 +02:00
Willy Tarreau
a9c0252b2e BUG/MEDIUM: threads: fix thread_release() at the end of the rendez-vous point
There is a bug in this function used to release other threads. It leaves
the current thread marked as harmless. If after this another thread does
a thread_isolate(), but before the first one reaches poll(), the second
thread will believe it's alone while it's not.

This must be backported to 1.8 since the rendez-vous point was merged
into 1.8.14.
2018-10-16 17:03:16 +02:00
Willy Tarreau
e18db9e984 MEDIUM: pools: implement a thread-local cache for pool entries
Each thread now keeps the last ~512 kB of freed objects into a local
cache. There are some heuristics involved so that a specific pool cannot
use more than 1/8 of the total cache in number of objects. Tests have
shown that 512 kB is an optimal size on a 24-thread test running on a
dual-socket machine, resulting in an overall 7.5% performance increase
and a cache miss ratio reducing from 19.2 to 17.7%. Anyway it seems
pointless to keep more than an L2 cache, which probably explains why
sizes between 256 and 512 kB are optimal.

Cached objects appear in two lists, one per pool and one LRU to help
with fair eviction. Currently there is no way to check each thread's
cache state nor to flush it. This cache cannot be disabled and is
enabled as soon as the lockless pools are enabled (i.e.: threads are
enabled, no pool debugging is in use and the CPU supports a double word
CAS).
2018-10-16 13:46:08 +02:00
Willy Tarreau
0a93b6413f MINOR: pools: allocate most memory pools from an array
For caching it will be convenient to have indexes associated with pools,
without having to dereference the pool itself. One solution could consist
in replacing all pool pointers with integers but this would limit the
number of allocatable pools. Instead here we allocate the 32 first pools
from a pre-allocated array whose base address is known so that it's trivial
to convert a pool to an index in this array. Pools that cannot fit there
will be allocated normally.
2018-10-16 10:29:26 +02:00
Willy Tarreau
8d8747abe0 OPTIM: tasks: group all tree roots per cache line
Currently we have per-thread arrays of trees and counts, but these
ones unfortunately share cache lines and are accessed very often. This
patch moves the task-specific stuff into a structure taking a multiple
of a cache line, and has one such per thread. Just doing this has
reduced the cache miss ratio from 19.2% to 18.7% and increased the
12-thread test performance by 3%.

It starts to become visible that we really need a process-wide per-thread
storage area that would cover more than just these parts of the tasks.
The code was arranged so that it's easy to move the pieces elsewhere if
needed.
2018-10-15 19:06:13 +02:00
Willy Tarreau
b20aa9eef3 MAJOR: tasks: create per-thread wait queues
Now we still have a main contention point with the timers in the main
wait queue, but the vast majority of the tasks are pinned to a single
thread. This patch creates a per-thread wait queue and queues a task
to the local wait queue without any locking if the task is bound to a
single thread (the current one) otherwise to the shared queue using
locking. This significantly reduces contention on the wait queue. A
test with 12 threads showed 11 ms spent in the WQ lock compared to
4.7 seconds in the same test without this change. The cache miss ratio
decreased from 19.7% to 19.2% on the 12-thread test, and its performance
increased by 1.5%.

Another indirect benefit is that the average queue size is divided
by the number of threads, which roughly removes log(nbthreads) levels
in the tree and further speeds up lookups.
2018-10-15 19:04:40 +02:00
Willy Tarreau
87d54a9a6d MEDIUM: fd/threads: only grab the fd's lock if the FD has more than one thread
The vast majority of FDs are only seen by one thread. Currently the lock
on FDs costs a lot because it's touched often, though there should be very
little contention. This patch ensures that the lock is only grabbed if the
FD is shared by more than one thread, since otherwise the situation is safe.
Doing so resulted in a 15% performance boost on a 12-threads test.
2018-10-15 13:25:06 +02:00
Willy Tarreau
9504dd64c6 MINOR: config: use atleast2() instead of my_popcountl() where relevant
Quite often we used my_popcountl() just to check for > 1 bit set. Now
we have an easier solution, let's use it.
2018-10-15 13:25:06 +02:00
Willy Tarreau
d944344f01 BUILD: peers: check allocation error during peers_init_sync()
peers_init_sync() doesn't check task_new()'s return value and doesn't
return any result to indicate success or failure. Let's make it return
an int and check it from the caller.

This can be backported as far as 1.6.
2018-10-15 13:24:43 +02:00
Willy Tarreau
848522f05d BUILD: stick-table: make sure not to fail on task_new() during initialization
Gcc reports a potential null-deref error in the stick-table init code.
While not critical there, it's trivial to fix. This check has been
missing since 1.4 so this fix can be backported to all supported versions.
2018-10-15 13:24:43 +02:00
Willy Tarreau
a8825520b7 BUILD: ssl: fix another null-deref warning in ssl_sock_switchctx_cbk()
This null-deref cannot happen either as there necesarily is a listener
where this function is called. Let's use __objt_listener() to address
this.

This may be backported to 1.8.
2018-10-15 13:24:43 +02:00
Willy Tarreau
b729077710 BUILD: ssl: fix null-deref warning in ssl_fc_cipherlist_str sample fetch
Gcc 6.4 detects a potential null-deref warning in smp_fetch_ssl_fc_cl_str().
This one is not real since already addressed a few lines above. Let's use
__objt_conn() instead of objt_conn() to avoid the extra test that confuses
it.

This could be backported to 1.8.
2018-10-15 13:24:43 +02:00
Willy Tarreau
f1ffb39b61 BUILD: lua: silence some compiler warnings about potential null derefs
These ones are on error paths that are properly handled by luaL_error()
which does a longjmp() but the compiler cannot know it. By adding an
__unreachable() statement in WILL_LJMP(), there is no ambiguity anymore.

This may be backported to 1.8 but the previous patch (BUILD: compiler:
add a new statement "__unreachable()") is needed for this.
2018-10-15 13:24:43 +02:00
Willy Tarreau
e5f229e639 BUG/MEDIUM: stream: don't crash on out-of-memory
In case pool_alloc() fails in stream_new(), we try to detach the stream
from the list before it has been added, dereferencing a NULL. In order
to fix it, simply move the LIST_DEL call upwards.

This must be backported to 1.8.
2018-10-15 13:24:43 +02:00
William Lallemand
dd319a5b1d BUG/MEDIUM: mworker: don't poll on LI_O_INHERITED listeners
The listeners with the LI_O_INHERITED flag were deleted but not unbound
which is a problem since we have a polling in the master.

This patch unbind every listeners which are not require for the master,
but does not close the FD of those that have a LI_O_INHERITED flag.
2018-10-12 19:30:18 +02:00
Willy Tarreau
b3fb56db10 MINOR: h2: add a new flag to quickly distinguish front vs back connection
We will need to know if a mux was created for a front or a back
connection and once it's established it's much harder, so let's
introduce H2_CF_IS_BACK for this.
2018-10-12 16:58:41 +02:00
Willy Tarreau
a8e4954856 MINOR: h2: split h2c_stream_new() into h2s_new() + h2c_frt_stream_new()
For backend connections we'll have to initialize streams but not allocate
conn_streams since they'll already be there. Thus this patch splits the
h2c_stream_new() function into one dedicated to allocation of a new stream
and another one supposed to attach this stream to an existing frontend
connection.
2018-10-12 16:58:01 +02:00
Willy Tarreau
0b37d658e6 MINOR: h2: retrieve the front proxy from the caller instead of the session
Till now in order to figure the timeouts, we used to retrieve the proxy
from the session's owner, but the new API provides it so it's better to
simply take it from the caller at init time. We take this opportunity to
store the pointer to the proxy into the h2 connection so that we can
reuse it later when needed.
2018-10-12 16:58:01 +02:00
Willy Tarreau
7dc24e49cc MINOR: h2: unify the mux init function
The init function was split into the mux init and the front init, but it
appears that most of the code will be common between the two sides when
implementing the backend init. Thus let's simply make this a unique
h2_init() function.
2018-10-12 16:58:01 +02:00
Willy Tarreau
6bf641a61d MINOR: h2: don't try to send data before preface
h2_snd_buf() must not accept to send data if the preface was not yet
received nor sent. At the moment it doesn't happen but it can with
server-side H2.
2018-10-12 16:58:01 +02:00
Willy Tarreau
7f0cc49645 CLEANUP: h2: rename h2c_snd_settings() to h2c_send_settings()
It's the only function not called h2c_send_<something>() and it took me
a while to find it.
2018-10-12 16:58:01 +02:00
Willy Tarreau
ab0e1da3a9 MEDIUM: h2: stop relying on H2_SS_IDLE / H2_SS_CLOSED
At a few places we check these states to detect if a stream has valid
data/errcode or is one of the two dummy streams (idle or closed). It
will become problematic for outgoing streams as it will not be possible
to report errors for example since the stream will switch from IDLE
state only after sending a HEADERS frame.

There is a safer solution consisting in checking the stream ID, which
may only be zero in the dummy streams. This patch changes the test to
only rely on the stream ID.
2018-10-12 16:58:01 +02:00
Willy Tarreau
9fa267dada MINOR: log: make sess_log() support sess=NULL
At many places in muxes we'll have to add tests to check if the
connection is front or back before deciding to log. Instead let's
centralize this test in sess_log() to simply do nothing when sess=NULL.
2018-10-12 16:58:01 +02:00
Christopher Faulet
25da9e34f1 MINOR: h1: Add the flag H1_MF_NO_PHDR to not add pseudo-headers during parsing
Some pseudo-headers are added during the headers parsing, mainly for the mux
H2. With this flag, it is possible to not add them. This avoid some boring
filtering in the mux H1.
2018-10-12 16:15:18 +02:00
Christopher Faulet
1dc2b49556 MINOR: h1: Change the union h1_sl to use indirect strings to store infos
Instead of using offsets relating to the parsed buffer to store start line
infos, we now use indirect strings. So now, these infos remain valid only if the
origin buffer remains untouched. But it's not a real problem because this union
is used during the parsing and never stored to a later use.
2018-10-12 16:14:57 +02:00
Christopher Faulet
ff08a92797 MINOR: h1: Add EOH marker during headers parsing
When headers parsing ends, a pseudo header with an empty name and an empty value
is added to the array of parsed headers to mark its end. It is convenient to
loop on this array, but not really useful if we want remove the last header or
add a new one, because we don't really know where is the last CRLF (the empty
line ending the headers block). So now, instead the name of this pseudo header
points on this last CRLF. Its length is still 0 and its value is still empty, so
loops on the array remains unchanged.
2018-10-12 16:08:27 +02:00
Christopher Faulet
315b39c391 MINOR: http: Use same flag for httpclose and forceclose options
Since keep-alive mode is the default mode, the passive close has disappeared,
and in the code, httpclose and forceclose options are handled the same way:
connections with the client and the server are closed as soon as the request and
the response are received and missing "Connection: close" header is added in
each direction.

So to make things clearer, forceclose is now an alias for httpclose. And
httpclose is explicitly an active close. So the old passive close does not exist
anymore. Internally, the flag PR_O_HTTP_PCL has been removed and PR_O_HTTP_FCL
has been replaced by PR_O_HTTP_CLO. In HTTP analyzers, the checks done to find
the right mode to use, depending on proxies options and "Connection: " header
value, have been simplified.

This should only be a cleanup and no changes are expected.
2018-10-12 16:07:56 +02:00
Christopher Faulet
4212a30ad1 MEDIUM: http: Ignore http-tunnel option on backend
This option is frontends specific, so there is no reason to support it on
backends. So now, it is ignored if it is set on a backend and a warning is
emitted during the startup. The change is quite trivial, but the commit is
tagged as MEDIUM because it is a small breakage with previous versions and
configurations using this options could emit a warning now.
2018-10-12 16:05:53 +02:00
Christopher Faulet
98db9768e5 MEDIUM: http: Ignore http-pretend-keepalive option on frontend
This option is backends specific, so there is no reason to support it on
frontends. So now, it is ignored if it is set on a frontend and a warning is
emitted during the startup. The change is quite trivial, but the commit is
tagged as MEDIUM because it is a small breakage with previous versions and
configurations using this options could emit a warning now.
2018-10-12 16:01:26 +02:00
Christopher Faulet
10079f59b7 MINOR: http: Export some functions and do cleanup to prepare HTTP refactoring
To ease the refactoring, the function "http_header_add_tail" have been
remove. Now, "http_header_add_tail2" is always used. And the function
"capture_headers" have been renamed into "http_capture_headers". Finally, some
functions have been exported.
2018-10-12 16:00:45 +02:00
Olivier Houchard
4fdec7aafa BUG/MEDIUM: stream: Make sure to unsubscribe before si_release_endpoint.
Make sure we unsubscribe from events before si_release_endpoint destroys
the conn_stream, or it will be never called. To do so, move the call to
unsubscribe to si_release_endpoint() directly.

This is 1.9-specific and shouldn't be backported.
2018-10-11 17:16:43 +02:00
Emeric Brun
c8c0ed91cb BUG/MEDIUM: mworker: segfault receiving SIGUSR1 followed by SIGTERM.
This bug appeared only if nbthread > 1. Handling the pipe with the
master, multiple threads of the same worker could process the deinit().

In addition, deinit() was called while some other threads were still
performing some tasks.

This patch assign the handler of the pipe with master to only the first
thread and removes the call to deinit() before exiting with an error.

This patch should be backported in v1.8.
2018-10-11 16:29:38 +02:00
Olivier Houchard
dddfe31265 BUG/MEDIUM: h2: Make sure we're not in the send list on flow control.
If we can't send data for a stream because of its flow control, make sure
not to put it in the send_list, until the flow control lets it send again.

This is specific to 1.9, and should not be backported.
2018-10-11 15:35:05 +02:00
Olivier Houchard
fa8aa867b9 MEDIUM: connections: Change struct wait_list to wait_event.
When subscribing, we don't need to provide a list element, only the h2 mux
needs it. So instead, Add a list element to struct h2s, and use it when a
list is needed.
This forces us to use the unsubscribe method, since we can't just unsubscribe
by using LIST_DEL anymore.
This patch is larger than it should be because it includes some renaming.
2018-10-11 15:34:39 +02:00
Olivier Houchard
83a0cd8a36 MINOR: connections: Introduce an unsubscribe method.
As we don't know how subscriptions are handled, we can't just assume we can
use LIST_DEL() to unsubscribe, so introduce a new method to mux and connections
to do so.
2018-10-11 15:34:21 +02:00
mildis
5ab01cb011 BUG/MINOR: checks: queues null-deref
queues can be null if calloc() failed.
Bypass free* calls when calloc did fail.
2018-10-11 15:17:47 +02:00
mildis
cd2d7de44e BUG/MINOR: h2: null-deref
h2c can be null if pool_alloc() failed.
Bypass tasklet_free and pool_free if pool_alloc did fail.
2018-10-11 15:17:27 +02:00
Emeric Brun
7ad43e7928 BUG/MEDIUM: Cur/CumSslConns counters not threadsafe.
CurSslConns inc/dec operations are not threadsafe. The unsigned CurSslConns
counter can wrap to a negative value. So we could notice connection rejects
because of MaxSslConns limit artificially exceeded.

CumSslConns inc operation are also not threadsafe so we could miss
some connections and show inconsistenties values compared to CumConns.

This fix should be backported to v1.8.
2018-10-10 18:05:33 +02:00
Willy Tarreau
0b25d5e99f MEDIUM: task: perform a single tree lookup per run queue batch
The run queue is designed to perform a single tree lookup and to
use multiple passes to eb32sc_next(). The scheduler rework took a
conservative approach first but this is not needed anymore and it
increases the processing cost of process_runnable_tasks() and even
the time during which the RQ lock is held if the global queue is
heavily loaded. Let's simply move the initial lookup to the entry
of the loop like the previous scheduler used to do. This has reduced
by a factor of 5.5 the number of calls to eb32sc_lookup_get() there.
2018-10-10 16:42:46 +02:00
Dirkjan Bussink
ff57f1bbcf CLEANUP: stick-tables: Remove unneeded double (()) around conditional clause
In the past this conditional had multiple conditionals which is why the
additional parentheses were needed. The conditional was simplified but
the duplicate parentheses were not cleaned up.
2018-10-09 15:09:59 +02:00
Dirkjan Bussink
c26c72d89b CLEANUP: h1: Fix debug warnings for h1 headers
The wrong method was used to debug the h1m state here. This fixes both
the signature of the h1m method and also fixes the invocation to be
correct.
2018-10-09 15:09:29 +02:00
Dirkjan Bussink
1d323de5e1 CLEANUP: haproxy: Remove unused variable
Looking at the code, this variable is no longer used and referenced
nowhere. That means it can be safely removed.
2018-10-09 15:09:25 +02:00
Dirkjan Bussink
415150f764 MEDIUM: ssl: add support for ciphersuites option for TLSv1.3
OpenSSL released support for TLSv1.3. It also added a separate function
SSL_CTX_set_ciphersuites that is used to set the ciphers used in the
TLS 1.3 handshake. This change adds support for that new configuration
option by adding a ciphersuites configuration variable that works
essentially the same as the existing ciphers setting.

Note that it should likely be backported to 1.8 in order to ease usage
of the now released openssl-1.1.1.
2018-10-08 19:20:13 +02:00
Olivier Houchard
363c745569 BUG/MEDIUM: buffers: Make sure we don't wrap in ci_insert_line2/b_rep_blk.
In ci_insert_line2() and b_rep_blk(), we can't afford to wrap, so don't use
b_tail() to check if we do, use __b_tail() instead.

This should be backported to previous versions.
2018-10-08 16:11:54 +02:00
Emmanuel Hocdet
747ca61693 MINOR: ssl: generate-certificates for BoringSSL 2018-10-08 09:42:34 +02:00
Emmanuel Hocdet
a9b84028e6 MINOR: ssl: cleanup old openssl API call
For generate-certificates, X509V3_EXT_conf is used but it's an old API
call: X509V3_EXT_nconf must be preferred. Openssl compatibility is ok
because it's inside #ifdef SSL_CTRL_SET_TLSEXT_HOSTNAME, introduce 5
years after X509V3_EXT_nconf.
2018-10-08 09:42:28 +02:00
Willy Tarreau
45efc07cb5 BUG/MEDIUM: h2: make h2_stream_new() return an error on memory allocation failure
Commit 8ae735da0 ("MEDIUM: mux_h2: Revamp the send path when blocking.")
added a tasklet allocation in h2_stream_new(), however the error exit path
fails to reset h2s in case the tasklet cannot be allocated, resulting in
the h2s pointer to be returned as valid to the caller. Let's readjust the
exit path to always return NULL on error and to always log as well (since
there is no reason for not logging on such important errors).

No backport is needed, this is strictly 1.9-dev.
2018-10-03 18:30:39 +02:00
Willy Tarreau
0f3835878d BUG/MEDIUM: h2: check that the connection is still valid at the end of init()
Since commit 7505f94f9 ("MEDIUM: h2: Don't use a wake() method anymore."),
the H2 mux's init() calls h2_process(). But this last one may detect an
early error and call h2_release(), destroying the connection, and return
-1. At this point we're screwed because the caller will still dereference
the connection for various things ranging from the configuration of the
proxy protocol header to the retries. We could simply return -1 here upon
failure but that's not enough since the stream layer really needs to keep
its connection structure allocated (to clean it up in session_kill_embryonic
or for example because it holds the destination address to reconnect to
when the connection goes to the backend). Thus the correct solution here is
to only schedule a wakeup of the I/O callback so that the init succeeds,
and that the connection is only handled later.

No backport is needed, this is 1.9-specific.
2018-10-03 18:09:58 +02:00
Willy Tarreau
33dd4ef812 BUG/MINOR: backend: check that the mux installed properly
The return value from conn_install_mux() was not checked, so if an
inconsistency happens in the code, or a memory allocation fails while
initializing the mux, we can crash while using an uninitialized mux.
In practice the code inconsistency does not really happen since we
cannot configure such a situation, except during development, but
the out of memory condition could definitely happen.

This should be backported to 1.8 (the code is a bit different there,
there are two calls to conn_install_mux()).
2018-10-03 10:24:05 +02:00
Willy Tarreau
491cec20be CLEANUP: http: remove some leftovers from recent cleanups
The prototypes of functions find_hdr_value_end(), extract_cookie_value()
and http_header_match2() were still in proto_http.h while some of them
don't exist anymore and the others were just moved. Let's remove them.
In addition, da.c was updated to use http_extract_cookie_value() which
is the correct one.
2018-10-02 18:37:27 +02:00
Willy Tarreau
61c112aa5b REORG: http: move HTTP rules parsing to http_rules.c
These ones are mostly called from cfgparse.c for the parsing and do
not depend on the HTTP representation. The functions's prototypes
were moved to proto/http_rules.h, making this file work exactly like
tcp_rules. Ideally we should stop calling these functions directly
from cfgparse and register keywords, but there are a few cases where
that wouldn't work (stats http-request) so it's probably not worth
trying to go this far.
2018-10-02 18:28:05 +02:00
Willy Tarreau
79e57336b5 REORG: http: move the code to different files
The current proto_http.c file is huge and contains different processing
domains making it very difficult to work on an alternative representation.
This commit moves some parts to other files :

  - ACL registration code => http_acl.c
    This code only creates some ACL mappings and doesn't know anything
    about HTTP nor about the representation. This code could even have
    moved to acl.c but it was not worth polluting it again.

  - HTTP sample conversion => http_conv.c
    This code doesn't depend on the internal representation but definitely
    manipulates some HTTP elements, such as dates. It also has access to
    captures.

  - HTTP sample fetching => http_fetch.c
    This code does depend entirely on the internal representation but is
    totally independent on the analysers. Placing it into a different
    file will ease the transition to the new representation and the
    creation of a wrapper if required. An include file was created due
    to CHECK_HTTP_MESSAGE_FIRST() being used at various places.

  - HTTP action registration => http_act.c
    This code doesn't directly interact with the messages nor the
    transaction but it does so via some exported http functions like
    http_replace_req_line() or http_set_status() so it will be easier
    to change only this after the conversion.

  - a few very generic parts were found and moved to http.{c,h} as
    relevant.

It is worth noting that the functions moved to these new files are not
referenced anywhere outside of the files and are only called as registered
callbacks, so these files do not even require associated include files.
2018-10-02 18:26:59 +02:00
Ilya Shipitsin
ca56fce8bd BUG/MINOR: connection: avoid null pointer dereference in send-proxy-v2
found by coverity.

[wt: this bug was introduced by commit 404d978 ("MINOR: add ALPN
 information to send-proxy-v2"). It might be triggered by a health
 check on a server using ppv2 or by an applet making use of such a
 server, if at all configurable].

This needs to be backported to 1.8.
2018-10-02 04:07:43 +02:00
Adis Nezirovic
8878f8eb3d MEDIUM: lua: Add stick table support for Lua.
This ads support for accessing stick tables from Lua. The supported
operations are reading general table info, lookup by string/IP key, and
dumping the table.

Similar to "show table", a data filter is available during dump, and as
an improvement over "show table" it's possible to use up to 4 filter
expressions instead of just one (with implicit AND clause binding the
expressions). Dumping with/without filters can take a long time for
large tables, and should be used sparingly.
2018-09-29 20:15:01 +02:00
Olivier Houchard
d48d6d284e BUG/MEDIUM: process_stream(): Don't wake the task if no new data was received.
At the eand of process_stream(), we wake the task if there's something in
the input buffer, after attempting a recv. However this is wrong, and we should
only do so if we received new data. Just check the CF_READ_PARTIAL flag.

This is 1.9-specific and should not be backported.
2018-09-28 15:12:12 +02:00
Olivier Houchard
61d322fa9e BUG/MEDIUM: h2: Wake the task instead of calling h2_recv()/h2_process().
In a number of cases, we may end up recursively calling h2_recv() via
h2_process(), so just wake the tasklet up instead.
2018-09-26 14:21:54 +02:00
Olivier Houchard
21df6cc2f9 MINOR: h2/stream_interface: Reintroduce te wake() method.
For the time being, reintroduce the wake methods, it may be revisited later.h
2018-09-26 14:21:54 +02:00
Olivier Houchard
0e367bbb01 BUG/MEDIUM: process_stream: Don't use si_cs_io_cb() in process_stream().
Instead of using si_cs_io_cb() in process_stream()  use si_cs_send/si_cs_recv
instead, as si_cs_io_cb() may lead to process_stream being woken up when it
shouldn't be, and thus timeout would never get triggered.
2018-09-26 14:21:54 +02:00
Christopher Faulet
ca874b8d92 BUG/MEDIUM: http: Don't parse chunked body if there is no input data
With recent modifications on the buffers API, when a buffer is released (calling
b_free), we replace it by BUF_NULL where the area pointer is NULL. So many
operations, like b_peek, must be avoided on a released or not allocated
buffer. These changes were mainly made in the commit c9fa048 ("MAJOR: buffer:
finalize buffer detachment").

Since this commit, HAProxy can crash during the body parsing of chunked HTTP
messages because there is no check on the channel's buffer in HTTP analyzers
(http_request_forward_body and http_response_forward_body) nor in H1 functions
reponsible to parse chunked content (h1_skip_chunk_crlf & co). If a stream is
woken up after all input data were forwarded, its input channel's buffer is
released (so set to BUF_NULL). In this case, if we resume the parsing of a
chunk, HAProxy crashes.

To fix this issue, we just skip the parsing of chunks if there is no input data
for the corresponding channel. This is only done if the message state is
strickly lower to HTTP_MSG_ENDING.
2018-09-20 14:37:58 +02:00
Willy Tarreau
7f2a44d319 BUG/CRITICAL: hpack: fix improper sign check on the header index value
Tim Düsterhus found using afl-fuzz that some parts of the HPACK decoder
use incorrect bounds checking which do not catch negative values after
a type cast. The first culprit is hpack_valid_idx() which takes a signed
int and is fed with an unsigned one, but a few others are affected as
well due to being designed to work with an uint16_t as in the table
header, thus not being able to detect the high offset bits, though they
are not exposed if hpack_valid_idx() is fixed.

The impact is that the HPACK decoder can be crashed by an out-of-bounds
read. The only work-around without this patch is to disable H2 in the
configuration.

CVE-2018-14645 was assigned to this bug.

This patch addresses all of these issues at once. It must be backported
to 1.8.
2018-09-20 11:45:56 +02:00
Willy Tarreau
7d7ab43a33 BUILD: sockpair: silence a build warning at -Wextra
An invalid null-deref warning is emitted because cmsg is not checked,
though it definitely is valid given the test performed 10 lines above,
but the compiler cannot necessarily guess this. Adding a null test to
the problematic condition is enough to get rid of it and cheap enough.
2018-09-20 11:42:15 +02:00
Willy Tarreau
1e582e5e5c BUILD: backend: fix 3 build warnings related to null-deref at -Wextra
These ones are not valid either since the checks are performed a few
lines above the call. Let's switch to __objt_server() instead.
2018-09-20 11:42:15 +02:00
Willy Tarreau
55e0da664e BUILD: connection: silence a couple of null-deref build warnings at -Wextra
These ones don't need to be checked either.
2018-09-20 11:42:15 +02:00
Willy Tarreau
543abd4027 BUILD: checks: silence a null-deref build warning at -Wextra
Simply don't use cs_conn() on a valid CS.
2018-09-20 11:42:15 +02:00
Willy Tarreau
433c16ffea BUILD: dns: fix null-deref build warning at -Wextra
Like for the other checks, the type is being tested just before calling
objt_{server,dns_srvrq}() so let's use the unguarded version instead to
silence the warning.
2018-09-20 11:42:15 +02:00
Willy Tarreau
1aaf324227 BUILD: log: silent build warnings due to unchecked __objt_{server,applet}
These ones are safe to use there since the same check is performed in
the switch/case they're used it. Let's use the unguarded versions
instead.
2018-09-20 11:42:15 +02:00
Willy Tarreau
b05e48a54d BUILD: http: address a couple of null-deref warnings at -Wextra
These two warnings are caused by the use of objt_server() without
checking its result. These are turned to __objt_server() which is
safe there.
2018-09-20 11:42:15 +02:00
Willy Tarreau
b8d42d0210 BUILD: stream: address null-deref build warnings at -Wextra
These warnings are caused by the improper use of stktable_data_ptr()
whose result is not checked instead of using __stktable_data_ptr().
2018-09-20 11:42:15 +02:00
Willy Tarreau
21ff2c46b7 BUILD: stats: remove build warnings on potential null-derefs
A couple of objt_appctx() could be replaced with their unchecked
equivalent since the pointer is guaranteed and not checked there.
2018-09-20 11:42:15 +02:00
Willy Tarreau
07d94e48d9 BUILD: ssl_sock: remove build warnings on potential null-derefs
When building with -Wnull-dereferences, gcc sees some cases where a
pointer is dereferenced after a check may set it to null. While all of
these are already guarded by either a preliminary test or the code's
construction (eg: listeners code being called only on listeners), it
cannot be blamed for not "seeing" this, so better use the unguarded
calls everywhere this happens, particularly after checks. This is a
step towards building with -Wextra.
2018-09-20 11:42:15 +02:00
Willy Tarreau
c2b7f80a91 BUG/MINOR: cli: make sure the "getsock" command is only called on connections
Theorically nothing would prevent a front applet form connecting to a stats
socket, and if a "getsock" command was issued, it would cause a crash. Right
now nothing in the code does this so in its current form there is no impact.

It may or may not be backported to 1.8.
2018-09-20 11:42:15 +02:00
Christopher Faulet
2912f87443 BUG/MEDIUM: h1: Really skip all updates when incomplete messages are parsed
In h1_headers_to_hdr_list, when an incomplete message is parsed, all updates
must be skipped until the end of the message is found. Then the parsing is
restarted from the beginning. But not all updates were skipped, leading to
invalid rewritting or segfault.

No backport is needed.
2018-09-19 15:08:05 +02:00
Dragan Dosen
f147479bd5 BUG/MEDIUM: patterns: fix possible double free when reloading a pattern list
A null pointer assignment was missing after free() in function
pat_ref_reload() which can lead to segfault.

This bug was introduced in commit b5997f7 ("MAJOR: threads/map: Make
acls/maps thread safe").

Must be backported to 1.8.
2018-09-19 06:46:51 +02:00
Willy Tarreau
73373ab43a MEDIUM: h1: deduplicate the content-length header
Just like we used to do in proto_http, we now check that each and every
occurrence of the content-length header field and each of its values are
exactly identical, and we normalize the header to return the last value
of the first header with spaces trimmed.
2018-09-14 19:04:28 +02:00
Willy Tarreau
2557f6a3e2 MEDIUM: h1: better handle transfer-encoding vs content-length
The transfer-encoding header processing was a bit lenient in this part
because it was made to read messages already validated by haproxy. We
absolutely need to reinstate the strict processing defined in RFC7230
as is currently being done in proto_http.c. That is, transfer-encoding
presence alone is enough to cancel content-length, and must be
terminated by the "chunked" token, except in the response where we
can fall back to the close mode if it's not last.

For this we now use a specific parsing function which updates the
flags and we introduce a new flag H1_MF_XFER_ENC indicating that the
transfer-encoding header is present.

Last, if such a header is found, we delete all content-length header
fields found in the message.
2018-09-14 17:40:35 +02:00
Willy Tarreau
2ea6bb5c31 MINOR: h1: add headers to the list after controls, not before
This will ease removal/skipping of duplicates such as content-length.
2018-09-14 17:40:35 +02:00
Bertrand Jacquin
874a35cb55 DOC: Fix typos in lua documentation 2018-09-14 09:31:34 +02:00
Willy Tarreau
98f5cf7a59 MINOR: h1: parse the Connection header field
The new function h1_parse_connection_header() is called when facing a
connection header in the generic parser, and it will set up to 3 bits
in h1m->flags indicating if at least one "close", "keep-alive" or "upgrade"
tokens was seen.
2018-09-13 14:52:31 +02:00
Willy Tarreau
ba5fbca33f MINOR: h1: report in the h1m struct if the HTTP version is 1.1 or above
This will be needed for the mux to know how to process the Connection
header, and will save it from having to re-parse the request line since
it's captured on the fly.
2018-09-13 14:34:09 +02:00
Willy Tarreau
db72da0432 BUG/MINOR: h1: don't consider the status for each header
While it was possible to consider the status before parsing response
headers, it's wrong to do it for request headers and could lead to
random behaviours due to this status matching other fields instead.
Additionnally there is little to no value in doing this for each and
every new header field. It's much better to reset the content-length
at once in the callerwhen seeing such statuses (which currently is only
the H2 mux).

No backport is needed, this is purely 1.9.
2018-09-13 14:30:23 +02:00
Willy Tarreau
b5b7d4a532 BUG/MAJOR: h2: reset the parser's state on mux buffer full
The h2 parser has this specificity that if it cannot send the headers
frame resulting from the headers it just parsed, it needs to drop it
and parse it again later. Since commit 8852850 ("MEDIUM: h1: let the
caller pass the initial parser's state"), when this happens the parser
remains in the data state and the headers are not parsed again next
time, resulting in a parse error. Let's reset the parser on exit there.

No backport is needed.
2018-09-12 18:55:29 +02:00
Olivier Houchard
70d0d18d41 BUG/MEDIUM: h2: Don't forget to set recv_wait_list to NULL in h2_detach.
If we're detaching the conn_stream, and it was subscribed to be waken up
when more data was available to receive, unsubscribe it.

No backport is needed.
2018-09-12 18:55:25 +02:00
Olivier Houchard
251f6a23ad BUG/MEDIUM: h2: Don't forget to empty the wait lists on destroy.
Empty both send_list and fctl_list when destroying the h2 context, so that
if we're freeing the stream after, it doesn't try to remove itself from the
now-deleted list.

No backport is needed.
2018-09-12 18:55:18 +02:00
Willy Tarreau
175a2bb507 MINOR: connection: pass the proxy when creating a connection
Till now it was very difficult for a mux to know what proxy it was
working for. Let's pass the proxy when the mux is instanciated at
init() time. It's not yet used but the H1 mux will definitely need
it, just like the H2 mux when dealing with backend connections.
2018-09-12 17:39:22 +02:00
Willy Tarreau
eb528db60b MINOR: h1: add H1_MF_TOLOWER to decide when to turn header names to lower case
The h1 parser used to systematically turn header field names to lower
case because it was designed for H2. Let's add a flag which is off by
default to condition this behaviour so that when using it from an H1
parser it will not affect the message.
2018-09-12 17:38:26 +02:00
Willy Tarreau
c2ab9f5163 MEDIUM: h1: implement the request parser as well
The original H1 request parsing code was reintroduced into the generic
H1 parser so that it can be used regardless of the direction. If the
parser is interrupted and restarts, it makes use of the H1_MF_RESP
flag to decide whether to re-parse a request or a response. While
parsing the request, the method is decoded and set into the start line
structure.
2018-09-12 17:38:25 +02:00
Willy Tarreau
11da5674c3 MINOR: h1: remove the HTTP status from the H1M struct
It has nothing to do there and is not used from there anymore, let's
get rid of it.
2018-09-12 17:38:25 +02:00
Willy Tarreau
9c5e22e436 MINOR: h2: store the HTTP status into the H2S, not the H1M
The HTTP status is not relevant to the H1 message but to the H2 stream
itself. It used to be placed there by pure convenience but better move
it before it's too hard to remove.
2018-09-12 17:38:25 +02:00
Willy Tarreau
001823c304 MEDIUM: h1: remove the useless H1_MSG_BODY state
This state was only a delimiter between headers and body but it now
causes more harm than good because it requires someone to change it.
Since the H1 parser knows if we're in DATA or CHUNK_SIZE, simply let
it set the right next state so that h1m->state constantly matches
what is expected afterwards.
2018-09-12 17:38:25 +02:00
Willy Tarreau
4c34c0e74a MEDIUM: h1: support partial message parsing
While it was not needed in the H2 mux which was reading full H1 messages
from the channel, it is mandatory for the H1 mux reading contents from
outside to be able to restart on a message. The problem is that the
headers are indexed on the fly, and it's not fun to have to store
everything between calls.

The solution here is to complete the first pass doing a partial restart,
and only once the end of message was found, to start over it again at
once, filling entries. This way there is a bounded number of passes on
the contents and no need to store an intermediary result anymore. Later
this principle could even be used to decide to completely drop an output
buffer to save memory.
2018-09-12 17:38:25 +02:00
Willy Tarreau
5384aac0cb MINOR: h1: make the message parser support a null <hdr> argument
This will allow some iterative calls to be made on incomplete messages
without having to store all the headers.
2018-09-12 17:38:25 +02:00
Willy Tarreau
4433c083ec MEDIUM: h1: let the caller pass the initial parser's state
This way the caller controls if it's the request or response which has
to be used, and it will allow to restart after an incomplete parsing.
2018-09-12 17:38:25 +02:00
Willy Tarreau
a41393fc61 MEDIUM: h1: make the parser support a pointer to a start line
This will allow the parser to fill some extra fields like the method or
status without having to store them permanently in the HTTP message. At
this point however the parser cannot restart from an interrupted read.
2018-09-12 17:38:25 +02:00
Willy Tarreau
9aec30557b MEDIUM: h1: consider err_pos before deciding to accept a header name or not
Till now the H1 parser made for H2 used to be lenient on invalid header
field names because they were supposed to be produced by haproxy. Now
instead we'll rely on err_pos to know how to act (ie: -2 == must block).
2018-09-12 17:38:25 +02:00
Willy Tarreau
9b8cd1f183 MINOR: h2: pre-initialize h1m->err_pos to -1 on the output path
We don't want to trigger an error while parsing a response coming from
haproxy (it could be an errorfile for example), so let's set this to
-1.
2018-09-12 17:38:25 +02:00
Willy Tarreau
a40704ab05 MINOR: mux_h2: replace the req,res h1 messages with a single h1 message
There's no reason to have the two sides in H1 format since we only use
one at a time (the response at the moment). While completely removing
the request declaration, let's rename the response to "h1m" to clarify
that it's the unique h1 message there.
2018-09-12 17:38:25 +02:00
Willy Tarreau
25173a7bcc MINOR: h2: make sure h1m->err_pos field is correct on chunk error
This never happens but in case it would, it's better to report the
correct offset of the error instead of a negative value.
2018-09-12 17:38:25 +02:00
Willy Tarreau
7f437ff81c MINOR: h1: provide a distinct init() function for request and response
h1m_init() used to handle response only since it was used by the H1
client code. Let's have one init per direction.
2018-09-12 17:38:25 +02:00
Willy Tarreau
801250e07d REORG: h1: create a new h1m_state
This is the *parsing* state of an HTTP/1 message. Currently the h1_state
is composite as it's made both of parsing and control (100SENT, BODY,
DONE, TUNNEL, ENDING etc). The purpose here is to have a purely H1 state
that can be used by H1 parsers. For now it's equivalent to h1_state.
2018-09-12 17:38:25 +02:00
Olivier Houchard
71384551fe MINOR: conn_streams: Remove wait_list from conn_streams.
The conn_streams won't be used for subscribing/waiting for I/O events, after
all, so just remove its wait_list, and send/recv/_wait_list.
2018-09-12 17:37:55 +02:00
Olivier Houchard
26e1a8f2bf MINOR: checks: Give checks their own wait_list.
Instead of (ab)using the conn_stream's wait_list, which should disappear,
give the checks their own wait_list.
2018-09-12 17:37:55 +02:00
Olivier Houchard
c2aa71108a MEDIUM: stream_interfaces: Starts receiving from the upper layers.
Instead of waiting for the connection layer to let us know we can read,
attempt to receive as soon as process_stream() is called, and subscribe
to receive events if we can't receive yet.

Now, except for idle connections, the recv(), send() and wake() methods are
no more, all the lower layers do is waking tasklet for anybody waiting
for I/O events.
2018-09-12 17:37:55 +02:00
Olivier Houchard
8ae735da05 MEDIUM: mux_h2: Revamp the send path when blocking.
Change fctl_list and send_list to be lists of struct wait_list, and nuke
send_wait_list, as it's now redundant.
Make the code responsible for shutr/shutw subscribe to those lists.
2018-09-12 17:37:55 +02:00
Olivier Houchard
f653528dc1 MEDIUM: stream_interface: Make recv() subscribe when more data is needed.
Refactor the code so that si_cs_recv() subscribes to receive events.
2018-09-12 17:37:55 +02:00
Olivier Houchard
7505f94f90 MEDIUM: h2: Don't use a wake() method anymore.
Instead of having our wake() method called each time a fd event happens,
just subscribe to recv/send events, and get our tasklet called when that
happens. If any recv/send was possible, the equivalent of what h2_wake_cb()
will be done.
2018-09-12 17:37:55 +02:00
Olivier Houchard
a1411e62e4 MEDIUM: h2: always subscribe to receive if allowed.
Let the connection layer know we're always interested in getting more data,
so that we get scheduled as soon as data is available, instead of relying
on the wake() method.
2018-09-12 17:37:55 +02:00
Olivier Houchard
d4dd22d0ab MINOR: h2: Let user of h2_recv() and h2_send() know xfer has been done.
Make h2_recv() and h2_send() return 1 if data has been sent/received, or 0
if it did not. That way the caller will be able to know if more work may
have to be done.
2018-09-12 17:37:55 +02:00
Olivier Houchard
af4021e680 MEDIUM: connections: Get rid of the recv() method.
Remove the recv() method from mux and conn_stream.
The goal is to always receive from the upper layers, instead of waiting
for the connection later. For now, recv() is still called from the wake()
method, but that should change soon.
2018-09-12 17:37:55 +02:00
Olivier Houchard
4cf7fb148f MEDIUM: connections/mux: Add a recv and a send+recv wait list.
For struct connection, struct conn_stream, and for the h2 mux, add 2 new
lists, one that handles waiters for recv, and one that handles waiters for
recv and send. That way we can ask to subscribe for either recv or send.
2018-09-12 17:37:55 +02:00
Olivier Houchard
524344b4e0 MEDIUM: connections: Don't reset the polling flags in conn_fd_handler().
Resetting the polling flags at the end of conn_fd_handler() shouldn't be
needed anymore, and it will create problem when we won't handle send/recv
from conn_fd_handler() anymore.
2018-09-12 17:37:55 +02:00
William Lallemand
cd5c944ea5 BUILD: fix build without thread
Cyril Bonté reported that commit f9cc07c25b broke the build without
thread.

We don't need to initialise tid = 0 in mworker_loop, so we could
completely remove it.
2018-09-12 13:59:00 +02:00
Willy Tarreau
2c096c3b7a BUG/MINOR: h2: report asynchronous end of stream on closed connections
Christopher noticed that the CS_FL_EOS to CS_FL_REOS conversion was
incomplete : when the connectionis closed, we mark the streams with EOS
instead of REOS, causing the loss of any possibly pending data. At the
moment it's not an issue since H2 is used only with a client, but with
servers it could be a real problem if servers close the connection right
after sending their response.

This patch should be backported to 1.8.
2018-09-12 09:45:54 +02:00
Frédéric Lécaille
5afb3cfbcc BUG/MINOR: server: Crash when setting FQDN via CLI.
This patch ensures that a DNS resolution may be launched before
setting a server FQDN via the CLI. Especially, it checks that
resolvers was set.

A LEVEL 4 reg testing file is provided.

Thanks to Lukas Tribus for having reported this issue.

Must be backported to 1.8.
2018-09-12 07:41:41 +02:00
William Lallemand
2fe7dd0b2e MEDIUM: protocol: sockpair protocol
This protocol is based on the uxst one, but it uses socketpair and FD
passing insteads of a connect()/accept().

The "sockpair@" prefix has been implemented for both bind and server
keywords.

When HAProxy wants to connect through a sockpair@, it creates 2 new
sockets using the socketpair() syscall and pass one of the socket
through the FD specified on the server line.

On the bind side, haproxy will receive the FD, and will use it like it
was the FD of an accept() syscall.

This protocol was designed for internal communication within HAProxy
between the master and the workers, but it's possible to use it
externaly with a wrapper and pass the FD through environment variabls.
2018-09-12 07:20:17 +02:00
William Lallemand
2d3f8a411f MEDIUM: protocol: use a custom AF_MAX to help protocol parser
It's possible to have several protocols per family which is a problem
with the current way the protocols are stored.

This allows to register a new protocol in HAProxy which is not a
protocol in the strict socket definition. It will be used to register a
SOCK_STREAM protocol using socketpair().
2018-09-12 07:12:27 +02:00
Olivier Houchard
5ab33944cd BUG/MAJOR: kqueue: Don't reset the changes number by accident.
In _update_fd(), if the fd wasn't polled, and we don't want it to be polled,
we just returned 0, however, we should return changes instead, or all previous
changes will be lost.

This should be backported to 1.8.
2018-09-11 14:53:00 +02:00
Willy Tarreau
ab813a4b05 REORG: http: move some header value processing functions to http.c
The following functions only deal with header field values and are agnostic
to the HTTP version so they were moved to http.c :

http_header_match2(), find_hdr_value_end(), find_cookie_value_end(),
extract_cookie_value(), parse_qvalue(), http_find_url_param_pos(),
http_find_next_url_param().

Those lacking the "http_" prefix were modified to have it.
2018-09-11 10:30:25 +02:00
Willy Tarreau
e10cd48a83 REORG: http: move the log encoding tables to log.c
There are 3 tables in proto_http which are used exclusively by logs :
hdr_encode_map[], url_encode_map[] and http_encode_map[]. They indicate
what characters are safe to be emitted in logs depending on the part of
the message where they are placed. Let's move this to log.c, as well as
its initialization. It's worth noting that the rfc5424 map was already
initialized there.
2018-09-11 10:30:25 +02:00
Willy Tarreau
04f1e2d202 REORG: http: move error codes production and processing to http.c
These error codes and messages are agnostic to the version, even if
they are represented as HTTP/1.0 messages. Ultimately they will have
to be transformed into internal HTTP messages to be used everywhere.

The HTTP/1.1 100 Continue message was turned to an IST and the local
copy in the Lua code was removed.
2018-09-11 10:30:25 +02:00
Willy Tarreau
6b952c8101 REORG: http: move http_get_path() to http.c
This function is purely HTTP once http_txn is put aside. So the original
one was renamed to http_txn_get_path() and it extracts the relevant offsets
from the txn to pass them to http_get_path(). One benefit of the new version
is that it returns the length at the same time so that allowed to slightly
simplify http_get_path_from_string() which had to look up the end pointer
previously and which is not needed anymore.
2018-09-11 10:30:25 +02:00
Willy Tarreau
35b51c6e5b REORG: http: move the HTTP semantics definitions to http.h/http.c
It's a bit painful to have to deal with HTTP semantics for each protocol
version (H1 and H2), and working on the version-agnostic code further
emphasizes the problem.

This patch creates http.h and http.c which are agnostic to the version
in use, and which borrow a few parts from proto_http and from h1. For
example the once thought h1-specific h1_char_classes array is in fact
dictated by RFC7231 and is used to parse HTTP headers. A few changes
were made to a few files which were including proto_http.h while they
only needed http.h.

Certain string definitions pre-dated the introduction of indirect
strings (ist) so some were used to simplify the definition of the known
HTTP methods. The current lookup code saves 2 kB of a heavily used table
and is faster than the previous table based lookup (typ. 14 ns vs 16
before).
2018-09-11 10:30:25 +02:00
William Lallemand
123f1f6441 MEDIUM: mworker: call per_thread deinit in mworker_reload()
We need to clean the FDs registered manually in the poller to avoid FD
leaking during a reload of the master.

This patch call the per thread deinit function which close the thread
waker pipe.
2018-09-11 10:23:24 +02:00
William Lallemand
333d7979cd MEDIUM: threads: close the thread-waker pipe during deinit
In order to avoid FD leaking, we close the pipe used to wake the threads
up during per thread deinit.
2018-09-11 10:23:24 +02:00
William Lallemand
e22f11ff47 MINOR: mworker: keep and clean the listeners
Keep the listeners that should be used in the master process and clean
them in the workers.
2018-09-11 10:23:24 +02:00
William Lallemand
bc19305e53 MEDIUM: mworker: replace the master pipe by socketpairs
In order to communicate with the workers, the master pipe has been
replaced by a socketpair() per worker.

The goal is to use these sockets as stats sockets and be able to access
them from the master.

When reloading, the master serialize the information of the workers and
put them in a environment variable. Once the master has been reexecuted
it unserialize that information and it is capable of closing the FDs of
the leaving children.
2018-09-11 10:21:58 +02:00
William Lallemand
f9cc07c25b MEDIUM: mworker: master wait mode use its own initialization
The master now use a poll loop, which should be initialized even in wait
mode. We need to init some variables if we didn't success to load the
configuration file.
2018-09-11 10:21:58 +02:00
William Lallemand
de0ff5ab20 MINOR: mworker: don't deinit the poller fd when in wait mode
If haproxy failed to load its configuration, the process is reexecuted
and it did not init the poller. So we must not try to deinit the poller
before the exec().
2018-09-11 10:21:58 +02:00
William Lallemand
d3801c1c21 MEDIUM: startup: unify signal init between daemon and mworker mode
The signals are now unblocked only once the configuration have been
parsed.
2018-09-11 10:21:58 +02:00
William Lallemand
242aae96c7 MEDIUM: mworker: never block SIG{TERM,INT} during reload
The master should be able to be killed even if the reload is not
finished.
2018-09-11 10:21:58 +02:00
William Lallemand
ebf304f8dd MEDIUM: mworker: block SIGCHLD until the master is ready
With the new way of handling the signals in the master worker, we are
are not staying in a waitpid() loop. Which means that we need to catch the
SIGCHLD signals to call waitpid().

The problem is when the master is reloading, this signal is neither
registered nor blocked so we lost all signals between the restart and
the call to mworker_loop().

This patch blocks the SIGCHLD signals before the reloading and ensure
it's not unblocked before the master registered the SIGCHLD handler.
2018-09-11 10:21:58 +02:00
William Lallemand
91c13b696a MINOR: mworker: mworker_cleanlisteners() delete the listeners
The mworker_cleanlisteners() function now remove the listeners, we don't
need them in the master for now.
2018-09-11 10:21:58 +02:00
William Lallemand
3da9769ee4 BUG/MINOR: mworker: no need to stop peers for each proxy
The mworker_cleanlisteners() was cleaning the peers in the proxy loop,
which is useless since we need to stop the peers only once.
2018-09-11 10:21:58 +02:00
William Lallemand
b3f2be338b MEDIUM: mworker: use the haproxy poll loop
In order to reorganize the code of the master worker, the mworker_wait()
function which was the main function was split. This function was
handling a wait() loop, but it does not need it anymore since the code
will use the poll loop of haproxy instead.

The function was split in several functions:

- mworker_catch_sigterm() which is a signal handler for SIGTERM ans
SIGUSR1 that sends the signals to the workers
- mworker_catch_sigchld() which is the code handling the leaving of a
child
- mworker_catch_sighup which basically call the mworker_restart()
function
- mworker_loop() which is the function calling the main poll loop in the
master
2018-09-11 10:21:58 +02:00
William Lallemand
73e1dfcfdf MEDIUM: mworker: remove register/unregister signal functions
Remove the register and unregister signal functions specifics to the
master worker, because that should be done with the generic ones.
2018-09-11 10:21:58 +02:00
Willy Tarreau
4bc7d90d3b MEDIUM: snapshot: merge the captured data after the descriptor
Instead of having a separate area for the captured data, we now have a
contigous block made of the descriptor and the data. At the moment, since
the area is dynamically allocated, we can adjust its size to what is
needed, but the idea is to quickly switch to a pool and an LRU list.
2018-09-07 20:07:17 +02:00
Willy Tarreau
c55015ee5b MEDIUM: snapshots: dynamically allocate the snapshots
Now upon error we dynamically allocate the snapshot instead of overwriting
it. This way there is no more memory wasted in the proxy to hold the two
error snapshot descriptors. Also an appreciable side effect of this is that
the proxy's lock is only taken during the pointer swap, no more while copying
the buffer's contents. This saves 480 bytes of memory per proxy.
2018-09-07 19:59:58 +02:00
Willy Tarreau
36b2736a69 BUG/MEDIUM: snapshot: take the proxy's lock while dumping errors
The proxy's lock it held while filling the error but not while dumping
it, so it's possible to dereference pointers being replaced, typically
server pointers. The risk is very low and unlikely but not inexistent.

Since "show errors" is rarely used in parallel, let's simply grab the
proxy's lock while dumping. Ideally we should use an R/W lock here but
it will not make any difference.

This patch must be backported to 1.8, but the code is in proto_http.c
there, though mostly similar.
2018-09-07 19:55:44 +02:00
Willy Tarreau
ddb68ac69e REORG: cli: move the "show errors" handler from http to proxy
There's nothing HTTP-specific there anymore at all, let's move this
to the proxy where it belongs.
2018-09-07 18:36:50 +02:00
Willy Tarreau
fd9419d560 MINOR: http: remove the pointer to the error snapshot in http_capture_bad_message()
It's not needed anymore as we know the side thanks to the channel. This
will allow the proxy generic code to better manage the error snapshots.
2018-09-07 18:36:04 +02:00
Willy Tarreau
ef3ca73fc3 MINOR: http: make the HTTP error capture rely on the generic proxy code
Now that we have a generic error capture function, let's simplify
http_capture_bad_message() to make use of it. At this point the API
is not changed at all, but it could be further simplified.
2018-09-07 18:36:04 +02:00
Willy Tarreau
75fb65a51f MINOR: proxy: add a new generic proxy_capture_error()
This function now captures an error regardless of its side and protocol.
The caller must pass a number of elements and may pass a protocol-specific
structure and a callback to display it. Later this function may deal with
more advanced allocation techniques to avoid allocating as many buffers
as proxies.
2018-09-07 18:36:04 +02:00
Willy Tarreau
7ccdd8dad9 MEDIUM: snapshot: implement a show() callback and use it for HTTP
The HTTP dumps are now configurable in the code : "show errors" now
calls a protocol-specific function to emit the decoded output. For
now only HTTP is implemented.
2018-09-07 18:36:01 +02:00
Willy Tarreau
0b5b480594 MEDIUM: snapshot: start to reorder the HTTP snapshot output a little bit
The output of "show errors" was slightly reordered to split the HTTP part
in a single chunk_appendf() call. The useless buffer total input was
replaced to report the buffer's start offset, which is the offset in the
stream of the first input byte (thus not counting output). Also it was
the opportunity to stop calling the stream "session".
2018-09-07 17:48:14 +02:00
Willy Tarreau
7480f323ff MINOR: snapshot: split the error snapshots into common and proto-specific parts
The idea will be to make the error snapshot feature accessible to other
protocols than just HTTP. This patch only introduces an "http_snapshot"
structure and renames a few fields to make things more explicit. The
HTTP part was installed inside a union so that we can easily add more
protocols in the future.
2018-09-07 16:13:45 +02:00
Willy Tarreau
5865a8fe69 MINOR: snapshot: restart on the event ID and not the stream ID
The snapshots have the ability to restart a partial dump and they use
the stream ID as the restart point. Since it's purely HTTP, let's use
the event ID instead.
2018-09-07 15:00:43 +02:00
Willy Tarreau
e9e878a056 BUG/MINOR: http/threads: atomically increment the error snapshot ID
Let's use an atomic increment for the error snapshot, as we'd rather
not assign the same ID to two errors happening in parallel. It's very
unlikely that it will ever happen though.

This patch must be backported to 1.8 with the other one it relies on
("MINOR: thread: implement HA_ATOMIC_XADD()").
2018-09-07 11:31:58 +02:00
Baptiste Assmann
044fd5bc2c BUG/MINOR: dns: check and link servers' resolvers right after config parsing
On the Mailing list, Marcos Moreno reported that haproxy configuration
validation (through "haproxy -c cfgfile") does not detect when a
resolvers section does not exist for a server.
That said, this checking is done after HAProxy has started up.

The problem is that this can create production issue, since init
script can't detect the problem before starting / reloading HAProxy.

To fix this issue, this patch registers the function which validates DNS
configuration validity and run it right after configuration parsing is
finished (through cfg_register_postparser()).
Thanks to it, now "haproxy -c cfgfile" will fail when a server
points to a non-existing resolvers section (or any other validation made
by the function above).

Backport status: 1.8
2018-09-06 19:41:30 +02:00
Willy Tarreau
be373150c7 MINOR: connection: make the initialization more consistent
Sometimes a connection is prepared before the target is set, sometimes
after. There's no real rule since the few functions involved operate on
different and independent fields. Soon we'll benefit from knowing the
target at the connection layer, in order to figure the associated proxy
and retrieve the various parameters (timeouts etc). This patch slightly
reorders a few calls to conn_prepare() so that we can make sure that the
target is always known to the mux.
2018-09-06 11:45:30 +02:00
Willy Tarreau
950a8a6fde BUG/MINOR: h1: fix buffer shift after realignment
Commit 5e74b0b ("MEDIUM: h1: port to new buffer API.") introduced a
minor bug by which a buffer's head could stay shifted by the amount
of removed CRLF if it started with empty lines. This would cause the
second request (or response) not to work until it would receive a few
extra characters. This most only impacts requests sent by hand though.

This is purely 1.9, no backport is needed.
2018-09-06 10:48:15 +02:00
Willy Tarreau
22de8d3e01 MEDIUM: h2: produce some logs on early errors that prevent streams from being created
The h2 mux currently lacks some basic transparency. Some errors cause the
connection to be aborted but they couldn't be reported. With this patch,
almost all situations where an error will cause a stream or connection to
be aborted without the ability for an existing stream to report it will be
reported in the logs. This at least provides a solution to monitor the
activity and abnormal traffic.
2018-09-06 09:43:41 +02:00
Willy Tarreau
5383935856 MINOR: log: provide a function to emit a log for a session
The new function sess_log() only needs a session to emit a log. It will
ignore the parts that depend on the stream. It is usable to emit a log
to report early errors in muxes. These ones will typically mention
"<BADREQ>" for the request and 0 for the HTTP status code.
2018-09-06 09:43:41 +02:00
Willy Tarreau
09bb27cdea MEDIUM: log: make sess_build_logline() support being called with no stream
Till now it was impossible to emit logs from the lower layers only because
a stream was mandatory. From now on it will at least be possible to emit a
log to report a bad request or some timings for example. When the stream
is null, sess_build_logline() will use default values and will extract the
timing information from the session just like stream_new() does, so the
resulting log line is perfectly valid.

The termination state will indicate a proxy error during the request phase
since it is the only realistic use for such a call with no stream.
2018-09-06 09:43:06 +02:00
Willy Tarreau
5cacab63e1 MINOR: log: use zero as the request counter if there is no stream
When s==NULL we don't have any assigned request counter. Ideally we
should proceed exactly like when a stream is initialized and assign
a unique value. For now we only place it into a local variable.
2018-09-05 20:01:23 +02:00
Willy Tarreau
b8bc52522c MINOR: log: keep a copy of s->flags early to avoid a dereference
By placing s->flags into a local variable we'll be able to force it new
values when s is NULL.
2018-09-05 20:01:23 +02:00
Willy Tarreau
02fdf4f77b MINOR: log: use NULL for the unique_id if there is no stream
Now s->unique_id is used as NULL (not set) if s==NULL.
2018-09-05 20:01:23 +02:00
Willy Tarreau
abd71a5c2e MINOR: log: don't check the stream-int's conn_retries if the stream is NULL
Let's simply forget the conn_retries when there is no stream since we
haven't tried to connect yet.
2018-09-05 20:01:23 +02:00
Willy Tarreau
e1809dfdaf MINOR: log: be sure not to dereference a null stream for a target
The supported targets are either a server or an applet, so both are
NULL if the stream is NULL.
2018-09-05 20:01:23 +02:00
Willy Tarreau
d4f9166f4e MINOR: log: do not dereference a null stream to access captures
If the stream is null, let's simply not check captures. That's already
done if there is no capture.
2018-09-05 20:01:23 +02:00
Willy Tarreau
2393c5b6a9 MINOR: log: keep a copy of the backend connection early in sess_build_logline()
This way we can avoid dereferencing a possibly inexisting stream.
2018-09-05 20:01:23 +02:00
Willy Tarreau
26ffa8544d CLEANUP: log: make the low_level lf_{ip,port,text,text_len} functions take consts
These ones were abusively relying on variables making it hard to integrate
with const arguments.
2018-09-05 20:01:23 +02:00
Willy Tarreau
372ac5abff MINOR: log: don't unconditionally pick log info from s->logs
We'll soon support s==NULL so let's use an intermediary variable for the
logs structure. For now it only points to s->logs but will support a local
variable as an alternative later.
2018-09-05 20:01:23 +02:00
Willy Tarreau
56a91dddc6 MINOR: log: make sess_build_logline() not dereference a NULL stream for txn
If the stream is NULL, the txn is NULL as well. This condition is already
handled everywhere else.
2018-09-05 20:01:23 +02:00
Willy Tarreau
a21c0e60d2 MINOR: log: make the backend fall back to the frontend when there's no stream
This is already what happens before the backend is assigned, except that
now we don't need to dereference a NULL stream to figure this.
2018-09-05 20:01:23 +02:00
Willy Tarreau
43c538eab6 MINOR: log: move the log code to sess_build_logline() to add extra arguments
The current build_logline() can only be used with valid streams, which
means it is not suitable for use from muxes. We start by moving it into
another more generic function which takes the session as an argument,
to avoid complexifying all the internal API for jsut a few use cases.
This new function is not supposed to be called directly from outside so
we'll be able to instrument it to support several calling conventions.

For now the behaviour and conditions remain unchanged.
2018-09-05 20:01:23 +02:00
Willy Tarreau
a0d11b6fd5 BUG/MEDIUM: h2: fix risk of memory leak on malformated wrapped frames
While parsing a headers frame, if the frame is wrapped in the buffer
and needs to be unwrapped, it will be duplicated before being processed.
But if it contains certain combinations of invalid flags, the parser
returns without releasing the temporary buffer leading to a memory
leak.

This fix needs to be backported to 1.8.
2018-09-05 20:01:14 +02:00
Willy Tarreau
590a0514f2 BUG/MEDIUM: session: fix reporting of handshake processing time in the logs
The handshake processing time used to be stored per stream, which was
valid when there was exactly one stream per session. With H2 and
multiplexing it's not the case anymore and the reported handshake times
are wrong in the logs as it's computed between the TCP accept() and the
stream creation. Let's first move the handshake where it belongs, which
is the session.

However, this is not enough because we don't want to report an excessive
idle time either for H2 (since many requests use the connection).

So the solution used here is to have the stream retrieve sess->tv_accept
and the handshake duration when the stream is created, and let the mux
immediately reset them. This way, the handshake time becomes zero for the
second and subsequent requests in H2 (which was already the case in H1),
and the idle time exactly counts how long the connection remained unused
while it could be used, so in H1 it runs from the end of the previous
response and in H2 it runs from the end of the previous request since the
channel is already available.

This patch will need to be backported to 1.8.
2018-09-05 16:30:23 +02:00
Willy Tarreau
90a7c03ec0 BUG/MINOR: stream: use atomic increments for the request counter
The request counter is incremented when creating a new stream and when
resetting a stream, preparing for a new request. Unfortunately during
the thread migration this was missed, leading to non-atomic increments
in case threads are in use. The most visible side effect is that two
requests may have the same ID from time to time in the logs. However
the SPOE also uses this ID to route responses back to the stream so it
may also lead to occasional spurious SPOE timeouts.

Note that it still doesn't guarantee temporal unicity in the stream
identifiers since a long and a short connection could technically use
the same ID. The likeliness that this happens at the same time is almost
null (roughly threads*runqueue_depth/2^32 that it happens in the same
poll loop), but it will have to be addressed later anyway.

This patch must be backported to 1.8 with the other one it relies on
("MINOR: thread: implement HA_ATOMIC_XADD()").
2018-09-05 16:30:19 +02:00
Willy Tarreau
f16cb41d19 MINOR: tools: make date2str_log() take some consts
The "tm" and "date" field are not modified, they can be const instead
of forcing their callers to use vars.
2018-09-05 16:30:11 +02:00
Emmanuel Hocdet
9f9b0c6a7f BUG/MEDIUM: ECC cert should work with TLS < v1.2 and openssl >= 1.1.1
With openssl >= 1.1.1 and boringssl multi-cert is natively supported.
ECDSA/RSA selection is done and work correctly with TLS >= v1.2.
TLS < v1.2 have no TLSEXT_TYPE_signature_algorithms extension: ECC
certificate can't be selected, and handshake fail if no RSA cert
is present. Safe ECC certificate selection without client announcement
can be very tricky (browser compatibilty). The safer approach is to
select ECDSA certificate if no other certificate matches, like it is
with openssl < 1.1.1: certificate selection is only done via the SNI.

Thanks to Lukas Tribus for reporting this and analysing the problem.

This patch should be backported to 1.8
2018-09-04 17:47:10 +02:00
Baptiste Assmann
6d0f38f00d BUG/MEDIUM: dns/server: fix incomatibility between SRV resolution and server state file
Server state file has no indication that a server is currently managed
by a DNS SRV resolution.
And thus, both feature (DNS SRV resolution and server state), when used
together, does not provide the expected behavior: a smooth experience...

This patch introduce the "SRV record name" in the server state file and
loads and applies it if found and wherever required.

This patch applies to haproxy-dev branch only. For backport, a specific patch
is provided for 1.8.
2018-09-04 17:40:22 +02:00
Olivier Houchard
9e643ea172 BUG/MEDIUM: hlua: Don't call RESET_SAFE_LJMP if SET_SAFE_LJMP returns 0.
If SET_SAFE_LJMP returns 0, the spinlock is already unlocked, and lua_atpanic
is already set back to hlua_panic_safe, so there's no need to call
RESET_SAFE_LJMP.

This should be MFC'd into 1.8.
2018-08-31 16:14:58 +02:00
Frédéric Lécaille
54f2bcf22b BUG/MAJOR: thread: lua: Wrong SSL context initialization.
When calling ->prepare_srv() callback for SSL server which
depends on global "nbthread" value, this latter was not already parsed,
so equal to 1 default value. This lead to bad memory accesses.

Thank you to Pieter (PiBa-NL) for having reported this issue and
for having provided a very helpful reg testing file to reproduce
this issue (reg-test/lua/b00002.*).

Must be backported to 1.8.
2018-08-30 10:06:45 +02:00
Olivier Houchard
c7ffa91763 BUG/MEDIUM: stream_interface: try to call si_cs_send() earlier.
Call si_cs_send() at the beginning of si_cs_wake_cb(), instead of from
stream_int_notify-), so that if we get a connection error while trying to
send, the stream_interface will get SI_FL_ERR, the associated task will
be woken up, and the connection will be properly destroyed.

No backport needed.
2018-08-28 19:46:45 +02:00
Olivier Houchard
4501c3e099 MINOR: checks: Call wake_srv_chk() when we can finally send data.
Instead of calling __event_srv_chk_w, call wake_srv_chk(), which will then
either call tcpcheck_main() or __event_srv_chk_w().
Also make tcpcheck_main() subscribe if it can't send.
2018-08-28 19:43:57 +02:00
Olivier Houchard
594c8c5015 BUG/MEDIUM: hlua: Make sure we drain the output buffer when done.
In hlua_applet_tcp_fct(), drain the output buffer when the applet is done
running, every time we're called.
Overwise, there's a race condition, and the output buffer could be filled
after the applet ran, and as it is never cleared, the stream interface
will never be destroyed.

This should be backported to 1.8 and 1.7.
2018-08-28 16:18:34 +02:00
Patrick Hemmer
155e93e570 MINOR: Add srv_conn_free sample fetch
This adds the 'srv_conn_free([<backend>/]<server>)' sample fetch. This fetch
provides the number of available connections on the designated server.
2018-08-27 16:38:56 +02:00
Patrick Hemmer
4cdf3abaa0 MINOR: add be_conn_free sample fetch
This adds the sample fetch 'be_conn_free([<backend>])'. This sample fetch
provides the total number of unused connections across available servers in the
specified backend.
2018-08-27 14:10:16 +02:00
Patrick Hemmer
e3faf02581 BUG/MEDIUM: lua: reset lua transaction between http requests
Previously LUA code would maintain the transaction state between http
requests, resulting in things like txn:get_priv() retrieving data from
a previous request. This addresses the issue by ensuring the LUA state
is reset between requests.

Co-authored-by: Tim Düsterhus <tim@bastelstu.be>
2018-08-25 07:51:02 +02:00
Willy Tarreau
ad7f0ad1c3 BUG/MEDIUM: mux_pt: dereference the connection with care in mux_pt_wake()
mux_pt_wake() calls data->wake() which can return -1 indicating that the
connection was just destroyed. We need to check for this condition and
immediately exit in this case otherwise we dereference a just freed
connection. Note that this mainly happens on idle connections between
two HTTP requests. It can have random implications between requests as
it may lead a wrong connection's polling to be re-enabled or disabled
for example, especially with threads.

This patch must be backported to 1.8.
2018-08-24 15:48:59 +02:00
Frédéric Lécaille
83ed5d58d2 BUG/MINOR: lua: Bad HTTP client request duration.
HTTP LUA applet callback should not update the date on which the HTTP client requests
arrive. This was done just after the LUA applet has completed its job.

This patch simply removes the affected statement. The same fixe has been applied
to TCP LUA applet callback.

To reproduce this issue, as reported by Patrick Hemmer, implement an HTTP LUA applet
which sleeps a bit before replying:

  core.register_service("foo", "http", function(applet)
      core.msleep(100)
      applet:set_status(200)
      applet:start_response()
  end)

This had as a consequence to log %TR field with approximatively the same value as
the LUA sleep time.

Thank you to Patrick Hemmer for having reported this issue.

Must be backported to 1.8, 1.7 and 1.6.
2018-08-24 14:49:30 +02:00
Willy Tarreau
e215bba956 MINOR: connection: make conn_sock_drain() work for all socket families
This patch improves the previous fix by implementing the socket draining
code directly in conn_sock_drain() so that it always applies regardless
of the protocol's family. Thus it gets rid of tcp_drain().
2018-08-24 14:45:46 +02:00
Willy Tarreau
fe5d2ac65f BUG/MEDIUM: unix: provide a ->drain() function
Right now conn_sock_drain() calls the protocol's ->drain() function if
it exists, otherwise it simply tries to disable polling for receiving
on the connection. This doesn't work well anymore since we've implemented
the muxes in 1.8, and it has a side effect with keep-alive backend
connections established over unix sockets. What happens is that if
during the idle time after a request, a connection reports some data,
si_idle_conn_null_cb() is called, which will call conn_sock_drain().
This one sees there's no drain() on unix sockets and will simply disable
polling for data on the connection. But it doesn't do anything on the
conn_stream. Thus while leaving the conn_fd_handler, the mux's polling
is updated and recomputed based on the conn_stream's polling state,
which is still enabled, and nothing changes, so we see the process
use 100% CPU in this case because the FD remains active in the cache.

There are several issues that need to be addressed here. The first and
most important is that we cannot expect some protocols to simply stop
reading data when asked to drain pending data. So this patch make the
unix sockets rely on tcp_drain() since the functions are the same. This
solution is appropriate for backporting, but a better one is desired for
the long term. The second issue is that si_idle_conn_null_cb() shouldn't
drain the connection but the conn_stream.

At the moment we don't have any way to drain a conn_stream, though a flag
on rcv_buf() will do it well. Until we support muxes on the server side
it is not a problem so this part can be addressed later.

This fix must be backported to 1.8.
2018-08-24 14:42:50 +02:00
Willy Tarreau
bba81563cf MINOR: chunk: remove impossible tests on negative chunk->data
Since commit 843b7cb ("MEDIUM: chunks: make the chunk struct's fields
match the buffer struct") a chunk length is unsigned so we can remove
negative size checks.
2018-08-22 05:28:32 +02:00
Willy Tarreau
1c913e4232 BUG/MEDIUM: cli/ssl: don't store base64dec() result in the trash's length
By convenience or laziness we used to store base64dec()'s return code
into trash.data and to compare it against 0 to check for conversion
failure, but it's now unsigned since commit 843b7cb ("MEDIUM: chunks:
make the chunk struct's fields match the buffer struct"). Let's clean
this up and test the result itself without storing it first.

No backport is needed.
2018-08-22 05:28:32 +02:00
Willy Tarreau
b406b8708f BUG/MEDIUM: connection: don't store recv() result into trash.data
Cyril Bonté discovered that the proxy protocol randomly fails since
commit 843b7cb ("MEDIUM: chunks: make the chunk struct's fields match
the buffer struct"). This is because we used to store recv()'s return
code into trash.data which is now unsigned, so it never compares as
negative against 0. Let's clean this up and test the result itself
without storing it first.

No backport is needed.
2018-08-22 05:28:32 +02:00
Willy Tarreau
2842e05c7c BUG/MEDIUM: map: don't store exp_replace() result in the trash's length
By convenience or laziness we used to store exp_replace()'s return code
into str->data. The result checks applied there compare str->data to -1
while it's now unsigned since commit 843b7cb ("MEDIUM: chunks: make the
chunk struct's fields match the buffer struct"). Let's clean this up
and test the result itself without storing it first.

No backport is needed.
2018-08-22 05:16:33 +02:00
Willy Tarreau
f6ee9dc616 BUG/MEDIUM: dns: don't store dns_build_query() result in the trash's length
By convenience or laziness we used to store dns_build_query()'s return code
into trash.data. The result checks applied there compare trash.data to -1
while it's now unsigned since commit 843b7cb ("MEDIUM: chunks: make the
chunk struct's fields match the buffer struct"). Let's clean this up
and test the result itself without storing it first.

No backport is needed.
2018-08-22 05:16:32 +02:00
Willy Tarreau
9c768fdca1 BUG/MEDIUM: http: don't store url_decode() result in the samples's length
By convenience or laziness we used to store url_decode()'s return code
into smp->data.u.str.data. The result checks applied there compare it
to 0 while it's now unsigned since commit 843b7cb ("MEDIUM: chunks: make
the chunk struct's fields match the buffer struct "). Let's clean this up
and test the result itself without storing it first.

No backport is needed.
2018-08-22 05:16:32 +02:00
Willy Tarreau
6e27be1a5d BUG/MEDIUM: http: don't store exp_replace() result in the trash's length
By convenience or laziness we used to store exp_replace()'s return code
into trash.data. The result checks applied there compare trash.data to -1
while it's now unsigned since commit 843b7cb ("MEDIUM: chunks: make the
chunk struct's fields match the buffer struct "). Let's clean this up
and test the result itself without storing it first.

No backport is needed.
2018-08-22 05:16:32 +02:00
Willy Tarreau
5f6333caca BUG/MINOR: chunks: do not store -1 into chunk_printf() in case of error
Since commit 843b7cb ("MEDIUM: chunks: make the chunk struct's fields
match the buffer struct") a chunk length is unsigned so we can't reliably
store -1 and check for negative values in the caller. Only one such
location was found in proto_http's http-request auth rules (which cannot
realistically fail).

No backport is needed.
2018-08-22 05:16:31 +02:00
Willy Tarreau
49725a0977 BUG/MEDIUM: check/threads: do not involve the rendez-vous point for status updates
thread_isolate() is currently being called with the server lock held.
This is not acceptable because it prevents other threads from reaching
the rendez-vous point. Now that the LB algos are thread-safe, let's get
rid of this call.

No backport is nedeed.
2018-08-21 19:54:09 +02:00
Willy Tarreau
1b87748ff5 BUG/MEDIUM: lb/threads: always properly lock LB algorithms on maintenance operations
Since commit 3ff577e ("MAJOR: server: make server state changes
synchronous again"), srv_update_status() calls the various maintenance
operations of the LB algorithms (->set_server_up, ->set_server_down,
->update_server_weight()). These ones are called with a single thread
guaranteed by the rendez-vous point, so the fact that they're lacking
some locks has no effect. However we'll need to remove the rendez-vous
point so we have to take care of properly locking all the LB algos.

The comments have been properly updated on the various functions to
mention their locking expectations. All these functions are called
with the server lock held, and all of them now support concurrent
calls by using the lbprm's lock.

This fix doesn't need to be backported at the moment, though if any
check-specific issue surfaced in 1.8, it could make sense to reuse it.
2018-08-21 19:44:53 +02:00
Willy Tarreau
deca26c452 BUG/MAJOR: queue/threads: make pendconn_redistribute not lock the server
Since commit 3ff577e ("MAJOR: server: make server state changes
synchronous again"), srv_update_status() is called with the server
lock held. It calls (among others) pendconn_redistribute() which used
to take this lock, causing CPU loops by default, or crashes if build
with -DDEBUG_THREAD. Since this function is not called from any other
place anymore, it doesn't require the lock on its own so let's simply
drop it from there.

No backport is needed, this is 1.9-specific.
2018-08-21 18:11:03 +02:00
Olivier Houchard
80c56790d9 BUG/MEDIUM: stream_interface: Call the wake callback after sending.
If we subscribed to send, and the callback is called, call the wake callback
after, so that process_stream() may be woken up if needed.

This is 1.9-specific, no backport is needed.
2018-08-21 18:06:57 +02:00
Olivier Houchard
fab7c7e91c BUG/MEDIUM: H2: Activate polling after successful h2_snd_buf().
Make sure h2_send() is called after h2_snd_buf() by activating polling.

This is 1.9-specific, no backport is needed.
2018-08-21 18:06:57 +02:00
Olivier Houchard
a6ff035770 BUG/MEDIUM: stream-int: Check if the conn_stream exist in si_cs_io_cb.
It is possible that the conn_stream gets detached from the stream_interface,
and as it subscribed to the wait list, si_cs_io_cb() gets called anyway,
so make sure we have a conn_stream before attempting to send more data.

This is 1.9-specific, no backport is needed.
2018-08-21 18:06:54 +02:00
Olivier Houchard
18a85fe602 BUG/MEDIUM: streams: Don't forget to remove the si from the wait list.
When freeing the stream, make sure we remove the stream interfaces from the
wait lists, in case it was in there.

This is 1.9-specific, no backport is needed.
2018-08-21 18:06:33 +02:00
Willy Tarreau
3bcc2699ba BUG/MEDIUM: cli/threads: protect some server commands against concurrent operations
The server-specific CLI commands "set weight", "set maxconn",
"disable agent", "enable agent", "disable health", "enable health",
"disable server" and "enable server" were not protected against
concurrent accesses. Now they take the server lock around the
sensitive part.

This patch must be backported to 1.8.
2018-08-21 15:35:31 +02:00
Willy Tarreau
46b7f53ad9 DOC: server/threads: document which functions need to be called with/without locks
At the moment it's totally unclear while reading the server's code which
functions require to be called with the server lock held and which ones
grab it and cannot be called this way. This commit simply inventories
all of them to indicate what is detected depending on how these functions
use the struct server. Only functions used at runtime were checked, those
dedicated to config parsing were skipped. Doing so already has uncovered
a few bugs on some CLI actions.
2018-08-21 14:58:25 +02:00
Willy Tarreau
a275a3710e BUG/MEDIUM: cli/threads: protect all "proxy" commands against concurrent updates
The proxy-related commands like "{enable|disable|shutdown} frontend",
"{enable|disable} dynamic-cookie", "set dynamic-cookie-key" were not
protected against concurrent accesses making their use dangerous with
threads.

This patch must be backported to 1.8.
2018-08-21 14:58:25 +02:00
Willy Tarreau
eeba36b3af BUG/MEDIUM: server: update our local state before propagating changes
Commit 3ff577e ("MAJOR: server: make server state changes synchronous again")
reintroduced synchronous server state changes. However, during the previous
change from synchronous to asynchronous, the server state propagation was
placed at the end of the function to ease the code changes, and the commit
above didn't put it back at its place. This has resulted in propagated
states to be incomplete. For example, making a server leave maintenance
would make it up but would leave its tracking servers down because they
see their tracked server is still down.

Let's just move the status update right to its place. It also adds the
benefit of reporting state changes in the order they appear and not in
reverse.

No backport is needed.
2018-08-21 08:29:25 +02:00
Cyril Bonté
7ee465f1ad BUG/MINOR: lua: fix extra 500ms added to socket timeouts
Since commit #56cc12509, haproxy accepts double values for timeouts. The
value is then converted to  milliseconds before being rounded up and cast
to int. The issue is that to round up the value, a constant value of 0.5
is added to it, but too early in the conversion, resulting in an
additional 500ms to the value. We are talking about a precision of 1ms,
so we can safely get rid of this rounding trick and adjust resulting
timeouts equal to 0 to a minimum of 1ms.

This patch is specific to the 1.9 branch and doesn't require to be
backported.
2018-08-19 22:11:28 +02:00
Cyril Bonté
7bb6345497 BUG/MEDIUM: lua: socket timeouts are not applied
Sachin Shetty reported that socket timeouts set in LUA code have no effect.
Indeed, connect timeout is never modified and is always set to its default,
set to 5 seconds. Currently, this patch will apply the specified timeout
value to the connect timeout.
For the read and write timeouts, the issue is that the timeout is updated but
the expiration dates were not updated.

This patch should be backported up to the 1.6 branch.
2018-08-18 00:23:52 +02:00
Olivier Houchard
19bdf2428d MINOR: tasks: Don't special-case when nbthreads == 1
Instead of checking if nbthreads == 1, just and thread_mask with
all_threads_mask to know if we're supposed to add the task to the local or
the global runqueue.
2018-08-17 14:50:37 +02:00
Bertrand Jacquin
a25282bb39 DOC: ssl: Use consistent naming for TLS protocols
In most cases, "TLSv1.x" naming is used across and documentation, lazy
people tend to grep too much and may not find what they are looking for.

Fixing people is hard.
2018-08-16 20:20:26 +02:00
Emeric Brun
271022150d BUG/MINOR: map: fix map_regm with backref
Due to a cascade of get_trash_chunk calls the sample is
corrupted when we want to read it.

The fix consist to use a temporary chunk to copy the sample
value and use it.

[wt: for 1.8 and older, a backport was successfully tested here :
 https://www.mail-archive.com/haproxy@formilux.org/msg30694.html]
2018-08-16 19:44:04 +02:00
Emeric Brun
e1b4ed4352 BUG/MEDIUM: ssl: loading dh param from certifile causes unpredictable error.
If the dh parameter is not found, the openssl's error global
stack was not correctly cleared causing unpredictable error
during the following parsing (chain cert parsing for instance).

This patch should be backported in 1.8 (and perhaps 1.7)
2018-08-16 19:36:08 +02:00
Emeric Brun
eb155b6ca6 BUG/MEDIUM: ssl: fix missing error loading a keytype cert from a bundle.
If there was an issue loading a keytype's part of a bundle, the bundle
was implicitly ignored without errors.

This patch should be backported in 1.8 (and perhaps 1.7)
2018-08-16 19:36:06 +02:00
Olivier Houchard
fde2a09a15 BUG/MEDIUM: sessions: Don't use t->state.
In session_expire_embryonic(), don't use t->state, use the "state" argument
instead, as t->state has been cleaned before we're being called.
2018-08-16 19:25:56 +02:00
Olivier Houchard
d8b7a4701d BUG/MEDIUM: tasks: Don't insert in the global rqueue if nbthread == 1
Make sure we don't insert a task in the global run queue if nbthread == 1,
as, as an optimisation, we avoid reading from it if nbthread == 1.
2018-08-16 19:25:46 +02:00
Olivier Houchard
5c110b924e MINOR: checks: Add event_srv_chk_io().
In checks, introduce event_srv_chk_io() as a callback to be called if data
can be sent again, instead of abusing event_srv_chk_w.
2018-08-16 17:29:54 +02:00
Olivier Houchard
29fb89dc5e MINOR: mux_h2: Don't use h2_send() as a callback.
Instead of using h2_send() directly as a callback, introcude h2_io_cb(), that
will call h2_send() if it is possible to send data.
2018-08-16 17:29:54 +02:00
Olivier Houchard
8f0b4c66f5 MINOR: stream_interface: Give stream_interface its own wait_list.
Instead of just using the conn_stream wait_list, give the stream_interface
its own. When the conn_stream will have its own buffers, the stream_interface
may have to wait on it.
2018-08-16 17:29:54 +02:00
Olivier Houchard
91894cbf4c MINOR: stream_interface: Don't use si_cs_send() as a task handler.
Instead of using si_cs_send() as a task handler, define a new function,
si_cs_io_cb(), and give si_cs_send() its original prototype. Right now
si_cs_io_cb() just handles send, but later it'll handle recv() too.
2018-08-16 17:29:54 +02:00
Olivier Houchard
e1c6dbcd70 MINOR: connections/mux: Add the wait reason(s) to wait_list.
Add a new element to the wait_list, that let us know which event(s) we are
waiting on.
2018-08-16 17:29:53 +02:00
Olivier Houchard
ed0f207ef5 MINOR: connections: Get rid of txbuf.
Remove txbuf from conn_stream. It is not used yet, and its only user will
probably be the mux_h2, so it will be better suited in the struct h2s.
2018-08-16 17:29:51 +02:00
Olivier Houchard
638b799b09 MINOR: connections: Move rxbuf from the conn_stream to the h2s.
As the mux_h2 is the only user of rxbuf, move it to the struct h2s, instead
of conn_stream.
2018-08-16 17:28:11 +02:00
Olivier Houchard
511efeae7e MINOR: connections: Make rcv_buf mandatory and nuke cs_recv().
Reintroduce h2_rcv_buf(), right now it just does what cs_recv() did, but
should be modified later.
2018-08-16 17:23:44 +02:00
Emeric Brun
77e8919fc6 BUG/MINOR: ssl: empty connections reported as errors.
Empty connection is reported as handshake error
even if dont-log-null is specified.

This bug affect is a regression du to:

BUILD: ssl: fix to build (again) with boringssl

New openssl 1.1.1 defines OPENSSL_NO_HEARTBEATS as boring ssl
so the test was replaced by OPENSSL_IS_BORINGSSL

This fix should be backported on 1.8
2018-08-16 11:59:59 +02:00
Patrick Hemmer
248cb4c503 MEDIUM: queue: adjust position based on priority-class and priority-offset
The priority values are used when connections are queued to determine
which connections should be served first. The lowest priority class is
served first. When multiple requests from the same class are found, the
earliest (according to queue_time + offset) is served first. The queue
offsets can span over roughly 17 minutes after which the offsets will
wrap around. This allows up to 8 minutes spent in the queue with no
reordering.
2018-08-10 15:06:48 +02:00
Patrick Hemmer
268a707a3d MEDIUM: add set-priority-class and set-priority-offset
This adds the set-priority-class and set-priority-offset actions to
http-request and tcp-request content. At this point they are not used
yet, which is the purpose of the next commit, but all the logic to
set and clear the values is there.
2018-08-10 15:06:31 +02:00
Patrick Hemmer
0355dabd7c MINOR: queue: replace the linked list with a tree
We'll need trees to manage the queues by priorities. This change replaces
the list with a tree based on a single key. It's effectively a list but
allows us to get rid of the list management right now.
2018-08-10 15:06:27 +02:00
Patrick Hemmer
da282f4a8f MINOR: queue: store the queue index in the stream when enqueuing
We store the queue index in the stream and check it on dequeueing to
figure how many entries were processed in between. This way we'll be
able to count the elements that may later be added before ours.
2018-08-10 15:06:25 +02:00
Patrick Hemmer
ffe5e8c638 MINOR: stream: rename {srv,prx}_queue_size to *_queue_pos
The current name is misleading as it implies a queue size, but the value
instead indicates a position in the queue.
The value is only the queue size at the exact moment the element is enqueued.
Soon we will gain the ability to insert anywhere into the queue, upon which
clarity of the name is more important.
2018-08-10 15:04:14 +02:00
Willy Tarreau
66425e31b5 MINOR: queue: make sure the pendconn is released before logging
We'll soon need to rely on the pendconn position at the time of dequeuing
to figure the position a stream took in the queue. Usually it's not a
problem since pendconn_free() is called once the connection starts, but
it will make a difference for failed dequeues (eg: queue timeout reached).
Thus it's important to call pendconn_free() before logging in cases we are
not certain whether it was already performed, and to call pendconn_unlink()
after we know the pendconn will not be used so that we collect the queue
state as accurately as possible. As a benefit it will also make the
server's and backend's queues count more accurate in these cases.
2018-08-10 15:04:08 +02:00
Christopher Faulet
7ce0c891ab MEDIUM: mux: Use the mux protocol specified on bind/server lines
To do so, mux choices are split to handle incoming and outgoing connections in a
different way. The protocol specified on the bind/server line is used in
priority. Then, for frontend connections, the ALPN is retrieved and used to
choose the best mux. For backend connection, there is no ALPN. Finaly, if no
protocol is specified and no protocol matches the ALPN, we fall back on a
default mux, choosing in priority the first mux with exactly the same mode.
2018-08-08 10:42:08 +02:00
Christopher Faulet
8ed0a3e32a MINOR: mux/server: Add 'proto' keyword to force the multiplexer's protocol
For now, it is parsed but not used. Tests are done on it to check if the side
and the mode are compatible with the server's definition.
2018-08-08 10:42:08 +02:00
Christopher Faulet
a717b99284 MINOR: mux/frontend: Add 'proto' keyword to force the mux protocol
For now, it is parsed but not used. Tests are done on it to check if the side
and the mode are compatible with the proxy's definition.
2018-08-08 10:41:11 +02:00
Christopher Faulet
7c42eacbe9 BUG/MEDIUM: stream_int: Don't check CO_FL_SOCK_RD_SH flag to trigger cs receive
It is mandatory to be sure to process data blocked in the RX buffer of the
conn_stream while the shutr/read0 was already processed. The stream interface
doesn't need to rely on this flags because it already tests CS_FL_EOS.
2018-08-08 10:41:11 +02:00
Willy Tarreau
91c2826e1d CLEANUP: server: remove the update list and the update lock
These ones are not more used, let's get rid of them.
2018-08-08 09:57:45 +02:00
Willy Tarreau
3ff577e165 MAJOR: server: make server state changes synchronous again
Now we try to synchronously push updates as they come using the new rdv
point, so that the call to the server update function from the main poll
loop is not needed anymore.

It further reduces the apparent latency in the health checks as the response
time almost always appears as 0 ms, resulting in a slightly higher check rate
of ~1960 conn/s. Despite this, the CPU consumption has slightly dropped again
to ~32% for the same test.

The only trick is that the checks code is built with a bit of recursivity
because srv_update_status() calls server_recalc_eweight(), and the latter
needs to signal srv_update_status() in case of updates. Thus we added an
extra argument to this function to indicate whether or not it must
propagate updates (no if it comes from srv_update_status).
2018-08-08 09:57:45 +02:00
Willy Tarreau
647c70b681 MINOR: threads: remove the previous synchronization point
It's not needed anymore as it is fully covered by the new rendez-vous
point. This also removes the pipe and its polling.
2018-08-08 09:57:45 +02:00
Willy Tarreau
85c459d7e8 MEDIUM: haproxy: don't use sync_poll_loop() anymore in the main loop
This partially reverts commit d8fd2af ("BUG/MEDIUM: threads: Use the sync
point to check active jobs and exit") which used to address an issue in
the way the sync point used to check for present threads, which was later
addressed by commit ddb6c16 ("BUG/MEDIUM: threads: Fix the exit condition
of the thread barrier"). Thus there is no need anymore to use the sync
point for exiting and we can completely remove this call in the main loop.
2018-08-08 09:56:32 +02:00
Willy Tarreau
3d3700f216 MEDIUM: checks: use the new rendez-vous point to spread check result
The current sync point causes some important stress when a high number
of threads is in use on a config with lots of checks, because it wakes
up all threads every time a server state changes.

A config like the following can easily saturate a 4-core machine reaching
only 750 checks per second out of the ~2000 configured :

    global
        nbthread 4

    defaults
        mode    http
        timeout connect 5s
        timeout client  5s
        timeout server  5s

    frontend srv
        bind :8001 process 1/1
        redirect location / if { method OPTIONS } { rand(100) ge 50 }
        stats uri /

    backend chk
        option httpchk
        server-template srv 1-100 127.0.0.1:8001 check rise 1 fall 1 inter 50

The reason is that the random on the fake server causes the responses
to randomly match an HTTP check, and results in a lot of up/down events
that are broadcasted to all threads. It's worth noting that the CPU usage
already dropped by about 60% between 1.8 and 1.9 just due to the scheduler
updates, but the sync point remains expensive.

In addition, it's visible on the stats page that a lot of requests end up
with an L7TOUT status in ~60ms. With smaller timeouts, it's even L4TOUT
around 20-25ms.

By not using THREAD_WANT_SYNC() anymore and only calling the server updates
under thread_isolate(), we can avoid all these wakeups. The CPU usage on
the same config drops to around 44% on the same machine, with all checks
being delivered at ~1900 checks per second, and the stats page shows no
more timeouts, even at 10 ms check interval. The difference is mainly
caused by the fact that there's no more need to wait for a thread to wake
up from poll() before starting to process check results.
2018-08-08 09:56:32 +02:00
Christopher Faulet
98d9fe21e0 MINOR: mux: Print the list of existing mux protocols during HA startup
This is done in verbose/debug mode and when build options are reported.
2018-08-08 09:54:22 +02:00
Christopher Faulet
32f61c0421 MINOR: mux: Unlink ALPN and multiplexers to rather speak of mux protocols
Multiplexers are not necessarily associated to an ALPN. ALPN is a TLS extension,
so it is not always defined or used. Instead, we now rather speak of
multiplexer's protocols. So in this patch, there are no significative changes,
some structures and functions are just renamed.
2018-08-08 09:54:22 +02:00
Christopher Faulet
2d5292a412 MINOR: mux: Add info about the supported side in alpn_mux_list structure
Now, a multiplexer can specify if it can be install on incoming connections
(ALPN_SIDE_FE), on outgoing connections (ALPN_SIDE_BE) or both
(ALPN_SIDE_BOTH). These flags are compatible with proxies' ones.
2018-08-08 09:54:22 +02:00
Christopher Faulet
b75bb21092 MEDIUM: backend: don't rely on mux_pt_ops in connect_server()
The comment above the change remains true. We assume there is always 1
conn_stream per outgoing connectionq. Today, it is always true because H2 is not
supported yet for server connections.
2018-08-08 09:54:22 +02:00
Christopher Faulet
6cc7afa04e MINOR: backend: Try to find the best mux for outgoing connections
For now, there is no effect. mux-pt will always be used because this is only
available mux for backend connections.
2018-08-08 09:54:22 +02:00
Christopher Faulet
063f786553 MINOR: conn_stream: add cs_send() as a default snd_buf() function
This function is generic and is able to automatically transfer data from a
buffer to the conn_stream's tx buffer. It does this automatically if the mux
doesn't define another snd_buf() function.

It cannot yet be used as-is with the conn_stream's txbuf without risking to
lose data on close since conn_streams need to be orphaned for this.
2018-08-08 09:53:58 +02:00
Christopher Faulet
2bf88c05d0 CLEANUP: backend: Move mux install to call it at only one place
It makes the code readability simpler. It will also ease futur changes.
2018-08-07 14:37:37 +02:00
Christopher Faulet
d44a9b3627 MEDIUM: mux: Remove const on the buffer in mux->snd_buf()
This is a partial revert of the commit deccd1116 ("MEDIUM: mux: make
mux->snd_buf() take the byte count in argument"). It is a requirement to do
zero-copy transfers. This will be mandatory when the TX buffer of the
conn_stream will be used.

So, now, data are consumed by mux->snd_buf() and not only sent. So it needs to
update the buffer state. On its side, the caller must be aware the buffer can be
replaced y an empty or unallocated one.

As a side effet of this change, the function co_set_data() is now only responsible
to update the channel set, by update ->output field.
2018-08-07 14:36:52 +02:00
Willy Tarreau
a8694654ba BUG/MEDIUM: queue: prevent a backup server from draining the proxy's connections
When switching back from a backup to an active server, the backup server
currently continues to drain the proxy's connections, which is a problem
because it's not expected to be able to pick them.

This patch ensures that a backup server will only pick backend connections
if there is no active server and it is the selected backup server or all
backup servers are supposed to be used.

This issue seems to have existed forever, so this fix should be backported
to all stable versions.
2018-08-07 10:52:01 +02:00
Willy Tarreau
6a78e61694 BUG/MEDIUM: servers: check the queues once enabling a server
Commit 64cc49c ("MAJOR: servers: propagate server status changes
asynchronously.") heavily changed the way the server states are
updated since they became asynchronous. During this change, some
code was lost, which is used to shut down some sessions from a
backup server and to pick pending connections from a proxy once
a server is turned back from maintenance to ready state. The
effect is that when temporarily disabling a server, connections
stay in the backend's queue, and when re-enabling it, they are
not picked and they expire in the backend's queue. Now they're
properly picked again.

This fix must be backported to 1.8.
2018-08-07 10:14:53 +02:00
Willy Tarreau
ab657ce251 BUG/MEDIUM: threads: fix the no-thread case after the change to the sync point
In commit 0c026f4 ("MINOR: threads: add more consistency between certain
variables in no-thread case"), we ensured that we don't have all_threads_mask
zeroed anymore. But one test was missed for the write() to the sync pipe.
This results in a situation where when running single-threaded, once a
server status changes, a wake-up message is written to the pipe and never
consumed, showing a 100% CPU usage.

No backport is needed.
2018-08-07 10:07:15 +02:00
Willy Tarreau
65e94d1ce9 [RELEASE] Released version 1.9-dev1
Released version 1.9-dev1 with the following main changes :
    - BUG/MEDIUM: kqueue: Don't bother closing the kqueue after fork.
    - DOC: cache: update sections and fix some typos
    - BUILD/MINOR: deviceatlas: enable thread support
    - BUG/MEDIUM: tcp-check: Don't lock the server in tcpcheck_main
    - BUG/MEDIUM: ssl: don't allocate shctx several time
    - BUG/MEDIUM: cache: bad computation of the remaining size
    - BUILD: checks: don't include server.h
    - BUG/MEDIUM: stream: fix session leak on applet-initiated connections
    - BUILD/MINOR: haproxy : FreeBSD/cpu affinity needs pthread_np header
    - BUILD/MINOR: Makefile : enabling USE_CPU_AFFINITY
    - BUG/MINOR: ssl: CO_FL_EARLY_DATA removal is managed by stream
    - BUG/MEDIUM: threads/peers: decrement, not increment jobs on quitting
    - BUG/MEDIUM: h2: don't report an error after parsing a 100-continue response
    - BUG/MEDIUM: peers: fix some track counter rules dont register entries for sync.
    - BUG/MAJOR: thread/peers: fix deadlock on peers sync.
    - BUILD/MINOR: haproxy: compiling config cpu parsing handling when needed
    - MINOR: config: report when "monitor fail" rules are misplaced
    - BUG/MINOR: mworker: fix validity check for the pipe FDs
    - BUG/MINOR: mworker: detach from tty when in daemon mode
    - MINOR: threads: Fix pthread_setaffinity_np on FreeBSD.
    - BUG/MAJOR: thread: Be sure to request a sync between threads only once at a time
    - BUILD: Fix LDFLAGS vs. LIBS re linking order in various makefiles
    - BUG/MEDIUM: checks: Be sure we have a mux if we created a cs.
    - BUG/MINOR: hpack: fix debugging output of pseudo header names
    - BUG/MINOR: hpack: must reject huffman literals padded with more than 7 bits
    - BUG/MINOR: hpack: reject invalid header index
    - BUG/MINOR: hpack: dynamic table size updates are only allowed before headers
    - BUG/MAJOR: h2: correctly check the request length when building an H1 request
    - BUG/MINOR: h2: immediately close if receiving GOAWAY after the last stream
    - BUG/MINOR: h2: try to abort closed streams as soon as possible
    - BUG/MINOR: h2: ":path" must not be empty
    - BUG/MINOR: h2: fix a typo causing PING/ACK to be responded to
    - BUG/MINOR: h2: the TE header if present may only contain trailers
    - BUG/MEDIUM: h2: enforce the per-connection stream limit
    - BUG/MINOR: h2: do not accept SETTINGS_ENABLE_PUSH other than 0 or 1
    - BUG/MINOR: h2: reject incorrect stream dependencies on HEADERS frame
    - BUG/MINOR: h2: properly check PRIORITY frames
    - BUG/MINOR: h2: reject response pseudo-headers from requests
    - BUG/MEDIUM: h2: remove connection-specific headers from request
    - BUG/MEDIUM: h2: do not accept upper case letters in request header names
    - BUG/MINOR: h2: use the H2_F_DATA_* macros for DATA frames
    - BUG/MINOR: action: Don't check http capture rules when no id is defined
    - BUG/MAJOR: hpack: don't pretend large headers fit in empty table
    - BUG/MINOR: ssl: support tune.ssl.cachesize 0 again
    - BUG/MEDIUM: mworker: also close peers sockets in the master
    - BUG/MEDIUM: ssl engines: Fix async engines fds were not considered to fix fd limit automatically.
    - BUG/MEDIUM: checks: a down server going to maint remains definitely stucked on down state.
    - BUG/MEDIUM: peers: set NOLINGER on the outgoing stream interface
    - BUG/MEDIUM: h2: fix handling of end of stream again
    - MINOR: mworker: Update messages referencing exit-on-failure
    - MINOR: mworker: Improve wording in `void mworker_wait()`
    - CONTRIB: halog: Add help text for -s switch in halog program
    - BUG/MEDIUM: email-alert: don't set server check status from a email-alert task
    - BUG/MEDIUM: threads/vars: Fix deadlock in register_name
    - MINOR: systemd: remove comment about HAPROXY_STATS_SOCKET
    - DOC: notifications: add precisions about thread usage
    - BUG/MEDIUM: lua/notification: memory leak
    - MINOR: conn_stream: add new flag CS_FL_RCV_MORE to indicate pending data
    - BUG/MEDIUM: stream-int: always set SI_FL_WAIT_ROOM on CS_FL_RCV_MORE
    - BUG/MEDIUM: h2: automatically set CS_FL_RCV_MORE when the output buffer is full
    - BUG/MEDIUM: h2: enable recv polling whenever demuxing is possible
    - BUG/MEDIUM: h2: work around a connection API limitation
    - BUG/MEDIUM: h2: debug incoming traffic in h2_wake()
    - MINOR: h2: store the demux padding length in the h2c struct
    - BUG/MEDIUM: h2: support uploading partial DATA frames
    - MINOR: h2: don't demand that a DATA frame is complete before processing it
    - BUG/MEDIUM: h2: don't switch the state to HREM before end of DATA frame
    - BUG/MEDIUM: h2: don't close after the first DATA frame on tunnelled responses
    - BUG/MEDIUM: http: don't disable lingering on requests with tunnelled responses
    - BUG/MEDIUM: h2: fix stream limit enforcement
    - BUG/MINOR: stream-int: don't try to receive again after receiving an EOS
    - MINOR: sample: add len converter
    - BUG: MAJOR: lb_map: server map calculation broken
    - BUG: MINOR: http: don't check http-request capture id when len is provided
    - MINOR: sample: rename the "len" converter to "length"
    - BUG/MEDIUM: mworker: Set FD_CLOEXEC flag on log fd
    - DOC/MINOR: intro: typo, wording, formatting fixes
    - MINOR: netscaler: respect syntax
    - MINOR: netscaler: remove the use of cip_magic only used once
    - MINOR: netscaler: rename cip_len to clarify its uage
    - BUG/MEDIUM: netscaler: use the appropriate IPv6 header size
    - BUG/MAJOR: netscaler: address truncated CIP header detection
    - MINOR: netscaler: check in one-shot if buffer is large enough for IP and TCP header
    - MEDIUM: netscaler: do not analyze original IP packet size
    - MEDIUM: netscaler: add support for standard NetScaler CIP protocol
    - MINOR: spoe: add force-set-var option in spoe-agent configuration
    - CONTRIB: iprange: Fix compiler warning in iprange.c
    - CONTRIB: halog: Fix compiler warnings in halog.c
    - BUG/MINOR: h2: properly report a stream error on RST_STREAM
    - MINOR: mux: add flags to describe a mux's capabilities
    - MINOR: stream-int: set flag SI_FL_CLEAN_ABRT when mux supports clean aborts
    - BUG/MEDIUM: stream: don't consider abortonclose on muxes which close cleanly
    - BUG/MEDIUM: checks: a server passed in maint state was not forced down.
    - BUG/MEDIUM: lua: fix crash when using bogus mode in register_service()
    - MINOR: http: adjust the list of supposedly cacheable methods
    - MINOR: http: update the list of cacheable status codes as per RFC7231
    - MINOR: http: start to compute the transaction's cacheability from the request
    - BUG/MINOR: http: do not ignore cache-control: public
    - BUG/MINOR: http: properly detect max-age=0 and s-maxage=0 in responses
    - BUG/MINOR: cache: do not force the TX_CACHEABLE flag before checking cacheability
    - MINOR: http: add a function to check request's cache-control header field
    - BUG/MEDIUM: cache: do not try to retrieve host-less requests from the cache
    - BUG/MEDIUM: cache: replace old object on store
    - BUG/MEDIUM: cache: respect the request cache-control header
    - BUG/MEDIUM: cache: don't cache the response on no-cache="set-cookie"
    - BUG/MAJOR: connection: refine the situations where we don't send shutw()
    - BUG/MEDIUM: checks: properly set servers to stopping state on 404
    - BUG/MEDIUM: h2: properly handle and report some stream errors
    - BUG/MEDIUM: h2: improve handling of frames received on closed streams
    - DOC/MINOR: configuration: typo, formatting fixes
    - BUG/MEDIUM: h2: ensure we always know the stream before sending a reset
    - BUG/MEDIUM: mworker: don't close stdio several time
    - MINOR: don't close stdio anymore
    - BUG/MEDIUM: http: don't automatically forward request close
    - BUG/MAJOR: hpack: don't return direct references to the dynamic headers table
    - MINOR: h2: add a function to report pseudo-header names
    - DEBUG: hpack: make hpack_dht_dump() expose the output file
    - DEBUG: hpack: add more traces to the hpack decoder
    - CONTRIB: hpack: add an hpack decoder
    - MEDIUM: h2: prepare a graceful shutdown when the frontend is stopped
    - BUG/MEDIUM: h2: properly handle the END_STREAM flag on empty DATA frames
    - BUILD: ssl: silence a warning when building without NPN nor ALPN support
    - CLEANUP: rbtree: remove
    - BUG/MEDIUM: ssl: cache doesn't release shctx blocks
    - BUG/MINOR: lua: Fix default value for pattern in Socket.receive
    - DOC: lua: Fix typos in comments of hlua_socket_receive
    - BUG/MEDIUM: lua: Fix IPv6 with separate port support for Socket.connect
    - BUG/MINOR: lua: Fix return value of Socket.settimeout
    - MINOR: dns: Handle SRV record weight correctly.
    - BUG/MEDIUM: mworker: execvp failure depending on argv[0]
    - MINOR: hathreads: add support for gcc < 4.7
    - BUILD/MINOR: ancient gcc versions atomic fix
    - BUG/MEDIUM: stream: properly handle client aborts during redispatch
    - MINOR: spoe: add register-var-names directive in spoe-agent configuration
    - MINOR: spoe: Don't queue a SPOE context if nothing is sent
    - DOC: clarify the scope of ssl_fc_is_resumed
    - CONTRIB: debug: fix a few flags definitions
    - BUG/MINOR: poll: too large size allocation for FD events
    - MINOR: sample: add date_us sample
    - BUG/MEDIUM: peers: fix expire date wasn't updated if entry is modified remotely.
    - MINOR: servers: Don't report duplicate dyncookies for disabled servers.
    - MINOR: global/threads: move cpu_map at the end of the global struct
    - MINOR: threads: add a MAX_THREADS define instead of LONGBITS
    - MINOR: global: add some global activity counters to help debugging
    - MINOR: threads/fd: Use a bitfield to know if there are FDs for a thread in the FD cache
    - BUG/MEDIUM: threads/polling: Use fd_cache_mask instead of fd_cache_num
    - BUG/MEDIUM: fd: maintain a per-thread update mask
    - MINOR: fd: add a bitmask to indicate that an FD is known by the poller
    - BUG/MEDIUM: epoll/threads: use one epoll_fd per thread
    - BUG/MEDIUM: kqueue/threads: use one kqueue_fd per thread
    - BUG/MEDIUM: threads/mworker: fix a race on startup
    - BUG/MINOR: mworker: only write to pidfile if it exists
    - MINOR: threads: Fix build when we're not compiling with threads.
    - BUG/MINOR: threads: always set an owner to the thread_sync pipe
    - BUG/MEDIUM: threads/server: Fix deadlock in srv_set_stopping/srv_set_admin_flag
    - BUG/MEDIUM: checks: Don't try to release undefined conn_stream when a check is freed
    - BUG/MINOR: kqueue/threads: Don't forget to close kqueue_fd[tid] on each thread
    - MINOR: threads: Use __decl_hathreads instead of #ifdef/#endif
    - BUILD: epoll/threads: Add test on MAX_THREADS to avoid warnings when complied without threads
    - BUILD: kqueue/threads: Add test on MAX_THREADS to avoid warnings when complied without threads
    - CLEANUP: sample: Fix comment encoding of sample.c
    - CLEANUP: sample: Fix outdated comment about sample casts functions
    - BUG/MINOR: sample: Fix output type of c_ipv62ip
    - CLEANUP: Fix typo in ARGT_MSK6 comment
    - CLEANUP: standard: Use len2mask4 in str2mask
    - MINOR: standard: Add str2mask6 function
    - MINOR: config: Add support for ARGT_MSK6
    - MEDIUM: sample: Add IPv6 support to the ipmask converter
    - MINOR: config: Enable tracking of up to MAX_SESS_STKCTR stick counters.
    - BUG/MINOR: cli: use global.maxsock and not maxfd to list all FDs
    - MINOR: polling: make epoll and kqueue not depend on maxfd anymore
    - MINOR: fd: don't report maxfd in alert messages
    - MEDIUM: polling: start to move maxfd computation to the pollers
    - CLEANUP: fd/threads: remove the now unused fdtab_lock
    - MINOR: poll: more accurately compute the new maxfd in the loop
    - CLEANUP: fd: remove the unused "new" field
    - MINOR: fd: move the hap_fd_{clr,set,isset} functions to fd.h
    - MEDIUM: select: make use of hap_fd_* functions
    - MEDIUM: fd: use atomic ops for hap_fd_{clr,set} and remove poll_lock
    - MEDIUM: select: don't use the old FD state anymore
    - MEDIUM: poll: don't use the old FD state anymore
    - MINOR: fd: pass the iocb and owner to fd_insert()
    - BUG/MINOR: threads: Update labels array because of changes in lock_label enum
    - MINOR: stick-tables: Adds support for new "gpc1" and "gpc1_rate" counters.
    - BUG/MINOR: epoll/threads: only call epoll_ctl(DEL) on polled FDs
    - DOC: don't suggest using http-server-close
    - MINOR: introduce proxy-v2-options for send-proxy-v2
    - BUG/MEDIUM: spoe: Always try to receive or send the frame to detect shutdowns
    - BUG/MEDIUM: spoe: Allow producer to read and to forward shutdown on request side
    - MINOR: spoe: Remove check on min_applets number when a SPOE context is queued
    - MINOR: spoe: Always link a SPOE context with the applet processing it
    - MINOR: spoe: Replace sending_rate by a frequency counter
    - MINOR: spoe: Count the number of frames waiting for an ack for each applet
    - MEDIUM: spoe: Use an ebtree to manage idle applets
    - MINOR: spoa_example: Count the number of frames processed by each worker
    - MINOR: spoe: Add max-waiting-frames directive in spoe-agent configuration
    - MINOR: init: make stdout unbuffered
    - MINOR: early data: Don't rely on CO_FL_EARLY_DATA to wake up streams.
    - MINOR: early data: Never remove the CO_FL_EARLY_DATA flag.
    - MINOR: compiler: introduce offsetoff().
    - MINOR: threads: Introduce double-width CAS on x86_64 and arm.
    - MINOR: threads: add test and set/reset operations
    - MINOR: pools/threads: Implement lockless memory pools.
    - MAJOR: fd/threads: Make the fdcache mostly lockless.
    - MEDIUM: fd/threads: Make sure we don't miss a fd cache entry.
    - MAJOR: fd: compute the new fd polling state out of the fd lock
    - MINOR: epoll: get rid of the now useless fd_compute_new_polled_status()
    - MINOR: kqueue: get rid of the now useless fd_compute_new_polled_status()
    - MINOR: poll: get rid of the now useless fd_compute_new_polled_status()
    - MINOR: select: get rid of the now useless fd_compute_new_polled_status()
    - CLEANUP: fd: remove the now unused fd_compute_new_polled_status() function
    - MEDIUM: fd: make updt_fd_polling() use atomics
    - MEDIUM: poller: use atomic ops to update the fdtab mask
    - MINOR: fd: move the fd_{add_to,rm_from}_fdlist functions to fd.c
    - BUG/MINOR: fd/threads: properly dereference fdcache as volatile
    - MINOR: fd: remove the unneeded last CAS when adding an fd to the list
    - MINOR: fd: reorder fd_add_to_fd_list()
    - BUG/MINOR: time/threads: ensure the adjusted time is always correct
    - BUG/MEDIUM: standard: Fix memory leak in str2ip2()
    - MINOR: init: emit warning when -sf/-sd cannot parse argument
    - BUILD: fd/threads: fix breakage build breakage without threads
    - DOC: Describe routing impact of using interface keyword on bind lines
    - DOC: Mention -Ws in the list of available options
    - BUG/MINOR: config: don't emit a warning when global stats is incompletely configured
    - BUG/MINOR: fd/threads: properly lock the FD before adding it to the fd cache.
    - BUG/MEDIUM: threads: fix the double CAS implementation for ARMv7
    - BUG/MEDIUM: ssl: Don't always treat SSL_ERROR_SYSCALL as unrecovarable.
    - BUILD/MINOR: memory: stdint is needed for uintptr_t
    - BUG/MINOR: init: Add missing brackets in the code parsing -sf/-st
    - DOC: lua: new prototype for function "register_action()"
    - DOC: cfgparse: Warn on option (tcp|http)log in backend
    - BUG/MINOR: ssl/threads: Make management of the TLS ticket keys files thread-safe
    - MINOR: sample: add a new "concat" converter
    - BUG/MEDIUM: ssl: Shutdown the connection for reading on SSL_ERROR_SYSCALL
    - BUG/MEDIUM: http: Switch the HTTP response in tunnel mode as earlier as possible
    - BUG/MEDIUM: ssl/sample: ssl_bc_* fetch keywords are broken.
    - MINOR: ssl/sample: adds ssl_bc_is_resumed fetch keyword.
    - CLEANUP: cfgparse: Remove unused label end
    - CLEANUP: spoe: Remove unused label retry
    - CLEANUP: h2: Remove unused labels from mux_h2.c
    - CLEANUP: pools: Remove unused end label in memory.h
    - CLEANUP: standard: Fix typo in IPv6 mask example
    - BUG/MINOR: pools/threads: don't ignore DEBUG_UAF on double-word CAS capable archs
    - BUG/MINOR: debug/pools: properly handle out-of-memory when building with DEBUG_UAF
    - MINOR: debug/pools: make DEBUG_UAF also detect underflows
    - MINOR: stats: display the number of threads in the statistics.
    - BUG/MINOR: h2: Set the target of dbuf_wait to h2c
    - BUG/MEDIUM: h2: always consume any trailing data after end of output buffers
    - BUG/MEDIUM: buffer: Fix the wrapping case in bo_putblk
    - BUG/MEDIUM: buffer: Fix the wrapping case in bi_putblk
    - BUG/MEDIUM: spoe: Remove idle applets from idle list when HAProxy is stopping
    - Revert "BUG/MINOR: send-proxy-v2: string size must include ('\0')"
    - MINOR: ssl: extract full pkey info in load_certificate
    - MINOR: ssl: add ssl_sock_get_pkey_algo function
    - MINOR: ssl: add ssl_sock_get_cert_sig function
    - MINOR: connection: add proxy-v2-options ssl-cipher,cert-sig,cert-key
    - MINOR: connection: add proxy-v2-options authority
    - MINOR: systemd: Add section for SystemD sandboxing to unit file
    - MINOR: systemd: Add SystemD's Protect*= options to the unit file
    - MINOR: systemd: Add SystemD's SystemCallFilter option to the unit file
    - CLEANUP: h2: rename misleading h2c_stream_close() to h2s_close()
    - MINOR: h2: provide and use h2s_detach() and h2s_free()
    - MEDIUM: h2: use a single buffer allocator
    - MINOR/BUILD: fix Lua build on Mac OS X
    - BUILD/MINOR: fix Lua build on Mac OS X (again)
    - BUG/MINOR: session: Fix tcp-request session failure if handshake.
    - CLEANUP: .gitignore: Ignore binaries from the contrib directory
    - BUG/MINOR: unix: Don't mess up when removing the socket from the xfer_sock_list.
    - DOC: buffers: clarify the purpose of the <from> pointer in offer_buffers()
    - BUG/MEDIUM: h2: also arm the h2 timeout when sending
    - BUG/MINOR: cli: Fix a crash when passing a negative or too large value to "show fd"
    - CLEANUP: ssl: Remove a duplicated #include
    - CLEANUP: cli: Remove a leftover debug message
    - BUG/MINOR: cli: Fix a typo in the 'set rate-limit' usage
    - BUG/MEDIUM: fix a 100% cpu usage with cpu-map and nbthread/nbproc
    - BUG/MINOR: force-persist and ignore-persist only apply to backends
    - BUG/MEDIUM: threads/unix: Fix a deadlock when a listener is temporarily disabled
    - BUG/MAJOR: threads/queue: Fix thread-safety issues on the queues management
    - BUG/MINOR: dns: don't downgrade DNS accepted payload size automatically
    - TESTS: Add a testcase for multi-port + multi-server listener issue
    - CLEANUP: dns: remove duplicate code in src/dns.c
    - BUG/MINOR: seemless reload: Fix crash when an interface is specified.
    - BUG/MINOR: cli: Ensure all command outputs end with a LF
    - BUG/MINOR: cli: Fix a crash when sending a command with too many arguments
    - BUILD: ssl: Fix build with OpenSSL without NPN capability
    - BUG/MINOR: spoa-example: unexpected behavior for more than 127 args
    - BUG/MINOR: lua: return bad error messages
    - CLEANUP: lua/syntax: lua is a name and not an acronym
    - BUG/MEDIUM: tcp-check: single connect rule can't detect DOWN servers
    - BUG/MINOR: tcp-check: use the server's service port as a fallback
    - BUG/MEDIUM: threads/queue: wake up other threads upon dequeue
    - MINOR: log: stop emitting alerts when it's not possible to write on the socket
    - BUILD/BUG: enable -fno-strict-overflow by default
    - BUG/MEDIUM: fd/threads: ensure the fdcache_mask always reflects the cache contents
    - DOC: log: more than 2 log servers are allowed
    - MINOR: hash: add new function hash_crc32c
    - MINOR: proxy-v2-options: add crc32c
    - MINOR: accept-proxy: support proxy protocol v2 CRC32c checksum
    - REORG: compact "struct server"
    - MINOR: samples: add crc32c converter
    - BUG/MEDIUM: h2: properly account for DATA padding in flow control
    - BUG/MINOR: h2: ensure we can never send an RST_STREAM in response to an RST_STREAM
    - BUG/MINOR: listener: Don't decrease actconn twice when a new session is rejected
    - CLEANUP: map, stream: remove duplicate code in src/map.c, src/stream.c
    - BUG/MINOR: lua: the function returns anything
    - BUG/MINOR: lua funtion hlua_socket_settimeout don't check negative values
    - CLEANUP: lua: typo fix in comments
    - BUILD/MINOR: fix build when USE_THREAD is not defined
    - MINOR: lua: allow socket api settimeout to accept integers, float, and doubles
    - BUG/MINOR: hpack: fix harmless use of uninitialized value in hpack_dht_insert
    - MINOR: cli/threads: make "show fd" report thread_sync_io_handler instead of "unknown"
    - MINOR: cli: make "show fd" report the mux and mux_ctx pointers when available
    - BUILD/MINOR: cli: fix a build warning introduced by last commit
    - BUG/MAJOR: h2: remove orphaned streams from the send list before closing
    - MINOR: h2: always call h2s_detach() in h2_detach()
    - MINOR: h2: fuse h2s_detach() and h2s_free() into h2s_destroy()
    - BUG/MEDIUM: h2/threads: never release the task outside of the task handler
    - BUG/MEDIUM: h2: don't consider pending data on detach if connection is in error
    - BUILD/MINOR: threads: always export thread_sync_io_handler()
    - MINOR: mux: add a "show_fd" function to dump debugging information for "show fd"
    - MINOR: h2: implement a basic "show_fd" function
    - MINOR: cli: report cache indexes in "show fd"
    - BUG/MINOR: h2: remove accidental debug code introduced with show_fd function
    - BUG/MEDIUM: h2: always add a stream to the send or fctl list when blocked
    - BUG/MINOR: checks: check the conn_stream's readiness and not the connection
    - BUG/MINOR: fd: Don't clear the update_mask in fd_insert.
    - BUG/MINOR: email-alert: Set the mailer port during alert initialization
    - BUG/MINOR: cache: fix "show cache" output
    - BUG/MAJOR: cache: fix random crashes caused by incorrect delete() on non-first blocks
    - BUG/MINOR: spoe: Initialize variables used during conf parsing before any check
    - BUG/MINOR: spoe: Don't release the context buffer in .check_timeouts callbaclk
    - BUG/MINOR: spoe: Register the variable to set when an error occurred
    - BUG/MINOR: spoe: Don't forget to decrement fpa when a processing is interrupted
    - MINOR: spoe: Add metrics in to know time spent in the SPOE
    - MINOR: spoe: Add options to store processing times in variables
    - MINOR: log: move 'log' keyword parsing in dedicated function
    - MINOR: log: Keep the ref when a log server is copied to avoid duplicate entries
    - MINOR: spoe: Add loggers dedicated to the SPOE agent
    - MINOR: spoe: Add support for option dontlog-normal in the SPOE agent section
    - MINOR: spoe: use agent's logger to log SPOE messages
    - MINOR: spoe: Add counters to log info about SPOE agents
    - BUG/MAJOR: cache: always initialize newly created objects
    - MINOR: servers: Support alphanumeric characters for the server templates names
    - BUG/MEDIUM: threads: Fix the max/min calculation because of name clashes
    - BUG/MEDIUM: connection: Make sure we have a mux before calling detach().
    - BUG/MINOR: http: Return an error in proxy mode when url2sa fails
    - MINOR: proxy: Add fe_defbe fetcher
    - MINOR: config: Warn if resolvers has no nameservers
    - BUG/MINOR: cli: Guard against NULL messages when using CLI_ST_PRINT_FREE
    - MINOR: cli: Ensure the CLI always outputs an error when it should
    - MEDIUM: sample: Extend functionality for field/word converters
    - MINOR: export localpeer as an environment variable
    - BUG/MEDIUM: kqueue: When adding new events, provide an output to get errors.
    - BUILD: sample: avoid build warning in sample.c
    - BUG/CRITICAL: h2: fix incorrect frame length check
    - DOC: lua: update the links to the config and Lua API
    - BUG/MINOR: pattern: Add a missing HA_SPIN_INIT() in pat_ref_newid()
    - BUG/MAJOR: channel: Fix crash when trying to read from a closed socket
    - BUG/MINOR: log: t_idle (%Ti) is not set for some requests
    - BUG/MEDIUM: lua: Fix segmentation fault if a Lua task exits
    - MINOR: h2: detect presence of CONNECT and/or content-length
    - BUG/MEDIUM: h2: implement missing support for chunked encoded uploads
    - BUG/MINOR: spoe: Fix counters update when processing is interrupted
    - BUG/MINOR: spoe: Fix parsing of dontlog-normal option
    - MEDIUM: cli: Add payload support
    - MINOR: map: Add payload support to "add map"
    - MINOR: ssl: Add payload support to "set ssl ocsp-response"
    - BUG/MINOR: lua/threads: Make lua's tasks sticky to the current thread
    - MINOR: sample: Add strcmp sample converter
    - MINOR: http: Add support for 421 Misdirected Request
    - BUG/MINOR: config: disable http-reuse on TCP proxies
    - MINOR: ssl: disable SSL sample fetches when unsupported
    - MINOR: ssl: add fetch 'ssl_fc_session_key' and 'ssl_bc_session_key'
    - BUG/MINOR: checks: Fix check->health computation for flapping servers
    - BUG/MEDIUM: threads: Fix the sync point for more than 32 threads
    - BUG/MINOR, BUG/MINOR: lua: Put tasks to sleep when waiting for data
    - MINOR: backend: implement random-based load balancing
    - DOC/MINOR: clean up LUA documentation re: servers & array/table.
    - MINOR: lua: Add server name & puid to LUA Server class.
    - MINOR: lua: add get_maxconn and set_maxconn to LUA Server class.
    - BUG/MINOR: map: correctly track reference to the last ref_elt being dumped
    - BUG/MEDIUM: task: Don't free a task that is about to be run.
    - MINOR: fd: Make the lockless fd list work with multiple lists.
    - BUG/MEDIUM: pollers: Use a global list for fd shared between threads.
    - MINOR: pollers: move polled_mask outside of struct fdtab.
    - BUG/MINOR: lua: schedule socket task upon lua connect()
    - BUG/MINOR: lua: ensure large proxy IDs can be represented
    - BUG/MEDIUM: pollers/kqueue: use incremented position in event list
    - BUG/MINOR: cli: don't stop cli_gen_usage_msg() when kw->usage == NULL
    - BUG/MEDIUM: http: don't always abort transfers on CF_SHUTR
    - BUG/MEDIUM: ssl: properly protect SSL cert generation
    - BUG/MINOR: lua: Socket.send threw runtime error: 'close' needs 1 arguments.
    - BUG/MINOR: spoe: Mistake in error message about SPOE configuration
    - BUG/MEDIUM: spoe: Flags are not encoded in network order
    - CLEANUP: spoe: Remove unused variables the agent structure
    - DOC: spoe: fix a typo
    - BUG/MEDIUM: contrib/mod_defender: Use network order to encode/decode flags
    - BUG/MEDIUM: contrib/modsecurity: Use network order to encode/decode flags
    - DOC: add some description of the pending rework of the buffer structure
    - BUG/MINOR: ssl/lua: prevent lua from affecting automatic maxconn computation
    - MINOR: lua: Improve error message
    - BUG/MEDIUM: cache: don't cache when an Authorization header is present
    - MINOR: ssl: set SSL_OP_PRIORITIZE_CHACHA
    - BUG/MEDIUM: dns: Delay the attempt to run a DNS resolution on check failure.
    - BUG/BUILD: threads: unbreak build without threads
    - BUG/MEDIUM: servers: Add srv_addr default placeholder to the state file
    - BUG/MEDIUM: lua/socket: Length required read doesn't work
    - MINOR: tasks: Change the task API so that the callback takes 3 arguments.
    - MAJOR: tasks: Create a per-thread runqueue.
    - MAJOR: tasks: Introduce tasklets.
    - MINOR: tasks: Make the number of tasks to run at once configurable.
    - MAJOR: applets: Use tasks, instead of rolling our own scheduler.
    - BUG/MEDIUM: stick-tables: Decrement ref_cnt in table_* converters
    - MINOR: http: Log warning if (add|set)-header fails
    - DOC: management: add the new wrew stats column
    - MINOR: stats: also report the failed header rewrites warnings on the stats page
    - BUG/MEDIUM: tasks: Don't forget to increase/decrease tasks_run_queue.
    - BUG/MEDIUM: task: Don't forget to decrement max_processed after each task.
    - MINOR: task: Also consider the task list size when getting global tasks.
    - MINOR: dns: Implement `parse-resolv-conf` directive
    - BUG/MEDIUM: spoe: Return an error when the wrong ACK is received in sync mode
    - MINOR: task/notification: Is notifications registered ?
    - BUG/MEDIUM: lua/socket: wrong scheduling for sockets
    - BUG/MAJOR: lua: Dead lock with sockets
    - BUG/MEDIUM: lua/socket: Notification error
    - BUG/MEDIUM: lua/socket: Sheduling error on write: may dead-lock
    - BUG/MEDIUM: lua/socket: Buffer error, may segfault
    - DOC: contrib/modsecurity: few typo fixes
    - DOC: SPOE.txt: fix a typo
    - MAJOR: spoe: upgrade the SPOP version to 2.0 and remove the support for 1.0
    - BUG/MINOR: contrib/spoa_example: Don't reset the status code during disconnect
    - BUG/MINOR: contrib/mod_defender: Don't reset the status code during disconnect
    - BUG/MINOR: contrib/modsecurity: Don't reset the status code during disconnect
    - BUG/MINOR: contrib/mod_defender: update pointer on the end of the frame
    - BUG/MINOR: contrib/modsecurity: update pointer on the end of the frame
    - MINOR: task: Fix a compiler warning by adding a cast.
    - MINOR: stats: also report the nice and number of calls for applets
    - MINOR: applet: assign the same nice value to a new appctx as its owner task
    - MINOR: task: Fix compiler warning.
    - BUG/MEDIUM: tasks: Use the local runqueue when building without threads.
    - MINOR: tasks: Don't define rqueue if we're building without threads.
    - BUG/MINOR: unix: Make sure we can transfer abns sockets on seamless reload.
    - MINOR: lua: Increase debug information
    - BUG/MEDIUM: threads: handle signal queue only in thread 0
    - BUG/MINOR: don't ignore SIG{BUS,FPE,ILL,SEGV} during signal processing
    - BUG/MINOR: signals: ha_sigmask macro for multithreading
    - BUG/MAJOR: map: fix a segfault when using http-request set-map
    - DOC: regression testing: Add a short starting guide.
    - MINOR: tasks: Make sure we correctly init and deinit a tasklet.
    - BUG/MINOR: tasklets: Just make sure we don't pass a tasklet to the handler.
    - BUG/MINOR: lua: Segfaults with wrong usage of types.
    - BUG/MAJOR: ssl: Random crash with cipherlist capture
    - BUG/MAJOR: ssl: OpenSSL context is stored in non-reserved memory slot
    - BUG/MEDIUM: ssl: do not store pkinfo with SSL_set_ex_data
    - MINOR: tests: First regression testing file.
    - MINOR: reg-tests: Add reg-tests/README file.
    - MINOR: reg-tests: Add a few regression testing files.
    - DOC: Add new REGTEST tag info about reg testing.
    - BUG/MEDIUM: fd: Don't modify the update_mask in fd_dodelete().
    - MINOR: Some spelling cleanup in the comments.
    - BUG/MEDIUM: threads: Use the sync point to check active jobs and exit
    - MINOR: threads: Be sure to remove threads from all_threads_mask on exit
    - REGTEST/MINOR: Wrong URI in a reg test for SSL/TLS.
    - REGTEST/MINOR: Set HAPROXY_PROGRAM default value.
    - REGTEST/MINOR: Add levels to reg-tests target.
    - BUG/MAJOR: Stick-tables crash with segfault when the key is not in the stick-table
    - BUG/BUILD: threads: unbreak build without threads
    - BUG/MAJOR: stick_table: Complete incomplete SEGV fix
    - MINOR: stick-tables: make stktable_release() do nothing on NULL
    - BUG/MEDIUM: lua: possible CLOSE-WAIT state with '\n' headers
    - MINOR: startup: change session/process group settings
    - MINOR: systemd: consider exit status 143 as successful
    - REGTEST/MINOR: Wrong URI syntax.
    - CLEANUP: dns: remove obsolete macro DNS_MAX_IP_REC
    - CLEANUP: dns: inacurate comment about prefered IP score
    - MINOR: dns: fix wrong score computation in dns_get_ip_from_response
    - MINOR: dns: new DNS options to allow/prevent IP address duplication
    - REGTEST/MINOR: Unexpected curl URL globling.
    - BUG/MINOR: ssl: properly ref-count the tls_keys entries
    - MINOR: h2: keep a count of the number of conn_streams attached to the mux
    - BUG/MEDIUM: h2: don't accept new streams if conn_streams are still in excess
    - MINOR: h2: add the mux and demux buffer lengths on "show fd"
    - BUG/MEDIUM: h2: never leave pending data in the output buffer on close
    - BUG/MEDIUM: h2: make sure the last stream closes the connection after a timeout
    - MINOR: tasklet: Set process to NULL.
    - MINOR: buffer: implement a new file for low-level buffer manipulation functions
    - MINOR: buffer: switch buffer sizes and offsets to size_t
    - MINOR: buffer: add a few basic functions for the new API
    - MINOR: buffer: Introduce b_sub(), b_add(), and bo_add()
    - MINOR: buffer: Add b_set_data().
    - MINOR: buffer: introduce b_realign_if_empty()
    - MINOR: compression: pass the channel to http_compression_buffer_end()
    - MINOR: channel: add a few basic functions for the new buffer API
    - MINOR: channel/buffer: use c_realign_if_empty() instead of buffer_realign()
    - MINOR: channel/buffer: replace buffer_slow_realign() with channel_slow_realign() and b_slow_realign()
    - MEDIUM: channel: make channel_slow_realign() take a swap buffer
    - MINOR: h2: use b_slow_realign() with the trash as a swap buffer
    - MINOR: buffer: remove buffer_slow_realign() and the swap_buffer allocation code
    - MINOR: channel/buffer: replace b_{adv,rew} with c_{adv,rew}
    - MINOR: buffer: replace calls to buffer_space_wraps() with b_space_wraps()
    - MINOR: buffer: remove bi_getblk() and bi_getblk_nc()
    - MINOR: buffer: split bi_contig_data() into ci_contig_data and b_config_data()
    - MINOR: buffer: remove bi_ptr()
    - MINOR: buffer: remove bo_ptr()
    - MINOR: buffer: remove bo_end()
    - MINOR: buffer: remove bi_end()
    - MINOR: buffer: remove bo_contig_data()
    - MINOR: buffer: merge b{i,o}_contig_space()
    - MINOR: buffer: replace bo_getblk() with direction agnostic b_getblk()
    - MINOR: buffer: replace bo_getblk_nc() with b_getblk_nc() which takes an offset
    - MINOR: buffer: replace bi_del() and bo_del() with b_del()
    - MINOR: buffer: convert most b_ptr() calls to c_ptr()
    - MINOR: h1: make h1_measure_trailers() take the byte count in argument
    - MINOR: h2: clarify the fact that the send functions are unsigned
    - MEDIUM: h2: prevent the various mux encoders from modifying the buffer
    - MINOR: h1: make h1_skip_chunk_crlf() not depend on b_ptr() anymore
    - MINOR: h1: make h1_parse_chunk_size() not depend on b_ptr() anymore
    - MINOR: h1: make h1_measure_trailers() use an offset and a count
    - MEDIUM: h2: do not use buf->o anymore inside h2_snd_buf's loop
    - MEDIUM: h2: don't use b_ptr() nor b_end() anymore
    - MINOR: buffer: get rid of b_end() and b_to_end()
    - MINOR: buffer: make b_getblk_nc() take const pointers
    - MINOR: buffer: make b_getblk_nc() take size_t for the block sizes
    - MEDIUM: connection: make xprt->snd_buf() take the byte count in argument
    - MEDIUM: mux: make mux->snd_buf() take the byte count in argument
    - MEDIUM: connection: make xprt->rcv_buf() use size_t for the count
    - MEDIUM: mux: make mux->rcv_buf() take a size_t for the count
    - MINOR: connection: add a flags argument to rcv_buf()
    - MINOR: connection: add a new receive flag : CO_RFL_BUF_WET
    - MINOR: buffer: get rid of b_ptr() and convert its last users
    - MINOR: buffer: use b_room() to determine available space in a buffer
    - MINOR: buffer: replace buffer_not_empty() with b_data() or c_data()
    - MINOR: buffer: replace buffer_empty() with b_empty() or c_empty()
    - MINOR: buffer: make bo_putchar() use b_tail()
    - MINOR: buffer: replace buffer_full() with channel_full()
    - MINOR: buffer: replace bi_space_for_replace() with ci_space_for_replace()
    - MINOR: buffer: replace buffer_pending() with ci_data()
    - MINOR: buffer: replace buffer_flush() with c_adv(chn, ci_data(chn))
    - MINOR: buffer: use c_head() instead of buffer_wrap_sub(c->buf, p-o)
    - MINOR: buffer: use b_orig() to replace most references to b->data
    - MINOR: buffer: Use b_add()/bo_add() instead of accessing b->i/b->o.
    - MINOR: channel: remove almost all references to buf->i and buf->o
    - MINOR: channel: Add co_set_data().
    - MEDIUM: channel: adapt to the new buffer API
    - MINOR: checks: adapt to the new buffer API
    - MEDIUM: h2: update to the new buffer API
    - MINOR: buffer: remove unused bo_add()
    - MEDIUM: spoe: use the new buffer API for the SPOE buffer
    - MINOR: stats: adapt to the new buffers API
    - MINOR: cli: use the new buffer API
    - MINOR: cache: use the new buffer API
    - MINOR: stream-int: use the new buffer API
    - MINOR: stream: use wrappers instead of directly manipulating buffers
    - MINOR: backend: use new buffer API
    - MEDIUM: http: use wrappers instead of directly manipulating buffers states
    - MINOR: filters: convert to the new buffer API
    - MINOR: payload: convert to the new buffer API
    - MEDIUM: h1: port to new buffer API.
    - MINOR: flt_trace: adapt to the new buffer API
    - MEDIUM: compression: start to move to the new buffer API
    - MINOR: lua: use the wrappers instead of directly manipulating buffer states
    - MINOR: buffer: convert part bo_putblk() and bi_putblk() to the new API
    - MINOR: buffer: adapt buffer_slow_realign() and buffer_dump() to the new API
    - MAJOR: start to change buffer API
    - MINOR: buffer: remove the check for output on b_del()
    - MINOR: buffer: b_set_data() doesn't truncate output data anymore
    - MINOR: buffer: rename the "data" field to "area"
    - MEDIUM: buffers: move "output" from struct buffer to struct channel
    - MINOR: buffer: replace bi_fast_delete() with b_del()
    - MINOR: buffer: replace b{i,o}_put* with b_put*
    - MINOR: buffer: add a new file for ist + buffer manipulation functions
    - MINOR: checks: use b_putist() instead of b_putstr()
    - MINOR: buffers: remove b_putstr()
    - CLEANUP: buffer: minor cleanups to buffer.h
    - MINOR: buffers/channel: replace buffer_insert_line2() with ci_insert_line2()
    - MINOR: buffer: replace buffer_replace2() with b_rep_blk()
    - MINOR: buffer: rename the data length member to '->data'
    - MAJOR: buffer: finalize buffer detachment
    - MEDIUM: chunks: make the chunk struct's fields match the buffer struct
    - MAJOR: chunks: replace struct chunk with struct buffer
    - DOC: buffers: document the new buffers API
    - DOC: buffers: remove obsolete docs about buffers
    - MINOR: tasklets: Don't attempt to add a tasklet in the list twice.
    - MINOR: connections/mux: Add a new "subscribe" method.
    - MEDIUM: connections/mux: Revamp the send direction.
    - MINOR: connection: simplify subscription by adding a registration function
    - BUG/MINOR: http: Set brackets for the unlikely macro at the right place
    - BUG/MINOR: build: Fix compilation with debug mode enabled
    - BUILD: Generate sha256 checksums in publish-release
    - MINOR: debug: Add check for CO_FL_WILL_UPDATE
    - MINOR: debug: Add checks for conn_stream flags
    - MINOR: ist: Add the function isteqi
    - BUG/MEDIUM: threads: Fix the exit condition of the thread barrier
    - BUG/MEDIUM: mux_h2: Call h2_send() before updating polling.
    - MINOR: buffers: simplify b_contig_space()
    - MINOR: buffers: split b_putblk() into __b_putblk()
    - MINOR: buffers: add b_xfer() to transfer data between buffers
    - DOC: add some design notes about the new layering model
    - MINOR: conn_stream: add a new CS_FL_REOS flag
    - MINOR: conn_stream: add an rx buffer to the conn_stream
    - MEDIUM: conn_stream: add cs_recv() as a default rcv_buf() function
    - MEDIUM: stream-int: automatically call si_cs_recv_cb() if the cs has data on wake()
    - MINOR: h2: make each H2 stream support an intermediary input buffer
    - MEDIUM: h2: make h2_frt_decode_headers() use an intermediary buffer
    - MEDIUM: h2: make h2_frt_transfer_data() copy via an intermediary buffer
    - MEDIUM: h2: centralize transfer of decoded frames in h2_rcv_buf()
    - MEDIUM: h2: move headers and data frame decoding to their respective parsers
    - MEDIUM: buffers: make b_xfer() automatically swap buffers when possible
    - MEDIUM: h2: perform a single call to the data layer in demux()
    - MEDIUM: h2: don't call data_cb->recv() anymore
    - MINOR: h2: make use of CS_FL_REOS to indicate that end of stream was seen
    - MEDIUM: h2: use the default conn_stream's receive function
    - DOC: add more design feedback on the new layering model
    - MINOR: h2: add the error code and the max/last stream IDs to "show fd"
    - BUG/MEDIUM: stream-int: don't immediately enable reading when the buffer was reportedly full
    - BUG/MEDIUM: stats: don't ask for more data as long as we're responding
    - BUG/MINOR: servers: Don't make "server" in a frontend fatal.
    - BUG/MEDIUM: tasks: make sure we pick all tasks in the run queue
    - BUG/MEDIUM: tasks: Decrement rqueue_size at the right time.
    - BUG/MEDIUM: tasks: use atomic ops for active_tasks_mask
    - BUG/MEDIUM: tasks: Make sure there's no task left before considering inactive.
    - MINOR: signal: don't pass the signal number anymore as the wakeup reason
    - MINOR: tasks: extend the state bits from 8 to 16 and remove the reason
    - MINOR: tasks: Add a flag that tells if we're in the global runqueue.
    - BUG/MEDIUM: tasks: make __task_unlink_rq responsible for the rqueue size.
    - MINOR: queue: centralize dequeuing code a bit better
    - MEDIUM: queue: make pendconn_free() work on the stream instead
    - DOC: queue: document the expected locking model for the server's queue
    - MINOR: queue: make sure pendconn->strm->pend_pos is always valid
    - MINOR: queue: use a distinct variable for the assigned server and the queue
    - MINOR: queue: implement pendconn queue locking functions
    - MEDIUM: queue: get rid of the pendconn lock
    - MINOR: tasks: Make active_tasks_mask volatile.
    - MINOR: tasks: Make global_tasks_mask volatile.
    - MINOR: pollers: Add a way to wake a thread sleeping in the poller.
    - MINOR: threads/queue: Get rid of THREAD_WANT_SYNC in the queue code.
    - BUG/MEDIUM: threads/sync: use sched_yield when available
    - MINOR: ssl: BoringSSL matches OpenSSL 1.1.0
    - BUG/MEDIUM: h2: prevent orphaned streams from blocking a connection forever
    - BUG/MINOR: config: stick-table is not supported in defaults section
    - BUILD/MINOR: threads: unbreak build with threads disabled
    - BUG/MINOR: threads: Handle nbthread == MAX_THREADS.
    - BUG/MEDIUM: threads: properly fix nbthreads == MAX_THREADS
    - MINOR: threads: move "nbthread" parsing to hathreads.c
    - BUG/MEDIUM: threads: unbreak "bind" referencing an incorrect thread number
    - MEDIUM: proxy_protocol: Convert IPs to v6 when protocols are mixed
    - BUILD/MINOR: compiler: fix offsetof() on older compilers
    - SCRIPTS: git-show-backports: add missing quotes to "echo"
    - MINOR: threads: add more consistency between certain variables in no-thread case
    - MEDIUM: hathreads: implement a more flexible rendez-vous point
    - BUG/MEDIUM: cli: make "show fd" thread-safe
2018-08-02 18:12:50 +02:00
Willy Tarreau
bf9fd65088 BUG/MEDIUM: cli: make "show fd" thread-safe
The "show fd" command was implemented as a debugging aid but it's not
thread safe. Its features have grown, it can now dump some mux-specific
parts and is being used in production to capture some useful debugging
traces. But it will quickly crash the process when used during an H2 load
test for example, especially when haproxy is built with the DEBUG_UAF
option. It cannot afford not to be thread safe anymore. Let's make use
of the new rendez-vous point using thread_isolate() / thread_release()
to ensure that the data being dumped are not changing under us. The dump
becomes slightly slower under load but now it's safe.

This should be backported to 1.8 along with the rendez-vous point code
once considered stable enough.
2018-08-02 17:51:49 +02:00
Willy Tarreau
60b639ccbe MEDIUM: hathreads: implement a more flexible rendez-vous point
The current synchronization point enforces certain restrictions which
are hard to workaround in certain areas of the code. The fact that the
critical code can only be called from the sync point itself is a problem
for some callback-driven parts. The "show fd" command for example is
fragile regarding this.

Also it is expensive in terms of CPU usage because it wakes every other
thread just to be sure all of them join to the rendez-vous point. It's a
problem because the sleeping threads would not need to be woken up just
to know they're doing nothing.

Here we implement a different approach. We keep track of harmless threads,
which are defined as those either doing nothing, or doing harmless things.
The rendez-vous is used "for others" as a way for a thread to isolate itself.
A thread then requests to be alone using thread_isolate() when approaching
the dangerous area, and then waits until all other threads are either doing
the same or are doing something harmless (typically polling). The function
only returns once the thread is guaranteed to be alone, and the critical
section is terminated using thread_release().
2018-08-02 17:51:45 +02:00
Willy Tarreau
0c026f49e7 MINOR: threads: add more consistency between certain variables in no-thread case
When threads are disabled, some variables such as tid and tid_bit are
still checked everywhere, the MAX_THREADS_MASK macro is ~0UL while
MAX_THREADS is 1, and the all_threads_mask variable is replaced with a
macro forced to zero. The compiler cannot optimize away all this code
involving checks on tid and tid_bit, and we end up in special cases
where all_threads_mask has to be specifically tested for being zero or
not. It is not even certain the code paths are always equivalent when
testing without threads and with nbthread 1.

Let's change this to make sure we always present a single thread when
threads are disabled, and have the relevant values declared as constants
so that the compiler can optimize all the tests away. Now we have
MAX_THREADS_MASK set to 1, all_threads_mask set to 1, tid set to zero
and tid_bit set to 1. Doing just this has removed 4 kB of code in the
no-thread case.

A few checks for all_threads_mask==0 have been removed since it never
happens anymore.
2018-08-02 17:48:09 +02:00
Tim Duesterhus
7fec021537 MEDIUM: proxy_protocol: Convert IPs to v6 when protocols are mixed
http-request set-src possibly creates a situation where src and dst
are from different address families. Convert both addresses to IPv6
to avoid a PROXY UNKNOWN.

This patch should be backported to haproxy 1.8.
2018-07-30 11:23:30 +02:00
Willy Tarreau
c477b6fcc9 BUG/MEDIUM: threads: unbreak "bind" referencing an incorrect thread number
The "process" directive on "bind" lines supports process references and
thread references. No check is performed on the thread number validity,
so that if a listener is only bound to non-existent threads, the traffic
will never be processed. It easily happens when setting one bind line per
thread with an incorrect (or reduced) thread count. No warning appears
and some random connections are never served. It also happens when setting
thread references with threads support disabled at build time.

This patch makes use of the all_threads_mask variable to detect if some
referenced threads don't exist, to emit a warning and fix this.

This patch needs to be backported to 1.8, just like the previous one which
it depends on (MINOR: threads: move "nbthread" parsing to hathreads.c).
2018-07-30 11:10:46 +02:00
Willy Tarreau
0ccd32285f MINOR: threads: move "nbthread" parsing to hathreads.c
The purpose is to make sure that all variables which directly depend
on this nbthread argument are set at the right moment. For now only
all_threads_mask needs to be set. It used to be set while calling
thread_sync_init() which is called too late for certain checks. The
same function handles threads and non-threads, which removes the need
for some thread-specific knowledge from cfgparse.c.
2018-07-30 11:10:46 +02:00
Willy Tarreau
5e954e1f27 BUG/MEDIUM: threads: properly fix nbthreads == MAX_THREADS
While moving Olivier's patch for nbthread==MAX_THREADS in commit
3e12304 ("BUG/MINOR: threads: Handle nbthread == MAX_THREADS.") to
hathreads.c, I missed one place resulting in the computed thread mask
being used as the thread count, which is worse than the initial bug.
Let's fix it properly this time.

This fix must be backported to 1.8 just like the other one.
2018-07-30 11:10:26 +02:00
Olivier Houchard
3e12304ae0 BUG/MINOR: threads: Handle nbthread == MAX_THREADS.
If nbthread is MAX_THREADS, the shift operation needed to compute
all_threads_mask fails in thread_sync_init(). Instead pass a number
of threads to this function and let it compute the mask without
overflowing.

This should be backported to 1.8.
2018-07-27 17:18:22 +02:00
Willy Tarreau
85d9b84eb1 BUILD/MINOR: threads: unbreak build with threads disabled
Depending on the optimization level, gcc may complain that wake_thread()
uses an invalid array index for poller_wr_pipe[] when called from
__task_wakeup(). Normally the condition to get there never happens,
but it's simpler to ifdef out this part of the code which is only
used to wake other threads up. No backport is needed, this was brought
by the recent introduction of the ability to wake a sleeping thread.
2018-07-27 17:18:22 +02:00
Willy Tarreau
c786768dba BUG/MINOR: config: stick-table is not supported in defaults section
Thierry discovered that the following config crashes haproxy while
parsing the config (it's probably the smallest crasher) :

   defaults
       stick-table type ip size 1M

And indeed it does because it looks for the current proxy's name which it
does not have as it's the default one. This affects all versions since 1.6.

This fix must be backported to all versions back to 1.6.
2018-07-27 10:26:22 +02:00
Willy Tarreau
a2b5181e7a BUG/MEDIUM: h2: prevent orphaned streams from blocking a connection forever
Some h2 connections remaining in CLOSE_WAIT state forever have been
reported for a while. Thanks to detailed captures provided by Milan
Petruzelka, the sequence where this happens became clearer :

  1) multiple streams compete for the mux and are queued in the send_list

  2) at this point the mux has to emit a GOAWAY for any reason (for
     example because it received a bad message)

  3) the streams are woken up, notified about the error

  4) h2_detach() is called for each of them

  5) the CS they are detached from the H2S

  6) since the streams are marked as blocked for some room, they are
     orphaned and nothing more is done on them.

  7) at this point, any activity on the connection goes through h2_wake()
     which sees the conneciton in ERROR2 state, tries again to release
     the streams, cannot, and stops polling (thus even connection errors
     cannot be detected anymore).

=> from this point, no more events can be received on the connection, and
   the streams remain orphaned forever.

This patch makes sure that we never return without doing anything once
an error was met. It has to act both on the h2_detach() side (for h2
streams being detached after the error was emitted) and on the h2_wake()
side (for errors reported after h2s have already been orphaned).

Many thanks to Milan Petruzelka and Janusz Dziemidowicz for their
awesome work on this issue, collecting traces and testing patches,
and to Olivier Doucet for extra testing and confirming the fix.

This fix must be backported to 1.8.
2018-07-27 09:55:14 +02:00
Willy Tarreau
3ea2490b48 BUG/MEDIUM: threads/sync: use sched_yield when available
There is a corner case with the sync point which can significantly
degrade performance. The reason is that it forces all threads to busy
spin there, and that if there are less CPUs available than threads,
this busy activity from some threads will force others to wait longer
in epoll() or to simply be scheduled out while doing something else,
and will increase the time needed to reach the sync point.

Given that the sync point is not expected to be stressed *that* much,
better call sched_yield() while waiting there to release the CPU and
offer it to waiting threads.

On a simple test with 4 threads bound to two cores using "maxconn 1"
on the server line, the performance was erratic before the recent
scheduler changes (between 40 and 200 conn/s with hundreds of ms
response time), and it jumped to 7200 with 12ms response time with
this fix applied.

It should be backported to 1.8 since 1.8 is affected as well.
2018-07-27 07:54:08 +02:00
Olivier Houchard
ecfe673f61 MINOR: threads/queue: Get rid of THREAD_WANT_SYNC in the queue code.
Now that we can wake one thread sleeping in the poller, we don't have to
use THREAD_WANT_SYNC any more.

This gives a significant performance boost on highly contended accesses
(servers with maxconn 1), showing a jump from 21k to 31k conn/s on a
test involving 8 threads.
2018-07-26 20:55:02 +02:00
Olivier Houchard
79321b95a8 MINOR: pollers: Add a way to wake a thread sleeping in the poller.
Add a new pipe, one per thread, so that we can write on it to wake a thread
sleeping in a poller, and use it to wake threads supposed to take care of a
task, if they are all sleeping.
2018-07-26 19:09:50 +02:00
Olivier Houchard
eba0c0b51d MINOR: tasks: Make global_tasks_mask volatile.
In order to make sure modifications are noticed by other threads when needed,
make global_tasks_mask volatile.
2018-07-26 19:09:50 +02:00
Olivier Houchard
9b03c0c9a7 MINOR: tasks: Make active_tasks_mask volatile.
To be sure we have the relevant informations, make active_tasks_mask volatile
2018-07-26 19:09:50 +02:00
Willy Tarreau
3201e4e428 MEDIUM: queue: get rid of the pendconn lock
This lock was necessary to manipulate the pendconn element between
concurrent places, but was causing great difficulties in the list walk
by having to iterate over multiple entries instead of being able to
safely pick the first one (in fact the first element was always the
right one but the locking model was hard to prove).

Here since we know we can always rely on the queue's locks, we take
the queue's lock every time we need to modify the element. In practice
it was already the case everywhere except in pendconn_dequeue() which
only works on an element that was already detached. This function had
to be protected against the risk of meeting an incompletely detached
element (which could be unlinked but not yet assigned). By taking the
queue lock around the LIST_ISEMPTY test, it's enough to ensure that a
concurrent thread either didn't begin or had completed the operation.

The true benefit really is in pendconn_process_next_strm() where we
can again safely work with the first element of each queue. This will
significantly simplify next updates to this code.
2018-07-26 17:32:51 +02:00
Willy Tarreau
7c6f8a2b0d MINOR: queue: implement pendconn queue locking functions
The new pendconn_queue_lock() and pendconn_queue_unlock() functions are
made to make it more convenient to lock or unlock the pendconn queue
either at the proxy or the server depending on pendconn->srv. This way
it is possible to remove the open-coding of these locks at various places.
These ones have been used in pendconn_unlink() and pendconn_add(), thus
significantly simplifying the logic there.
2018-07-26 17:32:51 +02:00
Willy Tarreau
88930dd364 MINOR: queue: use a distinct variable for the assigned server and the queue
The pendconn struct uses ->px and ->srv to designate where the element is
queued. There is something confusing regarding threads though, because we
have to lock the appropriate queue before inserting/removing elements, and
this queue may only be determined by looking at ->srv (if it's not NULL
it's the server, otherwise use the proxy). But pendconn_grab_from_px() and
pendconn_process_next_strm() both assign this ->srv field, making it
complicated to know what queue to lock before manipulating the element,
which is exactly why we have the pendconn_lock in the first place.

This commit introduces pendconn->target which is the target server that
the two aforementioned functions will set when assigning the server.
Thanks to this, the server pointer may always be relied on to determine
what queue to use.
2018-07-26 17:32:51 +02:00
Willy Tarreau
c1a60d6218 MINOR: queue: make sure pendconn->strm->pend_pos is always valid
pendconn_add() used to assign strm->pend_pos very late, after unlocking
the queue, so that a watching thread could see a random value in
pendconn->strm->pend_pos even while holding the lock on the element and
the queue itself. While there's currently nothing wrong with this, it
costs nothing to arrange it and will simplify code analysis later.
2018-07-26 17:32:51 +02:00
Willy Tarreau
6bdd05c0ef DOC: queue: document the expected locking model for the server's queue
The locking model is not trivial and is worth documenting to avoid seeing
apparent bugs everywhere while they are not.
2018-07-26 17:32:51 +02:00
Willy Tarreau
d0ad4a87f0 MEDIUM: queue: make pendconn_free() work on the stream instead
Now pendconn_free() takes a stream, checks that pend_pos is set, clears
it, and uses pendconn_unlink() to complete the job. It's cleaner and
centralizes all the bookkeeping work in pendconn_unlink() only and
ensures that there's a single place where the stream's position in the
queue is manipulated.
2018-07-26 17:32:51 +02:00
Willy Tarreau
9624faec86 MINOR: queue: centralize dequeuing code a bit better
For now the pendconns may be dequeued at two places :
  - pendconn_unlink(), which operates on a locked queue
  - pendconn_free(), which operates on an unlocked queue and frees
    everything.

Some changes are coming to the queue and we'll need to be able to be a
bit stricter regarding the places where we dequeue to keep the accounting
accurate. This first step renames the locked function __pendconn_unlink()
as it's for use by those aware of it, and introduces a new general purpose
pendconn_unlink() function which automatically grabs the necessary locks
before calling the former, and pendconn_cond_unlink() which additionally
checks the pointer and the presence in the queue.
2018-07-26 17:32:48 +02:00
Olivier Houchard
77551ee8a7 BUG/MEDIUM: tasks: make __task_unlink_rq responsible for the rqueue size.
As __task_wakeup() is responsible for increasing
rqueue_local[tid]/global_rqueue_size, make __task_unlink_rq responsible for
decreasing it, as process_runnable_tasks() isn't the only one that removes
tasks from runqueues.
2018-07-26 16:33:29 +02:00
Olivier Houchard
76e45181b2 MINOR: tasks: Add a flag that tells if we're in the global runqueue.
How that we have bits available in task->state, add a flag that tells if we're
in the global runqueue or not.
2018-07-26 16:33:10 +02:00
Willy Tarreau
ad8bd2467c MINOR: signal: don't pass the signal number anymore as the wakeup reason
This is never used and would even be wrong since the reasons are ORed
so two signals would be turned into a third value, just like if any
other reason was used at the same time.
2018-07-26 16:12:48 +02:00
Olivier Houchard
c4aac9effe BUG/MEDIUM: tasks: Make sure there's no task left before considering inactive.
We may remove the thread's bit in active_tasks_mask despite tasks for that
thread still being present in the global runqueue. To fix that, introduce
global_tasks_mask, and set the correspnding bits when we add a task to the
runqueue.
2018-07-26 15:40:22 +02:00
Willy Tarreau
189ea856a7 BUG/MEDIUM: tasks: use atomic ops for active_tasks_mask
We don't have the lock anymore so we need to protect it.
2018-07-26 15:16:43 +02:00
Olivier Houchard
e85ee7b663 BUG/MEDIUM: tasks: Decrement rqueue_size at the right time.
We need to decrement requeue_size when we remove a task form rqueue_local,
not when we remove if from the task list, or we'd also decrement it for any
tasklet, that was never in the rqueue in the first place.
2018-07-26 15:00:58 +02:00
Willy Tarreau
9a77186cb0 BUG/MEDIUM: tasks: make sure we pick all tasks in the run queue
Commit 09eeb76 ("BUG/MEDIUM: tasks: Don't forget to increase/decrease
tasks_run_queue.") addressed a count issue in the run queue and uncovered
another issue with the way the tasks are dequeued from the global run
queue. The number of tasks to pick is computed using an integral divide,
which results in up to nbthread-1 tasks never being run. The fix simply
consists in getting rid of the divide and checking the task count in the
loop.

No backport is needed, this is 1.9-specific.
2018-07-26 14:24:46 +02:00
Olivier Houchard
306e653331 BUG/MINOR: servers: Don't make "server" in a frontend fatal.
When parsing the configuration, if "server", "default-server" or
"server-template" are found in a frontend, we first warn that it will be
ignored, only to be considered a fatal error later. Be true to our word, and
just ignore it.

This should be backported to 1.8 and 1.7.
2018-07-24 17:13:54 +02:00
Willy Tarreau
055ba4f505 BUG/MEDIUM: stats: don't ask for more data as long as we're responding
The stats applet is still a bit hackish. It uses the HTTP txn to parse
the POST contents. Due to this it pretends not having parsed the request
from the buffer so that the HTTP parser continues to work fine on these
data. This comes with a side effect : the request lies pending in the
channel's buffer, and because of this, stream_int_update_applet() always
wakes the applet up. It's very visible when retrieving a large stats page
over a slow link as haproxy eats 100% of the CPU waiting for the data to
leave.

While the proper long term solution definitely is to consume these data
and parse the body from the applet, changing this is not suitable for a
fix.

What this patch does instead is to disable request polling as long as there
are pending data in the response buffer. Given that for almost all cases,
the applet remains busy sending data, this is at least enough to ensure
that we don't wake up for the pending request data while we're waiting for
the client to receive these data. Now a 5k backend stats page is dumped at
1% CPU over a 10 Mbps link instead of 100%, using 1500 epoll_wait() calls
instead of 80000.

Note that the previous fix (BUG/MEDIUM: stream-int: don't immediately
enable reading when the buffer was reportedly full) is necessary for the
effects of the fix to be noticed since both bugs have the exact same
effect.

This fix must be backported at least as far as 1.5.
2018-07-24 17:13:32 +02:00
Willy Tarreau
171d5f203a BUG/MEDIUM: stream-int: don't immediately enable reading when the buffer was reportedly full
There is a long-time issue which affects some applets, at least the stats
applet. If a large stats page is read over a slow link, regularly the
channel's buffer contains too many response data to allow another round
of ci_putblk() to copy a new message. In this case the applet calls
si_applet_cant_put() to mention that it failed to emit data into the
channel's buffer, and wants to be called only once some room is made.

The problem is that stream_int_update(), which is called from
process_stream(), will clear this flag whenever it sees there's some
spare room in the channel's buffer. It causes the applet to be woken
again immediately. This is very visible when reading a large stats
page over a slow link, because in this case haproxy will run at 100%
CPU and strace shows mostly epoll_wait(0). It is very likely that some
other applets like CLI, Lua, peers or SPOE have also been affected but
that the effect were less noticeable because it was mixed with traffic.

Ideally stream_int_update() should not touch these flags, but changing
this would require a very careful auditing of all users. Instead here
what we do is that we respect the flag if the channel still has output
data. This way the flag will automatically disappear once the buffer is
empty, and the applet function will be called only when input data
remains, if at all.

This patch alone is not enough to observe the behaviour change on the
stats page because another bug takes over, addressed by next patch
(BUG/MEDIUM: stats: don't ask for more data as long as we're responding).
When both are applied, dumping stats for 5k backends over a 10 Mbps link
take 1% CPU instead of 100%, with 1.5k epoll_wait() calls instead of 80k.

This fix should be backported at least as far as 1.5.
2018-07-24 17:12:38 +02:00
Willy Tarreau
616ac81dec MINOR: h2: add the error code and the max/last stream IDs to "show fd"
This is intented to help debugging H2 in field.
2018-07-24 14:12:42 +02:00
Willy Tarreau
842ed9b1cb MEDIUM: h2: use the default conn_stream's receive function
This removes h2_rcv_buf() now that the generic code can handle it fine.
2018-07-20 19:37:12 +02:00
Willy Tarreau
39d68508c3 MINOR: h2: make use of CS_FL_REOS to indicate that end of stream was seen
This allows h2_rcv_buf() not to depend anymore on h2s at all and to become
generic.
2018-07-20 19:35:14 +02:00
Willy Tarreau
2df65e7194 MEDIUM: h2: don't call data_cb->recv() anymore
Now we simply call data_cb->wake() which will automatically perform the
recv() call if required.
2018-07-20 19:31:36 +02:00
Willy Tarreau
2a761dcf0d MEDIUM: h2: perform a single call to the data layer in demux()
Instead of calling the data layer from each individual frame processing
function, we now call it from demux. This requires to know the h2s that
was created inside h2c_frt_handle_headers(), which is why the pointer is
now returned. This results in a small performance boost from 58k to 60k
POST requests/s compared to -master, thanks to half the number of
si_cs_recv_cb() calls and 66% calls to si_cs_wake_cb().

It's interesting to note that all calls to data_cb->recv() are now always
immediately followed by a call to data_cb->wake(). The next step should
be to let the ->wake handler perform the recv() call itself. For this it
will be useful to have some info on the CS to indicate whether or not it
is ready to be read (ie: contains a non-empty input buffer).
2018-07-20 19:30:03 +02:00
Willy Tarreau
a56a6def91 MEDIUM: h2: move headers and data frame decoding to their respective parsers
Now we entirely process the input frame before transfering it above, so
that h2_rcv_buf() doesn't have to "speak" h2 anymore.
2018-07-20 19:21:43 +02:00
Willy Tarreau
454b57b347 MEDIUM: h2: centralize transfer of decoded frames in h2_rcv_buf()
We still call the parser but it should soon not be needed anymore. The
decode functions don't need the buffer nor the max size anymore. They
must also not touch the CS_FL_EOS or CS_FL_RCV_MORE flags either, so
this is done within h2_rcv_buf() after transmission.

The "flags" argument to h2_frt_decode_headers() and h2_frt_transfer_data()
has been removed since it's not used anymore.
2018-07-20 19:21:43 +02:00
Willy Tarreau
d755ea6c7d MEDIUM: h2: make h2_frt_transfer_data() copy via an intermediary buffer
The purpose here is also to ensure we can split the lower from the top
layers. The way the CS_FL_MSG_MORE flag is set was updated so that it's
set or cleared upon exit depending on the buffer's remaining contents.
2018-07-20 19:21:43 +02:00
Willy Tarreau
937f760e1e MEDIUM: h2: make h2_frt_decode_headers() use an intermediary buffer
The purpose is to decode to a temporary buffer and then to copy this buffer
to the caller. This double-copy definitely has an impact on performance, the
test code goes down from 220k to 140k req/s, but this memcpy() will disappear
soon.

The test on CO_RFL_BUF_WET has become irrelevant now since we only use
the cs' rxbuf, so we cannot be blocked by "output" data that has to be
forwarded first. Thus instead we don't start until the rxbuf is empty
(it will be drained from any input data once the stream processes it).
2018-07-20 19:21:43 +02:00
Willy Tarreau
0b559071dd MINOR: h2: make each H2 stream support an intermediary input buffer
The purpose is to decode to a temporary buffer and then to copy this buffer
to the caller upon request to avoid having to process frames on the fly
when called from the higher level. For now the buffer is only initialized
on stream creation via cs_new() and allocated if the buffer_wait's callback
is called.
2018-07-20 19:21:43 +02:00
Willy Tarreau
67b1e78f68 MEDIUM: stream-int: automatically call si_cs_recv_cb() if the cs has data on wake()
If the cs has data pending or shutdown and the input channel is still
waiting for reads, let's simply call the recv() function from the wake()
callback. This will allow the lower layers to simply wake the upper one
up without having to consider the recv() nor anything else.
2018-07-20 19:21:43 +02:00
Willy Tarreau
11c9aa424e MEDIUM: conn_stream: add cs_recv() as a default rcv_buf() function
This function is generic and is able to automatically transfer data
from a conn_stream's rx buffer to the destination buffer. It does this
automatically if the mux doesn't define another rcv_buf() function.
2018-07-20 19:21:43 +02:00
Olivier Houchard
f495fc460e BUG/MEDIUM: mux_h2: Call h2_send() before updating polling.
In h2_wake(), make sure we call h2_send() before we try to update the
polling flags, and detect connection errors, or errors will never be
detected.
2018-07-20 19:07:49 +02:00
Christopher Faulet
ddb6c16576 BUG/MEDIUM: threads: Fix the exit condition of the thread barrier
In thread_sync_barrier, we exit when all threads have set their own bit in the
barrier mask. It is done by comparing it to all_threads_mask. But we must not
use a simple equality to do so, becaue all_threads_mask may change. Since commit
ba86c6c25 ("MINOR: threads: Be sure to remove threads from all_threads_mask on
exit"), when a thread exit, its bit is removed from all_threads_mask. Instead,
we must use a bitwise AND to test is all bits of all_threads_mask are set.

This also requires that all_threads_mask is set to volatile if we want to
catch changes.

This patch must be backported in 1.8.
2018-07-20 14:24:41 +02:00
Christopher Faulet
4507351a2f BUG/MINOR: build: Fix compilation with debug mode enabled
It remained some fragments of the old buffers API in debug messages, here and
there.

This was caused by the recent buffer API changes, no backport is needed.
2018-07-20 10:45:20 +02:00
Christopher Faulet
005e79e5dd BUG/MINOR: http: Set brackets for the unlikely macro at the right place
When test on the header "Early-Data" is made, the unlikely macro must encompass
the condition.

This patch must be backported in 1.8.
2018-07-20 10:42:53 +02:00
Willy Tarreau
8318885487 MINOR: connection: simplify subscription by adding a registration function
This new function wl_set_waitcb() prepopulates a wait_list with a tasklet
and a context and returns it so that it can be passed to ->subscribe() to
be added to a connection or conn_stream's wait_list. The caller doesn't
need to know all the insiders details anymore this way.
2018-07-19 18:31:07 +02:00
Olivier Houchard
910b2bc829 MEDIUM: connections/mux: Revamp the send direction.
Totally nuke the "send" method, instead, the upper layer decides when it's
time to send data, and if it's not possible, uses the new subscribe() method
to be called when it can send data again.
2018-07-19 18:31:07 +02:00
Olivier Houchard
6ff2039d13 MINOR: connections/mux: Add a new "subscribe" method.
Add a new "subscribe" method for connection, conn_stream and mux, so that
upper layer can subscribe to them, to be called when the event happens.
Right now, the only event implemented is "SUB_CAN_SEND", where the upper
layer can register to be called back when it is possible to send data.

The connection and conn_stream got a new "send_wait_list" entry, which
required to move a few struct members around to maintain an efficient
cache alignment (and actually this slightly improved performance).
2018-07-19 16:23:43 +02:00
Willy Tarreau
83061a820e MAJOR: chunks: replace struct chunk with struct buffer
Now all the code used to manipulate chunks uses a struct buffer instead.
The functions are still called "chunk*", and some of them will progressively
move to the generic buffer handling code as they are cleaned up.
2018-07-19 16:23:43 +02:00
Willy Tarreau
843b7cbe9d MEDIUM: chunks: make the chunk struct's fields match the buffer struct
Chunks are only a subset of a buffer (a non-wrapping version with no head
offset). Despite this we still carry a lot of duplicated code between
buffers and chunks. Replacing chunks with buffers would significantly
reduce the maintenance efforts. This first patch renames the chunk's
fields to match the name and types used by struct buffers, with the goal
of isolating the code changes from the declaration changes.

Most of the changes were made with spatch using this coccinelle script :

  @rule_d1@
  typedef chunk;
  struct chunk chunk;
  @@
  - chunk.str
  + chunk.area

  @rule_d2@
  typedef chunk;
  struct chunk chunk;
  @@
  - chunk.len
  + chunk.data

  @rule_i1@
  typedef chunk;
  struct chunk *chunk;
  @@
  - chunk->str
  + chunk->area

  @rule_i2@
  typedef chunk;
  struct chunk *chunk;
  @@
  - chunk->len
  + chunk->data

Some minor updates to 3 http functions had to be performed to take size_t
ints instead of ints in order to match the unsigned length here.
2018-07-19 16:23:43 +02:00
Willy Tarreau
c9fa0480af MAJOR: buffer: finalize buffer detachment
Now the buffers only contain the header and a pointer to the storage
area which can be anywhere. This will significantly simplify buffer
swapping and will make it possible to map chunks on buffers as well.

The buf_empty variable was removed, as now it's enough to have size==0
and area==NULL to designate the empty buffer (thus a non-allocated head
is the empty buffer by default). buf_wanted for now is indicated by
size==0 and area==(void *)1.

The channels and the checks now embed the buffer's head, and the only
pointer is to the storage area. This slightly increases the unallocated
buffer size (3 extra ints for the empty buffer) but considerably
simplifies dynamic buffer management. It will also later permit to
detach unused checks.

The way the struct buffer is arranged has proven quite efficient on a
number of tests, which makes sense given that size is always accessed
and often first, followed by the othe ones.
2018-07-19 16:23:43 +02:00
Willy Tarreau
e3128024bf MINOR: buffer: replace buffer_replace2() with b_rep_blk()
This one is more generic and designed to work on a random block. It
may later get a b_rep_ist() variant since many strings are already
available as (ptr,len).
2018-07-19 16:23:43 +02:00
Willy Tarreau
4d893d440c MINOR: buffers/channel: replace buffer_insert_line2() with ci_insert_line2()
There was no point keeping that function in the buffer part since it's
exclusively used by HTTP at the channel level, since it also automatically
appends the CRLF. This further cleans up the buffer code.
2018-07-19 16:23:43 +02:00
Willy Tarreau
a094fde2b6 MINOR: checks: use b_putist() instead of b_putstr()
This slightly simplifies the code.
2018-07-19 16:23:43 +02:00
Willy Tarreau
ea1b06d5bb MINOR: buffer: add a new file for ist + buffer manipulation functions
The new file istbuf.h links the indirect strings (ist) with the buffers.
The purpose is to encourage addition of more standard buffer manipulation
functions that rely on this in order to improve the overall ease of use
along all the code. Just like ist.h and buf.h, this new file is not
expected to depend on anything beyond these two files.

A few functions were added and/or converted from buffer.h :
  - b_isteq()  : indicates if a buffer and a string match
  - b_isteat() : consumes a string from the buffer if it matches
  - b_istput() : appends a small string to a buffer (all or none)
  - b_putist() : appends part of a large string to a buffer

The equivalent functions were removed from buffer.h and changed at the
various call places.
2018-07-19 16:23:43 +02:00
Willy Tarreau
55372f646f MINOR: buffer: replace b{i,o}_put* with b_put*
The two variants now do exactly the same (appending at the tail of the
buffer) so let's not keep the distinction between these classes of
functions and have generic ones for this. It's also worth noting that
b{i,o}_putchk() wasn't used at all and was removed.
2018-07-19 16:23:43 +02:00
Willy Tarreau
72a100b386 MINOR: buffer: replace bi_fast_delete() with b_del()
There's no distinction between in and out data now. The latter covers
the needs of the former and supports wrapping. The extra cost is
negligible given the locations where it's used.
2018-07-19 16:23:43 +02:00
Olivier Houchard
08afac0fd7 MEDIUM: buffers: move "output" from struct buffer to struct channel
Since we never access this field directly anymore, but only through the
channel's wrappers, it can now move to the channel. The buffers are now
completely free from the distinction between input and output data.
2018-07-19 16:23:43 +02:00
Willy Tarreau
892f1dbe4f MINOR: buffer: rename the "data" field to "area"
Since we use "_data" for the amount of data at many places, as opposed to
"_space" for the amount of space, let's rename the "data" field to "area"
so that we can reuse "data" later for the amount of data in the buffer
(currently called "len" despite not being contigous).
2018-07-19 16:23:43 +02:00
Willy Tarreau
d54a8ceb97 MAJOR: start to change buffer API
This is intentionally the minimal and safest set of changes, some cleanups
area still required. These changes are quite tricky and cannot be
independantly tested, so it's important to keep this patch as bisectable
as possible.

buf_empty and buf_wanted were changed and are now exactly similar since
there's no <p> member in the structure anymore. Given that no test is
ever made in the code to check that buf == &buf_wanted, it may be possible
that we don't need to have two anymore, unless some buf_empty tests have
precedence. This will have to be investigated.

A significant part of this commit affects the HTTP compression code,
which used to deeply manipulate the input and output buffers without
any reasonable solution for a better abstraction. For this reason, if
any regression is met and designates this patch as the culprit, it is
important to run tests which specifically involve compression or which
definitely don't use it in order to spot the issue.

Cc: Olivier Houchard <ohouchard@haproxy.com>
2018-07-19 16:23:42 +02:00
Willy Tarreau
81521ed850 MINOR: buffer: adapt buffer_slow_realign() and buffer_dump() to the new API
These are the last ones which need to be adapted.
2018-07-19 16:23:42 +02:00
Willy Tarreau
a79021af6f MINOR: lua: use the wrappers instead of directly manipulating buffer states
This replaces chn->buf->p with ci_head(chn), chn->buf->o with co_data(chn)
and chn->buf->i with ci_data(chn). This is in order to help porting to the
new buffer API.
2018-07-19 16:23:42 +02:00
Olivier Houchard
0b662843c8 MEDIUM: compression: start to move to the new buffer API
This part is tricky, it passes a channel where we used to have a buffer,
in order to reduce the API changes during the big switch. This way all
the channel's wrappers to distinguish between input and output are
available. It also makes sense given that the compression applies on
a channel since it's in the forwarding path.
2018-07-19 16:23:42 +02:00
Willy Tarreau
f158937620 MINOR: flt_trace: adapt to the new buffer API
The trace_hexdump() function now takes a count in argument to know where
to start dumping from.
2018-07-19 16:23:42 +02:00
Willy Tarreau
5e74b0ba3b MEDIUM: h1: port to new buffer API.
The parser now uses the channel exclusively to access the data. In order
to avoid the cost of indirection, a local variable "input" was added to
the function that replaces buf->p. Given that this part is on the critical
path, it will have to be tested again for any visible performance loss.
2018-07-19 16:23:42 +02:00
Willy Tarreau
fc0785d26c MINOR: payload: convert to the new buffer API
Mostly mechanical changes. It seems that some of them could be further
factored out by adding a few more wrappers at the channel level.
2018-07-19 16:23:42 +02:00
Willy Tarreau
44a41a83fb MINOR: filters: convert to the new buffer API
Use b_set_data() to modify the buffer size, and use the usual wrappers.
2018-07-19 16:23:42 +02:00
Willy Tarreau
f37954d4da MEDIUM: http: use wrappers instead of directly manipulating buffers states
This is aimed at easing the transition to the new API. There are a few places
which deserve some simplifications afterwards because ci_head() is called
often and may be placed into a local pointer.
2018-07-19 16:23:42 +02:00
Willy Tarreau
6a445ebc8a MINOR: backend: use new buffer API
The few locations dealing with the buffer rewind were updated not to
touch ->o nor ->p anymore and to use the channel's functions instead.
2018-07-19 16:23:42 +02:00
Willy Tarreau
7e9c30a7e0 MINOR: stream: use wrappers instead of directly manipulating buffers
This will help transitioning to the new API. These changes are very
scarce limited.
2018-07-19 16:23:42 +02:00
Willy Tarreau
77e478c56e MINOR: stream-int: use the new buffer API
A few locations still accessing ->i and ->o directly were changed to
use ci_data() and co_data() respectively. A call to b_del() was replaced
with co_set_data() in si_cs_send() so that ->o will is automatically be
decremented after the migration.
2018-07-19 16:23:42 +02:00
Willy Tarreau
178b987025 MINOR: cache: use the new buffer API
A few direct accesses to buf->p now use ci_head() instead.
2018-07-19 16:23:42 +02:00
Willy Tarreau
851d12c3d4 MINOR: cli: use the new buffer API
Almost nothing required to be touched.
2018-07-19 16:23:42 +02:00
Willy Tarreau
97f538b895 MINOR: stats: adapt to the new buffers API
The changes are fairly straightforward. Some places require to trim
the length. Maybe we'd need a b_extend() or b_adjust() for this.
2018-07-19 16:23:42 +02:00
Willy Tarreau
4fca5a9940 MEDIUM: spoe: use the new buffer API for the SPOE buffer
The buffer is not used as a forwarding buffer so we can simply map ->i
to ->len and ->p to b_head(). It *seems* that p is never modified, so
that we could even always use b_orig(). This needs to be rechecked.
2018-07-19 16:23:42 +02:00
Willy Tarreau
b7b5fe1a14 MEDIUM: h2: update to the new buffer API
There is no more distinction between ->i and ->o for the mux's buffers,
we always use b_data() to know the buffer's length since only one side
is used for each direction.
2018-07-19 16:23:42 +02:00
Willy Tarreau
876171e636 MINOR: checks: adapt to the new buffer API
The code exclusively used ->i for data received and ->o for data sent. Now
it always uses b_data(), b_head() and b_tail() so that there is no more
distinction between ->i and ->o.
2018-07-19 16:23:42 +02:00
Willy Tarreau
cd9e60db00 MEDIUM: channel: adapt to the new buffer API
Also, ci_swpbuf() was removed (unused).
2018-07-19 16:23:42 +02:00
Olivier Houchard
acd1403794 MINOR: buffer: Use b_add()/bo_add() instead of accessing b->i/b->o.
Use the newly available functions instead of using the buffer fields directly.
2018-07-19 16:23:42 +02:00
Willy Tarreau
591d445049 MINOR: buffer: use b_orig() to replace most references to b->data
This patch updates most users of b->data to use b_orig().
2018-07-19 16:23:42 +02:00
Willy Tarreau
144c5c4d21 MINOR: buffer: replace buffer_flush() with c_adv(chn, ci_data(chn))
It used to forward some input into output.
2018-07-19 16:23:41 +02:00
Willy Tarreau
5ba65521a3 MINOR: buffer: replace buffer_pending() with ci_data()
It used to return b->i for channels, which is what ci_data() does.
2018-07-19 16:23:41 +02:00
Willy Tarreau
3f6799975f MINOR: buffer: replace bi_space_for_replace() with ci_space_for_replace()
This one computes the size that can be overwritten over the input part
of the buffer, so it's channel-specific.
2018-07-19 16:23:41 +02:00
Willy Tarreau
2375233ef0 MINOR: buffer: replace buffer_full() with channel_full()
It's only used by channels since we need to know the amount of output
data.
2018-07-19 16:23:41 +02:00
Willy Tarreau
0c7ed5d264 MINOR: buffer: replace buffer_empty() with b_empty() or c_empty()
For the same consistency reasons, let's use b_empty() at the few places
where an empty buffer is expected, or c_empty() if it's done on a channel.
Some of these places were there to realign the buffer so
{b,c}_realign_if_empty() was used instead.
2018-07-19 16:23:41 +02:00
Willy Tarreau
d760eecf61 MINOR: buffer: replace buffer_not_empty() with b_data() or c_data()
It's mostly for consistency as many places already use one of these instead.
2018-07-19 16:23:41 +02:00
Willy Tarreau
eac5259888 MINOR: buffer: use b_room() to determine available space in a buffer
We used to have variations around buffer_total_space() and
size-buffer_len() or size-b_data(). Let's simplify all this. buffer_len()
was also removed as not used anymore.
2018-07-19 16:23:41 +02:00
Willy Tarreau
337ea57cfc MINOR: connection: add a new receive flag : CO_RFL_BUF_WET
With this flag we introduce the notion of "dry" vs "wet" buffers : some
demultiplexers like the H2 mux require as much room as possible for some
operations that are not retryable like decoding a headers frame. For this
they need to know if the buffer is congested with data scheduled for
leaving soon or not. Since the new API will not provide this information
in the buffer itself, the caller must indicate it. We never need to know
the amount of such data, just the fact that the buffer is not in its
optimal condition to be used for receipt. This "CO_RFL_BUF_WET" flag is
used to mention that such outgoing data are still pending in the buffer
and that a sensitive receiver should better let it "dry" before using it.
2018-07-19 16:23:41 +02:00
Willy Tarreau
7f3225f251 MINOR: connection: add a flags argument to rcv_buf()
The mux and transport rcv_buf() now takes a "flags" argument, just like
the snd_buf() one or like the equivalent syscall lower part. The upper
layers will use this to pass some information such as indicating whether
the buffer is free from outgoing data or if the lower layer may allocate
the buffer itself.
2018-07-19 16:23:41 +02:00
Willy Tarreau
d9cf540457 MEDIUM: mux: make mux->rcv_buf() take a size_t for the count
It also returns a size_t. This is in order to clean the API. Note
that the H2 mux still uses some ints in the functions called from
h2_rcv_buf(), though it's not really a problem given that H2 frames
are smaller. It may deserve a general cleanup later though.
2018-07-19 16:23:41 +02:00
Willy Tarreau
bfc4d77ad3 MEDIUM: connection: make xprt->rcv_buf() use size_t for the count
Just like we have a size_t for xprt->snd_buf(), we adjust to use size_t
for rcv_buf()'s count argument and return value. It also removes the
ambiguity related to the possibility to see a negative value there.
2018-07-19 16:23:41 +02:00
Willy Tarreau
deccd1116d MEDIUM: mux: make mux->snd_buf() take the byte count in argument
This way the mux doesn't need to modify the buffer's metadata anymore
nor to know the output's size. The mux->snd_buf() function now takes a
const buffer and it's up to the caller to update the buffer's state.

The return type was updated to return a size_t to comply with the count
argument.
2018-07-19 16:23:41 +02:00
Willy Tarreau
787db9a6a4 MEDIUM: connection: make xprt->snd_buf() take the byte count in argument
This way the senders don't need to modify the buffer's metadata anymore
nor to know about the output's split point. This way the functions can
take a const buffer and it's clearer who's in charge of updating the
buffer after a send. That's why the buffer realignment is now performed
by the caller of the transport's snd_buf() functions.

The return type was updated to return a size_t to comply with the count
argument.
2018-07-19 16:23:41 +02:00
Willy Tarreau
55f3ce1c91 MINOR: buffer: make b_getblk_nc() take size_t for the block sizes
Till now we used to reimplement it using ints to limit external changes
but we must adjust it and the various users to switch to size_t.
2018-07-19 16:23:41 +02:00
Willy Tarreau
206ba834ef MINOR: buffer: make b_getblk_nc() take const pointers
Now that there are no more users requiring to modify the buffer anymore,
switch these ones to const char and const buffer. This will make it more
obvious next time send functions are tempted to modify the buffer's output
count. Minor adaptations were necessary at a few call places which were
using char due to the function's previous prototype.
2018-07-19 16:23:41 +02:00
Willy Tarreau
9c7f2d19bf MEDIUM: h2: don't use b_ptr() nor b_end() anymore
The few places where they were still used were replaced with b_peek() and
b_wrap() respectively. The parts making use of ->i and ->o should now be
convertible to the new API.
2018-07-19 16:23:41 +02:00
Willy Tarreau
0bad0439f4 MEDIUM: h2: do not use buf->o anymore inside h2_snd_buf's loop
buf->o is only retrieved at the loop entry and modified using b_del()
on exit. We're close to being able to change the API to take a count
argument.
2018-07-19 16:23:41 +02:00
Willy Tarreau
f40e68227b MINOR: h1: make h1_measure_trailers() use an offset and a count
This will be needed by the H2 encoder to restart after wrapping.
2018-07-19 16:23:41 +02:00
Willy Tarreau
84d6b7af87 MINOR: h1: make h1_parse_chunk_size() not depend on b_ptr() anymore
It's similar to the previous commit so that the function doesn't rely
on buf->p anymore.
2018-07-19 16:23:41 +02:00
Willy Tarreau
c0973c6742 MINOR: h1: make h1_skip_chunk_crlf() not depend on b_ptr() anymore
It now takes offsets relative to the buffer's head. It's up to the
callers to add this offset which corresponds to the buffer's output
size.
2018-07-19 16:23:41 +02:00
Willy Tarreau
5dd17353d5 MEDIUM: h2: prevent the various mux encoders from modifying the buffer
Functions h2s_frt_make_resp_headers() and h2s_frt_make_resp_data() used
to modify the buffer's output data count. This is problematic for the
buffer's rework as we don't want to rely on this anymore. This commit
modifies these functions to take an offset (relative to the buffer's
head) and a maximum byte count. Thus h2_snd_buf() now calls them with
buf->o and takes care of removing deleted data itself. The send functions
now almost support being passed const buffers (except for the data part
which is still embedded).
2018-07-19 16:23:41 +02:00
Willy Tarreau
1dc41e75d8 MINOR: h2: clarify the fact that the send functions are unsigned
There's no more error return combined with the send output, though
the comments were misleading. Let's fix this as well as the functions'
prototypes. h2_snd_buf()'s return value wasn't changed yet since it
has to match the ->snd_buf prototype.
2018-07-19 16:23:40 +02:00
Willy Tarreau
7314be8e2c MINOR: h1: make h1_measure_trailers() take the byte count in argument
The principle is that it should not have to take this value from the
buffer itself anymore.
2018-07-19 16:23:40 +02:00
Willy Tarreau
188e230704 MINOR: buffer: convert most b_ptr() calls to c_ptr()
The latter uses the channel wherever a channel is known.
2018-07-19 16:23:40 +02:00
Willy Tarreau
e5f12ce7f2 MINOR: buffer: replace bi_del() and bo_del() with b_del()
Till now the callers had to know which one to call for specific use cases.
Let's fuse them now since a single one will remain after the API migration.
Given that bi_del() may only be used where o==0, just combine the two tests
by first removing output data then only input.
2018-07-19 16:23:40 +02:00
Willy Tarreau
a1f78fb652 MINOR: buffer: replace bo_getblk_nc() with b_getblk_nc() which takes an offset
This will be important so that we can parse a buffer without touching it.
Now we indicate where from the buffer's head we plan to start to copy, and
for how many bytes. This will be used by send functions to loop at the end
of the buffer without having to update the buffer's output byte count.
2018-07-19 16:23:40 +02:00
Willy Tarreau
90ed3836db MINOR: buffer: replace bo_getblk() with direction agnostic b_getblk()
This new functoin limits itself to the amount of data available in the
buffer and doesn't care about the direction anymore. It's only called
from co_getblk() which already checks that no more than the available
output bytes is requested.
2018-07-19 16:23:40 +02:00
Willy Tarreau
e4d5a036ed MINOR: buffer: merge b{i,o}_contig_space()
These ones were merged into a single b_contig_space() that covers both
(the bo_ case was a simplified version of the other one). The function
doesn't use ->i nor ->o anymore.
2018-07-19 16:23:40 +02:00
Willy Tarreau
0e11d59af6 MINOR: buffer: remove bo_contig_data()
The two call places now make use of b_contig_data(0) and check by
themselves that the returned size is no larger than the scheduled
output data.
2018-07-19 16:23:40 +02:00
Willy Tarreau
8f9c72d301 MINOR: buffer: remove bi_end()
It was replaced by ci_tail() when the channel is known, or b_tail() in
other cases.
2018-07-19 16:23:40 +02:00
Willy Tarreau
41e38ac0ee MINOR: buffer: remove bo_end()
It was replaced by either b_tail() when the buffer has no input data, or
b_peek(b, b->o).
2018-07-19 16:23:40 +02:00
Willy Tarreau
89faf5d7c3 MINOR: buffer: remove bo_ptr()
It was replaced by co_head() when a channel was known, otherwise b_head().
2018-07-19 16:23:40 +02:00
Willy Tarreau
dda2e41881 MINOR: buffer: remove bi_ptr()
It's now been replaced by b_head() when b->o is null, ci_head() when
the channel is known, or b_peek(b, b->o) in other situations.
2018-07-19 16:23:40 +02:00
Willy Tarreau
7194d3cc3b MINOR: buffer: split bi_contig_data() into ci_contig_data and b_config_data()
This function was sometimes used from a channel and sometimes from a buffer.
In both cases it requires knowledge of the size of the output data (to skip
them). Here the split ensures the channel can deal with this point, and that
other places not having output data can continue to work.
2018-07-19 16:23:40 +02:00
Willy Tarreau
aa7af7213d MINOR: buffer: replace calls to buffer_space_wraps() with b_space_wraps()
And remove the unused function.
2018-07-19 16:23:40 +02:00
Willy Tarreau
bcbd39370f MINOR: channel/buffer: replace b_{adv,rew} with c_{adv,rew}
These ones manipulate the output data count which will be specific to
the channel soon, so prepare the call points to use the channel only.
The b_* functions are now unused and were removed.
2018-07-19 16:23:40 +02:00
Willy Tarreau
c0a51c51b1 MINOR: buffer: remove buffer_slow_realign() and the swap_buffer allocation code
Since all call places can use the trash now, this is not needed anymore.
2018-07-19 16:23:40 +02:00
Willy Tarreau
0db4d10efc MINOR: h2: use b_slow_realign() with the trash as a swap buffer
H2 doesn't use the trash so it can make use of it as a swap area when
calling b_slow_realign(). This way we don't need buffer_slow_realign()
anymore.
2018-07-19 16:23:40 +02:00
Willy Tarreau
fd8d42f496 MEDIUM: channel: make channel_slow_realign() take a swap buffer
The few call places where it's used can use the trash as a swap buffer,
which is made for this exact purpose. This way we can rely on the
generic b_slow_realign() call.
2018-07-19 16:23:40 +02:00
Willy Tarreau
4cf1300e6a MINOR: channel/buffer: replace buffer_slow_realign() with channel_slow_realign() and b_slow_realign()
Where relevant, the channel version is used instead. The buffer version
was ported to be more generic and now takes a swap buffer and the output
byte count to know where to set the alignment point. The H2 mux still
uses buffer_slow_realign() with buf->o but it will change later.
2018-07-19 16:23:40 +02:00
Willy Tarreau
d5b343bf9e MINOR: channel/buffer: use c_realign_if_empty() instead of buffer_realign()
This patch removes buffer_realign() and replaces it with c_realign_if_empty()
instead.
2018-07-19 16:23:40 +02:00
Willy Tarreau
4d452384a3 MINOR: compression: pass the channel to http_compression_buffer_end()
This will be needed to access the output data count from the channel
after the buffer/channel changes.
2018-07-19 16:23:39 +02:00
Willy Tarreau
506a29ac6e MINOR: buffer: switch buffer sizes and offsets to size_t
Passing unsigned ints everywhere is painful, and will cause some headache
later when we'll want to integrate better with struct ist which already
uses size_t. Let's switch buffers to use size_t instead.
2018-07-19 16:23:39 +02:00
Willy Tarreau
42d55b9b6a BUG/MEDIUM: h2: make sure the last stream closes the connection after a timeout
If a timeout strikes on the connection side with some active streams,
there is a corner case which can sometimes cause the following sequence
to happen :

  - There are active streams but there are data in the mux buffer
    (eg: a client suddenly disconnected during a download with pending
    requests). The timeout is active.

  - The timeout strikes, h2_timeout_task() is called, kills the task and
    doesn't close the connection since there are streams left ; The
    connection is marked in H2_CS_ERROR ;

  - the streams are woken up and closed ;

  - when the last stream closes, calling h2_detach(), it sees the
    tree list is empty, but there is no condition allowing the
    connection to be closed (mbuf->o > 0), thus it does nothing ;

  - since the task is dead, there's no more hope to clear this
    situation later

For now we can take care of this by adding a test for the presence of
H2_CS_ERROR and !task, implying the timeout task triggered already
and will not be able to handle this again.

Over the long term it seems like a more reliable test on should be
made, so that it is possible to know whether or not someone is still
able to close this connection.

A big thanks to Janusz Dziemidowicz and Milan Petruzelka for providing
many details helping in figuring this bug.
2018-07-19 14:31:47 +02:00
Willy Tarreau
00610960a1 BUG/MEDIUM: h2: never leave pending data in the output buffer on close
We currently don't process trailers on H2, but this has an impact : on
chunked HTTP/1 responses, we decide to emit the ES bit once we see the
0CRLF. From this point the stream switches to the CLOSED state, which
aborts processing of the remaining bytes. Thus the extra CRLF which ends
trailers is not processed and remains in the buffer. This prevents the
stream from being notified about end of transmission, which in turn keeps
the mux busy and prevents the connection from quitting.

The case of the trailers is not the root cause of this issue, though it
is what triggers it. The root cause is that upon error and/or close, once
we know we're not going to process any more data, we must absolutely flush
any remaining bytes from the output buffer, otherwise there is no way the
stream can quit. This is what this patch does.

It looks very likely related to the issues reported and debugged by
Janusz Dziemidowicz and Milan Petruzelka.

One way to reproduce it is to chain two proxies with the last one emitting
chunked data (typically using the stats page) :

    global
        stats socket /tmp/sock1 mode 666 level admin
        stats timeout 1h
        tune.ssl.default-dh-param 1024
        tune.bufsize 16384

    defaults
        mode http
        timeout connect 4s
        timeout client 10s
        timeout server 20s

    listen px1
        bind :4443 ssl crt rsa+dh2048.pem npn h2 alpn h2
        server s1 127.0.0.1:4445

    listen px2
        bind :4444 ssl crt rsa+dh2048.pem npn h2 alpn h2
        bind :4445
        stats uri /

Then use curl to fetch the stats through px1 :

    curl --http2 -k "https://127.0.0.1:4443/"

When curl is sent to the first one, "show sess" issued to the CLI will
show a remaining session during the client timeout. When curl is aimed at
port 4444 (px2), there is no such remaining session.

This fix needs to be backported to 1.8.
2018-07-19 11:09:12 +02:00
Willy Tarreau
c65edac804 MINOR: h2: add the mux and demux buffer lengths on "show fd"
It is convenient during debugging sessions to know if the mux and demux
buffers are empty/full/other. Let's report this on "show fd" output.
2018-07-19 10:54:43 +02:00
Willy Tarreau
f210191dcd BUG/MEDIUM: h2: don't accept new streams if conn_streams are still in excess
The streams bookkeeping made in H2 is used for protocol compliance only
but it doesn't consider the number of conn_streams still attached to the
mux. It causes an issue when http-request set-nice rules are applied on
H2 requests processed on a saturated machine. Indeed, in this case, the
requests are accepted and assigned a default nice value of zero. When
they are processed, their nice value changes to a higher one (say 1024).
The response is sent through the H2 mux, which detects the end of stream
and decrements the protocol-level stream count (h2c->nb_streams). The
client may then send a new request. But the conn_stream is still attached
and will require a new call to process_stream() to finish, which is made
through the scheduler. Given that the machine is saturated, it is assumed
that many tasks are present in the scheduler. Thus the closing tasks holding
a higher nice value will pass after the new stream creations. If the client
is fast enough with a low latency link, it may add a lot of new stream
creations before the stream terminations have a chance to disappear due
to their high nice value, resulting in a huge amount of memory being used.

The solution consists in letting a mux always monitor its conn_streams and
refrain from creating new ones when it is full. Here the H2 mux checks the
nb_cs counter and sets a new blocked flag (H2_CF_DEM_TOOMANY) if the limit
was reached, so that the frame parser requests a pause in the new stream
creation, leaving some time for the pending conn_streams to vanish.

Several experiments were made using varying thresholds to see if
overbooking would provide any benefit here but it turned out not to be
the case, so the conn_stream limit remains set to the exact streams
limit. Interestingly various performance measurements showed that the
code tends to be slightly faster now than without the limit, probably
due to the smoother memory usage.

This commit requires previous patch ("MINOR: h2: keep a count of the number
of conn_streams attached to the mux"). It needs to be backported to 1.8.
2018-07-19 10:23:15 +02:00
Willy Tarreau
7ac60e836a MINOR: h2: keep a count of the number of conn_streams attached to the mux
The h2 mux only knows about the number of H2 streams which are not in a
CLOSED state. This is used for protocol compliance. But it doesn't hold
the number of really attached streams. It is a problem because depending
on scheduling, it is possible that more streams are attached to the mux
than the ones seen at the protocol level, due to some streams taking some
time to be detached. Let's add this count based on the conn_streams.

Note: this patch is part of a series of fixes which will have to be
backported to 1.8.
2018-07-19 09:06:37 +02:00
Willy Tarreau
17b4aa1adc BUG/MINOR: ssl: properly ref-count the tls_keys entries
Commit 200b0fa ("MEDIUM: Add support for updating TLS ticket keys via
socket") introduced support for updating TLS ticket keys from the CLI,
but missed a small corner case : if multiple bind lines reference the
same tls_keys file, the same reference is used (as expected), but during
the clean shutdown, it will lead to a double free when destroying the
bind_conf contexts since none of the lines knows if others still use
it. The impact is very low however, mostly a core and/or a message in
the system's log upon old process termination.

Let's introduce some basic refcounting to prevent this from happening,
so that only the last bind_conf frees it.

Thanks to Janusz Dziemidowicz and Thierry Fournier for both reporting
the same issue with an easy reproducer.

This fix needs to be backported from 1.6 to 1.8.
2018-07-18 08:59:50 +02:00
Baptiste Assmann
8e2d9430c0 MINOR: dns: new DNS options to allow/prevent IP address duplication
By default, HAProxy's DNS resolution at runtime ensure that there is no
IP address duplication in a backend (for servers being resolved by the
same hostname).
There are a few cases where people want, on purpose, to disable this
feature.

This patch introduces a couple of new server side options for this purpose:
"resolve-opts allow-dup-ip" or "resolve-opts prevent-dup-ip".
2018-07-12 17:56:44 +02:00
Baptiste Assmann
84221b4e90 MINOR: dns: fix wrong score computation in dns_get_ip_from_response
dns_get_ip_from_response() is used to compare the caller current IP to
the IP available in the records returned by the DNS server.
A scoring system is in place to get the best IP address available.
That said, in the current implementation, there are a couple of issues:
1. a comment does not match what the code does
2. the code does not match what the commet says (score value is not
   incremented with '2')

This patch fixes both issues.

Backport status: 1.8
2018-07-12 17:56:34 +02:00
Baptiste Assmann
741e00a820 CLEANUP: dns: inacurate comment about prefered IP score
The comment was about "prefered network ip version" while it's actually
"prefered ip version" in the code.
Fixed

Backport status: 1.7 and 1.8
  Be careful, this patch may not apply on 1.7, since the score was '4'
  for this item at that time.
2018-07-12 17:55:16 +02:00
Baptiste Assmann
e56fffd896 CLEANUP: dns: remove obsolete macro DNS_MAX_IP_REC
Since a8c6db8d2d, this macro is not used
anymore and can be safely removed.

Backport status: 1.8
2018-07-12 17:55:05 +02:00
William Lallemand
bfd8eb5909 MINOR: startup: change session/process group settings
Change the way the process groups are set. Indeed setsid() was called
for every processes which caused the worker to have a different process
group than the master.

This patch behave in a better way:

- In daemon mode only, each child do a setsid()
- In master worker + daemon mode, the setsid() is done in the master before
forking the children
- In any foreground mode, we don't do a setsid()

Could be backported in 1.8 but the master-worker mode is mostly used
with systemd which rely on cgroups so that won't affect much people.
2018-07-04 19:29:56 +02:00
Thierry FOURNIER
70d318ccb7 BUG/MEDIUM: lua: possible CLOSE-WAIT state with '\n' headers
The Lua parser doesn't takes in account end-of-headers containing
only '\n'. It expects always '\r\n'. If a '\n' is processes the Lua
parser considers it miss 1 byte, and wait indefinitely for new data.

When the client reaches their timeout, it closes the connection.
This close is not detected and the connection keep in CLOSE-WAIT
state.

I guess that this patch fix only a visible part of the problem.
If the Lua HTTP parser wait for data, the timeout server or the
connectio closed by the client may stop the applet.

How reproduce the problem:

HAProxy conf:

   global
      lua-load bug38.lua
   frontend frt
      timeout client 2s
      timeout server 2s
      mode http
      bind *:8080
      http-request use-service lua.donothing

Lua conf

   core.register_service("donothing", "http", function(applet) end)

Client request:

   echo -ne 'GET / HTTP/1.1\n\n' | nc 127.0.0.1 8080

Look for CLOSE-WAIT in the connection with "netstat" or "ss". I
use this script:

   while sleep 1; do ss | grep CLOSE-WAIT; done

This patch must be backported in 1.6, 1.7 and 1.8

Workaround: enable the "hard-stop-after" directive, and perform
periodic reload.
2018-07-01 06:08:43 +02:00
Willy Tarreau
43e903553e MINOR: stick-tables: make stktable_release() do nothing on NULL
stktable_release() has been involved in two recent crashes by being
used without enough care. Just like any free() function this one is
often called on an exit path with a possibly unsafe argument. Given
that there is another case (smp_fetch_sc_trackers()) which theorically
could call it with an unchecked NULL, though it cannot happen since
the function doesn't support being called with src_* hence cannot make
use of tmpstkctr, let's rather move the check into the function itself
to make it safer for the long term.

This patch could be backported to 1.8 as a strengthening measure.
2018-06-27 06:33:20 +02:00
Tim Duesterhus
65189c17c6 BUG/MAJOR: stick_table: Complete incomplete SEGV fix
This commit completes the incomplete segmentation fault fix
in commit ac1f3ed64b.

Likewise it must be backported to haproxy 1.8.
2018-06-26 20:29:36 +02:00
William Lallemand
091d827e09 BUG/BUILD: threads: unbreak build without threads
The build without threads was once again broken.

This issue was introduced in commit ba86c6c ("MINOR: threads: Be sure to
remove threads from all_threads_mask on exit").

This is exactly the same problem as last time it happened, because of
all_threads_mask not being defined with USE_THREAD=

This must be backported in 1.8
2018-06-26 14:15:12 +02:00
Thierry FOURNIER
ac1f3ed64b BUG/MAJOR: Stick-tables crash with segfault when the key is not in the stick-table
When a lookup is done on a key not present in the stick-table the "st"
pointer is NULL and it is used to return the converter result, but it
is used untested with stktable_release().

This regression was introduced in 1.8.10 here:

   BUG/MEDIUM: stick-tables: Decrement ref_cnt in table_* converters
   commit d7bd88009d88dd413e01bc0baa90d6662a3d7718
   Author: Daniel Corbett <dcorbett@haproxy.com>
   Date:   Sun May 27 09:47:12 2018 -0400

Minimal conf for reproducong the problem:

   frontend test
      mode http
      stick-table type ip size 1m expire 1h store gpc0
      bind *:8080
      http-request redirect location /a if { src,in_table(test) }

The segfault is triggered using:

   curl -i http://127.0.0.1:8080/

This patch must be backported in 1.8
2018-06-26 13:51:46 +02:00
Christopher Faulet
ba86c6c25b MINOR: threads: Be sure to remove threads from all_threads_mask on exit
When HAProxy is started with several threads, Each running thread holds a bit in
the bitfiled all_threads_mask. This bitfield is used here and there to check
which threads are registered to take part in a specific processing. So when a
thread exits, it seems normal to remove it from all_threads_mask.

No direct impact could be identified with this right now but it would
be better to backport it to 1.8 as a preventive measure to avoid complex
situations like the one in previous bug.
2018-06-22 14:55:15 +02:00
Christopher Faulet
d8fd2af882 BUG/MEDIUM: threads: Use the sync point to check active jobs and exit
When HAProxy is shutting down, it exits the polling loop when there is no jobs
anymore (jobs == 0). When there is no thread, it works pretty well, but when
HAProxy is started with several threads, a thread can decide to exit because
jobs variable reached 0 while another one is processing a task (e.g. a
health-check). At this stage, the running thread could decide to request a
synchronization. But because at least one of them has already gone, the others
will wait infinitly in the sync point and the process will never die.

To fix the bug, when the first thread (and only this one) detects there is no
active jobs anymore, it requests a synchronization. And in the sync point, all
threads will check if jobs variable reached 0 to exit the polling loop.

This patch must be backported in 1.8.
2018-06-22 10:16:26 +02:00
Dave Chiluk
8618a6a5e2 MINOR: Some spelling cleanup in the comments.
Signed-off-by: Dave Chiluk <chiluk+haproxy@indeed.com>
2018-06-21 20:43:52 +02:00
Olivier Houchard
d0e60d852a BUG/MEDIUM: fd: Don't modify the update_mask in fd_dodelete().
Only the pollers should remove bits in the update_mask. Removing it will
mean if the fd is currently in the global update list, it will never be
removed, and while it's mostly harmless in 1.9, in 1.8, only update_mask
is checked to know if the fd is already in the list or not, so we can end
up trying to add a fd that is already in the list, and corrupt it, which
means some fd may not be added to the poller.

This should be backported to 1.8.
2018-06-20 10:21:44 +02:00
Emmanuel Hocdet
3448c490ca BUG/MEDIUM: ssl: do not store pkinfo with SSL_set_ex_data
Bug from 96b7834e: pkinfo is stored on SSL_CTX ex_data and should
not be also stored on SSL ex_data without reservation.
Simply extract pkinfo from SSL_CTX in ssl_sock_get_pkey_algo.

No backport needed.
2018-06-18 13:34:09 +02:00
Thierry FOURNIER
28962c9941 BUG/MAJOR: ssl: OpenSSL context is stored in non-reserved memory slot
We never saw unexplicated crash with SSL, so I suppose that we are
luck, or the slot 0 is always reserved. Anyway the usage of the macro
SSL_get_app_data() and SSL_set_app_data() seem wrong. This patch change
the deprecated functions SSL_get_app_data() and SSL_set_app_data()
by the new functions SSL_get_ex_data() and SSL_set_ex_data(), and
it reserves the slot in the SSL memory space.

For information, this is the two declaration which seems wrong or
incomplete in the OpenSSL ssl.h file. We can see the usage of the
slot 0 whoch is hardcoded, but never reserved.

   #define SSL_set_app_data(s,arg)     (SSL_set_ex_data(s,0,(char *)arg))
   #define SSL_get_app_data(s)      (SSL_get_ex_data(s,0))

This patch must be backported at least in 1.8, maybe in other versions.
2018-06-18 10:32:14 +02:00
Thierry FOURNIER
16ff050478 BUG/MAJOR: ssl: Random crash with cipherlist capture
The cipher list capture struct is stored in the SSL memory space,
but the slot is reserved in the SSL_CTX memory space. This causes
ramdom crashes.

This patch should be backported to 1.8
2018-06-18 10:32:12 +02:00
Frédéric Lécaille
f874a83b57 BUG/MINOR: lua: Segfaults with wrong usage of types.
Patrick reported that this simple configuration made haproxy segfaults:

    global
        lua-load /tmp/haproxy.lua

    frontend f1
        mode http
        bind :8000
        default_backend b1

        http-request lua.foo

    backend b1
        mode http
        server s1 127.0.0.1:8080

with this '/tmp/haproxy.lua' script:

    core.register_action("foo", { "http-req" }, function(txn)
        txn.sc:ipmask(txn.f:src(), 24, 112)
    end)

This is due to missing initialization of the array of arguments
passed to hlua_lua2arg_check() which makes it enter code with
corrupted arguments.

Thanks a lot to Patrick Hemmer for having reported this issue.

Must be backported to 1.8, 1.7 and 1.6.
2018-06-18 10:23:47 +02:00
Olivier Houchard
9db0fedb59 BUG/MINOR: tasklets: Just make sure we don't pass a tasklet to the handler.
We can't just set t to NULL if it's a tasklet, or we'd have a hard time
accessing to t->process, so just make sure we pass NULL as the first parameter
of t->process if it's a tasklet.
This should be a non-issue at this point, as tasklets aren't used yet.
2018-06-14 18:57:26 +02:00
William Lallemand
579fb25b62 BUG/MAJOR: map: fix a segfault when using http-request set-map
The bug happens with an existing entry, when you try to overwrite the
value with wrong data, for example, a string when the type is INT.

The code path was not secure and tried to set *err and *merr while
err = merr = NULL when performing an http action.

Must be backported in 1.6, 1.7, 1.8.
2018-06-11 11:02:06 +02:00
William Lallemand
6e1796e85d BUG/MINOR: signals: ha_sigmask macro for multithreading
The behavior of sigprocmask in an multithreaded environment is
undefined.

The new macro ha_sigmask() calls either pthreads_sigmask() or
sigprocmask() if haproxy was built with thread support or not.

This should be backported to 1.8.
2018-06-08 18:24:53 +02:00
William Lallemand
933642c6ef BUG/MINOR: don't ignore SIG{BUS,FPE,ILL,SEGV} during signal processing
We don't have any reason of blocking those signals.

If SIGBUS, SIGFPE, SIGILL, or SIGSEGV are generated while they are blocked, the
result is undefined, unless the signal was generated by kill(2), sigqueue(3), or
raise(3).

This should be backported to 1.8.
2018-06-08 18:22:43 +02:00
William Lallemand
1aab50bb4a BUG/MEDIUM: threads: handle signal queue only in thread 0
Signals were handled in all threads which caused some signals to be lost
from time to time. To avoid complicated lock system (threads+signals),
we prefer handling the signals in one thread avoiding concurrent access.

The side effect of this bug was that some process were not leaving from
time to time during a reload.

This patch must be backported in 1.8.
2018-06-08 18:22:31 +02:00
Thierry FOURNIER
fc044c98e4 MINOR: lua: Increase debug information
When an unrecoverable error raises, the user receive poor information
for the trouble shooting. For example:

   [ALERT] 157/143755 (21212) : Lua function 'hello-world': runtime error: memory allocation error: block too big.

Unfortunately, the memory allocation error can be throwed by many
function, and we have no informatio to reach the original cause.
This patch add the list of function called from the entry point to
the function in error, like this:

   [ALERT] 157/143755 (21212) : Lua function 'hello-world': runtime error: memory allocation error: block too big from [C] method 'req_get_headers', bug35.lua:2 global 'ee', bug35.lua:6 global 'ff', bug35.lua:10 C function line 9.
2018-06-08 18:18:33 +02:00
Olivier Houchard
b4dd15bd6f BUG/MINOR: unix: Make sure we can transfer abns sockets on seamless reload.
When checking if a socket we got from the parent is suitable for a listener,
we just checked that the path matched sockname.tmp, however this is
unsuitable for abns sockets, where we don't have to create a temporary
file and rename it later.
To detect that, check that the first character of the sun_path is 0 for
both, and if so, that &sun_path[1] is the same too.

This should be backported to 1.8.
2018-06-07 14:33:44 +02:00
Olivier Houchard
b1ca58b245 MINOR: tasks: Don't define rqueue if we're building without threads.
To make sure we don't inadvertently insert task in the global runqueue,
while only the local runqueue is used without threads, make its definition
and usage conditional on USE_THREAD.
2018-06-06 16:35:12 +02:00
David Carlier
cc0a957a50 MINOR: task: Fix compiler warning.
Waking up task, when checking if it is a valid entry.
Similarly to commit caa8a37ffe,
casting explicitally to void pointer as HA_ATOMIC_CAS needs.
2018-06-05 13:55:57 +02:00
Willy Tarreau
34b1facbcf MINOR: stats: also report the nice and number of calls for applets
Since applets are now part of the main scheduler, it's useful to report
their nice value and the number of calls to the applet handler, to see
where the CPU is spent.
2018-06-05 11:18:21 +02:00
Christopher Faulet
6381650516 MAJOR: spoe: upgrade the SPOP version to 2.0 and remove the support for 1.0
The commit c4dcaff3 ("BUG/MEDIUM: spoe: Flags are not encoded in network order")
introduced an incompatibility with older agents. So the major version of the
SPOP is increased to make the situation unambiguous. And because before the fix,
the protocol is buggy, the support of the version 1.0 is removed to be sure to
not continue to support buggy agents.

The agents in the contrib folder (spoa_example, modsecurity and mod_defender)
are also updated to announce the SPOP version 2.0.

So, to be clear, from the patch, connections to agents announcing the SPOP
version 1.0 will be rejected.

This patch must be backported in 1.8.
2018-06-04 17:33:48 +02:00
Thierry FOURNIER
66b8919b10 BUG/MEDIUM: lua/socket: Buffer error, may segfault
The buffer pointer is already updated. It is again updated
when it is given to the function ci_putblk().

This patch must be backported in 1.6, 1.7 and 1.8
2018-05-31 10:58:41 +02:00
Thierry FOURNIER
101b97619a BUG/MEDIUM: lua/socket: Sheduling error on write: may dead-lock
When we write data, we risk to encounter a dead-loack. The
function "stream_int_notify()" cannot be called the the
cosocket because the caller acquire a lock and when the socket
is closed, the cleanup function try to acquire the same lock.,
so a dead-lock raises.

In other way, the function stream_int_update_applet() can't
be called because it schedumes the applet only if some activity
in the buffers were detected. It is not always the case. We
replace this function by appctx_wakeup() which wake up the
applet inconditionnaly.

The last part of the fix is setting right signals. the applet
call the stream_int_update() function if the output buffer si
not empty, and ask for put data if some rite signals are
registered.

This patch must be backported in 1.6, 1.7 and 1.8. Note that it requires
patch "MINOR: task/notification: Is notifications registered" to be
applied.
2018-05-31 10:58:41 +02:00
Thierry FOURNIER
ba42fcd064 BUG/MEDIUM: lua/socket: Notification error
Each time the send function yields, a notification must be registered.
Without this notification, the task is never wakeup when data arrives.

Today, the notification is registered only if the buffer is not available.
Other cases like the buffer is too small for all data are not processed.

This patch must be backported in 1.6, 1.7 and 1.8
2018-05-31 10:58:41 +02:00
Thierry FOURNIER
7e4ee47acc BUG/MAJOR: lua: Dead lock with sockets
In some cases, when we are waiting for data and the socket
timeout expires, we have a dead lock. The Lua socket locks
the applet socket, and call for a notify. The notify
immediately executes code and try to acquire the same lock,
so ... dead lock.

stream_int_notify() cant be used because it wakeup the applet
task only if the stream have changes. The changes are forces
by Lua, but not repported on the stream.

stream_int_update_applet() cant be used because the deadlock.

So, I inconditionnaly wakeup the applet. This wake is performed
asynchronously, and will call a stream_int_notify().

This patch must be backported in 1.6, 1.7 and 1.8
2018-05-31 10:58:41 +02:00
Thierry FOURNIER
af4bd0867a BUG/MEDIUM: lua/socket: wrong scheduling for sockets
The appctx pointer is given from any variable which are wrong.
This implies the wakeup of wrong applet, and the socket are no
longer responsive.

This behavior is hidden by another inherited error which is
fixed in the next patch.

This patch remove all wrong appctx affectations.

This patch must be backported in 1.6, 1.7 and 1.8
2018-05-31 10:58:41 +02:00
Christopher Faulet
3a47e5e25c BUG/MEDIUM: spoe: Return an error when the wrong ACK is received in sync mode
This is required to let a message processing timed out. Because, when it
happens, there is no more context attached to the SPOE applet that sent the
NOTIFY frame. So when the ACK is received, it is too late. This is the same
situation when we receive the wrong ACK. It is invalid in sync mode. Otherwise,
the SPOE applet remains in the state "WAITING_SYNC_ACK" until the idle timeout
is reached. In such case, the applet is seen as busy and it is unusable. If this
happens too often, more and more applets will be created because some others are
blocked. If there is a maxconn on the SPOE backend, all processings will be
drastically slowdown.

Returning an error in such cases, in sync mode, allow us to terminate the SPOE
applet. Because it means the agent is unresponsive or too slow.

Note this bug exists only if the sync mode is used.

This patch must be backported in 1.8.
2018-05-30 15:34:48 +02:00
Ben Draut
44e609bfa5 MINOR: dns: Implement parse-resolv-conf directive
This introduces a new directive for the `resolvers` section:
`parse-resolv-conf`. When present, it will attempt to add any
nameservers in `/etc/resolv.conf` to the list of nameservers
for the current `resolvers` section.

[Mailing list thread][1].

[1]: https://www.mail-archive.com/haproxy@formilux.org/msg29600.html
2018-05-30 05:17:16 +02:00
Olivier Houchard
082627af77 MINOR: task: Also consider the task list size when getting global tasks.
We're taking tasks from the global runqueue based on the number of tasks
the thread already have in its local runqueue, but now that we have a task
list, we also have to take that into account.
2018-05-28 15:20:59 +02:00
Olivier Houchard
736ea41c6c BUG/MEDIUM: task: Don't forget to decrement max_processed after each task.
When the task list was introduced, we bogusly lost max_processed--, that means
we would execute as much tasks as present in the list, and we would never
set active_tasks_mask, so the thread would go to sleep even if more tasks were
to be executed.

1.9-dev only, no backport is needed.
2018-05-28 15:20:57 +02:00
Willy Tarreau
1b0f85e47f MINOR: stats: also report the failed header rewrites warnings on the stats page
These ones concern the warnings detected during header addition/insertion.
They are visible in the tooltip reporting the per-status codes stats. The
frontend and backend contain a total of request+response warnings, while
server only has the response warnings.
2018-05-28 15:16:23 +02:00
Tim Duesterhus
3fd1973d37 MINOR: http: Log warning if (add|set)-header fails
This patch adds a warning if an http-(request|reponse) (add|set)-header
rewrite fails to change the respective header in a request or response.

This usually happens when tune.maxrewrite is not sufficient to hold all
the headers that should be added.
2018-05-28 14:53:59 +02:00
Daniel Corbett
3e60b11100 BUG/MEDIUM: stick-tables: Decrement ref_cnt in table_* converters
When using table_* converters ref_cnt was incremented
and never decremented causing entries to not expire.

The root cause appears to be that stktable_lookup_key()
was called within all sample_conv_table_* functions which was
incrementing ref_cnt and not decrementing after completion.

Added stktable_release() to the end of each sample_conv_table_*
function and reworked the end logic to ensure that ref_cnt is
always decremented after use.

This should be backported to 1.8
2018-05-28 10:36:20 +02:00
Olivier Houchard
673867c357 MAJOR: applets: Use tasks, instead of rolling our own scheduler.
There's no real reason to have a specific scheduler for applets anymore, so
nuke it and just use tasks. This comes with some benefits, the first one
being that applets cannot induce high latencies anymore since they share
nice values with other tasks. Later it will be possible to configure the
applets' nice value. The second benefit is that the applet scheduler was
not very thread-friendly, having a big lock around it in prevision of this
change. Thus applet-intensive workloads should now scale much better with
threads.

Some more improvement is possible now : some applets also use a task to
handle timers and timeouts. These ones could now be simplified to use only
one task.
2018-05-26 20:03:30 +02:00
Olivier Houchard
1599b80360 MINOR: tasks: Make the number of tasks to run at once configurable.
Instead of hardcoding 200, make the number of tasks to be run configurable
using tune.runqueue-depth. 200 is still the default.
2018-05-26 20:03:24 +02:00
Olivier Houchard
b0bdae7b88 MAJOR: tasks: Introduce tasklets.
Introduce tasklets, lightweight tasks. They have no notion of priority,
they are just run as soon as possible, and will probably be used for I/O
later.

For the moment they're used to replace the temporary thread-local list
that was used in the scheduler. The first part of the struct is common
with tasks so that tasks can be cast to tasklets and queued in this list.
Once a task is in the tasklet list, it has its leaf_p set to 0x1 so that
it cannot accidently be confused as not in the queue.

Pure tasklets are identifiable by their nice value of -32768 (which is
normally not possible).
2018-05-26 20:03:19 +02:00
Olivier Houchard
f6e6dc12cd MAJOR: tasks: Create a per-thread runqueue.
A lot of tasks are run on one thread only, so instead of having them all
in the global runqueue, create a per-thread runqueue which doesn't require
any locking, and add all tasks belonging to only one thread to the
corresponding runqueue.

The global runqueue is still used for non-local tasks, and is visited
by each thread when checking its own runqueue. The nice parameter is
thus used both in the global runqueue and in the local ones. The rare
tasks that are bound to multiple threads will have their nice value
used twice (once for the global queue, once for the thread-local one).
2018-05-26 19:27:29 +02:00
Olivier Houchard
9f6af33222 MINOR: tasks: Change the task API so that the callback takes 3 arguments.
In preparation for thread-specific runqueues, change the task API so that
the callback takes 3 arguments, the task itself, the context, and the state,
those were retrieved from the task before. This will allow these elements to
change atomically in the scheduler while the application uses the copied
value, and even to have NULL tasks later.
2018-05-26 19:23:57 +02:00
Thierry FOURNIER
8c126c7235 BUG/MEDIUM: lua/socket: Length required read doesn't work
The limit of data read works only if all the data is in the
input buffer. Otherwise (if the data arrive in chunks), the
total amount of data is not taken in acount.

Only the current read data are compared to the expected amout
of data.

This patch must be backported from 1.9 to 1.6
2018-05-26 08:51:05 +02:00
Daniel Corbett
9215ffa6b2 BUG/MEDIUM: servers: Add srv_addr default placeholder to the state file
When creating a state file using "show servers state" an empty field is
created in the srv_addr column if the server is from the socket family
AF_UNIX.  This leads to a warning on start up when using
"load-server-state-from-file". This patch defaults srv_addr to "-" if
the socket family is not covered.

This patch should be backported to 1.8.
2018-05-24 22:06:08 +02:00
Olivier Houchard
f3d9e608d7 BUG/MEDIUM: dns: Delay the attempt to run a DNS resolution on check failure.
When checks fail, the code tries to run a dns resolution, in case the IP
changed.
The old way of doing that was to check, in case the last dns resolution
hadn't expired yet, if there were an applicable IP, which should be useless,
because it has already be done when the resolution was first done, or to
run a new resolution.
Both are a locking nightmare, and lead to deadlocks, so instead, just wake the
resolvers task, that should do the trick.

This should be backported to 1.8.
2018-05-23 16:57:15 +02:00
Lukas Tribus
926594f606 MINOR: ssl: set SSL_OP_PRIORITIZE_CHACHA
Sets OpenSSL 1.1.1's SSL_OP_PRIORITIZE_CHACHA unconditionally, as per [1]:

When SSL_OP_CIPHER_SERVER_PREFERENCE is set, temporarily reprioritize
ChaCha20-Poly1305 ciphers to the top of the server cipher list if a
ChaCha20-Poly1305 cipher is at the top of the client cipher list. This
helps those clients (e.g. mobile) use ChaCha20-Poly1305 if that cipher
is anywhere in the server cipher list; but still allows other clients to
use AES and other ciphers. Requires SSL_OP_CIPHER_SERVER_PREFERENCE.

[1] https://www.openssl.org/docs/man1.1.1/man3/SSL_CTX_clear_options.html
2018-05-23 16:55:15 +02:00
William Lallemand
8a16fe0d05 BUG/MEDIUM: cache: don't cache when an Authorization header is present
RFC 7234 says:

A cache MUST NOT store a response to any request, unless:
[...] the Authorization header field (see Section 4.2 of [RFC7235]) does
      not appear in the request, if the cache is shared, unless the
      response explicitly allows it (see Section 3.2), [...]

In this patch we completely disable the cache upon the receipt of an
Authorization header in the request. In this case it's not possible to
either use the cache or store into the cache anymore.

Thanks to Adam Eijdenberg of Digital Transformation Agency for raising
this issue.

This patch must be backported to 1.8.
2018-05-23 10:36:44 +02:00
Thierry Fournier
d5b073cf1f MINOR: lua: Improve error message
The function hlua_ctx_resume return less text message and more error
code. These error code allow the caller to return appropriate
message to the user.
2018-05-22 18:57:46 +02:00
Willy Tarreau
cbe6da5eb0 BUG/MINOR: ssl/lua: prevent lua from affecting automatic maxconn computation
Since commit 36d1374 ("BUG/MINOR: lua: Fix SSL initialisation") in 1.6, the
Lua code always initializes an SSL server. It caused a small visible side
effect which is that by calling ssl_sock_prepare_srv_ctx(), it forces
global.ssl_used_backend to 1 and makes the initialization code believe that
there are some SSL servers in certain backends. This detection is used to
figure how to set the global maxconn value when only the memory usage is
limited. As such, even a configuration with no SSL at all will have a very
conservative maxconn.

The configuration below exhibits this :

   global
        ssl-server-verify none
        stats socket /tmp/sock1 mode 666 level admin
        tune.bufsize 16384

   listen  px
        timeout client  5s
        timeout server  5s
        timeout connect 5s
        bind :4445
        #bind :4443 ssl crt rsa+dh2048.pem
        #server s1 127.0.0.1:8003 ssl

Starting it with "-m 200" to limit it to 200 MB of RAM reports 1500 for
Maxconn, the same when uncommenting the "server" line, and 1300 when
uncommenting the "bind" line, regardless of the "server" line's status.

In practice it doesn't make sense to consider that Lua's server template
counts for one regular SSL server, because even if used for SSL, it will
not take large connection counts, compared to a backend relaying traffic.
Thus the solution consists in resetting the ssl_used_backend to its
previous value after creating the server_ctx from the Lua code. With the
fix, the same config with the same parameters now show :
  - maxconn=5700 when neither side uses SSL
  - maxconn=1500 when only one side uses SSL
  - maxconn=1300 when both sides use SSL

This fix can be backported to versions 1.6 and beyond.
2018-05-18 17:09:35 +02:00
Christopher Faulet
68db0235fd CLEANUP: spoe: Remove unused variables the agent structure
applets_act and applets_idle were used for debugging purpose. Now, these values
are part of the agent's counters.
2018-05-18 15:04:46 +02:00
Thierry FOURNIER
c4dcaff3f0 BUG/MEDIUM: spoe: Flags are not encoded in network order
The flags are direct copy of the "unsigned int" in the network stream,
so the stream contains a 32 bits field encoded with the host endian.
 - This is not reliable for stream betwen different architecture host
 - For x86, the bits doesn't correspond to the documentation.

This patch add some precision in the documentation and put the bitfield
in the stream usig network butes order.

Warning: this patch can break compatibility with existing agents.

This patch should be backported in all version supporing SPOE

Original network capture:

   12:28:16.181343 IP 127.0.0.1.46782 > 127.0.0.1.12345: Flags [P.], seq 134:168, ack 59, win 342, options [nop,nop,TS val 2855241281 ecr 2855241281], length 34
           0x0000:  4500 0056 6b94 4000 4006 d10b 7f00 0001  E..Vk.@.@.......
           0x0010:  7f00 0001 b6be 3039 a3d1 ee54 7d61 d6f7  ......09...T}a..
           0x0020:  8018 0156 fe4a 0000 0101 080a aa2f 8641  ...V.J......./.A
           0x0030:  aa2f 8641 0000 001e 0301 0000 0000 010f  ./.A............
                                          ^^^^^^^^^^
           0x0040:  6368 6563 6b2d 636c 6965 6e74 2d69 7001  check-client-ip.
           0x0050:  0006 7f00 0001                           ......

Fixed network capture:

   12:24:26.948165 IP 127.0.0.1.46706 > 127.0.0.1.12345: Flags [P.], seq 4066280627:4066280661, ack 3148908096, win 342, options [nop,nop,TS val 2855183972 ecr 2855177690], length 34
           0x0000:  4500 0056 0538 4000 4006 3768 7f00 0001  E..V.8@.@.7h....
           0x0010:  7f00 0001 b672 3039 f25e 84b3 bbb0 8640  .....r09.^.....@
           0x0020:  8018 0156 fe4a 0000 0101 080a aa2e a664  ...V.J.........d
           0x0030:  aa2e 8dda 0000 001e 0300 0000 0114 010f  ................
                                          ^^^^^^^^^^
           0x0040:  6368 6563 6b2d 636c 6965 6e74 2d69 7001  check-client-ip.
           0x0050:  0006 7f00 0001                           ......
2018-05-18 13:50:53 +02:00
Thierry FOURNIER
01a3f20740 BUG/MINOR: spoe: Mistake in error message about SPOE configuration
The announced accepted chars are "[a-zA-Z_-.]", but
the real accepted alphabet is "[a-zA-Z0-9_.]".

Numbers are supported and "-" is not supported.

This patch should be backported to 1.8 and 1.7
2018-05-18 13:50:40 +02:00
sada
05ed330d72 BUG/MINOR: lua: Socket.send threw runtime error: 'close' needs 1 arguments.
Function `hlua_socket_close` expected exactly one argument on the Lua stack.
But when `hlua_socket_close` was called from `hlua_socket_write_yield`,
Lua stack had 3 arguments. So `hlua_socket_close` threw the exception with
message "'close' needs 1 arguments".

Introduced new helper function `hlua_socket_close_helper`, which removed the
Lua stack argument count check and only checked if the first argument was
a socket.

This fix should be backported to 1.8, 1.7 and 1.6.
2018-05-18 13:48:21 +02:00
Willy Tarreau
03f4ec47d9 BUG/MEDIUM: ssl: properly protect SSL cert generation
Commit 821bb9b ("MAJOR: threads/ssl: Make SSL part thread-safe") added
insufficient locking to the cert lookup and generation code : it uses
lru64_lookup(), which will automatically remove and add a list element
to the LRU list. It cannot be simply read-locked.

A long-term improvement should consist in using a lockless mechanism
in lru64_lookup() to safely move the list element at the head. For now
let's simply use a write lock during the lookup. The effect will be
minimal since it's used only in conjunction with automatically generated
certificates, which are much more expensive and rarely used.

This fix must be backported to 1.8.
2018-05-17 10:56:47 +02:00
Willy Tarreau
ba20dfc501 BUG/MEDIUM: http: don't always abort transfers on CF_SHUTR
Pawel Karoluk reported on Discourse[1] that HTTP/2 breaks url_param.

Christopher managed to track it down to the HTTP_MSGF_WAIT_CONN flag
which is set there to ensure the connection is validated before sending
the headers, as we may need to rewind the stream and hash again upon
redispatch. What happens is that in the forwarding code we refrain
from forwarding when this flag is set and the connection is not yet
established, and for this we go through the missing_data_or_waiting
path. This exit path was initially designed only to wait for data
from the client, so it rightfully checks whether or not the client
has already closed since in that case it must not wait for more data.
But it also has the side effect of aborting such a transfer if the
client has closed after the request, which is exactly what happens
in H2.

A study on the code reveals that this whole combined check should
be revisited : while it used to be true that waiting had the same
error conditions as missing data, it's not true anymore. Some other
corner cases were identified, such as the risk to report a server
close instead of a client timeout when waiting for the client to
read the last chunk of data if the shutr is already present, or
the risk to fail a redispatch when a client uploads some data and
closes before the connection establishes. The compression seems to
be at risk of rare issues there if a write to a full buffer is not
yet possible but a shutr is already queued.

At the moment these risks are extremely unlikely but they do exist,
and their impact is very minor since it mostly concerns an issue not
being optimally handled, and the fixes risk to cause more serious
issues. Thus this patch only focuses on how the HTTP_MSGF_WAIT_CONN
is handled and leaves the rest untouched.

This patch needs to be backported to 1.8, and could be backported to
earlier versions to properly take care of HTTP/1 requests passing via
url_param which are closed immediately after the headers, though this
is unlikely as this behaviour is only exhibited by scripts.

[1] https://discourse.haproxy.org/t/haproxy-1-8-x-url-param-issue-in-http2/2482/13
2018-05-16 11:35:05 +02:00
William Lallemand
0154edc96f BUG/MINOR: cli: don't stop cli_gen_usage_msg() when kw->usage == NULL
In commit abbf607 ("MEDIUM: cli: Add payload support") some cli keywords
without usage message have been added at the beginning of the keywords
array.

cli_gen_usage_usage_msg() use the kw->usage == NULL to stop generating
the usage message for the current keywords array. With those keywords at
the beginning, the whole array in cli.c was ignored in the usage message
generation.

This patch now checks the keyword itself, allowing a keyword without
usage message anywhere in the array.
2018-05-15 15:16:23 +02:00
PiBa-NL
c55b88ece6 BUG/MEDIUM: pollers/kqueue: use incremented position in event list
When composing the event list for subscribe to kqueue events, the index
where the new event is added must be after the previous events, as such
the changes counter should continue counting.

This caused haproxy to accept connections but not try read and process
the incoming data.

This patch is for 1.9 only
2018-05-11 14:08:56 +02:00
Willy Tarreau
29d698040d BUG/MINOR: lua: ensure large proxy IDs can be represented
In function hlua_fcn_new_proxy() too small a buffer was passed to
snprintf(), resulting in large proxy or listener IDs to make
snprintf() fail. It is unlikely to meet this case but let's fix it
anyway.

This fix must be backported to all stable branches where it applies.
2018-05-06 14:50:09 +02:00
PiBa-NL
706d5ee0c3 BUG/MINOR: lua: schedule socket task upon lua connect()
The parameters like server-address, port and timeout should be set before
process_stream task is called to avoid the stream being 'closed' before it
got initialized properly. This is most clearly visible when running with
tune.lua.forced-yield=1.. So scheduling the task should not be done when
creating the lua socket, but when connect is called. The error
"socket: not yet initialised, you can't set timeouts." would then appear.

Below code for example also shows this issue, as the sleep will
yield the lua code:
  local con = core.tcp()
  core.sleep(1)
  con:settimeout(10)
2018-05-06 14:36:41 +02:00
Olivier Houchard
cb92f5cae4 MINOR: pollers: move polled_mask outside of struct fdtab.
The polled_mask is only used in the pollers, and removing it from the
struct fdtab makes it fit in one 64B cacheline again, on a 64bits machine,
so make it a separate array.
2018-05-06 06:27:34 +02:00
Olivier Houchard
6b96f7289c BUG/MEDIUM: pollers: Use a global list for fd shared between threads.
With the old model, any fd shared by multiple threads, such as listeners
or dns sockets, would only be updated on one threads, so that could lead
to missed event, or spurious wakeups.
To avoid this, add a global list for fd that are shared, using the same
implementation as the fd cache, and only remove entries from this list
when every thread as updated its poller.

[wt: this will need to be backported to 1.8 but differently so this patch
 must not be backported as-is]
2018-05-06 06:27:09 +02:00
Olivier Houchard
6a2cf8752c MINOR: fd: Make the lockless fd list work with multiple lists.
Modify fd_add_to_fd_list() and fd_rm_from_fd_list() so that they take an
offset in the fdtab to the list entry, instead of hardcoding the fd cache,
so we can use them with other lists.
2018-05-06 06:25:49 +02:00
Olivier Houchard
9b36cb4a41 BUG/MEDIUM: task: Don't free a task that is about to be run.
While running a task, we may try to delete and free a task that is about to
be run, because it's part of the local tasks list, or because rq_next points
to it.
So flag any task that is in the local tasks list to be deleted, instead of
run, by setting t->process to NULL, and re-make rq_next a global,
thread-local variable, that is modified if we attempt to delete that task.

Many thanks to PiBa-NL for reporting this and analysing the problem.

This should be backported to 1.8.
2018-05-04 20:11:04 +02:00
Dragan Dosen
336a11f755 BUG/MINOR: map: correctly track reference to the last ref_elt being dumped
The bug was introduced in the commit 8d85aa4 ("BUG/MAJOR: map: fix
segfault during 'show map/acl' on cli").

This patch should be backported to 1.8, 1.7 and 1.6.
2018-05-04 17:14:39 +02:00
Patrick Hemmer
32d539fa88 MINOR: lua: add get_maxconn and set_maxconn to LUA Server class. 2018-05-03 18:53:42 +02:00
Patrick Hemmer
a62ae7ed9a MINOR: lua: Add server name & puid to LUA Server class. 2018-05-03 18:44:44 +02:00
Willy Tarreau
760e81d356 MINOR: backend: implement random-based load balancing
For large farms where servers are regularly added or removed, picking
a random server from the pool can ensure faster load transitions than
when using round-robin and less traffic surges on the newly added
servers than when using leastconn.

This commit introduces "balance random". It internally uses a random as
the key to the consistent hashing mechanism, thus all features available
in consistent hashing such as weights and bounded load via hash-balance-
factor are usable. It is extremely convenient because one common concern
when using random is what happens when a server is hammered a bit too
much. Here that can trivially be avoided, like in the configuration below :

    backend bk0
        balance random
        hash-balance-factor 110
        server-template s 1-100 127.0.0.1:8000 check inter 1s

Note that while "balance random" internally relies on a hash algorithm,
it holds the same properties as round-robin and as such is compatible with
reusing an existing server connection with "option prefer-last-server".
2018-05-03 07:20:40 +02:00
PiBa-NL
fe971b35ae BUG/MINOR, BUG/MINOR: lua: Put tasks to sleep when waiting for data
If a lua socket is waiting for data it currently spins at 100% cpu usage.
This because the TICK_ETERNITY returned by the socket is ignored when
setting the 'expire' time of the task.

Fixed by removing the check for yields that return TICK_ETERNITY.

This should be backported to at least 1.8.
2018-05-03 05:00:25 +02:00
Christopher Faulet
148b16e1ce BUG/MEDIUM: threads: Fix the sync point for more than 32 threads
In the sync point, to know if a thread has requested a synchronization, we call
the function thread_need_sync(). It should return 1 if yes, otherwise it should
return 0. It is intended to return a signed integer.

But internally, instead of returning 0 or 1, it returns 0 or tid_bit
(threads_want_sync & tid_bit). So, tid_bit is casted in integer. For the first
32 threads, it's ok, because we always check if thread_need_sync() returns
something else than 0. But this is a problem if HAProxy is started with more
than 32 threads, because for threads 33 to 64 (so for tid 32 to 63), their
tid_bit casted to integer are evaluated to 0. So the sync point does not work for
more than 32 threads.

Now, the function thread_need_sync() respects its contract, returning 0 or
1. the function thread_no_sync() has also been updated to avoid any ambiguities.

This patch must be backported in HAProxy 1.8.
2018-05-02 17:58:36 +02:00
Christopher Faulet
b119a79fc3 BUG/MINOR: checks: Fix check->health computation for flapping servers
This patch fixes an old bug introduced in the commit 7b1d47ce ("MAJOR: checks:
move health checks changes to set_server_check_status()"). When a DOWN server is
flapping, everytime a check succeds, check->health is incremented. But when a
check fails, it is decremented only when it is higher than the rise value. So if
only one check succeds for a DOWN server, check->health will remain set to 1 for
all subsequent failing checks.

So, at first glance, it seems not that terrible because the server remains
DOWN. But it is reported in the transitional state "DOWN server, going up". And
it will remain in this state until it is UP again. And there is also an
insidious side effect. If a DOWN server is flapping time to time, It will end to
be considered UP after a uniq successful check, , regardless the rise threshold,
because check->health will be increased slowly and never decreased.

To fix the bug, we just need to reset check->health to 0 when a check fails for
a DOWN server. To do so, we just need to relax the condition to handle a failure
in the function set_server_check_status.

This patch must be backported to haproxy 1.5 and newer.
2018-05-02 14:57:58 +02:00
Patrick Hemmer
e027547f8d MINOR: ssl: add fetch 'ssl_fc_session_key' and 'ssl_bc_session_key'
These fetches return the SSL master key of the front/back connection.
This is useful to decrypt traffic encrypted with ephemeral ciphers.
2018-04-30 14:56:19 +02:00
Patrick Hemmer
419667746b MINOR: ssl: disable SSL sample fetches when unsupported
Previously these fetches would return empty results when HAProxy was
compiled
without the requisite SSL support. This results in confusion and problem
reports from people who unexpectedly encounter the behavior.
2018-04-30 14:56:19 +02:00
Willy Tarreau
46deab6e64 BUG/MINOR: config: disable http-reuse on TCP proxies
Louis Chanouha reported an inappropriate warning when http-reuse is
present in a defaults section while a TCP proxy accidently inherits
it and finds a conflict with other options like the use of the PROXY
protocol. To fix this patch removes the http-reuse option for TCP
proxies.

This fix needs to be backported to 1.8, 1.7 and possibly 1.6.
2018-04-28 07:18:15 +02:00
Tim Duesterhus
e2b10bf491 MINOR: http: Add support for 421 Misdirected Request
This makes haproxy aware of HTTP 421 Misdirected Request, which
is defined in RFC 7540, section 9.1.2.
2018-04-28 07:03:39 +02:00
Tim Duesterhus
ca097c16a8 MINOR: sample: Add strcmp sample converter
This converter supplements the existing string matching by allowing
strings to be converted to a variable.

Example usage:

  http-request set-var(txn.host) hdr(host)
  # Check whether the client is attempting domain fronting.
  acl ssl_sni_http_host_match ssl_fc_sni,strcmp(txn.host) eq 0
2018-04-28 07:03:39 +02:00
Christopher Faulet
5bc9972ed8 BUG/MINOR: lua/threads: Make lua's tasks sticky to the current thread
PiBa-NL reported a bug with tasks registered in lua when HAProxy is started with
serveral threads. These tasks have not specific affinity with threads so they
can be woken up on any threads. So, it is impossbile for these tasks to handled
cosockets or applets, because cosockets and applets are sticky on the thread
which created them. It is forbbiden to manipulate a cosocket from another
thread.

So to fix the bug, tasks registered in lua are now sticky to the current
thread. Because these tasks can be registered before threads creation, the
affinity is set the first time a lua's task is processed.

This patch must be backported in HAProxy 1.8.
2018-04-26 22:58:16 +02:00
Aurélien Nephtali
1e0867cfbc MINOR: ssl: Add payload support to "set ssl ocsp-response"
It is now possible to use a payload with the "set ssl ocsp-response"
command.  These syntaxes will work the same way:

 # echo "set ssl ocsp-response $(base64 -w 10000 ocsp.der)" | \
     socat /tmp/sock1 -

 # echo -e "set ssl ocsp-response <<\n$(base64 ocsp.der)\n" | \
     socat /tmp/sock1 -

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-04-26 14:20:09 +02:00
Aurélien Nephtali
25650ce513 MINOR: map: Add payload support to "add map"
It is now possible to use a payload with the "add map" command.
These syntaxes will work the same way:

 # echo "add map #-1 key value" | socat /tmp/sock1 -

 # echo -e "add map #-1 <<\n$(cat data)\n" | socat /tmp/sock1 -

with

 # cat data
 key1 value1 with spaces
 key2 value2
 key3 value3 also with spaces

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-04-26 14:20:01 +02:00
Aurélien Nephtali
abbf607105 MEDIUM: cli: Add payload support
In order to use arbitrary data in the CLI (multiple lines or group of words
that must be considered as a whole, for example), it is now possible to add a
payload to the commands. To do so, the first line needs to end with a special
pattern: <<\n. Everything that follows will be left untouched by the CLI parser
and will be passed to the commands parsers.

Per-command support will need to be added to take advantage of this
feature.

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-04-26 14:19:33 +02:00
Christopher Faulet
799f51801a BUG/MINOR: spoe: Fix parsing of dontlog-normal option
A missing goto led to a parsing error when line "option dontlog-normal" was
parsed.
2018-04-26 11:50:30 +02:00
Christopher Faulet
ebe1399efe BUG/MINOR: spoe: Fix counters update when processing is interrupted
When the processing is interrupted, because of a typo, <nb_sending> was
incremented instead of decremented.
2018-04-26 11:50:18 +02:00
Willy Tarreau
eba10f24b7 BUG/MEDIUM: h2: implement missing support for chunked encoded uploads
Upload requests not carrying a content-length nor tunnelling data must
be sent chunked-encoded over HTTP/1. The code was planned but for some
reason forgotten during the implementation, leading to such payloads to
be sent as tunnelled data.

Browsers always emit a content length in uploads so this problem doesn't
happen for most sites. However some applications may send data frames
after a request without indicating it earlier.

The only way to detect that a client will need to send data is that the
HEADERS frame doesn't hold the ES bit. In this case it's wise to look
for the content-length header. If it's not there, either we're in tunnel
(CONNECT method) or chunked-encoding (other methods).

This patch implements this.

The following request is sent using content-length :

    curl --http2 -sk https://127.0.0.1:4443/s2 -XPOST -T /large/file

and these ones using chunked-encoding :

    curl --http2 -sk https://127.0.0.1:4443/s2 -XPUT -T /large/file
    curl --http2 -sk https://127.0.0.1:4443/s2 -XPUT -T - < /dev/urandom

Thanks to Robert Samuel Newson for raising this issue with details.
This fix must be backported to 1.8.
2018-04-26 10:20:44 +02:00
Willy Tarreau
174b06a572 MINOR: h2: detect presence of CONNECT and/or content-length
We'll need this in order to support uploading chunks. The h2 to h1
converter checks for the presence of the content-length header field
as well as the CONNECT method and returns these information to the
caller. The caller indicates whether or not a body is detected for
the message (presence of END_STREAM or not). No transfer-encoding
header is emitted yet.
2018-04-26 10:15:14 +02:00
Tim Duesterhus
cd235c6042 BUG/MEDIUM: lua: Fix segmentation fault if a Lua task exits
PiBa-NL reported that haproxy crashes with a segmentation fault
if a function registered using `core.register_task` returns.

An example Lua script that reproduces the bug is:

  mytask = function()
  	core.Info("Stopping task")
  end
  core.register_task(mytask)

The Valgrind output is as follows:

  ==6759== Process terminating with default action of signal 11 (SIGSEGV)
  ==6759==  Access not within mapped region at address 0x20
  ==6759==    at 0x5B60AA9: lua_sethook (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==6759==    by 0x430264: hlua_ctx_resume (hlua.c:1009)
  ==6759==    by 0x43BB68: hlua_process_task (hlua.c:5525)
  ==6759==    by 0x4FED0A: process_runnable_tasks (task.c:231)
  ==6759==    by 0x4B2256: run_poll_loop (haproxy.c:2397)
  ==6759==    by 0x4B2256: run_thread_poll_loop (haproxy.c:2459)
  ==6759==    by 0x41A7E4: main (haproxy.c:3049)

Add the missing `task = NULL` for the `HLUA_E_OK` case. The error cases
have been fixed as of 253e53e661 which
first was included in haproxy v1.8-dev3. This bugfix should be backported
to haproxy 1.8.
2018-04-25 11:30:56 +02:00
Rian McGuire
89fcb7d929 BUG/MINOR: log: t_idle (%Ti) is not set for some requests
If TCP content inspection is used, msg_state can be >= HTTP_MSG_ERROR
the first time http_wait_for_request is called. t_idle was being left
unset in that case.

In the example below :
     stick-table type string len 64 size 100k expire 60s
     tcp-request inspect-delay 1s
     tcp-request content track-sc1 hdr(X-Session)

%Ti will always be -1, because the msg_state is already at HTTP_MSG_BODY
when http_wait_for_request is called for the first time.

This patch should backported to 1.8 and 1.7.
2018-04-25 08:59:23 +02:00
Tim Duesterhus
45be38c9c7 BUG/MAJOR: channel: Fix crash when trying to read from a closed socket
When haproxy is compiled using GCC <= 3.x or >= 5.x the `unlikely`
macro performs a comparison with zero: `(x) != 0`, thus returning
either 0 or 1.

In `int co_getline_nc()` this macro was accidentally applied to
the variable `retcode` itself, instead of the result of the
comparison `retcode <= 0`. As a result any negative `retcode`
is converted to `1` for purposes of the comparison.
Thus never taking the branch (and exiting the function) for
negative values.

This in turn leads to reads of uninitialized memory in the for-loop
below:

  ==12141== Conditional jump or move depends on uninitialised value(s)
  ==12141==    at 0x4EB6B4: co_getline_nc (channel.c:346)
  ==12141==    by 0x421CA4: hlua_socket_receive_yield (hlua.c:1713)
  ==12141==    by 0x421F6F: hlua_socket_receive (hlua.c:1896)
  ==12141==    by 0x529B08F: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x52A7EFC: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529B497: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529711A: lua_pcallk (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x52ABDF0: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529B08F: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x52A7EFC: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529A9F1: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529B523: lua_resume (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==
  ==12141== Use of uninitialised value of size 8
  ==12141==    at 0x4EB6B9: co_getline_nc (channel.c:346)
  ==12141==    by 0x421CA4: hlua_socket_receive_yield (hlua.c:1713)
  ==12141==    by 0x421F6F: hlua_socket_receive (hlua.c:1896)
  ==12141==    by 0x529B08F: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x52A7EFC: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529B497: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529711A: lua_pcallk (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x52ABDF0: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529B08F: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x52A7EFC: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529A9F1: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529B523: lua_resume (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==
  ==12141== Invalid read of size 1
  ==12141==    at 0x4EB6B9: co_getline_nc (channel.c:346)
  ==12141==    by 0x421CA4: hlua_socket_receive_yield (hlua.c:1713)
  ==12141==    by 0x421F6F: hlua_socket_receive (hlua.c:1896)
  ==12141==    by 0x529B08F: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x52A7EFC: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529B497: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529711A: lua_pcallk (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x52ABDF0: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529B08F: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x52A7EFC: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529A9F1: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529B523: lua_resume (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==  Address 0x8637171e928bb500 is not stack'd, malloc'd or (recently) free'd

Fix this bug by correctly applying the `unlikely` macro to the result of the comparison.

This bug exists as of commit ca16b03813
which is the first commit adding this function.

v1.6-dev1 is the first tag containing this commit, the fix should
be backported to haproxy 1.6 and newer.
2018-04-25 05:39:49 +02:00
Aurélien Nephtali
564d15a71e BUG/MINOR: pattern: Add a missing HA_SPIN_INIT() in pat_ref_newid()
pat_ref_newid() is lacking a spinlock init. It was probably forgotten
in b5997f740b ("MAJOR: threads/map: Make acls/maps thread safe").

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-04-19 17:49:48 +02:00
Willy Tarreau
3f0e1ec701 BUG/CRITICAL: h2: fix incorrect frame length check
The incoming H2 frame length was checked against the max_frame_size
setting instead of being checked against the bufsize. The max_frame_size
only applies to outgoing traffic and not to incoming one, so if a large
enough frame size is advertised in the SETTINGS frame, a wrapped frame
will be defragmented into a temporary allocated buffer where the second
fragment my overflow the heap by up to 16 kB.

It is very unlikely that this can be exploited for code execution given
that buffers are very short lived and their address not realistically
predictable in production, but the likeliness of an immediate crash is
absolutely certain.

This fix must be backported to 1.8.

Many thanks to Jordan Zebor from F5 Networks for reporting this issue
in a responsible way.
2018-04-19 10:35:30 +02:00
Willy Tarreau
9eb2a4addf BUILD: sample: avoid build warning in sample.c
Recent commit 9631a28 ("MEDIUM: sample: Extend functionality for field/word
converters") introduced this minor build warning that this patch addresses :

 src/sample.c: In function 'sample_conv_word':
 src/sample.c:2108:8: warning: suggest explicit braces to avoid ambiguous 'else' [-Wparentheses]
 src/sample.c:2137:8: warning: suggest explicit braces to avoid ambiguous 'else' [-Wparentheses]

No backport is needed.
2018-04-19 10:33:28 +02:00
Olivier Houchard
ebaba75429 BUG/MEDIUM: kqueue: When adding new events, provide an output to get errors.
When adding new events using kevent(), if there's an error, because we're
trying to delete an event that wasn't there, or because the fd has already
been closed, kevent() will either add an event in the eventlist array if
there's enough room for it, and keep on handling other events, or stop and
return -1.
We want it to process all the events, so give it a large-enough array to
store any error.

Special thanks to PiBa-NL for diagnosing the root cause of this bug.

This should be backported to 1.8.
2018-04-17 17:46:56 +02:00
William Lallemand
daf4cd209a MINOR: export localpeer as an environment variable
Export localpeer as the environment variable $HAPROXY_LOCALPEER,
allowing to use this variable in the configuration file.

It's useful to use this variable in the case of synchronized
configuration between peers.
2018-04-17 17:17:58 +02:00
Marcin Deranek
9631a28275 MEDIUM: sample: Extend functionality for field/word converters
Extend functionality of field/word converters, so it's possible
to extract field(s)/word(s) counting from the beginning/end and/or
extract multiple fields/words (including separators) eg.

str(f1_f2_f3__f5),field(2,_,2)  # f2_f3
str(f1_f2_f3__f5),field(2,_,0)  # f2_f3__f5
str(f1_f2_f3__f5),field(-2,_,3) # f2_f3_
str(f1_f2_f3__f5),field(-3,_,0) # f1_f2_f3

str(w1_w2_w3___w4),word(3,_,2)  # w3___w4
str(w1_w2_w3___w4),word(2,_,0)  # w2_w3___w4
str(w1_w2_w3___w4),word(-2,_,3) # w1_w2_w3
str(w1_w2_w3___w4),word(-3,_,0) # w1_w2

Change is backward compatible.
2018-04-17 11:27:48 +02:00
Aurélien Nephtali
9a4da683a6 MINOR: cli: Ensure the CLI always outputs an error when it should
When using the CLI_ST_PRINT_FREE state, always output something back
if the faulty function did not fill the 'err' variable.
The map/acl code could lead to a crash whereas the SSL code was silently
failing.

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-04-16 19:23:16 +02:00
Aurélien Nephtali
c511b7cc97 BUG/MINOR: cli: Guard against NULL messages when using CLI_ST_PRINT_FREE
Some error paths (especially those followed when running out of memory)
can set the error message to NULL. In order to avoid a crash, use a
generic message ("Out of memory") when this case arises.

It should be backported to 1.8.

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-04-16 19:22:42 +02:00
Ben Draut
054fbee67a MINOR: config: Warn if resolvers has no nameservers
Today, a `resolvers` section may be configured without any `nameserver`
directives, which is useless. This implements a warning when such
sections are detected.

[List thread][1].

[1]: https://www.mail-archive.com/haproxy@formilux.org/msg29600.html
2018-04-16 15:58:23 +02:00
Marcin Deranek
9a66dfbd6c MINOR: proxy: Add fe_defbe fetcher
Patch adds ability to fetch frontend's default backend name in your
logic, so it can be used later to derive other backend names to make routing
decisions.
2018-04-16 15:51:57 +02:00
Christopher Faulet
11ebb2080e BUG/MINOR: http: Return an error in proxy mode when url2sa fails
In proxy mode, the result of url2sa is never checked. So when the function fails
to resolve the destination server from the URL, we continue. Depending on the
internal state of the connection, we get different behaviours. With a newly
allocated connection, the field <addr.to> is not set. So we will get a HTTP
error. The status code is 503 instead of 400, but it's not really critical. But,
if it's a recycled connection, we will reuse the previous value of <addr.to>,
opening a connection on an unexpected server.

To fix the bug, we return an error when url2sa fails.

This patch should be backported in all version from 1.5.
2018-04-16 15:31:18 +02:00
Thierry Fournier
f7b7c3e2f2 MINOR: servers: Support alphanumeric characters for the server templates names
'server-template' directive doesn't support the same name alphabet as
the 'server' directive. This patch allows the usage of chars [0-9].

[wt: let's backport this to 1.8 to apply the principle of least surprize
 to people migrating to server templates]
2018-04-06 19:16:18 +02:00
Willy Tarreau
1093a4586c BUG/MAJOR: cache: always initialize newly created objects
Recent commit 5bd37fa ("BUG/MAJOR: cache: fix random crashes caused by
incorrect delete() on non-first blocks") addressed an issue where
dangling objects could be deleted in the cache, but even after this fix
some similar segfaults were reported at the same place (cache_free_blocks()).
The tree was always corrupted as well. Placing some traces revealed
that this time it's caused by a missing initialization in
http_action_store_cache() : while object->eb.key is used to note that
the object is not in the tree, the first retrieved block may contain
random data and is not initialized. Further, this entry can be updated
later without the object being inserted into the tree. Thus, if at the
end the object is not stored and the blocks are put back to the avail
list, the next attempt to use them will find eb.key != 0 and will try
to delete the uninitialized block, will see that eb.node.leaf_p is not
NULL (random data), and will dereference it as well as a few other
uninitialized pointers. It was harder to trigger than the previous one,
despite being very closely related. This time the following config
was used :

  listen l1
    mode http
    bind :8888
    http-request cache-use c1
    http-response cache-store c1
    server s1 127.0.0.1:8000

  cache c1
    total-max-size 4
    max-age 10

Httpterm was running on port 8000.

And it was stressed this way :

  $ inject -o 1 -u 500 -P 1 -G '127.0.0.1:8888/?s=4097&p=1&x=%s'
  ... wait 5 seconds then Ctrl-C ...
  # wait 3 seconds doing nothing
  $ inject -o 1 -u 500 -P 1 -G '127.0.0.1:8888/?s=4097&p=1&x=%s'
  => segfault

Other values don't work well. The size and the small pieces in the
responses (p=1) are critical to make it work.

Here the fix consists in pre-zeroing object->eb.key AND object->eb.leaf_p
just after the object is allocated so as to stay consistent with other
locations. Ideally this could be simplified later by only relying on
eb->node.leaf_p everywhere since in the end the key alone is not a
reliable indicator, so that we use only one indicator of being part of
the tree or not.

This fix needs to be backported to 1.8.
2018-04-06 19:02:25 +02:00
Christopher Faulet
caf2feca62 MINOR: spoe: Add counters to log info about SPOE agents
In addition to metrics about time spent in the SPOE, following counters have
been added:

  * applets : number of SPOE applets.
  * idles : number of idle applets.
  * nb_sending : number of streams waiting to send data.
  * nb_waiting : number of streams waiting for a ack.
  * nb_processed : number of events/groups processed by the SPOE (from the
                   stream point of view).
  * nb_errors : number of errors during the processing (from the stream point of
                view).

Log messages has been updated to report these counters. Following pattern has
been added at the end of the log message:

    ... <idles>/<applets> <nb_sending>/<nb_waiting> <nb_error>/<nb_processed>
2018-04-05 15:13:54 +02:00
Christopher Faulet
3b8e34902b MINOR: spoe: use agent's logger to log SPOE messages
Instead of using the logger of the stream, we now use dedicated logger of the
SPOE. This means a logger should be defined.
2018-04-05 15:13:54 +02:00
Christopher Faulet
0e0f085a73 MINOR: spoe: Add support for option dontlog-normal in the SPOE agent section
It does the same than for proxies.
2018-04-05 15:13:54 +02:00
Christopher Faulet
7250b8fb5c MINOR: spoe: Add loggers dedicated to the SPOE agent
Now it is possible to configure a logger in a spoe-agent section using a "log"
line, as for a proxy. "no log", "log global" and "log <address> ..." syntaxes
are supported.
2018-04-05 15:13:54 +02:00
Christopher Faulet
28ac099907 MINOR: log: Keep the ref when a log server is copied to avoid duplicate entries
With "log global" line, the global list of loggers are copied into the proxy's
struct. The list coming from the default section is also copied when a frontend
or a backend section is parsed. So it is possible to have duplicate entries in
the proxy's list. For instance, with this following config, all messages will be
logged twice:

    global
        log 127.0.0.1 local0 debug
        daemon

    defaults
        mode   http
        log    global
        option httplog

    frontend front-http
        log global
        bind *:8888
        default_backend back-http

    backend back-http
        server www 127.0.0.1:8000
2018-04-05 15:13:54 +02:00
Christopher Faulet
4b0b79dd56 MINOR: log: move 'log' keyword parsing in dedicated function
Now, the function parse_logsrv should be used to parse a "log" line. This
function will update the list of loggers passed in argument. It can release all
log servers when "no log" line was parsed (by the caller) or it can parse "log
global" or "log <address> ... " lines. It takes care of checking the caller
context (global or not) to prohibit "log global" usage in the global section.
2018-04-05 15:13:54 +02:00
Christopher Faulet
36bda1cd4a MINOR: spoe: Add options to store processing times in variables
"set-process-time" and "set-total-time" options have been added to store
processing times in the transaction scope, at each event and group processing,
the current one and the total one. So it is possible to get them.

TODO: documentation
2018-04-05 15:13:54 +02:00
Christopher Faulet
b2dd1e034c MINOR: spoe: Add metrics in to know time spent in the SPOE
Following metrics are added for each event or group of messages processed in the
SPOE:

  * processing time: the delay to process the event or the group. From the
                     stream point of view, it is the latency added by the SPOE
                     processing.
  * request time : It is the encoding time. It includes ACLs processing, if
                   any. For fragmented frames, it is the sum of all fragments.
  * queue time : the delay before the request gets out the sending queue. For
                 fragmented frames, it is the sum of all fragments.
  * waiting time: the delay before the reponse is received. No fragmentation
                  supported here.
  * response time: the delay to process the response. No fragmentation supported
                   here.
  * total time: (unused for now). It is the sum of all events or groups
                processed by the SPOE for a specific threads.

Log messages has been updated. Before, only errors was logged (status_code !=
0). Now every processing is logged, following this format:

  SPOE: [AGENT] <TYPE:NAME> sid=STREAM-ID st=STATUC-CODE reqT/qT/wT/resT/pT

where:

  AGENT              is the agent name
  TYPE               is EVENT of GROUP
  NAME               is the event or the group name
  STREAM-ID          is an integer, the unique id of the stream
  STATUS_CODE        is the processing's status code
  reqT/qT/wT/resT/pT are delays descrive above

For all these delays, -1 means the processing was interrupted before the end. So
-1 for the queue time means the request was never dequeued. For fragmented
frames it is harder to know when the interruption happened.

For now, messages are logged using the same logger than the backend of the
stream which initiated the request.
2018-04-05 15:13:53 +02:00
Christopher Faulet
879dca9a76 BUG/MINOR: spoe: Don't forget to decrement fpa when a processing is interrupted
In async or pipelining mode, we count the number of NOTIFY frames sent waiting
for their corresponding ACK frames. This is a way to evaluate the "load" of a
SPOE applet. For pipelining mode, it is easy to make the link between a NOTIFY
frame and its ACK one, because exchanges are done using the same TCP connection.

For async mode, it is harder because a ACK frame can be received on another
connection than the one sending the NOTIFY frame. So to decrement the fpa of the
right applet, we need to keep it in the SPOE context. Most of time, it works
expect when the processing is interrupted by the stream, because of a timeout.

This patch fixes this issue. If a SPOE applet is still link to a SPOE context
when the processing is interrupted by the stream, the applet's fpa is
decremented. This is only done for unfragmented frames.
2018-04-05 15:13:53 +02:00
Christopher Faulet
b7426d1562 BUG/MINOR: spoe: Register the variable to set when an error occurred
Variables referenced in HAProxy's configuration file are registered during the
configuration parsing (during parsing of "var", "set-var" or "unset-var"
keywords). For the SPOE, you can use "register-var-names" directive to
explicitly register variable names. All unknown variables will be rejected
(unless you set "force-set-var" option). But, the variable set when an error
occurred (when "set-on-error" option is defined) should also be regiestered by
default. This is done with this patch.
2018-04-05 15:13:53 +02:00
Christopher Faulet
ac580608d7 BUG/MINOR: spoe: Don't release the context buffer in .check_timeouts callbaclk
It is better to let spoe_stop_processing release this buffer because, in
.check_timeouts callback, we lack information to know if it should be release or
not. For instance, if the processing timeout is reached while the SPOE applet
receives the reply, it is preferable to ignore the timeout and process the
result.

This patch should be backported in 1.8.
2018-04-05 15:13:53 +02:00
Christopher Faulet
84c844eb12 BUG/MINOR: spoe: Initialize variables used during conf parsing before any check
Some initializations must be done at the beginning of parse_spoe_flt to avoid
segmentaion fault when first errors are catched, when the "filter spoe" line is
parsed.

This patch must be backported in 1.8.
[cf: the variable "curvars" doesn't exist in 1.8. So the patch must be adapted.]
2018-04-05 15:13:53 +02:00
Willy Tarreau
5bd37fa625 BUG/MAJOR: cache: fix random crashes caused by incorrect delete() on non-first blocks
Several segfaults were reported in the cache, each time in eb_delete()
called from cache_free_blocks() itself called from shctx_row_reserve_hot().
Each time the tree node was corrupted with random cached data (often JS or
HTML contents).

The problem comes from an incompatibility between the cache's expectations
and the recycling algorithm used in the shctx. The shctx allocates and
releases a chain of blocks at once. And when it needs to allocate N blocks
from the avail list while a chain of M>N is found, it picks the first N
from the list, moves them to the hot list, and marks all remaining M-N
blocks as isolated blocks (chains of 1).

For each such released block, the shctx->free_block() callback is used
and passed a pointer to the first and current block of the chain. For
the cache, it's cache_free_blocks(). What this function does is check
that the current block is the first one, and in this case delete the
object from the tree and mark it as not in tree by setting key to zero.

The problem this causes is that the tail blocks when M>N become first
blocks for the next call to shctx_row_reserve_hot(), these ones will
be passed to cache_free_blocks() as list heads, and will be sent to
eb_delete() despite containing only cached data.

The simplest solution for now is to mark each block as holding no cache
object by setting key to zero all the time. It keeps the principle used
elsewhere in the code. The SSL code is not subject to this problem
because it relies on the block's len not being null, which happens
immediately after a block was released. It was uncertain however whether
this method is suitable for the cache. It is not critical though since
this code is going to change soon in 1.9 to dynamically allocate only
the number of required blocks.

This fix must be backported to 1.8. Thanks to Thierry for providing
exploitable cores.
2018-04-04 20:17:03 +02:00
Willy Tarreau
afe1de5d98 BUG/MINOR: cache: fix "show cache" output
The "show cache" command used to dump the header for each entry into into
the handler loop, making it repeated every ~16kB of output data. Additionally
chunk_appendf() was used instead of chunk_printf(), causing the output to
repeat already emitted lines, and the output size to grow in O(n^2). It used
to take several minutes to report tens of millions of objects from a small
cache containing only a few thousands. There was no more impact though.

This fix must be backported to 1.8.
2018-04-04 11:56:43 +02:00
Christopher Faulet
b797ae1f15 BUG/MINOR: email-alert: Set the mailer port during alert initialization
Since the commit 2f3a56b4f ("BUG/MINOR: tcp-check: use the server's service port
as a fallback"), email alerts stopped working because the mailer's port was
overriden by the server's port. Remember, email alerts are defined as checks
with specific tcp-check rules and triggered on demand to send alerts. So to send
an email, a check is executed. Because no specific port's was defined, the
server's one was used.

To fix the bug, the ports used for checks attached an email alert are explicitly
set using the mailer's port. So this port will be used instead of the server's
one.

In this patch, the assignement to a default port (587) when an email alert is
defined has been removed. Indeed, when a mailer is defined, the port must be
defined. So the default port was never used.

This patch must be backported in 1.8.
2018-04-04 10:36:50 +02:00
Olivier Houchard
8ef1a6b0d8 BUG/MINOR: fd: Don't clear the update_mask in fd_insert.
Clearing the update_mask bit in fd_insert may lead to duplicate insertion
of fd in fd_updt, that could lead to a write past the end of the array.
Instead, make sure the update_mask bit is cleared by the pollers no matter
what.

This should be backported to 1.8.
[wt: warning: 1.8 doesn't have the lockless fdcache changes and will
 require some careful changes in the pollers]
2018-04-03 19:38:15 +02:00
Willy Tarreau
2500fc2c34 BUG/MINOR: checks: check the conn_stream's readiness and not the connection
Since commit 9aaf778 ("MAJOR: connection : Split struct connection into
struct connection and struct conn_stream."), the checks use a conn_stream
and not directly the connection anymore. However wake_srv_chk() still used
to verify the connection's readiness instead of the conn_stream's. Due to
the existence of a mux, the connection is always waiting for receiving
something, and doesn't reflect the changes made in event_srv_chk_{r,w}(),
causing the connection appear as not ready yet, and the check to be
validated only after its timeout. The difference is only visible when
sending pure TCP checks, and simply adding a "tcp-check connect" line
is enough to work around it.

This fix must be backported to 1.8.
2018-04-03 19:31:38 +02:00
Willy Tarreau
b2e290acb6 BUG/MEDIUM: h2: always add a stream to the send or fctl list when blocked
When a stream blocks on a mux buffer full/unallocated or on connection
flow control, a flag among H2_SF_MUX_M* is set, but the stream is not
always added to the connection's list. It's properly done when the
operations are performed from the connection handler but not always when
done from the stream handler. For instance, a simple shutr or shutw may
fail by lack of room. If it's immediately followed by a call to h2_detach(),
the stream remains lying around in no list at all, and prevents the
connection from ending. This problem is actually quite difficult to
trigger and seems to require some large objects and low server-side
timeouts.

This patch covers all identified paths. Some are redundant but since the
code will change and will be simplified in 1.9, it's better to stay on
the safe side here for now. It must be backported to 1.8.
2018-03-30 17:43:49 +02:00
Willy Tarreau
1a1dd6066f BUG/MINOR: h2: remove accidental debug code introduced with show_fd function
Commit e3f36cd ("MINOR: h2: implement a basic "show_fd" function")
accidently brought one surrounding debugging part that was in the same
context. No backport needed.
2018-03-30 17:41:19 +02:00
Willy Tarreau
c754b343a2 MINOR: cli: report cache indexes in "show fd"
Instead of just indicating "cache={0,1}" we now report cache.next and
cache.prev since they are the ones used with the lockless fd cache.
2018-03-30 15:00:15 +02:00
Willy Tarreau
e3f36cd479 MINOR: h2: implement a basic "show_fd" function
The purpose here is to dump some information regarding an H2 connection,
and a few statistics about its streams. The output looks like this :

     35 : st=0x55(R:PrA W:PrA) ev=0x00(heopi) [lc] cache=0 owner=0x7ff49ee15e80 iocb=0x588a61(conn_fd_handler) tmask=0x1 umask=0x0 cflg=0x00201366 fe=decrypt mux=H2 mux_ctx=0x7ff49ee16f30 st0=2 flg=0x00000002 fctl_cnt=0 send_cnt=33 tree_cnt=33 orph_cnt=0

- st0 is the connection's state (FRAME_H here)
- flg is the connection's flags (MUX_MFULL here)
- fctl_cnt is the number of streams in the fctl_list
- send_cnt is the number of streams in the send_list
- tree_cnt is the number of streams in the streams_by_id tree
- orph_cnt is the number of orphaned streams (cs==0) in the tree
2018-03-30 14:43:13 +02:00
Willy Tarreau
b011d8f4c4 MINOR: mux: add a "show_fd" function to dump debugging information for "show fd"
This function will be called from the CLI's "show fd" command to append some
extra mux-specific information that only the mux handler can decode. This is
supposed to help collect various hints about what is happening when facing
certain anomalies.
2018-03-30 14:41:19 +02:00
Willy Tarreau
e96e61cadc BUILD/MINOR: threads: always export thread_sync_io_handler()
Otherwise it doesn't build again without threads.
2018-03-29 18:54:33 +02:00
Willy Tarreau
3041fcc2fd BUG/MEDIUM: h2: don't consider pending data on detach if connection is in error
Interrupting an h2load test shows that some connections remain active till
the client timeout. This is due to the fact that h2_detach() immediately
returns if the h2s flags indicate that the h2s is still waiting for some
buffer room in the output mux (possibly to emit a response or to send some
window updates). If the connection is broken, these data will never leave
and must not prevent the stream from being terminated nor the connection
from being released.

This fix must be backported to 1.8.
2018-03-29 15:41:32 +02:00
Willy Tarreau
0975f11d55 BUG/MEDIUM: h2/threads: never release the task outside of the task handler
Currently, h2_release() will release all resources assigned to the h2
connection, including the timeout task if any. But since the multi-threaded
scheduler, the timeout task could very well be queued in the thread-local
list of running tasks without any way to remove it, so task_delete() will
have no effect and task_free() will cause this undefined object to be
dereferenced.

In order to prevent this from happening, we never release the task in
h2_release(), instead we wake it up after marking its context NULL so that
the task handler can release the task.

Future improvements could consist in modifying the scheduler so that a
task_wakeup() has to be done on any task having to be killed, letting
the scheduler take care of it.

This fix must be backported to 1.8. This bug was apparently not reported
so far.
2018-03-29 15:22:59 +02:00
Willy Tarreau
71049cce3f MINOR: h2: fuse h2s_detach() and h2s_free() into h2s_destroy()
Since these two functions are always used together, let's simplify
the code by having a single one for both operations. It also ensures
we don't leave wandering elements that risk to leak later.
2018-03-29 13:22:15 +02:00
Willy Tarreau
e323f3458c MINOR: h2: always call h2s_detach() in h2_detach()
The code is safer and more robust this way, it avoids multiple paths.
This is possible due to the idempotence of LIST_DEL() and eb32_delete()
that are called in h2s_detach().
2018-03-29 13:22:15 +02:00
Willy Tarreau
4a333d3d53 BUG/MAJOR: h2: remove orphaned streams from the send list before closing
Several people reported very strange occasional crashes when using H2.
Every time it appeared that either an h2s or a task was corrupted. The
outcome is that a missing LIST_DEL() when removing an orphaned stream
from the list in h2_wake_some_streams() can cause this stream to
remain present in the send list after it was freed. This may happen
when receiving a GOAWAY frame for example. In the mean time the send
list may be processed due to pending streams, and the just released
stream is still found. If due to a buffer full condition we left the
h2_process_demux() loop before being able to process the pending
stream, the pool entry may be reassigned somewhere else. Either another
h2 connection will get it, or a task, since they are the same size and
are shared. Then upon next pass in h2_process_mux(), the stream is
processed again. Either it crashes here due to modifications, or the
contents are harmless to it and its last changes affect the other object
reasigned to this area (typically a struct task). In the case of a
collision with struct task, the LIST_DEL operation performed on h2s
corrupts the task's wait queue's leaf_p pointer, thus all the wait
queue's structure.

The fix consists in always performing the LIST_DEL in h2s_detach().
It will also make h2s_stream_new() more robust against a possible
future situation where stream_create_from_cs() could have sent data
before failing.

Many thanks to all the reporters who provided extremely valuable
information, traces and/or cores, namely Thierry Fournier, Yves Lafon,
Holger Amann, Peter Lindegaard Hansen, and discourse user "slawekc".

This fix must be backported to 1.8. It is probably better to also
backport the following code cleanups with it as well to limit the
divergence between master and 1.8-stable :

  00dd078 CLEANUP: h2: rename misleading h2c_stream_close() to h2s_close()
  0a10de6 MINOR: h2: provide and use h2s_detach() and h2s_free()
2018-03-29 13:22:15 +02:00
Willy Tarreau
a833cd90b2 BUILD/MINOR: cli: fix a build warning introduced by last commit
Commit 35b1b48 ("MINOR: cli: make "show fd" report the mux and mux_ctx
pointers when available") introduced an accidental build warning due to
a missing const statement.
2018-03-29 13:19:37 +02:00
Willy Tarreau
35b1b48c75 MINOR: cli: make "show fd" report the mux and mux_ctx pointers when available
This is handy to quickly distinguish H2 connections as well as to easily
access the h2c context. It could be backported to 1.8 to help during
troubleshooting sessions.
2018-03-28 18:41:30 +02:00
Willy Tarreau
4037a3f904 MINOR: cli/threads: make "show fd" report thread_sync_io_handler instead of "unknown"
The output was confusing when the sync point's dummy handler was shown.

This patch should be backported to 1.8 to help with troubleshooting.
2018-03-28 18:06:47 +02:00
Willy Tarreau
a7394e1b72 BUG/MINOR: hpack: fix harmless use of uninitialized value in hpack_dht_insert
A warning is reported here by valgrind on first pass in hpack_dht_insert().
The cause is that the not-yet-initialized dht->head is checked in
hpack_dht_get_tail(), though the result is not used, making it have
no impact. At the very least it confuses valgrind, and maybe it makes
it harder for gcc to optimize the code path. Let's move the variable
initialization around to shut it up. Thanks to Olivier for reporting
this one.

This fix may be backported to 1.8 at least to make valgrind usage less
painful.
2018-03-27 20:05:13 +02:00
Mark Lakes
56cc12509c MINOR: lua: allow socket api settimeout to accept integers, float, and doubles
Instead of hlua_socket_settimeout() accepting only integers, allow user
to specify float and double as well. Convert to milliseconds much like
cli_parse_set_timeout but also sanity check the value.

http://w3.impa.br/~diego/software/luasocket/tcp.html#settimeout

T. Fournier edit:

The main goal is to keep compatibility with the LuaSocket API. This
API only accept seconds, so using a float to specify milliseconds is
an acceptable way.

Update doc.
2018-03-27 14:17:02 +02:00
Ilya Shipitsin
7741c854cd BUILD/MINOR: fix build when USE_THREAD is not defined
src/queue.o: In function `pendconn_redistribute':
/home/ilia/haproxy/src/queue.c:272: undefined reference to `thread_want_sync'
src/queue.o: In function `pendconn_grab_from_px':
/home/ilia/haproxy/src/queue.c:311: undefined reference to `thread_want_sync'
src/queue.o: In function `process_srv_queue':
/home/ilia/haproxy/src/queue.c:184: undefined reference to `thread_want_sync'
collect2: error: ld returned 1 exit status
make: *** [Makefile:900: haproxy] Error 1

To be backported to 1.8.
2018-03-26 17:17:59 +02:00
Mark Lakes
22154b437d CLEANUP: lua: typo fix in comments
Some typo fixes in comments.
2018-03-26 11:12:41 +02:00
Thierry Fournier
17a921b799 BUG/MINOR: lua funtion hlua_socket_settimeout don't check negative values
Negatives timeouts doesn't have sense. A negative timeout doesn't cause
a crash, but the connection expires before the system try to extablish it.

This patch should be backported in all versions from 1.6
2018-03-26 11:11:49 +02:00
Thierry Fournier
e9636f192a BUG/MINOR: lua: the function returns anything
The output of these function indicates that one element is pushed in
the stack, but no element is set in the stack. Actually, if anyone
read the value returned by this function, is gets "something"
present in the stack.

This patch is a complement of these one: 119a5f10e4

The LuaSocket documentation tell anything about the returned value,
but the effective code set an integer of value one.

   316a9455b9/src/timeout.c (L172)

Thanks to Tim for the bug report.

This patch should be backported in all version from 1.6
2018-03-26 11:11:23 +02:00
Ilya Shipitsin
f93f0935c9 CLEANUP: map, stream: remove duplicate code in src/map.c, src/stream.c
issue was identified by cppcheck

[src/map.c:372] -> [src/map.c:376]: (warning) Variable 'appctx->st2' is reassigned a value before the old one has been used. 'break;' missing?
[src/map.c:433] -> [src/map.c:437]: (warning) Variable 'appctx->st2' is reassigned a value before the old one has been used. 'break;' missing?
[src/map.c:555] -> [src/map.c:559]: (warning) Variable 'appctx->st2' is reassigned a value before the old one has been used. 'break;' missing?
[src/stream.c:3264] -> [src/stream.c:3268]: (warning) Variable 'appctx->st2' is reassigned a value before the old one has been used. 'break;' missing?

Signed-off-by: Ilya Shipitsin <chipitsine@gmail.com>
2018-03-23 18:00:09 +01:00
Christopher Faulet
fe234281d6 BUG/MINOR: listener: Don't decrease actconn twice when a new session is rejected
When a freshly created session is rejected, for any reason, during the accept in
the function "session_accept_fd", the variable "actconn" is decreased twice. The
first time when the rejected session is released, then in the function
"listener_accpect", because of the failure. So it is possible to have an
negative value for actconn. Note that, in this case, we will also have a negatve
value for the current number of connections on the listener rejecting the
session (actconn and l->nbconn are in/decreased in same time).

It is easy to reproduce the bug with this small configuration:

  global
      stats socket /tmp/haproxy

  listen test
      bind *:12345
      tcp-request connection reject if TRUE

A "show info" on the stat socket, after a connection attempt, will show a very
high value (the unsigned representation of -1).

To fix the bug, if the function "session_accept_fd" returns an error, it
decrements the right counters and "listener_accpect" leaves them untouched.

This patch must be backported in 1.8.
2018-03-23 16:21:50 +01:00
Willy Tarreau
8adae7c15f BUG/MINOR: h2: ensure we can never send an RST_STREAM in response to an RST_STREAM
There are some corner cases where this could happen by accident. Since
the spec explicitly forbids this (RFC7540#5.4.2), let's add a test in
the two only functions which make the RST to avoid this. Thanks to user
klzgrad for reporting this problem. Usually it is expected to be harmless
but may result in browsers issuing a warning.

This fix must be backported to 1.8.
2018-03-22 17:37:05 +01:00
Willy Tarreau
d1023bbab3 BUG/MEDIUM: h2: properly account for DATA padding in flow control
Recent fixes made to process partial frames broke the flow control on
DATA frames, as the padding is not considered anymore, only the actual
data is. Let's simply take account of the padding once the transfer
ends. The probability to meet this bug is low because, when used, padding
is small and it can require a large number of padded transfers before the
window is completely depleted.

Thanks to user klzgrad for reporting this bug and confirming the fix.

This fix must be backported to 1.8.
2018-03-22 16:53:12 +01:00
Emmanuel Hocdet
50791a7df3 MINOR: samples: add crc32c converter
This patch adds the support of CRC32c (rfc4960).
2018-03-21 16:17:00 +01:00
Emmanuel Hocdet
115df3e38e MINOR: accept-proxy: support proxy protocol v2 CRC32c checksum
When proxy protocol v2 CRC32c tlv is received, check it before accept
connection (as describe in "doc/proxy-protocol.txt").
2018-03-21 05:04:01 +01:00
Emmanuel Hocdet
4399c75f6c MINOR: proxy-v2-options: add crc32c
This patch add option crc32c (PP2_TYPE_CRC32C) to proxy protocol v2.
It compute the checksum of proxy protocol v2 header as describe in
"doc/proxy-protocol.txt".
2018-03-21 05:04:01 +01:00
Emmanuel Hocdet
6afd898988 MINOR: hash: add new function hash_crc32c
This function will be used to perform CRC32c computations. This is
required to compute proxy protocol v2 CRC32C tlv (PP2_TYPE_CRC32C).
2018-03-21 05:04:01 +01:00
Willy Tarreau
c98aebcdb8 MINOR: log: stop emitting alerts when it's not possible to write on the socket
This is a recurring pain when using certain unix domain sockets or when
sending to temporarily unroutable addresses, if the process remains in
the foreground, the console is full of error which it's impossible to
do anything about. It's even worse when the process is remote, or when
run from a serial console which will slow the whole process down. Let's
send them only once now to warn about a possible config issue, and not
pollute the system nor slow everything down.
2018-03-20 16:44:25 +01:00
Christopher Faulet
fd83f0bfa4 BUG/MEDIUM: threads/queue: wake up other threads upon dequeue
The previous patch about queues (5cd4bbd7a "BUG/MAJOR: threads/queue: Fix
thread-safety issues on the queues management") revealed a performance drop when
multithreading is enabled (nbthread > 1). This happens when pending connections
handled by other theads are dequeued. If these other threads are blocked in the
poller, we have to wait the poller's timeout (or any I/O event) to process the
dequeued connections.

To fix the problem, at least temporarly, we "wake up" the threads by requesting
a synchronization. This may seem a bit overkill to use the sync point to do a
wakeup on threads, but it fixes this performance issue. So we can now think
calmly on the good way to address this kind of issues.

This patch should be backported in 1.8 with the commit 5cd4bbd7a ("BUG/MAJOR:
threads/queue: Fix thread-safety issues on the queues management").
2018-03-19 22:16:58 +01:00
Baptiste Assmann
2f3a56b4ff BUG/MINOR: tcp-check: use the server's service port as a fallback
When running tcp-check scripts, one must ensure we can establish a tcp
connection first.
When doing this action, HAProxy needs a TCP port configured either on
the server or on the check itself or on the connect rule itself.
For some reasons, the connect code did not evaluate the service port on
the server structure...

this patch fixes this error.

Backport status: 1.8
2018-03-19 13:55:55 +01:00
Baptiste Assmann
248f1173f2 BUG/MEDIUM: tcp-check: single connect rule can't detect DOWN servers
When tcpcheck is used to do TCP port monitoring only and the script is
composed by a single "tcp-check connect" rule (whatever port and ssl
options enabled), then the server can't be seen as DOWN.
Simple configuration to reproduce:

  backend b
    [...]
    option tcp-check
    tcp-check connect
    server s1 127.0.0.1:22 check

The main reason for this issue is that the piece of code which validates
that we're not at the end of the chained list (of rules) prevents
executing the validation of the establishment of the TCP connection.
Since validation is not executed, the rule is terminated and the report
says no errors were encountered, hence the server is UP all the time.

The workaround is simple: move the connection validation outsied the
CONNECT rule processing loop, into the main function.
That way, if the connection status is not CONNECTED, then HAProxy will
now add more time to wait for it. If the time is expired, an error is
now well reported.

Backport status: 1.8
2018-03-19 13:53:59 +01:00
Thierry FOURNIER
2986c0db88 CLEANUP: lua/syntax: lua is a name and not an acronym
This patch fix some first letter upercase for Lua messages.
2018-03-19 12:59:26 +01:00
Thierry FOURNIER
fd1e955a56 BUG/MINOR: lua: return bad error messages
The returned type is the type of the top of stack value and
not the type of the checked argument.

[wt: this can be backported to 1.8, 1.7 and 1.6]
2018-03-19 12:59:19 +01:00
Bernard Spil
13c53f8cc2 BUILD: ssl: Fix build with OpenSSL without NPN capability
OpenSSL can be built without NEXTPROTONEG support by passing
-no-npn to the configure script. This sets the
OPENSSL_NO_NEXTPROTONEG flag in opensslconf.h

Since NEXTPROTONEG is now considered deprecated, it is superseeded
by ALPN (Application Layer Protocol Next), HAProxy should allow
building withough NPN support.
2018-03-19 12:43:15 +01:00
Aurélien Nephtali
6a61e968ac BUG/MINOR: cli: Fix a crash when sending a command with too many arguments
This bug was introduced in 48bcfdab2 ("MEDIUM: dumpstat: make the CLI
parser understand the backslash as an escape char").

This should be backported to 1.8.

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-03-19 12:15:38 +01:00
Aurélien Nephtali
6e8a41d8fc BUG/MINOR: cli: Ensure all command outputs end with a LF
Since 200b0fac ("MEDIUM: Add support for updating TLS ticket keys via
socket"), 4147b2ef ("MEDIUM: ssl: basic OCSP stapling support."),
4df59e9 ("MINOR: cli: add socket commands and config to prepend
informational messages with severity") and 654694e1 ("MEDIUM: stats/cli:
add support for "set table key" to enter values"), commands
'set ssl tls-key', 'set ssl ocsp-response', 'set severity-output' and
'set table' do not always send an extra LF at the end of their outputs.

This is required as mentioned in doc/management.txt:

"Since multiple commands may be issued at once, haproxy uses the empty
line as a delimiter to mark an end of output for each command"

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-03-19 12:13:02 +01:00
Olivier Houchard
33e083c92e BUG/MINOR: seemless reload: Fix crash when an interface is specified.
When doing a seemless reload, while receiving the sockets from the old process
the new process will die if the socket has been bound to a specific
interface.
This happens because the code that tries to parse the informations bogusly
try to set xfer_sock->namespace, while it should be setting wfer_sock->iface.

This should be backported to 1.8.
2018-03-19 12:10:53 +01:00
Ilya Shipitsin
210eb259bf CLEANUP: dns: remove duplicate code in src/dns.c
issue was identified by cppcheck

[src/dns.c:2037] -> [src/dns.c:2041]: (warning) Variable 'appctx->st2' is reassigned a value before the old one has been used. 'break;' missing?
2018-03-19 12:09:16 +01:00
Baptiste Assmann
1fa7d2acce BUG/MINOR: dns: don't downgrade DNS accepted payload size automatically
Automatic downgrade of DNS accepted payload size may have undesired side
effect, which could make a backend with all servers DOWN.

After talking with Lukas on the ML, I realized this "feature" introduces
more issues that it fixes problem.
The "best" way to handle properly big responses will be to implement DNS
over TCP.

To be backported to 1.8.
2018-03-19 11:41:52 +01:00
Christopher Faulet
5cd4bbd7ab BUG/MAJOR: threads/queue: Fix thread-safety issues on the queues management
The management of the servers and the proxies queues was not thread-safe at
all. First, the accesses to <strm>->pend_pos were not protected. So it was
possible to release it on a thread (for instance because the stream is released)
and to use it in same time on another one (because we redispatch pending
connections for a server). Then, the accesses to stream's information (flags and
target) from anywhere is forbidden. To be safe, The stream's state must always
be updated in the context of process_stream.

So to fix these issues, the queue module has been refactored. A lock has been
added in the pendconn structure. And now, when we try to dequeue a pending
connection, we start by unlinking it from the server/proxy queue and we wake up
the stream. Then, it is the stream reponsibility to really dequeue it (or
release it). This way, we are sure that only the stream can create and release
its <pend_pos> field.

However, be careful. This new implementation should be thread-safe
(hopefully...). But it is not optimal and in some situations, it could be really
slower in multi-threaded mode than in single-threaded one. The problem is that,
when we try to dequeue pending connections, we process it from the older one to
the newer one independently to the thread's affinity. So we need to wait the
other threads' wakeup to really process them. If threads are blocked in the
poller, this will add a significant latency. This problem happens when maxconn
values are very low.

This patch must be backported in 1.8.
2018-03-19 10:03:06 +01:00
Christopher Faulet
510c0d67ef BUG/MEDIUM: threads/unix: Fix a deadlock when a listener is temporarily disabled
When a listener is temporarily disabled, we start by locking it and then we call
.pause callback of the underlying protocol (tcp/unix). For TCP listeners, this
is not a problem. But listeners bound on an unix socket are in fact closed
instead. So .pause callback relies on unbind_listener function to do its job.

Unfortunatly, unbind_listener hold the listener's lock and then call an internal
function to unbind it. So, there is a deadlock here. This happens during a
reload. To fix the problemn, the function do_unbind_listener, which is lockless,
is now exported and is called when a listener bound on an unix socket is
temporarily disabled.

This patch must be backported in 1.8.
2018-03-16 11:19:07 +01:00
Cyril Bonté
4288c5a9d8 BUG/MINOR: force-persist and ignore-persist only apply to backends
>From the very first day of force-persist and ignore-persist features,
they only applied to backends, except that the documentation stated it
could also be applied to frontends.

In order to make it clear, the documentation is updated and the parser
will raise a warning if the keywords are used in a frontend section.

This patch should be backported up to the 1.5 branch.
2018-03-12 22:52:24 +01:00
Cyril Bonté
d400ab3a36 BUG/MEDIUM: fix a 100% cpu usage with cpu-map and nbthread/nbproc
Krishna Kumar reported a 100% cpu usage with a configuration using
cpu-map and a high number of threads,

Indeed, this minimal configuration to reproduce the issue :
  global
    nbthread 40
    cpu-map auto:1/1-40 0-39

  frontend test
    bind :8000

This is due to a wrong type in a shift operator (int vs unsigned long int),
causing an endless loop while applying the cpu affinity on threads. The same
issue may also occur with nbproc under FreeBSD. This commit addresses both
cases.

This patch must be backported to 1.8.
2018-03-12 22:52:24 +01:00
Aurélien Nephtali
b53e20826e BUG/MINOR: cli: Fix a typo in the 'set rate-limit' usage
The correct keyword is 'ssl-sessions' (vs. 'ssl-session').
The typo was introduced in 45c742be05 ('REORG: cli: move the "set
rate-limit" functions to their own parser').

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-03-12 07:49:08 +01:00
Aurélien Nephtali
bca08762d2 CLEANUP: cli: Remove a leftover debug message
This printf() was added in f886e3478d ("MINOR: cli: Add a command to
send listening sockets.").

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-03-12 07:49:05 +01:00
Aurélien Nephtali
76de95a4c0 CLEANUP: ssl: Remove a duplicated #include
openssl/x509.h is included twice since commit fc0421fde ("MEDIUM: ssl:
add support for SNI and wildcard certificates").

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-03-12 07:49:01 +01:00
Aurélien Nephtali
498a115727 BUG/MINOR: cli: Fix a crash when passing a negative or too large value to "show fd"
This bug is present since 7a4a0ac71d ("MINOR: cli: add a new "show fd"
command").

This should be backported to 1.8.

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-03-12 07:47:26 +01:00
Willy Tarreau
84b118f312 BUG/MEDIUM: h2: also arm the h2 timeout when sending
Right now the h2 idle timeout is only set when there is no stream. If we
fail to send because the socket buffers are full (generally indicating
the client has left), we also need to arm it so that we can properly
expire such connections, otherwise some failed transfers might leave
H2 connections pending forever.

Thanks to Thierry Fournier for the diag and the traces.

This patch needs to be backported to 1.8.
2018-03-08 18:43:56 +01:00
Willy Tarreau
c41b3e8dff DOC: buffers: clarify the purpose of the <from> pointer in offer_buffers()
This one is only used to compare pointers and NULL is permitted though
this is far from being clear.
2018-03-08 18:33:48 +01:00
Olivier Houchard
ec9516a6dc BUG/MINOR: unix: Don't mess up when removing the socket from the xfer_sock_list.
When removing the socket from the xfer_sock_list, we want to set
next->prev to prev, not to next->prev, which is useless.

This should be backported to 1.8.
2018-03-08 18:33:11 +01:00
Emeric Brun
1738e86771 BUG/MINOR: session: Fix tcp-request session failure if handshake.
Some sample fetches check if session is established using
the flag CO_FL_CONNECTED. But in some cases, when a handshake
is performed this flag is set too late, after the process
of the tcp-request session rules.

This fix move the raising of the flag at the beginning of the
conn_complete_session function which processes the tcp-request
session rules.

This fix must be backported to 1.8 (and perhaps 1.7)
2018-03-06 14:04:45 +01:00
Willy Tarreau
44e973f508 MEDIUM: h2: use a single buffer allocator
We used to have one buffer allocator per direction while we can never
block on two buffers at once. Let's have a single one and rely on the
connection's flags to know which one we're waitinf for.
2018-03-01 17:58:15 +01:00
Willy Tarreau
0a10de6066 MINOR: h2: provide and use h2s_detach() and h2s_free()
These ones save us from open-coding the cleanup functions on each and
every error path. The code was updated to use them with no functional
change.
2018-03-01 16:35:01 +01:00
Willy Tarreau
00dd07895a CLEANUP: h2: rename misleading h2c_stream_close() to h2s_close()
This function takes an h2c and an h2s but it never uses the h2c, which
is a bit confusing at some places in the code. Let's make it clear that
it only operates on the h2s instead by renaming it and removing the
unused h2c argument.
2018-03-01 16:31:34 +01:00
Emmanuel Hocdet
253c3b7516 MINOR: connection: add proxy-v2-options authority
This patch add option PP2_TYPE_AUTHORITY to proxy protocol v2 when a TLS
connection was negotiated. In this case, authority corresponds to the sni.
2018-03-01 11:38:32 +01:00
Emmanuel Hocdet
fa8d0f1875 MINOR: connection: add proxy-v2-options ssl-cipher,cert-sig,cert-key
This patch implement proxy protocol v2 options related to crypto information:
ssl-cipher (PP2_SUBTYPE_SSL_CIPHER), cert-sig (PP2_SUBTYPE_SSL_SIG_ALG) and
cert-key (PP2_SUBTYPE_SSL_KEY_ALG).
2018-03-01 11:38:28 +01:00
Emmanuel Hocdet
283e004a85 MINOR: ssl: add ssl_sock_get_cert_sig function
ssl_sock_get_cert_sig can be used to report cert signature short name
to log and ppv2 (RSA-SHA256).
2018-03-01 11:34:08 +01:00
Emmanuel Hocdet
96b7834e98 MINOR: ssl: add ssl_sock_get_pkey_algo function
ssl_sock_get_pkey_algo can be used to report pkey algorithm to log
and ppv2 (RSA2048, EC256,...).
Extract pkey information is not free in ssl api (lock/alloc/free):
haproxy can use the pkey information computed in load_certificate.
Store and use this information in a SSL ex_data when available,
compute it if not (SSL multicert bundled and generated cert).
2018-03-01 11:34:05 +01:00
Emmanuel Hocdet
ddc090bc55 MINOR: ssl: extract full pkey info in load_certificate
Private key information is used in switchctx to implement native multicert
selection (ecdsa/rsa/anonymous). This patch extract and store full pkey
information: dsa type and pkey size in bits. This can be used for switchctx
or to report pkey informations in ppv2 and log.
2018-03-01 11:33:18 +01:00
Emmanuel Hocdet
8c0c34b6e7 Revert "BUG/MINOR: send-proxy-v2: string size must include ('\0')"
This reverts commit 82913e4f79.
TLV string value should not be null-terminated.

This should be backported to 1.8.
2018-03-01 06:48:05 +01:00
Christopher Faulet
7d9f1ba246 BUG/MEDIUM: spoe: Remove idle applets from idle list when HAProxy is stopping
In the SPOE applet's handler, when an applet is switched from the state IDLE to
PROCESSING, it is removed for the list of idle applets. But when HAProxy is
stopping, this applet can be switched to DISCONNECT. In this case, we also need
to remove it from the list of idle applets. Else the applet is removed but still
present in the list. It could lead to a segmentation fault or an infinite loop,
depending the code path.
2018-02-28 16:20:33 +01:00
Willy Tarreau
35a62705df BUG/MEDIUM: h2: always consume any trailing data after end of output buffers
In case a stream tries to emit more data than advertised by the chunks
or content-length headers, the extra data remains in the channel's output
buffer until the channel's timeout expires. It can easily happen when
sending malformed error files making use of a wrong content-length or
having extra CRLFs after the empty chunk. It may also be possible to
forge such a bad response using Lua.

The H1 to H2 encoder must protect itself against this by marking the data
presented to it as consumed if it decides to discard them, so that the
sending stream doesn't wait for the timeout to trigger.

The visible effect of this problem is a huge memory usage and a high
concurrent connection count during benchmarks when using such bad data
(a typical place where this easily happens).

This fix must be backported to 1.8.
2018-02-27 15:37:25 +01:00
Christopher Faulet
929b52d8a1 BUG/MINOR: h2: Set the target of dbuf_wait to h2c
In h2_get_dbuf, when the buffer allocation was failing, dbuf_wait.target was
errornously set to the connection (h2c->conn) instead of the h2 connection
descriptor (h2c).

This patch must be backported to 1.8.
2018-02-26 17:33:16 +01:00
Yves Lafon
95317289e9 MINOR: stats: display the number of threads in the statistics.
Add the nbthread global variable to the output, matching nbproc.

This may be backported to 1.8
2018-02-26 11:53:46 +01:00
Willy Tarreau
f161d0f51e BUG/MINOR: pools/threads: don't ignore DEBUG_UAF on double-word CAS capable archs
Since commit cf975d4 ("MINOR: pools/threads: Implement lockless memory
pools."), we support lockless pools. However the parts dedicated to
detecting use-after-free are not present in this part, making DEBUG_UAF
useless in this situation.

The present patch sets a new define CONFIG_HAP_LOCKLESS_POOLS when such
a compatible architecture is detected, and when pool debugging is not
requested, then makes use of this everywhere in pools and buffers
functions. This way enabling DEBUG_UAF will automatically disable the
lockless version.

No backport is needed as this is purely 1.9-dev.
2018-02-22 14:18:45 +01:00
Tim Duesterhus
5e64286bab CLEANUP: standard: Fix typo in IPv6 mask example
IPv6 addresses with two double colons are invalid.

This typo was introduced in commit 471851713a.
2018-02-21 05:07:35 +01:00
Tim Duesterhus
66888f907c CLEANUP: h2: Remove unused labels from mux_h2.c
This removes the unused next_header_block and try_again labels
from mux_h2.c.

try_again is unused as of a76e4c2183,
which first appeared in haproxy 1.8.0.
next_header_block is unused as of 872855998b,
which was backported to haproxy 1.8.0 as
59fcb216085a7aa9744cffe39567c80de4ebd6bf.
2018-02-20 08:30:13 +01:00
Tim Duesterhus
932bb289dd CLEANUP: spoe: Remove unused label retry
This removes the retry labels from spoe_send_frame and spoe_recv_frame
which are unused since d5216d474d, which
is unreleased, but was backported to haproxy 1.8 as
f13f3a4babdb1ce23a7e982c765704bca728111a.
2018-02-20 08:30:12 +01:00
Tim Duesterhus
9619e72c6b CLEANUP: cfgparse: Remove unused label end
This removes the end label from parse_process_number() which
is unused since 5ab51775e7, which
first was released in haproxy 1.8.0.
2018-02-20 08:30:12 +01:00
Emeric Brun
74f7ffa229 MINOR: ssl/sample: adds ssl_bc_is_resumed fetch keyword.
Returns true when the back connection was made over an SSL/TLS transport
layer and the newly created SSL session was resumed using a cached
session or a TLS ticket.
2018-02-19 16:50:20 +01:00
Emeric Brun
eb8def9f34 BUG/MEDIUM: ssl/sample: ssl_bc_* fetch keywords are broken.
Since the split between connections and conn-stream objects, this
keywords are broken.

This patch must be backported in 1.8
2018-02-19 16:50:05 +01:00
Christopher Faulet
fd04fcf5ed BUG/MEDIUM: http: Switch the HTTP response in tunnel mode as earlier as possible
When the body length is undefined (no Content-Length or Transfer-Encoding
headers), The reponse remains in ending mode, waiting the request is done. So,
most of time this is not a problem because the resquest is done before the
response. But when a client sends data to a server that replies without waiting
all the data, it is really not desirable to wait the end of the request to
finish the response.

This bug was introduced when the tunneling of the request and the reponse was
refactored, in commit 4be980391 ("MINOR: http: Switch requests/responses in
TUNNEL mode only by checking txn flag").

This patch should be backported in 1.8 and 1.7.
2018-02-19 16:47:12 +01:00
Christopher Faulet
4ac77a98cd BUG/MEDIUM: ssl: Shutdown the connection for reading on SSL_ERROR_SYSCALL
When SSL_read returns SSL_ERROR_SYSCALL and errno is unset or set to EAGAIN, the
connection must be shut down for reading. Else, the connection loops infinitly,
consuming all the CPU.

The bug was introduced in the commit 7e2e50500 ("BUG/MEDIUM: ssl: Don't always
treat SSL_ERROR_SYSCALL as unrecovarable."). This patch must be backported in
1.8 too.
2018-02-19 15:37:47 +01:00
Willy Tarreau
280f42b99e MINOR: sample: add a new "concat" converter
It's always a pain not to be able to combine variables. This commit
introduces the "concat" converter, which appends a delimiter, a variable's
contents and another delimiter to an existing string. The result is a string.
This makes it easier to build composite variables made of other variables.
2018-02-19 15:34:12 +01:00
Christopher Faulet
16f45c87d5 BUG/MINOR: ssl/threads: Make management of the TLS ticket keys files thread-safe
A TLS ticket keys file can be updated on the CLI and used in same time. So we
need to protect it to be sure all accesses are thread-safe. Because updates are
infrequent, a R/W lock has been used.

This patch must be backported in 1.8
2018-02-19 14:15:38 +01:00
Tim Duesterhus
9ad9f3517e DOC: cfgparse: Warn on option (tcp|http)log in backend
The option does not seem to have any effect since at least haproxy
1.3. Also the `log-format` directive already warns when being used
in a backend.
2018-02-19 13:57:32 +01:00
Aurélien Nephtali
39b89889e7 BUG/MINOR: init: Add missing brackets in the code parsing -sf/-st
The codes tries to strip trailing spaces of arguments but due to missing
brackets, it will always exit.

It can be reproduced with this (silly) example:

$ haproxy -f /etc/haproxy/haproxy.cfg -sf 1234 "1235 " 1236
$ echo $?
1

This was introduced in commit 236062f7c ("MINOR: init: emit warning when
-sf/-sd cannot parse argument")

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@gmail.com>
2018-02-19 08:02:21 +01:00
Olivier Houchard
7e2e505006 BUG/MEDIUM: ssl: Don't always treat SSL_ERROR_SYSCALL as unrecovarable.
Bart Geesink reported some random errors appearing under the form of
termination flags SD in the logs for connections involving SSL traffic
to reach the servers.

Tomek Gacek and Mateusz Malek finally narrowed down the problem to commit
c2aae74 ("MEDIUM: ssl: Handle early data with OpenSSL 1.1.1"). It happens
that the special case of SSL_ERROR_SYSCALL isn't handled anymore since
this commit.

SSL_read() might return <= 0, and SSL_get_erro() return SSL_ERROR_SYSCALL,
without meaning the connection is gone. Before flagging the connection
as in error, check the errno value.

This should be backported to 1.8.
2018-02-14 18:44:28 +01:00
Olivier Houchard
1ff9104117 BUG/MINOR: fd/threads: properly lock the FD before adding it to the fd cache.
It was believed that it was useless to lock the "prev" field when adding a
FD. However, if there's only one element in the FD cache, and that element
removes itself from the fd cache, and another FD is added before the first
add completed, there's a risk of losing elements. To prevent that, lock the
"prev" field, so that such a removal will wait until the add completed.
2018-02-08 17:24:06 +01:00
Willy Tarreau
58aa5ccd76 BUG/MINOR: config: don't emit a warning when global stats is incompletely configured
Martin Brauer reported an unexpected warning when some parts of the
global stats are defined but not the listening address, like below :

  global
    #stats socket run/admin.sock mode 660 level admin
    stats timeout 30s

Then haproxy complains :
  [WARNING] 334/150131 (23086) : config : frontend 'GLOBAL' has no
'bind' directive. Please declare it as a backend if this was intended.

This is because of the check for a bind-less frontend (the global section
creates a frontend for the stats). There's no clean fix for this one, so
here we're simply checking that the frontend is not the global stats one
before emitting the warning.

This patch should be backported to all stable versions.
2018-02-08 09:55:09 +01:00
Willy Tarreau
821069832e BUILD: fd/threads: fix breakage build breakage without threads
The last fix for the volatile dereference made use of pl_deref_int()
which is unknown when building without threads. Let's simply open-code
it instead. No backport needed.
2018-02-06 12:00:27 +01:00
Chris Lane
236062f7ce MINOR: init: emit warning when -sf/-sd cannot parse argument
Previously, -sf and -sd command line parsing used atol which cannot
detect errors.  I had a problem where I was doing -sf "$pid1 $pid2 $pid"
and it was sending the gracefully terminate signal only to the first pid.
The change uses strtol and checks endptr and errno to see if the parsing
worked.  It will exit when the pid list is not parsed.

[wt: this should be backported to 1.8]
2018-02-06 07:23:32 +01:00
Tim Duesterhus
7d58b4d156 BUG/MEDIUM: standard: Fix memory leak in str2ip2()
An haproxy compiled with:

> make -j4 all TARGET=linux2628 USE_GETADDRINFO=1

And running with a configuration like this:

  defaults
  	log	global
  	mode	http
  	option	httplog
  	option	dontlognull
  	timeout connect 5000
  	timeout client  50000
  	timeout server  50000

  frontend fe
  	bind :::8080 v4v6

  	default_backend be

  backend be
  	server s example.com:80 check

Will leak memory inside `str2ip2()`, because the list `result` is not
properly freed in success cases:

==18875== 140 (76 direct, 64 indirect) bytes in 1 blocks are definitely lost in loss record 87 of 111
==18875==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==18875==    by 0x537A565: gaih_inet (getaddrinfo.c:1223)
==18875==    by 0x537DD5D: getaddrinfo (getaddrinfo.c:2425)
==18875==    by 0x4868E5: str2ip2 (standard.c:733)
==18875==    by 0x43F28B: srv_set_addr_via_libc (server.c:3767)
==18875==    by 0x43F50A: srv_iterate_initaddr (server.c:3879)
==18875==    by 0x43F50A: srv_init_addr (server.c:3944)
==18875==    by 0x475B30: init (haproxy.c:1595)
==18875==    by 0x40406D: main (haproxy.c:2479)

The exists as long as the usage of getaddrinfo in that function exists,
it was introduced in commit:
d5f4328efd

v1.5-dev8 is the first tag containing this comment, the fix
should be backported to haproxy 1.5 and newer.
2018-02-05 21:04:15 +01:00
Willy Tarreau
a331544c33 BUG/MINOR: time/threads: ensure the adjusted time is always correct
In the time offset calculation loop, we ensure we only commit the new
date once it's futher in the future than the current one. However there
is a small issue here on 32-bit platforms : if global_now is written in
two cycles by another thread, starting with the tv_sec part, and the
current thread reads it in the middle of a change, it may compute a
wrong "adjusted" value on the first round, with the new (larger) tv_sec
and the old (large) tv_usec. This will be detected as the CAS will fail,
and another attempt will be made, but this time possibly with too large
an adusted value, pushing the date further than needed (at worst almost
one second).

This patch addresses this by using a temporary adjusted time in the loop
that always restarts from the last known one, and by assigning the result
to the final value only once the CAS succeeds.

The impact is very limited, it may cause the time to advance in small
jumps on 32 bit platforms and in the worst case some timeouts might
expire 1 second too early.

This fix should be backported to 1.8.
2018-02-05 20:11:38 +01:00
Willy Tarreau
11559a7530 MINOR: fd: reorder fd_add_to_fd_list()
The function was cleaned up a bit from duplicated parts inherited from
the initial attempt at getting it to work. It's a bit smaller and cleaner
this way.
2018-02-05 19:45:41 +01:00
Willy Tarreau
3a8263f86b MINOR: fd: remove the unneeded last CAS when adding an fd to the list
This was a leftover from the initial code where two threads could fight
for the list's tail.
2018-02-05 19:45:39 +01:00
Willy Tarreau
abeaff2d54 BUG/MINOR: fd/threads: properly dereference fdcache as volatile
In fd_rm_from_fd_list(), we have loops waiting for another change to
complete, in case we don't have support for a double CAS. But these
ones fail to place a compiler barrier or to dereference the fdcache
as a volatile, resulting in an endless loop on the first collision,
which is visible when run on MIPS32.

No backport needed.
2018-02-05 19:45:31 +01:00
Willy Tarreau
4cc67a2782 MINOR: fd: move the fd_{add_to,rm_from}_fdlist functions to fd.c
There's not point inlining these huge functions, better move them to real
functions in fd.c.
2018-02-05 17:19:40 +01:00
Willy Tarreau
62a627ac19 MEDIUM: poller: use atomic ops to update the fdtab mask
We don't need to lock the fdtab[].lock anymore since we only have one
modification left (update update_mask). Let's use an atomic AND instead.
2018-02-05 16:02:22 +01:00
Willy Tarreau
d4daeac7f1 MINOR: select: get rid of the now useless fd_compute_new_polled_status()
Do not call it anymore and avoid updating the fdstate.
2018-02-05 16:02:22 +01:00
Willy Tarreau
1394eb0120 MINOR: poll: get rid of the now useless fd_compute_new_polled_status()
Do not call it anymore and avoid updating the fdstate.
2018-02-05 16:02:22 +01:00
Willy Tarreau
7d24fadf7c MINOR: kqueue: get rid of the now useless fd_compute_new_polled_status()
Do not call it anymore and avoid updating the fdstate.
2018-02-05 16:02:22 +01:00
Willy Tarreau
038e54cb3c MINOR: epoll: get rid of the now useless fd_compute_new_polled_status()
Do not call it anymore and avoid updating the fdstate. We're not very far
from removing the fd lock it seems.
2018-02-05 16:02:22 +01:00
Olivier Houchard
1256836ebf MEDIUM: fd/threads: Make sure we don't miss a fd cache entry.
An fd cache entry might be removed and added at the end of the list, while
another thread is parsing it, if that happens, we may miss fd cache entries,
to avoid that, add a new field in the struct fdtab, "added_mask", which
contains a mask for potentially affected threads, if it is set, the
corresponding thread will set its bit in fd_cache_mask, to avoid waiting in
poll while it may have more work to do.
2018-02-05 16:02:22 +01:00
Olivier Houchard
4815c8cbfe MAJOR: fd/threads: Make the fdcache mostly lockless.
Create a local, per-thread, fdcache, for file descriptors that only belongs
to one thread, and make the global fd cache mostly lockless, as we can get
a lot of contention on the fd cache lock.
2018-02-05 16:02:22 +01:00
Olivier Houchard
cf975d46bc MINOR: pools/threads: Implement lockless memory pools.
On CPUs that support a double-width compare-and-swap, implement lockless
pools.
2018-02-05 16:02:22 +01:00
Olivier Houchard
25ae45a078 MINOR: early data: Never remove the CO_FL_EARLY_DATA flag.
It may be useful to keep the CO_FL_EARLY_DATA flag, so that we know early
data were used, so instead of doing this, only add the Early-data header,
and have the sample fetch ssl_fc_has_early return 1, if CO_FL_EARLY_DATA is
set, and if the handshake isn't done yet.
2018-02-05 14:24:50 +01:00
Olivier Houchard
6fa63d9852 MINOR: early data: Don't rely on CO_FL_EARLY_DATA to wake up streams.
Instead of looking for CO_FL_EARLY_DATA to know if we have to try to wake
up a stream, because it is waiting for a SSL handshake, instead add a new
conn_stream flag, CS_FL_WAIT_FOR_HS. This way we don't have to rely on
CO_FL_EARLY_DATA, and we will only wake streams that are actually waiting.
2018-02-05 14:24:50 +01:00
Olivier Houchard
5fa300da89 MINOR: init: make stdout unbuffered
printf is unusable for debugging without this, and printf() is not used
for anything else.
2018-02-05 14:15:20 +01:00
Christopher Faulet
e8ade385b4 MINOR: spoe: Add max-waiting-frames directive in spoe-agent configuration
This is the maximum number of frames waiting for an acknowledgement on the same
connection. This value is only used when the pipelinied or asynchronus exchanges
between HAProxy and SPOA are enabled. By default, it is set to 20.
2018-02-02 16:00:32 +01:00
Christopher Faulet
b077cdc012 MEDIUM: spoe: Use an ebtree to manage idle applets
Instead of using a list of applets with idle ones in front, we now use an
ebtree. Aapplets in the tree are idle by definition. And the key is the applet's
weight. When a new frame is queued, the first idle applet (with the lowest
weight) is woken up and its weight is increased by one. And when an applet sends
a frame to a SPOA, its weight is decremented by one.

This is empirical, but it should avoid to overuse a very few number of applets
and increase the balancing between idle applets.
2018-02-02 16:00:32 +01:00
Christopher Faulet
8f82b203d5 MINOR: spoe: Count the number of frames waiting for an ack for each applet
So it is easier to respect the max_fpa value. This is no more the maximum frames
processed by an applet at each loop but the maximum frames waiting for an ack
for a specific applet.

The function spoe_handle_processing_appctx has been rewritten accordingly.
2018-02-02 16:00:32 +01:00
Christopher Faulet
6f9ea4f87b MINOR: spoe: Replace sending_rate by a frequency counter
sending_rate was a counter used to evaluate the SPOE capacity to process
frames. Because it was not really accurrate, it has been replaced by a frequency
counter representing the number of frames handled by the SPOE per second. We
just check this counter is higher than the number of streams waiting for a
reply. If not, a new applet is created.
2018-02-02 16:00:32 +01:00
Christopher Faulet
fce747bbaa MINOR: spoe: Always link a SPOE context with the applet processing it
This was already done for fragmented frames. Now, this is true for all
frames.
2018-02-02 16:00:32 +01:00
Christopher Faulet
420977903b MINOR: spoe: Remove check on min_applets number when a SPOE context is queued
The calculation of a minimal number of active applets was really empirical and
finally useless. On heavy load, there are always many active applets (most of
time, more than the minimal required) and when the load is low, there is no
reason to keep unused applets opened.

Because of this change, the flag SPOE_APPCTX_FL_PERSIST is now unused. So it has
been removed.
2018-02-02 16:00:32 +01:00
Christopher Faulet
9cdca976d3 BUG/MEDIUM: spoe: Allow producer to read and to forward shutdown on request side
This is mandatory to correctly set right timeout on the stream. Else the client
timeout is never set. So only SPOE processing timeout will be evaluated. If it
is not defined (ie infinity), the stream can be blocked for a while, waiting the
SPOA reply. Of course, this is not a good idea to let the SPOE processing
timeout undefined, but it can happen.

This patch must be backported in 1.8.
2018-02-02 16:00:31 +01:00
Christopher Faulet
d5216d474d BUG/MEDIUM: spoe: Always try to receive or send the frame to detect shutdowns
Before, we checked if the buffer was allocated or not to avoid sending or
receiving a frame. This was done to not call ci_putblk or co_getblk if there is
nothing to do. But the checks on the buffers are also done in these
functions. So this is not mandatory here. But in these functions, the channel
state is also checked, so an error is returned if it is closed. By skipping the
call, we also skip the checks on the channel state, delaying shutdowns
detection.

Now, we always try to send or receive a frame. So if the corresponding channel
is closed, we can immediatly handle the error.

This patch must be backported in 1.8
2018-02-02 16:00:31 +01:00
Emmanuel Hocdet
f643b80429 MINOR: introduce proxy-v2-options for send-proxy-v2
Proxy protocol v2 can transport many optional informations. To avoid
send-proxy-v2-* explosion, this patch introduce proxy-v2-options parameter
and will allow to write: "send-proxy-v2 proxy-v2-options ssl,cert-cn".
2018-02-02 05:52:51 +01:00
Willy Tarreau
4979592907 BUG/MINOR: epoll/threads: only call epoll_ctl(DEL) on polled FDs
Commit d9e7e36 ("BUG/MEDIUM: epoll/threads: use one epoll_fd per thread")
addressed an issue with the polling and required that cloned FDs are removed
from all polling threads on close. But in fact it does it for all bound
threads, some of which may not necessarily poll the FD. This is harmless,
but it may also make it harder later to deal with FD migration between
threads. Better use polled_mask which only reports threads still aware
of the FD instead of thread_mask.

This fix should be backported to 1.8.
2018-01-31 09:49:29 +01:00
Frédéric Lécaille
6778b27542 MINOR: stick-tables: Adds support for new "gpc1" and "gpc1_rate" counters.
Implement exactly the same code as this has been done for "gpc0" and "gpc0_rate"
counters.
2018-01-31 09:40:05 +01:00
Willy Tarreau
a9786b6f04 MINOR: fd: pass the iocb and owner to fd_insert()
fd_insert() is currently called just after setting the owner and iocb,
but proceeding like this prevents the operation from being atomic and
requires a lock to protect the maxfd computation in another thread from
meeting an incompletely initialized FD and computing a wrong maxfd.
Fortunately for now all fdtab[].owner are set before calling fd_insert(),
and the first lock in fd_insert() enforces a memory barrier so the code
is safe.

This patch moves the initialization of the owner and iocb to fd_insert()
so that the function will be able to properly arrange its operations and
remain safe even when modified to become lockless. There's no other change
beyond the internal API.
2018-01-29 16:07:25 +01:00
Willy Tarreau
fc6eea4de2 MEDIUM: poll: don't use the old FD state anymore
The polling updates are now performed exactly like the epoll/kqueue
ones : only the new polled state is considered, and the previous one
is checked using polled_mask. The only specific stuff here is that
the fd state is shared between all threads, so an FD removal has to
be done only once.
2018-01-29 16:03:15 +01:00
Willy Tarreau
56dd12a7f0 MEDIUM: select: don't use the old FD state anymore
The polling updates are now performed exactly like the epoll/kqueue
ones : only the new polled state is considered, and the previous one
is checked using polled_mask. The only specific stuff here is that
the fd state is shared between all threads, so an FD removal has to
be done only once.
2018-01-29 16:03:15 +01:00
Willy Tarreau
82b37d74d2 MEDIUM: fd: use atomic ops for hap_fd_{clr,set} and remove poll_lock
Now that we can use atomic ops to set/clear an fd occurrence in an
fd_set, we don't need the poll_lock anymore. Let's remove it.
2018-01-29 16:03:15 +01:00
Willy Tarreau
d51a507dbd MEDIUM: select: make use of hap_fd_* functions
Given that FD_{CLR,SET} are not always guaranteed to be thread safe,
let's fall back to using the hap_fd_* functions as we used to till
1.5-dev18 and as poll() continues to use. This will make it easier
to remove the poll_lock.
2018-01-29 16:03:15 +01:00
Willy Tarreau
322e6c7e73 MINOR: fd: move the hap_fd_{clr,set,isset} functions to fd.h
These functions were created for poll() in 1.5-dev18 (commit 80da05a4) to
replace the previous FD_{CLR,SET,ISSET} that were shared with select()
because some libcs enforce a limit on FD_SET. But FD_SET doesn't seem
to be universally MT-safe, requiring locks in the select() code that
are not needed in the poll code. So let's move back to the initial
situation where we used to only use bit fields, since that has been in
use since day one without a problem, and let's use these hap_fd_*
functions instead of FD_*.

This patch only moves the functions to fd.h and revives hap_fd_isset()
that was recently removed to kill an "unused" warning.
2018-01-29 16:03:15 +01:00
Willy Tarreau
745c60eac6 CLEANUP: fd: remove the unused "new" field
This field has been unused since 1.6, it's only updated and never
tested. Let's remove it.
2018-01-29 16:02:59 +01:00
Willy Tarreau
2d3c2db868 MINOR: poll: more accurately compute the new maxfd in the loop
Last commit 173d995 ("MEDIUM: polling: start to move maxfd computation
to the pollers") moved the maxfd computation to the polling loop, but
it still adds an entry when removing an fd, forcing the next loop to
seek from further away than necessary. Let's only update the max when
actually adding an entry.
2018-01-29 16:00:28 +01:00
Willy Tarreau
f2b5c99b4c CLEANUP: fd/threads: remove the now unused fdtab_lock
It was only used to protect maxfd computation and is not needed
anymore.
2018-01-29 15:25:35 +01:00
Willy Tarreau
173d9951e2 MEDIUM: polling: start to move maxfd computation to the pollers
Since only select() and poll() still make use of maxfd, let's move
its computation right there in the pollers themselves, and only
during each fd update pass. The computation doesn't need a lock
anymore, only a few atomic ops. It will be accurate, be done much
less often and will not be required anymore in the FD's fast patch.

This provides a small performance increase of about 1% in connection
rate when using epoll since we get rid of this computation which was
performed under a lock.
2018-01-29 15:22:57 +01:00
Willy Tarreau
c5532acb4d MINOR: fd: don't report maxfd in alert messages
The listeners and connectors may complain that process-wide or
system-wide FD limits have been reached and will in this case report
maxfd as the limit. This is wrong in fact since there's no reason for
the whole FD space to be contiguous when the total # of FD is reached.
A better approach would consist in reporting the accurate number of
opened FDs, but this is pointless as what matters here is to give a
hint about what might be wrong. So let's simply report the configured
maxsock, which will generally explain why the process' limits were
reached, which is the most common reason. This removes another
dependency on maxfd.
2018-01-29 15:18:54 +01:00
Willy Tarreau
ce036bc2da MINOR: polling: make epoll and kqueue not depend on maxfd anymore
Maxfd is really only useful to poll() and select(), yet epoll and
kqueue reference it almost by mistake :
  - cloning of the initial FDs (maxsock should be used here)
  - max polled events, it's maxpollevents which should be used here.

Let's fix these places.
2018-01-29 15:18:54 +01:00
Willy Tarreau
ccea35c980 BUG/MINOR: cli: use global.maxsock and not maxfd to list all FDs
The "show fd" command on the CLI doesn't list the last FD in use since
it doesn't include maxfd. We don't need to use maxfd here anyway as
global.maxsock will do the job pretty well and removes this dependency.
This patch may be backported to 1.8.
2018-01-29 15:18:54 +01:00
Frédéric Lécaille
a41d531e4e MINOR: config: Enable tracking of up to MAX_SESS_STKCTR stick counters.
This patch really adds support for up to MAX_SESS_STKCTR stick counters.
2018-01-29 13:53:56 +01:00
Tim Duesterhus
1478aa795e MEDIUM: sample: Add IPv6 support to the ipmask converter
Add an optional second parameter to the ipmask converter that specifies
the number of bits to mask off IPv6 addresses.

If the second parameter is not given IPv6 addresses fail to mask (resulting
in an empty string), preserving backwards compatibility: Previously
a sample like `src,ipmask(24)` failed to give a result for IPv6 addresses.

This feature can be tested like this:

  defaults
  	log	global
  	mode	http
  	option	httplog
  	option	dontlognull
  	timeout connect 5000
  	timeout client  50000
  	timeout server  50000

  frontend fe
  	bind :::8080 v4v6

  	# Masked IPv4 for IPv4, empty for IPv6 (with and without this commit)
  	http-response set-header Test %[src,ipmask(24)]
  	# Correctly masked IP addresses for both IPv4 and IPv6
  	http-response set-header Test2 %[src,ipmask(24,ffff:ffff:ffff:ffff::)]
  	# Correctly masked IP addresses for both IPv4 and IPv6
  	http-response set-header Test3 %[src,ipmask(24,64)]

  	default_backend be

  backend be
  	server s example.com:80

Tested-By: Jarno Huuskonen <jarno.huuskonen@uef.fi>
2018-01-25 22:25:40 +01:00
Tim Duesterhus
b814da6c5c MINOR: config: Add support for ARGT_MSK6
This commit adds support for ARGT_MSK6 to make_arg_list().
2018-01-25 22:25:40 +01:00
Tim Duesterhus
471851713a MINOR: standard: Add str2mask6 function
This new function mirrors the str2mask() function for IPv4 addresses.

This commit is in preparation to support ARGT_MSK6.
2018-01-25 22:25:40 +01:00
Tim Duesterhus
8575f72e93 CLEANUP: standard: Use len2mask4 in str2mask
The len2mask4 function was introduced in commit:
70473a5f8c
which is about six years later than the commit that introduced the
str2mask function:
2937c0dd20

This is a clean up in preparation for a str2mask6 function which
will use len2mask6.
2018-01-25 22:25:40 +01:00
Tim Duesterhus
bf5ce02eff BUG/MINOR: sample: Fix output type of c_ipv62ip
c_ipv62ip failed to set the output type of the cast to SMP_T_IPV4
even for a successful conversion.

This bug exists as of commit cc4d1716a2
which is the first commit adding this function.

v1.6-dev4 is the first tag containing this commit, the fix should
be backported to haproxy 1.6 and newer.
2018-01-25 22:25:40 +01:00
Tim Duesterhus
ec6b0a2d18 CLEANUP: sample: Fix outdated comment about sample casts functions
The cast functions modify their output type as of commit:
b805f71d1b

v1.5-dev20 is the first tag containing this comment, the fix
should be backported to haproxy 1.5 and newer.
2018-01-25 22:25:40 +01:00
Tim Duesterhus
c555ee0c45 CLEANUP: sample: Fix comment encoding of sample.c
The file contained an 'e' with an gravis accent and thus was
not US-ASCII, but ISO-8859-1.

Also correct the spelling in the incorrect comment.

The incorrect character was introduced in commit:
4d9a1d1a5c

v1.6-dev1 is the first tag containing this comment, the fix
should be backported to haproxy 1.6 and newer.
2018-01-25 22:25:40 +01:00
Christopher Faulet
727c89b3df BUILD: kqueue/threads: Add test on MAX_THREADS to avoid warnings when complied without threads
This is the same patch than the previous one ("BUILD: epoll/threads: Add test on
MAX_THREADS to avoid warnings when complied without threads ").

It should be backported in 1.8 with the commit 7a2364d4 ("BUG/MEDIUM:
kqueue/threads: use one kqueue_fd per thread").
2018-01-25 17:52:57 +01:00
Christopher Faulet
3e805ed08e BUILD: epoll/threads: Add test on MAX_THREADS to avoid warnings when complied without threads
When HAProxy is complied without threads, gcc throws following warnings:

  src/ev_epoll.c:222:3: warning: array subscript is outside array bounds [-Warray-bounds]
  ...
  src/ev_epoll.c:199:11: warning: array subscript is outside array bounds [-Warray-bounds]
  ...

Of course, this is not a bug. In such case, tid is always equal to 0. But to
avoid the noise, a check on MAX_THREADS in "if (tid)" lines makes gcc happy.

This patch should be backported in 1.8 with the commit d9e7e36c ("BUG/MEDIUM:
epoll/threads: use one epoll_fd per thread").
2018-01-25 17:52:57 +01:00
Christopher Faulet
da18b9db7b MINOR: threads: Use __decl_hathreads instead of #ifdef/#endif
A #ifdef/#endif on USE_THREAD was added in the commit 0048dd04 ("MINOR: threads:
Fix build when we're not compiling with threads.") to conditionally define the
start_lock variable, because HA_SPINLOCK_T is only defined when HAProxy is
compiled with threads.

If fact, to do that, we should use the macro __decl_hathreads instead.

If commit 0048dd04 is backported in 1.8, this one can also be backported.
2018-01-25 17:52:57 +01:00
Christopher Faulet
13b007d583 BUG/MINOR: kqueue/threads: Don't forget to close kqueue_fd[tid] on each thread
in deinit_kqueue_per_thread, kqueue_fd[tid] must be closed, except for the main
thread (the first one, tid==0).

This patch must be backported in 1.8 with commit 7a2364d4.
2018-01-25 17:52:57 +01:00
Christopher Faulet
23d86d157e BUG/MEDIUM: checks: Don't try to release undefined conn_stream when a check is freed
When a healt-check is released, the attached conn_stream may be undefined. For
instance, this happens when 'no-check' option is used on a server line. So we
must check it is defined before trying to release it.

This patch must be backported in 1.8.
2018-01-25 13:51:23 +01:00
Christopher Faulet
8d01fd6b3c BUG/MEDIUM: threads/server: Fix deadlock in srv_set_stopping/srv_set_admin_flag
Because of a typo (HA_SPIN_LOCK instead of HA_SPIN_UNLOCK), there is a deadlock
in srv_set_stopping and srv_set_admin_flag when there is at least one trackers.

This patch must be backported in 1.8.
2018-01-25 13:51:23 +01:00
Willy Tarreau
c20d737338 BUG/MINOR: threads: always set an owner to the thread_sync pipe
The owner of the fd used by the synchronization pipe was set to NULL,
making it ignored by maxfd computation. The risk would be that some
synchronization events get delayed between threads when using poll()
or select(). However this is only theorical since the pipe is created
before listeners are bound so normally its FD should be lower and
this should normally not happen. The only possible situation would
be if all listeners are bound to inherited FDs which are lower than
the pipe's.

This patch must be backported to 1.8.
2018-01-25 07:31:08 +01:00
Olivier Houchard
0048dd04c9 MINOR: threads: Fix build when we're not compiling with threads.
Only declare the start_lock if threads are compiled in, otherwise
HA_SPINLOCK_T won't be defined.
This should be backported to 1.8 when/if
1605c7ae61 is backported.
2018-01-24 21:41:29 +01:00
Willy Tarreau
46ec48bc1a BUG/MINOR: mworker: only write to pidfile if it exists
A missing test causes a write(-1, $PID) to appear in strace output when
in master-worker mode. This is totally harmless though.

This fix must be backported to 1.8.
2018-01-23 19:20:19 +01:00
Willy Tarreau
1605c7ae61 BUG/MEDIUM: threads/mworker: fix a race on startup
Marc Fournier reported an interesting case when using threads with the
master-worker mode : sometimes, a listener would have its FD closed
during startup. Sometimes it could even be health checks seeing this.

What happens is that after the threads are created, and the pollers
enabled on each threads, the master-worker pipe is registered, and at
the same time a close() is performed on the write side of this pipe
since the children must not use it.

But since this is replicated in every thread, what happens is that the
first thread closes the pipe, thus releases the FD, and the next thread
starting a listener in parallel gets this FD reassigned. Then another
thread closes the FD again, which this time corresponds to the listener.
It can also happen with the health check sockets if they're started
early enough.

This patch splits the mworker_pipe_register() function in two, so that
the close() of the write side of the FD is performed very early after the
fork() and long before threads are created (we don't need to delay it
anyway). Only the pipe registration is done in the threaded code since
it is important that the pollers are properly allocated for this.
The mworker_pipe_register() function now takes care of registering the
pipe only once, and this is guaranteed by a new surrounding lock.

The call to protocol_enable_all() looks fragile in theory since it
scans the list of proxies and their listeners, though in practice
all threads scan the same list and take the same locks for each
listener so it's not possible that any of them escapes the process
and finishes before all listeners are started. And the operation is
idempotent.

This fix must be backported to 1.8. Thanks to Marc for providing very
detailed traces clearly showing the problem.
2018-01-23 19:18:57 +01:00
Willy Tarreau
7a2364d474 BUG/MEDIUM: kqueue/threads: use one kqueue_fd per thread
This is the same principle as the previous patch (BUG/MEDIUM:
epoll/threads: use one epoll_fd per thread) except that this time it's
for kqueue. We don't want all threads to wake up because of activity on
a single other thread that the other ones are not interested in.

Just like with previous patch, this one shows that the polling state
doesn't need to be changed here and that some simplifications are now
possible. This patch only implements the minimum required for a stable
backport.

This should be backported to 1.8.
2018-01-23 15:50:03 +01:00
Willy Tarreau
d9e7e36c6e BUG/MEDIUM: epoll/threads: use one epoll_fd per thread
There currently is a problem regarding epoll(). While select() and poll()
compute their polling state on the fly upon each call, epoll() keeps a
shared state between all threads via the epoll_fd. The problem is that
once an fd is registered on *any* thread, all other threads receive
events for that FD as well. It is clearly visible when binding a listener
to a single thread like in the configuration below where all 4 threads
will work, 3 of them simply spinning to skip the event :

    global
        nbthread 4

    frontend foo
        bind :1234 process 1/1

The worst case happens when some slow operations are in progress on a
busy thread, preventing it from processing its task and causing the
other ones to wake up not being able to do anything with this event.
Typically computing a large TLS key will delay processing of next
events on the same thread while others will still wake up.

All this simply shows that the poller must remain thread-specific, with
its own events and its own ability to sleep when it doesn't have anyhing
to do.

This patch does exactly this. For this, it proceeds like this :

   - have one epoll_fd per thread instead of one per process
   - initialize these epoll_fd when threads are created.
   - mark all known FDs as updated so that the next invocation of
     _do_poll() recomputes their polling status (including a possible
     removal of undesired polling from the original FD) ;
   - use each fd's polled_mask to maintain an accurate status of
     the current polling activity for this FD.
   - when scanning updates, only focus on events whose new polling
     status differs from the existing one
   - during updates, always verify the thread_mask to resist migration
   - on __fd_clo(), for cloned FDs (typically listeners inherited
     from the parent during a graceful shutdown), run epoll_ctl(DEL)
     on all epoll_fd. This is the reason why epoll_fd is stored in a
     shared array and not in a thread_local storage. Note: maybe this
     can be moved to an update instead.

Interestingly, this shows that we don't need the FD's old state anymore
and that we only use it to convert it to the new state based on stable
information. It appears clearly that the FD code can be further improved
by computing the final state directly when manipulating it.

With this change, the config above goes from 22000 cps at 380% CPU to
43000 cps at 100% CPU : not only the 3 unused threads are not activated,
but they do not disturb the activity anymore.

The output of "show activity" before and after the patch on a 4-thread
config where a first listener on thread 2 forwards over SSL to threads
3 & 4 shows this a much smaller amount of undesired events (thread 1
doesn't wake up anymore, poll_skip remains zero, fd_skip stays low) :

  // before: 400% CPU, 7700 cps, 13 seconds
  loops: 11380717 65879 5733468 5728129
  wake_cache: 0 63986 317547 314174
  wake_tasks: 0 0 0 0
  wake_applets: 0 0 0 0
  wake_signal: 0 0 0 0
  poll_exp: 0 63986 317547 314174
  poll_drop: 1 0 49981 48893
  poll_dead: 65514 0 31334 31934
  poll_skip: 46293690 34071 22867786 22858208
  fd_skip: 66068135 174157 33732685 33825727
  fd_lock: 0 2 2809 2905
  fd_del: 0 494361 80890 79464
  conn_dead: 0 0 0 0
  stream: 0 407747 50526 49474
  empty_rq: 11380718 1914 5683023 5678715
  long_rq: 0 0 0 0

  // after: 200% cpu, 9450 cps, 11 seconds
  loops: 17 66147 1001631 450968
  wake_cache: 0 66119 865139 321227
  wake_tasks: 0 0 0 0
  wake_applets: 0 0 0 0
  wake_signal: 0 0 0 0
  poll_exp: 0 66119 865139 321227
  poll_drop: 6 5 38279 60768
  poll_dead: 0 0 0 0
  poll_skip: 0 0 0 0
  fd_skip: 54 172661 4411407 2008198
  fd_lock: 0 0 10890 5394
  fd_del: 0 492829 58965 105091
  conn_dead: 0 0 0 0
  stream: 0 406223 38663 61338
  empty_rq: 18 40 962999 390549
  long_rq: 0 0 0 0

This patch presents a few risks but fixes a real problem with threads,
and as such it needs be backported to 1.8. It depends on previous patch
("MINOR: fd: add a bitmask to indicate that an FD is known by the poller").

Special thanks go to Samuel Reed for providing a large amount of useful
debugging information and for testing fixes.
2018-01-23 15:48:08 +01:00
Willy Tarreau
c9c8378c2b MINOR: fd: add a bitmask to indicate that an FD is known by the poller
Some pollers like epoll() need to know if the fd is already known or
not in order to compute the operation to perform (add, mod, del). For
now this is performed based on the difference between the previous FD
state and the new state but this will not be usable anymore once threads
become responsible for their own polling.

Here we come with a different approach : a bitmask is stored with the
fd to indicate which pollers already know it, and the pollers will be
able to simply perform the add/mod/del operations based on this bit
combined with the new state.

This patch only adds the bitmask declaration and initialization, it
is it not yet used. It will be needed by the next two fixes and will
need to be backported to 1.8.
2018-01-23 15:42:57 +01:00
Willy Tarreau
ebc78d78a2 BUG/MEDIUM: fd: maintain a per-thread update mask
Since the fd update tables are per-thread, we need to have a bit per
thread to indicate whether an update exists, otherwise this can lead
to lost update events every time multiple threads want to update the
same FD. In practice *for now*, it only happens at start time when
listeners are enabled and ask for polling after facing their first
EAGAIN. But since the pollers are still shared, a lost event is still
recovered by a neighbor thread. This will not reliably work anymore
with per-thread pollers, where it has been observed a few times on
startup that a single-threaded listener would not always accept
incoming connections upon startup.

It's worth noting that during this code review it appeared that the
"new" flag in the fdtab isn't used anymore.

This fix should be backported to 1.8.
2018-01-23 15:41:19 +01:00
Christopher Faulet
32467fef98 BUG/MEDIUM: threads/polling: Use fd_cache_mask instead of fd_cache_num
fd_cache_num is the number of FDs in the FD cache. It is a global variable. So
it is underoptimized because we may be lead to consider there are waiting FDs
for the current thread in the FD cache while in fact all FDs are assigned to the
other threads. So, in such cases, the polling loop will be evaluated many more
times than necessary.

Instead, we now check if the thread id is set in the bitfield fd_cache_mask.

[wt: it's not exactly a bug, rather a design limitation of the thread
 which was not addressed in time for the 1.8 release. It can appear more
 often than we initially predicted, when more threads are running than
 the number of assigned CPU cores, or when certain threads spend
 milliseconds computing crypto keys while other threads spin on
 epoll_wait(0)=0]

This patch should be backported to 1.8.
2018-01-23 15:39:51 +01:00
Christopher Faulet
69553fe62c MINOR: threads/fd: Use a bitfield to know if there are FDs for a thread in the FD cache
A bitfield has been added to know if there are some FDs processable by a
specific thread in the FD cache. When a FD is inserted in the FD cache, the bits
corresponding to its thread_mask are set. On each thread, the bitfield is
updated when the FD cache is processed. If there is no FD processed, the thread
is removed from the bitfield by unsetting its tid_bit.

Note that this bitfield is updated but not checked in
fd_process_cached_events. So, when this function is called, the FDs cache is
always processed.

[wt: should be backported to 1.8 as it will help fix a design limitation]
2018-01-23 15:39:10 +01:00
Willy Tarreau
d80cb4ee13 MINOR: global: add some global activity counters to help debugging
A number of counters have been added at special places helping better
understanding certain bug reports. These counters are maintained per
thread and are shown using "show activity" on the CLI. The "clear
counters" commands also reset these counters. The output is sent as a
single write(), which currently produces up to about 7 kB of data for
64 threads. If more counters are added, it may be necessary to write
into multiple buffers, or to reset the counters.

To backport to 1.8 to help collect more detailed bug reports.
2018-01-23 15:38:33 +01:00
Willy Tarreau
421f02e738 MINOR: threads: add a MAX_THREADS define instead of LONGBITS
This one allows not to inflate some structures when threads are
disabled. Now struct global is 1.4 kB instead of 33 kB.

Should be backported to 1.8 for ease of backporting of upcoming
patches.
2018-01-23 15:28:20 +01:00
Olivier Houchard
e9bad0a936 MINOR: servers: Don't report duplicate dyncookies for disabled servers.
Especially with server-templates, it can happen servers starts with a
placeholder IP, in the disabled state. In this case, we don't want to report
that the same cookie was generated for multiple servers. So defer the test
until the server is enabled.

This should be backported to 1.8.
2018-01-23 14:05:17 +01:00
Emeric Brun
5548291395 BUG/MEDIUM: peers: fix expire date wasn't updated if entry is modified remotely.
The stktable_touch_remote considers the expire field stored in the stksess
struct.
The expire field was updated on the a newly created stksess to store.

But if the stksess with a same key is still present the expire was not updated.

This patch postpones the update of the expire field of the stksess just before
processing the "touch".

These bug was introduced in commit:

MEDIUM: threads/stick-tables: handle multithreads on stick tables.

And the fix should be backported on 1.8.
2018-01-22 16:03:25 +01:00
Etienne Carriere
a792a0aa93 MINOR: sample: add date_us sample
Add date_us sample that returns the microsecond part of the timeval
structure representing the date of the structure. The "second" part of
the timeval can already be fetched by the "date" sample
2018-01-21 07:56:42 +01:00
Willy Tarreau
cc35923c32 BUG/MINOR: poll: too large size allocation for FD events
Commit 80da05a ("MEDIUM: poll: do not use FD_* macros anymore") which
appeared in 1.5-dev18 and which was backported to 1.4.23 made explicit
use of arrays of FDs mapped to unsigned ints. The problem lies in the
allocated size for poll(), as the resulting size is in bits and not
bytes, resulting in poll() arrays being 8 times larger than necessary!

In practice poll() is not used on highly loaded systems, explaining why
nobody noticed. But it definetely has to be addressed.

This fix needs to be backported to all stable versions.
2018-01-17 15:52:11 +01:00
Christopher Faulet
333694d771 MINOR: spoe: Don't queue a SPOE context if nothing is sent
When some messages must be sent to an agent, the SPOE context of the stream is
queued to be handled by an SPOE applet. If there is no available applet, a new
one is created, thus opening a connection with the agent.

Since the support of ACLs on messages, some processing can now be discarded. So,
to avoid opening a connection for nothing, the SPOE context is now queued after
the messages encoding.
2018-01-15 13:48:03 +01:00
Christopher Faulet
336d3ef0e7 MINOR: spoe: add register-var-names directive in spoe-agent configuration
In addition to "option force-set-var", recently added, this directive can be
used to selectivelly register unknown variable names, without totally relaxing
their registration during the runtime, like "option force-set-var" does.

So there is no way for a malicious agent to exhaust memory by defining a too
high number of variable names. In other hand, you need to enumerate all
variable names. This could be painfull in some circumstances.

Remember, this directive is only usefull when the variable names are not
referenced anywhere in the HAProxy configuration or the SPOE one.

Thanks to Etienne Carrière for his help on this part.
2018-01-15 13:47:27 +01:00
Willy Tarreau
d651ba14d4 BUG/MEDIUM: stream: properly handle client aborts during redispatch
James Mc Bride reported an interesting case affecting all versions since
at least 1.5 : if a client aborts a connection on an empty buffer at the
exact moment a server redispatch happens, the CF_SHUTW_NOW flag on the
channel is immediately turned into CF_SHUTW, which is not caught by
check_req_may_abort(), leading the redispatch to be performed anyway
with the channel marked as shut in both directions while the stream
interface correctly establishes. This situation makes no sense.
Ultimately the transfer times out and the server-side stream interface
remains in EST state while the client is in CLO state, and this case
doesn't correspond to anything we can handle in process_stream, leading
to poll() being woken up all the time without any progress being made.
And the session cannot even be killed from the CLI.

So we must ensure that check_req_may_abort() also considers the case
where the channel is already closed, which is what this patch does.
Thanks to James for providing detailed captures allowing to diagnose
the problem.

This fix must be backported to all maintained versions.
2018-01-12 10:47:48 +01:00
William Lallemand
29f690c945 BUG/MEDIUM: mworker: execvp failure depending on argv[0]
The copy_argv() function lacks a check on '-' to remove the -x, -sf and
-st parameters.

When reloading a master process with a path starting by /st, /sf, or
/x..  the copy_argv() function skipped argv[0] leading to an execvp()
without the binary.
2018-01-09 23:44:18 +01:00
Olivier Houchard
2ec2db9725 MINOR: dns: Handle SRV record weight correctly.
A SRV record weight can range from 0 to 65535, while haproxy weight goes
from 0 to 256, so we have to divide it by 256 before handing it to haproxy.
Also, a SRV record with a weight of 0 doesn't mean the server shouldn't be
used, so use a minimum weight of 1.

This should probably be backported to 1.8.
2018-01-09 15:43:11 +01:00
Tim Duesterhus
119a5f10e4 BUG/MINOR: lua: Fix return value of Socket.settimeout
The `socket.tcp.settimeout` method of Lua returns `1` in all cases,
while the `Socket.settimeout` method of haproxy returns `0` in all
cases. This breaks the `socket.http` module, because it validates
the return value of `settimeout`.

This bug was introduced in commit 7e7ac32dad
(which is the very first commit adding the Socket class to Lua). This
bugfix should be backported to every branch containing that commit:
- 1.6
- 1.7
- 1.8

A test case for this bug is as follows:

The 'Test' response header will contain an HTTP status code with the
patch applied and will be zero (nil) without the patch applied.

http.lua:
  http = require("socket.http")

  core.register_action("bug", { "http-req" }, function(txn)
  	local b, c, h = http.request {
  		url = "http://93.184.216.34",
  		headers = {
  			Host = "example.com"
  		},
  		create = core.tcp,
  		redirect = false
  	}

  	txn:set_var("txn.foo", c)
  end)

haproxy.cfg:
  global
  	lua-load /scratch/haproxy/http.lua

  frontend fe
  	bind 127.0.0.1:8080
  	http-request lua.bug
  	http-response set-header Test %[var(txn.foo)]

  	default_backend be

  backend be
  	server s example.com:80
2018-01-09 15:22:55 +01:00
Tim Duesterhus
6edab865f6 BUG/MEDIUM: lua: Fix IPv6 with separate port support for Socket.connect
The `socket.tcp.connect` method of Lua requires at least two parameters:
The host and the port. The `Socket.connect` method of haproxy requires
only one when a host with a combined port is provided. This stems from
the fact that `str2sa_range` is used internally in `hlua_socket_connect`.
This very fact unfortunately causes a diversion in the behaviour of
Lua's socket class and haproxy's for IPv6 addresses:

  sock:connect("::1", "80")

works fine with Lua, but fails with:

  connect: cannot parse destination address '::1'

in haproxy, because `str2sa_range` parses the trailing `:1` as the port.

This patch forcefully adds a `:` to the end of the address iff a port
number greater than `0` is given as the second parameter.

Technically this breaks backwards compatibility, because the docs state:

> The syntax "127.0.0.1:1234" is valid. in this case, the
> parameter *port* is ignored.

But: The connect() call can only succeed if the second parameter is left
out (which causes no breakage) or if the second parameter is an integer
or a numeric string.

It seems unlikely that someone would provide an address with a port number
and would also provide a second parameter containing a number other than
zero. Thus I feel this breakage is warranted to fix the mismatch between
haproxy's socket class and Lua's one.

This commit should be backported to haproxy 1.8 only, because of the
possible breakage of existing Lua scripts.
2018-01-09 15:22:55 +01:00
Tim Duesterhus
b33754ce86 DOC: lua: Fix typos in comments of hlua_socket_receive 2018-01-09 15:22:49 +01:00
Tim Duesterhus
c6e377e6bb BUG/MINOR: lua: Fix default value for pattern in Socket.receive
The default value of the pattern in `Socket.receive` is `*l` according
to the documentation and in the `socket.tcp.receive` method of Lua.

The default value of `wanted` in `int hlua_socket_receive(struct lua_State *)`
reflects this requirement, but the function fails to ensure this
nonetheless:

If no parameter is given the top of the Lua stack will have the index 1.
`lua_pushinteger(L, wanted);` then pushes the default value onto the stack
(with index 2).
The following `lua_replace(L, 2);` then pops the top index (2) and tries to
replace the index 2 with it.
I am not sure why exactly that happens (possibly, because one cannot replace
non-existent stack indicies), but this causes the stack index to be lost.

`hlua_socket_receive_yield` then tries to read the stack index 2, to
determine what to read and get the value `0`, instead of the correct
HLSR_READ_LINE, thus taking the wrong branch.

Fix this by ensuring that the top of the stack is not replaced by itself.

This bug was introduced in commit 7e7ac32dad
(which is the very first commit adding the Socket class to Lua). This
bugfix should be backported to every branch containing that commit:
- 1.6
- 1.7
- 1.8

A test case for this bug is as follows:

The 'Test' response header will contain an HTTP status line with the
patch applied and will be empty without the patch applied. Replacing
the `sock:receive()` with `sock:receive("*l")` will cause the status
line to appear with and without the patch

http.lua:
  core.register_action("bug", { "http-req" }, function(txn)
  	local sock = core.tcp()
  	sock:settimeout(60)
  	sock:connect("127.0.0.1:80")
  	sock:send("GET / HTTP/1.0\r\n\r\n")
  	response = sock:receive()
  	sock:close()
  	txn:set_var("txn.foo", response)
  end)

haproxy.cfg (bits omitted for brevity):
  global
  	lua-load /scratch/haproxy/http.lua

  frontend fe
  	bind 127.0.0.1:8080
  	http-request lua.bug
  	http-response set-header Test %[var(txn.foo)]

  	default_backend be

  backend be
  	server s 127.0.0.1:80
2018-01-09 15:22:46 +01:00
William Lallemand
99b90af621 BUG/MEDIUM: ssl: cache doesn't release shctx blocks
Since the rework of the shctx with the hot list system, the ssl cache
was putting session inside the hot list, without removing them.
Once all block were used, they were all locked in the hot list, which
was forbiding to reuse them for new sessions.

Bug introduced by 4f45bb9 ("MEDIUM: shctx: separate ssl and shctx")

Thanks to Jeffrey J. Persch for reporting this bug.

Must be backported to 1.8.
2018-01-05 11:46:54 +01:00
Olivier Houchard
e2a34967a9 CLEANUP: rbtree: remove
Remove the rbtree implementation. It's not used, it's not even connected to
the build, and we probably have no use for it .
2018-01-05 10:56:32 +01:00
Willy Tarreau
5d4cafb610 BUILD: ssl: silence a warning when building without NPN nor ALPN support
When building with a library not offering any of these, ssl_conf_cur
is not used.

Can be backported to 1.8.
2018-01-04 19:04:08 +01:00
Willy Tarreau
4a28da1e9d BUG/MEDIUM: h2: properly handle the END_STREAM flag on empty DATA frames
Peter Lindegaard Hansen reported a problem affecting some POST requests
sent by MSIE on 1.8.3. Lukas found that we incorrectly dealt with the
END_STREAM flag on empty DATA frames.

What happens in fact is that while we correctly report that we've read a
zero-byte frame, since commit 8fc016d ("BUG/MEDIUM: h2: support uploading
partial DATA frames") backported into 1.8.2, we've been able to return
without updating the parser's state nor checking the frame flags in this
case.

The fix is trival, we just need not to return too early.

This fix must be backported to 1.8.
2018-01-04 14:41:00 +01:00
Willy Tarreau
8ec140604a MEDIUM: h2: prepare a graceful shutdown when the frontend is stopped
During a reload operation, instead of keeping the H2 connections opened
forever causing confusion during configuration changes, let's send a
graceful shutdown so that the client knows that it would better open a
new connection for future requests. We can't really catch the signal
from H2, but we can advertise this graceful shutdown upon the next I/O
event (eg: a WINDOW_UPDATE from the client or a new request). One of
the visible effect is that the old process quits much faster.

This patch should be backported to 1.8 since it is affected by this
problem.
2017-12-30 18:08:13 +01:00
Willy Tarreau
c775f8372b DEBUG: hpack: add more traces to the hpack decoder
These ones are only enabled when DEBUG_HPACK is defined so they have no
effect on the production code.
2017-12-30 17:37:08 +01:00
Willy Tarreau
4f03436c48 DEBUG: hpack: make hpack_dht_dump() expose the output file
It's more convenient to be able to choose between stdout and stderr.
2017-12-30 17:17:07 +01:00
Willy Tarreau
bb39b4945b BUG/MAJOR: hpack: don't return direct references to the dynamic headers table
Maximilian Böhm and Lucas Rolff both reported some random failed requests
with HTTP/2. Upon deep investigation on detailed traces provided by Lucas,
it turned out that some header names were occasionally corrupted and used
to point to random strings within the dynamic headers table.

The HPACK decoder must always return copies of header names that point
to the dynamic headers table. Otherwise, the insertion of a header after
the current one leading to a reorganization of the table will change the
data the pointer designates. Unfortunately, one such copy was missing for
indexed names, leading to random request failures due to invalid header
names.

Many thanks to Lucas who ran a large number of tests with full traces
helping to capture a reproduceable sequence exhibiting this issue.

This patch must be backported to 1.8.
2017-12-30 17:17:06 +01:00
Willy Tarreau
ff47b3f41d BUG/MEDIUM: http: don't automatically forward request close
Maximilian Böhm, and Lucas Rolff reported some frequent HTTP/2 POST
failures affecting version 1.8.2 that were not affecting 1.8.1. Lukas
Tribus determined that these ones appeared consecutive to commit a48c141
("BUG/MAJOR: connection: refine the situations where we don't send shutw()").

It turns out that the HTTP request forwarding engine lets a shutr from
the client be automatically forwarded to the server unless chunked
encoding is in use. It's a bit tricky to meet this condition as it only
happens if the shutr is not reported in the initial request. So if a
request is large enough or the body is delayed after the headers (eg:
Expect: 100-continue), the the function quits with channel_auto_close()
left enabled. The patch above was not really related in fact. It's just
that a previous bug was causing this shutw to be skipped at the lower
layers, and the two bugs used to cancel themselves.

In the HTTP request we should only pass the close in tunnel mode, as
other cases either need to keep the connection alive (eg: for reuse)
or will force-close it. Also the forced close will properly take care
of avoiding the painful time-wait, which is not possible with the early
close.

This patch must be backported to 1.8 as it directly impacts HTTP/2, and
may be backported to older version to save them from being abused by
clients causing TIME_WAITs between haproxy and the server.

Thanks to Lukas and Lucas for running many tests with captures allowing
the bug to be narrowed down.
2017-12-29 17:23:40 +01:00
William Lallemand
e134041910 MINOR: don't close stdio anymore
Closing the standard IO FDs (0,1,2) can be troublesome, especially in
the case of the master-worker.

Instead of closing those FDs, they are now pointing to /dev/null which
prevents sending debugging messages to the wrong FDs.

This patch could be backported in 1.8.
2017-12-29 16:33:41 +01:00
PiBa-NL
149a81a443 BUG/MEDIUM: mworker: don't close stdio several time
This patch makes sure that a frontend socket that gets created after
initialization won't be closed when the master gets re-executed.

When used in daemon mode, the master-worker is closing the FDs 0, 1, 2
after the fork of the children.

When the master was reloading, those FDs were assigned again during the
parsing of the configuration (probably for some listeners), and the
workers were closing them thinking it was the stdio.

This patch must be backported to 1.8.
2017-12-29 16:31:10 +01:00
Willy Tarreau
d790143d99 BUG/MEDIUM: h2: ensure we always know the stream before sending a reset
The recent patch introducing the H2_CS_FRAME_E state to emit stream
resets was not totally correct in that in the rare case where there is
no room left to emit the reset, the next call to process it later could
use an uninitialized stream. This only affects responses to frames that
are sent on closed streams though.

This fix must be backported to 1.8.
2017-12-29 11:34:40 +01:00
Willy Tarreau
ab83750a29 BUG/MEDIUM: h2: improve handling of frames received on closed streams
The h2spec utility found certain situations where we're returning an
RST_STREAM while a GOAWAY is expected. While we can't always reliably
decide which one to use (eg: after a stream has been closed for a long
time), in practice we often still have the stream available until it's
destroyed at the application level. This provides the flags we need to
verify the conditions that led to its closure, namely if RST was sent
or received, or if it was regularly closed using a double ES.

The first step consists in marking all closed streams as having already
sent an RST_STREAM frame. This will ensure that we can send an RST_STREAM
for a late transmission on a stream we have forgotten about instead of
risking to break the connection. The next steps consist in re-arranging
the H2_SS_CLOSED checks so that we can deliver a GOAWAY frame for the
few cases where an unexpected frame was received after a double ES.

By carefully taking care of these specificities, we can reduce by 4 the
number of remaining compliance issues.

Note: some tests start to become a bit long and to be repeated at various
places. Probably that adding a bitmask of allowed/forbidden frame types
per state and/or per situation could significantly help. It's likely
that some deeper tests in the frame handlers could also be removed now
as they can't be triggered anymore.

This fix should be backported to 1.8.
2017-12-27 18:44:22 +01:00
Willy Tarreau
a20a519b8f BUG/MEDIUM: h2: properly handle and report some stream errors
Some stream errors applied to half-closed and closed streams are not
properly reported, especially after the stream transistions to the
closed state. The reason is that the code checks for this "error"
stream state in order to send an RST frame. But if the stream was
just closed or was already closed, there's no way to validate this
condition, and the error is never reported to the peer.

In order to address this situation, we'll add a new FRAME_E demux state
which indicates that the previously parsed frame triggered a stream error
of type STREAM CLOSED that needs to be reported. Proceeding like this
will ensure that we don't lose that information even if we can't
immediately send the message. It also removes the confusion where FRAME_A
could be used either for ACKs or for RST.

The state transition has been added after every h2s_error() on the demux
path. It seems that we might need to have two distinct h2s_error()
functions, one for the mux and another one for the demux, though it
would provide little benefit. It also becomes more apparent that the
H2_SS_ERROR state is only used to detect the need to report an error
on the mux direction. Maybe this will have to be revisited later.

This simple change managed to eliminate 5 bugs reported by h2spec.

This fix must be backported to 1.8.
2017-12-27 18:34:50 +01:00
Willy Tarreau
b26881a5d5 BUG/MEDIUM: checks: properly set servers to stopping state on 404
Paul Lockaby reported that since 1.8, disable-on-404 doesn't work
anymore in that the server stay up despite returning 404. Cyril spotted
that this was caused by a copy-paste error introduced by commit 5a13351
("BUG/MEDIUM: log: check result details truncated.") causing
set_server_running() to be called instead of set_server_stopping() in
this case.

It can be reproduced with the simple test config below :

  defaults
     mode http
     timeout connect 1s
     timeout client  10s
     timeout server  10s

  listen http
     bind :8888
     option httpchk GET /
     http-check disable-on-404
     server s1 127.0.0.1:9001 check
     server s2 127.0.0.1:9002 check
     http-response add-header x-served-by %s

  listen s1
     bind :9001
     server next 127.0.0.1:9002
     http-response set-status 404

  frontend s2
     bind :9002
     http-request redirect location /

S1 is supposed to be stopping and s2 up, which is not the case. After
calling the correct function, only S2 is used now.

This needs to be backported to 1.8.
2017-12-23 11:16:49 +01:00
Willy Tarreau
a48c141f44 BUG/MAJOR: connection: refine the situations where we don't send shutw()
Since commit f9ce57e ("MEDIUM: connection: make conn_sock_shutw() aware
of lingering"), we refrain from performing the shutw() on the socket if
there is no lingering risk. But there is a problem with this in tunnel
and in TCP modes where a client is explicitly allowed to send a shutw
to the server, eventhough it it risky.

Not doing it creates this situation reported by Ricardo Fraile and
diagnosed by Christopher : a typical HTTP client (eg: curl) connecting
via the config below to an HTTP server would receive its response,
immediately close while the server remains in keep-alive mode. The
shutr() received by haproxy from the client is "propagated" to the
server side but not acted upon because fdtab[fd].linger_risk is set,
so we expect that the next close will immediately complete this
operation.

  listen proxy-tcp
    bind 127.0.0.1:8888
    mode tcp
    timeout connect 5s
    timeout server  10s
    timeout client  10s
    server server1 127.0.0.1:8000

But since the whole stream will not end until the server closes in
turn, the server doesn't close and haproxy expires on server timeout.
This problem has already struck by waking up an older bug and was
partially fixed with commit 8059351 ("BUG/MEDIUM: http: don't disable
lingering on requests with tunnelled responses") though it was not
enough.

The problem is that linger_risk is not suited here. In fact we need to
know whether or not it is desired to close normally or silently, and
whether or not a shutr() has already been received on this connection.

This is the approach this patch takes, and it solves the problem for
the various difficult modes (tcp, http-server-close, pretend-keepalive).

This fix needs to be backported to 1.8. Many thanks to Ricardo for
providing very detailed traces and configurations.
2017-12-22 18:54:05 +01:00
Willy Tarreau
d4569d1937 BUG/MEDIUM: cache: don't cache the response on no-cache="set-cookie"
If the server mentions no-cache="set-cookie" in the response headers,
we must guarantee that any set-cookie field will not be stored. We
cannot edit the stored response on the fly to trim the set-cookie
header so we can refrain from storing a response containing such a
header. In theory we could use TX_SCK_PRESENT for this but this one
is only set when the cookie is being watched by the configuration.
Since these responses are not very frequent and often accompanied
with a set-cookie header, let's simply refrain from caching whenever
such directive is present.

This needs to be backported to 1.8.
2017-12-22 18:03:04 +01:00
Willy Tarreau
504455c533 BUG/MEDIUM: cache: respect the request cache-control header
Till now if a client emitted a request featureing a cache-control header,
this one was not respected and a stale object could still be delievered.r
 This patch ensures that :
  - cache-control: no-cache disables retrieval from the cache but does
    not prevent the newly fetched object from being stored ;
  - cache-control: no-store can safely retrieve from the cache but prevents
    from storing any fetched object
  - cache-control: max-age/max-stale/min-fresh act like no-cache
  - pragma: no-cache acts like cache-control: no-cache.

This needs to be backported to 1.8.
2017-12-22 17:56:18 +01:00
Willy Tarreau
c9bd34c7e0 BUG/MEDIUM: cache: replace old object on store
Currently the cache aborts a store operation if the object to store
already exists in the cache. This is used to avoid storing multiple
copies at the same time on concurrent accesses. It causes an issue
though, which is that existing unexpired objects cannot be updated.
This happens when any request criterion disables the retrieval from
the cache (eg: with max-age or any other cache-control condition).

For now, let's simply replace the previous existing entry by unlinking
it from the index. This could possibly be improved in the future if
needed.

This fix needs to be backported to 1.8.
2017-12-22 17:56:18 +01:00
Willy Tarreau
7704b1e89a BUG/MEDIUM: cache: do not try to retrieve host-less requests from the cache
All HTTP/1.1 requests the Host header share the same hash key 0 and
will be return the first cached object. Let's add the check on the call
to sha1_hosturi() to prevent this from happening.

This must be backported to 1.8.
2017-12-22 17:56:17 +01:00
Willy Tarreau
0ad8e0dfea MINOR: http: add a function to check request's cache-control header field
The new function check_request_for_cacheability() is used to check if
a request may be served from the cache, and/or allows the response to
be stored into the cache. For this it checks the cache-control and
pragma header fields, and adjusts the existing TX_CACHEABLE and a new
TX_CACHE_IGNORE flags.

For now, just like its response side counterpart, it only checks the
first value of the header field. These functions should be reworked to
improve their parsers and validate all elements.
2017-12-22 17:56:17 +01:00
Willy Tarreau
faf2909f9f BUG/MINOR: cache: do not force the TX_CACHEABLE flag before checking cacheability
The cache used to set this flag before calling
check_response_for_cacheability() due to the way the flags were previously
set (too late), but this is a bad idea as it loses the information of the
implicit caching rules related to the method and the status code. Let's
only rely on what was determined during the request and response parsing
instead and not change it.

This fix must be backported to 1.8, and it requires that the following
patches are also merged :
 - MINOR: http: adjust the list of supposedly cacheable methods
 - MINOR: http: update the list of cacheable status codes as per RFC7231
 - MINOR: http: start to compute the transaction's cacheability from the request
 - BUG/MINOR: http: do not ignore cache-control: public
2017-12-22 15:49:15 +01:00
Willy Tarreau
d3900cc31d BUG/MINOR: http: properly detect max-age=0 and s-maxage=0 in responses
In 1.3.8, commit a15645d ("[MAJOR] completed the HTTP response processing.")
improved the response parser by taking care of the cache-control header
field. The parser is wrong because it is split in two parts, one checking
for elements containing an equal sign and the other one for those without.
The "max-age=0" and "s-maxage=0" tests were located at the wrong place and
thus have never matched. In practice the side effect was very minimal given
that this code used to be enabled only when checking if a cookie had the
risk of being cached or not. Recently in 1.8 it was also used to decide if
the response could be cached but in practice the cache takes care of these
values by itself so there is very limited impact.

This fix can be backported to all stable versions.
2017-12-22 15:49:15 +01:00
Willy Tarreau
12b32f212f BUG/MINOR: http: do not ignore cache-control: public
In check_response_for_cacheability(), we don't check the
cache-control flags if the response is already supposed not to be
cacheable. This was introduced very early when cache-control:public
was not checked, and it basically results in this last one not being
able to properly mark the response as cacheable if it uses a status
code which is non-cacheable by default. Till now the impact is very
limited as it doesn't check that cookies set on non-default status
codes are not cacheable, and it prevents the cache from caching such
responses.

Let's fix this by doing two things :
  - remove the test for !TX_CACHEABLE in the aforementionned function
  - however take care of 1xx status codes here (which used to be
    implicitly dealt with by the test above) and remove the explicit
    check for 101 in the caller

This fix must be backported to 1.8.
2017-12-22 14:43:26 +01:00
Willy Tarreau
83ece462b4 MINOR: http: start to compute the transaction's cacheability from the request
There has always been something odd with the way the cache-control flags
are checked. Since it was made for checking for the risk of leaking cookies
only, all the processing was done in the response. Because of this it is not
possible to reuse the transaction flags correctly for use with the cache.

This patch starts to change this by moving the method check in the request
so that we know very early whether the transaction is expected to be cacheable
and that this status evolves along with checked headers. For now it's not
enough to use from the cache yet but at least it makes the flag more
consistent along the transaction processing.
2017-12-22 14:43:26 +01:00
Willy Tarreau
c55ddce65c MINOR: http: update the list of cacheable status codes as per RFC7231
Since RFC2616, the following codes were added to the list of codes
cacheable by default : 204, 404, 405, 414, 501. For now this it only
checked by the checkcache option to detect cacheable cookies.
2017-12-22 14:43:26 +01:00
Willy Tarreau
24ea0bcb1d MINOR: http: adjust the list of supposedly cacheable methods
We used to have a rule inherited from RFC2616 saying that the POST
method was the only uncacheable one, but things have changed since
and RFC7231+7234 made it clear that in fact only GET/HEAD/OPTIONS/TRACE
are cacheable. Currently this rule is only used to detect cacheable
cookies.
2017-12-22 14:43:26 +01:00
Eric Salama
fe7456f3b7 BUG/MEDIUM: lua: fix crash when using bogus mode in register_service()
When using an incorrect 'mode' as 2nd argument of core.register_service(),
HAProxy crashes while displaying the error message.

To be backported to 1.8, 1.7 and 1.6.
2017-12-22 14:34:54 +01:00
Emeric Brun
e31148031f BUG/MEDIUM: checks: a server passed in maint state was not forced down.
Setting a server in maint mode, the required next_state was not set
before calling the 'lb_down' function and so the system state was never
commited.

This patch should be backported in 1.8
2017-12-21 15:23:55 +01:00
Willy Tarreau
7aa15b072e BUG/MEDIUM: stream: don't consider abortonclose on muxes which close cleanly
The H2 mux can cleanly report an error when a client closes, which is not
the case for the pass-through mux which only reports shutr. That was the
reason why "option abortonclose" was created since there was no way to
distinguish a clean shutdown after sending the request from an abort.

The problem is that in case of H2, the streams are always shut read after
the request is complete (when the END_STREAM flag is received), and that
when this lands on a backend configured with "option abortonclose", this
aborts the request. Disabling abortonclose is not always an option when
H1 and H2 have to coexist.

This patch makes use of the newly introduced mux capabilities reported
via the stream interface's SI_FL_CLEAN_ABRT indicating that the mux is
safe and that there is no need to turn a clean shutread into an abort.
This way abortonclose has no effect on requests initiated from an H2
mux.

This patch as well as these 3 previous ones need to be backported to
1.8 :
 - BUG/MINOR: h2: properly report a stream error on RST_STREAM
 - MINOR: mux: add flags to describe a mux's capabilities
 - MINOR: stream-int: set flag SI_FL_CLEAN_ABRT when mux supports clean aborts
2017-12-20 17:01:24 +01:00
Willy Tarreau
984fca9363 MINOR: stream-int: set flag SI_FL_CLEAN_ABRT when mux supports clean aborts
By copying the info in the stream interface that the mux cleanly reports
aborts, we'll have the ability to check this flag wherever needed regardless
of the presence of a mux or not.
2017-12-20 16:56:32 +01:00
Willy Tarreau
28f1cb9da2 MINOR: mux: add flags to describe a mux's capabilities
This new field will be used to describe certain properties of some
muxes. For now we only add MX_FL_CLEAN_ABRT to indicate that a mux
is able to unambiguously report aborts using CS_FL_ERROR contrary
to others who may only report it via a read0. This will be used to
improve handling of the abortonclose option with H2. Other flags
may come later to report multiplexing capabilities or not, support
of client/server sides etc.
2017-12-20 16:31:30 +01:00
Willy Tarreau
2153d3ce73 BUG/MINOR: h2: properly report a stream error on RST_STREAM
We want to report such an error since H2 allows to differenciate
between an end of stream and an abort.

To be backported to 1.8.
2017-12-20 14:38:19 +01:00
Etienne Carriere
aec8989e53 MINOR: spoe: add force-set-var option in spoe-agent configuration
For security reasons, the spoe filter was only able to change values of
existing variables. In specific cases (ex : with LUA code), the name of
variables are unknown at the configuration parsing phase.
The force-set-var option can be enabled to register all variables.
2017-12-20 08:55:18 +01:00
Bertrand Jacquin
72fa1ec24e MEDIUM: netscaler: add support for standard NetScaler CIP protocol
It looks like two version of the protocol exist as reported by
Andreas Mahnke. This patch add support for both legacy and standard CIP
protocol according to NetScaler specifications.
2017-12-20 07:04:07 +01:00
Bertrand Jacquin
a341a2f479 MEDIUM: netscaler: do not analyze original IP packet size
Original informations about the client are stored in the CIP encapsulated
IP header, hence there is no need to consider original IP packet length
to determine if data are missing. Instead this change detect missing
data if the remaining buffer is large enough to contain a minimal IP and
TCP header and if the buffer has as much data as CIP is telling.
2017-12-20 07:04:07 +01:00
Bertrand Jacquin
67de5a295c MINOR: netscaler: check in one-shot if buffer is large enough for IP and TCP header
There is minimal gain in checking first the IP header length and then
the TCP header length since we always want to capture information about
both protocols.

IPv4 length calculation was incorrect since IPv4 ip_len actually defines
the total length of IPv4 header and following data.
2017-12-20 07:04:07 +01:00
Bertrand Jacquin
43a66a96b3 BUG/MAJOR: netscaler: address truncated CIP header detection
Buffer line is manually incremented in order to progress in the trash
buffer but calculation are made omitting this manual offset.

This leads to random packets being rejected with the following error:

  HTTP/1: Truncated NetScaler Client IP header received

Instead, once original IP header is found, use the IP header length
without considering the CIP encapsulation.
2017-12-20 07:04:07 +01:00
Bertrand Jacquin
c7cc69ac36 BUG/MEDIUM: netscaler: use the appropriate IPv6 header size
IPv6 header has a fixed size of 40 bytes, not 20.
2017-12-20 07:04:07 +01:00
Bertrand Jacquin
7d668f9e76 MINOR: netscaler: rename cip_len to clarify its uage
cip_len was meant to be the length of the data encapsulated in the CIP
protocol, the size the IP and TCP header
2017-12-20 07:04:07 +01:00
Bertrand Jacquin
4b4c286bee MINOR: netscaler: remove the use of cip_magic only used once 2017-12-20 07:04:07 +01:00
Bertrand Jacquin
b387591f32 MINOR: netscaler: respect syntax
As per doc/coding-style.txt
2017-12-20 07:04:07 +01:00
Christopher Faulet
789691778f BUG/MEDIUM: mworker: Set FD_CLOEXEC flag on log fd
A log socket (UDP or UNIX) is opened by the master during its startup, when the
first log message is sent. So, to prevent FD leaks, we must ensure we correctly
close it during a reload. By setting FD_CLOEXEC bit on it, we are sure it will
be automatically closed it during a reload.

This patch must be backported in 1.8.
2017-12-19 14:03:30 +01:00
Willy Tarreau
60a2ee7945 MINOR: sample: rename the "len" converter to "length"
This converter was recently introduced by commit ed0d24e ("MINOR:
sample: add len converter").

As found by Cyril, it causes an issue in "http-request capture"
statements. The non-obvious problem is that an old syntax for sample
expressions and converters used to support a series of words, each
representing a converter. This used to be how the "stick" directives
were created initially. By having a converter called "len", a
statement such as "http-request capture foo len 10" considers "len"
as a converter and not as the capture length.

This obsolete syntax needs to be changed in 1.9 but it's too late
for other versions. It's worth noting that the same problem can
happen if converters are registered on the fly using Lua. Other
language keywords that currently have to be avoided in converters
include "id", "table", "if", "unless".
2017-12-15 07:13:48 +01:00
Cyril Bonté
9fc9e53763 BUG: MINOR: http: don't check http-request capture id when len is provided
Randomly, haproxy could fail to start when a "http-request capture"
action is defined, without any change to the configuration. The issue
depends on the memory content, which may raise a fatal error like :
  unable to find capture id 'xxxx' referenced by http-request capture
rule

Commit fd608dd2 already prevents the condition to happen, but this one
should be included for completeness and to reclect the code on the
response side.

The issue was introduced recently by commit 29730ba5 and should only be
backported to haproxy 1.8.
2017-12-14 22:46:27 +01:00
Cyril Bonté
3906d5739c BUG: MAJOR: lb_map: server map calculation broken
Adrian Williams reported that several balancing methods were broken and
sent all requests to one backend. This is a regression in haproxy 1.8 where
the server score was not correctly recalculated.

This fix must be backported to the 1.8 branch.
2017-12-14 17:36:39 +01:00
Etienne Carriere
ed0d24ebed MINOR: sample: add len converter
Add len converter that returns the length of a string
2017-12-14 14:36:10 +01:00
Willy Tarreau
b78b80efe5 BUG/MINOR: stream-int: don't try to receive again after receiving an EOS
When an end of stream has been reported, we should not try to receive again
as the mux layer might not be prepared to this and could report unexpected
errors.

This is more of a strengthening measure that follows the introduction of
conn_stream that came in 1.8. It's desired to backport this into 1.8 though
it's uncertain at this time whether it may have caused real issues.
2017-12-14 13:43:52 +01:00
Willy Tarreau
91bfdd7e04 BUG/MEDIUM: h2: fix stream limit enforcement
Commit 4974561 ("BUG/MEDIUM: h2: enforce the per-connection stream limit")
implemented a stream limit enforcement on the connection but it was not
correctly done as it would count streams still known by the connection,
which includes the lingering ones that are already marked close. We need
to count only the non-closed ones, which this patch does. The effect is
that some streams are rejected a bit before the limit.

This fix needs to be backported to 1.8.
2017-12-14 13:43:52 +01:00
Willy Tarreau
805935147a BUG/MEDIUM: http: don't disable lingering on requests with tunnelled responses
The HTTP forwarding engine needs to disable lingering on requests in
case the connection to the server has to be suddenly closed due to
http-server-close being used, so that we don't accumulate lethal
TIME_WAIT sockets on the outgoing side. A problem happens when the
server doesn't advertise a response size, because the response
message quickly goes through the MSG_DONE and MSG_TUNNEL states,
and once the client has transferred all of its data, it turns to
MSG_DONE and immediately sets NOLINGER and closes before the server
has a chance to respond. The problem is that this destroys some of
the pending DATA being uploaded, the server doesn't receive all of
them, detects an error and closes.

This early NOLINGER is inappropriate in this situation because it
happens before the response is transmitted. This state transition
to MSG_TUNNEL doesn't happen when the response size is known since
we stay in MSG_DATA (and related states) during all the transfer.

Given that the issue is only related to connections not advertising
a response length and that by definition these connections cannot be
reused, there's no need for NOLINGER when the response's transfer
length is not known, which can be verified when entering the CLOSED
state. That's what this patch does.

This fix needs to be backported to 1.8 and very likely to 1.7 and
older as it affects the very rare case where a client immediately
closes after the last uploaded byte (typically a script). However
given that the risk of occurrence in HTTP/1 is extremely low, it is
probably wise to wait before backporting it before 1.8.
2017-12-14 13:43:52 +01:00
Willy Tarreau
13e4e94dae BUG/MEDIUM: h2: don't close after the first DATA frame on tunnelled responses
Tunnelled responses are those without a content-length nor a chunked
encoding. They are specially dealt with in the current code but the
behaviour is not correct. The fact that the chunk size is left to zero
with a state artificially set to CHUNK_SIZE validates the test on
whether or not to set the end of stream flag. Thus the first DATA
frame always carries the ES flag and subsequent ones remain blocked.

This patch fixes it in two ways :
  - update h1m->curr_len to the size of the current buffer so that it
    is properly subtracted later to find the real end ;
  - don't set the state to CHUNK_SIZE when there's no content-length
    and instead set it to CHUNK_SIZE only when there's chunking.

This fix needs to be backported to 1.8.
2017-12-14 13:43:52 +01:00
Willy Tarreau
c4134ba8b0 BUG/MEDIUM: h2: don't switch the state to HREM before end of DATA frame
We used to switch the stream's state to HREM when seeing and ES bit on
the DATA frame before actually being able to process that frame, possibly
resulting in the DATA frame being processed after the stream was seen as
half-closed and possibly being rejected. The state must not change before
the frame is really processed.

Also fixes a harmless typo in the flag name which should have DATA and
not HEADERS in its name (but all values are equal).

Must be backported to 1.8.
2017-12-14 13:43:52 +01:00
Willy Tarreau
6847262211 MINOR: h2: don't demand that a DATA frame is complete before processing it
Since last commit it's not required that the DATA frames are complete anymore
so better start with what we have. Only the HEADERS frame requires this. This
may be backported as part of the upload fixes.
2017-12-14 13:43:52 +01:00
Willy Tarreau
8fc016d0fe BUG/MEDIUM: h2: support uploading partial DATA frames
We currently have a problem with DATA frames when they don't fit into
the destination buffer. While it was imagined that in theory this never
happens, in practice it does when "option http-buffer-request" is set,
because the headers don't leave the target buffer before trying to read
so if the frame is full, there's never enough room.

This fix consists in reading what can be read from the frame and advancing
the input buffer. Once the contents left are only the padding, the frame
is completely processed. This also solves another problem we had which is
that it was possible to fill a request buffer beyond its reserve because
the <count> argument was not respected in h2_rcv_buf(). Thus it's possible
that some POST requests sent at once with a headers+body filling exactly a
buffer could result in "400 bad req" when trying to add headers.

This fix must be backported to 1.8.
2017-12-14 13:43:52 +01:00
Willy Tarreau
05e5dafe9a MINOR: h2: store the demux padding length in the h2c struct
We'll try to process partial frames and for this we need to know the
padding length. The first step requires to extract it during the parsing
and store it in the demux context in the connection. Till now it was only
processed at once.
2017-12-14 13:43:52 +01:00
Willy Tarreau
d13bf27e78 BUG/MEDIUM: h2: debug incoming traffic in h2_wake()
Even after previous commit ("BUG/MEDIUM: h2: work around a connection
API limitation") there is still a problem with some requests. Sometimes
when polling for more request data while some pending data lies in the
buffer, there's no way to enter h2_recv() because the FD is not marked
ready for reading.

We need to slightly change the approach and make h2_recv() only receive
from the buffer and h2_wake() always attempt to demux if the demux is not
blocked.

However, if the connection is already being polled for reading, it will
not wake up from polling. For this reason we need to cheat and also
pretend a request for sending data, which ensures that as soon as any
direction may move, we can continue to demux. This shows that in the
long term we probably need a better way to resume an interrupted
operation at the mux level.

With this fix, no more hangups happen during uploads. Note that this
time the setup required to provoke the hangups was a bit complex :
  - client is "curl" running on local host, uploading 1.7 MB of
    data via haproxy
  - haproxy running on local host, forwarding to a remote server
    through a 100 Mbps only switch
  - timeouts disabled on haproxy
  - remote server made of thttpd executing a cgi reading request data
    through "dd bs=10" to slow down everything.

With such a setup, around 3-5% of the connections would hang up.

This fix needs to be backported to 1.8.
2017-12-14 13:43:24 +01:00
Willy Tarreau
6042aeb1e8 BUG/MEDIUM: h2: work around a connection API limitation
The connection API permits us to enable or disable receiving on a
connection. The underlying FD layer arranges this with the polling
and the fd cache. In practice, if receiving was allowed and an end
of buffer was reached, the FD is subscribed to the polling. If later
we want to process pending data from the buffer, we have to enable
receiving again, but since it's already enabled (in polled mode),
nothing happens and the pending data remain stuck until a new event
happens on the connection to wake the FD up. This is a limitation of
the internal connection API which is not very friendly to the new mux
architecture.

The visible effect is that certain uploads to slow servers experience
truncation on timeout on their last blocks because nothing new comes
from the connection to wake it up while it's being polled.

In order to work around this, there are two solutions :
  - either cheat on the connection so that conn_update_xprt_polling()
    always performs a call to fd_may_recv() after fd_want_recv(), that
    we can trigger from the mux by always calling conn_xprt_stop_recv()
    before conn_xprt_want_recv(), but that's a bit tricky and may have
    side effects on other parts (eg: SSL)

  - or we refrain from receiving in the mux as soon as we're busy on
    anything else, regardless of whether or not some room is available
    in the receive buffer.

This patch takes the second approach above. This way once we read some
data, as soon as we detect that we're stuck, we immediately stop receiving.
This ensures the event doesn't go into polled mode for this period and
that as soon as we're unstuck we can continue. In fact this guarantees
that we can only wait on one side of the mux for a given direction. A
future improvement of the connection layer should make it possible to
resume processing of an interrupted receive operation.

This fix must be backported to 1.8.
2017-12-14 13:43:24 +01:00