Commit Graph

6022 Commits

Author SHA1 Message Date
Eric Salama
fe7456f3b7 BUG/MEDIUM: lua: fix crash when using bogus mode in register_service()
When using an incorrect 'mode' as 2nd argument of core.register_service(),
HAProxy crashes while displaying the error message.

To be backported to 1.8, 1.7 and 1.6.
2017-12-22 14:34:54 +01:00
Emeric Brun
e31148031f BUG/MEDIUM: checks: a server passed in maint state was not forced down.
Setting a server in maint mode, the required next_state was not set
before calling the 'lb_down' function and so the system state was never
commited.

This patch should be backported in 1.8
2017-12-21 15:23:55 +01:00
Willy Tarreau
7aa15b072e BUG/MEDIUM: stream: don't consider abortonclose on muxes which close cleanly
The H2 mux can cleanly report an error when a client closes, which is not
the case for the pass-through mux which only reports shutr. That was the
reason why "option abortonclose" was created since there was no way to
distinguish a clean shutdown after sending the request from an abort.

The problem is that in case of H2, the streams are always shut read after
the request is complete (when the END_STREAM flag is received), and that
when this lands on a backend configured with "option abortonclose", this
aborts the request. Disabling abortonclose is not always an option when
H1 and H2 have to coexist.

This patch makes use of the newly introduced mux capabilities reported
via the stream interface's SI_FL_CLEAN_ABRT indicating that the mux is
safe and that there is no need to turn a clean shutread into an abort.
This way abortonclose has no effect on requests initiated from an H2
mux.

This patch as well as these 3 previous ones need to be backported to
1.8 :
 - BUG/MINOR: h2: properly report a stream error on RST_STREAM
 - MINOR: mux: add flags to describe a mux's capabilities
 - MINOR: stream-int: set flag SI_FL_CLEAN_ABRT when mux supports clean aborts
2017-12-20 17:01:24 +01:00
Willy Tarreau
984fca9363 MINOR: stream-int: set flag SI_FL_CLEAN_ABRT when mux supports clean aborts
By copying the info in the stream interface that the mux cleanly reports
aborts, we'll have the ability to check this flag wherever needed regardless
of the presence of a mux or not.
2017-12-20 16:56:32 +01:00
Willy Tarreau
28f1cb9da2 MINOR: mux: add flags to describe a mux's capabilities
This new field will be used to describe certain properties of some
muxes. For now we only add MX_FL_CLEAN_ABRT to indicate that a mux
is able to unambiguously report aborts using CS_FL_ERROR contrary
to others who may only report it via a read0. This will be used to
improve handling of the abortonclose option with H2. Other flags
may come later to report multiplexing capabilities or not, support
of client/server sides etc.
2017-12-20 16:31:30 +01:00
Willy Tarreau
2153d3ce73 BUG/MINOR: h2: properly report a stream error on RST_STREAM
We want to report such an error since H2 allows to differenciate
between an end of stream and an abort.

To be backported to 1.8.
2017-12-20 14:38:19 +01:00
Etienne Carriere
aec8989e53 MINOR: spoe: add force-set-var option in spoe-agent configuration
For security reasons, the spoe filter was only able to change values of
existing variables. In specific cases (ex : with LUA code), the name of
variables are unknown at the configuration parsing phase.
The force-set-var option can be enabled to register all variables.
2017-12-20 08:55:18 +01:00
Bertrand Jacquin
72fa1ec24e MEDIUM: netscaler: add support for standard NetScaler CIP protocol
It looks like two version of the protocol exist as reported by
Andreas Mahnke. This patch add support for both legacy and standard CIP
protocol according to NetScaler specifications.
2017-12-20 07:04:07 +01:00
Bertrand Jacquin
a341a2f479 MEDIUM: netscaler: do not analyze original IP packet size
Original informations about the client are stored in the CIP encapsulated
IP header, hence there is no need to consider original IP packet length
to determine if data are missing. Instead this change detect missing
data if the remaining buffer is large enough to contain a minimal IP and
TCP header and if the buffer has as much data as CIP is telling.
2017-12-20 07:04:07 +01:00
Bertrand Jacquin
67de5a295c MINOR: netscaler: check in one-shot if buffer is large enough for IP and TCP header
There is minimal gain in checking first the IP header length and then
the TCP header length since we always want to capture information about
both protocols.

IPv4 length calculation was incorrect since IPv4 ip_len actually defines
the total length of IPv4 header and following data.
2017-12-20 07:04:07 +01:00
Bertrand Jacquin
43a66a96b3 BUG/MAJOR: netscaler: address truncated CIP header detection
Buffer line is manually incremented in order to progress in the trash
buffer but calculation are made omitting this manual offset.

This leads to random packets being rejected with the following error:

  HTTP/1: Truncated NetScaler Client IP header received

Instead, once original IP header is found, use the IP header length
without considering the CIP encapsulation.
2017-12-20 07:04:07 +01:00
Bertrand Jacquin
c7cc69ac36 BUG/MEDIUM: netscaler: use the appropriate IPv6 header size
IPv6 header has a fixed size of 40 bytes, not 20.
2017-12-20 07:04:07 +01:00
Bertrand Jacquin
7d668f9e76 MINOR: netscaler: rename cip_len to clarify its uage
cip_len was meant to be the length of the data encapsulated in the CIP
protocol, the size the IP and TCP header
2017-12-20 07:04:07 +01:00
Bertrand Jacquin
4b4c286bee MINOR: netscaler: remove the use of cip_magic only used once 2017-12-20 07:04:07 +01:00
Bertrand Jacquin
b387591f32 MINOR: netscaler: respect syntax
As per doc/coding-style.txt
2017-12-20 07:04:07 +01:00
Christopher Faulet
789691778f BUG/MEDIUM: mworker: Set FD_CLOEXEC flag on log fd
A log socket (UDP or UNIX) is opened by the master during its startup, when the
first log message is sent. So, to prevent FD leaks, we must ensure we correctly
close it during a reload. By setting FD_CLOEXEC bit on it, we are sure it will
be automatically closed it during a reload.

This patch must be backported in 1.8.
2017-12-19 14:03:30 +01:00
Willy Tarreau
60a2ee7945 MINOR: sample: rename the "len" converter to "length"
This converter was recently introduced by commit ed0d24e ("MINOR:
sample: add len converter").

As found by Cyril, it causes an issue in "http-request capture"
statements. The non-obvious problem is that an old syntax for sample
expressions and converters used to support a series of words, each
representing a converter. This used to be how the "stick" directives
were created initially. By having a converter called "len", a
statement such as "http-request capture foo len 10" considers "len"
as a converter and not as the capture length.

This obsolete syntax needs to be changed in 1.9 but it's too late
for other versions. It's worth noting that the same problem can
happen if converters are registered on the fly using Lua. Other
language keywords that currently have to be avoided in converters
include "id", "table", "if", "unless".
2017-12-15 07:13:48 +01:00
Cyril Bont
9fc9e53763 BUG: MINOR: http: don't check http-request capture id when len is provided
Randomly, haproxy could fail to start when a "http-request capture"
action is defined, without any change to the configuration. The issue
depends on the memory content, which may raise a fatal error like :
  unable to find capture id 'xxxx' referenced by http-request capture
rule

Commit fd608dd2 already prevents the condition to happen, but this one
should be included for completeness and to reclect the code on the
response side.

The issue was introduced recently by commit 29730ba5 and should only be
backported to haproxy 1.8.
2017-12-14 22:46:27 +01:00
Cyril Bont
3906d5739c BUG: MAJOR: lb_map: server map calculation broken
Adrian Williams reported that several balancing methods were broken and
sent all requests to one backend. This is a regression in haproxy 1.8 where
the server score was not correctly recalculated.

This fix must be backported to the 1.8 branch.
2017-12-14 17:36:39 +01:00
Etienne Carriere
ed0d24ebed MINOR: sample: add len converter
Add len converter that returns the length of a string
2017-12-14 14:36:10 +01:00
Willy Tarreau
b78b80efe5 BUG/MINOR: stream-int: don't try to receive again after receiving an EOS
When an end of stream has been reported, we should not try to receive again
as the mux layer might not be prepared to this and could report unexpected
errors.

This is more of a strengthening measure that follows the introduction of
conn_stream that came in 1.8. It's desired to backport this into 1.8 though
it's uncertain at this time whether it may have caused real issues.
2017-12-14 13:43:52 +01:00
Willy Tarreau
91bfdd7e04 BUG/MEDIUM: h2: fix stream limit enforcement
Commit 4974561 ("BUG/MEDIUM: h2: enforce the per-connection stream limit")
implemented a stream limit enforcement on the connection but it was not
correctly done as it would count streams still known by the connection,
which includes the lingering ones that are already marked close. We need
to count only the non-closed ones, which this patch does. The effect is
that some streams are rejected a bit before the limit.

This fix needs to be backported to 1.8.
2017-12-14 13:43:52 +01:00
Willy Tarreau
805935147a BUG/MEDIUM: http: don't disable lingering on requests with tunnelled responses
The HTTP forwarding engine needs to disable lingering on requests in
case the connection to the server has to be suddenly closed due to
http-server-close being used, so that we don't accumulate lethal
TIME_WAIT sockets on the outgoing side. A problem happens when the
server doesn't advertise a response size, because the response
message quickly goes through the MSG_DONE and MSG_TUNNEL states,
and once the client has transferred all of its data, it turns to
MSG_DONE and immediately sets NOLINGER and closes before the server
has a chance to respond. The problem is that this destroys some of
the pending DATA being uploaded, the server doesn't receive all of
them, detects an error and closes.

This early NOLINGER is inappropriate in this situation because it
happens before the response is transmitted. This state transition
to MSG_TUNNEL doesn't happen when the response size is known since
we stay in MSG_DATA (and related states) during all the transfer.

Given that the issue is only related to connections not advertising
a response length and that by definition these connections cannot be
reused, there's no need for NOLINGER when the response's transfer
length is not known, which can be verified when entering the CLOSED
state. That's what this patch does.

This fix needs to be backported to 1.8 and very likely to 1.7 and
older as it affects the very rare case where a client immediately
closes after the last uploaded byte (typically a script). However
given that the risk of occurrence in HTTP/1 is extremely low, it is
probably wise to wait before backporting it before 1.8.
2017-12-14 13:43:52 +01:00
Willy Tarreau
13e4e94dae BUG/MEDIUM: h2: don't close after the first DATA frame on tunnelled responses
Tunnelled responses are those without a content-length nor a chunked
encoding. They are specially dealt with in the current code but the
behaviour is not correct. The fact that the chunk size is left to zero
with a state artificially set to CHUNK_SIZE validates the test on
whether or not to set the end of stream flag. Thus the first DATA
frame always carries the ES flag and subsequent ones remain blocked.

This patch fixes it in two ways :
  - update h1m->curr_len to the size of the current buffer so that it
    is properly subtracted later to find the real end ;
  - don't set the state to CHUNK_SIZE when there's no content-length
    and instead set it to CHUNK_SIZE only when there's chunking.

This fix needs to be backported to 1.8.
2017-12-14 13:43:52 +01:00
Willy Tarreau
c4134ba8b0 BUG/MEDIUM: h2: don't switch the state to HREM before end of DATA frame
We used to switch the stream's state to HREM when seeing and ES bit on
the DATA frame before actually being able to process that frame, possibly
resulting in the DATA frame being processed after the stream was seen as
half-closed and possibly being rejected. The state must not change before
the frame is really processed.

Also fixes a harmless typo in the flag name which should have DATA and
not HEADERS in its name (but all values are equal).

Must be backported to 1.8.
2017-12-14 13:43:52 +01:00
Willy Tarreau
6847262211 MINOR: h2: don't demand that a DATA frame is complete before processing it
Since last commit it's not required that the DATA frames are complete anymore
so better start with what we have. Only the HEADERS frame requires this. This
may be backported as part of the upload fixes.
2017-12-14 13:43:52 +01:00
Willy Tarreau
8fc016d0fe BUG/MEDIUM: h2: support uploading partial DATA frames
We currently have a problem with DATA frames when they don't fit into
the destination buffer. While it was imagined that in theory this never
happens, in practice it does when "option http-buffer-request" is set,
because the headers don't leave the target buffer before trying to read
so if the frame is full, there's never enough room.

This fix consists in reading what can be read from the frame and advancing
the input buffer. Once the contents left are only the padding, the frame
is completely processed. This also solves another problem we had which is
that it was possible to fill a request buffer beyond its reserve because
the <count> argument was not respected in h2_rcv_buf(). Thus it's possible
that some POST requests sent at once with a headers+body filling exactly a
buffer could result in "400 bad req" when trying to add headers.

This fix must be backported to 1.8.
2017-12-14 13:43:52 +01:00
Willy Tarreau
05e5dafe9a MINOR: h2: store the demux padding length in the h2c struct
We'll try to process partial frames and for this we need to know the
padding length. The first step requires to extract it during the parsing
and store it in the demux context in the connection. Till now it was only
processed at once.
2017-12-14 13:43:52 +01:00
Willy Tarreau
d13bf27e78 BUG/MEDIUM: h2: debug incoming traffic in h2_wake()
Even after previous commit ("BUG/MEDIUM: h2: work around a connection
API limitation") there is still a problem with some requests. Sometimes
when polling for more request data while some pending data lies in the
buffer, there's no way to enter h2_recv() because the FD is not marked
ready for reading.

We need to slightly change the approach and make h2_recv() only receive
from the buffer and h2_wake() always attempt to demux if the demux is not
blocked.

However, if the connection is already being polled for reading, it will
not wake up from polling. For this reason we need to cheat and also
pretend a request for sending data, which ensures that as soon as any
direction may move, we can continue to demux. This shows that in the
long term we probably need a better way to resume an interrupted
operation at the mux level.

With this fix, no more hangups happen during uploads. Note that this
time the setup required to provoke the hangups was a bit complex :
  - client is "curl" running on local host, uploading 1.7 MB of
    data via haproxy
  - haproxy running on local host, forwarding to a remote server
    through a 100 Mbps only switch
  - timeouts disabled on haproxy
  - remote server made of thttpd executing a cgi reading request data
    through "dd bs=10" to slow down everything.

With such a setup, around 3-5% of the connections would hang up.

This fix needs to be backported to 1.8.
2017-12-14 13:43:24 +01:00
Willy Tarreau
6042aeb1e8 BUG/MEDIUM: h2: work around a connection API limitation
The connection API permits us to enable or disable receiving on a
connection. The underlying FD layer arranges this with the polling
and the fd cache. In practice, if receiving was allowed and an end
of buffer was reached, the FD is subscribed to the polling. If later
we want to process pending data from the buffer, we have to enable
receiving again, but since it's already enabled (in polled mode),
nothing happens and the pending data remain stuck until a new event
happens on the connection to wake the FD up. This is a limitation of
the internal connection API which is not very friendly to the new mux
architecture.

The visible effect is that certain uploads to slow servers experience
truncation on timeout on their last blocks because nothing new comes
from the connection to wake it up while it's being polled.

In order to work around this, there are two solutions :
  - either cheat on the connection so that conn_update_xprt_polling()
    always performs a call to fd_may_recv() after fd_want_recv(), that
    we can trigger from the mux by always calling conn_xprt_stop_recv()
    before conn_xprt_want_recv(), but that's a bit tricky and may have
    side effects on other parts (eg: SSL)

  - or we refrain from receiving in the mux as soon as we're busy on
    anything else, regardless of whether or not some room is available
    in the receive buffer.

This patch takes the second approach above. This way once we read some
data, as soon as we detect that we're stuck, we immediately stop receiving.
This ensures the event doesn't go into polled mode for this period and
that as soon as we're unstuck we can continue. In fact this guarantees
that we can only wait on one side of the mux for a given direction. A
future improvement of the connection layer should make it possible to
resume processing of an interrupted receive operation.

This fix must be backported to 1.8.
2017-12-14 13:43:24 +01:00
Willy Tarreau
315d807cbc BUG/MEDIUM: h2: enable recv polling whenever demuxing is possible
In order to allow demuxing when the dmux buffer is full, we need to
enable data receipt in multiple conditions. Since the conditions are a
bit complex, they have been delegated to a new function h2_recv_allowed()
which follows these rules :

  - if an error or a shutdown was detected on the connection and the buffer
    is empty, we must not attempt to receive
  - if the demux buf failed to be allocated, we must not try to receive and
    we know there is nothing pending
  - if the buffer is not full, we may attempt to receive
  - if no flag indicates a blocking condition, we may attempt to receive
  - otherwise must may not attempt

No more truncated payloads are detected in tests anymore, which seems to
indicate that the issue was worked around. A better connection API will
have to be created for new versions to make this stuff simpler and more
intuitive.

This fix needs to be backported to 1.8 along with the rest of the patches
related to CS_FL_RCV_MORE.
2017-12-10 22:17:57 +01:00
Willy Tarreau
c9ede6c43e BUG/MEDIUM: h2: automatically set CS_FL_RCV_MORE when the output buffer is full
If we can't demux pending data due to a stream buffer full condition, we
now set CS_FL_RCV_MORE on the conn_stream so that the stream layer knows
it must call back as soon as possible to restart demuxing. Without this,
some uploaded payloads are truncated if the server does not consume them
fast enough and buffers fill up.

Note that this is still not enough to solve the problem, some changes are
required on the recv() and update_poll() paths to allow to restart reading
even with a buffer full condition.

This patch must be backported to 1.8.
2017-12-10 21:28:43 +01:00
Willy Tarreau
6577b48613 BUG/MEDIUM: stream-int: always set SI_FL_WAIT_ROOM on CS_FL_RCV_MORE
When a stream interface tries to read data from a mux using rcv_buf(),
sometimes it sees 0 as the return value and concludes that there's no
more data while there are, resulting in the connection being polled for
more data and no new attempt being made at reading these pending data.

Now it will automatically check for flag CS_FL_RCV_MORE to know if the
mux really did not have anything available or was not able to provide
these data by lack of room in the destination buffer, and will set
SI_FL_WAIT_ROOM accordingly. This will ensure that once current data
lying in the buffer are forwarded to the other side, reading chk_rcv()
will be called to re-enable reading.

It's important to note that in practice it will rely on the mux's
update_poll() function to re-enable reading and that where the calls
are placed in the stream interface, it's not possible to perform a
new synchronous rcv_buf() call. Thus a corner case remains where the
mux cannot receive due to a full buffer or any similar condition, but
needs to be able to wake itself up to deliver pending data. This is a
limitation of the current connection/conn_stream API which will likely
need a new event subscription to at least call ->wake() asynchronously
(eg: mux->{kick,restart,touch,update} ?).

For now the affected mux (h2 only) will have to take care of the extra
logic to carefully enable polling to restart processing incoming data.

This patch relies on previous one (MINOR: conn_stream: add new flag
CS_FL_RCV_MORE to indicate pending data) and both must be backported to
1.8.
2017-12-10 21:19:33 +01:00
Thierry FOURNIER
cb14688496 BUG/MEDIUM: lua/notification: memory leak
The thread patches adds refcount for notifications. The notifications are
used with the Lua cosocket. These refcount free the notifications when
the session is cleared. In the Lua task case, it not have sessions, so
the nofications are never cleraed.

This patch adds a garbage collector for signals. The garbage collector
just clean the notifications for which the end point is disconnected.

This patch should be backported in 1.8
2017-12-10 19:38:58 +01:00
Christopher Faulet
eb3e276d39 BUG/MEDIUM: threads/vars: Fix deadlock in register_name
In register_name, before locking the var_names array, we check the variable name
validity. So if we try to register an invalid or empty name, we need to return
without unlocking it (because it was never locked).

This patch must be backported in 1.8.
2017-12-08 10:37:24 +01:00
PiBa-NL
1714b9f286 BUG/MEDIUM: email-alert: don't set server check status from a email-alert task
This avoids possible 100% cpu usage deadlock on a EMAIL_ALERTS_LOCK and
avoids sending lots of emails when 'option log-health-checks' is used.
It is avoided to change the server state and possibly queue a new email
while processing the email alert by setting check->status to
HCHK_STATUS_UNKNOWN which will exit the set_server_check_status(..) early.

This needs to be backported to 1.8.
2017-12-08 05:58:56 +01:00
Tim Duesterhus
d16f450c98 MINOR: mworker: Improve wording in void mworker_wait()
Replace "left" / "leaving" with "exit" / "exiting".

This should be backported to haproxy 1.8.
2017-12-07 19:21:25 +01:00
Tim Duesterhus
c578d9acfa MINOR: mworker: Update messages referencing exit-on-failure
Commit 4cfede87a3 removed
`exit-on-failure` in favor of `no-exit-on-failure`, but failed
to update references to the former in user facing messages.

This should be backported to haproxy 1.8.
2017-12-07 19:21:14 +01:00
Willy Tarreau
0249219be8 BUG/MEDIUM: h2: fix handling of end of stream again
Commit 9470d2c ("BUG/MINOR: h2: try to abort closed streams as
soon as possible") tried to address the situations where a stream
is closed by the client, but caused a side effect which is that in
some cases, a regularly closed stream reports an error to the stream
layer. The reason is that we purposely matched H2_SS_CLOSED in the
test for H2_SS_ERROR to report this so that we can check for RST,
but it accidently catches certain end of transfers as well. This
results in valid requests to report flags "CD" in the logs.

Instead, let's roll back to detecting H2_SS_ERROR and explicitly check
for a received RST. This way we can correctly abort transfers without
mistakenly reporting errors in normal situations.

This fix needs to be backported to 1.8 as the fix above was merged into
1.8.1.
2017-12-07 19:20:35 +01:00
Willy Tarreau
dbd026792a BUG/MEDIUM: peers: set NOLINGER on the outgoing stream interface
Since peers were ported to an applet in 1.5, an issue appeared which
is that certain attempts to close an outgoing connection are a bit
"too nice". Specifically, protocol errors and stream timeouts result
in a clean shutdown to be sent, waiting for the other side to confirm.
This is particularly problematic in the case of timeouts since by
definition the other side will not confirm as it has disappeared.

As found by Fred, this issue was further emphasized in 1.8 by commit
f9ce57e ("MEDIUM: connection: make conn_sock_shutw() aware of
lingering") which causes clean shutdowns not to be sent if the fd is
marked as linger_risk, because now even a clean timeout will not be
sent on an idle peers session, and the other one will have nothing
to respond to.

The solution here is to set NOLINGER on the outgoing stream interface
to ensure we always close whenever we attempt a simple shutdown.

However it is important to keep in mind that this also underlines
some weaknesses of the shutr/shutw processing inside process_stream()
and that all this part needs to be reworked to clearly consider the
abort case, and to stop the confusion between linger_risk and NOLINGER.

This fix needs to be backported as far as 1.5 (all versions are affected).
However, during testing of the backport it was found that 1.5 never tries
to close the peers connection on timeout, so it suffers for another issue.
2017-12-06 17:48:36 +01:00
Emeric Brun
8f29829e24 BUG/MEDIUM: checks: a down server going to maint remains definitely stucked on down state.
The new admin state was not correctly commited in this case.
Checks were fully disabled but the server was not marked in MAINT state.
It results with a server definitely stucked on the DOWN state.

This patch should be backported on haproxy 1.8
2017-12-06 17:01:00 +01:00
Emeric Brun
ece0c334bd BUG/MEDIUM: ssl engines: Fix async engines fds were not considered to fix fd limit automatically.
The number of async fd is computed considering the maxconn, the number
of sides using ssl and the number of engines using async mode.

This patch should be backported on haproxy 1.8
2017-12-06 14:17:41 +01:00
Willy Tarreau
473cf5d0cd BUG/MEDIUM: mworker: also close peers sockets in the master
There's a nasty case related to signaling all processes via SIGUSR1.
Since the master process still holds the peers sockets, the old process
trying to connect to the new one to teach it its tables has a risk to
connect to the master instead, which will not do anything, causing the
old process to hang instead of quitting.

This patch ensures we correctly close the peers in the master process
on startup, just like it is done for proxies. Ultimately we would rather
have a complete list of listeners to avoid such issues. But that's a bit
trickier as it would require using unbind_all() and avoiding side effects
the master could cause to other processes (like unlinking unix sockets).

To be backported to 1.8.
2017-12-06 11:14:08 +01:00
William Lallemand
c61c0b371b BUG/MINOR: ssl: support tune.ssl.cachesize 0 again
Since the split of the shctx and the ssl cache, we lost the ability to
disable the cache with tune.ssl.cachesize 0.

Worst than that, when using this configuration, haproxy segfaults during
the configuration parsing.

Must be backported to 1.8.
2017-12-04 18:48:26 +01:00
Willy Tarreau
6c71e4696b BUG/MAJOR: hpack: don't pretend large headers fit in empty table
In hpack_dht_make_room(), we try to fulfill this rule form RFC7541#4.4 :

 "It is not an error to attempt to add an entry that is larger than the
  maximum size; an attempt to add an entry larger than the maximum size
  causes the table to be emptied of all existing entries and results in
  an empty table."

Unfortunately it is not consistent with the way it's used in
hpack_dht_insert() as this last one will consider a success as a
confirmation it can copy the header into the table, and a failure as
an indexing error. This results in the two following issues :
  - if a client sends too large a header into an empty table, this
    header may overflow the table. Fortunately, most clients send
    small headers like :authority first, and never mark headers that
    don't fit into the table as indexable since it is counter-productive ;

  - if a client sends too large a header into a populated table, the
    operation fails after the table is totally flushed and the request
    is not processed.

This patch fixes the two issues at once :
  - a header not fitting into an empty table is always a sign that it
    will never fit ;
  - not fitting into the table is not an error

Thanks to Yves Lafon for reporting detailed traces demonstrating this
issue. This fix must be backported to 1.8.
2017-12-04 18:06:51 +01:00
Christopher Faulet
fd608dd2d2 BUG/MINOR: action: Don't check http capture rules when no id is defined
This is a regression in the commit 29730ba5 ("MINOR: action: Add a functions to
check http capture rules"). We must check the capture id only when an id is
defined.

This patch must be backported in 1.8.
2017-12-04 10:39:56 +01:00
Willy Tarreau
7912781a30 BUG/MINOR: h2: use the H2_F_DATA_* macros for DATA frames
A typo resulted in H2_F_HEADERS_* being used there, but it's harmless
as they are equal. Better fix the confusion though.

Should be backported to 1.8.
2017-12-03 21:09:38 +01:00
Willy Tarreau
637f64d565 BUG/MEDIUM: h2: do not accept upper case letters in request header names
This is explicitly forbidden by 7540#8.1.2, and may be used to bypass
some of the other filters, so they must be blocked early. It removes
another issue reported by h2spec.

To backport to 1.8.
2017-12-03 21:09:38 +01:00
Willy Tarreau
fe7c356be6 BUG/MEDIUM: h2: remove connection-specific headers from request
h2spec rightfully outlines that we used not to reject these ones, and
they may cause trouble if presented, especially "upgrade".

Must be backported to 1.8.
2017-12-03 21:09:18 +01:00
Willy Tarreau
520886990f BUG/MINOR: h2: reject response pseudo-headers from requests
At the moment there's only ":status". Let's block it early when parsing
the request. Otherwise it would be blocked by the HTTP/1 code anyway.
This silences another h2spec issue.

To backport to 1.8.
2017-12-03 21:08:43 +01:00
Willy Tarreau
92153fccd3 BUG/MINOR: h2: properly check PRIORITY frames
We don't use them right now but it's better to ensure they're properly
checked. This removes another 3 warnings in h2spec.

To backport to 1.8.
2017-12-03 21:08:43 +01:00
Willy Tarreau
18b86cd074 BUG/MINOR: h2: reject incorrect stream dependencies on HEADERS frame
We currently don't use stream dependencies, but as reported by h2spec,
the spec requires that we reject streams that depend on themselves in
HEADERS frames.

To backport to 1.8.
2017-12-03 21:08:42 +01:00
Willy Tarreau
1b38b46ab7 BUG/MINOR: h2: do not accept SETTINGS_ENABLE_PUSH other than 0 or 1
We don't use yet it but for correctness, let's enforce the check.

To backport to 1.8.
2017-12-03 21:08:42 +01:00
Willy Tarreau
497456154e BUG/MEDIUM: h2: enforce the per-connection stream limit
h2spec reports that we unfortunately didn't enforce the per-connection
stream limit that we advertise. It's important to ensure it's never
crossed otherwise it's cheap for a client to create many streams. This
requires the addition of a stream count. The h2c struct could be cleaned
up a bit, just like the h2_detach() function where an "if" block doesn't
make sense anymore since it's always true.

To backport to 1.8.
2017-12-03 21:08:42 +01:00
Willy Tarreau
d8d2ac75e8 BUG/MINOR: h2: the TE header if present may only contain trailers
h2spec reports this issue which has no side effect for now, but is
better cleared.

To backport to 1.8.
2017-12-03 21:08:42 +01:00
Willy Tarreau
68ed64148a BUG/MINOR: h2: fix a typo causing PING/ACK to be responded to
The ACK flag was tested on the frame type instead of the frame flag.

To backport to 1.8.
2017-12-03 21:08:41 +01:00
Willy Tarreau
cd4fe17a26 BUG/MINOR: h2: ":path" must not be empty
As reported by h2spec, the h2->h1 gateway doesn't verify that ":path"
is not empty. This is harmless since the H1 parser will reject such a
request, but better fix it anyway.

To backport to 1.8.
2017-12-03 21:08:41 +01:00
Willy Tarreau
9470d2cd35 BUG/MINOR: h2: try to abort closed streams as soon as possible
The purpose here is to be able to signal receipt of RST_STREAM to
streams when they start to provide a response so that the response
can be aborted ASAP. Given that RST_STREAM immediately switches the
stream to the CLOSED state, we must check for CLOSED in addition to
the existing ERROR check.

To be backported to 1.8.
2017-12-03 21:08:41 +01:00
Willy Tarreau
11cc2d6031 BUG/MINOR: h2: immediately close if receiving GOAWAY after the last stream
The h2spec test suite reveals that a GOAWAY frame received after the
last stream doesn't cause an immediate close, because we count on the
last stream to quit to do so. By simply setting the last_sid to the
received value in case it was not set, we can ensure to properly close
an idle connection during h2_wake().

To be backported to 1.8.
2017-12-03 21:08:40 +01:00
Willy Tarreau
811ad12414 BUG/MAJOR: h2: correctly check the request length when building an H1 request
Due to a typo in the request maximum length calculation, we count the
request path twice instead of counting it added to the method's length.
This has two effects, the first one being that a path cannot be larger
than half a buffer, and the second being that the method's length isn't
properly checked. Due to the way the temporary buffers are used internally,
it is quite difficult to meet this condition. In practice, the only
situation where this can cause a problem is when exactly one of either
the method or the path are compressed and the other ones is sent as a
literal.

Thanks to Yves Lafon for providing useful traces exhibiting this issue.

To be backported to 1.8.
2017-12-03 21:08:40 +01:00
Willy Tarreau
c611e6681b BUG/MINOR: hpack: dynamic table size updates are only allowed before headers
h2spec reports that we used to support a dynamic table size update
anywhere in the header block but it's only allowed before other
headers (cf RFC7541#4.2.1). In practice we don't use these for now
since we only use literals in responses.

To backport to 1.8.
2017-12-03 21:08:40 +01:00
Willy Tarreau
d85ba4e092 BUG/MINOR: hpack: reject invalid header index
If the hpack decoder sees an invalid header index, it emits value
"### ERR ###" that was used during debugging instead of rejecting the
block. This is harmless, and was detected by h2spec.

To backport to 1.8.
2017-12-03 21:08:39 +01:00
Willy Tarreau
4235d18214 BUG/MINOR: hpack: must reject huffman literals padded with more than 7 bits
h2spec reported that we didn't check that no more than 7 bits of padding
were left after decoding an huffman-encoded literal. This is harmless but
better fix it now.

To backport to 1.8.
2017-12-03 21:08:39 +01:00
Willy Tarreau
9e28f459b4 BUG/MINOR: hpack: fix debugging output of pseudo header names
When a pseudo header is used, name.ptr is NULL and we must replace it
with hpack_idx_to_name(). This only affects code built with DEBUG_HPACK.

To be backported to 1.8.
2017-12-03 09:43:38 +01:00
Olivier Houchard
6377a0004f BUG/MEDIUM: checks: Be sure we have a mux if we created a cs.
In connect_conn_chk(), there were one case we could return with a new
conn_stream created, but no mux attached. With no mux, cs_destroy() would
segfault. Fix that by setting the mux before we can fail.

This should be backported to 1.8.
2017-12-03 08:13:22 +01:00
Christopher Faulet
81991d3285 BUG/MAJOR: thread: Be sure to request a sync between threads only once at a time
The first thread requesting a synchronization is responsible to write in the
"sync" pipe to notify all others. But we must write only once in the pipe
between two synchronizations to have exactly one character in the pipe. It is
important because we only read 1 character in return when the last thread exits
from the sync-point.

Here there is a bug. If two threads request a synchronization, only the first
writes in the pipe. But, if the same thread requests several times a
synchronization before entering in the sync-point (because, for instance, it
detects many servers down), it writes as many as characters in the pipe. And
only one of them will be read. Repeating this bug many times will block HAProxy
on the write because the pipe is full.

To fix the bug, we just check if the current thread has already requested a
synchronization before trying to notify all others.

The patch must be backported in 1.8
2017-12-02 14:31:01 +01:00
Olivier Houchard
829aa24459 MINOR: threads: Fix pthread_setaffinity_np on FreeBSD.
As with the call to cpuset_setaffinity(), FreeBSD expects the argument to
pthread_setaffinity_np() to be a cpuset_t, not an unsigned long, so the call
was silently failing.

This should probably be backported to 1.8.
2017-12-02 14:23:12 +01:00
PiBa-NL
baf6ea4bd5 BUG/MINOR: mworker: detach from tty when in daemon mode
This allows a calling script to show the first startup output and
know when to stop reading from stdout so haproxy can daemonize.

To be backpored to 1.8.
2017-12-02 14:13:40 +01:00
PiBa-NL
4763ffdf04 BUG/MINOR: mworker: fix validity check for the pipe FDs
Check if master-worker pipe getenv succeeded, also allow pipe fd 0 as
valid. On FreeBSD in quiet mode the stdin/stdout/stderr are closed
which lets the mworker_pipe to use fd 0 and fd 1. Additionally exit()
upon failure to create or get the master-worker pipe.

This needs to be backported to 1.8.
2017-12-02 13:24:47 +01:00
Willy Tarreau
721d8e0286 MINOR: config: report when "monitor fail" rules are misplaced
"monitor-uri" may rely on "monitor fail" rules, which are processed
very early, immediately after the HTTP request is parsed and before
any http rulesets. It's not reported by the config parser when this
ruleset is misplaces, causing some configurations not to work like
users would expect. Let's just add the warning for a misplaced rule.
2017-12-01 18:25:08 +01:00
David Carlier
7e351eefe5 BUILD/MINOR: haproxy: compiling config cpu parsing handling when needed
parse_cpu_set is only relevant where there is cpu affinity,
avoiding in the process compilation warning as well.
2017-12-01 16:05:35 +01:00
Emeric Brun
088c9b73ca BUG/MAJOR: thread/peers: fix deadlock on peers sync.
Table lock was not released on an error path (if there is no
enough room to write table switch message).

[wt: needs to be backported to 1.8]
2017-12-01 15:06:43 +01:00
Emeric Brun
0fed0b0a38 BUG/MEDIUM: peers: fix some track counter rules dont register entries for sync.
This BUG was introduced with:
'MEDIUM: threads/stick-tables: handle multithreads on stick tables'

The API was reviewed to handle stick table entry updates
asynchronously and the caller must now call a 'stkable_touch_*'
function each time the content of an entry is modified to
register the entry to be synced.

There was missing call to stktable_touch_* resulting in
not propagated entries to remote peers (or local one during reload)
2017-11-29 19:16:22 +01:00
Willy Tarreau
872855998b BUG/MEDIUM: h2: don't report an error after parsing a 100-continue response
Yves Lafon reported a breakage with 100-continue. In fact the problem
is caused when an 1xx is the last response in the buffer (which commonly
is the case). We loop back immediately into the parser with what remains
of the input buffer (ie: nothing), while it is not expected to be called
with an empty response, so it fails.

Let's simply get back to the caller to decide whether or not more data
are expected to be sent.

This fix needs to be backported to 1.8.
2017-11-29 15:41:32 +01:00
Willy Tarreau
cea8537efd BUG/MEDIUM: threads/peers: decrement, not increment jobs on quitting
Commit 8d8aa0d ("MEDIUM: threads/listeners: Make listeners thread-safe")
mistakenly placed HA_ATOMIC_ADD(job, 1) to replace a job--, so it maintains
the job count too high preventing the process from cleanly exiting on
reload.

This needs to be backported to 1.8.
2017-11-29 14:51:20 +01:00
Emmanuel Hocdet
cebd7962e2 BUG/MINOR: ssl: CO_FL_EARLY_DATA removal is managed by stream
Manage BoringSSL early_data as it is with openssl 1.1.1.
2017-11-29 14:34:47 +01:00
David Carlier
6d5c841d24 BUILD/MINOR: haproxy : FreeBSD/cpu affinity needs pthread_np header
for pthread_*_np calls, pthread_np.h is needed under FreeBSD.
2017-11-29 14:30:38 +01:00
Willy Tarreau
5bcfd56519 BUG/MEDIUM: stream: fix session leak on applet-initiated connections
Commit 3e13cba ("MEDIUM: session: make use of the connection's destroy
callback") ensured that connections could be autonomous to destroy the
session they initiated, but it didn't take care of doing the same for
applets. Such applets are used for peers, Lua and SPOE outgoing
connections. In this case, once the stream ends, it closes everything
and nothing takes care of releasing the session. The problem is not
immediately obvious since the only visible effect is that older
processes will not quit on reload after having leaked one such session.

For now we check in stream_free() if the session's origin is the applet
we're releasing, and then free the session as well. Something more
uniform should probably be done once we manage to unify applets and
connections a bit more.

This fix needs to be backported to 1.8. Thanks to Emmanuel Hocdet for
reporting the problem.
2017-11-29 14:26:11 +01:00
William Lallemand
bcd9101a66 BUG/MEDIUM: cache: bad computation of the remaining size
The cache was not setting the hdrs_len to zero when we are called
in the http_forward_data with headers + body.

The consequence is to always try to store a size - the size of headers,
during the calls to http_forward_data even when it has already forwarded
the headers.

Thanks to Cyril Bont for reporting this bug.

Must be backported to 1.8.
2017-11-28 12:06:06 +01:00
William Lallemand
c3cd35f96c BUG/MEDIUM: ssl: don't allocate shctx several time
The shctx_init() function does not check anymore if the pointer is not
NULL, this check must be done is the caller.

The consequence was to allocate one shctx per ssl bind.

Bug introduced by 4f45bb9 ("MEDIUM: shctx: separate ssl and shctx")

Thanks to Maciej Zdeb for reporting this bug.

Must be backported to 1.8.
2017-11-28 12:04:16 +01:00
Christopher Faulet
b61028549e BUG/MEDIUM: tcp-check: Don't lock the server in tcpcheck_main
There was a deadlock in tcpcheck_main function. The server's lock was already
acquired by the caller (process_chk_conn or wake_srv_chk).

This patch must be backported in 1.8.
2017-11-28 10:43:23 +01:00
David Carlier
e78915a47a BUILD/MINOR: deviceatlas: enable thread support
DeviceAtlas detection being multi-thread safe, we enable the
new thread feature support.
Needs to be backported to 1.8 branch.
2017-11-27 14:22:21 +01:00
Olivier Houchard
ba8e8c3518 BUG/MEDIUM: kqueue: Don't bother closing the kqueue after fork.
kqueue fd's are not shared with children after fork(), so the children
don't have to close them, and it may in fact be dangerous, because we may
end up closing a totally unrelated fd.

[wt: to be backported to 1.8 where master-worker broke on this, and
 likely to older versions for completeness]
2017-11-26 20:10:58 +01:00
Willy Tarreau
103e5663c8 BUG/MAJOR: threads/queue: avoid recursive locking in pendconn_get_next_strm()
pendconn_get_next_strm() is called from process_srv_queue() under the
server lock, and calls stream_add_srv_conn() with this lock held, while
the latter tries to take it again. This results in a deadlock when
a server's maxconn is reached and haproxy is built with thread support.
2017-11-26 18:50:30 +01:00
Willy Tarreau
fd5efb5936 CLEANUP: cache: more efficiently pack the struct cache
By having the cache id on 33 bytes as the first member, it was
creating a hole and forcing the "hot" remaining part to be split
across two cache lines. Let's move the id at the end as it's used
only during config parsing.
2017-11-26 11:10:53 +01:00
Willy Tarreau
b6a2f58993 MINOR: buffers: cache-align buffer_wq_lock
This lock is highly stressed, avoid cache-line sharing to limit stress.
2017-11-26 11:10:51 +01:00
Willy Tarreau
a24d1d0be4 MINOR: task: align the rq and wq locks
We really don't want them to share the same cache line as they are
expected to be used in parallel. Adding a 64-byte alignment here shows
a performance increase of about 4.5% on task-intensive workloads with
2 to 4 threads.
2017-11-26 11:10:51 +01:00
Willy Tarreau
6d1222ce73 MINOR: task: keep a pointer to the currently running task
Very often when debugging, the current task's pointer isn't easy to
recover (eg: from a core file). Let's keep a copy of it, it will
likely help, especially with threads.
2017-11-26 11:10:50 +01:00
William Lallemand
4cfede87a3 MAJOR: mworker: exits the master on failure
This patch changes the behavior of the master during the exit of a
worker.

When a worker exits with an error code, for example in the case of a
segfault, all workers are now killed and the master leaves.

If you don't want this behavior you can use the option
"master-worker no-exit-on-failure".
2017-11-24 22:48:27 +01:00
William Lallemand
49b4453b58 MEDIUM: cache: max-age configuration keyword
Add a configuration keyword to change the max-age.
The default one is still 60s.
2017-11-24 19:31:01 +01:00
William Lallemand
a71cd1d407 MINOR: cache: replace a fprint() by an abort()
In the applet I/O handler we can never get an object bigger than a
buffer, so we should never reach this case.
2017-11-24 19:00:07 +01:00
Willy Tarreau
bafbe01028 CLEANUP: pools: rename all pool functions and pointers to remove this "2"
During the migration to the second version of the pools, the new
functions and pool pointers were all called "pool_something2()" and
"pool2_something". Now there's no more pool v1 code and it's a real
pain to still have to deal with this. Let's clean this up now by
removing the "2" everywhere, and by renaming the pool heads
"pool_head_something".
2017-11-24 17:49:53 +01:00
Olivier Houchard
fbc74e8556 MINOR/CLEANUP: proxy: rename "proxy" to "proxies_list"
Rename the global variable "proxy" to "proxies_list".
There's been multiple proxies in haproxy for quite some time, and "proxy"
is a potential source of bugs, a number of functions have a "proxy" argument,
and some code used "proxy" when it really meant "px" or "curproxy". It worked
by pure luck, because it usually happened while parsing the config, and thus
"proxy" pointed to the currently parsed proxy, but we should probably not
rely on this.

[wt: some of these are definitely fixes that are worth backporting]
2017-11-24 17:21:27 +01:00
Christopher Faulet
767a84bcc0 CLEANUP: log: Rename Alert/Warning in ha_alert/ha_warning 2017-11-24 17:19:12 +01:00
Christopher Faulet
56803b1c98 CLEANUP: debug: Use DPRINTF instead of fprintf into #ifdef DEBUG_FULL/#endif 2017-11-24 17:19:03 +01:00
Christopher Faulet
165f07e7b4 MEDIUM: listener: Bind listeners on a thread subset if specified
If a "process" option with a thread set is used on the bind line, we use the
corresponding bitmask when the listener's FD is created.
2017-11-24 15:38:50 +01:00
Christopher Faulet
c644fa9bf5 MINOR: config: Add threads support for "process" option on "bind" lines
It is now possible on a "bind" line (or a "stats socket" line) to specify the
thread set allowed to process listener's connections. For instance:

    # HTTPS connections will be processed by all threads but the first and HTTP
    # connection will be processed on the first thread.
    bind *:80 process 1/1
    bind *:443 ssl crt mycert.pem process 1/2-
2017-11-24 15:38:50 +01:00
Christopher Faulet
cb6a94510d MINOR: config: Add the threads support in cpu-map directive
Now, it is possible to bind CPU at the thread level instead of the process level
by defining a thread set in "cpu-map" directives. Thus, its format is now:

  cpu-map [auto:]<process-set>[/<thread-set>] <cpu-set>...

where <process-set> and <thread-set> must follow the format:

  all | odd | even | number[-[number]]

Having a process range and a thread range in same time with the "auto:" prefix
is not supported. Only one range is supported, the other one must be a fixed
number. But it is allowed when there is no "auto:" prefix.

Because it is possible to define a mapping for a process and another for a
thread on this process, threads will be bound on the intersection of their
mapping and the one of the process on which they are attached. If the
intersection is null, no specific binding will be set for the threads.
2017-11-24 15:38:50 +01:00
Christopher Faulet
11da456e77 MINOR:: config: Remove thread-map directive
It was a temporary directive used for development purpose. Now, CPU mapping for
at the thread level should be done using the cpu-map directive. This feature
will be added in a next commit.
2017-11-24 15:38:50 +01:00
Christopher Faulet
ff4121f741 MINOR: config: Support partial ranges in cpu-map directive
Now, processa and CPU ranges can be partially defined. The higher bound can be
omitted. In such case, it is replaced by the corresponding maximum value, 32 or
64 depending on the machine's word size.

By extension, It is also true for the "bind-process" directive and "process"
parameter on a "bind" or a "stats socket" line.
2017-11-24 15:38:50 +01:00
Christopher Faulet
26028f6209 MINOR: config: Add auto-increment feature for cpu-map
The prefix "auto:" can be added before the process set to let HAProxy
automatically bind a process to a CPU by incrementing process and CPU sets. To
be valid, both sets must have the same size. No matter the declaration order of
the CPU sets, it will be bound from the lower to the higher bound.

  Examples:
      # all these lines bind the process 1 to the cpu 0, the process 2 to cpu 1
      #  and so on.
      cpu-map auto:1-4   0-3
      cpu-map auto:1-4   0-1 2-3
      cpu-map auto:1-4   3 2 1 0

      # bind each process to exaclty one CPU using all/odd/even keyword
      cpu-map auto:all   0-63
      cpu-map auto:even  0-31
      cpu-map auto:odd   32-63

      # invalid cpu-map because process and CPU sets have different sizes.
      cpu-map auto:1-4   0    # invalid
      cpu-map auto:1     0-3  # invalid
2017-11-24 15:38:49 +01:00
Christopher Faulet
f1f0c5f591 MINOR: config: Export parse_process_number and use it wherever it's applicable
This function is used when "bind-process" directive is parsed and when "process"
parameter on a "bind" or a "stats socket" line is parsed.
2017-11-24 15:38:49 +01:00
Christopher Faulet
5ab51775e7 MINOR: config: Slightly change how parse_process_number works
Now, this function returns a status code to indicate a success (0) or a failure
(1) and the error message in set in <err> parameter. And the result of the parsing
is set in <proc> parameter.
2017-11-24 15:38:49 +01:00
Christopher Faulet
1dcb9cb81c MINOR: config: Support a range to specify processes in "cpu-map" parameter
Now, you can define processes concerned by a cpu-map line using a range. For
instance, the following line binds the first 32 processes on CPUs 0 to 3:

  cpu-map 1-32 0-3
2017-11-24 15:38:49 +01:00
Christopher Faulet
15eb3a9a08 BUG/MINOR: listener: Allow multiple "process" options on "bind" lines
The documentation specifies that you can have several "process" options to
define several ranges on "bind" lines (or "stats socket" lines). It is uncommon,
but it should be possible. So the bind_proc bitmask in bind_conf structure must
not be overwritten at each new "process" option parsed.

This bug also exists in 1.7, 1.6 and 1.5. So it may be backported. But no one
seems to have noticed it, so it was probably never hitted.
2017-11-24 15:38:49 +01:00
William Lallemand
ecb73b12c1 MINOR: cache: move the refcount decrease in the applet release
Move the refcount decrease of the cache in the release callback of the
applet. We don't need to decrease it in the applet code.
2017-11-24 15:04:36 +01:00
William Lallemand
49dc048c25 BUG/MEDIUM: cache: free ressources in chn_end_analyze
Upon an aborted HTTP connection, or an error, the filter cache does not
decrement the refcount and does not free the allocated ressources.
2017-11-24 15:04:36 +01:00
Willy Tarreau
0542c8b39a BUG/MEDIUM: stream: always release the stream-interface on abort
The cache exhibited a but in process_stream() where upon abort it is
possible to switch the stream-int's state to SI_ST_CLO without calling
si_release_endpoint(), resulting in a possibly missing ->release() for
the applet.

It should affect all other applets as well (eg: lua, spoe, peers) and
should carefully be backported to stable branches after some observation
period.
2017-11-24 15:04:36 +01:00
Emmanuel Hocdet
ca6a957c5d MINOR: ssl: Handle early data with BoringSSL
BoringSSL early data differ from OpenSSL 1.1.1 implementation. When early
handshake is done, SSL_in_early_data report if SSL_read will be done on early
data. CO_FL_EARLY_SSL_HS and CO_FL_EARLY_DATA can be adjust accordingly.
2017-11-24 13:50:02 +01:00
Willy Tarreau
45a66ccc55 MEDIUM: config: ensure that tune.bufsize is at least 16384 when using HTTP/2
HTTP/2 mandates the support of 16384 bytes frames by default, so we need
a large enough buffer to process them. Till now if tune.bufsize was too
small, H2 connections were simply rejected during their establishment,
making it quite hard to troubleshoot the issue.

Now we detect when HTTP/2 is enabled on an HTTP frontend and emit an
error if tune.bufsize is not large enough, with the appropriate
recommendation.
2017-11-24 11:28:00 +01:00
Willy Tarreau
599391a7c2 MINOR: h2: make use of client-fin timeout after GOAWAY
At the moment, the "client" timeout is used on an HTTP/2 connection once
it's idle with no active stream. With this patch, this timeout is replaced
by client-fin once a GOAWAY frame is sent. This closely matches what is
done on HTTP/1 since the principle is the same, as it indicates a willing
ness to quickly close a connection on which we don't expect to see anything
anymore.
2017-11-24 10:16:00 +01:00
Willy Tarreau
a76e4c2183 MEDIUM: h2: don't gracefully close the connection anymore on Connection: close
As reported by Lukas, it causes more harm than good, for example on
prompt for authentication. Now we have an "http-request reject" rule
to use instead of "http-request deny" if we absolutely want to close
the connection.
2017-11-24 08:17:28 +01:00
Willy Tarreau
90c3232e54 MINOR: h2: send RST_STREAM before GOAWAY on reject
Apparently the h2c client has trouble reading the RST_STREAM frame after
a GOAWAY was sent, so it's likely that other clients may face the same
difficulty. Curl and Firefox don't care about this ordering, so let's
send it first.
2017-11-24 08:00:30 +01:00
Willy Tarreau
53275e8b02 MINOR: http: implement the "http-request reject" rule
This one acts similarly to its tcp-request counterpart. It immediately
closes the request without emitting any response. It can be suitable in
certain DoS conditions, as well as to close an HTTP/2 connection.
2017-11-24 07:52:01 +01:00
William Lallemand
f528fff46b MEDIUM: cache: store sha1 for hashing the cache key
The cache was relying on the txn->uri for creating its key, which was a
big problem when there was no log activated.

This patch does a sha1 of the host + uri, and stores it in the txn.
When a object is stored, the eb32node uses the first 32 bits of the hash
as a key, and the whole hash is stored in the cache entry.

During a lookup, the truncated hash is used, and when it matches an
entry we check the real sha1.
2017-11-23 20:20:04 +01:00
Olivier Houchard
7fc96d5a01 MINOR: mux: Make sure every string is woken up after the handshake.
In case any stream was waiting for the handshake after receiving early data,
we have to wake all of them. Do so by making the mux responsible for
removing the CO_FL_EARLY_DATA flag after all of them are woken up, instead
of doing it in si_cs_wake_cb(), which would then only work for the first one.
This makes wait_for_handshake work with HTTP/2.
2017-11-23 19:35:42 +01:00
Olivier Houchard
90084a133d MINOR: ssl: Handle reading early data after writing better.
It can happen that we want to read early data, write some, and then continue
reading them.
To do so, we can't reuse tmp_early_data to store the amount of data sent,
so introduce a new member.
If we read early data, then ssl_sock_to_buf() is now the only responsible
for getting back to the handshake, to make sure we don't miss any early data.
2017-11-23 19:35:28 +01:00
Willy Tarreau
51753458c4 BUG/MAJOR: threads/task: dequeue expired tasks under the WQ lock
There is a small unprotected window for a task between the wait queue
and the run queue where a task could be woken up and destroyed at the
same time. What typically happens is that a timeout is reached at the
same time an I/O completes and wakes it up, and the I/O terminates the
task, causing a use after free in wake_expired_tasks() possibly causing
a crash and/or memory corruption :

       thread 1                             thread 2
  (wake_expired_tasks)                (stream_int_notify)

 HA_SPIN_UNLOCK(TASK_WQ_LOCK, &wq_lock);
                              task_wakeup(task, TASK_WOKEN_IO);
                              ...
                              process_stream()
                                stream_free()
                                   task_free()
                                      pool_free(task)
 task_wakeup(task, TASK_WOKEN_TIMER);

This case is reasonably easy to reproduce with a config using very short
server timeouts (100ms) and client timeouts (10ms), while injecting on
httpterm requesting medium sized objects (5kB) over SSL. All this is
easier done with more threads than allocated CPUs so that pauses can
happen anywhere and last long enough for process_stream() to kill the
task.

This patch inverts the lock and the wakeup(), but requires some changes
in process_runnable_tasks() to ensure we never try to grab the WQ lock
while having the RQ lock held. This means we have to release the RQ lock
before calling task_queue(), so we can't hold the RQ lock during the
loop and must take and drop it.

It seems that a different approach with the scope-aware trees could be
easier, but it would possibly not cover situations where a task is
allowed to run on multiple threads. The current solution covers it and
doesn't seem to have any measurable performance impact.
2017-11-23 18:47:04 +01:00
Willy Tarreau
541dd82879 BUG/MAJOR: h2: always remove a stream from the send list before freeing it
When a stream is aborted on timeout or any reason initiated by the stream,
and this stream was subscribed to the send list, we forgot to detach it
when freeing it, resulting in a dead node remaining present in the send
list with all usual funny consequences (memory corruption, crashes, etc).
Let's simply unconditionally delete the stream.
2017-11-23 18:12:50 +01:00
Willy Tarreau
ee8269e84d BUG/MINOR: stream: fix tv_request calculation for applets
When the stats code was moved to an applet, it wasn't completely
cleaned of its usage of the HTTP transaction and it used to store
the HTTP status in txn->status and to set the HTTP request date to
<now> from within the applet. This is totally wrong because the
applet is seen as a server from the HTTP engine, which parses its
response, so the http_txn must not be touched there.

This was made visible by the cache which would always exhibit a
negative TR log, indicating that nowhere in the code we took care of
setting s->logs.tv_request while the code above used to continue to
hide this. Another side effect of this issue is that under load, if
the stats applet call risks to be delayed, the reported t_queue can
appear negative by being below tv_request-tv_accept.

This patch removes the assignment of tv_request and txn->status from
the applet code and instead sets the tv_request if still unset when
connecting to the applet. This ensures that all applets report correct
request timers now.
2017-11-23 17:34:29 +01:00
Christopher Faulet
ff3a41eb3f BUG/MINOR: Use crt_base instead of ca_base when crt is parsed on a server line
In srv_parse_crt, crt_base was checked but ca_base was used to build the
certifacte path.

This patch must be backported in 1.7, 1.6 and 1.5.
2017-11-23 16:34:10 +01:00
Christopher Faulet
34adb2af96 MINOR: sample: Add "thread" sample fetch
It returns id of the thread calling the function.
2017-11-23 16:33:13 +01:00
Willy Tarreau
9fefc51c56 BUG/MEDIUM: threads/time: maintain a common time reference between all threads
During high loads it becomes visible that the time drifts between threads,
sometimes showing tens of seconds after several minutes. The root cause is
the per-thread correction which is performed based on a local offset and
local time. But we can't use a unique global time either as we need the
thread-local time to be stable between two poll() calls.

This commit takes a stab at this problem by proceeding this way :

  - a global "global_now" date is monotonous and common between all threads.
  - each thread has its own local <now> which is resynced with <global_now>
    on each invocation of tv_update_date()
  - each thread detects its own drift based on its poll() timeout and its
    local <now>, and recalculates its adjusted local time
  - each thread then ensures its new local time is no older than the current
    global time, otherwise it readjusts its local time to match this one
  - finally threads do atomically update the global time to match its own
    local one

This guarantees a monotonous global time and a monotonous+stable local time.

It is still possible by definition for two threads to report a minor time
variation on subsequent events but that variation will only be caused by
the moment they watched the time and are very small. When a common global
time is needed between all threads, global_now could be used as a reference
(with care). The wallclock time used in logs is still <date> anyway.
2017-11-23 16:32:32 +01:00
Willy Tarreau
7649aacf7f BUG/MEDIUM: threads/time: fix time drift correction
With threads, it became mandatory to implement a thread-local time with
its own correction. However, it was noticed that during high thread
contention, the time correction could occasionally be wrong, reporting
huge negative or positive timers in logs. This was caused by the
conversion between struct timeval and a single 64-bit offset, due to
an erroneous shift and due to a loss of sign during the conversion.

Given that time_t is not always signed, and that timeval is not really
needed here, better avoid playing dangerous games with these operations
and use a single 64-bit offset representing a signed 32-bit offset, for
the seconds part and an unsigned offset for the microsecond part.
It still supports atomic updates and doesn't cause issues anymore.
2017-11-23 16:32:32 +01:00
Willy Tarreau
f13322ede1 MINOR: pools: prepare functions to override malloc/free in pools
This will be useful to add some debugging capabilities. For now it
changes nothing.
2017-11-22 19:27:44 +01:00
Olivier Houchard
424ecfb33c MINOR: ssl: Don't disable early data handling if we could not write.
If we can't write early data, for some reason, don't give up on reading them,
they may still be early data to be read, and if we don't do so, openssl
internal states might be inconsistent, and the handshake will fail.
2017-11-22 19:27:14 +01:00
Olivier Houchard
777e4b98a3 BUG/MINOR: ssl: Always start the handshake if we can't send early data.
The current code only tries to do the handshake in case we can't send early
data if we're acting as a client, which is wrong, it has to be done on the
server side too, or we end up in an infinite loop.
2017-11-22 19:27:09 +01:00
Willy Tarreau
1f89b1805b BUG/MEDIUM: deinit: correctly deinitialize the proxy and global listener tasks
While using mmap() to allocate pools for debugging purposes, kill -USR1 caused
libc aborts in deinit() on two calls to free() on proxies' tasks and the global
listener task. The issue comes from the fact that we're using free() to release
a task instead of task_free(), so the task was allocated from a pool and released
using a different method.

This bug has been there since at least 1.5, so a backport is desirable to all
maintained versions.
2017-11-22 16:57:05 +01:00
William Lallemand
e899af89b5 BUG/MEDIUM: cache fix cli_kws structure
The cli_kws structure was not ended and was causing undefined behavior.
2017-11-22 16:56:58 +01:00
William Lallemand
55e7674bc4 BUG/MEDIUM: cache: refcount forbids to free the objects
Some refcount decrementation were forgotten and they were forbidding to
reuse the objects in some cases.
2017-11-22 15:13:54 +01:00
William Lallemand
0872766e31 BUG/MEDIUM: cache: use key=0 as a condition for freeing
The cache was trying to remove objects from the tree while they were
already removed from it. We set the key to 0 as a check for not trying
to remove the object from the tree when we are still using the object.
2017-11-22 15:13:54 +01:00
William Lallemand
1f49a366fd MEDIUM: cache: "show cache" on the cli
The cli command "show cache" displays the status of the cache, the first
displayed line is the shctx informations with how much blocks available
blocks it contains (blocks are 1k by default).

The next lines are the objects stored in the cache tree, the pointer,
the size of the object and how much blocks it uses, a refcount for the
number of users of the object, and the remaining expiration time (which
can be negative if expired)

Example:

    $ echo "show cache" | socat - /run/haproxy.sock
    0x7fa54e9ab03a: foobar (shctx:0x7fa54e9ab000, available blocks:3921)
    0x7fa54ed65b8c (size: 43190 (43 blocks), refcount:2, expire: 2)
    0x7fa54ecf1b4c (size: 45238 (45 blocks), refcount:0, expire: 2)
    0x7fa54ed70cec (size: 61622 (61 blocks), refcount:0, expire: 2)
    0x7fa54ecdbcac (size: 42166 (42 blocks), refcount:1, expire: 2)
    0x7fa54ec9736c (size: 44214 (44 blocks), refcount:2, expire: 2)
    0x7fa54eca28ec (size: 46262 (46 blocks), refcount:2, expire: -2)
2017-11-21 21:35:04 +01:00
William Lallemand
75d93291c9 CLEANUP: cache: reorder includes 2017-11-21 21:35:04 +01:00
Lukas Tribus
f46bf95d2b BUG/MINOR: systemd: ignore daemon mode
Since we switched to notify mode in the systemd unit file in commit
d6942c8, haproxy won't start if the daemon keyword is present in the
configuration.

This change makes sure that haproxy remains in foreground when using
systemd mode and adds a note in the documentation.
2017-11-21 21:21:35 +01:00
Willy Tarreau
2fb986ccb8 BUG/MEDIUM: h2: always reassemble the Cookie request header field
The special case of the Cookie header field was overlooked in the
implementation, considering that most servers do handle cookie lists,
but as reported here on discourse it's not the case at all :

  https://discourse.haproxy.org/t/h2-cookie-header-splitted-header/1742

This patch fixes this by skipping all occurences of the Cookie header
in the request while building the H1 request, and then building a single
Cookie header with all values appended at once, according to what is
requested in RFC7540#8.1.2.5.

In order to build the list of values, the list struct is used as a linked
list (as there can't be more cookies than headers). This makes the list
walking quite efficient and ensures all values are quickly found without
having to rescan the list.

A test case provided by Lukas shows that it properly works :

 > GET /? HTTP/1.1
 > user-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0
 > accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
 > accept-language: en-US,en;q=0.5
 > accept-encoding: gzip, deflate
 > referer: https://127.0.0.1:4443/?expectValue=1511294406
 > host: 127.0.0.1:4443

 < HTTP/1.1 200 OK
 < Server: nginx
 < Date: Tue, 21 Nov 2017 20:00:13 GMT
 < Content-Type: text/html; charset=utf-8
 < Transfer-Encoding: chunked
 < Connection: keep-alive
 < X-Powered-By: PHP/5.3.10-1ubuntu3.26
 < Set-Cookie: HAPTESTa=1511294413
 < Set-Cookie: HAPTESTb=1511294413
 < Set-Cookie: HAPTESTc=1511294413
 < Set-Cookie: HAPTESTd=1511294413
 < Content-Encoding: gzip

 > GET /?expectValue=1511294413 HTTP/1.1
 > user-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0
 > accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
 > accept-language: en-US,en;q=0.5
 > accept-encoding: gzip, deflate
 > host: 127.0.0.1:4443
 > cookie: SERVERID=s1; HAPTESTa=1511294413; HAPTESTb=1511294413; HAPTESTc=1511294413; HAPTESTd=1511294413

Many thanks to @Nurza, @adrianw and @lukastribus for their helpful reports
and investigations here.
2017-11-21 21:13:36 +01:00
Willy Tarreau
59a10fb53d MEDIUM: h2: change hpack_decode_headers() to only provide a list of headers
The current H2 to H1 protocol conversion presents some issues which will
require to perform some processing on certain headers before writing them
so it's not possible to convert HPACK to H1 on the fly.

This commit modifies the headers decoding so that it now works in two
phases : hpack_decode_headers() only decodes the HPACK stream in the
HEADERS frame and puts the result into a list. Headers which require
storage (huffman-compressed or from the dynamic table) are stored in
a chunk allocated by the H2 demuxer. Then once the headers are properly
decoded into this list, h2_make_h1_request() is called with this list
to produce the HTTP/1.1 request into the destination buffer. The list
necessarily enforces a limit. Here we use 2*MAX_HTTP_HDR, which means
that we can have as many individual cookies as we have regular headers
if a client decides to break their cookies into multiple values. This
seams reasonable and will allow the H1 parser to decide whether it's
too much or not.

Thus the output stream is not produced on the fly anymore and this will
permit to deal with certain corner cases like reparing the Cookie header
(which for now is not done).

In order to limit header duplication and parsing, the known pseudo headers
continue to be passed by their index : the name element in the list then
has a NULL pointer and the value is the pseudo header's index. Given that
these ones represent about half of the incoming requests and need to be
found quickly, it maintains an acceptable level of performance.

The code was significantly reduced by doing this because the orignal code
had to deal with HPACK and H1 combinations (eg: index vs not indexed, etc)
and now the HPACK decoding is totally focused on the decompression, and
the H1 encoding doesn't have to deal with the issue of wrapping input for
example.

One bug was addressed here (though it couldn't happen at the moment). The
H2 demuxer used to detect a failure to write the request into the H1 buffer
and would then detect if the output buffer wraps, realign it and try again.
The problem by doing so was that the HPACK context was already modified and
not rewindable. Thus the size check is now performed first and a failure is
reported if it doesn't fit.
2017-11-21 21:13:36 +01:00
Willy Tarreau
f24ea8e45e MEDIUM: h2: add a function to emit an HTTP/1 request from a headers list
The current H2 to H1 protocol conversion presents some issues which will
require to perform some processing on certain headers before writing them
so it's not possible to convert HPACK to H1 on the fly.

Here we introduce a function which performs half of what hpack_decode_header()
used to do, which is to take a list of headers on input and emit the
corresponding request in HTTP/1.1 format. The code is the same and functions
were renamed to be prefixed with "h2" instead of "hpack", though it ends
up being simpler as the various HPACK-specific cases could be fused into
a single one (ie: add header).

Moving this part here makes a lot of sense as now this code is specific to
what is documented in HTTP/2 RFC 7540 and will be able to deal with special
cases related to H2 to H1 conversion enumerated in section 8.1.

Various error codes which were previously assigned to HPACK were never
used (aside being negative) and were all replaced by -1 with a comment
indicating what error was detected. The code could be further factored
thanks to this but this commit focuses on compatibility first.

This code is not yet used but builds fine.
2017-11-21 21:13:33 +01:00
Willy Tarreau
8f650c369d BUG/MEDIUM: h2: properly report connection errors in headers and data handlers
We used to return >0 indicating a success when an error was present on the
connection, preventing the caller from detecting and handling it. This for
example happens when sending too many headers in a frame, making the request
impossible to decompress.
2017-11-21 19:36:21 +01:00
Willy Tarreau
358847f026 BUILD: server: check->desc always exists
Clang reports this warning :

  src/server.c:872:14: warning: address of array 'check->desc' will
  always evaluate to 'true' [-Wpointer-bool-conversion]

Indeed, check->desc used to be a pointer to a dynamically allocated area
a long time ago and is now an array. Let's remove the useless test.
2017-11-20 21:33:21 +01:00
Willy Tarreau
1f09467114 BUILD: h2: mark some inlined functions "unused"
Clang complains that h2_get_n64() is not used, and a few other protocol
specific functions may fall in that category depending on how the code
evolves. Better mark them unused to silence the warning since it's on
purpose.
2017-11-20 21:27:45 +01:00
William Lallemand
eee5c39715 CLEANUP: cache: remove wrong comment 2017-11-20 19:22:27 +01:00
William Lallemand
71bd11a1f3 MEDIUM: cache: enable the HTTP analysers
Enable the same analysers as the stats applet.
Allows keepalive and termination flags to work.
2017-11-20 19:22:27 +01:00
William Lallemand
a400a3a6d0 BUG/MEDIUM: cache: free callback to remove from tree
Call the shctx free_blocks callback in order to remove the row from the
cache tree.

Put the row in the hot list during allocation, forbid the blocks to be
stolen by a free or a row_reserve
2017-11-20 19:22:27 +01:00
Tim Duesterhus
d6942c8297 MEDIUM: mworker: Add systemd Type=notify support
This patch adds support for `Type=notify` to the systemd unit.

Supporting `Type=notify` improves both starting as well as reloading
of the unit, because systemd will be let known when the action completed.

See this quote from `systemd.service(5)`:
> Note however that reloading a daemon by sending a signal (as with the
> example line above) is usually not a good choice, because this is an
> asynchronous operation and hence not suitable to order reloads of
> multiple services against each other. It is strongly recommended to
> set ExecReload= to a command that not only triggers a configuration
> reload of the daemon, but also synchronously waits for it to complete.

By making systemd aware of a reload in progress it is able to wait until
the reload actually succeeded.

This patch introduces both a new `USE_SYSTEMD` build option which controls
including the sd-daemon library as well as a `-Ws` runtime option which
runs haproxy in master-worker mode with systemd support.

When haproxy is running in master-worker mode with systemd support it will
send status messages to systemd using `sd_notify(3)` in the following cases:

- The master process forked off the worker processes (READY=1)
- The master process entered the `mworker_reload()` function (RELOADING=1)
- The master process received the SIGUSR1 or SIGTERM signal (STOPPING=1)

Change the unit file to specify `Type=notify` and replace master-worker
mode (`-W`) with master-worker mode with systemd support (`-Ws`).

Future evolutions of this feature could include making use of the `STATUS`
feature of `sd_notify()` to send information about the number of active
connections to systemd. This would require bidirectional communication
between the master and the workers and thus is left for future work.
2017-11-20 18:39:41 +01:00
Willy Tarreau
62dd698070 BUG/MINOR: stream-int: don't try to read again when CF_READ_DONTWAIT is set
Commit 9aaf778 ("MAJOR: connection : Split struct connection into struct
connection and struct conn_stream.") had to change the way the stream
interface deals with incoming data to accomodate the mux. A break
statement got lost during a change, leading to the receive call being
performed twice even when CF_READ_DONTWAIT is set. The most noticeable
effect is that it made the bug described in commit 33982cb ("BUG/MAJOR:
stream: ensure analysers are always called upon close") much easier to
reproduce as it would appear even with an HTTP frontend.

Let's just restore the stream-interface flag and the break here, as in
the previous code.

No backport is needed as this was introduced during 1.8-dev.
2017-11-20 16:13:16 +01:00
Willy Tarreau
33982cbdc0 BUG/MAJOR: stream: ensure analysers are always called upon close
A recent issue affecting HTTP/2 + redirect + cache has uncovered an old
problem affecting all existing versions regarding the way events are
reported to analysers.

It happens that when an event is reported, analysers see it and may
decide to temporarily pause processing and prevent other analysers from
processing the same event. Then the event may be cleared and upon the
next call to the analysers, some of them will never see it.

This is exactly what happens with CF_READ_NULL if it is received before
the request is processed, like during redirects : the first time, some
analysers see it, pause, then the event may be converted to a SHUTW and
cleared, and on next call, there's nothing to process. In practice it's
hard to get the CF_READ_NULL flag during the request because requests
have CF_READ_DONTWAIT, preventing the read0 from happening. But on
HTTP/2 it's presented along with any incoming request. Also on a TCP
frontend the flag is not set and it's possible to read the NULL before
the request is parsed.

This causes a problem when filters are present because flt_end_analyse
needs to be called to release allocated resources and remove the
CF_FLT_ANALYZE flag. And the loss of this event prevents the analyser
from being called and from removing itself, preventing the connection
from ever ending.

This problem just shows that the event processing needs a serious revamp
after 1.8. In the mean time we can deal with the really problematic case
which is that we *want* to call analysers if CF_SHUTW is set on any side
ad it's the last opportunity to terminate a processing. It may
occasionally result in some analysers being called for nothing in half-
closed situations but it will take care of the issue.

An example of problematic configuration triggering the bug in 1.7 is :

    frontend tcp
        bind :4445
        default_backend http

    backend http
        redirect location /
        compression algo identity

Then submitting requests which immediately close will have for effect
to accumulate streams which will never be freed :

   $ printf "GET / HTTP/1.1\r\n\r\n" >/dev/tcp/0/4445

This fix must be backported to 1.7 as well as any version where commit
c0c672a ("BUG/MINOR: http: Fix conditions to clean up a txn and to
handle the next request") was backported. This commit didn't cause the
bug but made it much more likely to happen.
2017-11-20 15:58:22 +01:00
Willy Tarreau
e223e3bc85 BUG/MEDIUM: stream: don't automatically forward connect nor close
Upon stream instanciation, we used to enable channel auto connect
and auto close to ease TCP processing. But commit 9aaf778 ("MAJOR:
connection : Split struct connection into struct connection and
struct conn_stream.") has revealed that it was a bad idea because
this commit enables reading of the trailing shutdown that may follow
a small requests, resulting in a read and a shutr turned into shutw
before the stream even has a chance to apply the filters. This
causes an issue with impossible situations where the backend stream
interface is still in SI_ST_INI with a closed output, which blocks
some streams for example when performing a redirect with filters
enabled.

Let's change this so that we only enable these two flags if there is
no analyser on the stream. This way process_stream() has a chance to
let the analysers decide whether or not to allow the shutdown event
to be transferred to the other side.

It doesn't seem possible to trigger this issue before 1.8, so for now
it is preferable not to backport this fix.
2017-11-20 15:58:22 +01:00
David Carlier
91a88b0c25 BUG/MEDIUM: deviceatlas: ignore not valuable HTTP request data
A customer reported a crash when within the HTTP request some headers
were not set leading to the module to crash. So the module ignore them
since empty data have no value for the detection.
Needs to be backported to 1.7.
2017-11-17 10:41:40 +01:00
Olivier Houchard
e9bed53486 MINOR: ssl: Make sure we don't shutw the connection before the handshake.
Instead of trying to finish the handshake in ssl_sock_shutw, which may
fail, try not to shutdown until the handshake is finished.
2017-11-16 19:04:10 +01:00
Olivier Houchard
e6060c5d87 MINOR: SSL: Store the ASN1 representation of client sessions.
Instead of storing the SSL_SESSION pointer directly in the struct server,
store the ASN1 representation, otherwise, session resumption is broken with
TLS 1.3, when multiple outgoing connections want to use the same session.
2017-11-16 19:03:32 +01:00
Christopher Faulet
f02050662b MINOR: stream: Add thread-mask of tasks/FDs/applets in "show sess all" command 2017-11-16 11:19:46 +01:00
Christopher Faulet
b4a4d9aed4 MEDIUM: applets: Don't process more than 200 active applets at once
Now, we process at most 200 active applets per call to applet_run_active. We use
the same limit as the tasks. With the cache filter and the SPOE, the number of
active applets can now be huge. So, it is important to limit the number of
applets processed in applet_run_active.
2017-11-16 11:19:46 +01:00
Christopher Faulet
7163056dc5 MAJOR: polling: Use active_appels_mask instead of applets_active_queue
applets_active_queue is the active queue size. It is a global variable. So it is
underoptimized because we may be lead to consider there are active applets for a
thread while in fact all active applets are assigned to the otherthreads. So, in
such cases, the polling loop will be evaluated many more times than necessary.

Instead, we now check if the thread id is set in the bitfield active_applets_mask.

This is specific to threads, no backport is needed.
2017-11-16 11:19:46 +01:00
Christopher Faulet
595d7b72a6 MINOR: applets: Use a bitfield to track applets activity per-thread
a bitfield has been added to know if there are runnable applets for a
thread. When an applet is woken up, the bits corresponding to its thread_mask
are set. When all active applets for a thread is get to be processed, the thread
is removed from active ones by unsetting its tid_bit from the bitfield.
2017-11-16 11:19:46 +01:00
Christopher Faulet
8a48f67526 MAJOR: polling: Use active_tasks_mask instead of tasks_run_queue
tasks_run_queue is the run queue size. It is a global variable. So it is
underoptimized because we may be lead to consider there are active tasks for a
thread while in fact all active tasks are assigned to the other threads. So, in
such cases, the polling loop will be evaluated many more times than necessary.

Instead, we now check if the thread id is set in the bitfield active_tasks_mask.

Another change has been made in process_runnable_tasks. Now, we always limit the
number of tasks processed to 200.

This is specific to threads, no backport is needed.
2017-11-16 11:19:46 +01:00
Christopher Faulet
3911ee85df MINOR: tasks: Use a bitfield to track tasks activity per-thread
a bitfield has been added to know if there are runnable tasks for a thread. When
a task is woken up, the bits corresponding to its thread_mask are set. When all
tasks for a thread have been evaluated without any wakeup, the thread is removed
from active ones by unsetting its tid_bit from the bitfield.
2017-11-16 11:19:46 +01:00
Christopher Faulet
96d4483df7 BUG/MINOR: Allocate the log buffers before the proxies startup
Since the commit cd7879adc ("BUG/MEDIUM: threads: Run the poll loop on the main
thread too"), the log buffers are allocated after the proxies startup. So log
messages produced during this startup was ignored.

To fix the bug, we restore the initialization of these buffers before proxies
startup.

This is specific to threads, no backport is needed.
2017-11-16 11:19:46 +01:00
William Lallemand
75ea0a06b0 BUG/MEDIUM: mworker: does not close inherited FD
At the end of the master initialisation, a call to protocol_unbind_all()
was made, in order to close all the FDs.

Unfortunately, this function closes the inherited FDs (fd@), upon reload
the master wasn't able to reload a configuration with those FDs.

The create_listeners() function now store a flag to specify if the fd
was inherited or not.

Replace the protocol_unbind_all() by  mworker_cleanlisteners() +
deinit_pollers()
2017-11-15 19:53:33 +01:00
William Lallemand
fade49d8fb BUG/MEDIUM: mworker: does not deinit anymore
Does not use the deinit() function during a reload, it's dangerous and
might be subject to double free, segfault and hazardous behavior if
it's called twice in the case of a execvp fail.
2017-11-15 19:53:31 +01:00
William Lallemand
2f8b31c2c6 BUG/MEDIUM: mworker: wait again for signals when execvp fail
After execvp fails, the signals were ignored, preventing to try a reload
again. It is now fixed by reaching the top of the mworker_wait()
function once the execvp failed.
2017-11-15 19:52:06 +01:00
William Lallemand
722d4ca0dd MINOR: mworker: display an accurate error when the reexec fail
When the master worker fail the execvp, it returns the wrong error
"Cannot allocate memory".

We now display the accurate error corresponding to the errno value.
2017-11-15 19:52:06 +01:00
Willy Tarreau
9c1e15d8cd MINOR: tools: emphasize the node being worked on in the tree dump
Now we can show in dotted red the node being removed or surrounded in red
a node having been inserted, and add a description on the graph related to
the operation in progress for example.
2017-11-15 19:43:05 +01:00
Willy Tarreau
6c7f4deb21 MINOR: tools: improve the DOT dump of the ebtree
Use a smaller and cleaner fixed font, use upper case to indicate sides on
branches, remove the useless node/leaf markers on branches since the colors
already indicate them, and show the node's key as it helps spot the matching
leaf.
2017-11-15 19:43:05 +01:00
Willy Tarreau
ed3cda02ae MINOR: tools: add a function to dump a scope-aware tree to a file
It emits a dump in DOT format for graphing purposes during debugging
sessions. It's convenient to dump the run queue.
2017-11-15 16:07:15 +01:00
Christopher Faulet
99bca65f53 BUG/MEDIUM: standard: itao_str/idx and quote_str/idx must be thread-local
This bug has an impact on the stats applet and easily leads to a crash of HAProxy.

This is specific to threads, no backport is needed.
2017-11-14 18:11:57 +01:00
Christopher Faulet
919b739862 CLEANUP: tasks: Remove useless double test on rq_next
No backport is needed, this is purely 1.8-specific.
2017-11-14 18:11:34 +01:00
Christopher Faulet
e9a896e09e BUG/MINOR: threads: tid_bit must be a unsigned long
This is specific to threads, no backport is needed.
2017-11-14 18:11:28 +01:00
William Lallemand
e1533f5790 MINOR: cache: disable cache if shctx_row_data_append fail
Disable the cache if the append of data failed, it should never happen
because the allocated row size is at least equal to the size of the
object to allocate.
2017-11-14 15:20:44 +01:00
William Lallemand
10935bc547 MINOR: cache: forward data with headers
Forward the remaining headers with the data in the first call of
cache_store_http_forward_data().

Previously the headers were forwarded first, and the function left,
implying an additionnal call to cache_store_http_forward_data() for the
data.

Cc: Christopher Faulet <cfaulet@haproxy.com>
2017-11-14 15:20:44 +01:00
William Lallemand
9d5f54daad BUG/MEDIUM: cache: use msg->sov to forward header
Use msg->sov to forward headers instead of msg->eoh. It can causes some
problem because eoh does not contains the last \r\n, and the filter does
not support to send the headers partially.

Cc: Christopher Faulet <cfaulet@haproxy.com>
2017-11-14 15:20:44 +01:00
Tim Duesterhus
0436ab7841 BUG/MEDIUM: mworker: Fix re-exec when haproxy is started from PATH
If haproxy is started using the name of the binary only (i.e.
not using a relative or absolute path) the `execv` in
`mworker_reload` fails with `ENOENT`, because it does not
examine the `PATH`:

  [WARNING] 315/161139 (7) : Reexecuting Master process
  [WARNING] 315/161139 (7) : Cannot allocate memory
  [WARNING] 315/161139 (7) : Failed to reexecute the master processs [7]

The error messages are misleading, because the return value of
`execv` is not checked. This should be fixed in a separate commit.

Once this happened the master process ignores any further
signals sent by the administrator.

Replace `execv` with `execvp` to establish the expected
behaviour.

This bug was introduced in commit 73b85e75b3.
2017-11-14 15:11:24 +01:00
Christopher Faulet
9dcf9b6f03 MINOR: threads: Use __decl_hathreads to declare locks
This macro should be used to declare variables or struct members depending on
the USE_THREAD compile option. It avoids the encapsulation of such declarations
between #ifdef/#endif. It is used to declare all lock variables.
2017-11-13 11:38:17 +01:00
Christopher Faulet
600d37edda BUG/MINOR: spoe: check buffer size before acquiring or releasing it
In spoe_acquire_buffer and spoe_release_buffer, instead of checking the buffer
against buf_empty, we now check its size. It is important because when an
allocation fails, it will be set to buf_wanted. In both cases, the size is 0.

It is a proactive bug fix, no real problem was observed till now. It cannot be
backported as is in 1.7 because of all changes made on the SPOE in 1.8.
2017-11-13 11:38:12 +01:00
Willy Tarreau
3e5e417060 BUILD: thread/pipe: fix build without threads
Marcus Rckert reported that commit d8b3b65 ("BUG/MEDIUM: splice/threads:
pipe reuse list was not protected.") broke threadless support. Add the
required #ifdef.
2017-11-11 18:00:24 +01:00
William Lallemand
18f133adb3 BUG/MEDIUM: cache: does not cache if no Content-Length
In the case of Transfer-Encoding: chunked, there is no Content-Length
which causes the cache to allocate a too small shctx row for the data.

It's not possible to allocate a shctx row for the chunks, we need to be
able to allocate on-the-fly the shctx blocks during the data transfer.
2017-11-11 14:01:21 +01:00
Willy Tarreau
916597903c MEDIUM: http: always reject the "PRI" method
This method was reserved for the HTTP/2 connection preface, must never
be used and must be rejected. In normal situations it doesn't happen,
but it may be visible if a TCP frontend has alpn "h2" enabled, and
forwards to an HTTP backend which tries to parse the request. Before
this patch it would pass the wrong request to the backend server, now
it properly returns 400 bad req.

This patch should probably be backported to stable versions.
2017-11-10 19:38:10 +01:00
Willy Tarreau
387bd4f69f CLEANUP: global: introduce variable pid_bit to avoid shifts with relative_pid
At a number of places, bitmasks are used for process affinity and to map
listeners to processes. Every time 1UL<<(relative_pid-1) is used. Let's
create a "pid_bit" variable corresponding to this value to clean this up.
2017-11-10 19:08:14 +01:00
Willy Tarreau
9a398beac3 BUG/MEDIUM: stream: don't ignore res.analyse_exp anymore
It happens that no single analyser has ever needed to set res.analyse_exp,
so that process_stream() didn't consider it when computing the next task
expiration date. Since Lua actions were introduced in 1.6, this can be
needed on http-response actions for example, so let's ensure it's properly
handled.

Thanks to Nick Dimov for reporting this bug. The fix needs to be
backported to 1.7 and 1.6.
2017-11-10 17:14:23 +01:00
Willy Tarreau
5d9846f4b3 MINOR: cli: make "show fd" report the fd's thread mask
This is useful to know what thread(s) an fd is scheduled to be
handled on. It's worth noting that at the moment the "show fd"d
doesn't seem totally thread-safe.
2017-11-10 16:53:09 +01:00
Willy Tarreau
28b55c6fed CLEANUP: mux: remove the unused "release()" function
In commit 53a4766 ("MEDIUM: connection: start to introduce a mux layer
between xprt and data") we introduced a release() function which ends
up never being used. Let's get rid of it now.
2017-11-10 16:43:05 +01:00
Willy Tarreau
7ce3f09513 BUG/MEDIUM: threads/cli: fix "show sess" locking on release
The recent thread updates on the CLI broke "show sess" by unlocking
the stream twice instead of lock+unlock. No backport is needed.
2017-11-10 16:24:41 +01:00
Willy Tarreau
22cf59bbba BUG/MEDIUM: h2: support orphaned streams
When a stream_interface performs a shutw() then a shutr(), the stream
is marked closed. Then cs_destroy() calls h2_detach() and it cannot
fail since we're on the leaving path of the caller. The problem is that
in order to close streams we usually have to send either an emty DATA
frame with the ES flag set or an RST_STREAM frame, and the mux buffer
might already be full, forcing the stream to be queued. The forced
removal of this stream causes this last message to silently disappear,
and the client to wait forever for a response.

This commit ensures we can detach the conn_stream from the h2 stream
if the stream is blocked, effectively making the h2 stream an orphan,
ensures that the mux can deal with orphaned streams after processing
them, and that the demux can kill them upon receipt of GOAWAY.
2017-11-10 11:48:15 +01:00
Willy Tarreau
8c0ea7d21a BUG/MEDIUM: h2: split the function to send RST_STREAM
There is an issue with how the RST_STREAM frames are sent. Some of
them are sent from the demux, either for valid or for closed streams,
and some are sent from the mux always for valid streams. At the moment
the demux stream ID is used, which is wrong for all streams being muxed,
and sometimes results in certain bad HTTP responses causing the emission
of an RST_STREAM referencing stream zero. In addition, the stream's
blocked flags could be updated even if the stream was the closed or
idle ones.

We really need to split the function for the two distinct use cases where
one is used to send an RST on a condition detected at the connection level
(such as a closed stream) and the other one is used to send an RST for a
condition detected at the stream level. The first one is used only in the
demux, and the other one only by a valid stream.
2017-11-10 10:05:24 +01:00
Christopher Faulet
09fdf4b112 BUG/MINOR: pattern: Rely on the sample type to copy it in pattern_exec_match
To be thread safe, the function pattern_exec_match copy data (the pattern and
the inner sample) in thread-local variables. But when the sample is duplicated,
we must check its type and not the pattern one.

This is specific to threads, no backport is needed.
2017-11-09 17:19:20 +01:00
Christopher Faulet
c5a9d5bf23 BUG/MEDIUM: stream-int: Don't loss write's notifs when a stream is woken up
When a write activity is reported on a channel, it is important to keep this
information for the stream because it take part on the analyzers' triggering.
When some data are written, the flag CF_WRITE_PARTIAL is set. It participates to
the task's timeout updates and to the stream's waking. It is also used in
CF_MASK_ANALYSER mask to trigger channels anaylzers. In the past, it was cleared
by process_stream. Because of a bug (fixed in commit 95fad5ba4 ["BUG/MAJOR:
stream-int: don't re-arm recv if send fails"]), It is now cleared before each
send and in stream_int_notify. So it is possible to loss this information when
process_stream is called, preventing analyzers to be called, and possibly
leading to a stalled stream.

Today, this happens in HTTP2 when you call the stat page or when you use the
cache filter. In fact, this happens when the response is sent by an applet. In
HTTP1, everything seems to work as expected.

To fix the problem, we need to make the difference between the write activity
reported to lower layers and the one reported to the stream. So the flag
CF_WRITE_EVENT has been added to notify the stream of the write activity on a
channel. It is set when a send succedded and reset by process_stream. It is also
used in CF_MASK_ANALYSER. finally, it is checked in stream_int_notify to wake up
a stream and in channel_check_timeouts.

This bug is probably present in 1.7 but it seems to have no effect. So for now,
no needs to backport it.
2017-11-09 15:16:05 +01:00
Willy Tarreau
a87f202b49 BUG/MEDIUM: h2: reject non-3-digit status codes
If the H1 parser would report a status code length not consisting in
exactly 3 digits, the error case was confused with a lack of buffer
room and was causing the parser to loop infinitely.
2017-11-09 11:23:00 +01:00
Willy Tarreau
1b4cf9b754 BUG/MINOR: h1: the HTTP/1 make status code parser check for digits
The H1 parser used by the H2 gateway was a bit lax and could validate
non-numbers in the status code. Since it computes the code on the fly
it's problematic, as "30:" is read as status code 310. Let's properly
check that it's a number now. No backport needed.
2017-11-09 11:15:45 +01:00
Willy Tarreau
ddfbd83780 BUILD: shctx: do not depend on openssl anymore
The build breaks on a machine without openssl/crypto.h because shctx
still loads openssl-compat.h while it doesn't need it anymore since
the code was moved :

In file included from src/shctx.c:20:0:
include/proto/openssl-compat.h:3:28: fatal error: openssl/crypto.h: No such file or directory
 #include <openssl/crypto.h>

Just remove include openssl-compat from shctx.
2017-11-08 14:33:36 +01:00
Willy Tarreau
46c9d3e6cb BUILD: ssl: fix build of backend without ssl
Commit 522eea7 ("MINOR: ssl: Handle sending early data to server.") added
a dependency on SRV_SSL_O_EARLY_DATA which only exists when USE_OPENSSL
is defined (which is probably not the best solution) and breaks the build
when ssl is not enabled. Just add an ifdef USE_OPENSSL around the block
for now.
2017-11-08 14:28:08 +01:00
Olivier Houchard
522eea7110 MINOR: ssl: Handle sending early data to server.
This adds a new keyword on the "server" line, "allow-0rtt", if set, we'll try
to send early data to the server, as long as the client sent early data, as
in case the server rejects the early data, we no longer have them, and can't
resend them, so the only option we have is to send back a 425, and we need
to be sure the client knows how to interpret it correctly.
2017-11-08 14:11:10 +01:00
Olivier Houchard
cfdef2e312 MINOR: ssl: Spell 0x10101000L correctly.
Issue added in 1.8-dev by c2aae74 ("MEDIUM: ssl: Handle early data with
OpenSSL 1.1.1"), no impact on older versions.
2017-11-08 14:10:02 +01:00
Olivier Houchard
bd84ac8737 MINOR: ssl: Handle session resumption with TLS 1.3
With TLS 1.3, session aren't established until after the main handshake
has completed. So we can't just rely on calling SSL_get1_session(). Instead,
we now register a callback for the "new session" event. This should work for
previous versions of TLS as well.
2017-11-08 14:08:07 +01:00
Olivier Houchard
35a63cc1c7 BUG/MINOR; ssl: Don't assume we have a ssl_bind_conf because a SNI is matched.
We only have a ssl_bind_conf if crt-list is used, however we can still
match a certificate SNI, so don't assume we have a ssl_bind_conf.
2017-11-08 14:08:07 +01:00
Willy Tarreau
9e45b33f7e BUG/MAJOR: threads/tasks: fix the scheduler again
My recent change in commit ce4e0aa ("MEDIUM: task: change the construction
of the loop in process_runnable_tasks()") was bogus as it used to keep the
rq_next across an unlock/lock sequence, occasionally leading to crashes for
tasks that are eligible to any thread. We must use the lookup call for each
new batch instead. The problem is easily triggered with such a configuration :

    global
        nbthread 4

    listen check
        mode http
        bind 0.0.0.0:8080
        redirect location /
        option httpchk GET /
        server s1 127.0.0.1:8080 check inter 1
        server s2 127.0.0.1:8080 check inter 1

Thanks to Olivier for diagnosing this one. No backport is needed.
2017-11-08 14:05:19 +01:00
Willy Tarreau
ecd2e15919 BUG/MINOR: stream-int: don't set MSG_MORE on closed request path
Commit 4ac4928 ("BUG/MINOR: stream-int: don't set MSG_MORE on SHUTW_NOW
without AUTO_CLOSE") was incomplete. H2 reveals another situation where
the input stream is marked closed with the request and we set MSG_MORE,
causing a delay before the request leaves.

Better avoid setting the flag on the request path for close cases in
general.
2017-11-07 15:07:25 +01:00
Emeric Brun
11f5886e5c BUG/MINOR: comp: fix compilation warning compiling without compression.
This is specific to threads, no backport is needed.
2017-11-07 14:48:13 +01:00
Emeric Brun
d8b3b65faa BUG/MEDIUM: splice/threads: pipe reuse list was not protected.
The list is now protected using a global spinlock.
2017-11-07 14:47:28 +01:00
Willy Tarreau
926fa4c098 BUG/MINOR: h2: don't send GOAWAY on failed response
As part of the detection for intentional closes, we can kill the
connection if a shutw() happens before the headers. But it can also
happen that an invalid response is not properly parsed, preventing
any headers frame from being sent and making the function believe
it was an abort. Now instead we check if any response was received
from the stream, regardless of the fact that it was properly
converted.
2017-11-07 14:47:04 +01:00
Willy Tarreau
c4312d3dfd MINOR: h2: add new stream flag H2_SF_OUTGOING_DATA
This one indicates whether we've received data to mux out. It helps
make the difference between a clean close and a an erroneous one.
2017-11-07 14:47:04 +01:00
Willy Tarreau
58e3208714 BUG/MINOR: h2: correctly check for H2_SF_ES_SENT before closing
In h2_shutw() we must not send another empty frame (nor RST) after
one has been sent, as the stream is already in HLOC/CLOSED state.
2017-11-07 14:47:04 +01:00
Willy Tarreau
6d8b682f9a BUG/MEDIUM: h2: properly set H2_SF_ES_SENT when sending the final frame
When sending DATA+ES, it's important to set H2_SF_ES_SENT as we don't
want to emit is several times nor to send an RST afterwards.
2017-11-07 14:47:04 +01:00
Willy Tarreau
e6ae77f64f MINOR: h2: don't re-enable the connection's task when we're closing
It's pointless to requeue the task when we're closing, so swap the
order of the task_queue() and h2_release(). It also matches what
was written in the comment regarding re-arming the timer.
2017-11-07 14:47:04 +01:00
Willy Tarreau
83906c2f91 BUG/MEDIUM: h2: don't close the connection is there are data left
h2_detach() is called after a stream was closed, and it evaluates if it's
worth closing the connection. The issue there is that the connection is
closed too early in case there's demand for closing after the last stream,
even if some data remain in the mux. Let's change the condition to check
for this.
2017-11-07 14:47:04 +01:00
Christopher Faulet
2a944ee16b BUILD: threads: Rename SPIN/RWLOCK macros using HA_ prefix
This remove any name conflicts, especially on Solaris.
2017-11-07 11:10:24 +01:00
Willy Tarreau
7d8e4af46a BUG/MEDIUM: h2: fix some wrong error codes on connections
When the assignment of the connection state was moved into h2c_error(),
3 of them were missed because they were wrong, using H2_SS_ERROR instead.
This resulted in the connection's state being set to H2_CS_ERROR2 in fact,
so the error was not properly sent.
2017-11-07 11:08:28 +01:00
Willy Tarreau
721c974e5e MEDIUM: h2: remove the H2_SS_RESET intermediate state
This one was created to maintain the knowledge that a stream was closed
after having sent an RST_STREAM frame but that's not needed anymore and
it confuses certain conditions on the error processing path. It's time
to get rid of it.
2017-11-07 11:05:42 +01:00
Willy Tarreau
319994a2e9 BUG/MEDIUM: h2: don't try (and fail) to send non-existing data in the mux
The call to xprt->snd_buf() was not conditionned on the presence of
data in the buffer, resulting in snd_buf() returning 0 and never
disabling the polling. It was revealed by the previous bug on error
processing but must properly be handled.
2017-11-07 11:03:56 +01:00
Willy Tarreau
3eabe9b174 BUG/MEDIUM: h2: properly send the GOAWAY frame in the mux
A typo on a condition prevented H2_CS_ERROR from being processed,
leading to an infinite loop on connection error.
2017-11-07 11:03:01 +01:00
Willy Tarreau
c6795ca7c1 BUG/MEDIUM: h2: properly send an RST_STREAM on mux stream error
Some stream errors are detected on the MUX path (eg: H1 response
encoding). The ones forgot to emit an RST_STREAM frame, causing the
client to wait and/or to see the connection being immediately closed.
This is now fixed.
2017-11-07 09:43:06 +01:00
Willy Tarreau
6743420778 BUG/MINOR: h2: set the "HEADERS_SENT" flag on stream, not connection
This flag was added after the GOAWAY flags were introduced and mistakenly
placed in the connection, but that doesn't make sense as it's specific to
the stream. The main impact is the risk of returning a DATA0+ES frame for
an error instead of an RST_STREAM.
2017-11-06 20:20:51 +01:00
Olivier Houchard
283810773a BUG/MINOR: dns: Don't lock the server lock in snr_check_ip_callback().
snr_check_ip_callback() may be called with the server lock, so don't attempt
to lock it again, instead, make sure the callers always have the lock before
calling it.
2017-11-06 18:34:42 +01:00
Olivier Houchard
55dcdf4c39 BUG/MINOR: dns: Don't try to get the server lock if it's already held.
dns_link_resolution() can be called with the server lock already held, so
don't attempt to lock it again in that case.
2017-11-06 18:34:24 +01:00
Willy Tarreau
f0c531ab55 MEDIUM: tasks: implement a lockless scheduler for single-thread usage
The scheduler is complex and uses local queues to amortize the cost of
locks. But all this comes with a cost that is quite observable with
single-thread workloads.

The purpose of this patch is to reimplement the much simpler scheduler
for the case where threads are not used. The code is very small and
simple. It doesn't impact the multi-threaded performance at all, and
provides a nice 10% performance increase in single-thread by reaching
606kreq/s on the tests that showed 550kreq/s before.
2017-11-06 11:20:11 +01:00
Willy Tarreau
9d4b56b88e MINOR: tasks: only visit filled task slots after processing them
process_runnable_tasks() needs to requeue or wake up tasks after
processing them in batches. By only refilling the existing ones, we
avoid revisiting all the queue. The performance gain is measurable
starting with two threads, where the request rate climbs to 657k/s
compared to 644k.
2017-11-06 11:20:11 +01:00
Willy Tarreau
ce4e0aa7f3 MEDIUM: task: change the construction of the loop in process_runnable_tasks()
This patch slightly rearranges the loop to pack the locked code a little
bit, and to try to concentrate accesses to the tree together to benefit
more from the cache.

It also fixes how the loop handles the right margin : now that is guaranteed
that the retrieved nodes are filtered to only match the current thread, we
don't need to rewind every 16 entries. Instead we can rewind each time we
reach the right margin again.

With this change, we now achieve the following performance for 10 H2 conns
each containing 100 streams :

   1 thread : 550kreq/s
   2 thread : 644kreq/s
   3 thread : 598kreq/s
2017-11-06 11:20:11 +01:00
Willy Tarreau
b992ba16ef MINOR: task: simplify wake_expired_tasks() to avoid unlocking in the loop
This function is sensitive, let's make it shorter by factoring out the
unlock and leave code. This reduced the function's size by a few tens
of bytes and increased the overall performance by about 1%.
2017-11-06 11:20:11 +01:00
Willy Tarreau
8d38805d3d MAJOR: task: make use of the scope-aware ebtree functions
Currently the task scheduler suffers from an O(n) lookup when
skipping tasks that are not for the current thread. The reason
is that eb32_lookup_ge() has no information about the current
thread so it always revisits many tasks for other threads before
finding its own tasks.

This is particularly visible with HTTP/2 since the number of
concurrent streams created at once causes long series of tasks
for the same stream in the scheduler. With only 10 connections
and 100 streams each, by running on two threads, the performance
drops from 640kreq/s to 11.2kreq/s! Lookup metrics show that for
only 200000 task lookups, 430 million skips had to be performed,
which means that on average, each lookup leads to 2150 nodes to
be visited.

This commit backports the principle of scope lookups for ebtrees
from the ebtree_v7 development tree. The idea is that each node
contains a mask indicating the union of the scopes for the nodes
below it, which is fed during insertion, and used during lookups.

Then during lookups, branches that do not contain any leaf matching
the requested scope are simply ignored. This perfectly matches a
thread mask, allowing a thread to only extract the tasks it cares
about from the run queue, and to always find them in O(log(n))
instead of O(n). Thus the scheduler uses tid_bit and
task->thread_mask as the ebtree scope here.

Doing this has recovered most of the performance, as can be seen on
the test below with two threads, 10 connections, 100 streams each,
and 1 million requests total :

                              Before     After    Gain
              test duration : 89.6s      4.73s     x19
    HTTP requests/s (DEBUG) : 11200     211300     x19
     HTTP requests/s (PROD) : 15900     447000     x28
             spin_lock time : 85.2s      0.46s    /185
            time per lookup : 13us       40ns     /325

Even when going to 6 threads (on 3 hyperthreaded CPU cores), the
performance stays around 284000 req/s, showing that the contention
is much lower.

A test showed that there's no benefit in using this for the wait queue
though.
2017-11-06 11:20:11 +01:00
William Lallemand
92159b2901 MINOR: mworker: do not store child pid anymore in the pidfile
The parent process supervises itself the children, we don't need to
store the children pids anymore in the pidfile in master-worker mode.
2017-11-06 11:19:53 +01:00
William Lallemand
deed780a22 MINOR: mworker: write parent pid in the pidfile
The first pid in the pidfile is now the parent, it's more convenient for
supervising the processus.

You can now reload haproxy in master-worker mode with convenient command
like: kill -USR2 $(head -1 /tmp/haproxy.pid)
2017-11-06 11:08:38 +01:00
William Lallemand
8029300df6 MINOR: mworker: allow pidfile in mworker + foreground
This patch allows the use of the pidfile in master-worker mode without
using the background option.
2017-11-06 11:08:38 +01:00
William Lallemand
cc113822a7 MINOR: add master-worker in the warning about nbproc 2017-11-06 11:08:38 +01:00
Willy Tarreau
6dbd3e963b BUG/MEDIUM: threads: don't try to free build option message on exit
Commit 0493149 ("MINOR: thread: report multi-thread support in haproxy -vv")
added information about thread support in haproxy -vv output but accidently
marked the message as "must_free" while it's a constant. This causes a segv
on the old process on clean exit if threads are enabled. It doesn't affect
the stability during operations however.
2017-11-05 11:51:48 +01:00
Willy Tarreau
bbd09b9306 BUG/MAJOR: thread/listeners: enable_listener must not call unbind_listener()
unbind_listener() takes the listener lock, which is already held by
enable_listener(). This situation happens when starting with nbproc > 1
with some bind lines limited to a certain process, because in this case
enable_listener() tries to stop unneeded listeners.

This commit introduces __do_unbind_listeners() which must be called with
the lock held, and makes enable_listener() use this one. Given that the
only return code has never been used and that it starts to make the code
more complicated to propagate it before throwing it to the trash, the
function's return type was changed to void.
2017-11-05 11:38:44 +01:00
Willy Tarreau
3340029b97 BUG/MAJOR: h2: set the connection's task to NULL when no client timeout is set
If "timeout client" is missing from the frontend, the task is not initialized,
causing a crash on connection teardown.
2017-11-05 11:23:40 +01:00
Willy Tarreau
4d5f13cab3 BUG/MEDIUM: threads/stick-tables: close a race condition on stktable_trash_expired()
The spin_unlock() was called just before setting the expiry to
TICK_ETERNITY, so if another thread has the time to perform its
update and set a timeout, this would would clear it.
2017-11-05 11:04:47 +01:00
Willy Tarreau
03071f6937 BUG/MAJOR: threads/lb: fix missing unlock on map-based hash LB
We often left the function with the lock held on success.
2017-11-05 10:59:12 +01:00
Willy Tarreau
1ed90ac377 BUG/MAJOR: threads/lb: fix missing unlock on consistent hash LB
If no matching node was found, the function was left without unlocking
the tree.
2017-11-05 10:54:50 +01:00
Willy Tarreau
5ec84574c7 BUG/MAJOR: threads/dns: add missing unlock on allocation failure path
An unlock was missing when a memory allocation failure is detected.
2017-11-05 10:35:57 +01:00
Willy Tarreau
70124ce3e1 BUG/MAJOR: cli/streams: missing unlock on exit "show sess"
An unlock was missing on the situation where the session disappeared
while watching it.
2017-11-05 10:31:10 +01:00
Willy Tarreau
6ce38f3eab CLEANUP: server: get rid of return statements in the CLI parser
There were two many return, some of them missing a spin_unlock call,
let's use a goto to a central place instead.
2017-11-05 10:19:23 +01:00
Willy Tarreau
a075258a2c BUG/MINOR: cli: add severity in "set server addr" parser
Commit c3680ec ("MINOR: add severity information to cli feedback messages")
introduced a severity level to CLI messages, but one of them was missed
on "set server addr". No backport is needed.
2017-11-05 10:17:49 +01:00
Willy Tarreau
62ac84f843 CLEANUP: checks: remove return statements in locked functions
Given that all spinning loops we've had since 1.8-rc1 were caused by
unbalanced lock/unlock, let's get rid of all return statements in the
locked check functions and only exit via a a single unlock place.
2017-11-05 10:13:38 +01:00
Willy Tarreau
73247e0757 BUG/MAJOR: threads/checks: wrong use of SPIN_LOCK instead of SPIN_UNLOCK
Must unlock on exit, copy-paste error.
2017-11-05 10:13:37 +01:00
Willy Tarreau
1c8980f9b5 BUG/MINOR: cli: do not perform an invalid action on "set server check-port"
The "set server <srv> check-port" CLI handler forgot to return after
detecting an error on the port number, and still proceeds with the action.
This needs to be backported to 1.7.
2017-11-05 10:13:37 +01:00
Willy Tarreau
2a858a82ec BUG/MAJOR: threads/server: missing unlock in CLI fqdn parser
This one didn't properly unlock before returning an error message.
2017-11-05 10:13:37 +01:00
Willy Tarreau
1cd153aa89 BUG/MAJOR: threads/checks: add 4 missing spin_unlock() in various functions
Some unlocks were missing, resulting in deadlocks even with a single thread.
We really need to make these functions safer by getting rid of all those
remaining "return" calls and only leave using a goto!
2017-11-05 10:13:28 +01:00
Olivier Houchard
f143b8040b BUILD: use MAXPATHLEN instead of NAME_MAX.
This fixes building on at least Solaris, where NAME_MAX doesn't exist.
2017-11-04 17:09:23 +01:00
Willy Tarreau
0493149ac3 MINOR: thread: report multi-thread support in haproxy -vv
Otherwise it's hard to know if it was enabled or not.
2017-11-03 23:39:25 +01:00
Willy Tarreau
ed339a375c BUG/MAJOR: mux_pt: don't dereference a connstream after ->wake()
The wake() callback may destroy a connstream, so it must not be
dereferenced in case wake() returns negative. No backport needed,
this is 1.8-only.
2017-11-03 15:55:24 +01:00
Emeric Brun
8c4954c5c2 BUG/MINOR: lua: fix missing lock protection on server.
To avoid inconsistencies server's attributes must be read
or updated under lock.
2017-11-03 15:17:59 +01:00
Emeric Brun
e9fd6b5916 BUG/MINOR: dns: fix missing lock protection on server.
To avoid inconsistencies server's attributes must be read
or updated under lock.
2017-11-03 15:17:55 +01:00
Emeric Brun
f2fc1fda80 BUG/MINOR: freq: fix infinite loop on freq_ctr_period.
Using peers or stick table we could update an freq_ctr
using a tick value with the first bit set but this
bit is reserved for lock since multithreading support.
2017-11-02 18:09:58 +01:00
William Lallemand
9c54c53f2f BUG/MEDIUM: cache: don't try to resolve wrong filters
Don't try to resolve wrong filters which are not cache filters during
the post configuration callback.
2017-11-02 16:58:25 +01:00
Emeric Brun
f6ba17da20 BUG/MAJOR: fix deadlock on healthchecks.
Fix bugs due to missing unlock and recursive lock performing
http health check.

The server's lock scope was enlarged to protect all callers
of 'set_server_check_status' and 'chk_report_conn_err'.

This fix also protects tcpcheck against concurrency.
2017-11-02 16:24:37 +01:00
Willy Tarreau
16257f648f BUG/MEDIUM: checks/mux: always enable send-polling after connecting
Before introducing the mux layer, tcp_connect() would poll for sending
to detect the connection establishment. It happens that the health
checks have apparently never explicitly enabled this polling and have
been relying on this implicit one.

Now that there's the mux layer, the conn_stream needs to be enabled
for polling as well and since it's not done in the checks, it's never
done and the check's request doesn't leave the machine, as can be
noticed with http checks.

The solution simply consists in going back to the well-known case
where we enable polling after connecting using cs_want_send() if we
have anything but just a plain connect(). The regular data path is
not affected because the stream interface code automatically computes
the polling needs based on buffer contents.
2017-11-02 15:53:04 +01:00
Willy Tarreau
f13ef96e70 BUG/MEDIUM: h2: don't try to parse incomplete H1 responses
This situation which must not happen does in fact happen when feeding
artificial responses using errorfiles, Lua or an applet. For now it
causes the H1 response parser to loop forever trying to get a more
complete response. Since it cannot progress, let's return an error.
2017-11-02 15:53:04 +01:00
Olivier Houchard
fccf840cdf MINOR: cache: Don't confuse act_return and act_parse_ret. 2017-11-01 15:10:51 +01:00
Olivier Houchard
cd2867a012 MINOR: cache: Remove useless test for nonzero.
Don't bother testing if len is nonzero, we know it is, as we're in the
"else" part of a if (!len), and testing it confuses clang into thinking
ret may be left uninitialized.
2017-11-01 15:10:51 +01:00
Olivier Houchard
7da120bb0e MINOR: mux: Only define pipe functions on linux.
Only define mux_pt_snd_pipe() and mux_pt_rcv_pipe() if splicing is
available.
2017-11-01 15:10:51 +01:00
Emmanuel Hocdet
82913e4f79 BUG/MINOR: send-proxy-v2: string size must include ('\0')
strlen() exclude the terminating null byte ('\0'), add it.
2017-11-01 07:58:20 +01:00
Emmanuel Hocdet
571c7ac0a5 BUG/MINOR: send-proxy-v2: fix dest_len in make_tlv call
Subtract already allocated size from buf_len.
2017-11-01 07:57:42 +01:00
William Lallemand
77c1197bfb MEDIUM: cache: deliver objects from cache
Lookup objects in the cache and deliver them using the http-request
action "cache-use".
2017-10-31 21:17:19 +01:00
William Lallemand
4da3f8a1f2 MEDIUM: cache: store objects in cache
Store object in the cache. The cache use an shctx for storage.

It uses an http-response action to store the headers and a filter to
store the body. The http-response action is used in order to allow
modifications by other actions before caching.
2017-10-31 21:17:19 +01:00
William Lallemand
41db46035e MEDIUM: cache: configuration parsing and initialization
Parse a configuration section "cache" and a http-{response,request}
actions.

Example:

    listen frt
        mode http
        http-response cache-store foobar
        http-request cache-use foobar

    cache foobar
        total-max-size 4   # size in megabytes
2017-10-31 21:17:19 +01:00
William Lallemand
7217c46dfe MEDIUM: shctx: forbid shctx to read more than expected
Forbid shctx to read more than expected, it allows you to use a greater
value as a len with shctx_row_data_get(), the size of the destination
buffer for example.
2017-10-31 21:17:19 +01:00
Willy Tarreau
3f133570b8 BUG/MEDIUM: h2: fix incorrect timeout handling on the connection
Previous commit ea3928 (MEDIUM: h2: apply a timeout to h2 connections)
was wrong for two reasons. The first one is that if the client timeout
is not set, it's used as zero, preventing connections from establishing.
The second reason is that if the timeout triggers with active streams
(normally it should not since the task is supposed to be disabled), the
task is removed (h2c->task=NULL), and the last quitting stream might
try to dereference it.

Instead of doing this, we simply not register the task if there's no
timeout (it's useless) and we always control its presence in the streams.
2017-10-31 19:21:06 +01:00
Willy Tarreau
ea39282e85 MEDIUM: h2: apply a timeout to h2 connections
Till now there was no way to deal with a dead H2 connection. Now each
connection creates a task that wakes up to kill the connection. Its
timeout is constantly refreshed when there's some activity. In case
the timeout triggers, the best effort attempts are made at sending a
clean GOAWAY message before closing and signaling the streams.

The timeout is automatically disabled when there's an active stream on
the connection, and restarted when the last stream finishes. This way
it should not affect long sessions.
2017-10-31 18:16:19 +01:00
Willy Tarreau
a1349f0207 MEDIUM: h2: send a GOAWAY frame when dealing with an empty response
Given that we're processing data produced by haproxy, we know that the
situations where haproxy doesn't return anything are :
  - request timeout with option http-ignore-probes : there's no reason to
    hit this since we're creating the stream with the request into it ;

  - tcp-request content reject : this definitely means we want to kill the
    connection and abort keep-alive and any further processing ;

  - using /dev/null as the error file to hide an error

In practice it appears that using the abort on empty response as a hint to
trigger a connection close is very appropriate to continue to give the
control over the connection management. This patch thus tries to send a
GOAWAY frame with the max_id presented as the last stream ID, then sends
an RST_STREAM for the current stream. For the client, this means that the
connection must be shut down immediately after processing the last pending
streams and that the current stream is aborted. This way it's still possible
to force connections to be closed using tcp-request rules.
2017-10-31 18:16:19 +01:00
Willy Tarreau
af1e4f5167 MEDIUM: h2: perform a graceful shutdown on "Connection: close"
After some long brainstorming sessions, it appears that "Connection: close"
seems to be the best signal from the L7 layer to indicate the need to close
the connection. Indeed, in H1 it is only present in very rare cases (eg:
certain unrecoverable errors, some of which could remove it now by the way).
It will also be added when the L7 layer wants to force the connection to
terminate. By default when running in keep-alive mode it is not present.
It's worth mentionning that in H1 with persistent connections, we have sort
of a concurrency-1 mux and this header field is used the same way.

Thus here this patch detects "Connection: close" in response headers and
if seen, sends a GOAWAY frame with the highest possible ID so that the
client knows that it can quit whenever it wants to. If more aggressive
closures are needed in the future, we may decide to advertise the max_id
to abort after the current requests and better honor "http-request deny".
2017-10-31 18:16:19 +01:00
Willy Tarreau
1c661986a8 MINOR: h2: properly reject PUSH_PROMISE frames coming from the client
These ones deserve a connection error as per 5.1.
2017-10-31 18:16:19 +01:00
Willy Tarreau
c0da1964ba MEDIUM: h2: silently ignore frames higher than last_id after GOAWAY
For a graceful shutdown, the specs requries to discard frames with a
stream ID higher than the advertised last_id. (RFC7540#6.8). Well,
finally for now the code is disabled (see last page of #6.8). Some
frames need to be processed anyway to maintain the compression state
and the flow control window state, but we don't have any trivial way
to do this and ignore them at the same time. For the headers it's
the worst case where we can't parse headers frames without coming
from the streams, and we don't want to create such streams as we'd
have to abort them, and aborting would cause errors to flow back.

Possibly that a longterm solution might involve using some dummy
streams and dummy buffers for this and calling the parsers directly.
2017-10-31 18:16:19 +01:00
Willy Tarreau
f182a9a8b4 MINOR: h2: centralize the check for the half-closed(remote) streams
RFC7540#5.1 is pretty clear : "any frame other than WINDOW_UPDATE,
PRIORITY, or RST_STREAM in this state MUST be treated as a connection
error of type STREAM_CLOSED". Instead of dealing with this for each
and every frame type, let's do it once for all in the main demux loop.
2017-10-31 18:16:19 +01:00
Willy Tarreau
f65b80dd47 MINOR: h2: centralize the check for the idle streams
RFC7540#5.1 is pretty clear : "any frame other than HEADERS or PRIORITY
in this state MUST be treated as a connection error". Instead of dealing
with this for each and every frame type, let's do it once for all in the
main demux loop.
2017-10-31 18:16:19 +01:00
Willy Tarreau
e96b0922e9 MEDIUM: h2: handle GOAWAY frames
The ID is respected, and only IDs greater than the advertised last_id
are woken up, with a CS_FL_ERROR flag to signal that the stream is
aborted. This is necessary for a browser to abort a download or to
reject a bad response that affects the connection's state.
2017-10-31 18:16:19 +01:00
Willy Tarreau
23b92aa2bb MINOR: h2: use a common function to signal some and all streams.
Let's replace h2_wake_all_streams() with h2_wake_some_streams(), to
support signaling only streams by their ID (for GOAWAY frames) and
to pass the flags to add on the conn_stream.
2017-10-31 18:16:19 +01:00
Willy Tarreau
c7576eac46 MEDIUM: h2: send DATA+ES or RST_STREAM on shutw/shutr
When a stream sends a shutw, we send an empty DATA frame with the ES
flag set, except if no HEADERS were sent, in which case we rather send
RST_STREAM. On shutr(1) to abort a request, an RST_STREAM frame is sent
if the stream is OPEN and the stream is closed. Care is taken to switch
the stream's state accordingly and to avoid sending an ES bit again or
another RST once already done.
2017-10-31 18:16:19 +01:00
Willy Tarreau
cd234e9fb0 MINOR: h2: handle RST_STREAM frames
These ones are received when the browser aborts a page load, it's the
only moment we can abort the stream.
2017-10-31 18:16:19 +01:00
Willy Tarreau
454f905084 MEDIUM: h2: handle request body in DATA frames
Data frames are received and transmitted. The per-connection and
per-stream amount of data to ACK is automatically updated. Each
DATA frame is ACKed because usually the downstream link is large
and the upstream one is small, so it seems better to waste a few
bytes every few kilobytes to maintain a low ACK latency and help
the sender keep the link busy. The connection's ACK however is
sent at the end of the demux loop and at the beginning of the mux
loop so that a single aggregated one is emitted (connection
windows tend to be much larger than stream windows).

A future improvement would consist in sending a single ACK for
multiple subsequent DATA frames of the same stream (possibly
interleaved with window updates frames), but this is much trickier
as it also requires to remember the ID of the stream for which
DATA frames have to be sent.

Ideally in the near future we should chunk-encode the body sent
to HTTP/1 when there's no content length and when the request is
not a CONNECT. It's just uncertain whether it's the best option
or not for now.
2017-10-31 18:16:19 +01:00
Willy Tarreau
cc0b8c34a6 MEDIUM: h2: send WINDOW_UPDATE frames for connection
When it is detected that the number of received bytes is > 0 on the
connection at the end of the demux call or before starting to process
pending output data, an attempt is made at sending a WINDOW UPDATE on
the connection. In case of failure, it's attempted later.
2017-10-31 18:16:19 +01:00
Willy Tarreau
c199faf5bd MEDIUM: h2: properly continue to parse header block when facing a 1xx response
We still didn't handle the 1xx responses properly.
2017-10-31 18:16:19 +01:00
Willy Tarreau
9d89ac8f42 MEDIUM: h2: skip the response trailers if any
For now we don't build a HEADERS frame with them, but at least we remove
them from the response so that the L7 chunk parser inside isn't blocked
on these (often two) remaining bytes that don't want to leave the buffer.
It also ensures that trailers delivered progressively will correctly be
skipped.
2017-10-31 18:16:19 +01:00
Willy Tarreau
c652dbde9d MEDIUM: h2: send the H1 response body as DATA frames
The H1 response data are processed (either following content-length or
chunks) and emitted as H2 DATA frames. In the case of content-length,
the maximum size permitted by the mux buffer, the max frame size, the
connection's window and the stream's window it used to determine the
frame size. For chunked encoding, the same limitation applies, but in
addition, each chunk leads to a distinct frame. This could be improved
in the future to aggregate chunks into larger frames.

Streams blocked on the connection's flow control subscribe to the
connection's fctl_list to be woken up when the window opens again.

Streams blocked on their own flow control don't subscribe to anything,
they just sit waiting for window update frames to reopen the window.

The connection-close mode (without content-length) partially works thanks
to the fact that the SHUTW event leads to a close of the stream. In
practice an empty DATA frame should be sent in this case though.
2017-10-31 18:16:19 +01:00
Willy Tarreau
9e5ae1d721 MEDIUM: h2: implement the response HEADERS frame to encode the H1 response
This calls the h1 response parser and feeds the output through the hpack
encoder to produce stateless HPACK bytecode into an output chunk. For now
it's a bit naive but reasonably efficient.

The HPACK encoder relies on hpack_encode_header() so that the most common
response header fields are encoded based on the static header table. The
forbidden header field names (connection, proxy-connection, upgrade,
transfer-encoding, keep-alive) are dropped before calling the hpack
encoder.

A new flag (H2_CF_HEADERS_SENT) is set once such a frame is emitted. It
will be used to know if we can send an empty DATA+ES frame to use as a
shutdown() signal or if we have to use RST_STREAM.
2017-10-31 18:16:19 +01:00
Willy Tarreau
68dd9856ce MEDIUM: h2: don't use trash to decode headers!
The trash is already used by the hpack layer and for Huffman decoding,
it's unsafe to use here as a buffer and results in corrupted data. Use
a safely allocated trash instead.
2017-10-31 18:16:18 +01:00
Willy Tarreau
13278b44b1 MEDIUM: h2: basic processing of HEADERS frame
This takes care of creating a new h2s and a new conn_stream when a
HEADERS frame arrives. The recv() callback from the data layer is then
called to extract the frame into the stream's buffer. It is verified
that the stream ID is strictly greater than the known max stream ID.
And the last_id is updated if the current request is properly converted.
The streams are created in open or half-closed(remote) states.

For now there are some limitations :
  - frames without END_HEADERS are rejected (CONTINUATION not supported
    yet, will require some more changes so that the stream processor
    checks the H2 frame header by itself and steals the frames from the
    connection)
  - padding/stream_dep/priority are currently ignored
  - limited error handling, could be improved

But at least the request is properly decoded, transcoded and processed.
2017-10-31 18:16:18 +01:00
Willy Tarreau
45f752e037 MEDIUM: h2: unblock a connection when its current stream detaches
If a stream is killed for whatever reason and it happens to be the one
currently blocking the connection, we must unblock the connection and
enable polling again so that it can attempt to make progress. This may
happen for example on upload timeout, where the demux is blocked due to
a full stream buffer, and the stream dies on server timeout and quits.
2017-10-31 18:16:18 +01:00
Willy Tarreau
6093514933 MEDIUM: h2: partial implementation of h2_detach()
This does the very minimum required to release a stream and/or a connection
upon the stream's request. The only thing is that it doesn't kill the
connection unless it's already closed or in error or the stream ID reached
the one specified in GOAWAY frame. We're supposed to arm a timer to close
after some idle timeout but it's not done.
2017-10-31 18:16:18 +01:00
Willy Tarreau
61290ec774 MINOR: h2: handle CONTINUATION frames
For now we have nowhere to store partial header frames so we can't
handle CONTINUATION frames and we must reject them. In this case we
respond with a stream error of type INTERNAL_ERROR.
2017-10-31 18:16:18 +01:00
Willy Tarreau
27a84c90ce MINOR: h2: implement h2_send_rst_stream() to send RST_STREAM frames
This one sends an RST_STREAM for a given stream, using the current
demux stream ID. It's also used to send RST_STREAM for streams which
have lost their CS part (ie were aborted).
2017-10-31 18:16:18 +01:00
Willy Tarreau
26f95954fe MEDIUM: h2: honor WINDOW_UPDATE frames
Now they really increase the window size of connections and streams.
If a stream was not queued but requested to send, it means it was
flow-controlled so it's added again into the connection's send list.
2017-10-31 18:16:18 +01:00
Willy Tarreau
f3ee0697f3 MINOR: h2: lookup the stream during demuxing
Several stream-oriented functions will need to perform this lookup, so
better centralize it.
2017-10-31 18:16:18 +01:00
Willy Tarreau
3421aba3de MEDIUM: h2: decode SETTINGS frames and extract relevant settings
The INITIAL_WINDOW_SIZE and MAX_FRAME_SIZE settings are now extracted
from the settings frame, assigned to the connection, and attempted to
be propagated to all existing streams as per the specification. In
practice clients rarely update the settings after sending the first
stream, so the propagation will rarely be used. The ACK is properly
sent after the frame is completely parsed.
2017-10-31 18:16:18 +01:00
Willy Tarreau
cf68c787ae MINOR: h2: implement PING frames
Now we can detect and properly parse PING frames as well as emit a
response containing the same payload.
2017-10-31 18:16:18 +01:00
Willy Tarreau
7e98c057ff MINOR: h2: create a stream parser for the demuxer
The function h2_process_demux() now tries to parse the incoming bytes
to process as many streams as possible. For now it does nothing but
dropping all incoming frames.
2017-10-31 18:16:18 +01:00
Willy Tarreau
4c3690bf96 MEDIUM: h2: detect the presence of the first settings frame
Instead of doing a special processing of the first SETTINGS frame, we
simply parse its header, check that it matches the expected frame type
and flags (ie no ACK), and switch to FRAME_P to parse it as any regular
frame. The regular frame parser will take care of decoding it.
2017-10-31 18:16:18 +01:00
Willy Tarreau
be5b715fb2 MINOR: h2: send a real SETTINGS frame based on the configuration
An initial settings frame is emitted upon receipt of the connection
preface, which takes care of configured values. These settings are
only emitted when they differ from the protocol's default value :

  - header_table_size (defaults to 4096)
  - initial_window_size (defaults to 65535)
  - max_concurrent_streams (defaults to unlimited)
  - max_frame_size (defaults to 16384)

The max frame size is a copy of tune.bufsize. Clients will most often
reject values lower than 16384 and currently there's no trivial way to
check if H2 is going to be used at boot time.
2017-10-31 18:16:18 +01:00
Willy Tarreau
bacdf5a49b MEDIUM: h2: process streams pending for sending
The send() callback calls h2_process_mux() which iterates over the list
of flow controlled streams first, then streams waiting for room in the
send_list. If a stream from the send_list ends up being flow controlled,
it is then moved to the fctl_list. This way we can maintain the most
accurate fairness by ensuring that flows are always processed in order
of arrival except when they're blocked by flow control, in which case
only the other ones may pass in front of them.

It's a bit tricky as we want to remove a stream from the active lists
if it doesn't block (ie it has no reason for staying there).
2017-10-31 18:16:18 +01:00
Willy Tarreau
d7739c8820 MEDIUM: h2: enable reading again on the connection if it was blocked on stream buffer full
If the polling update function is called with RD_ENA while H2_CF_DEM_SFULL
indicates the demux had to block on a stream buffer full condition, we can
remove the flag and re-enable polling for receiving because this is the
indication that a consumer stream has made some room in the buffer. Probably
that we should improve this to ensure that h2s->id == h2c->dsi and avoid
trying to receive multiple times in a row for the wrong stream.
2017-10-31 18:16:18 +01:00
Willy Tarreau
1d393228e0 MEDIUM: h2: enable connection polling for send when a cs wants to emit
A conn_stream indicates its intent to send by setting the WR_ENA flag
and calling mux->update_poll(). There's no synchronous write so the only
way to emit a response from a stream is to proceed this way. The sender
h2s is then queued into the h2c's send_list if it was not yet queued.

Once the connection is ready, it will enter its send() callback to visit
writers, calling their data->send_cb() callback to complete the operation
using mux->snd_buf().

Also we enable polling if the mux contains data and wasn't enabled. This
may happen just after a response has been transmitted using chk_snd().
It likely is incomplete for now and should probably be refined.
2017-10-31 18:16:18 +01:00
Willy Tarreau
52eed75ced MINOR: h2: match the H2 connection preface on init
The H2 preface is properly detected to switch to the settings state.
It's important to note that for now we don't send out settings frame
so the operation is not complete yet.
2017-10-31 18:16:18 +01:00
Willy Tarreau
081d472f79 MINOR: h2: add a function to send a GOAWAY error frame
For now it's only used to report immediate errors by announcing the
highest known stream-id on the mux's error path. The function may be
used both while processing a stream or directly in relation with the
connection. The wake() callback will automatically ask for send access
if an error is reported. The function should be usable for graceful
shutdowns as well by simply setting h2c->last_sid to the highest
acceptable stream-id (2^31-1) prior to calling the function.

A connection flag (H2_CF_GOAWAY_SENT) is set once the frame was
successfully sent. It will be usable to detect when it's safe to
close the connection.

Another flag (H2_CF_GOAWAY_FAILED) is set in case of unrecoverable
error while trying to send. It will also be used to know when it's safe
to close the connection.
2017-10-31 18:16:18 +01:00
Willy Tarreau
bc933930a7 MEDIUM: h2: start to implement the frames processing loop
The rcv_buf() callback now calls h2_process_demux() after an recv() call
leaving some data in the buffer, and the snd_buf() callback calls
h2_process_mux() to try to process pending data from streams.
2017-10-31 18:16:18 +01:00
Willy Tarreau
5160683fc7 MEDIUM: h2: wake the connection up for send on pending streams
If some streams were blocked on flow control and the connection's
window was recently opened, or if some streams are waiting while
no block flag remains, we immediately want to try to send again.
This can happen if a recv() for a stream wants to send after the
send() loop has already been processed.
2017-10-31 18:16:17 +01:00
Willy Tarreau
29a9824144 MEDIUM: h2: properly consider all conditions for end of connection
During h2_wake(), there are various situations that can lead to the
connection being closed :
  - low-level connection error
  - read0 received
  - fatal error (ERROR2)
  - failed to emit a GOAWAY
  - empty stream list with max_id >= last_sid

In such cases, all streams are notified and we have to wait for all
streams to leave while doing nothing, or if the last stream is gone,
we can simply terminate the connection.

It's important to do this test there again because an error might arise
while trying to send a pending GOAWAY after the last stream for example,
thus there's possibly no way to get notified of a closing stream.
2017-10-31 18:16:17 +01:00
Willy Tarreau
26bd761f01 MINOR: h2: also terminate the connection on shutr
It happens that an H2 mux is totally unusable once the client has shut,
so we must consider this situation equivalent to the connection error,
and let the possible streams drain their data if needed then stop.
2017-10-31 18:16:17 +01:00
Willy Tarreau
fbe3b4fcbe MEDIUM: h2: start to consider the H2_CF_{MUX,DEM}_* flags for polling
Now we start to set the flags to indicate that the response buffer is
being awaited or that it is full, it makes it possible to centralize a
little bit the polling management into the wake() callback.

In case of error, we wake all the streams up so that they are aware of
the nature of the event and are able to detach if needed.
2017-10-31 18:16:17 +01:00
Willy Tarreau
1b62c5caef MINOR: h2: update the {MUX,DEM}_{M,D}ALLOC flags on buffer availability
Flag H2_CF_DEM_DALLOC is set when the demux buffer fails to be allocated
in the recv() callback, and is cleared when it succeeds.

Both flags H2_CF_MUX_MALLOC and H2_CF_DEM_MROOM are cleared when the mux
buffer allocation succeeds.

In both cases it will be up to the callers to report allocation failures.
2017-10-31 18:16:17 +01:00
Willy Tarreau
3ccf4b2a20 MINOR: h2: add the function to create a new stream
This one will be used by the HEADERS frame handler and maybe later by
the PUSH frame handler. It creates a conn_stream in the mux's connection.

The create streams are inserted in the h2c's tree sorted by IDs. The
caller is expected to have verified that the stream doesn't exist yet.
2017-10-31 18:16:17 +01:00
Willy Tarreau
2a8561895d MINOR: h2: create dummy idle and closed streams
It will be more convenient to always manipulate existing streams than
null pointers. Here we create one idle stream and one closed stream.
The idea is that we can easily point any stream to one of these states
in order to merge maintenance operations.
2017-10-31 18:15:51 +01:00
Willy Tarreau
2373acc384 MINOR: h2: add stream lookup function based on the stream ID
The function performs a simple lookup in the tree and returns
either the matching h2s or NULL if not found.
2017-10-31 18:12:14 +01:00
Willy Tarreau
54c150653d MINOR: h2: add a few functions to retrieve contents from a wrapping buffer
Functions h2_get_buf_n{16,32,64}() and h2_get_buf_bytes() respectively
extract a network-ordered 16/32/64 bit value from a possibly wrapping
buffer, or any arbitrary size. They're convenient to retrieve a PING
payload or to parse SETTINGS frames. Since they copy one byte at a time,
they will be less efficient than a memcpy-based implementation on large
blocks.
2017-10-31 18:12:14 +01:00
Willy Tarreau
715d5316e5 MINOR: h2: new function h2_peek_frame_hdr() to retrieve a new frame header
This function extracts the next frame header but doesn't consume it.
This will allow to detect a stream-id change and to perform a yielding
window update without losing information. The result is stored into a
temporary frame descriptor. We could also store the next frame header
into the connection but parsing the header again is much cheaper than
wasting bytes in the connection for a rare use case.

A function (h2_skip_frame_hdr()) is also provided to skip the parsed
header (always 9 bytes) and another one (h2_get_frame_hdr()) to do both
at once.
2017-10-31 18:12:14 +01:00
Willy Tarreau
e482074c96 MINOR: h2: add h2_set_frame_size() to update the size in a binary frame
This function is called after preparing a frame, in order to update the
frame's size in the frame header. It takes the frame payload length in
argument.

It simply writes a 24-bit frame size into a buffer, making use of the
net_helper functions which try to optimize per platform (this is a
frequently used operation).
2017-10-31 18:12:14 +01:00
Willy Tarreau
2e43f08c60 MINOR: h2: new function h2s_error() to mark an error on a stream
This one will store the error into the stream's errcode if it's neither
idle nor closed (since these ones are read-only) and switch its state to
H2_SS_ERROR. If a conn_stream is attached, it will be flagged with
CS_FL_ERROR.
2017-10-31 18:12:14 +01:00
Willy Tarreau
741d6df870 MINOR: h2: new function h2c_error to mark an error on the connection
This one sets the error code in h2c->errcode and changes the connection's
stat to H2_CS_ERROR.
2017-10-31 18:12:14 +01:00
Willy Tarreau
5b5e68741a MINOR: h2: small function to know when the mux is busy
A mux is busy when any stream id >= 0 is currently being handled
and the current stream's id doesn't match. When no stream is
involved (ie: demuxer), stream 0 is considered. This will be
necessary to know when it's possible to send frames.
2017-10-31 18:12:14 +01:00
Willy Tarreau
71681174f3 MINOR: h2: add function h2s_id() to report a stream's ID
This one supports being called with NULL and returns 0 in this case,
making it easier to check for stream IDs in various send functions.
2017-10-31 18:12:14 +01:00
Willy Tarreau
2e5b60ee18 MINOR: h2: add the connection and stream flags listing the causes for blocking
A demux may be prevented from receiving for the following reasons :
  - no receive buffer could be allocated
  - the receive buffer is full
  - a response is needed and the mux is currently being used by a stream
  - a response is needed and some room could not be found in the mux
    buffer (either full or waiting for allocation)
  - the stream buffer is waiting for allocation
  - the stream buffer is full

A mux may stop accepting data for the following reasons :
  - the buffer could not be allocated
  - the buffer is full

A stream may stop sending data to a mux for the following reaons :
  - the mux is busy processing another stream
  - the mux buffer lacks room (full or not allocated)
  - the mux's flow control prevents from sending
  - the stream's flow control prevents from sending

All these conditions were turned into flags for use by the respective
places.
2017-10-31 18:12:14 +01:00
Willy Tarreau
1439812da8 MEDIUM: h2: implement the mux buffer allocator
The idea is that we may need a mux buffer for anything, ranging from
receiving to sending traffic. For now it's unclear where exactly the
calls will be placed so let's block both send and recv when a buffer
is missing, and re-enable both of them at the end. This will have to
be changed later.
2017-10-31 18:12:14 +01:00
Willy Tarreau
35dbd5d719 MEDIUM: h2: dynamically allocate the demux buffer on Rx
This patch implements a very basic Rx buffer management. The mux needs
an rx buffer to decode the connection's stream. If this buffer it
available upon Rx events, we fill it with whatever input data are
available. Otherwise we try to allocate it and subscribe to the buffer
wait queue in case of failure. In such a situation, a function
"h2_dbuf_available()" will be called once a buffer may be allocated.
The buffer is released if it's still empty after recv().
2017-10-31 18:12:14 +01:00
Willy Tarreau
a2af51291f MEDIUM: h2: implement basic recv/send/wake functions
For now they don't do much since the buffers are not yet allocated, but
the squeletton is here.
2017-10-31 18:12:14 +01:00
Willy Tarreau
32218eb344 MEDIUM: h2: allocate and release the h2c context on connection init/end
The connection's h2c context is now allocated and initialized on mux
initialization, and released on mux destruction. Note that for now the
release() code is never called.
2017-10-31 18:12:14 +01:00
Willy Tarreau
c64051404d MINOR: h2: add a frame header descriptor for incoming frames
This descriptor will be used by the frame parser, it's designed to ease
manipulation of frame length, type, flags and sid.
2017-10-31 18:03:24 +01:00
Willy Tarreau
96060bad26 MINOR: h2: handle two extra stream states for errors
We need to deal with stream error notifications (RST_STREAM) as well as
internal reporting. The problem is that we don't know in which order
this will be done so we can't unilaterally decide to deallocate the
stream. In order to help, we add two extra stream states, H2_SS_ERROR
and H2_SS_RESET. The former mentions that the stream has an error pending
and the latter indicates that the error was already sent and that the
stream is now closed. It's equivalent to H2_SS_CLOSED except that in this
state we'll avoid sending new RST_STREAM as per RFC7540#5.4.2.

With this it will be possible to only detach or deallocate the h2s once
the stream is closed.
2017-10-31 18:03:24 +01:00
Willy Tarreau
183126488b MINOR: h2: create the h2s struct and the associated pool
This describes an HTTP/2 stream with its relation to the connection
and to the conn_stream on the other side.

For now we also allocate request and response state for HTTP/1 because
the internal HTTP representation is HTTP/1 at the moment. Later this
should evolve towards a version-agnostic representation and this H1
message state will disappear.

It's important to consider that the streams are necessarily polarized
depending on h2c : if the connection is incoming, streams initiated by
the connection receive requests and send responses. Otherwise it's the
other way around. Such information is known during the connection
instanciation by h2c_frt_init() and will normally be reflected in the
stream ID (odd=demux from client, even=demux from server). The initial
H2_CS_PREFACE state will also depend on the direction. The current h2c
state machine doesn't allow for outgoing connections as it uses a single
state for both (rx state only). It should be the demux state only.
2017-10-31 18:03:24 +01:00
Willy Tarreau
5ab6b57c6f MINOR: h2: create the h2c struct and allocate its pool
The h2c struct describes an H2 connection context and is assigned as the
mux's context. It has its own pool, allocated at boot time and released
after deinit().
2017-10-31 18:03:24 +01:00
Willy Tarreau
5242ef8095 MINOR: h2: expose tune.h2.max-concurrent-streams to limit the number of streams
This will be advertised in the settings frame.
2017-10-31 18:03:24 +01:00
Willy Tarreau
e6baec0e23 MINOR: h2: expose tune.h2.initial-window-size to configure the window size
This will be advertised in the settings frame.
2017-10-31 18:03:24 +01:00
Willy Tarreau
fe20e5b8c7 MINOR: h2: expose tune.h2.header-table-size to configure the table size
It's the HPACK header table size which is to be advertised in the settings
frames. It defaults to 4096.
2017-10-31 18:03:24 +01:00
Willy Tarreau
62f5269d05 MINOR: h2: create a very minimalistic h2 mux
This one currently does nothing and rejects every connection. It
registers ALPN token "h2".
2017-10-31 18:03:24 +01:00
Willy Tarreau
1be4f3d8af MEDIUM: hpack: implement basic hpack encoding
For now it only supports literals and a bit of static header table
references for the 9 most common header field names (date, server,
content-type, content-length, last-modified, accept-ranges, etag,
cache-control, location).

A previous incarnation of this commit used to strip the forbidden H2
header names (connection, proxy-connection, upgrade, transfer-encoding,
keep-alive) but this is no longer the case as this filtering is irrelevant
to HPACK encoding and is specific to H2, so this will have to be done by
the caller.

It's quite not optimal but works fine enough to prepare some valid and
partially compressed responses during development.
2017-10-31 18:03:24 +01:00
Willy Tarreau
679790baae MINOR: hpack: implement the decoder
The decoder is now fully functional. It makes use of the dynamic header
table. Dynamic header table size updates are currently ignored, as our
initially advertised value is the highest we support. Strictly speaking,
the impact is that a client referencing a header field after such an
update wouldn't observe an error instead of the connection being dropped
if it was implemented.

Decoded header fields are copied into a target buffer in HTTP/1 format
using HTTP/1.1 as the version. The Host header field is automatically
appended if a ":authority" header field is present.

All decoded header fields can be displayed if the file is compiled with
DEBUG_HPACK.
2017-10-31 18:03:24 +01:00
Willy Tarreau
ce04094c4a MINOR: hpack: implement the header tables management
This code deals with header insertion, retrieval and eviction, as well
as with dynamic header table defragmentation. It is functional for use
as a decoder and was heavily tested in this context. There's still some
room for optimization (eg: the defragmentation code currently does it
in place using a memcpy).

Also for now the dynamic header table is allocated using malloc() while
a pool needs to be created instead.

This code was mostly imported from https://github.com/wtarreau/http2-exp
with "hpack_" prepended in front of most names to avoid risks of conflicts.
Some small cleanups and renamings were applied during the import. This
version must be considered more recent.

Some HPACK error codes were placed here (HPACK_ERR_*), not exactly because
they're needed by the decoder but they'll be needed by all callers. Maybe
a different location should be found.
2017-10-31 18:03:24 +01:00
Willy Tarreau
a004ade512 MINOR: hpack: implement the HPACK Huffman table decoder
The code was borrowed from the HPACK experimental implementations
available here :

    https://github.com/wtarreau/http2-exp

It contains the Huffman table as specified in RFC7541 Appendix B, and a
set of reverse tables used to decode a Huffman byte stream, and produced
by contrib/h2/gen-rht. The encoder is not finalized, it doesn't emit the
byte stream but this is not needed for now.
2017-10-31 18:03:24 +01:00
Willy Tarreau
3e13cbafe2 MEDIUM: session: make use of the connection's destroy callback
Now we don't remove the session when a stream dies, instead we
detach the stream and let the mux decide to release the connection
and call session_free() instead.
2017-10-31 18:03:24 +01:00
Willy Tarreau
4f0c64cad7 MINOR: session: release the listener with the session, not the stream
Since multiple streams can share one session attached to one listener,
the listener_release() call must be done in session_free() and not in
stream_free(), otherwise we end up with a negative count in H2.
2017-10-31 18:03:24 +01:00
Willy Tarreau
436d333124 MEDIUM: connection: add a destroy callback
This callback will be used to release upper layers when a mux is in
use. Given that the mux can be asynchronously deleted, we need a way
to release the extra information such as the session.

This callback will be called directly by the mux upon releasing
everything and before the connection itself is released, so that
the callee can find its information inside the connection if needed.

The way it currently works is not perfect, and most likely this should
instead become a mux release callback, but for now we have no easy way
to add mux-specific stuff, and since there's one mux per connection,
it works fine this way.
2017-10-31 18:03:24 +01:00
Willy Tarreau
ac59f361ba MEDIUM: checks: exclusively use cs_destroy() to release a connection
This way we're using the more consistent API everywhere.
2017-10-31 18:03:24 +01:00
Willy Tarreau
3256073976 MEDIUM: stream: do not forcefully close the client connection anymore
Now that the mux will take care of closing the client connection at the
right moment, we don't need to close the client connection anymore, and
we just need to close the conn_stream.
2017-10-31 18:03:24 +01:00
Willy Tarreau
2c52a2b9ee MEDIUM: connection: make mux->detach() release the connection
For H2, only the mux's timeout or other conditions might cause a
release of the mux and the connection, no stream should be allowed
to kill such a shared connection. So a stream will only detach using
cs_destroy() which will call mux->detach() then free the cs.

For now it's only handled by mux_pt. The goal is that the data layer
never has to care about the connection, which will have to be released
depending on the mux's mood.
2017-10-31 18:03:24 +01:00
Willy Tarreau
a553ae96f5 MEDIUM: connection: replace conn_full_close() with cs_close()
At all call places where a conn_stream is in use, we can now use
cs_close() to get rid of a conn_stream and of its underlying connection
if the mux estimates it makes sense. This is what is currently being
done for the pass-through mux.
2017-10-31 18:03:24 +01:00
Willy Tarreau
4b79524591 MEDIUM: mux_pt: make cs_shutr() / cs_shutw() properly close the connection
Now these functions are able to automatically close both the transport
and the socket layer, causing the whole connection to be torn down if
needed.

The two shutdown modes are implemented for both directions, and when
a direction is closed, if it sees the other one is closed as well, it
completes by closing the connection. This is similar to what is performed
in the stream interface.

It's not deployed yet but the purpose is to get rid of conn_full_close()
where only conn_stream should be known.
2017-10-31 18:03:24 +01:00
Willy Tarreau
9fbbff6de4 MEDIUM: connection: make conn_sock_shutw() aware of lingering
Instead of having to manually handle lingering outside, let's make
conn_sock_shutw() check for it before calling shutdown(). We simply
don't want to emit the FIN if we're going to reset the connection
due to lingering. It's particularly important for silent-drop where
it's absolutely mandatory that no packet leaves the machine.
2017-10-31 18:03:24 +01:00
Willy Tarreau
ecdb3fe9f4 MINOR: conn_stream: modify cs_shut{r,w} API to pass the desired mode
Now we can specify how we want to shutdown (drain vs reset, and normal
vs silent), and this propagates to the mux then the transport layer.
2017-10-31 18:03:23 +01:00
Willy Tarreau
4ff3b89643 MINOR: connection: make conn_stream users also check for per-stream error flag
In a 1:1 connection:stream there's no problem relying on the connection
flags alone to check for errors. But in a mux, it will be possible to mark
certain streams in error without having to mark all of them. An example is
an H2 client sending RST_STREAM frames to abort a long download, or a parse
error requiring to abort only this specific stream.

This commit ensures that stream-interface and checks properly check for
CS_FL_ERROR in cs->flags wherever CO_FL_ERROR was in use. Most likely over
the long term, any check for CO_FL_ERROR will have to disappear.
2017-10-31 18:03:23 +01:00
Olivier Houchard
9aaf778129 MAJOR: connection : Split struct connection into struct connection and struct conn_stream.
All the references to connections in the data path from streams and
stream_interfaces were changed to use conn_streams. Most functions named
"something_conn" were renamed to "something_cs" for this. Sometimes the
connection still is what matters (eg during a connection establishment)
and were not always renamed. The change is significant and minimal at the
same time, and was quite thoroughly tested now. As of this patch, all
accesses to the connection from upper layers go through the pass-through
mux.
2017-10-31 18:03:23 +01:00
Olivier Houchard
7a3f0dfb7b MINOR: mux_pt: implement remaining mux_ops methods
This is a basic pass-through implementation which is now basic but
complete and operational, just not used yet.
2017-10-31 18:03:23 +01:00
Olivier Houchard
e2b40b9eab MINOR: connection: introduce conn_stream
This patch introduces a new struct conn_stream. It's the stream-side of
a multiplexed connection. A pool is created and destroyed on exit. For
now the conn_streams are not used at all.
2017-10-31 18:03:23 +01:00
Willy Tarreau
60ca10a372 MINOR: connection: report the major HTTP version from the MUX for logging (fc_http_major)
A new sample fetch function reports either 1 or 2 for the on-wire encoding,
to indicate if the request was received using the HTTP/1.x format or HTTP/2
format. Note that it reports the on-wire encoding, not the version presented
in the request header.

This will possibly have to evolve if it becomes necessary to report the
encoding on the server side as well.
2017-10-31 18:03:23 +01:00
Willy Tarreau
2e0b2b5f83 MEDIUM: session: use the ALPN token and proxy mode to select the mux
When an incoming connection is made on an HTTP mode frontend, the
session now looks up the mux to use based on the ALPN token and the
proxy mode. This will allow easier mux registration, and we don't
need to hard-code the mux_pt_ops anymore.
2017-10-31 18:03:23 +01:00
Willy Tarreau
f64908294c MINOR: mux: register the pass-through mux for any ALPN string
The pass-through mux is the fallback used on any incoming connection
unless another mux claims the ALPN name and the proxy mode. Thus mux_pt
registers ALPN token "" (empty name) which catches everything.
2017-10-31 18:03:23 +01:00
Willy Tarreau
2386be64ba MINOR: connection: implement alpn registration of muxes
Selecting a mux based on ALPN and the proxy mode will quickly become a
pain. This commit provides new functions to register/lookup a mux based
on the ALPN string and the proxy mode to make this easier. Given that
we're not supposed to support a wide range of muxes, the lookup should
not have any measurable performance impact.
2017-10-31 18:03:23 +01:00
Willy Tarreau
53a4766e40 MEDIUM: connection: start to introduce a mux layer between xprt and data
For HTTP/2 and QUIC, we'll need to deal with multiplexed streams inside
a connection. After quite a long brainstorming, it appears that the
connection interface to the existing streams is appropriate just like
the connection interface to the lower layers. In fact we need to have
the mux layer in the middle of the connection, between the transport
and the data layer.

A mux can exist on two directions/sides. On the inbound direction, it
instanciates new streams from incoming connections, while on the outbound
direction it muxes streams into outgoing connections. The difference is
visible on the mux->init() call : in one case, an upper context is already
known (outgoing connection), and in the other case, the upper context is
not yet known (incoming connection) and will have to be allocated by the
mux. The session doesn't have to create the new streams anymore, as this
is performed by the mux itself.

This patch introduces this and creates a pass-through mux called
"mux_pt" which is used for all new connections and which only
calls the data layer's recv,send,wake() calls. One incoming stream
is immediately created when init() is called on the inbound direction.
There should not be any visible impact.

Note that the connection's mux is purposely not set until the session
is completed so that we don't accidently run with the wrong mux. This
must not cause any issue as the xprt_done_cb function is always called
prior to using mux's recv/send functions.
2017-10-31 18:03:23 +01:00
Christopher Faulet
d7bddda151 BUG/MEDIUM: threads: Initialize the sync-point
The sync point must be initialized before starting threads. This line was lost
in one of merges preparing the threads support integration.
2017-10-31 18:03:06 +01:00
Willy Tarreau
a06a580941 BUG/MAJOR: threads/freq_ctr: use a memory barrier to detect changes
commit 6e01286 (BUG/MAJOR: threads/freq_ctr: fix lock on freq counters)
attempted to fix the loop using volatile but that doesn't work depending
on the level of optimization, resulting in situations where the threads
could remain looping forever. Here we use memory barriers between reads
to enforce a strict ordering and the asm code produced does exactly what
the C code does and works perfectly, with a 3-digit measurement accuracy
observed during a test.
2017-10-31 18:01:18 +01:00
Willy Tarreau
2510f702f9 MINOR: h1: add a function to measure the trailers length
This is needed in the H2->H1 gateway so that we know how long the trailers
block is in chunked encoding. It returns the number of bytes, or 0 if some
are missing, or -1 in case of parse error.
2017-10-31 17:18:10 +01:00
Willy Tarreau
f65610a83d CLEANUP: threads: rename process_mask to thread_mask
It was a leftover from the last cleaning session; this mask applies
to threads and calling it process_mask is a bit confusing. It's the
same in fd, task and applets.
2017-10-31 16:06:06 +01:00
Willy Tarreau
5f4a47b701 CLEANUP: threads: replace the last few 1UL<<tid with tid_bit
There were a few occurences left, better replace them now.
2017-10-31 15:59:32 +01:00
Olivier Houchard
79a481ddde MINOR: ssl: Remove the global allow-0rtt option. 2017-10-31 15:48:42 +01:00
Olivier Houchard
d16bfe6c01 BUG/MINOR: dns: Fix SRV records with the new thread code.
srv_set_fqdn() may be called with the DNS lock already held, but tries to
lock it anyway. So, add a new parameter to let it know if it was already
locked or not;
2017-10-31 15:47:55 +01:00