Commit Graph

9436 Commits

Author SHA1 Message Date
Willy Tarreau
5909380c05 BUG/MINOR: checks: stop polling for write when we have nothing left to send
Since the change of I/O direction, we perform the connect() call and
the send() call together from the top. But the send call must at least
disable polling for writes once it does not have anything left to send.

This bug is partially responsible for the waste of resources described
in issue #253.

This must be backported to 2.0.
2019-09-06 08:13:15 +02:00
Willy Tarreau
dbe3060e81 MINOR: fd: make updt_fd_polling() a normal function
It's called from many places, better use a real function than an inline.
2019-09-05 09:31:18 +02:00
Willy Tarreau
5bee3e2f47 MEDIUM: fd: remove the FD_EV_POLLED status bit
Since commit 7ac0e35f2 in 1.9-dev1 ("MAJOR: fd: compute the new fd polling
state out of the fd lock") we've started to update the FD POLLED bit a
bit more aggressively. Lately with the removal of the FD cache, this bit
is always equal to the ACTIVE bit. There's no point continuing to watch
it and update it anymore, all it does is create confusion and complicate
the code. One interesting side effect is that it now becomes visible that
all fd_*_{send,recv}() operations systematically call updt_fd_polling(),
except fd_cant_recv()/fd_cant_send() which never saw it change.
2019-09-05 09:31:18 +02:00
Christopher Faulet
51bb185618 BUG/MINOR: mux-h1: Fix a possible null pointer dereference in h1_subscribe()
This patch fixes the github issue #243. No backport needed.
2019-09-04 10:30:11 +02:00
Christopher Faulet
b066747107 BUG/MEDIUM: cache: Don't cache objects if the size of headers is too big
HTTP responses with headers than impinge upon the reserve must not be
cached. Otherwise, there is no warranty to have enough space to add the header
"Age" when such cached responses are delivered.

This patch must be backported to 2.0 and 1.9. For these versions, the same must
be done for the legacy HTTP mode.
2019-09-04 10:30:11 +02:00
Christopher Faulet
15a4ce870a BUG/MEDIUM: cache: Properly copy headers splitted on several shctx blocks
In the cache, huge HTTP headers will use several shctx blocks. When a response
is returned from the cache, these headers must be properly copied in the
corresponding HTX message by updating the pointer where to copied a header
part.

This patch must be backported to 2.0 and 1.9.
2019-09-04 10:30:11 +02:00
Christopher Faulet
f1ef7f641d BUG/MINOR: mux-h1: Be sure to update the count before adding EOM after trailers
Otherwise, an EOM may be added in a full buffer.

This patch must be backported to 2.0.
2019-09-04 10:30:11 +02:00
Christopher Faulet
6b32192cfb BUG/MINOR: mux-h1: Don't stop anymore input processing when the max is reached
The loop is now stopped only when nothing else is consumed from the input buffer
or if a parsing error is encountered. This will let a chance to detect cases
when we fail to add the EOM.

For instance, when the max is reached after the headers parsing and all the
message is received. In this case, we may have the flag H1S_F_REOS set without
the flag H1S_F_APPEND_EOM and no pending input data, leading to an error because
we think it is an abort.

This patch must be backported to 2.0. This bug does not affect 1.9.
2019-09-04 10:30:11 +02:00
Christopher Faulet
8427d0d6f8 BUG/MINOR: mux-h1: Fix size evaluation of HTX messages after headers parsing
The block size of the start-line was not counted.

This patch must be backported to 2.0.
2019-09-04 10:30:11 +02:00
Christopher Faulet
84f06533e1 BUG/MINOR: h1: Properly reset h1m when parsing is restarted
Otherwise some processing may be performed twice. For instance, if the header
"Content-Length" is parsed on the first pass, when the parsing is restarted, we
skip it because we think another header with the same value was already seen. In
fact, it is currently the only existing bug that can be encountered. But it is
safer to reset all the h1m on restart to avoid any future bugs.

This patch must be backported to 2.0 and 1.9
2019-09-04 10:30:11 +02:00
Christopher Faulet
3499f62b59 BUG/MINOR: http-ana: Reset response flags when 1xx messages are handled
Otherwise, the following final response could inherit of some of these
flags. For instance, because informational responses have no body, the flag
HTTP_MSGF_BODYLESS is set for 1xx messages. If it is not reset, this flag will
be kept for the final response.

One of visible effect of this bug concerns the HTTP compression. When the final
response is preceded by an 1xx message, the compression is not performed. This
was reported in github issue #229.

This patch must be backported to 2.0 and 1.9. Note that the file http_ana.c does
not exist for these branches, the patch must be applied on proto_htx.c instead.
2019-09-04 10:29:55 +02:00
Jerome Magnin
78891c7e71 BUILD: connection: silence gcc warning with extra parentheses
Commit 8a4ffa0a ("MINOR: send-proxy-v2: sends authority TLV according
to TLV received") is missing parentheses around a variable assignment
used as condition in an if statement, and gcc isn't happy about it.
2019-09-02 16:59:32 +02:00
Frédéric Lécaille
9c3a0ceeac BUG/MEDIUM: peers: local peer socket not bound.
This bug came with 015e4d7 commit: "MINOR: stick-tables: Add peers process
binding computing" where the "stick" rules cases were missing when computing
the peer local listener process binding. At parsing time we store in the
stick-table struct ->proxies_list the proxies which refer to this stick-table.
The process binding is computed after having parsed the entire configuration file
with this simple loop in cfgparse.c:

     /* compute the required process bindings for the peers from <stktables_list>
      * for all the stick-tables, the ones coming with "peers" sections included.
      */
     for (t = stktables_list; t; t = t->next) {
             struct proxy *p;

             for (p = t->proxies_list; p; p = p->next_stkt_ref) {
                     if (t->peers.p && t->peers.p->peers_fe) {
                             t->peers.p->peers_fe->bind_proc |= p->bind_proc;
                     }
             }
     }

Note that if this process binding is not correctly initialized, the child forked
by the master-worker stops the peer local listener. Should be also the case
when daemonizing haproxy.

Must be backported to 2.0.
2019-09-02 14:39:38 +02:00
Emmanuel Hocdet
8a4ffa0aab MINOR: send-proxy-v2: sends authority TLV according to TLV received
Since patch "7185b789", the authority TLV in a PROXYv2 header from a
client connection is stored. Authority TLV sends in PROXYv2 should be
taken into account to allow chaining PROXYv2 without droping it.
2019-08-31 12:28:33 +02:00
Willy Tarreau
c046d167e4 MEDIUM: log: add support for logging to a ring buffer
Now by prefixing a log server with "ring@<name>" it's possible to send
the logs to a ring buffer. One nice thing is that it allows multiple
sessions to consult the logs in real time in parallel over the CLI, and
without requiring file system access. At the moment, ring0 is created as
a default sink for tracing purposes and is available. No option is
provided to create new rings though this is trivial to add to the global
section.
2019-08-30 15:24:59 +02:00
Willy Tarreau
f3dc30f6de MINOR: log: add a target type instead of hacking the address family
Instead of detecting an AF_UNSPEC address family for a log server and
to deduce a file descriptor, let's create a target type field and
explicitly mention that the socket is of type FD.
2019-08-30 15:07:25 +02:00
Willy Tarreau
d52a7f8c8d MEDIUM: log: use the new generic fd_write_frag_line() function
When logging to a file descriptor, we'd rather use the unified
fd_write_frag_line() which uses the FD's lock than perform the
writev() ourselves and use a per-server lock, because if several
loggers point to the same output (e.g. stdout) they are still
not locked and their logs may interleave. The function above
instead relies on the fd's lock so this is safer and will even
protect against concurrent accesses from other areas (e.g traces).
The function also deals with the FD's non-blocking mode so we do
not have to keep specific code for this anymore in the logs.
2019-08-30 15:07:25 +02:00
Willy Tarreau
7e9776ad7b MINOR: fd/log/sink: make the non-blocking initialization depend on the initialized bit
Logs and sinks were resorting to dirty hacks to initialize an FD to
non-blocking mode. Now we have a bit for this in the fd tab so we can
do it on the fly on first use of the file descriptor. Previously it was
set per log server by writing value 1 to the port, or during a sink
initialization regardless of the usage of the fd.
2019-08-30 15:07:25 +02:00
Willy Tarreau
76913d3ef4 CLEANUP: fd: remove leftovers of the fdcache
The "cache" entry was still present in the fdtab struct and it was
reported in "show sess". Removing it broke the cache-line alignment
on 64-bit machines which is important for threads, so it was fixed
by adding an attribute(aligned()) when threads are in use. Doing it
only in this case allows 32-bit thread-less platforms to see the
struct fit into 32 bytes.
2019-08-30 15:07:25 +02:00
Willy Tarreau
30362908d8 BUG/MINOR: ring: b_peek_varint() returns a uint64_t, not a size_t
The difference matters when building on 32-bit architectures and a
warning was rightfully emitted.

No backport is needed.
2019-08-30 15:07:25 +02:00
Willy Tarreau
e7bbbca781 BUG/MEDIUM: mux-h2/trace: fix missing braces added with traces
Ilya reported in issue #242 that h2c_handle_priority() was having
unreachable code...  Obviously, I missed the braces around the "if",
leaving an unconditional return.

No backport is needed.
2019-08-30 15:03:58 +02:00
Willy Tarreau
fe1c908744 BUG/MEDIUM: mux-h2/trace: do not dereference h2c->conn after failed idle
In h2_detach(), if session_check_idle_conn() returns <0 we must not
dereference it since it has been freed.

No backport is needed.
2019-08-30 15:00:42 +02:00
Willy Tarreau
1d181e489c MEDIUM: ring: implement a wait mode for watchers
Now it is possible for a reader to subscribe and wait for new events
sent to a ring buffer. When new events are written to a ring buffer,
the applets that are subscribed are woken up to display new events.
For now we only support this with the CLI applet called by "show events"
since the I/O handler is indeed a CLI I/O handler. But it's not
complicated to add other mechanisms to consume events and forward them
to external log servers for example. The wait mode is enabled by adding
"-w" after "show events <sink>". An extra "-n" was added to directly
seek to new events only.
2019-08-30 11:58:58 +02:00
Willy Tarreau
70b1e50feb MINOR: mux-h2/trace: report the connection pointer and state before FRAME_H
Initially we didn't report anything before FRAME_H but at least the
connection's pointer and its state are desirable.
2019-08-30 11:58:58 +02:00
Willy Tarreau
300decc8d9 MINOR: cli: extend the CLI context with a list and two offsets
Some CLI parsers are currently abusing the CLI context types such as
pointers to stuff longs into them by lack of room. But the context is
80 bytes while cli is only 48, thus there's some room left. This patch
adds a list element and two size_t usable as various offsets. The list
element is initialized.
2019-08-30 11:58:58 +02:00
Willy Tarreau
13696ffba2 BUG/MINOR: ring: fix the way watchers are counted
There are two problems with the way we count watchers on a ring:
  - the test for >=255 was accidently kept to 1 used during QA
  - if the producer deletes all data up to the reader's position
    and the reader is called, cannot write, and is called again,
    it will have a zero offset, causing it to initialize itself
    multiple times and each time adding a new refcount.

Let's change this and instead use ~0 as the initial offset as it's
not possible to have it as a valid one. We now store it into the
CLI context's size_t o0 instead of casting it to a void*.

No backport needed.
2019-08-30 11:58:58 +02:00
Willy Tarreau
99282ddb2c MINOR: trace: extend default event names to 12 chars
With "tx_settings" the 10-chars limit was already passed, thus it sounds
reasonable to push this slightly.
2019-08-30 07:39:59 +02:00
Willy Tarreau
8795194f79 CLEANUP: mux-h2/trace: lower-case event names
I wanted to do it before pushing and forgot. It's easier to type lower-
case event names and more consistent with the "none" and "any" keywords.
2019-08-30 07:39:59 +02:00
Willy Tarreau
8fecec2839 CLEANUP: mux-h2/trace: reformat the "received" messages for better alignment
user-level traces are more readable when visually aligned. This is easily
done by writing "rcvd" instead of "received" to align with "sent" :

  $ socat - /tmp/sock1 <<< "show events buf0"
  [00|h2|0|mux_h2.c:2465] rcvd H2 request  : [1] H2 REQ: GET /?s=10k HTTP/2.0
  [00|h2|0|mux_h2.c:4563] sent H2 response : [1] H2 RES: HTTP/1.1 200
2019-08-30 07:39:59 +02:00
Willy Tarreau
c067a3ac8f MINOR: mux-h2/trace: report h2s->id before h2c->dsi for the stream ID
h2c->dsi is only for demuxing, and needed while decoding a new request.
But if we already have a valid stream ID (e.g. response or outgoing
request), we should use it instead. This avoids seeing [0] in front of
the responses at user level.
2019-08-30 07:39:59 +02:00
Willy Tarreau
17104d46be MINOR: mux-h2/trace: always report the h2c/h2s state and flags
There's no limitation to just "state" trace level anymore, we're
expected to always show these internal states at verbosity levels
above "clean".
2019-08-30 07:39:59 +02:00
Willy Tarreau
94f1dcf119 MINOR: mux-h2/trace: only decode the start-line at verbosity other than "minimal"
This is as documented in "trace h2 verbosity", level "minimal" only
features flags and doesn't perform any decoding at all, "simple" does,
just like "clean" which is the default for end uesrs.
2019-08-30 07:39:59 +02:00
Willy Tarreau
f7dd5191cd MINOR: mux-h2/trace: add a new verbosity level "clean"
The "clean" output will be suitable for user and proto-level output
where the internal stuff (state, pointers, etc) is not desired but
just the basic protocol elements.
2019-08-30 07:38:42 +02:00
Willy Tarreau
ab2ec45403 MINOR: mux-h2: add functions to convert an h2c/h2s state to a string
We need this all the time in traces, let's have it now. For the sake
of compact outputs, the strings are all 3-chars long. The "show fd"
output was improved to make use of this.
2019-08-30 07:10:46 +02:00
Willy Tarreau
7838a79bac MEDIUM: mux-h2/trace: add lots of traces all over the code
All functions of the h2 data path were updated to receive one or multiple
TRACE() calls, at least one pair of TRACE_ENTER()/TRACE_LEAVE(), and those
manipulating protocol elements have been improved to report frame types,
special state transitions or detected errors. Even with careful tests, no
performance impact was measured when traces are disabled.

They are not completely exploited yet, the callback function tries to
dump a lot about them, but still doesn't provide buffer dumps, nor does
it indicate the stream or connection error codes.

The first argument is always set to the connection when known. The second
argument is set to the h2s when known, sometimes a 3rd argument is set to
a buffer, generally the rxbuf or htx, and occasionally the 4th argument
points to an integer (number of bytes read/sent, error code).

Retrieving a 10kB object produces roughly 240 lines when at developer
level, 35 lines at data level, 27 at state level, and 10 at proto level
and 2 at user level.

For now the headers are not dumped, but the start line are emitted in
each direction at user level.

The patch is marked medium because it touches lots of places, though
it takes care not to change the execution path.
2019-08-29 18:22:12 +02:00
Willy Tarreau
db3cfff200 MINOR: mux-h2/trace: add the default decoding callback
The new function h2_trace() is called when relevant by the trace subsystem
in order to provide extra information about the trace being produced. It
can for example display the connection pointer, the stream pointer, etc.
It is declared in the trace source as the default callback as we expect
it to be versatile enough to enrich most traces.

In addition, for requests and responses, if we have a buffer and we can
decode it as an HTX buffer, we can extract user-friendly information from
the start line.
2019-08-29 18:19:11 +02:00
Willy Tarreau
12ae212837 MINOR: mux-h2/trace: register a new trace source with its events
For now the traces are not used. Supported events are categorized by
where the activity comes from (h2c, h2s, stream, etc), a direction
(send/recv/wake), and a list of possibilities for each of them (frame
types, errors, shut, ...). This results in ~50 different events that
try to cover a lot of possibilities when it's needed to filter on
something specific. Special events like protocol error are handled.
A few aggregate events like "rx_frame" or "tx_frame" are planed to
cover all frame types at once by being placed all the time with any
of the other ones.

We also state that the first argument is always the connection. This way
the trace subsystem will be able to safely retrieve some useful info, and
we'll still be able to get the h2c from there (conn->ctx) in a pretty
print function. The second argument will always be an h2s, and in order
to propose it for tracking, we add its description. We also define 4
verbosity levels, which seems more than enough.
2019-08-29 17:14:35 +02:00
Willy Tarreau
370a694879 MINOR: trace: change the detail_level to per-source verbosity
The detail level initially based on syslog levels is not used, while
something related is missing, trace verbosity, to indicate whether or
not we want to call the decoding callback and what level of decoding
we want (raw captures etc). Let's change the field to "verbosity" for
this. A verbosity of zero means that the decoding callback is not
called, and all other levels are handled by this callback and are
source-specific. The source is now prompted to list the levels that
are proposed to the user. When the source doesn't define anything,
"quiet" and "default" are available.
2019-08-29 17:11:25 +02:00
Willy Tarreau
052ad360cd MINOR: trace: also report the trace level in the output
It's very convenient to know the trace level in the output, at least to
grep/grep -v on it. The main usage is to filter/filter out the developer
traces at level DEVEL. For this we now add the numeric level (0 to 4) just
after the source name in the brackets. The output now looks like this :

  [00|h2|4|mux_h2.c:3174] h2_send(): entering : h2c=0x27d75a0 st=2
   -, -, | ------------,  --------,  -------------------> message
    |  | |             |          '-> function name
    |  | |             '-> source file location
    |  | '-> trace level (4=dev)
    |  '-> trace source
    '-> thread number
2019-08-29 17:11:25 +02:00
Willy Tarreau
09fb0df6fd MINOR: trace: prepend the function name for developer level traces
Working on adding traces to mux-h2 revealed that the function names are
manually copied a lot in developer traces. The reason is that they are
not preprocessor macros and as such cannot be concatenated. Let's
slightly adjust the trace() function call to take a function name just
after the file:line argument. This argument is only added for the
TRACE_DEVEL and 3 new TRACE_ENTER, TRACE_LEAVE, and TRACE_POINT macros
and left NULL for others. This way the function name is only reported
for traces aimed at the developers. The pretty-print callback was also
extended to benefit from this. This will also significantly shrink the
data segment as the "entering" and "leaving" strings will now be merged.

One technical point worth mentioning is that the function name is *not*
passed as an ist to the inline function because it's not considered as
a builtin constant by the compiler, and would lead to strlen() being
run on it from all call places before calling the inline function. Thus
instead we pass the const char * (that the compiler knows where to find)
and it's the __trace() function that converts it to an ist for internal
consumption and for the pretty-print callback. Doing this avoids losing
5-10% peak performance.
2019-08-29 17:09:13 +02:00
Willy Tarreau
2ea549bc43 MINOR: trace: change the "payload" level to "data" and move it
The "payload" trace level was ambigous because its initial purpose was
to be able to dump received data. But it doesn't make sense to force to
report data transfers just to be able to report state changes. For
example, all snd_buf()/rcv_buf() operations coming from the application
layer should be tagged at this level. So here we move this payload level
above the state transitions and rename it to avoid the ambiguity making
one think it's only about request/response payload. Now it clearly is
about any data transfer and is thus just below the developer level. The
help messages on the CLI and the doc were slightly reworded to help
remove this ambiguity.
2019-08-29 10:46:11 +02:00
Geoff Simmons
7185b789f9 MINOR: connection: add the fc_pp_authority fetch -- authority TLV, from PROXYv2
Save the authority TLV in a PROXYv2 header from the client connection,
if present, and make it available as fc_pp_authority.

The fetch can be used, for example, to set the SNI for a backend TLS
connection.
2019-08-28 17:16:20 +02:00
Willy Tarreau
a9f5b96e02 MINOR: trace: show thread number and source name in the trace
Traces were missing the thread number and the source name, which was
really annoying. Now the thread number is emitted on two digits inside
the square brackets, followed by the source name then the line location,
each delimited with a vertical bar, such as below :

  [00|h2|mux_h2.c:2651] Notifying stream about SID change : h2c=0x7f3284581ae0 st=3 h2s=0x7f3284297f00 id=523 st=4
  [00|h2|mux_h2.c:2708] receiving H2 HEADERS frame : h2c=0x7f3284581ae0 st=3 dsi=525 (st=0)
  [02|h2|mux_h2.c:2194] Received H2 request : h2c=0x7f328d3d1ae0 st=2 : [525] H2 REQ: GET / HTTP/2.0
  [02|h2|mux_h2.c:2561] Expecting H2 frame header : h2c=0x7f328d3d1ae0 st=2
2019-08-28 10:10:50 +02:00
Willy Tarreau
b3f7a72c27 MINOR: trace: extend the source location to 13 chars
With 4-digit line numbers, this allows to emit up to 6 chars of file
name before extension, instead of 3 previously.
2019-08-28 10:10:50 +02:00
Willy Tarreau
3da0026d25 MINOR: trace: support a default callback for the source
It becomes apparent that most traces will use a single trace pretty
print callback, so let's allow the trace source to declare a default
one so that it can be omitted from trace calls, and will be used if
no other one is specified.
2019-08-28 07:06:23 +02:00
Willy Tarreau
8f24023ba0 MINOR: sink: now report the number of dropped events on output
The principle is that when emitting a message, if some dropped events
were logged, we first attempt to report this counter before going
further. This is done under an exclusive lock while all logs are
produced under a shared lock. This ensures that the dropped line is
accurately reported and doesn't accidently arrive after a later
event.
2019-08-27 17:14:19 +02:00
Willy Tarreau
9f830d7408 MINOR: sink: implement "show events" to show supported sinks and dump the rings
The new "show events" CLI keyword lists supported event sinks. When
passed a buffer-type sink it completely dumps it.

no drops at all during attachment even at 8 millon evts/s.
still missing the attachment limit though.
2019-08-27 17:14:19 +02:00
Willy Tarreau
4ed23ca0e7 MINOR: sink: add support for ring buffers
This now provides sink_new_buf() which allocates a ring buffer. One such
ring ("buf0") of 1 MB is created already, and may be used by sink_write().
The sink's creation should probably be moved somewhere else later.
2019-08-27 17:14:19 +02:00
Willy Tarreau
072931cdcb MINOR: ring: add a generic CLI io_handler to dump a ring buffer
The three functions (attach, IO handler, and release) are meant to be
called by any CLI command which requires to dump the contents of a ring
buffer. We do not implement anything generic to dump any ring buffer on
the CLI since it's meant to be used by other functionalities above.
However these functions deal with locking and everything so it's trivial
to embed them in other code.
2019-08-27 17:14:19 +02:00
Willy Tarreau
be97853c2f MINOR: ring: add a ring_write() function
This function tries to write to the ring buffer, possibly removing enough
old messages to make room for the new one. It takes two arrays of fragments
on input to ease the insertion of prefixes by the caller. It atomically
writes the message, possibly truncating it if desired, and returns the
operation's status.
2019-08-27 17:14:19 +02:00
Willy Tarreau
172945fbad MINOR: ring: add a new mechanism for retrieving/storing ring data in buffers
Our circular buffers are well suited for being used as ring buffers for
not-so-structured data. The machanism here consists in making room in a
buffer before inserting a new record which is prefixed by its size, and
looking up next record based on the previous one's offset and size. We
can have up to 255 consumers watching for data (dump in progress, tail)
which guarantee that entrees are not recycled while they're being dumped.
The complete representation is described in the header file. For now only
ring_new(), ring_resize() and ring_free() are created.
2019-08-27 17:14:19 +02:00
Willy Tarreau
a1426de5aa MINOR: sink: now call the generic fd write function
Let's not mess up with fd-specific code, locking nor message formating
here, and use the new generic function instead. This substantially
simplifies the sink_write() code and makes it more agnostic to the
output representation and storage.
2019-08-27 17:14:19 +02:00
Willy Tarreau
931d8b79a8 MINOR: fd: add fd_write_frag_line() to send a fragmented line to an fd
Currently both logs and event sinks may use a file descriptor to
atomically emit some output contents. The two may use the same FD though
nothing is done to make sure they use the same lock. Also there is quite
some redundancy between the two. Better make a specific function to send
a fragmented message to a file descriptor which will take care of the
locking via the fd's lock. The function is also able to truncate a
message and to enforce addition of a trailing LF when building the
output message.
2019-08-27 17:14:19 +02:00
Willy Tarreau
4d589e719b MINOR: tools: add a function varint_bytes() to report the size of a varint
It will sometimes be useful to encode varints to know the output size in
advance. Two versions are provided, one inline using a switch/case construct
which will be trivial for use with constants (and will be very fast albeit
huge) and one function iterating on the number which is 5 times smaller,
for use with variables.
2019-08-27 17:14:19 +02:00
Willy Tarreau
799e9ed62b MINOR: sink: set the fd-type sinks to non-blocking
Just like we used to do for the logs, we must disable blocking on FD
output except if it's a terminal.
2019-08-27 17:14:18 +02:00
Nenad Merdanovic
177adc9e57 MINOR: backend: Add srv_queue converter
The converter can be useful to look up a server queue from a dynamic value.

It takes an input value of type string, either a server name or
<backend>/<server> format and returns the number of queued sessions
on that server. Can be used in places where we want to look up
queued sessions from a dynamic name, like a cookie value (e.g.
req.cook(SRVID),srv_queue) and then make a decision to break
persistence or direct a request elsewhere.

Signed-off-by: Nenad Merdanovic <nmerdan@haproxy.com>
2019-08-27 04:32:06 +02:00
Jerome Magnin
2dd26ca9ff BUG/MEDIUM: url32 does not take the path part into account in the returned hash.
The url32 sample fetch does not take the path part of the URL into
account. This is because in smp_fetch_url32() we erroneously modify
path.len and path.ptr before testing their value and building the
path based part of the hash.

This fixes issue #235

This must be backported as far as 1.9, when HTX was introduced.
2019-08-26 13:28:13 +02:00
Willy Tarreau
6ee9f8df3b BUG/MEDIUM: listener/threads: fix an AB/BA locking issue in delete_listener()
The delete_listener() function takes the listener's lock before taking
the proto_lock, which is contrary to what other functions do, possibly
causing an AB/BA deadlock. In practice the two only places where both
are taken are during protocol_enable_all() and delete_listener(), the
former being used during startup and the latter during stop. In practice
during reload floods, it is technically possible for a thread to be
initializing the listeners while another one is stopping. While this
is too hard to trigger on 2.0 and above due to the synchronization of
all threads during startup, it's reasonably easy to do in 1.9 by having
hundreds of listeners, starting 64 threads and flooding them with reloads
like this :

   $ while usleep 50000; do killall -USR2 haproxy; done

Usually in less than a minute, all threads will be deadlocked. The fix
consists in always taking the proto_lock before the listener lock. It
seems to be the only place where these two locks were reversed. This
fix needs to be backported to 2.0, 1.9, and 1.8.
2019-08-26 11:07:09 +02:00
Willy Tarreau
e0d86e2c1c BUG/MINOR: mworker: disable SIGPROF on re-exec
If haproxy is built with profiling enabled with -pg, it is possible to
see the master quit during a reload while it's re-executing itself with
error code 155 (signal 27) saying "Profile timer expired)". This happens
if the SIGPROF signal is delivered during the execve() call while the
handler was already unregistered. The issue itself is not directly inside
haproxy but it's easy to address. This patch disables this signal before
calling execvp() during a master reload. A simple test for this consists
in running this little script with haproxy started in master-worker mode :

     $ while usleep 50000; do killall -USR2 haproxy; done

This fix should be backported to all versions using the master-worker
model.
2019-08-26 10:44:48 +02:00
Willy Tarreau
0bb5a5c4b5 BUG/MEDIUM: mux-h1: do not report errors on transfers ending on buffer full
If a receipt ends with the HTX buffer full and everything is completed except
appending the HTX EOM block, we end up detecting an error because the H1
parser did not switch to H1_MSG_DONE yet while all conditions for an end of
stream and end of buffer are met. This can be detected by retrieving 31532
or 31533 chunk-encoded bytes over H1 and seeing haproxy log "SD--" at the
end of a successful transfer.

Ideally the EOM part should be totally independent on the H1 message state
since the block was really parsed and finished. So we should switch to a
last state requiring to send only EOM. However this needs a few risky
changes. This patch aims for simplicity and backport safety, thus it only
adds a flag to the H1 stream indicating that an EOM is still needed, and
excludes this condition from the ones used to detect end of processing. A
cleaner approach needs to be studied, either by adding a state before DONE
or by setting DONE once the various blocks are parsed and before trying to
send EOM.

This fix must be backported to 2.0. The issue does not seem to affect 1.9
though it is not yet known why, probably that it is related to the different
encoding of trailers which always leaves a bit of room to let EOM be stored.
2019-08-23 09:37:30 +02:00
Willy Tarreau
347f464d4e BUG/MEDIUM: mux-h1: do not truncate trailing 0CRLF on buffer boundary
The H1 message parser calls the various message block parsers with an
offset indicating where in the buffer to start from, and only consumes
the data at the end of the parsing. The headers and trailers parsers
have a condition detecting if a headers or trailers block is too large
to fit into the buffer. This is detected by an incomplete block while
the buffer is full. Unfortunately it doesn't take into account the fact
that the block may be parsed after other blocks that are still present
in the buffer, resulting in aborting some transfers early as reported
in issue #231. This typically happens if a trailers block is incomplete
at the end of a buffer full of data, which typically happens with data
sizes multiple of the buffer size minus less than the trailers block
size. It also happens with the CRLF that follows the 0-sized chunk of
any transfer-encoded contents is itself on such a boundary since this
CRLF is technically part of the trailers block. This can be reproduced
by asking a server to retrieve exactly 31532 or 31533 bytes of static
data using chunked encoding with curl, which reports:

   transfer closed with outstanding read data remaining

This issue was revealed in 2.0 and does not affect 1.9 because in 1.9
the trailers block was processed at once as part of the data block
processing, and would simply give up and wait for the rest of the data
to arrive.

It's interesting to note that the headers block parsing is also affected
by this issue but in practice it has a much more limited impact since a
headers block is normally only parsed at the beginning of a buffer. The
only case where it seems to matter is when dealing with a response buffer
full of 100-continue header blocks followed by a regular header block,
which will then be rejected for the same reason.

This fix must be backported to 2.0 and partially to 1.9 (the headers
block part).
2019-08-23 08:11:36 +02:00
Willy Tarreau
d8b99edeed MINOR: trace: retrieve useful pointers and enforce lock-on
Now we try to find frontend, listener, backend, server, connection,
session, stream, from the presented argument of type connection,
stream or session. Various combinations and bounces allow to
retrieve most of them almost all the time. The extraction is
performed early so that we'll be able to apply filters later.

The lock-on is set if it was not there while the trace is running and
a valid pointer is available. If it was already set and doesn't match,
no trace is produced.
2019-08-22 20:21:00 +02:00
Willy Tarreau
60e4c9f8db MINOR: trace: parse the "lock" argument to trace
When no criterion is provided, it carefully enumerates all available
ones, including those provided by the source itself. Otherwise it sets
the new criterion and resets the lockon pointer.
2019-08-22 20:21:00 +02:00
Willy Tarreau
beadb5c823 MINOR: trace: make sure to always stop the locking when stopping or pausing
When we stop or pause a trace (either on a matching event or by hand),
we must also stop the lock-on feature so that we don't follow any
further activity on this pointer even if it is recycled. For now this
is not exploited.
2019-08-22 20:21:00 +02:00
Willy Tarreau
bfd14fc6eb MINOR: trace: implement a call to a decode function
The trace() call will support an optional decoding callback and 4
arguments that this function is supposed to know how to use to provide
extra information. The output remains unchanged when the function is
NULL. Otherwise, the message is pre-filled into the thread-local
trace_buf, and the function is called with all arguments so that it
completes the buffer in a readable form depending on the expected
level of detail.
2019-08-22 20:21:00 +02:00
Willy Tarreau
5da408818b MINOR: trace: make trace() now also take a level in argument
This new "level" argument will allow the trace sources to label the
traces for different purposes, and filter out some of them if they
are not relevant to the current target. Right now we have 5 different
levels:
  - USER : the least verbose one, only a few functional information
  - PAYLOAD: like user but also displays some payload-related information
  - PROTO: focuses on the protocol's framing
  - STATE: also indicate state internal transitions or non-transitions
  - DEVELOPER: adds extra info about branches taken in the code (break
    points, return points)
2019-08-22 20:21:00 +02:00
Willy Tarreau
419bd49f0b MINOR: trace: add the file name and line number in the prefix
We now pass an extra argument "where" to the trace() call, which
is supposed to be an ist made of the concatenation of the filename
and the line number. We only keep the last 10 chars from this string
since the end of file names is most often easy to recognize. This
gives developers useful information at very low cost.
2019-08-22 20:21:00 +02:00
Willy Tarreau
4c2ae48375 MINOR: trace: implement a very basic trace() function
For now it remains quite basic. It performs a few state checks, calls
the source's sink if defined, and performs the transitions between
RUNNING, STOPPED and WAITING when the configured events match.
2019-08-22 20:21:00 +02:00
Willy Tarreau
85b157570b MINOR: trace/cli: add "show trace" to report trace state and statistics
The new "show trace" CLI command lists available trace sources and
indicates their status, their sink, and number of dropped packets.
When "show trace <source>" is used, the list of known events is
also listed with their status per action (report/start/stop/pause).
2019-08-22 20:21:00 +02:00
Willy Tarreau
aaaf411406 MINOR: trace/cli: parse the "level" argument to configure the trace verbosity
The "level" keyword allows to indicate the expected level of verbosity
in the traces, among "user" (least verbose, just synthetic info) to
"developer" (very detailed, including function entry/leaving). It's only
displayed and set but not used yet.
2019-08-22 20:21:00 +02:00
Willy Tarreau
864e880f6c MINOR: trace/cli: register the "trace" CLI keyword to list the sources
For now it lists the sources if one is not provided, and checks
for the source's existence. It lists the events if not provided,
checks for their existence if provided, and adjusts reported
events/start/stop/pause events, and performs state transitions.
It lists sinks and adjusts them as well. Filters, lock, and
level are not implemented yet.
2019-08-22 20:21:00 +02:00
Willy Tarreau
88ebd4050e MINOR: trace: add allocation of buffer-sized trace buffers
This will be needed so that we can implement protocol decoders which will
have to emit their contents into such a buffer.
2019-08-22 20:21:00 +02:00
Willy Tarreau
4151c753fc MINOR: trace: start to create a new trace subsystem
The principle of this subsystem will be to support taking live traces
at various places in the code with conditional triggers, filters, and
ability to lock on some elements. The traces will support typed events
and will be sent into sinks made of ring buffers, file descriptors or
remote servers.
2019-08-22 20:21:00 +02:00
Willy Tarreau
973e662fe8 MINOR: sink: add a support for file descriptors
This is the most basic type of sink. It pre-registers "stdout" and
"stderr", and is able to use writev() on them. The writev() operation
is locked to avoid mixing outputs. It's likely that the registration
should move somewhere else to take into account the fact that stdout
and stderr are still opened or are closed.
2019-08-22 20:21:00 +02:00
Willy Tarreau
67b5a161b4 MINOR: sink: create definitions a minimal code for event sinks
The principle will be to be able to dispatch events to various destinations
called "sinks". This is already done in part in logs where log servers can
be either a UDP socket or a file descriptor. This will be needed with the
new trace subsystem where we may also want to add ring buffers. And it turns
out that all such destinations make sense at all places. Logs may need to be
sent to a TCP server via a ring buffer, or consulted from the CLI. Trace
events may need to be sent to stdout/stderr as well as to remote log servers.

This patch creates a new structure "sink" aiming at addressing these similar
needs. The goal is to merge together what is common to all of them, such as
the output format, the dropped events count, etc, and also keep separately
the target identification (network address, file descriptor). Provisions
were made to have a "waiter" on the sink. For a TCP log server it will be
the task to wake up after writing to the log buffer. For a ring buffer, it
could be the list of watchers on the CLI running a "tail" operation and
waiting for new events. A lock was also placed in the struct since many
operations will require some locking, including the FD ones. The output
formats covers those in use by logs and two extra ones prepending the ISO
time in front of the message (convenient for stdio/buffer).

For now only the generic infrastructure is present, no type-specific
output is implemented. There's the sink_write() function which prepares
and formats a message to be sent, trying hard to avoid copies and only
using pointer manipulation, where the type-specific code just has to be
added. Dropped messages are already counted (for now 100% drop). The
message is put into an iovec array as it will be trivial to use with
file descriptors and sockets.
2019-08-22 20:21:00 +02:00
Willy Tarreau
9eebd8a978 REORG: trace: rename trace.c to calltrace.c and mention it's not thread-safe
The function call tracing code is a quite old and was never ported to
support threads. It's not even sure whether it still works well, but
at least its presence creates confusion for future work so let's rename
it to calltrace.c and add a comment about its lack of thread-safety.
2019-08-22 20:21:00 +02:00
Olivier Houchard
02bac85bee BUG/MEDIUM: h1: Always try to receive more in h1_rcv_buf().
In h1_rcv_buf(), wake the h1c tasklet as long as we're not done reading the
request/response, and the h1c is not already subscribed for receiving. Now
that we no longer subscribe in h1_recv() if we managed to read data, we
rely on h1_rcv_buf() calling us again, but h1_process_input() may have
returned 0 if we only received part of the request, so we have to wake
the tasklet to be sure to get more data again.
2019-08-22 18:35:42 +02:00
Willy Tarreau
78a7cb648c MEDIUM: debug: make the thread dump code show Lua backtraces
When we dump a thread's state (show thread, panic) we don't know if
anything is happening in Lua, which can be problematic especially when
calling external functions. With this patch, the thread dump code can
now detect if we're running in a global Lua task (hlua_process_task),
or in a TCP or HTTP Lua service (task_run_applet and applet.fct ==
hlua_applet_tcp_fct or http_applet_http_fct), or a fetch/converter
from an analyser (s->hlua != NULL). In such situations, it's able to
append a formatted Lua backtrace of the Lua execution path with
function names, file names and line numbers.

Note that a shorter alternative could be to call "luaL_where(hlua->T,0)"
which only prints the current location, but it's not necessarily sufficient
for complex code.
2019-08-21 14:32:09 +02:00
Willy Tarreau
60409db0b1 MINOR: lua: export applet and task handlers
The current functions are seen outside from the debugging code and are
convenient to export so that we can improve the thread dump output :

  void hlua_applet_tcp_fct(struct appctx *ctx);
  void hlua_applet_http_fct(struct appctx *ctx);
  struct task *hlua_process_task(struct task *task, void *context, unsigned short state);

Of course they are only available when USE_LUA is defined.
2019-08-21 14:32:09 +02:00
Willy Tarreau
a2c9911ace MINOR: tools: add append_prefixed_str()
This is somewhat related to indent_msg() except that this one places a
known prefix at the beginning of each line, allows to replace the EOL
character, and not to insert a prefix on the first line if not desired.
It works with a normal output buffer/chunk so it doesn't need to allocate
anything nor to modify the input string. It is suitable for use in multi-
line backtraces.
2019-08-21 14:32:09 +02:00
Willy Tarreau
a512b02f67 MINOR: debug: indicate the applet name when the task is task_run_applet()
This allows to figure what applet is currently being executed (and likely
hung).
2019-08-21 14:32:09 +02:00
Olivier Houchard
ea32b0fa50 BUG/MEDIUM: mux_pt: Don't call unsubscribe if we did not subscribe.
In mux_pt_attach(), don't inconditionally call unsubscribe, and only do so
if we were subscribed. The idea was that at this point we would always be
subscribed, as for the mux_pt attach would only be called after at least one
request, after which the mux_pt would have subscribed, but this is wrong.
We can also be called if for some reason the connection failed before the
xprt was created. And with no xprt, attempting to call unsubscribe will
probably lead to a crash.

This should be backported to 2.0.
2019-08-16 16:11:56 +02:00
Christopher Faulet
bd9e842866 BUG/MINOR: stats: Wait the body before processing POST requests
The stats applet waits to have a full body to process POST requests. Because
when it is waiting for the end of a request it does not produce anything, the
applet may be blocked. The client side is blocked because the stats applet does
not consume anything and the applet is waiting because all the body is not
received. Registering the analyzer AN_REQ_HTTP_BODY when a POST request is sent
for the stats applet solves the issue.

This patch must be backported to 2.0.
2019-08-15 22:26:50 +02:00
Christopher Faulet
81921b1371 BUG/MEDIUM: lua: Fix test on the direction to set the channel exp timeout
This bug was introduced by the commit bfab2ddd ("MINOR: hlua: Add a flag on the
lua txn to know in which context it can be used"). The wrong test was done. So
the timeout was always set on the response channel. It may lead to an infinite
loop.

This patch must be backported everywhere the commit bfab2ddd is. For now, at
least to 2.0, 1.9 and 1.8.
2019-08-14 23:29:18 +02:00
Lukas Tribus
579e3e3dd5 BUG/MINOR: lua: fix setting netfilter mark
In the REORG of commit 1a18b5414 ("REORG: connection: centralize the
conn_set_{tos,mark,quickack} functions") a bug was introduced by
calling conn_set_tos instead of conn_set_mark.

This was reported in issue #212

This should be backported to 1.9 and 2.0.
2019-08-12 07:41:31 +02:00
Olivier Houchard
59dd06d659 BUG/MEDIUM: proxy: Don't use cs_destroy() when freeing the conn_stream.
When we upgrade the mux from TCP to H2/HTX, don't use cs_destroy() to free
the conn_stream, use cs_free() instead. Using cs_destroy() would call the
mux detach method, and at that point of time the mux would be the H2 mux,
which knows nothing about that conn_stream, so bad things would happen.
This should eventually make upgrade from TCP to H2/HTX work, and fix
the github issue #196.

This should be backported to 2.0.
2019-08-09 18:01:15 +02:00
Olivier Houchard
71b20c26be BUG/MEDIUM: proxy: Don't forget the SF_HTX flag when upgrading TCP=>H1+HTX.
In stream_end_backend(), if we're upgrading from TCP to H1/HTX, as we don't
destroy the stream, we have to add the SF_HTX flag on the stream, or bad
things will happen.
This was broken when attempting to fix github issue #196.

This should be backported to 2.0.
2019-08-09 17:50:05 +02:00
Willy Tarreau
9d00869323 CLEANUP: cli: replace all occurrences of manual handling of return messages
There were 221 places where a status message or an error message were built
to be returned on the CLI. All of them were replaced to use cli_err(),
cli_msg(), cli_dynerr() or cli_dynmsg() depending on what was expected.
This removed a lot of duplicated code because most of the times, 4 lines
are replaced by a single, safer one.
2019-08-09 11:26:10 +02:00
Willy Tarreau
d50c7feaa1 MINOR: cli: add two new states to print messages on the CLI
Right now we used to have extremely inconsistent states to report output,
one is CLI_ST_PRINT which prints constant message cli->msg with the
assigned severity, and CLI_ST_PRINT_FREE which prints dynamically
allocated cli->err with severity LOG_ERR, and nothing in between,
eventhough it's useful to be able to report dynamically allocated
messages as well as constant error messages.

This patch adds two extra states, which are not particularly well named
given the constraints imposed by existing ones. One is CLI_ST_PRINT_ERR
which prints a constant error message. The other one is CLI_ST_PRINT_DYN
which prints a dynamically allocated message. By doing so we maintain
the compatibility with current code.

It is important to keep in mind that we cannot pre-initialize pointers
and automatically detect what message type it is based on the assigned
fields, because the CLI's context is in a union shared with all other
users, thus unused fields contain anything upon return. This is why we
have no choice but using 4 states. Keeping the two fields <msg> and
<err> remains useful because one is const and not the other one, and
this catches may copy-paste mistakes. It's just that <err> is pretty
confusing here, it should be renamed.
2019-08-09 10:11:38 +02:00
Willy Tarreau
e0d0b4089d CLEANUP: buffer: replace b_drop() with b_free()
Since last commit there's no point anymore in having two variants of the
same function, let's switch to b_free() only. __b_drop() was renamed to
__b_free() for obvious consistency reasons.
2019-08-08 08:07:45 +02:00
Emmanuel Hocdet
c9858010c2 MINOR: ssl: ssl_fc_has_early should work for BoringSSL
CO_FL_EARLY_SSL_HS/CO_FL_EARLY_DATA are removed for BoringSSL. Early
data can be checked via BoringSSL API and ssl_fc_has_early can used it.

This should be backported to all versions till 1.8.
2019-08-07 18:44:49 +02:00
Emmanuel Hocdet
f967c31e75 BUG/MINOR: ssl: fix 0-RTT for BoringSSL
Since BoringSSL commit 777a2391 "Hold off flushing NewSessionTicket until write.",
0-RTT doesn't work. It appears that half-RTT data (response from 0-RTT) never
worked before the BoringSSL fix. For HAProxy the regression come from 010941f8
"BUG/MEDIUM: ssl: Use the early_data API the right way.": the problem is link to
the logic of CO_FL_EARLY_SSL_HS used for OpenSSL. With BoringSSL, handshake is
done before reading early data, 0-RTT data and half-RTT data are processed as
normal data: CO_FL_EARLY_SSL_HS/CO_FL_EARLY_DATA is not needed, simply remove
it.

This should be backported to all versions till 1.8.
2019-08-07 18:44:48 +02:00
Baptiste Assmann
1263540fe8 MINOR: cache: allow caching of OPTIONS request
Allow HAProxy to cache responses to OPTIONS HTTP requests.
This is useful in the use case of "Cross-Origin Resource Sharing" (cors)
to cache CORS responses from API servers.

Since HAProxy does not support Vary header for now, this would be only
useful for "access-control-allow-origin: *" use case.
2019-08-07 15:13:38 +02:00
Baptiste Assmann
db92a836f4 MINOR: cache: add method to cache hash
Current HTTP cache hash contains only the Host header and the url path.
That said, request method should also be added to the mix to support
caching other request methods on the same URL. IE GET and OPTIONS.
2019-08-07 15:13:38 +02:00
Willy Tarreau
6386481cbb CLEANUP: mux-h2: move the demuxed frame check code in its own function
The frame check code in the demuxer was moved to its own function to
keep the demux function clean enough. This also simplifies the test
case as we can now simply call this function once in H2_CS_FRAME_P
state.
2019-08-07 14:25:20 +02:00
Frédéric Lécaille
be36793d1d BUG/MEDIUM: stick-table: Wrong stick-table backends parsing.
When parsing references to stick-tables declared as backends, they are added to
a list of proxies (they are proxies!) which refer to this stick-tables.
Before this patch we added them to these list without checking they were already
present, making the silly hypothesis the actions/sample were checked/resolved in the same
order the proxies are parsed.

This patch implement a simple inline function to in_proxies_list() to test
the presence of a proxy in a list of proxies. We use this function when resolving
/checking samples/actions.

This bug was introduced by 015e4d7 commit.

Must be backported to 2.0.
2019-08-07 10:32:31 +02:00
Willy Tarreau
5488a62bfb BUG/MEDIUM: checks: make sure to close nicely when we're the last to speak
In SMTP, MySQL and PgSQL checks, we're supposed to finish with a message
to politely quit the server, otherwise some of them will log some errors.
This is the case with Postfix as reported in GH issue #187. Since commit
fe4abe6 ("BUG/MEDIUM: connections: Don't call shutdown() if we want to
disable linger.") we are a bit more aggressive on outgoing connection
closure and checks were not prepared for this.

This patch makes the 3 checks above disable the linger_risk for these
checks so that we close cleanly, with the side effect that it will leave
some TIME_WAIT connections behind (hence why it should not be generalized
to all checks). It's worth noting that in issue #187 it's mentioned that
this patch doesn't seem to be sufficient for Postfix, however based only
on local network activity this looks OK, so maybe this will need to be
improved later.

Given that the patch above was backported to 2.0 and 1.9, this one should
as well.
2019-08-06 16:35:55 +02:00
Willy Tarreau
30d05f3557 BUG/MINOR: mux-h2: always reset rcvd_s when switching to a new frame
In Patrick's trace it was visible that after a stream had been missed,
the next stream would receive a WINDOW_UPDATE with the first one's
credit added to its own. This makes sense because in case of error
h2c->rcvd_s is not reset. Given that this counter is per frame, better
reset it when starting to parse a new frame, it's easier and safer.

This must be backported as far as 1.8.
2019-08-06 15:49:51 +02:00
Willy Tarreau
e74679a9c6 BUG/MINOR: mux-h2: always send stream window update before connection's
In h2_process_mux() if we have some room and an attempt to send a window
update for the connection was pending, it's done first. But it's not done
for the stream, which will have for effect of postponing this attempt
till next pass into h2_process_demux(), at the risk of seeing the send
buffer full again. Let's always try to send both pending frames as soon
as possible.

This should be backported as far as 1.8.
2019-08-06 15:39:32 +02:00
Willy Tarreau
9fd5aa8ada BUG/MEDIUM: mux-h2: do not recheck a frame type after a state transition
Patrick Hemmer reported a rare case where the H2 mux emits spurious
RST_STREAM(STREAM_CLOSED) that are triggered by the send path and do
not even appear to be associated with a previous incoming frame, while
the send path never emits such a thing.

The problem is particularly complex (hence its rarity). What happens is
that when data are uploaded (POST) we must refill the sending stream's
window by sending a WINDOW_UPDATE message (and we must refill the
connection's too). But in a highly bidirectional traffic, it is possible
that the mux's buffer will be full and that there is no more room to build
this WINDOW_UPDATE frame. In this case the demux parser switches to the
H2_CS_FRAME_A state, noting that an "acknowledgement" is needed for the
current frame, and it doesn't change the current stream nor frame type.
But the stream's state was possibly updated (typically OPEN->HREM when
a DATA frame carried the ES flag).

Later the data can leave the buffer, wake up h2_io_cb(), which calls
h2_send() to send pending data, itself calling h2_process_mux() which
detects that there are unacked data in the connection's window so it
emits a WINDOW_UPDATE for the connection and resets the counter. so it
emits a WINDOW_UPDATE for the connection and resets the counter. Then
h2_process() calls h2_process_demux() which continues the processing
based on the current frame type and the current state H2_CS_FRAME_A.
Unfortunately the protocol compliance checks matching the frame type
against the current state are still present. These tests are designed
for new frames only, not for those in progress, but they are not
limited by frame types. Thus the current DATA frame is checked again
against the current stream state that is now HREM, and fails the test
with a STREAM_CLOSED error.

The quick and backportable solution consists in adding the test for
this ACK and bypass all these checks that were already validated prior
to the state transition. A better long-term solution would consist in
having a new state between H and P indicating the frame is new and
needs to be checked ("N" for new?) and apply the protocol tests only
in this state. In addition everywhere we decide to send a window
update, we should send a stream WU first if there are unacked data
for the current stream. Last, rcvd_s should always be reset when
transitioning to FRAME_H (and a BUGON for this in dev would help).

The bug will be way harder to trigger on 2.0 than on 1.8/1.9 because
we have a ring buffer for the connection so the buffer full situations
are extremely rare.

This fix must be backpored to all versions having H2 (as far as 1.8).

Special thanks to Patrick for providing exploitable traces.
2019-08-06 15:35:20 +02:00
Willy Tarreau
cfba9d6eaa BUG/MINOR: mux-h2: do not send REFUSED_STREAM on aborted uploads
If the server decides to close early, we don't want to send a
REFUSED_STREAM error but a CANCEL, so that the client doesn't want
to retry. The test in h2_do_shutw() was wrong for this as it would
handle the HLOC case like the case where nothing had been sent for
this stream, which is wrong. Now h2_do_shutw() does nothing in this
case and lets h2_do_shutr() decide.

Note that this partially undoes f983d00a1 ("BUG/MINOR: mux-h2: make
the do_shut{r,w} functions more robust against retries").

This must be backported to 2.0. The patch above was not backported to
1.9 for being too risky there, but if it eventually gets to it, this
one will be needed as well.
2019-08-06 10:32:02 +02:00
Willy Tarreau
082c45769b BUG/MINOR: mux-h2: use CANCEL, not STREAM_CLOSED in h2c_frt_handle_data()
There is a test on the existence of the conn_stream when receiving data,
to be sure to have somewhere to deliver it. Right now it responds with
STREAM_CLOSED, which is not correct since from an H2 point of view the
stream is not closed and a peer could be upset to see this. After some
analysis, it is important to keep this test to be sure not to fill the
rxbuf then stall the connection. Another option could be to modiffy
h2_frt_transfer_data() to silently discard any contents but the CANCEL
error code is designed exactly for this and to save the peer from
continuing to stream data that will be discarded, so better switch to
using this.

This must be backported as far as 1.8.
2019-08-06 10:15:49 +02:00
Willy Tarreau
231f616170 BUG/MINOR: mux-h2: don't refrain from sending an RST_STREAM after another one
The test in h2s_send_rst_stream() is excessive, it refrains from sending an
RST_STREAM if *the last frame* was an RST_STREAM, regardless of the stream
ID. In a context where both clients and servers abort a lot, it could happen
that one RST_STREAM is dropped from responses from time to time, causing
delays to the client.

This must be backported to 2.0, 1.9 and 1.8.
2019-08-06 10:04:55 +02:00
Olivier Houchard
a3a8ea2fbf BUG/MEDIUM: pollers: Clear the poll_send bits as well.
In _update_fd(), if we're about to remove the FD from the poller, remove
both the receive and the send bits, instead of removing the receive bits
twice.
2019-08-05 23:56:26 +02:00
Olivier Houchard
c22580c2cc BUG/MEDIUM: fd: Always reset the polled_mask bits in fd_dodelete().
In fd_dodelete(), always reset the polled_mask bits, instead on only doing
it if we're closing the file descriptor. We call the poller clo() method
anyway, and failing to do so means that if fd_remove() is used while the
fd is polled, the poller won't attempt to poll on a fd with the same value
as the old one.
This leads to fd being stuck in the SSL code while using the async engine.

This should be backported to 2.0, 1.9 and 1.8.
2019-08-05 18:55:04 +02:00
Olivier Houchard
4c18f94c11 BUG/MEDIUM: proxy: Make sure to destroy the stream on upgrade from TCP to H2
In stream_set_backend(), if we have a TCP stream, and we want to upgrade it
to H2 instead of attempting ot reuse the stream, just destroy the
conn_stream, make sure we don't log anything about the stream, and pretend
we failed setting the backend, so that the stream will get destroyed.
New streams will then be created by the mux, as if the connection just
happened.
This fixes a crash when upgrading from TCP to H2, as the H2 mux totally
ignored the conn_stream provided by the upgrade, as reported in github
issue #196.

This should be backported to 2.0.
2019-08-02 18:28:58 +02:00
Willy Tarreau
1d4a0f8810 BUG/MEDIUM: mux-h2: split the stream's and connection's window sizes
The SETTINGS frame parser updates all streams' window for each
INITIAL_WINDOW_SIZE setting received on the connection (like h2spec
does in test 6.5.3), which can start to be expensive if repeated when
there are many streams (up to 100 by default). A quick test shows that
it's possible to parse only 35000 settings per second on a 3 GHz core
for 100 streams, which is rather small.

Given that window sizes are relative and may be negative, there's no
point in pre-initializing them for each stream and update them from
the settings. Instead, let's make them relative to the connection's
initial window size so that any change immediately affects all streams.
The only thing that remains needed is to wake up the streams that were
unblocked by the update, which is now done once at the end of
h2_process_demux() instead of once per setting. This now results in
5.7 million settings being processed per second, which is way better.

In order to keep the change small, the h2s' mws field was renamed to
"sws" for "stream window size", and an h2s_mws() function was added
to add it to the connection's initial window setting and determine the
window size to use when muxing. The h2c_update_all_ws() function was
renamed to h2c_unblock_sfctl() since it's now only used to unblock
previously blocked streams.

This needs to be backported to all versions till 1.8.
2019-08-02 13:43:33 +02:00
Willy Tarreau
9bc1c95855 BUG/MEDIUM: mux-h2: unbreak receipt of large DATA frames
Recent optimization in commit 4d7a88482 ("MEDIUM: mux-h2: don't try to
read more than needed") broke the receipt of large DATA frames because
it would unconditionally subscribe if there was some room left, thus
preventing any new rx from being done since subscription may only be
done once the end was reached, as indicated by ret == 0.

However, fixing this uncovered that in HTX mode previous versions might
occasionally be affected as well, when an available frame is the same
size as the maximum data that may fit into an HTX buffer, we may end
up reading that whole frame and still subscribe since it's still allowed
to receive, thus causing issues to read the next frame.

This patch will only work for 2.1-dev but a minor adaptation will be
needed for earlier versions (down to 1.9, where subscribe() was added).
2019-08-02 13:37:55 +02:00
Willy Tarreau
45bcb37f0f BUG/MINOR: stream-int: also update analysers timeouts on activity
Between 1.6 and 1.7, some parts of the stream forwarding process were
moved into lower layers and the stream-interface had to keep the
stream's task up to date regarding the timeouts. The analyser timeouts
were not updated there as it was believed this was not needed during
forwarding, but actually there is a case for this which is "option
contstats" which periodically triggers the analyser timeout, and this
change broke the option in case of sustained traffic (if there is some
I/O activity during the same millisecond as the timeout expires, then
the update will be missed).

This patch simply brings back the analyser expiration updates from
process_stream() to stream_int_notify().

It may be backported as far as 1.7, taking care to adjust the fields
names if needed.
2019-08-01 18:58:21 +02:00
William Lallemand
6e5f2ceead BUG/MEDIUM: ssl: open the right path for multi-cert bundle
Multi-cert bundle was not working anymore because we tried to open the
wrong path.
2019-08-01 14:47:57 +02:00
Willy Tarreau
a64c703374 BUG/MINOR: stream-int: make sure to always release empty buffers after sending
There are some situations, after sending a request or response, upon I/O
completion, or applet execution, where we end up with an empty buffer that
was not released. This results in excessive memory usage (back to 1.5) and
a lower CPU cache efficiency since buffers are not recycled as fast. This
has changed since the places where we send have changed with the new
layering, but not all cases susceptible of leaving an empty buffer were
properly spotted. Doing so reduces the memory pressure on buffers by about
2/3 in high traffic tests.

This should be backported to 2.0 and maybe 1.9.
2019-08-01 14:34:01 +02:00
Richard Russo
458eafb36d BUG/MAJOR: http/sample: use a static buffer for raw -> htx conversion
Multiple calls to smp_fetch_fhdr use the header context to keep track of
header parsing position; however, when using header sampling on a raw
connection, the raw buffer is converted into an HTX structure each time, and
this was done in the trash areas; so the block reference would be invalid on
subsequent calls.

This patch must be backported to 2.0 and 1.9.
2019-08-01 11:35:29 +02:00
Christopher Faulet
0a52c17f81 BUG/MEDIUM: lb-chash: Ensure the tree integrity when server weight is increased
When the server weight is increased in consistant hash, extra nodes have to be
allocated. So a realloc() is performed on the nodes array of the server. the
previous commit 962ea7732 ("BUG/MEDIUM: lb-chash: Remove all server's entries
before realloc() to re-insert them after") have fixed the size used during the
realloc() to avoid segfaults. But another bug remains. After the realloc(), the
memory area allocated for the nodes array may change, invalidating all node
addresses in the chash tree.

So, to fix the bug, we must remove all server's entries from the chash tree
before the realloc to insert all of them after, old nodes and new ones. The
insert will be automatically handled by the loop at the end of the function
chash_queue_dequeue_srv().

Note that if the call to realloc() failed, no new entries will be created for
the server, so the effective server weight will be unchanged.

This issue was reported on Github (#189).

This patch must be backported to all versions since the 1.6.
2019-08-01 11:35:29 +02:00
Emmanuel Hocdet
1503e05362 BUG/MINOR: ssl: fix ressource leaks on error
Commit 36b84637 "MEDIUM: ssl: split the loading of the certificates"
introduce leaks on fd/memory in case of error.
2019-08-01 11:27:24 +02:00
William Lallemand
6dee29d63d BUG/MEDIUM: ssl: don't free the ckch in multi-cert bundle
When using a ckch we should never try to free its content, because it
won't be usable  after and can result in a NULL derefence during
parsing.

The content was previously freed because the ckch wasn't stored in a
tree to be used later, now that we use it multiple time, we need to keep
the data.
2019-08-01 11:27:24 +02:00
Willy Tarreau
a37cb1880c MINOR: wdt: also consider that waiting in the thread dumper is normal
It happens that upon looping threads the watchdog fires, starts a dump,
and other threads expire their budget while waiting for the other threads
to get dumped and trigger a watchdog event again, adding some confusion
to the traces. With this patch the situation becomes clearer as we export
the list of threads being dumped so that the watchdog can check it before
deciding to trigger. This way such threads in queue for being dumped are
not attempted to be reported in turn.

This should be backported to 2.0 as it helps understand stack traces.
2019-07-31 19:35:31 +02:00
Willy Tarreau
c07736209d BUG/MINOR: debug: fix a small race in the thread dumping code
If a thread dump is requested from a signal handler, it may interrupt
a thread already waiting for a dump to complete, and may see the
threads_to_dump variable go to zero while others are waiting, steal
the lock and prevent other threads from ever completing. This tends
to happen when dumping many threads upon a watchdog timeout, to threads
waiting for their turn.

Instead now we proceed in two steps :
  1) the last dumped thread sets all bits again
  2) all threads only wait for their own bit to appear, then clear it
     and quit

This way there's no risk that a bit performs a double flip in the same
loop and threads cannot get stuck here anymore.

This should be backported to 2.0 as it clarifies stack traces.
2019-07-31 19:35:31 +02:00
William Lallemand
a8c73748f8 BUG/MEDIUM: ssl: does not try to free a DH in a ckch
ssl_sock_load_dh_params() should not free the DH * of a ckch, or the
ckch won't be usable during the next call.
2019-07-31 19:35:31 +02:00
William Lallemand
c4ecddf418 BUG/BUILD: ssl: fix build with openssl < 1.0.2
Recent changes use struct cert_key_and_chain to load all certificates in
frontends, this structure was previously used only to load multi-cert
bundle, which is supported only on >= 1.0.2.
2019-07-31 17:05:09 +02:00
Willy Tarreau
4d7a884827 MEDIUM: mux-h2: don't try to read more than needed
The h2_recv() loop was historically built around a loop to deal with
the callback model but this is not needed anymore, as it the upper
layer wants more data, it will simply try to read again. Right now
50% of the recvfrom() calls made over H2 return EAGAIN. With this
change it doesn't happen anymore. Note that the code simply consists
in breaking the loop, and reporting real data receipt instead of
always returning 1.

A test was made not to subscribe if we actually read data but it
doesn't change anything since we might be subscribed very early
already.
2019-07-31 16:18:25 +02:00
Olivier Houchard
53055055c5 MEDIUM: pollers: Remember the state for read and write for each threads.
In the poller code, instead of just remembering if we're currently polling
a fd or not, remember if we're polling it for writing and/or for reading, that
way, we can avoid to modify the polling if it's already polled as needed.
2019-07-31 14:54:41 +02:00
Olivier Houchard
305d5ab469 MAJOR: fd: Get rid of the fd cache.
Now that the architecture was changed so that attempts to receive/send data
always come from the upper layers, instead of them only trying to do so when
the lower layer let them know they could try, we can finally get rid of the
fd cache. We don't really need it anymore, and removing it gives us a small
performance boost.
2019-07-31 14:12:55 +02:00
Emmanuel Hocdet
a7a0f991c9 MINOR: ssl: clean ret variable in ssl_sock_load_ckchn
In ssl_sock_load_ckchn, ret variable is now in a half dead usage.
Remove it to clean compilation warnings.
2019-07-30 17:54:35 +02:00
Emmanuel Hocdet
efa4b95b78 CLEANUP: ssl: ssl_sock_load_crt_file_into_ckch
Fix comments for this function and remove free before alloc call: ckch
call is correctly balanced  (alloc/free).
2019-07-30 17:54:34 +02:00
Emmanuel Hocdet
54227d8add MINOR: ssl: do not look at DHparam with OPENSSL_NO_DH
OPENSSL_NO_DH can be defined to avoid obsolete and heavy DH processing.
With OPENSSL_NO_DH, parse the entire PEM file to look at DHparam is wast
of time.
2019-07-30 17:54:34 +02:00
Emmanuel Hocdet
03e09f3818 MINOR: ssl: check private key consistency in loading
Load a PEM certificate and use it in CTX are now decorrelated.
Checking the certificate and private key consistency can be done
earlier: in loading phase instead CTX set phase.
2019-07-30 15:53:54 +02:00
Emmanuel Hocdet
1c65fdd50e MINOR: ssl: add extra chain compatibility
cert_key_and_chain handling is now outside openssl 1.0.2 #if: the
code must be libssl compatible. SSL_CTX_add1_chain_cert and
SSL_CTX_set1_chain requires openssl >= 1.0.2, replace it by legacy
SSL_CTX_add_extra_chain_cert when SSL_CTX_set1_chain is not provided.
2019-07-30 15:53:54 +02:00
Emmanuel Hocdet
9246f8bc83 MINOR: ssl: use STACK_OF for chain certs
Used native cert chain manipulation with STACK_OF from ssl lib.
2019-07-30 15:53:54 +02:00
Willy Tarreau
5e83d996cf BUG/MAJOR: queue/threads: avoid an AB/BA locking issue in process_srv_queue()
A problem involving server slowstart was reported by @max2k1 in issue #197.
The problem is that pendconn_grab_from_px() takes the proxy lock while
already under the server's lock while process_srv_queue() first takes the
proxy's lock then the server's lock.

While the latter seems more natural, it is fundamentally incompatible with
mayn other operations performed on servers, namely state change propagation,
where the proxy is only known after the server and cannot be locked around
the servers. Howwever reversing the lock in process_srv_queue() is trivial
and only the few functions related to dynamic cookies need to be adjusted
for this so that the proxy's lock is taken for each server operation. This
is possible because the proxy's server list is built once at boot time and
remains stable. So this is what this patch does.

The comments in the proxy and server structs were updated to mention this
rule that the server's lock may not be taken under the proxy's lock but
may enclose it.

Another approach could consist in using a second lock for the proxy's queue
which would be different from the regular proxy's lock, but given that the
operations above are rare and operate on small servers list, there is no
reason for overdesigning a solution.

This fix was successfully tested with 10000 servers in a backend where
adjusting the dyncookies in loops over the CLI didn't have a measurable
impact on the traffic.

The only workaround without the fix is to disable any occurrence of
"slowstart" on server lines, or to disable threads using "nbthread 1".

This must be backported as far as 1.8.
2019-07-30 14:02:06 +02:00
William Lallemand
fa8922285d MEDIUM: ssl: load DH param in struct cert_key_and_chain
Load the DH param at the same time as the certificate, we don't need to
open the file once more and read it again. We store it in the ckch_node.

There is a minor change comparing to the previous way of loading the DH
param in a bundle. With a bundle, the DH param in a certificate file was
never loaded, it only used the global DH or the default DH, now it's
able to use the DH param from a certificate file.
2019-07-29 15:28:46 +02:00
William Lallemand
6af03991da MEDIUM: ssl: lookup and store in a ckch_node tree
Don't read a certificate file again if it was already stored in the
ckchn tree. It allows HAProxy to start more quickly if the same
certificate is used at different places in the configuration.

HAProxy lookup in the ssl_sock_load_cert() function, doing it at this
level allows to skip the reading of the certificate in the filesystem.

If the certificate is not found in the tree, we insert the ckch_node in
the tree once the certificate is read on the filesystem, the filename or
the bundle name is used as the key.
2019-07-29 15:28:46 +02:00
William Lallemand
36b8463777 MEDIUM: ssl: split the loading of the certificates
Split the functions which open the certificates.

Instead of opening directly the certificates and inserting them directly
into a SSL_CTX, we use a struct cert_key_and_chain to store them in
memory and then we associate a SSL_CTX to the certificate stored in that
structure.

Introduce the struct ckch_node for the multi-cert bundles so we can
store multiple cert_key_and_chain in the same structure.

The functions ssl_sock_load_multi_cert() and ssl_sock_load_cert_file()
were modified so they don't open the certicates anymore on the
filesystem. (they still open the sctl and ocsp though).  These functions
were renamed ssl_sock_load_ckchn() and ssl_sock_load_multi_ckchn().

The new function ckchn_load_cert_file() is in charge of loading the
files in the cert_key_and_chain. (TODO: load ocsp and sctl from there
too).

The ultimate goal is to be able to load a certificate from a certificate
tree without doing any filesystem access, so we don't try to open it
again if it was already loaded, and we share its configuration.
2019-07-29 15:28:46 +02:00
William Lallemand
a59191b894 MEDIUM: ssl: use cert_key_and_chain struct in ssl_sock_load_cert_file()
This structure was only used in the case of the multi-cert bundle.

Using these primitives everywhere when we load the file are a first step
in the deduplication of the code.
2019-07-29 15:28:46 +02:00
William Lallemand
c940207d39 MINOR: ssl: merge ssl_sock_load_cert_file() and ssl_sock_load_cert_chain_file()
This commit merges the function ssl_sock_load_cert_file() and
ssl_sock_load_cert_chain_file().

The goal is to refactor the SSL code and use the cert_key_and_chain
struct to load everything.
2019-07-29 15:28:46 +02:00
Christopher Faulet
61ed7797f6 BUG/MINOR: htx: Fix free space addresses calculation during a block expansion
When the payload of a block is shrinked or enlarged, addresses of the free
spaces must be updated. There are many possible cases. One of them is
buggy. When there is only one block in the HTX message and its payload is just
before the tail room and it needs to be moved in the head room to be enlarged,
addresses are not correctly updated. This bug may be hit by the compression
filter.

This patch must be backported to 2.0.
2019-07-29 11:17:52 +02:00
Christopher Faulet
301eff8e21 BUG/MINOR: hlua: Only execute functions of HTTP class if the txn is HTTP ready
The flag HLUA_TXN_HTTP_RDY was added in the previous commit to know when a
function is called for a channel with a valid HTTP message or not. Of course it
also depends on the calling direction. In this commit, we allow the execution of
functions of the HTTP class only if this flag is set.

Nobody seems to use them from an unsupported context (for instance, trying to
set an HTTP header from a tcp-request rule). But it remains a bug leading to
undefined behaviors or crashes.

This patch may be backported to all versions since the 1.6. It depends on the
commits "MINOR: hlua: Add a flag on the lua txn to know in which context it can
be used" and "MINOR: hlua: Don't set request analyzers on response channel for
lua actions".
2019-07-29 11:17:52 +02:00
Christopher Faulet
bfab2dddad MINOR: hlua: Add a flag on the lua txn to know in which context it can be used
When a lua action or a lua sample fetch is called, a lua transaction is
created. It is an entry in the stack containing the class TXN. Thanks to it, we
can know the direction (request or response) of the call. But, for some
functions, it is also necessary to know if the buffer is "HTTP ready" for the
given direction. "HTTP ready" means there is a valid HTTP message in the
channel's buffer. So, when a lua action or a lua sample fetch is called, the
flag HLUA_TXN_HTTP_RDY is set if it is appropriate.
2019-07-29 11:17:52 +02:00
Christopher Faulet
51fa358432 MINOR: hlua: Don't set request analyzers on response channel for lua actions
Setting some requests analyzers on the response channel was an old trick to be
sure to re-evaluate the request's analyers after the response's ones have been
called. It is no more necessary. In fact, this trick was removed in the version
1.8 and backported up to the version 1.6.

This patch must be backported to all versions since 1.6 to ease the backports of
fixes on the lua code.
2019-07-29 11:17:52 +02:00
Christopher Faulet
84a6d5bc21 BUG/MEDIUM: hlua: Check the calling direction in lua functions of the HTTP class
It is invalid to manipulate responses from http-request rules or to manipulate
requests from http-response rules. When http-request rules are evaluated, the
connection to server is not yet established, so there is no response at all. And
when http-response rules are evaluated, the request has already been sent to the
server.

Now, the calling direction is checked. So functions "txn.http:req_*" can now
only be called from http-request rules and the functions "txn.http:res_*" can
only be called from http-response rules.

This issue was reported on Github (#190).

This patch must be backported to all versions since the 1.6.
2019-07-29 11:17:52 +02:00
Christopher Faulet
fe6a71b8e0 BUG/MINOR: hlua/htx: Reset channels analyzers when txn:done() is called
For HTX streams, when txn:done() is called, the work is delegated to the
function http_reply_and_close(). But it is not enough. The channel's analyzers
must also be reset. Otherwise, some analyzers may still be called while
processing should be aborted.

For instance, if the function is called from an http-request rules on the
frontend, request analyzers on the backend side are still called. So we may try
to add an header to the request, while this one was already reset.

This patch must be backported to 2.0 and 1.9.
2019-07-29 11:17:52 +02:00
Olivier Houchard
dedd30610b MEDIUM: h1: Don't wake the H1 tasklet if we got the whole request.
In h1_rcv_buf(), don't wake the H1 tasklet to attempt to receive more data
if we got the whole request. It will lead to a recv and maybe to a subscribe
while it may not be needed.
If the connection is keep alive, the tasklet will be woken up later by
h1_detach(), so that we'll be able to get the next request, or an end of
connection.
2019-07-26 17:13:21 +02:00
Olivier Houchard
cc3fec8ac9 MEDIUM: h1: Don't try to subscribe if we managed to read data.
In h1_recv(), don't subscribe if we managed to receive data. We may not have
to, if we received a complete request, and a new receive will be attempted
later, as the tasklet is woken up either by h1_rcv_buf() or by h1_detach.
2019-07-26 17:13:17 +02:00
Willy Tarreau
9fbcb7e2e9 BUG/MINOR: log: make sure writev() is not interrupted on a file output
Since 1.9 we support sending logs to various non-blocking outputs like
stdou/stderr or flies, by using writev() which guarantees that it only
returns after having written everything or nothing. However the syscall
may be interrupted while doing so, and this is visible when writing to
a tty during debug sessions, as some logs occasionally appear interleaved
if an xterm or SSH connection is not very fast. Performance here is not a
critical concern, log correctness is. Let's simply take the logger's lock
around the writev() call to prevent multiple senders from stepping onto
each other's toes.

This may be backported to 2.0 and 1.9.
2019-07-26 15:46:18 +02:00
Olivier Houchard
7859526fd6 BUG/MEDIUM: streams: Don't switch the SI to SI_ST_DIS if we have data to send.
In sess_established(), don't immediately switch the backend stream_interface
to SI_ST_DIS if we only got a SHUTR. We may still have something to send,
ie if the request is a POST, and we should be switched to SI_ST8DIS later
when the shutw will happen.

This should be backported to 2.0 and 1.9.
2019-07-26 14:56:41 +02:00
Christopher Faulet
366ad86af7 BUG/MEDIUM: lb-chash: Fix the realloc() when the number of nodes is increased
When the number of nodes is increased because the server weight is changed, the
nodes array must be realloc. But its new size is not correctly set. Only the
total number of nodes is used to set the new size. But it must also depends on
the size of a node. It must be the total nomber of nodes times the size of a
node.

This issue was reported on Github (#189).

This patch must be backported to all versions since the 1.6.
2019-07-26 14:12:59 +02:00
Christopher Faulet
98fbe9531a MEDIUM: mux-h1: Add the support of headers adjustment for bogus HTTP/1 apps
There is no standard case for HTTP header names because, as stated in the
RFC7230, they are case-insensitive. So applications must handle them in a
case-insensitive manner. But some bogus applications erroneously rely on the
case used by most browsers. This problem becomes critical with HTTP/2
because all header names must be exchanged in lowercase. And HAProxy uses the
same convention. All header names are sent in lowercase to clients and servers,
regardless of the HTTP version.

This design choice is linked to the HTX implementation. So, for previous
versions (2.0 and 1.9), a workaround is to disable the HTX mode to fall
back to the legacy HTTP mode.

Since the legacy HTTP mode was removed, some users reported interoperability
issues because their application was not able anymore to handle HTTP/1 message
received from HAProxy. So, we've decided to add a way to change the case of some
headers before sending them. It is now possible to define a "mapping" between a
lowercase header name and a version supported by the bogus application. To do
so, you must use the global directives "h1-case-adjust" and
"h1-case-adjust-file". Then options "h1-case-adjust-bogus-client" and
"h1-case-adjust-bogus-server" may be used in proxy sections to enable the
conversion. See the configuration manual for more info.

Of course, our advice is to urgently upgrade these applications for
interoperability concerns and because they may be vulnerable to various types of
content smuggling attacks. But, if your are really forced to use an unmaintained
bogus application, you may use these directive, at your own risks.

If it is relevant, this feature may be backported to 2.0.
2019-07-24 18:32:47 +02:00
Willy Tarreau
3de3cd4d97 BUG/MINOR: proxy: always lock stop_proxy()
There is one unprotected call to stop_proxy() from the manage_proxy()
task, so there is a single caller by definition, but there is also
another such call from the CLI's "shutdown frontend" parser. This
one does it under the proxy's lock but the first one doesn't use it.
Thus it is theorically possible to corrupt the list of listeners in a
proxy by issuing "shutdown frontend" and SIGUSR1 exactly at the same
time. While it sounds particularly contrived or stupid, it could
possibly happen with automated tools that would send actions via
various channels. This could cause the process to loop forever or
to crash and thus stop faster than expected.

This might be backported as far as 1.8.
2019-07-24 17:42:44 +02:00
Willy Tarreau
daacf36645 BUG/MEDIUM: protocols: add a global lock for the init/deinit stuff
Dragan Dosen found that the listeners lock is not sufficient to protect
the listeners list when proxies are stopping because the listeners are
also unlinked from the protocol list, and under certain situations like
bombing with soft-stop signals or shutting down many frontends in parallel
from multiple CLI connections, it could be possible to provoke multiple
instances of delete_listener() to be called in parallel for different
listeners, thus corrupting the protocol lists.

Such operations are pretty rare, they are performed once per proxy upon
startup and once per proxy on shut down. Thus there is no point trying
to optimize anything and we can use a global lock to protect the protocol
lists during these manipulations.

This fix (or a variant) will have to be backported as far as 1.8.
2019-07-24 16:45:02 +02:00
Olivier Houchard
f0f4238977 BUG/CRITICAL: http_ana: Fix parsing of malformed cookies which start by a delimiter
When client-side or server-side cookies are parsed, HAProxy enters in an
infinite loop if a Cookie/Set-Cookie header value starts by a delimiter (a colon
or a semicolon). Depending on the operating system, the service may become
degraded, unresponsive, or may trigger haproxy's watchdog causing a service stop
or automatic restart.

To fix this bug, in the loop parsing the attributes, we must be sure to always
skip delimiters once the first attribute-value pair was parsed, empty or
not. The credit for the fix goes to Olivier.

CVE-2019-14241 was assigned to this bug. This patch fixes the Github issue #181.

This patch must be backported to 2.0 and 1.9. However, the patch will have to be
adapted.
2019-07-23 14:58:32 +02:00
Christopher Faulet
90cc4811be BUG/MINOR: http_htx: Support empty errorfiles
Empty error files may be used to disable the sending of any message for specific
error codes. A common use-case is to use the file "/dev/null". This way the
default error message is overridden and no message is returned to the client. It
was supported in the legacy HTTP mode, but not in HTX. Because of a bug, such
messages triggered an error.

This patch must be backported to 2.0 and 1.9. However, the patch will have to be
adapted.
2019-07-23 14:58:32 +02:00
Christopher Faulet
9f5839cde2 BUG/MINOR: http_ana: Be sure to have an allocated buffer to generate an error
In http_reply_and_close() and http_server_error(), we must be sure to have an
allocated buffer (buf.size > 0) to consider it as a valid HTX message. For now,
there is no way to hit this bug. But a fix to support "empty" error messages in
HTX is pending. Such empty messages, after parsing, will be converted into
unallocated buffer (buf.size == 0).

This patch must be backported to 2.0 and 1.9. owever, the patch will have to be
adapted.
2019-07-23 14:58:23 +02:00
Willy Tarreau
ef91c939f3 BUG/MEDIUM: tcp-checks: do not dereference inexisting conn_stream
Github user @jpulz reported a crash with tcp-checks in issue #184
where cs==NULL. If we enter the function with cs==NULL and check->result
!= CHK_RES_UKNOWN, we'll go directly to out_end_tcpcheck and dereference
cs. We must validate there that cs is valid (and conn at the same time
since it would be NULL as well).

This fix must be backported as far as 1.8.
2019-07-23 14:37:47 +02:00
Christopher Faulet
f1204b8933 BUG/MINOR: mux-h1: Close server connection if input data remains in h1_detach()
With the previous commit 03627245c ("BUG/MEDIUM: mux-h1: Trim excess server data
at the end of a transaction"), we try to avoid to handle junk data coming from a
server as a response. But it only works for data already received. Starting from
the moment a server sends an invalid response, it is safer to close the
connection too, because more data may come after and there is no good reason to
handle them.

So now, when a conn_stream is detached from a server connection, if there are
some unexpected input data, we simply trim them and close the connection
ASAP. We don't close it immediately only if there are still some outgoing data
to deliver to the server.

This patch must be backported to 2.0 and 1.9.
2019-07-19 14:51:08 +02:00
Willy Tarreau
b082186528 MEDIUM: backend: remove impossible cases from connect_server()
Now that we start by releasing any possibly existing connection,
the conditions simplify a little bit and some of the complex cases
can be removed. A few comments were also added for non-obvious cases.
2019-07-19 13:50:09 +02:00
Willy Tarreau
a5797aab11 MEDIUM: backend: always release any existing prior connection in connect_server()
When entering connect_server() we're not supposed to have a connection
already, except when retrying a failed connection, which is pretty rare.
Let's simplify the code by starting to unconditionally release any existing
connection. For now we don't go further, as this change alone will lead to
quite some simplification that'd rather be done as a separate cleanup.
2019-07-19 13:50:09 +02:00
Willy Tarreau
5a0b25d31c MEDIUM: lua: do not allocate the remote connection anymore
Lua cosockets do not need to allocate the remote connection anymore.
However this was trickier than expected because some tests were made
on this remote connection's existence to detect establishment instead
of relying on the stream interface's state (which is how it's now done).
The flag SF_ADDR_SET was set a bit too early (before assigning the
address) so this was moved to the right place. It should not have had
any impact beyond confusing debugging.

The only remaining occurrence of the remote connection knowledge now
is for getsockname() which requires to access the connection to send
the syscall, and it's unlikely that we'll need to change this before
QUIC or so.
2019-07-19 13:50:09 +02:00
Willy Tarreau
02efedac0c MINOR: peers: now remove the remote connection setup code
The connection is not needed anymore, the backend does the job.
2019-07-19 13:50:09 +02:00
Willy Tarreau
1c8d32bb62 MAJOR: stream: store the target address into s->target_addr
When forcing the outgoing address of a connection, till now we used to
allocate this outgoing connection and set the address into it, then set
SF_ADDR_SET. With connection reuse this causes a whole lot of issues and
difficulties in the code.

Thanks to the previous changes, it is now possible to store the target
address into the stream instead, and copy the address from the stream to
the connection when initializing the connection. assign_server_address()
does this and as a result SF_ADDR_SET now reflects the presence of the
target address in the stream, not in the connection. The http_proxy mode,
the peers and the master's CLI now use the same mechanism. For now the
existing connection code was not removed to limit the amount of tricky
changes, but the allocated connection is not used anymore.

This change also revealed a latent issue that we've been having around
option http_proxy : the address was set in the connection but neither the
SF_ADDR_SET nor the SF_ASSIGNED flags were set. It looks like the connection
could establish only due to the fact that it existed with a non-null
destination address.
2019-07-19 13:50:09 +02:00
Willy Tarreau
9042060b0b MINOR: stream: add a new target_addr entry in the stream structure
The purpose will be to store the target address there and not to
allocate a connection just for this anymore. For now it's only placed
in the struct, a few fields were moved to plug some holes, and the
entry is freed on release (never allocated yet for now). This must
have no impact. Note that in order to fit, the store_count which
previously was an int was turned into a short, which is way more
than enough given that the hard-coded limit is 8.
2019-07-19 13:50:09 +02:00
Willy Tarreau
16aa4aff6b MINOR: connection: don't use clear_addr() anymore, just release the address
Now that we have dynamically allocated addresses, there's no need to
clear an address before reusing it, just release it. Note that this
is not equivalent to saying that an address is never zero, as shown in
assign_server_address() where an address 0.0.0.0 can still be assigned
to a connection for the time it takes to modify it.
2019-07-19 13:50:09 +02:00
Willy Tarreau
ca79f59365 MEDIUM: connection: make sure all address producers allocate their address
This commit places calls to sockaddr_alloc() at the places where an address
is needed, and makes sure that the allocation is properly tested. This does
not add too many error paths since connection allocations are already in the
vicinity and share the same error paths. For the two cases where a
clear_addr() was called, instead the address was not allocated.
2019-07-19 13:50:09 +02:00
Willy Tarreau
ff5d57b022 MINOR: connection: create a new pool for struct sockaddr_storage
This pool will be used to allocate storage for source and destination
addresses used in connections. Two functions sockaddr_{alloc,free}()
were added and will have to be used everywhere an address is needed.
These ones are safe for progressive replacement as they check that the
existing pointer is set before replacing it. The pool is not yet used
during allocation nor freeing. Also they operate on pointers to pointers
so they will perform checks and replace values. The free one nulls the
pointer.
2019-07-19 13:50:09 +02:00
Willy Tarreau
c0e16f208d MEDIUM: backend: turn all conn->addr.{from,to} to conn->{src,dst}
All reads were carefully reviewed for only reading already checked
values. Assignments were commented indicating that an allocation will
be needed once they become dynamic. The memset() used to clear the
addresses should then be turned to a free() and a NULL assignment.
2019-07-19 13:50:09 +02:00
Willy Tarreau
9a1efe1e15 MINOR: http: convert conn->addr.from to conn->src in sample fetches
These calls are safe because the address' validity was already checked
prior to reaching that code.
2019-07-19 13:50:09 +02:00
Willy Tarreau
44a7d8ee89 MINOR: frontend: switch from conn->addr.{from,to} to conn->{src,dst}
All these values were already checked, it's safe to use them as-is.
2019-07-19 13:50:09 +02:00
Willy Tarreau
b3c81cbbbf MINOR: checks: replace conn->addr.to with conn->dst
Two places will require a dynamic address allocation since the connection
is created from scratch. For the source address it looks like the
clear_addr() call will simply have to be removed as the pointer will
already be NULL.
2019-07-19 13:50:09 +02:00
Willy Tarreau
6c6365f455 MINOR: log: use conn->{src,dst} instead of conn->addr.{from,to}
This is used to retrieve the addresses to be logged (client, frontend,
backend, server). In all places the validity check was already performed.
2019-07-19 13:50:09 +02:00
Willy Tarreau
3f4fa0964c MINOR: sockpair: use conn->dst for the target address in ->connect()
No extra check is needed since the destination must be set there.
2019-07-19 13:50:09 +02:00
Willy Tarreau
ca9f5a927a MINOR: unix: use conn->dst for the target address in ->connect()
No extra check is needed since the destination must be set there.
2019-07-19 13:50:09 +02:00
Willy Tarreau
7bbc4a511f MINOR: tcp: replace conn->addr.{from,to} with conn->{src,dst}
Most of the locations were already safe, only two places needed to have
one extra check to avoid assuming that cli_conn->src is necessarily set
(it is in practice but let's stay safe).
2019-07-19 13:50:09 +02:00
Willy Tarreau
4d3c60ad8d MINOR: session: use conn->src instead of conn->addr.from
In session_accept_fd() we'll soon have to dynamically allocate the
address, or better, steal it from the caller and define a strict calling
convention regarding who's responsible for the freeing. In the simpler
session_prepare_log_prefix(), just add an attempt to retrieve the address
if not yet set and do not dereference it on failure.
2019-07-19 13:50:09 +02:00
Willy Tarreau
026efc71c8 MINOR: proxy: switch to conn->src in error snapshots
The source address was taken unchecked from a client connection. In
practice we know it's set but better strengthen this now.
2019-07-19 13:50:09 +02:00
Willy Tarreau
71e34c186a MINOR: stream: switch from conn->addr.{from,to} to conn->{src,dst}
No allocation is needed there. Some extra checks were added in the
stream dump code to make sure the source address is effectively valid
(it always is but it doesn't cost much to be certain).
2019-07-19 13:50:09 +02:00
Willy Tarreau
a48f4b3254 MINOR: htx: switch from conn->addr.{from,to} to conn->{src,dst}
One place (transparent proxy) will require an allocation when the
address becomes dynamic. A few dereferences of the family were adjusted
to preliminary check for the address pointer to exist at all. The
remaining operations were already performed under control of a
successful retrieval.
2019-07-19 13:50:09 +02:00
Willy Tarreau
3ca149018d MINOR: peers: use conn->dst for the peer's target address
The target address is duplicated from the peer's configured one. For
now we keep the target address as-is but we'll have to dynamically
allocate it and place it into the stream instead. Maybe a sockaddr_dup()
will help by the way.

The "show peers" part is safe as it's already called after checking
the addresses' validity.
2019-07-19 13:50:09 +02:00
Willy Tarreau
9da9a6fdca MINOR: lua: switch to conn->dst for a connection's target address
This one will soon need a dynamic allocation, though this will be
temporary as ideally the address will be placed on the stream and no
connection will be allocated anymore.
2019-07-19 13:50:09 +02:00
Willy Tarreau
085a1513ad MINOR: ssl-sock: use conn->dst instead of &conn->addr.to
This part can be definitive as the check was already in place.
2019-07-19 13:50:09 +02:00
Willy Tarreau
226572f55f MINOR: connection: use conn->{src,dst} instead of &conn->addr.{from,to}
This is in preparation for the switch to dynamic address allocation,
let's migrate the code using the old fields to the pointers instead.
Note that no extra check was added for now, the purpose is only to
get the code to use the pointers and still work.

In the proxy protocol message handling we make sure the addresses are
properly allocated before declaring them unset.
2019-07-19 13:50:09 +02:00
Willy Tarreau
cd7ca79e6c MINOR: http: check the source address via conn_get_src() in sample fetch functions
In smp_fetch_url32_src() and smp_fetch_base32_src() it's better to
validate that the source address was properly initialized since it
will soon be dynamic, thus let's call conn_get_src().
2019-07-19 13:50:09 +02:00
Willy Tarreau
428d8e32f4 MINOR: lua: use conn_get_{src,dst} to retrieve connection addresses
This replaces the previous conn_get_{from,to}_addr() and reuses the
existing error checks.
2019-07-19 13:50:09 +02:00
Willy Tarreau
83b5890b47 MINOR: http/htx: use conn_get_dst() to retrieve the destination address
When adding the X-Original-To header, let's use conn_get_dst() and make
sure it succeeds, since  previous call to conn_get_to_addr() was unchecked.
2019-07-19 13:50:09 +02:00
Willy Tarreau
8fa9984a17 MINOR: log: use conn_get_{dst,src}() to retrieve the cli/frt/bck/srv/ addresses
This also allows us to check that the operation succeeded without
logging whatever remained in the memory area in case of failure.
2019-07-19 13:50:09 +02:00
Willy Tarreau
8dfffdb060 MINOR: stream/cli: use conn_get_{src,dst} in "show sess" and "show peers" output
The stream outputs requires to retrieve connections sources and
destinations. The previous call involving conn_get_{to,from}_addr()
was missing a status check which has now been integrated with the
new call since these places already handle connection errors there.

The same code parts were reused for "show peers" and were modified
similarly.
2019-07-19 13:50:09 +02:00
Willy Tarreau
7bb447c3dd MINOR: stream-int: use conn_get_{src,dst} in conn_si_send_proxy()
These ones replace the previous conn_get_{from,to}_addr() used to wait
for the connection establishment before sending a LOCAL line. The
error handling was preserved.
2019-07-19 13:50:09 +02:00
Willy Tarreau
dddd2b422f MINOR: tcp: replace various calls to conn_get_{from,to}_addr with conn_get_{src,dst}
These calls include the operation's status. When the check was already
present, it was merged with the call. when it was not present, it was
added.
2019-07-19 13:50:09 +02:00
Willy Tarreau
f5bdb64d35 MINOR: ssl: switch to conn_get_dst() to retrieve the destination address
This replaces conn_get_to_addr() and the subsequent check.
2019-07-19 13:50:09 +02:00
Willy Tarreau
3cc01d84b3 MINOR: backend: switch to conn_get_{src,dst}() for port and address mapping
The backend connect code uses conn_get_{from,to}_addr to forward addresses
in transparent mode and to map server ports, without really checking if the
operation succeeds. In preparation of future changes, let's switch to
conn_get_{src,dst}() and integrate status check for possible failures.
2019-07-19 13:50:09 +02:00
Willy Tarreau
a0a4b09d08 MINOR: frontend: switch to conn_get_{src,dst}() for logging and debugging
The frontend accept code uses conn_get_{from,to}_addr for logging and
debugging, without really checking if the operation succeeds. In
preparation of future changes, let's switch to conn_get_{src,dst}() and
integrate status check for possible failures.
2019-07-19 13:50:09 +02:00
Christopher Faulet
03627245c6 BUG/MEDIUM: mux-h1: Trim excess server data at the end of a transaction
At the end of a transaction, when the conn_stream is detach from the H1
connection, on the server side, we must release the input buffer to trim any
excess data received from the server to be sure to block invalid responses.  A
typical example of such data would be from a buggy server responding to a HEAD
with some data, or sending more than the advertised content-length.

This issue was reported on Gitbub. See issue #176.

This patch must be backported to 2.0 and 1.9.
2019-07-19 11:39:19 +02:00
Christopher Faulet
f89f0991f6 MINOR: config: Warn only if the option http-use-htx is used with "no" prefix
No warning message is emitted anymore if the option is used to enable the
HTX. But it is still diplayed when the "no" prefix is used to disable the HTX
explicitly. So, for existing configs, we display a warning only if there is a
change in the behavior of HAProxy between the 2.1 and the previous versions.
2019-07-19 11:39:19 +02:00
Willy Tarreau
2ab5c38359 BUG/MINOR: checks: do not exit tcp-checks from the middle of the loop
There's a comment above tcpcheck_main() clearly stating that no return
statement should be placed in the middle, still we did have one after
installing the mux. It looks mostly harmless though as it will only
fail to mark the server as being in error in case of allocation failure
or config issue.

This fix should be backported to 2.0 and probably 1.9 as well.
2019-07-19 11:03:54 +02:00
Christopher Faulet
4da05478e3 CLEANUP: mux-h2: Remove unused flags H2_SF_CHNK_*
Since the legacy HTTP code was removed, these flags are unused anymore.
2019-07-19 09:46:23 +02:00
Christopher Faulet
39566d1892 BUG/MINOR: session: Send a default HTTP error if accept fails for a H1 socket
If session_accept_fd() fails for a raw HTTP socket, we try to send an HTTP error
500. But we must not rely on error messages of the proxy or on the array
http_err_chunks because these are HTX messages. And it should be too expensive
to convert an HTX message to a raw message at this place. So instead, we send a
default HTTP error message from the array http_err_msgs.

This patch must be backported to 2.0 and 1.9.
2019-07-19 09:46:23 +02:00
Christopher Faulet
76f4c370f1 BUG/MINOR: session: Emit an HTTP error if accept fails only for H1 connection
If session_accept_fd() fails for a raw HTTP socket, we try to send an HTTP error
500. But, we must also take care it is an HTTP/1 connection. We cannot rely on
the mux at this stage, because the error, if any, happens before or during its
creation. So, instead, we check if the mux_proto is specified or not. Indeed,
the mux h1 cannot be forced on the bind line and there is no ALPN to choose
another mux on a raw socket. So if there is no mux_proto defined for a raw HTTP
socket, we are sure to have an HTTP/1 connection.

This patch must be backported to 2.0 and 1.9.
2019-07-19 09:46:23 +02:00
Christopher Faulet
f734638976 MINOR: http: Don't store raw HTTP errors in chunks anymore
Default HTTP error messages are stored in an array of chunks. And since the HTX
was added, these messages are also converted in HTX and stored in another
array. But now, the first array is not used anymore because the legacy HTTP mode
was removed.

So now, only the array with the HTX messages are kept. The other one was
removed.
2019-07-19 09:46:23 +02:00
Christopher Faulet
41ba36f8b2 MINOR: global: Preset tune.max_http_hdr to its default value
By default, this tune parameter is set to MAX_HTTP_HDR. This assignment is done
after the configuration parsing, when we check the configuration validity. So
during the configuration parsing, its value is 0. Now, it is set to MAX_HTTP_HDR
from the start. So, it is possible to rely on it during the configuration
parsing.
2019-07-19 09:46:23 +02:00
Christopher Faulet
1b6adb4a51 MINOR: proxy/http_ana: Remove unused req_exp/rsp_exp and req_add/rsp_add lists
The keywords req* and rsp* are now unsupported. So the corresponding lists are
now unused. It is safe to remove them from the structure proxy.

As a result, the code dealing with these rules in HTTP analyzers was also
removed.
2019-07-19 09:24:12 +02:00
Christopher Faulet
8c3b63ae1d MINOR: proxy: Remove the unused list of block rules
The keyword "block" is now unsupported. So the list of block rules is now
unused. It can be safely removed from the structure proxy.
2019-07-19 09:24:12 +02:00
Christopher Faulet
a6a56e6483 MEDIUM: config: Remove parsing of req* and rsp* directives
It was announced for the 2.1. Following keywords are now unsupported:

  * reqadd, reqallow, reqiallow, reqdel, reqidel, reqdeny, reqideny, reqpass,
    reqipass, reqrep, reqirep reqtarpit, reqitarpit

  * rspadd, rspdel, rspidel, rspdeny, rspideny, rsprep, rspirep

a fatal error is emitted if one of these keyword is found during the
configuraion parsing.
2019-07-19 09:24:12 +02:00
Christopher Faulet
73e8ede156 MINOR: proxy: Remove support of the option 'http-tunnel'
The option 'http-tunnel' is deprecated and it was only used in the legacy HTTP
mode. So this option is now totally ignored and a warning is emitted during
HAProxy startup if it is found in a configuration file.
2019-07-19 09:24:12 +02:00
Christopher Faulet
fc9cfe4006 REORG: proto_htx: Move HTX analyzers & co to http_ana.{c,h} files
The old module proto_http does not exist anymore. All code dedicated to the HTTP
analysis is now grouped in the file proto_htx.c. So, to finish the polishing
after removing the legacy HTTP code, proto_htx.{c,h} files have been moved in
http_ana.{c,h} files.

In addition, all HTX analyzers and related functions prefixed with "htx_" have
been renamed to start with "http_" instead.
2019-07-19 09:24:12 +02:00
Christopher Faulet
a8a46e2041 CLEANUP: proto_http: Move remaining code from proto_http.c to proto_htx.c 2019-07-19 09:24:12 +02:00
Christopher Faulet
eb2754bef8 CLEANUP: proto_http: Remove unecessary includes and comments 2019-07-19 09:24:12 +02:00
Christopher Faulet
22dc248c2a CLEANUP: channel: Remove the unused flag CF_WAKE_CONNECT
This flag is tested or cleared but never set anymore.
2019-07-19 09:24:12 +02:00
Christopher Faulet
cc76d5b9a1 MINOR: proto_http: Remove the unused flag HTTP_MSGF_WAIT_CONN
This flag is set but never used. So remove it.
2019-07-19 09:24:12 +02:00
Christopher Faulet
c41547b66e MINOR: proto_http: Remove unused http txn flags
Many flags of the HTTP transction (TX_*) are now unused and useless. So the
flags TX_WAIT_CLEANUP, TX_HDR_CONN_*, TX_CON_CLO_SET and TX_CON_KAL_SET were
removed. Most of TX_CON_WANT_* were also removed. Only TX_CON_WANT_TUN has been
kept.
2019-07-19 09:24:12 +02:00
Christopher Faulet
67bb3bb0c2 MINOR: hlua: Remove useless test on TX_CON_WANT_* flags
When an HTTP applet is initialized, it is useless to force server-close mode on
the HTTP transaction because the connection mode is now handled by muxes. In
HTX, during analysis, the flag TX_CON_WANT_CLO is set by default in
htx_wait_for_request(), and TX_CON_WANT_SCL is never tested anywere.
2019-07-19 09:24:12 +02:00
Christopher Faulet
711ed6ae4a MAJOR: http: Remove the HTTP legacy code
First of all, all legacy HTTP analyzers and all functions exclusively used by
them were removed. So the most of the functions in proto_http.{c,h} were
removed. Only functions to deal with the HTTP transaction have been kept. Then,
http_msg and hdr_idx modules were entirely removed. And finally the structure
http_msg was lightened of all its useless information about the legacy HTTP. The
structure hdr_ctx was also removed because unused now, just like unused states
in the enum h1_state. Note that the memory pool "hdr_idx" was removed and
"http_txn" is now smaller.
2019-07-19 09:24:12 +02:00
Christopher Faulet
bcac786b36 MINOR: stream: Remove code relying on the legacy HTTP mode
Dump of streams information was updated to remove useless info. And it is not
necessary anymore to update msg->sov..
2019-07-19 09:18:27 +02:00
Christopher Faulet
3d11969a91 MAJOR: filters: Remove code relying on the legacy HTTP mode
This commit breaks the compatibility with filters still relying on the legacy
HTTP code. The legacy callbacks were removed (http_data, http_chunk_trailers and
http_forward_data).

For now, the filters must still set the flag FLT_CFG_FL_HTX to be used on HTX
streams.
2019-07-19 09:18:27 +02:00
Christopher Faulet
b7f8890b19 MINOR: stats: Remove code relying on the legacy HTTP mode
The part of the applet dealing with raw buffer was removed, for the HTTP part
only. So the old functions stats_send_http_headers() and
stats_send_http_redirect() were removed and replaced by the htx ones. The legacy
applet I/O handler was replaced by the htx one. And the parsing of POST data was
purged of the legacy HTTP code.
2019-07-19 09:18:27 +02:00
Christopher Faulet
386a0cda23 MINOR: flt_trace: Remove code relying on the legacy HTTP mode
The legacy HTTP callbacks were removed (trace_http_data,
trace_http_chunk_trailers and trace_http_forward_data). And the loop on the HTTP
headers was updated to only handle HTX messages.
2019-07-19 09:18:27 +02:00
Christopher Faulet
89f2b16530 MEDIUM: compression: Remove code relying on the legacy HTTP mode
The legacy HTTP callbacks were removed (comp_http_data, comp_http_chunk_trailers
and comp_http_forward_data). Functions emitting compressed chunks of data for
the legacy HTTP mode were also removed. The state for the compression filter was
updated accordingly. The compression context and the algorigttm used to compress
data are the only useful information remaining.
2019-07-19 09:18:27 +02:00
Christopher Faulet
95e7ea3c62 MEDIUM: cache: Remove code relying on the legacy HTTP mode
The applet delivering cached objects based on the legacy HTTP code was removed
as the filter callback cache_store_http_forward_data(). And the action analyzing
the response coming from the server to store it in the cache or not was purged
of the legacy HTTP code.
2019-07-19 09:18:27 +02:00
Christopher Faulet
12c28b6579 MINOR: http_act: Remove code relying on the legacy HTTP mode
Actions updating the request or the response start-line are concerned.
2019-07-19 09:18:27 +02:00
Christopher Faulet
a209796c80 MEDIUM: hlua: Remove code relying on the legacy HTTP mode
HTTP applets are concerned and functions of the HTTP class too.
2019-07-19 09:18:27 +02:00
Christopher Faulet
7d37fbb753 MEDIUM: backend: Remove code relying on the HTTP legacy mode
The L7 loadbalancing algorithms are concerned (uri, url_param and hdr), the
"sni" parameter on the server line and the "source" parameter on the server line
when used with "use_src hdr_ip()".
2019-07-19 09:18:27 +02:00
Christopher Faulet
4cb2828e96 MINOR: proxy: Don't adjust connection mode of HTTP proxies anymore
This was only used for the legacy HTTP mode where the connection mode was
handled by the HTTP analyzers. In HTX, the function http_adjust_conn_mode() does
nothing. The connection mode is handled by the muxes.
2019-07-19 09:18:27 +02:00
Christopher Faulet
28b18c5e21 CLEANUP: proxy: Remove the flag PR_O2_USE_HTX
This flag is now unused. So we can safely remove it.
2019-07-19 09:18:27 +02:00
Christopher Faulet
8f7fe1c9d7 MINOR: cache: Remove tests on the option 'http-use-htx'
All cache filters now store HTX messages. So it is useless to test if a cache is
used at the same time by a legacy HTTP proxy and an HTX one.
2019-07-19 09:18:27 +02:00
Christopher Faulet
280f85b153 MINOR: hlua: Remove tests on the option 'http-use-htx' to reject TCP applets
TCP applets are now forbidden for all HTTP proxies because all of them use the
HTX mode. So we don't rely anymore on the flag PR_O2_USE_HTX to do so.
2019-07-19 09:18:27 +02:00
Christopher Faulet
60d29b37b2 MINOR: proxy: Remove tests on the option 'http-use-htx' during H1 upgrade
To know if an upgrade from TCP to H1 must be performed, we now only need to know
if a non HTX stream is assigned to an HTTP backend. So we don't rely anymore on
the flag PR_O2_USE_HTX to handle such upgrades.
2019-07-19 09:18:27 +02:00
Christopher Faulet
3494c63770 MINOR: stream: Remove tests on the option 'http-use-htx' in stream_new()
All streams created for an HTTP proxy must now use the HTX internal
resprentation. So, it is no more necessary to test the flag PR_O2_USE_HTX. It
means a stream is an HTX stream if the frontend is an HTTP proxy or if the
frontend multiplexer, if any, set the flag MX_FL_HTX.
2019-07-19 09:18:27 +02:00
Christopher Faulet
0d79c67103 MINOR: config: Remove tests on the option 'http-use-htx'
All proxies have now the option PR_O2_USE_HTX set. So it is useless to still
test it when the validity of the configuratio is checked.
2019-07-19 09:18:27 +02:00
Christopher Faulet
6d1dd46917 MEDIUM: http_fetch: Remove code relying on HTTP legacy mode
Since the legacy HTTP mode is disbabled, all HTTP sample fetches work on HTX
streams. So it is safe to remove all code relying on HTTP legacy mode. Among
other things, the function smp_prefetch_http() was removed with the associated
macros CHECK_HTTP_MESSAGE_FIRST() and CHECK_HTTP_MESSAGE_FIRST_PERM().
2019-07-19 09:18:27 +02:00
Christopher Faulet
9a7e8ce4eb MINOR: stream: Rely on HTX analyzers instead of legacy HTTP ones
Since the legacy HTTP mode is disabled, old HTTP analyzers do nothing but call
those of the HTX. So, it is safe to directly call HTX analyzers from
process_stream().
2019-07-19 09:18:27 +02:00
Christopher Faulet
c985f6c5d8 MINOR: connection: Remove the multiplexer protocol PROTO_MODE_HTX
Since the legacy HTTP mode is disabled and no multiplexer relies on it anymore,
there is no reason to have 2 multiplexer protocols for the HTTP. So the protocol
PROTO_MODE_HTX was removed and all HTTP multiplexers use now PROTO_MODE_HTTP.
2019-07-19 09:18:27 +02:00
Christopher Faulet
5ed8353dcf CLEANUP: h2: Remove functions converting h2 requests to raw HTTP/1.1 ones
Because the h2 multiplexer only uses the HTX mode, following H2 functions were
removed :

  * h2_prepare_h1_reqline
  * h2_make_h1_request()
  * h2_make_h1_trailers()
2019-07-19 09:18:27 +02:00
Christopher Faulet
9b79a1025d MEDIUM: mux-h2: Remove support of the legacy HTTP mode
Now the H2 multiplexer only works in HTX. Code relying on the legacy HTTP mode
was removed.
2019-07-19 09:18:27 +02:00
Christopher Faulet
319303739a MAJOR: http: Deprecate and ignore the option "http-use-htx"
From this commit, the legacy HTTP mode is now definitely disabled. It is the
first commit of a long series to remove the legacy HTTP code. Now, all HTTP
processing is done using the HTX internal representation. Since the version 2.0,
It is the default mode. So now, it is no more possible to disable the HTX to
fallback on the legacy HTTP mode. If you still use "[no] option http-use-htx", a
warning will be emitted during HAProxy startup. Note the passthough multiplexer
is now only usable for TCP proxies.
2019-07-19 09:18:27 +02:00
Christopher Faulet
2bf43f0746 MINOR: htx: Use an array of char to store HTX blocks
Instead of using a array of (struct block), it is more natural and intuitive to
use an array of char. Indeed, not only (struct block) are stored in this array,
but also their payload.
2019-07-19 09:18:27 +02:00
Christopher Faulet
192c6a23d4 MINOR: htx: Deduce the number of used blocks from tail and head values
<head> and <tail> fields are now signed 32-bits integers. For an empty HTX
message, these fields are set to -1. So the field <used> is now useless and can
safely be removed. To know if an HTX message is empty or not, we just compare
<head> against -1 (it also works with <tail>). The function htx_nbblks() has
been added to get the number of used blocks.
2019-07-19 09:18:27 +02:00
Christopher Faulet
5a916f7326 CLEANUP: htx: Remove the unsued function htx_add_blk_type_size() 2019-07-19 09:18:27 +02:00
Christopher Faulet
3b21972061 DOC: htx: Update comments in HTX files
This patch may be backported to 2.0 to have accurate comments.
2019-07-19 09:18:27 +02:00
Christopher Faulet
c63231df55 MINOR: proto_htx: Don't stop forwarding when there is a post-connect processing
The TXN flag HTTP_MSGF_WAIT_CONN is now ignored on HTX streams. There is no
reason to not start to forward data in HTX. This is required for the legacy mode
and this was copied from it during the HTX development. But it is simply
useless.
2019-07-19 09:18:27 +02:00
Christopher Faulet
b5f86f116b MINOR: backend/htx: Don't rewind output data to set the sni on a srv connection
Rewind on output data is useless for HTX streams.
2019-07-19 09:18:27 +02:00
Christopher Faulet
304cc40536 MINOR: proto_htx: Add the function htx_return_srv_error()
Instead of using a function from the legacy HTTP, the HTX code now uses its own
one.
2019-07-19 09:18:27 +02:00
Christopher Faulet
00618aadf9 MINOR: proto_htx: Rely on the HTX function to apply a redirect rules
There is no reason to use the legacy HTTP version here, which falls back on the
HTX version in this case.
2019-07-19 09:18:27 +02:00
Christopher Faulet
75b4cd967d MINOR: proto_htx: Directly call htx_check_response_for_cacheability()
Instead of using the HTTP legacy version.
2019-07-19 09:18:27 +02:00
Christopher Faulet
4d0e263079 BUG/MINOR: hlua: Make the function txn:done() HTX aware
The function hlua_txn_done() still relying, for the HTTP, on the legacy HTTP
mode. Now, for HTX streams, it calls the function htx_reply_and_close().

This patch must be backported to 2.0 and 1.9.
2019-07-19 09:18:27 +02:00
Christopher Faulet
5f2c49f5ee BUG/MINOR: cache/htx: Make maxage calculation HTX aware
The function http_calc_maxage() was not updated to be HTX aware. So the header
"Cache-Control" on the response was never parsed to find "max-age" or "s-maxage"
values.

This patch must be backported to 2.0 and 1.9.
2019-07-19 09:18:27 +02:00
Christopher Faulet
7b889cb387 BUG/MINOR: http_htx: Initialize HTX error messages for TCP proxies
Since the HTX is the default mode for all proxies, HTTP and TCP, we must
initialize all HTX error messages for all HTX-aware proxies and not only for
HTTP ones. It is required to support HTTP upgrade for TCP proxies.

This patch must be backported to 2.0.
2019-07-19 09:18:27 +02:00
Christopher Faulet
cd76195061 BUG/MINOR: http_fetch: Fix http_auth/http_auth_group when called from TCP rules
These sample fetches rely on the static fnuction get_http_auth(). For HTX
streams and TCP proxies, this last one gets its HTX message from the request's
channel. When called from an HTTP rule, There is no problem. Bu when called from
TCP rules for a TCP proxy, this buffer is a raw buffer not an HTX message. For
instance, using the following TCP rule leads to a crash :

  tcp-request content accept if { http_auth(Users) }

To fix the bug, we must rely on the HTX message returned by the function
smp_prefetch_htx(). So now, the HTX message is passed as argument to the
function get_http_auth().

This patch must be backported to 2.0 and 1.9.
2019-07-19 09:18:27 +02:00
Christopher Faulet
6d36e1c282 MINOR: mux-h2: Don't adjust anymore the amount of data sent in h2_snd_buf()
Because the infinite forward is HTX aware, it is useless to tinker with the
number of bytes really sent. This was fixed long ago for the H1 and forgotten to
do so for the H2.
2019-07-19 09:18:27 +02:00
Willy Tarreau
09e0203ef4 BUG/MINOR: backend: do not try to install a mux when the connection failed
If si_connect() failed, do not try to install the mux nor to complete
the operations or add the connection to an idle list, and abort quickly
instead. No obvious side effects were identified, but continuing to
allocate some resources after something has already failed seems risky.

This was a result of a prior fix which already wanted to push this code
further : aa089d80b ("BUG/MEDIUM: server: Defer the mux init until after
xprt has been initialized.") but it ought to have pushed it even further
to maintain the error check just after si_connect().

To be backported to 2.0 and 1.9.
2019-07-18 16:49:11 +02:00
Willy Tarreau
69564b1c49 BUG/MEDIUM: http/htx: unbreak option http_proxy
The temporary connection used to hold the target connection's address
was missing a valid target, resulting in a 500 server error being
reported when trying to connect to a remote host. Strangely this
issue was introduced as a side effect of commit 2c52a2b9e ("MEDIUM:
connection: make mux->detach() release the connection") which at
first glance looks unrelated but solidly stops the bisection (note
that by default this part even crashes). It's suspected that the
error only happens when closing and destroys pending data in fact.

Given that this feature was broken very early during 1.8-rc1 development
it doesn't seem to be used often. This must be backported as far as 1.8.
2019-07-18 16:49:11 +02:00
Olivier Houchard
0ba6c85a0b BUG/MEDIUM: checks: Don't attempt to receive data if we already subscribed.
tcpcheck_main() might be called while we already attempted to subscribe, and
failed. There's no point in trying to call rcv_buf() again, and failing
would lead to us trying to subscribe again, which is not allowed.

This should be backported to 2.0 and 1.9.
2019-07-18 16:42:45 +02:00
Willy Tarreau
8280ea97a0 MINOR: applet: make appctx use their own pool
A long time ago, applets were seen as an alternative to connections,
and since their respective sizes were roughly equal it appeared wise
to share the same pool. Nowadays, connections got significantly larger
but applets are not that often used, except for the cache. However
applets are mostly complementary and not alternatives anymore, as
it's very possible not to have a back connection or to share one with
other streams.

The connections will soon lose their addresses and their size will
shrink so much that appctx won't fit anymore. Given that the old
benefits of sharing these pools have long disappeared, let's stop
doing this and have a dedicated pool for appctx.
2019-07-18 10:45:08 +02:00
Willy Tarreau
45726fd458 BUG/MINOR: dns: remove irrelevant dependency on a client connection
The do-resolve action tests for a client connection to the stream and
tries to get the client's address, otherwise it refrains from performing
the resolution. This really makes no sense at all and looks like an
earlier attempt at resolving the client's address to test that the
code was working. Further, it prevents the action from being used
from other places such as an autonomous applet for example, even if
at the moment this use case does not exist.

This patch simply removes the irrelevant test.

This can be backported to 2.0.
2019-07-17 14:11:57 +02:00
Willy Tarreau
7764a57d32 BUG/MEDIUM: threads: cpu-map designating a single thread/process are ignored
Since commit 81492c989 ("MINOR: threads: flatten the per-thread cpu-map"),
we don't keep the proc*thread matrix anymore to represent the full binding
possibilities, but only the proc and thread ones. The problem is that the
per-process binding is not the same for each thread and for the process,
and the proc[] array was assumed to store the per-proc first thread value
when doing this change. Worse, the logic present there tries to deal with
thread ranges and process ranges in a way which automatically exclused the
other possibility (since ranges cannot be used on both) but as such fails
to apply changes if neither the process nor the thread is expressed as a
range.

The real problem comes from the fact that specifying cpu-map 1/1 doesn't
yet reveal if the per-process mask or the per-thread mask needs to be
updated. In practice it's the thread one but then the current storage
doesn't allow to store the binding of the first thread of each other
process in nbproc>1 configurations.

When removing the proc*thread matrix, what ought to have been kept was
both the thread column for process 1 and the process line for threads 1,
but instead only the thread column was kept. This patch reintroduces the
storage of the configuration for the first thread of each process so that
it is again possible to store either the per-thread or per-process
configuration.

As a partial workaround for existing configurations, it is possible to
systematically indicate at least two processes or two threads at once
and map them by pairs or more so that at least two values are present
in the range. E.g :

  # set processes 1-4 to cpus 0-3 :

     cpu-map auto:1-4/1 0 1 2 3
  # or:
     cpu-map 1-2/1 0 1
     cpu-map 2-3/1 2 3

  # set threads 1-4 to cpus 0-3 :

     cpu-map auto:1/1-4 0 1 2 3
  # or :
     cpu-map 1/1-2 0 1
     cpu-map 3/3-4 2 3

This fix must be backported to 2.0.
2019-07-16 15:23:09 +02:00
Andrew Heberle
9723696759 MEDIUM: mworker-prog: Add user/group options to program section
This patch adds "user" and "group" config options to the "program"
section so the configured command can be run as a different user.
2019-07-15 16:43:16 +02:00
Willy Tarreau
7df8ca6296 BUG/MEDIUM: tcp-check: unbreak multiple connect rules again
The last connect rule used to be ignored and that was fixed by commit
248f1173f ("BUG/MEDIUM: tcp-check: single connect rule can't detect
DOWN servers") during 1.9 development. However this patch went a bit
too far by not breaking out of the loop after a pending connect(),
resulting in a series of failed connect() to be quickly skipped and
only the last one to be taken into account.

Technically speaking the series is not exactly skipped, it's just that
TCP checks suffer from a design issue which is that there is no
distinction between a new rule and this rule's completion in the
"connect" rule handling code. As such, when evaluating TCPCHK_ACT_CONNECT
a new connection is created regardless of any previous connection in
progress, and the previous result is ignored. It seems that this issue
is mostly specific to the connect action if we refer to the comments at
the top of the function, so it might be possible to durably address it
by reworking the connect state.

For now this patch does something simpler, it restores the behaviour
before the commit above consisting in breaking out of the loop when
the connection is in progress and after skipping comment rules. This
way we fall back to the default code waiting for completion.

This patch must be backported as far as 1.8 since the commit above
was backported there. Thanks to Jérôme Magnin for reporting and
bisecting this issue.
2019-07-15 11:10:36 +02:00
Willy Tarreau
9cca8dfc0b BUG/MINOR: mux-pt: do not pretend there's more data after a read0
Commit 8706c8131 ("BUG/MEDIUM: mux_pt: Always set CS_FL_RCV_MORE.")
was a bit excessive in setting this flag, it refrained from removing
it after read0 unless it was on an empty call. The problem it causes
is that read0 is thus ignored on the first call :

  $ strace -tts200 -e trace=recvfrom,epoll_wait,sendto  ./haproxy -db -f tcp.cfg
  06:34:23.956897 recvfrom(9, "blah\n", 15360, 0, NULL, NULL) = 5
  06:34:23.956938 recvfrom(9, "", 15355, 0, NULL, NULL) = 0
  06:34:23.956958 recvfrom(9, "", 15355, 0, NULL, NULL) = 0
  06:34:23.957033 sendto(8, "blah\n", 5, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 5
  06:34:23.957229 epoll_wait(3, [{EPOLLIN|EPOLLHUP|EPOLLRDHUP, {u32=8, u64=8}}], 200, 0) = 1
  06:34:23.957297 recvfrom(8, "", 15360, 0, NULL, NULL) = 0

If CO_FL_SOCK_RD_SH is reported by the transport layer, it indicates the
read0 was already seen thus we must not try again and we must immedaitely
report it. The simple fix consists in removing the test on ret==0 :

  $ strace -tts200 -e trace=recvfrom,epoll_wait,sendto  ./haproxy -db -f tcp.cfg
  06:44:21.634835 recvfrom(9, "blah\n", 15360, 0, NULL, NULL) = 5
  06:44:21.635020 recvfrom(9, "", 15355, 0, NULL, NULL) = 0
  06:44:21.635056 sendto(8, "blah\n", 5, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 5
  06:44:21.635269 epoll_wait(3, [{EPOLLIN|EPOLLHUP|EPOLLRDHUP, {u32=8, u64=8}}], 200, 0) = 1
  06:44:21.635330 recvfrom(8, "", 15360, 0, NULL, NULL) = 0

The issue is minor, it only results in extra syscalls and CPU usage.
This fix should be backported to 2.0 and 1.9.
2019-07-15 06:47:54 +02:00
Olivier Houchard
4bd5867627 BUG/MEDIUM: streams: Don't redispatch with L7 retries if redispatch isn't set.
Move the logic to decide if we redispatch to a new server from
sess_update_st_cer() to a new inline function, stream_choose_redispatch(), and
use it in do_l7_retry() instead of just setting the state to SI_ST_REQ.
That way, when using L7 retries, we won't redispatch the request to another
server except if "option redispatch" is used.

This should be backported to 2.0.
2019-07-12 16:17:50 +02:00
Olivier Houchard
29cac3c5f7 BUG/MEDIUM: streams: Don't give up if we couldn't send the request.
In htx_request_forward_body(), don't give up if we failed to send the request,
and we have L7 retries activated. If we do, we will not retry when we should.

This should be backported to 2.0.
2019-07-12 16:17:50 +02:00
Dave Pirotte
234740f65d BUG/MINOR: mux-h1: Correctly report Ti timer when HTX and keepalives are used
When HTTP keepalives are used in conjunction with HTX, the Ti timer
reports the elapsed time since the beginning of the connection instead
of the end of the previous request as stated in the documentation. Th,
Tq and Tt also report incorrectly as a result.

When creating a new h1s, check if it is the first request on the
connection. If not, set the session create times to the current
timestamp rather than the initial session accept timestamp. This makes
the logged timers behave as stated in the documentation.

This fix should be backported to 1.9 and 2.0.
2019-07-12 16:14:12 +02:00
Christopher Faulet
37243bc61f BUG/MEDIUM: mux-h1: Don't release h1 connection if there is still data to send
When the h1 stream (h1s) is detached, If the connection is not really shutdown
yet and if there is still some data to send, the h1 connection (h1c) must not be
released. Otherwise, the remaining data are lost. This bug was introduced by the
commit 3ac0f430 ("BUG/MEDIUM: mux-h1: Always release H1C if a shutdown for
writes was reported").

Here is the conditions to release an h1 connection when the h1 stream is
detached :

  * An error or a shutdown write occurred on the connection
    (CO_FL_ERROR|CO_FL_SOCK_WR_SH)

  * an error, an h2 upgrade or full shutdown occurred on the h1 connection
    (H1C_F_CS_ERROR||H1C_F_UPG_H2C|H1C_F_CS_SHUTDOWN)

  * A shutdown write is pending on the h1 connection and there is no more data
    in the output buffer
    ((h1c->flags & H1C_F_CS_SHUTW_NOW) && !b_data(&h1c->obuf))

If one of these conditions is fulfilled, the h1 connection is
released. Otherwise, the release is delayed. If we are waiting to send remaining
data, a timeout is set.

This patch must be backported to 2.0 and 1.9. It fixes the issue #164.
2019-07-12 10:06:41 +02:00
Willy Tarreau
f2cb169487 BUG/MAJOR: listener: fix thread safety in resume_listener()
resume_listener() can be called from a thread not part of the listener's
mask after a curr_conn has gone lower than a proxy's or the process' limit.
This results in fd_may_recv() being called unlocked if the listener is
bound to only one thread, and quickly locks up.

This patch solves this by creating a per-thread work_list dedicated to
listeners, and modifying resume_listener() so that it bounces the listener
to one of its owning thread's work_list and waking it up. This thread will
then call resume_listener() again and will perform the operation on the
file descriptor itself. It is important to do it this way so that the
listener's state cannot be modified while the listener is being moved,
otherwise multiple threads can take conflicting decisions and the listener
could be put back into the global queue if the listener was used at the
same time.

It seems like a slightly simpler approach would be possible if the locked
list API would provide the ability to return a locked element. In this
case the listener would be immediately requeued in dequeue_all_listeners()
without having to go through resume_listener() with its associated lock.

This fix must be backported to all versions having the lock-less accept
loop, which is as far as 1.8 since deadlock fixes involving this feature
had to be backported there. It is expected that the code should not differ
too much there. However, previous commit "MINOR: task: introduce work lists"
will be needed as well and should not present difficulties either. For 1.8,
the commits introducing thread_mask() and LIST_ADDED() will be needed as
well, either backporting my_flsl() or switching to my_ffsl() will be OK,
and some changes will have to be performed so that the init function is
properly called (and maybe the deinit one can be dropped).

In order to test for the fix, simply set up a multi-threaded frontend with
multiple bind lines each attached to a single thread (reproduced with 16
threads here), set up a very low maxconn value on the frontend, and inject
heavy traffic on all listeners in parallel with slightly more connections
than the configured limit ( typically +20%) so that it flips very
frequently. If the bug is still there, at some point (5-20 seconds) the
traffic will go much lower or even stop, either with spinning threads or
not.
2019-07-12 09:07:48 +02:00
Willy Tarreau
64e6012eb9 MINOR: task: introduce work lists
Sometimes we need to delegate some list processing to a function running
on another thread. In this case the list element will simply be queued
into a dedicated self-locked list and the task responsible for this list
will be woken up, calling the associated function which will run over the
list.

This is what work_list does. Such lists will be dedicated to a limited
type of work but will significantly ease such remote handling. A function
is provided to create these per-thread lists, their tasks and to properly
bind each task to a distinct thread, so that the caller only has to store
the resulting pointer to the start of the structure.

These structures should not be abused though as each head will consume
4 pointers per thread, hence 32 bytes per thread or 2 kB for 64 threads.
2019-07-12 09:07:48 +02:00
Olivier Houchard
4be7190c10 BUG/MEDIUM: servers: Fix a race condition with idle connections.
When we're purging idle connections, there's a race condition, when we're
removing the connection from the idle list, to add it to the list of
connections to free, if the thread owning the connection tries to free it
at the same time.
To fix this, simply add a per-thread lock, that has to be hold before
removing the connection from the idle list, and when, in conn_free(), we're
about to remove the connection from every list. That way, we know for sure
the connection will stay valid while we remove it from the idle list, to add
it to the list of connections to free.
This should happen rarely enough that it shouldn't have any impact on
performances.
This has not been reported yet, but could provoke random segfaults.

This should be backported to 2.0.
2019-07-11 16:16:38 +02:00
Frédéric Lécaille
51596c166b CLEANUP: proto_tcp: Remove useless header inclusions.
I guess "sys/un.h" and "sys/stat.h" were included for debugging purposes when
"proto_tcp.c" was initially created. There are no more useful.
2019-07-11 10:40:20 +02:00
David Carlier
7df4185f3c BUG/MEDIUM: da: cast the chunk to string.
in fetch mode, the output was incorrect, setting the type to string
explicitally.

This should be backported to all stable versions.
2019-07-11 10:20:09 +02:00
Olivier Houchard
bc89ad8d94 BUG/MEDIUM: checks: Don't attempt to read if we destroyed the connection.
In event_srv_chk_io(), only call __event_srv_chk_r() if we did not subscribe
for reading, and if wake_srv_chk() didn't return -1, as it would mean it
just destroyed the connection and the conn_stream, and attempting to use
those to recv data would lead to a crash.

This should be backported to 1.9 and 2.0.
2019-07-10 16:29:12 +02:00
Willy Tarreau
828675421e MINOR: pools: always pre-initialize allocated memory outside of the lock
When calling mmap(), in general the system gives us a page but does not
really allocate it until we first dereference it. And it turns out that
this time is much longer than the time to perform the mmap() syscall.
Unfortunately, when running with memory debugging enabled, we mmap/munmap()
each object resulting in lots of such calls and a high contention on the
allocator. And the first accesses to the page being done under the pool
lock is extremely damaging to other threads.

The simple fact of writing a 0 at the beginning of the page after
allocating it and placing the POOL_LINK pointer outside of the lock is
enough to boost the performance by 8x in debug mode and to save the
watchdog from triggering on lock contention. This is what this patch
does.
2019-07-09 10:40:33 +02:00
Willy Tarreau
3e853ea74d MINOR: pools: release the pool's lock during the malloc/free calls
The malloc and free calls and especially the underlying mmap/munmap()
can occasionally take a huge amount of time and even cause the thread
to sleep. This is visible when haproxy is compiled with DEBUG_UAF which
causes every single pool allocation/free to allocate and release pages.
In this case, when using the locked pools, the watchdog can occasionally
fire under high contention (typically requesting 40000 1M objects in
parallel over 8 threads). Then, "perf top" shows that 50% of the CPU
time is spent in mmap() and munmap(). The reason the watchdog fires is
because some threads spin on the pool lock which is held by other threads
waiting on mmap() or munmap().

This patch modifies this so that the pool lock is released during these
syscalls. Not only this allows other threads to request try to allocate
their data in parallel, but it also considerably reduces the lock
contention.

Note that the locked pools are only used on small architectures where
high thread counts would not make sense, so this will not provide any
benefit in the general case. However it makes the debugging versions
way more stable, which is always appreciated.
2019-07-09 10:40:33 +02:00
Lukas Tribus
4979916134 BUG/MINOR: ssl: revert empty handshake detection in OpenSSL <= 1.0.2
Commit 54832b97 ("BUILD: enable several LibreSSL hacks, including")
changed empty handshake detection in OpenSSL <= 1.0.2 and LibreSSL,
from accessing packet_length directly (not available in LibreSSL) to
calling SSL_state() instead.

However, SSL_state() appears to be fully broken in both OpenSSL and
LibreSSL.

Since there is no possibility in LibreSSL to detect an empty handshake,
let's not try (like BoringSSL) and restore this functionality for
OpenSSL 1.0.2 and older, by reverting to the previous behavior.

Should be backported to 2.0.
2019-07-09 04:47:18 +02:00
Olivier Houchard
a1ab97316f BUG/MEDIUM: servers: Don't forget to set srv_cs to NULL if we can't reuse it.
In connect_server(), if there were already a CS assosciated with the stream,
but we can't reuse it, because the target is different (because we tried a
previous connection, it failed, and we use redispatch so we switched servers),
don't forget to set srv_cs to NULL. Otherwise, if we end up reusing another
connection, we would consider we already have a conn_stream, and we won't
create a new one, so we'd have a new connection but we would not be able to
use it.
This can explain frozen streams and connections stuck in CLOSE_WAIT when
using redispatch.

This should be backported to 1.9 and 2.0.
2019-07-08 16:32:58 +02:00
Christopher Faulet
037b3ebd35 BUG/MEDIUM: stream-int: Don't rely on CF_WRITE_PARTIAL to unblock opposite si
In the function stream_int_notify(), when the opposite stream-interface is
blocked because there is no more room into the input buffer, if the flag
CF_WRITE_PARTIAL is set on this buffer, it is unblocked. It is a way to unblock
the reads on the other side because some data was sent.

But it is a problem during the fast-forwarding because only the stream is able
to remove the flag CF_WRITE_PARTIAL. So it is possible to have this flag because
of a previous send while the input buffer of the opposite stream-interface is
now full. In such case, the opposite stream-interface will be woken up for
nothing because its input buffer is full. If the same happens on the opposite
side, we will have a loop consumming all the CPU.

To fix the bug, the opposite side is now only notify if there is some available
room in its input buffer in the function si_cs_send(), so only if some data was
sent.

This patch must be backported to 2.0 and 1.9.
2019-07-05 14:26:15 +02:00
Christopher Faulet
86162db15c MINOR: stream-int: Factorize processing done after sending data in si_cs_send()
In the function si_cs_send(), what is done when an error occurred on the
connection or the conn_stream or when some successfully data was send via a pipe
or the channel's buffer may be factorized at the function. It slightly simplify
the function.

This patch must be backported to 2.0 and 1.9 because a bugfix depends on it.
2019-07-05 14:26:15 +02:00
Christopher Faulet
0e54d547f1 BUG/MINOR: mux-h1: Don't process input or ouput if an error occurred
It is useless to proceed if an error already occurred. Instead, it is better to
wait it will be catched by the stream or the connection, depending on which is
the first one to detect it.

This patch must be backported to 2.0.
2019-07-05 14:26:15 +02:00
Christopher Faulet
f8db73efbe BUG/MEDIUM: mux-h1: Handle TUNNEL state when outgoing messages are formatted
Since the commit 94b2c7 ("MEDIUM: mux-h1: refactor output processing"), the
formatting of outgoing messages is performed on the message state and no more on
the HTX blocks read. But the TUNNEL state was left out. So, the HTTP tunneling
using the CONNECT method or switching the protocol (for instance,
the WebSocket) does not work.

This issue was reported on Github. See #131. This patch must be backported to
2.0.
2019-07-05 14:26:15 +02:00
Christopher Faulet
16b2be93ad BUG/MEDIUM: lb_fas: Don't test the server's lb_tree from outside the lock
In the function fas_srv_reposition(), the server's lb_tree is tested from
outside the lock. So it is possible to remove it after the test and then call
eb32_insert() in fas_queue_srv() with a NULL root pointer, which is
invalid. Moving the test in the scope of the lock fixes the bug.

This issue was reported on Github, issue #126.

This patch must be backported to 2.0, 1.9 and 1.8.
2019-07-05 14:26:15 +02:00
Christopher Faulet
8f1aa77b42 BUG/MEDIUM: http/applet: Finish request processing when a service is registered
In the analyzers AN_REQ_HTTP_PROCESS_FE/BE, when a service is registered, it is
important to not interrupt remaining processing but just the http-request rules
processing. Otherwise, the part that handles the applets installation is
skipped.

Among the several effects, if the service is registered on a frontend (not a
listen), the forwarding of the request is skipped because all analyzers are not
set on the request channel. If the service does not depends on it, the response
is still produced and forwarded to the client. But the stream is infinitly
blocked because the request is not fully consumed. This issue was reported on
Github, see #151.

So this bug is fixed thanks to the new action return ACT_RET_DONE. Once a
service is registered, the action process_use_service() still returns
ACT_RET_STOP. But now, only rules processing is stopped. As a side effet, the
action http_action_reject() must now return ACT_RET_DONE to really stop all
processing.

This patch must be backported to 2.0. It depends on the commit introducing the
return code ACT_RET_DONE.
2019-07-05 14:26:14 +02:00
Christopher Faulet
2e4843d1d2 MINOR: action: Add the return code ACT_RET_DONE for actions
This code should be now used by action to stop at the same time the rules
processing and the possible following processings. And from its side, the return
code ACT_RET_STOP should be used to only stop rules processing.

So concretely, for TCP rules, there is no changes. ACT_RET_STOP and ACT_RET_DONE
are handled the same way. However, for HTTP rules, ACT_RET_STOP should now be
mapped on HTTP_RULE_RES_STOP and ACT_RET_DONE on HTTP_RULE_RES_DONE. So this
way, a action will have the possibilty to stop all processing or only rules
processing.

Note that changes about the TCP is done in this commit but changes about the
HTTP will be done in another one because it will fix a bug in the same time.

This patch must be backported to 2.0 because a bugfix depends on it.
2019-07-05 14:26:14 +02:00
Frédéric Lécaille
1b9423d214 MINOR: server: Add "no-tfo" option.
Simple patch to add "no-tfo" option to "default-server" and "server" lines
to disable any usage of TCP fast open.

Must be backported to 2.0.
2019-07-04 14:45:52 +02:00
Olivier Houchard
8d82db70a5 BUG/MEDIUM: servers: Authorize tfo in default-server.
There's no reason to forbid using tfo with default-server, so allow it.

This should be backported to 2.0.
2019-07-04 13:34:25 +02:00
Olivier Houchard
2ab3dada01 BUG/MEDIUM: connections: Make sure we're unsubscribe before upgrading the mux.
Just calling conn_force_unsubscribe() from conn_upgrade_mux_fe() is not
enough, as there may be multiple XPRT involved. Instead, require that
any user of conn_upgrade_mux_fe() unsubscribe itself before calling it.
This should fix upgrading a TCP connection to HTX when using SSL.

This should be backported to 2.0.
2019-07-03 13:57:30 +02:00
Christopher Faulet
9060fc02b5 BUG/MINOR: hlua/htx: Respect the reserve when HTX data are sent
The previous commit 7e145b3e2 ("BUG/MINOR: hlua: Don't use
channel_htx_recv_max()") is buggy. The buffer's reserve must be respected.

This patch must be backported to 2.0 and 1.9.
2019-07-03 11:47:20 +02:00
Christopher Faulet
7e145b3e24 BUG/MINOR: hlua: Don't use channel_htx_recv_max()
The function htx_free_data_space() must be used intead. Otherwise, if there are
some output data not already forwarded, the maximum amount of data that may be
inserted into the buffer may be greater than what we can really insert.

This patch must be backported to 2.0 and 1.9.
2019-07-02 21:32:45 +02:00
Olivier Houchard
f494957980 BUG/MEDIUM: checks: Make sure the tasklet won't run if the connection is closed.
wake_srv_chk() can be called from conn_fd_handler(), and may decide to
destroy the conn_stream and the connection, by calling cs_close(). If that
happens, we have to make sure the tasklet isn't scheduled to run, or it will
probably crash trying to access the connection or the conn_stream.
This fixes a crash that can be seen when using tcp checks.

This should be backported to 1.9 and 2.0.
For 1.9, the call should be instead :
task_remove_from_tasklet_list((struct task *)check->wait_list.task);
That function was renamed in 2.0.
2019-07-02 17:45:35 +02:00
Olivier Houchard
6c7e96a3e1 BUG/MEDIUM: connections: Always call shutdown, with no linger.
Revert commit fe4abe62c7.
The goal was to make sure for health-checks, we would not get sockets in
TIME_WAIT. To do so, we would not call shutdown() if linger_risk is set.
However that is wrong, and that means shutw would never be forwarded to
the server, and thus we could get connection that are never properly closed.
Instead, to fix the original problem as described here :
https://www.mail-archive.com/haproxy@formilux.org/msg34080.html
Just make sure the checks code call cs_shutr() before calling cs_shutw().
If shutr has been called, conn_sock_shutw() will make no attempt to call
shutdown(), as it knows close() will be called.
We should really review and revamp the shutr/shutw code, as described in
github issue #142.

This should be backported to 1.9 and 2.0.
2019-07-02 16:40:55 +02:00
Christopher Faulet
b8fc304e8f BUG/MINOR: mux-h1: Don't return the empty chunk on HEAD responses
HEAD responses must not have any body payload. But, because of a bug, for chunk
reponses, the empty chunk was always added.

This patch fixes the Github issue #146. It must be backported to 2.0 and 1.9.
2019-07-01 16:24:01 +02:00
Christopher Faulet
5433a0b021 BUG/MINOR: mux-h1: Skip trailers for non-chunked outgoing messages
Unlike H1, H2 messages may contains trailers while the header "Content-Length"
is set. Indeed, because of the framed structure of HTTP/2, it is no longer
necessary to use the chunked transfer encoding. So Trailing HEADERS frames,
after all DATA frames, may be added on messages with an explicit content length.

But in H1, it is impossible to have trailers on non-chunked messages. So when
outgoing messages are formatted by the H1 multiplexer, if the message is not
chunked, all trailers must be dropped.

This patch must be backported to 2.0 and 1.9. However, the patch will have to be
adapted for the 1.9.
2019-07-01 16:24:01 +02:00
Willy Tarreau
2df8cad0fe BUG/MEDIUM: checks: unblock signals in external checks
As discussed in issue #140, processes are forked with signals blocked
resulting in haproxy's kill being ignored. This happens when the command
takes more time to complete than the configured check timeout or interval.
Just calling "sleep 30" every second makes the problem obvious.

The fix simply consists in unblocking the signals in the child after the
fork. It needs to be backported to all stable branches containing external
checks and where signals are blocked on startup. It's unclear when it
started, but the following config exhibits the issue :

  global
    external-check

  listen www
    bind :8001
    timeout client 5s
    timeout server 5s
    timeout connect 5s
    option external-check
    external-check command "$PWD/sleep10.sh"
    server local 127.0.0.1:80 check inter 200

  $ cat sleep10.sh
  #!/bin/sh
  exec /bin/sleep 10

The "sleep" processes keep accumulating for 10 seconds and stabilize
around 25 when the bug is present. Just issuing "killall sleep" has no
effect on them, and stopping haproxy leaves these processes behind.
2019-07-01 16:03:44 +02:00
William Lallemand
ad03288e6b BUG/MINOR: mworker/cli: don't output a \n before the response
When using a level lower than admin on the master CLI, a \n is output
before the response, this is caused by the response of the "operator" or
"user" that are sent before the actual command.

To fix this problem we introduce the flag APPCTX_CLI_ST1_NOLF which ask
a command response to not be followed by the final \n.
This patch made a special case with the command operator and user
followed by a - so they are not followed by \n.

This patch must be backported to 2.0 and 1.9.
2019-07-01 15:34:11 +02:00
Christopher Faulet
3ac0f43020 BUG/MEDIUM: mux-h1: Always release H1C if a shutdown for writes was reported
We must take care of this when the stream is detached from the
connection. Otherwise, on the server side, the connexion is inserted in the list
of idle connections of the session. But when reused, because the shutdown for
writes was already catched, nothing is sent to the server and the session is
blocked with a freezed connection.

This patch must be backported to 2.0 and 1.9. It is related to the issue #136
reported on Github.
2019-06-28 17:58:15 +02:00
Olivier Houchard
e488ea865a BUG/MEDIUM: ssl: Don't attempt to set alpn if we're not using SSL.
Checks use ssl_sock_set_alpn() to set the ALPN if check-alpn is used, however
check-alpn failed to check if the connection was indeed using SSL, and thus,
would crash if check-alpn was used on a non-SSL connection. Fix this by
making sure the connection uses SSL before attempting to set the ALPN.

This should be backported to 2.0 and 1.9.
2019-06-28 14:12:28 +02:00
Christopher Faulet
d87d3fab25 BUG/MINOR: mux-h1: Make format errors during output formatting fatal
These errors are unexpected at this staged and there is not much more to do than
to close the connection and leave. So now, when it happens, the flag
H1C_F_CS_ERROR is set on the H1 connection and the flag HTX_FL_PARSING_ERROR is
set on the channel's HTX message.

This patch must be backported to 2.0 and 1.9.
2019-06-26 15:23:06 +02:00
Christopher Faulet
e5438b749c BUG/MEDIUM: mux-h1: Use buf_room_for_htx_data() to detect too large messages
During headers parsing, an error is returned if the message is too large and
does not fit in the input buffer. The mux h1 used the function b_full() to do
so. But to allow zero copy transfers, in h1_recv(), the input buffer is
pre-aligned and thus few bytes remains always free.

To fix the bug, as during the trailers parsing, the function
buf_room_for_htx_data() should be used instead.

This patch must be backported to 2.0 and 1.9.
2019-06-26 15:23:06 +02:00
Christopher Faulet
1d5ec0944f BUG/MEDIUM: proto_htx: Don't add EOM on 1xx informational messages
Since the commit b75b5eaf ("MEDIUM: htx: 1xx messages are now part of the final
reponses"), these messages are part of the response and should not contain
EOM. This block is skipped during responses parsing, but analyzers still add it
for "100-Continue" and "103-Eraly-Hints". It can also be added for error files
with 1xx status code.

Now, when HAProxy generate such transitional responses, it does not emit EOM
blocks. And informational messages are now forbidden in error files.

This patch must be backported to 2.0.
2019-06-26 15:23:06 +02:00
Tim Duesterhus
2164800c1b BUG/MINOR: log: Detect missing sampling ranges in config
Consider a config like:

    global
    	log 127.0.0.1:10001 sample :10 local0

No sampling ranges are given here, leading to NULL being passed
as the first argument to qsort.

This configuration does not make sense anyway, a log without ranges
would never log. Thus output an error if no ranges are given.

This bug was introduced in d95ea2897e.
This fix must be backported to HAProxy 2.0.
2019-06-26 11:15:49 +02:00
Christopher Faulet
2f6d3c0d65 BUG/MINOR: memory: Set objects size for pools in the per-thread cache
When a memory pool is created, it may be allocated from a static array. This
happens for "most common" pools, allocated first. Objects of these pools may
also be cached in a pool cache. Of course, to not cache too much entries, we
track the number of cached objects and the total size of the cache.

But the objects size of each pool in the cache (ie, pool_cache[tid][idx].size,
where tid is the thread-id and idx is the index of the pool) was never set. So
the total size of the cache was never limited. Now when a pool is created, if
these objects may be cached, we set the corresponding objects size in the pool
cache.

This patch must be backported to 2.0 and 1.9.
2019-06-26 09:57:49 +02:00
Christopher Faulet
c2518a53ae BUG/MAJOR: mux-h1: Don't crush trash chunk area when outgoing message is formatted
When an outgoing HTX message is formatted before sending it, a trash chunk is
used to do the formatting. Its content is then copied into the output buffer of
the H1 connection. There are some tricks to avoid this last copy. First, if
possible we perform a zero-copy by swapping the area of the HTX buffer with the
one of the output buffer. If zero-copy is not possible, but if the output buffer
is empty, we don't use a trash chunk. To do so, we change the area of the trash
chunk to point on the one of the output buffer. But it is terribly wrong. Trash
chunks are global variables, allocated statically. If the area is changed, the
old one is lost. Worst, the area of the output buffer is dynamically allocated,
so it is released when emptied, leaving the trash chunk with a freed area (in
fact, it is a bit more complicated because buffers are allocated from a memory
pool).

So, honestly, I don't know why we never experienced any problem because this bug
till now. To fix it, we still use a temporary buffer, but we assign it to a
trash chunk only when other solutions were excluded. This way, we never
overwrite the area of a trash chunk.

This patch must be backported to 2.0 and 1.9.
2019-06-26 09:57:49 +02:00
Christopher Faulet
2bce046eea BUG/MINOR: htx: Save hdrs_bytes when the HTX start-line is replaced
The HTX start-line contains the number of bytes held by all headers as seen by
the mux during the parsing. So it must not be updated during analysis. It was
done when the start-line is replaced, so this update was removed at this
place. But we still save it from the old start-line to not loose it. It should
not be used outside the mux, but there is no reason to skip it. It is a bug,
however it should have no impact.

This patch must be backported to 2.0.
2019-06-26 09:57:49 +02:00
William Lallemand
1933801136 BUG/MEDIUM: mworker/cli: command pipelining doesn't work anymore
Since commit 829bd471 ("MEDIUM: stream: rearrange the events to remove
the loop"), the pipelining in the master CLI does not work anymore.

Indeed when doing:

  echo "@1 show info; @2 show info; @3 show info" | socat /tmp/haproxy.master -

the CLI will only show the response of the first command.

When debugging we can observe that the command is sent, but the client
closes the connection before receiving the response.

The problem is that the flag CF_READ_NULL is not cleared when we
reiniate the flags of the response and we rely on this flag to close.

Must be backported in 2.0
2019-06-25 18:15:46 +02:00
Olivier Houchard
0ff28651c1 BUG/MEDIUM: ssl: Don't do anything in ssl_subscribe if we have no ctx.
In ssl_subscribe(), make sure we have a ssl_sock_ctx before doing anything.
When ssl_sock_close() is called, it wakes any subscriber up, and that
subscriber may decide to subscribe again, for some reason. If we no longer
have a context, there's not much we can do.

This should be backported to 2.0.
2019-06-24 19:00:16 +02:00
Olivier Houchard
6c6dc58da0 BUG/MEDIUM: connections: Always add the xprt handshake if needed.
In connect_server(), we used to only call xprt_add_hs() if CO_FL_SEND_PROXY
was set during the function call, we would not do it if the flag was set
before connect_server() was called. The rational at the time was if the flag
was already set, then the XPRT was already present. But now the xprt_handshake
always removes itself, so we have to re-add it each time, or it wouldn't be
done if the first connection attempt failed.
While I'm there, check any non-ssl handshake flag, instead of just
CO_FL_SEND_PROXY, or we'd miss the SOCKS4 flags.

This should be backported to 2.0.
2019-06-24 19:00:16 +02:00
Olivier Houchard
c31e2cbd28 BUG/MEDIUM: stream_interface: Don't add SI_FL_ERR the state is < SI_ST_CON.
Only add SI_FL_ERR if the stream_interface is connected, or is attempting
a connection. We may get there because the stream_interface's tasklet
was woken up, but before it actually runs, process_stream() may be called,
detect that there were an error, and change the state of the stream_interface
to SI_ST_TAR. When the stream_interface's tasklet then run, the connection
may still have CO_FL_ERROR, but that error was already accounted for, so
just ignore it.

This should be backported to 2.0.
2019-06-24 19:00:16 +02:00
William Lallemand
16866670dd BUG/MEDIUM: mworker: don't call the thread and fdtab deinit
Before switching to wait mode, the per thread deinit should not be
called, because we didn't initiate threads and fdtab.

The problem is that the master could crash if we try to reload HAProxy

The commit 944e619 ("MEDIUM: mworker: wait mode use standard init code
path") removed the deinit code by accident, but its fix 7c756a8
("BUG/MEDIUM: mworker: fix FD leak upon reload") was incomplete and did
not took care of the WAIT_MODE.

This fix must be backported in 1.9 and 2.0
2019-06-24 17:54:05 +02:00
Tim Duesterhus
b298613072 BUG/MINOR: spoe: Fix memory leak if failing to allocate memory
Technically harmless, but it annoys clang analyzer.

This bug was introduced in 336d3ef0e7.
This fix should be backported to HAProxy 1.9+.
2019-06-24 14:38:15 +02:00
Tim Duesterhus
2c9e274f45 BUG/MINOR: mworker-prog: Fix segmentation fault during cfgparse
Consider this configuration:

    frontend fe_http
    	mode http
    	bind *:8080

    	default_backend be_http

    backend be_http
    	mode http
    	server example example.com:80

    program foo bar

Running with valgrind results in:

    ==16252== Invalid read of size 8
    ==16252==    at 0x52AE3F: cfg_parse_program (mworker-prog.c:233)
    ==16252==    by 0x4823B3: readcfgfile (cfgparse.c:2180)
    ==16252==    by 0x47BCED: init (haproxy.c:1649)
    ==16252==    by 0x404E22: main (haproxy.c:2714)
    ==16252==  Address 0x48 is not stack'd, malloc'd or (recently) free'd

Check whether `ext_child` is valid before attempting to free it and its
contents.

This bug was introduced in 9a1ee7ac31.
This fix must be backported to HAProxy 2.0.
2019-06-24 10:09:00 +02:00
Willy Tarreau
76a80c710c BUILD: mworker: silence two printf format warnings around getpid()
getpid() is documented as returning a pit pid_t result, not
necessarily an int. This causes a build warning on Solaris 10
because of '%d' or '%u' are used in the format passed to snprintf().
Let's just cast the result as an int (respectively unsigned int).

This can be backported to 2.0 and possibly older versions though
it really has no impact.
2019-06-22 07:57:56 +02:00
Frédéric Lécaille
9417f4534a BUG/MAJOR: sample: Wrong stick-table name parsing in "if/unless" ACL condition.
This bug was introduced by 1b8e68e commit which supposed the stick-table was always
stored in struct arg at parsing time. This is never the case with the usage of
"if/unless" conditions in stick-table declared as backends. In this case, this is
the name of the proxy which must be considered as the stick-table name.

This must be backported to 2.0.
2019-06-21 09:48:28 +02:00
Christopher Faulet
1ae2a88781 BUG/MEDIUM: lb_fwlc: Don't test the server's lb_tree from outside the lock
In the function fwlc_srv_reposition(), the server's lb_tree is tested from
outside the lock. So it is possible to remove it after the test and then call
eb32_insert() in fwlc_queue_srv() with a NULL root pointer, which is
invalid. Moving the test in the scope of the lock fixes the bug.

This issue was reported on Github, issue #126.

This patch must be backported to 2.0, 1.9 and 1.8.
2019-06-19 13:55:57 +02:00
Christopher Faulet
4f09ec812a BUG/MEDIUM: mux-h2: Remove the padding length when a DATA frame size is checked
When a DATA frame is processed for a message with a content-length, we first
take care to not have a frame size that exceeds the remaining to
read. Otherwise, an error is triggered. But we must remove the padding length
from the frame size because the padding is not included in the announced
content-length.

This patch must be backported to 2.0 and 1.9.
2019-06-19 10:06:31 +02:00
Christopher Faulet
dd2a5620d5 BUG/MEDIUM: mux-h2: Reset padlen when several frames are demux
In the function h2_process_demux(), if several frames are parsed, the padding
length must be reset between each frame. Otherwise we may wrongly think a frame
has a padding block because the previous one was padded.

This patch must be backported to 2.0 and 1.9.
2019-06-19 10:06:31 +02:00
Christopher Faulet
3e2638ee04 BUG/MEDIUM: htx: Fully update HTX message when the block value is changed
Everywhere the value length of a block is changed, calling the function
htx_set_blk_value_len(), the HTX message must be updated. But at many places,
because of the recent changes in the HTX structure, this update was only
partially done. tail_addr and head_addr values were not systematically updated.

In fact, the function htx_set_blk_value_len() was designed as an internal
function to the HTX API. And we used it from outside by convenience. But it is
really painfull and error prone to let the caller update the HTX message. So
now, we use the function htx_change_blk_value_len() wherever is possible. It
changes the value length of a block and updates the HTX message accordingly.

This patch must be backported to 2.0.
2019-06-18 10:02:05 +02:00
Tim Duesterhus
721d686bd1 BUG/MEDIUM: compression: Set Vary: Accept-Encoding for compressed responses
Make HAProxy set the `Vary: Accept-Encoding` response header if it compressed
the server response.

Technically the `Vary` header SHOULD also be set for responses that would
normally be compressed based off the current configuration, but are not due
to a missing or invalid `Accept-Encoding` request header or due to the
maximum compression rate being exceeded.

Not setting the header in these cases does no real harm, though: An
uncompressed response might be returned by a Cache, even if a compressed
one could be retrieved from HAProxy. This increases the traffic to the end
user if the cache is unable to compress itself, but it saves another
roundtrip to HAProxy.

see the discussion on the mailing list: https://www.mail-archive.com/haproxy@formilux.org/msg34221.html
Message-ID: 20190617121708.GA2964@1wt.eu

A small issue remains: The User-Agent is not added to the `Vary` header,
despite being relevant to the response. Adding the User-Agent header would
make responses effectively uncacheable and it's unlikely to see a Mozilla/4
in the wild in 2019.

Add a reg-test to ensure the behaviour as described in this commit message.

see issue #121
Should be backported to all branches with compression (i.e. 1.6+).
2019-06-17 18:51:43 +02:00
Christopher Faulet
a110ecbd84 BUG/MINOR: mux-h1: Add the header connection in lower case in outgoing messages
When necessary, this header is directly added in outgoing messages by the H1
multiplexer. Because there is no HTX conversion first, the header name is not
converserted to its lower case version. So, it must be added in lower case by
the multiplexer.

This patch must be backported to 2.0 and 1.9.
2019-06-17 14:15:32 +02:00
Christopher Faulet
ea418748dd BUG/MINOR: lua/htx: Make txn.req_req_* and txn.res_rep_* HTX aware
These bindings were not updated to support HTX streams.

This patch must be backported to 2.0 and 1.9. It fixes the issue #124.
2019-06-17 13:42:45 +02:00
Baptiste Assmann
da29fe2360 MEDIUM: server: server-state global file stored in a tree
Server states can be recovered from either a "global" file (all backends)
or a "local" file (per backend).

The way the algorithm to parse the state file was first implemented was good
enough for a low number of backends and servers per backend.
Basically, for each backend the state file (global or local) is opened,
parsed entirely and for each line we check if it contains data related to
a server from the backend we're currently processing.
We must read the file entirely, just in case some lines for the current
backend are stored at the end of the file.
This does not scale at all!

This patch changes the behavior above for the "global" file only. Now,
the global file is read and parsed once and all lines it contains are
stored in a tree, for faster discovery.
This result in way much less fopen, fgets, and strcmp calls, which make
loading of very big state files very quick now.
2019-06-17 13:40:42 +02:00
Tim Duesterhus
d437630237 MINOR: sample: Add sha2([<bits>]) converter
This adds a converter for the SHA-2 family, supporting SHA-224, SHA-256
SHA-384 and SHA-512.

The converter relies on the OpenSSL implementation, thus only being available
when HAProxy is compiled with USE_OPENSSL.

See GitHub issue #123. The hypothetical `ssl_?_sha256` fetch can then be
simulated using `ssl_?_der,sha2(256)`:

  http-response set-header Server-Cert-FP %[ssl_f_der,sha2(256),hex]
2019-06-17 13:36:42 +02:00
Tim Duesterhus
24915a55da MEDIUM: Remove 'option independant-streams'
It is deprecated with HAProxy 1.5. Time to remove it.
2019-06-17 13:35:54 +02:00
Tim Duesterhus
86e6b6ebf8 MEDIUM: Make '(cli|con|srv)timeout' directive fatal
They were deprecated with HAProxy 1.5. Time to remove them.
2019-06-17 13:35:54 +02:00
Tim Duesterhus
dac168bc15 MEDIUM: Make 'redispatch' directive fatal
It was deprecated with HAProxy 1.5. Time to remove it.
2019-06-17 13:35:54 +02:00
Tim Duesterhus
7b7c47f05c MEDIUM: Make 'block' directive fatal
It was deprecated with HAProxy 1.5. Time to remove it.
2019-06-17 13:35:54 +02:00
Christopher Faulet
0c6de00d7c BUG/MEDIUM: h2/htx: Update data length of the HTX when the cookie list is built
When an H2 request is converted into an HTX message, All cookie headers are
grouped into one, each value separated by a semicolon (;). To do so, we add the
header "cookie" with the first value and then we update the value by appending
other cookies. But during this operation, only the size of the HTX block is
updated. And not the data length of the whole HTX message.

It is an old bug and it seems to work by chance till now. But it may lead to
undefined behaviour by time to time.

This patch must be backported to 2.0 and 1.9
2019-06-17 11:44:51 +02:00
Willy Tarreau
33ccf1cce0 BUILD: pattern: work around an internal compiler bug in gcc-3.4
gcc-3.4 fails to compile pattern.c :

src/pattern.c: In function `pat_match_ip':
src/pattern.c:1092: error: unrecognizable insn:
(insn 186 185 187 9 src/pattern.c:970 (set (reg/f:SI 179)
        (high:SI (const:SI (plus:SI (symbol_ref:SI ("static_pattern") [flags 0x22] <var_decl fe5bae80 static_pattern>)
                    (const_int 8 [0x8]))))) -1 (nil)
    (nil))
src/pattern.c:1092: internal compiler error: in extract_insn, at recog.c:2083

This happens when performing the memcpy() on the union, and in this
case the workaround is trivial (and even cleaner) using a cast instead.
2019-06-16 18:40:33 +02:00
Willy Tarreau
8daa920ae4 BUILD: tools: work around an internal compiler bug in gcc-3.4
gcc-3.4 fails to compile standard.c :

src/standard.c: In function `str2sa_range':
src/standard.c:1034: error: unrecognizable insn:
(insn 582 581 583 37 src/standard.c:949 (set (reg/f:SI 262)
        (high:SI (const:SI (plus:SI (symbol_ref:SI ("*ss.4") [flags 0x22] <var_decl fe782e80 ss>)
                    (const_int 2 [0x2]))))) -1 (nil)
    (nil))
src/standard.c:1034: internal compiler error: in extract_insn, at recog.c:2083

The workaround is explained here :

    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21613

It only requires creating a local variable containing the result of the
cast, which is totally harmless, so let's do it.
2019-06-16 18:16:33 +02:00
Olivier Houchard
965e84e2df BUG/MEDIUM: ssl: Make sure we initiate the handshake after using early data.
When we're done sending/receiving early data, and we add the handshake
flags on the connection, make sure we wake the associated tasklet up, so that
the handshake will be initiated.
2019-06-15 21:00:39 +02:00
Willy Tarreau
b6563f4ac4 BUG/MEDIUM: mux-h2: properly account for the appended data in HTX
When commit 0350b90e3 ("MEDIUM: htx: make htx_add_data() never defragment
the buffer") was introduced, it made htx_add_data() actually be able to
add less data than it was asked for, and the callers must use the returned
value to know how much was added. The H2 code used to rely on the frame
length instead of the return value. A version of the code doing this was
written but is obviously not the one that got merged, resulting in breaking
large uploads or downloads when HTX would have instead defragmented the
buffer because the HTX side sees less contents than what the H2 side sees.

This patch fixes this again. No backport is needed.
2019-06-15 11:42:01 +02:00
Olivier Houchard
8694e5bc99 BUG/MEDIUM: connections: Don't try to send early data if we have no mux.
In connect_server(), if we don't yet have a mux, because we're choosing
one depending on the ALPN, don't attempt to send early data. We can't do
it because those data would depend on the mux, that will only be determined
by the handshake.

This should be backported to 1.9.
2019-06-15 11:35:00 +02:00
Olivier Houchard
b4a8b2c63d BUG/MEDIUM: connections: Don't use ALPN to pick mux when in mode TCP.
In connect_server(), don't wait until we negociate the ALPN to choose the
mux, the only mux we want to use is the mux_pt anyway.

This should be backported to 1.9.
2019-06-15 11:34:55 +02:00
Willy Tarreau
76c83826db BUG/MEDIUM: mux-h2: fix early close with option abortonclose
Olivier found that commit 99ad1b3e8 ("MINOR: mux-h2: stop relying on
CS_FL_REOS") managed to break abortonclose again with H2. What happens
is that while the CS_FL_REOS flag was set on some transitions to the
HREM state, it's not set on all and is in fact only set when the low
level connection is closed. So making the replacement condition match
the HREM and ERROR states is not correct and causes completely correct
requests to send advertise an early close of the connection layer while
only the stream's input is closed.

In order to avoid this, we now properly split the checks for the CLOSED
state and for the closed connection. This way there is no risk to set
the EOS flag too early on the connection.

No backport is needed.
2019-06-15 10:04:09 +02:00
Willy Tarreau
bd20a9dd4e BUG: tasks: fix bug introduced by latest scheduler cleanup
In commit 86eded6c6 ("CLEANUP: tasks: rename task_remove_from_tasklet_list()
to tasklet_remove_*") which consisted in removing the casts between tasks
and tasklet, I was a bit too fast to believe that we only saw tasklets in
this function since process_runnable_tasks() also uses it with tasks under
a cast. So removing the bookkeeping on task_list_size was not appropriate.
Bah, the joy of casts which hide the real thing...

This patch does two things at once to address this mess once for all:
  - it restores the decrement of task_list_size when it's a real task,
    but moves it to process_runnable_task() since it's the only place
    where it's allowed to call it with a task

  - it moves the increment there as well and renames
    task_insert_into_tasklet_list() to tasklet_insert_into_tasklet_list()
    of obvious consistency reasons.

This way the increment/decrement of task_list_size is made at the only
places where the cast is enforced, so it has less risks to be missed.
The comments on top of these functions were updated to reflect that they
are only supposed to be used with tasklets and that the caller is responsible
for keeping task_list_size up to date if it decides to enforce a task there.

Now we don't have to worry anymore about how these functions work outside
of the scheduler, which is better longterm-wise. Thanks to Christopher for
spotting this mistake.

No backport is needed.
2019-06-14 18:16:19 +02:00
Christopher Faulet
cd67bffd26 BUG/MINOR: mux-h1: Wake busy mux for I/O when message is fully sent
If a mux is in busy mode when the outgoing EOM is consummed, it is important to
wake it up for I/O. Because in busy mode, the mux is not subscribed for
receive. Otherwise, it depends on the applicative layer to shutdown the H1
stream. Wake it up allows the mux to catch the read0 as soon as possible.

This patch must be backported to 1.9.
2019-06-14 17:40:10 +02:00
Willy Tarreau
86eded6c69 CLEANUP: tasks: rename task_remove_from_tasklet_list() to tasklet_remove_*
The function really only operates on tasklets, its arguments are always
tasklets cast as tasks to match the function's type, to be cast back to
a struct tasklet. Let's rename it to tasklet_remove_from_tasklet_list(),
take a struct tasklet, and get rid of the undesired task casts.
2019-06-14 14:57:03 +02:00
Willy Tarreau
3c39a7d889 CLEANUP: connection: rename the wait_event.task field to .tasklet
It's really confusing to call it a task because it's a tasklet and used
in places where tasks and tasklets are used together. Let's rename it
to tasklet to remove this confusion.
2019-06-14 14:42:29 +02:00
Baptiste Assmann
95c2c01ced MEDIUM: server: server-state only rely on server name
Since h7da71293e431b5ebb3d6289a55b0102331788ee6as has been added, the
server name (srv->id in the code) is now unique per backend, which
means it can reliabely be used to identify a server recovered from the
server-state file.

This patch cleans up the parsing of server-state file and ensure we use
only the server name as a reliable key.
2019-06-14 14:18:55 +02:00
Christopher Faulet
3b44c54129 MINOR: mux-h2: Forward clients scheme to servers checking start-line flags
By default, the scheme "https" is always used. But when an explicit scheme was
defined and when this scheme is "http", we use it in the request sent to the
server. This is done by checking flags of the start-line. If the flag
HTX_SL_F_HAS_SCHM is set, it means an explicit scheme was defined on the client
side. And if the flag HTX_SL_F_SCHM_HTTP is set, it means the scheme "http" was
used.
2019-06-14 11:13:32 +02:00
Christopher Faulet
42993a86c9 MINOR: mux-h1: Set flags about the request's scheme on the start-line
We first try to figure out if the URI of the start-line is absolute or not. So,
if it does not start by a slash ("/"), it means the URI is an absolute one and
the flag HTX_SL_F_HAS_SCHM is set. Then checks are performed to know if the
scheme is "http" or "https" and the corresponding flag is set,
HTX_SL_F_SCHM_HTTP or HTX_SL_F_SCHM_HTTPS. Other schemes, for instance ftp, are
ignored.
2019-06-14 11:13:32 +02:00
Christopher Faulet
a9a5c04c23 MINOR: h2: Set flags about the request's scheme on the start-line
The flag HTX_SL_F_HAS_SCHM is always set because H2 requests have always an
explicit scheme. Then, the pseudo-header ":scheme" is tested. If it is set to
"http", the flag HTX_SL_F_SCHM_HTTP is set. Otherwise, for all other cases, the
flag HTX_SL_F_SCHM_HTTPS is set. For now, it seems reasonable to have a fallback
on the scheme "https".
2019-06-14 11:13:32 +02:00
Christopher Faulet
d20fdb0454 BUG/MEDIUM: proto_htx: Introduce the state ENDING during forwarding
This state is used in the legacy HTTP when everything was received from an
endpoint but a filter doesn't forward all the data. It is used to not report a
client or a server abort, depending on channels flags.

The same must be done on HTX streams. Otherwise, the message may be
truncated. For instance, it may happen with the filter trace with the random
forwarding enabled on the response channel.

This patch must be backported to 1.9.
2019-06-14 11:13:32 +02:00
Christopher Faulet
421e769783 BUG/MEDIUM: htx: Don't change position of the first block during HTX analysis
In the HTX structure, the field <first> is used to know where to (re)start the
analysis. It may differ from the message's head. It is especially important to
update it to handle 1xx messages, to be sure to restart the analysis on the next
message (another 1xx message or the final one). It is also updated when some
data are forwarded (the headers or part of the body). But this update is an
error and must never be done at the analysis level. It is a bug, because some
sample fetches may be used after the data forwarding (but before the first send
of course). At this stage, if the first block position does not point on the
start-line, most of HTTP sample fetches fail.

So now, when something is forwarding by HTX analyzers, the first block position
is not update anymore.

This issue was reported on Github. See #119. No backport needed.
2019-06-14 11:13:32 +02:00
Christopher Faulet
8c65486081 BUG/MINOR: htx: Detect when tail_addr meet end_addr to maximize free rooms
When a block's payload is moved during an expansion or when the whole block is
removed, the addresses of free spaces are updated accordingly. We must be
careful to reset them when <tail_addr> becomes equal to <end_addr>. In this
situation, we can maximize the free space between the blocks and their payload
and set the other one to 0. It is also important to be sure to never have
<end_addr> greater than <tail_addr>.
2019-06-14 11:13:32 +02:00
Christopher Faulet
e4ab11bb88 BUG/MINOR: http: Use the global value to limit the number of parsed headers
Instead of using the macro MAX_HTTP_HDR to limit the number of headers parsed
before throwing an error, we now use the custom global variable
global.tune.max_http_hdr.

This patch must be backported to 1.9.
2019-06-14 11:13:32 +02:00
Christopher Faulet
647fe1d9e1 BUG/MINOR: fl_trace/htx: Be sure to always forward trailers and EOM
Previous fix about the random forwarding on the message body was not enough to
fix the bug in all cases. Among others, when there is no data but only the EOM,
we must forward everything.

This patch must be backported to 1.9 if the patch 0bdeeaacb ("BUG/MINOR:
flt_trace/htx: Only apply the random forwarding on the message body.") is also
backported.
2019-06-14 11:13:32 +02:00
Olivier Houchard
985234d0cb BUG/MEDIUM: h1: Wait for the connection if the handshake didn't complete.
In h1_init(), also add the H1C_F_CS_WAIT_CONN flag if the handshake didn't
complete, otherwise we may end up letting the upper layer sending data too
soon.
2019-06-13 19:14:45 +02:00
Olivier Houchard
6063003c96 BUG/MEDIUM: h1: Don't wait for handshake if we had an error.
In h1_process(), only wait for the handshake if we had no error on the
connection. If the handshake failed, we have to let the upper layer know.
2019-06-13 19:14:45 +02:00
Ben51Degrees
f4a82fb26b BUILD/MINOR: 51d: Updated build registration output to indicate thatif the library is a dummy one or not.
When built with the dummy 51Degrees library for testing, the output will
include "(dummy library)" to ensure it is clear that this is this is not
the API.
2019-06-13 18:00:54 +02:00
William Lallemand
63329e36ab MINOR: doc: update the manpage and usage message about -S
Add -S in the manpage, and update the usage message.

Should be backported to 1.9.
2019-06-13 17:09:27 +02:00
Tim Duesterhus
dda1155ed7 BUILD: Silence gcc warning about unused return value
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

complains:

> src/debug.c: In function "ha_panic":
> src/debug.c:162:2: warning: ignoring return value of "write", declared with attribute warn_unused_result [-Wunused-result]
>  (void) write(2, trash.area, trash.data);
>    ^
2019-06-13 15:47:41 +02:00
William Lallemand
1dc6963086 MINOR: mworker: add the HAProxy version in "show proc"
Displays the HAProxy version so you can compare the version of old
processes and new ones.
2019-06-12 19:19:57 +02:00
William Lallemand
e8669fc9db MINOR: mworker: change formatting in uptime field of "show proc"
Change the formatting of the uptime field in "show proc" so it's easier
to parse it. Remove the space between the day and the hour and align the
field on 15 characters.
2019-06-12 19:19:57 +02:00
Ben51Degrees
31a51f25d6 BUG/MINOR: 51d/htx: The _51d_fetch method, and the methods it calls are now HTX aware.
The _51d_fetch method, and the two methods it calls to fetch HTTP
headers (_51d_set_device_offsets, and _51d_set_headers), now support
both legacy and HTX operation.

This should be backported to 1.9.
2019-06-12 18:06:59 +02:00
Willy Tarreau
3381022d88 MINOR: http: add a new "http-request replace-uri" action
This action is particularly convenient to replace some deprecated usees
of "reqrep". It takes a match and a format string including back-
references. The reqrep warning was updated to suggest it as well.
2019-06-12 18:06:59 +02:00
Olivier Houchard
690e0f07f5 BUG/MEDIUM: h1: Don't consider we're connected if the handshake isn't done.
In h1_process(), don't consider we're connected if we still have handshakes
pending. It used not to happen, because we would not be called if there
were any ongoing handshakes, but that changed now that the handshakes are
handled by a xprt, and not by conn_fd_handler() directly.
2019-06-11 16:41:36 +02:00
Olivier Houchard
92d093d641 BUG/MEDIUM: h1: Don't try to subscribe if we had a connection error.
If the CO_FL_ERROR flag is set, and we weren't connected yet, don't attempt
to subscribe, as the underlying xprt may already have been destroyed.
2019-06-11 16:41:24 +02:00
Willy Tarreau
b5ba2b0177 MINOR: http: turn default error files to HTTP/1.1
For quite a long time we've been saying that the default error files
should produce HTTP/1.1 responses and since it's of low importance, it
always gets forgotten.

So here it finally comes. Each status code now properly contains a
content-length header so that the output is clean and doesn't force
upstream proxies to switch to chunked encoding or to close the connection
immediately after the response, which is particularly annoying for 401
or 407 for example. It's worth noting that the 3xx codes had already
been turned to HTTP/1.1.

This patch will obviously not change anything for user-provided error files.
2019-06-11 16:37:13 +02:00
Willy Tarreau
5abdc760c9 BUG/MINOR: http-rules: mention "deny_status" for "deny" in the error message
The error message indicating an unknown keyword on an http-request rule
doesn't mention the "deny_status" option which comes with the "deny" rule,
this is particularly confusing.

This can be backported to all versions supporting this option.
2019-06-11 16:37:13 +02:00
Olivier Houchard
45c4437b4a Revert "BUG/MEDIUM: H1: When upgrading, make sure we don't free the buffer too early."
This reverts commit 6c7fe5c370.

This patch was harmless, but not needed, conn_upgrade_mux_fe() already takes
care of setting the buffer to BUF_NULL.
2019-06-11 14:07:53 +02:00
Christopher Faulet
86fcf6d6cd MINOR: htx: Add the function htx_move_blk_before()
The function htx_add_data_before() was removed because it was buggy. The
function htx_move_blk_before() may be used if necessary to do something
equivalent, except it just moves blocks. It doesn't handle the adding.
2019-06-11 14:05:25 +02:00
Christopher Faulet
d7884d3449 MAJOR: htx: Rework how free rooms are tracked in an HTX message
In an HTX message, it may have 2 available rooms to store a new block. The first
one is between the blocks and their payload. Blocks are added starting from the
end of the buffer and their payloads are added starting from the begining. So
the first free room is between these 2 edges. The second one is at the begining
of the buffer, when we start to wrap to add new payloads. Once we start to use
this one, the other one is ignored until the next defragmentation of the HTX
message.

In theory, there is no problem. But in practice, some lacks in the HTX structure
force us to defragment too often HTX messages to always be in a known state. The
second free room is not tracked as it should do and the first one may be easily
corrupted when rewrites happen.

So to fix the problem and avoid unecessary defragmentation, the HTX structure
has been refactored. The front (the block's position of the first payload before
the blocks) is no more stored. Instead we keep the relative addresses of 3 edges:

 * tail_addr : The start address of the free space in front of the the blocks
               table
 * head_addr : The start address of the free space at the beginning
 * end_addr  : The end address of the free space at the beginning

Here is the general view of the HTX message now:

           head_addr     end_addr    tail_addr
               |            |            |
               V            V            V
  +------------+------------+------------+------------+------------------+
  |            |            |            |            |                  |
  |  PAYLOAD   | Free space |  PAYLOAD   | Free space |    Blocks area   |
  |    ==>     |     1      |    ==>     |     2      |        <==       |
  +------------+------------+------------+------------+------------------+

<head_addr> is always lower or equal to <end_addr> and <tail_addr>. <end_addr>
is always lower or equal to <tail_addr>.

In addition;, to simplify everything, the blocks area are now contiguous. It
doesn't wrap anymore. So the head is always the block with the lowest position,
and the tail is always the one with the highest position.
2019-06-11 14:05:25 +02:00
Christopher Faulet
50fe9fba4b MINOR: flt_trace: Don't scrash the original offset during the random forwarding
There is no bug here, but this patch improves the debug message reported during
the random forwarding. The original offset is kept untouched so its value may be
used to format the message. Before, 0 was always reported.
2019-06-11 14:05:25 +02:00
Christopher Faulet
86bc8df955 BUG/MEDIUM: compression/htx: Fix the adding of the last data block
The function htx_add_data_before() is buggy and cannot work. It first add a data
block and then move it before another one, passed in argument. The problem
happens when a defragmentation is done to add the new block. In this case, the
reference is no longer valid, because the blocks are rearranged. So, instead of
moving the new block before the reference, it is moved at the head of the HTX
message.

So this function has been removed. It was only used by the compression filter to
add a last data block before a TLR, EOT or EOM block. Now, the new function
htx_add_last_data() is used. It adds a last data block, after all others and
before any TLR, EOT or EOM block. Then, the next bock is get. It is the first
non-data block after data in the HTX message. The compression loop continues
with it.

This patch must be backported to 1.9.
2019-06-11 14:05:25 +02:00
Christopher Faulet
bda8397fba BUG/MINOR: cache/htx: Fix the counting of data already sent by the cache applet
Since the commit 8f3c256f7 ("MEDIUM: cache/htx: Always store info about HTX
blocks in the cache"), it is possible to read info about a data block without
sending anything. It is possible because we rely on the function htx_add_data(),
which will try to add data without any defragmentation. In such case, info about
the data block are skipped but don't count in data sent.

No need to backport this patch, expect if the commit 8f3c256f7 is backported
too.
2019-06-11 14:05:25 +02:00
Willy Tarreau
34a150ccf5 MEDIUM: init/threads: don't use spinlocks during the init phase
PiBa-NL found some pathological cases where starting threads can hinder
each other and cause a measurable slow down. This problem is reproducible
with the following config (haproxy must be built with -DDEBUG_DEV) :

    global
	stats socket /tmp/sock1 mode 666 level admin
	nbthread 64

    backend stopme
	timeout server  1s
	option tcp-check
	tcp-check send "debug dev exit\n"
	server cli unix@/tmp/sock1 check

This will cause the process to be stopped once the checks are ready to
start. Binding all these to just a few cores magnifies the problem.
Starting them in loops shows a significant time difference among the
commits :

  # before startup serialization
  $ time for i in {1..20}; do taskset -c 0,1,2,3 ./haproxy-e186161 -db -f slow-init.cfg >/dev/null 2>&1; done

  real    0m1.581s
  user    0m0.621s
  sys     0m5.339s

  # after startup serialization
  $ time for i in {1..20}; do taskset -c 0,1,2,3 ./haproxy-e4d7c9dd -db -f slow-init.cfg >/dev/null 2>&1; done

  real    0m2.366s
  user    0m0.894s
  sys     0m8.238s

In order to address this, let's use plain mutexes and cond_wait during
the init phase. With this done, waiting threads now sleep and the problem
completely disappeared :

  $ time for i in {1..20}; do taskset -c 0,1,2,3 ./haproxy -db -f slow-init.cfg >/dev/null 2>&1; done

  real    0m0.161s
  user    0m0.079s
  sys     0m0.149s
2019-06-11 11:30:26 +02:00
Frédéric Lécaille
b5ecf0393c BUG/MINOR: dict: race condition fix when inserting dictionary entries.
When checking the result of an ebis_insert() call in an ebtree with unique keys,
if already present, in place of freeing() the old one and return the new one,
rather the correct way is to free the new one, and return the old one. For
this, the __dict_insert() function was folded into dict_insert() as this
significantly simplifies the test of duplicates.

Thanks to Olivier for having reported this bug which came with this one:
"MINOR: dict: Add dictionary new data structure".
2019-06-11 09:54:12 +02:00
Willy Tarreau
e4d7c9dd65 OPTIM/MINOR: init/threads: only call protocol_enable_all() on first thread
There's no point in calling this on each and every thread since the first
thread passing there will enable the listeners, and the next ones will
simply scan all of them in turn to discover that they are already
initialized. Let's only initilize them on the first thread. This could
slightly speed up start up on very large configurations, eventhough most
of the time is still spent in the main thread binding the sockets.

A few measurements have constantly shown that this decreases the startup
time by ~0.1s for 150k listeners. Starting all of them in parallel doesn't
provide better results and can still expose some undesired races.
2019-06-10 10:53:59 +02:00
Willy Tarreau
7109282577 BUG/MEDIUM: init/threads: prevent initialized threads from starting before others
Since commit 6ec902a ("MINOR: threads: serialize threads initialization")
we now serialize threads initialization. But doing so has emphasized another
race which is that some threads may actually start the loop before others
are done initializing.

As soon as all threads enter the first thread_release() call, their rdv
bit is cleared and they're all waiting for all others' rdv to be cleared
as well, with their harmless bit set. The first one to notice the cleared
mask will progress through thread_isolate(), take rdv again preventing
most others from noticing its short pass to zero, and this first one will
be able to run all the way through the initialization till the last call
to thread_release() which it happily crosses, being the only one with the
rdv bit, leaving the room for one or a few others to do the same. This
results in some threads entering the loop before others are done with
their initialization, which is particularly bad. PiBa-NL reported that
some regtests fail for him due to this (which was impossible to reproduce
here, but races are racy by definition). However placing some printf()
in the initialization code definitely shows this unsychronized startup.

This patch takes a different approach in three steps :
  - first, we don't start with thread_release() anymore and we don't
    set the rdv mask anymore in the main call. This was initially done
    to let all threads start toghether, which we don't want. Instead
    we just start with thread_isolate(). Since all threads are harmful
    by default, they all wait for each other's readiness before starting.

  - second, we don't release with thread_release() but with
    thread_sync_release(), meaning that we don't leave the function until
    other ones have reached the point in the function where they decide
    to leave it as well.

  - third, it makes sure we don't start the listeners using
    protocol_enable_all() before all threads have allocated their local
    FD tables or have initialized their pollers, otherwise startup could
    be racy as well. It's worth noting that it is even possible to limit
    this call to thread #0 as it only needs to be performed once.

This now guarantees that all thread init calls start only after all threads
are ready, and that no thread enters the polling loop before all others have
completed their initialization.

Please check GH issues #111 and #117 for more context.

No backport is needed, though if some new init races are reported in
1.9 (or even 1.8) which do not affect 2.0, then it may make sense to
carefully backport this small series.
2019-06-10 10:53:52 +02:00
Willy Tarreau
9a1f57351d MEDIUM: threads: add thread_sync_release() to synchronize steps
This function provides an alternate way to leave a critical section run
under thread_isolate(). Currently, a thread may remain in thread_release()
without having the time to notice that the rdv mask was released and taken
again by another thread entering thread_isolate() (often the same that just
released it). This is because threads wait in harmless mode in the loop,
which is compatible with the conditions to enter thread_isolate(). It's
not possible to make them wait with the harmless bit off or we cannot know
when the job is finished for the next thread to start in thread_isolate(),
and if we don't clear the rdv bit when going there, we create another
race on the start point of thread_isolate().

This new synchronous variant of thread_release() makes use of an extra
mask to indicate the threads that want to be synchronously released. In
this case, they will be marked harmless before releasing their sync bit,
and will wait for others to release their bit as well, guaranteeing that
thread_isolate() cannot be started by any of them before they all left
thread_sync_release(). This allows to construct synchronized blocks like
this :

     thread_isolate()
     /* optionally do something alone here */
     thread_sync_release()
     /* do something together here */
     thread_isolate()
     /* optionally do something alone here */
     thread_sync_release()

And so on. This is particularly useful during initialization where several
steps have to be respected and no thread must start a step before the
previous one is completed by other threads.

This one must not be placed after any call to thread_release() or it would
risk to block an earlier call to thread_isolate() which the current thread
managed to leave without waiting for others to complete, and end up here
with the thread's harmless bit cleared, blocking others. This might be
improved in the future.
2019-06-10 09:42:43 +02:00
Willy Tarreau
31cba0d3e0 MINOR: threads: avoid clearing harmless twice in thread_release()
thread_release() is to be called after thread_isolate(), i.e. when the
thread already has its harmless bit cleared. No need to clear it twice,
thus avoid calling thread_harmless_end() and directly check the rdv
bits then loop on them.
2019-06-09 08:47:35 +02:00
Olivier Houchard
19a2e2d91e BUG/MEDIUM: stream_interface: Make sure we call si_cs_process() if CS_FL_EOI.
In si_cs_recv(), if we got the CS_FL_EOI flag on the conn_stream, make sure
we return 1, so that si_cs_process() will be called, and wake
process_stream() up, otherwise if we're unlucky the flag will never be
noticed, and the stream won't be woken up.
2019-06-07 19:37:21 +02:00
Olivier Houchard
6c7fe5c370 BUG/MEDIUM: H1: When upgrading, make sure we don't free the buffer too early.
In h1_release(), when we want to upgrade the mux to h2, make sure we set
h1c->ibuf to BUF_NULL before calling conn_upgrade_mux_fe().
If the upgrade is successful, the buffer will be provided to the new mux,
h1_release() will be called recursively, it will so try to free h1c->ibuf,
and freeing the buffer we just provided to the new mux would be unfortunate.
2019-06-07 19:37:21 +02:00
Willy Tarreau
9faebe34cd MEDIUM: tools: improve time format error detection
As reported in GH issue #109 and in discourse issue
https://discourse.haproxy.org/t/haproxy-returns-408-or-504-error-when-timeout-client-value-is-every-25d
the time parser doesn't error on overflows nor underflows. This is a
recurring problem which additionally has the bad taste of taking a long
time before hitting the user.

This patch makes parse_time_err() return special error codes for overflows
and underflows, and adds the control in the call places to report suitable
errors depending on the requested unit. In practice, underflows are almost
never returned as the parsing function takes care of rounding values up,
so this might possibly happen on 64-bit overflows returning exactly zero
after rounding though. It is not really possible to cut the patch into
pieces as it changes the function's API, hence all callers.

Tests were run on about every relevant part (cookie maxlife/maxidle,
server inter, stats timeout, timeout*, cli's set timeout command,
tcp-request/response inspect-delay).
2019-06-07 19:32:02 +02:00
Frédéric Lécaille
b65717fa55 MINOR: peers: Optimization for dictionary cache lookup.
When we look up an dictionary entry in the cache used upon transmission
we store the last result in ->prev_lookup of struct dcache_tx so that
to compare it with the subsequent entries to look up and save performances.
2019-06-07 15:47:54 +02:00
Frédéric Lécaille
fd827937ed MINOR: peers: A bit of optimization when encoding cached server names.
When a server name is cached we only send its cache entry ID which has
an encoded length of 1 (because smaller than PEER_ENC_2BYTES_MIN).
So, in this case we only have to encode 1, the already known encoded length
of this ID before encoding it.

Furthermore we do not have to call strlen() to compute the lengths of server
name strings thanks to this commit: "MINOR: dict: Store the length of the
dictionary entries".
2019-06-07 15:47:54 +02:00
Frédéric Lécaille
99de1d0479 MINOR: dict: Store the length of the dictionary entries.
When allocating new dictionary entries we store the length of the strings.
May be useful so that not to have to call strlen() too much often at runing
time.
2019-06-07 15:47:54 +02:00
Frédéric Lécaille
6c39198b57 MINOR peers: data structure simplifications for server names dictionary cache.
We store pointers to server names dictionary entries in a pre-allocated array of
ebpt_node's (->entries member of struct dcache_tx) to cache those sent to remote
peers. Consequently the ID used to identify the server name dictionary entry is
also used as index for this array. There is no need to implement a lookup by key
for this dictionary cache.
2019-06-07 15:47:54 +02:00
Willy Tarreau
6ec902a659 MINOR: threads: serialize threads initialization
There is no point in initializing threads in parallel when we know that
it's the moment where some global variables are turned to thread-local
ones, and/or that some global variables are updated (like global_now or
trash_size). Some FDs might be created/destroyed/reallocated and could
be tricky to follow as well (think about epoll_fd for example).

Instead of having to be extremely careful about all these, and to trigger
false positives in thread sanitizers, let's simply initialize one thread
at a time. The init step is very fast so nobody should even notice, and
we won't have any more doubts about what might have happened when
analysing a dump.

See GH issues #111 and #117 for some background on this.
2019-06-07 15:37:47 +02:00
Willy Tarreau
e18616168f Revert "MINOR: chunks: Make sure trash_size is only set once."
This reverts commit 1c3b83242d.

It was made only to silence the thread sanitizer but ends up creating a
bug. Indeed, if "tune.bufsize" is in the global section, the trash_size
value is not updated anymore and the trash becomes smaller than a buffer!

Let's stop trying to fix the thread sanitizer reports, they are invalid,
and trying to fix them actually introduces bugs where there were none.

See GH issue #117 for more context. No backport is needed.
2019-06-07 15:37:47 +02:00
Olivier Houchard
1c3b83242d MINOR: chunks: Make sure trash_size is only set once.
The trash_size variable is shared by all threads, and is set by all threads,
when alloc_trash_buffers() is called. To make sure it's set only once,
to silence a harmless data race, use a CAS to set it, and only set it if it
was 0.
2019-06-07 14:45:44 +02:00
Willy Tarreau
1bfd6020ce MINOR: logs: use the new bitmap functions instead of fd_sets for encoding maps
The fd_sets we've been using in the log encoding functions are not portable
and were shown to break at least under Cygwin. This patch gets rid of them
in favor of the new bitmap functions. It was verified with the config below
that the log output was exactly the same before and after the change :

    defaults
        mode http
        option httplog
        log stdout local0
        timeout client 1s
        timeout server 1s
        timeout connect 1s

    frontend foo
        bind :8001
        capture request header chars len 255

    backend bar
        option httpchk "GET" "/" "HTTP/1.0\r\nchars: \x01\x02\x03\x04\x05\x06\x07\x08\x09\x0b\x0c\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f\x20\x21\x22\x23\x24\x25\x26\x27\x28\x29\x2a\x2b\x2c\x2d\x2e\x2f\x30\x31\x32\x33\x34\x35\x36\x37\x38\x39\x3a\x3b\x3c\x3d\x3e\x3f\x40\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f\x50\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5a\x5b\x5c\x5d\x5e\x5f\x60\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f\x70\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff"
        server foo 127.0.0.1:8001 check
2019-06-07 11:13:24 +02:00
Willy Tarreau
7348119fb2 BUG/MEDIUM: mux-h2: make sure the connection timeout is always set
There seems to be a tricky case in the H2 mux related to stream flow
control versus buffer a full situation : is a large response cannot
be entirely sent to the client due to the stream window being too
small, the stream is paused with the SFCTL flag. Then the upper
layer stream might get bored and expire this stream. It will then
shut it down first. But the shutdown operation might fail if the
mux buffer is full, resulting in the h2s being subscribed to the
deferred_shut event with the stream *not* added to the send_list
since it's blocked in SFCTL. In the mean time the upper layer completely
closes, calling h2_detach(). There we have a send_wait (the pending
shutw), the stream is marked with SFCTL so we orphan it.

Then if the client finally reads all the data that were clogging the
buffer, the send_list is run again, but our stream is not there. From
this point, the connection's stream list is not empty, the mux buffer
is empty, so the connection's timeout is not set. If the client
disappears without updating the stream's window, nothing will expire
the connection.

This patch makes sure we always keep the connection timeout updated.
There might be finer solutions, such as checking that there are still
living streams in the connection (i.e. streams not blocked in SFCTL
state), though this is not necessarily trivial nor useful, since the
client timeout is the same for the upper level stream and the connection
anyway.

This patch needs to be backported to 1.9 and 1.8 after some observation.
2019-06-07 08:47:44 +02:00
Olivier Houchard
7b3a79f6c4 BUG/MEDIUM: tcp: Make sure we keep the polling consistent in tcp_probe_connect.
In tcp_probe_connect(), if the connection is still pending, do not disable
want_recv, we don't have any business to do so, but explicitely use
__conn_xprt_want_send(), otherwise the next time we'll reach tcp_probe_connect,
fd_send_ready() would return 0 and we would never flag the connection as
CO_FL_CONNECTED, which can lead to various problems, such as check not
completing because they consider it is not connected yet.
2019-06-06 18:17:32 +02:00
Willy Tarreau
43091ed161 BUG/MINOR: time: make sure only one thread sets global_now at boot
All threads call tv_update_date(-1) at boot to set their own local time
offset. While doing so they also overwrite global_now, which is not that
much of a problem except that it's not done using an atomic write and
that it will be overwritten by every there in parallel. We only need the
first thread to set it anyway, so let's simply set it if not set and do
it using a CAS. This should fix GH issue #111.

This may be backported to 1.9.
2019-06-06 16:50:39 +02:00
Willy Tarreau
237f8aef41 BUILD: peers: fix a build warning about an incorrect intiialization
Just got this one :
src/peers.c:528:13: warning: missing braces around initializer [-Wmissing-braces]
src/peers.c:528:13: warning: (near initialization for 'cde.key') [-Wmissing-braces]

Indeed, this struct contains two structs so scalar zero is not a valid
value for the first field. Let's just leave it as an empty struct since
it was the purpose.
2019-06-06 16:42:14 +02:00
Willy Tarreau
1ec9bb5b62 MEDIUM: stream: don't abusively loop back on changes on CF_SHUT*_NOW
These flags are not used by analysers, only by the shut* functions, and
they were covered by CF_MASK_STATIC only because in the past the shut
functions were in the middle of the analysers. But here they are causing
excess loop backs which provide no value and increase processing cost.
Ideally the CF_MASK_STATIC bitfield should be revisited, but doing this
alone is enough to reduce by 30% the number of calls to si_sync_send().
2019-06-06 16:36:19 +02:00
Willy Tarreau
3c5c066d66 MEDIUM: stream: only loop on flags relevant to the analysers
In process_stream() we detect a number of conditions to decide to loop
back to the analysers. Some of them are excessive in that they perform
a strict comparison instead of filtering on the flags relevant to the
analysers as is done at other places, resulting in excess wakeups. One
of the effect is that after a successful WRITE_PARTIAL, a second send is
not possible, resulting in the loss of WRITE_PARTIAL, causing another
wakeup! Let's apply the same mask and verify the flags correctly.
2019-06-06 16:36:19 +02:00
Willy Tarreau
829bd4710f MEDIUM: stream: rearrange the events to remove the loop
The "goto redo" at the end of process_stream() to make the states converge
is still a big source of problems and mostly stems from the very late call
to the send() functions, whose results need to be considered, while it's
being done in si_update_both() when leaving.

This patch extracts the si_sync_send() calls from si_update_both(), and
places them at the relevant places in process_stream(), which are just
after the amount of data to forward is updated and before the shutw()
calls (which were also moved). The stream-interface resynchronization
needs to go slightly upper to take into account the transition from CON
to RDY that will happen consecutive to some successful send(), and that's
all.

By doing so we can now get rid of this loop and have si_update_both()
called only to update the stream interface and channel when leaving the
function, as it was initially designed to work.

It is worth noting that a number of the remaining conditions to perform
a goto resync_XXX still seem suboptimal and would benefit from being
refined to perform les resynchronization. But what matters at this stage
is that the code remains valid and efficient.
2019-06-06 16:36:19 +02:00
Willy Tarreau
3b285d7fbd MINOR: stream-int: make si_sync_send() from the send code of si_update_both()
Just like we have a synchronous recv() function for the stream interface,
let's have a synchronous send function that we'll be able to call from
different places. For now this only moves the code, nothing more.
2019-06-06 16:36:19 +02:00
Willy Tarreau
236c4298b3 MINOR: stream-int: split si_update() into si_update_rx() and si_update_tx()
We should not update the two directions at once, in fact we should update
the Rx path after recv() and the Tx path after send(). Let's start by
splitting the update function in two for this.
2019-06-06 16:36:19 +02:00
Willy Tarreau
d66ed88a78 MEDIUM: stream: re-arrange the connection setup status reporting
Till now when a wakeup happens after a connection is attempted, we go
through sess_update_st_con_tcp() to deal with the various possible events,
then to sess_update_st_cer() to deal with a possible error detected by the
former, or to sess_establish() to complete the connection validation. There
are multiple issues in the way this is handled, which have accumulated over
time. One of them is that any spurious wakeup during SI_ST_CON would validate
the READ_ATTACHED flag and wake the analysers up. Another one is that nobody
feels responsible for clearing SI_FL_EXP if it happened at the same time as
a success (and it is present in all reports of loops to date). And another
issue is that aborts cannot happen after a clean connection setup with no
data transfer (since CF_WRITE_NULL is part of CF_WRITE_ACTIVITY). Last, the
flags cleanup work was hackish, added here and there to please the next
function (typically what had to be donne in commit 7a3367cca to work around
the url_param+reuse issue by moving READ_ATTACHED to CON).

This patch performs a significant lift up of this setup code. First, it
makes sure that the state handlers are the ones responsible for the cleanup
of the stuff they rely on. Typically sess_sestablish() will clean up the
SI_FL_EXP flag because if we decided to validate the connection it means
that we want to ignore this late timeout. Second, it splits the CON and
RDY state handlers because the former only has to deal with failures,
timeouts and non-events, while the latter has to deal with partial or
total successes. Third, everything related to connection success was
moved to sess_establish() since it's the only safe place to do so, and
this function is also called at a few places to deal with synchronous
connections, which are not seen by intermediary state handlers.

The code was made a bit more robust, for example by making sure we
always set SI_FL_NOLINGER when aborting a connection so that we don't
have any risk to leave a connection in SHUTW state in case it was
validated late. The useless return codes of some of these functions
were dropped so that callers only rely on the stream-int's state now
(which was already partially the case anyway).

The code is now a bit cleaner, could be further improved (and functions
renamed) but given the sensitivity of this part, better limit changes to
strictly necessary. It passes all reg tests.
2019-06-06 16:36:19 +02:00
Willy Tarreau
b27f54a88c MAJOR: stream-int: switch from SI_ST_CON to SI_ST_RDY on I/O
Now whenever an I/O event succeeds during a connection attempt, we
switch the stream-int's state to SI_ST_RDY. This allows si_update()
to update R/W timeouts on the channel and end points to start to
consume outgoing data and to subscribe to lower layers in case of
failure. It also allows chk_rcv() to be performed on the other side
to enable data forwarding and make sure we don't fall into a situation
where no more events happen and nothing moves anymore.
2019-06-06 16:36:19 +02:00
Willy Tarreau
4f283fa604 MEDIUM: stream-int: introduce a new state SI_ST_RDY
The main reason for all the trouble we're facing with stream interface
error or timeout reports during the connection phase is that we currently
can't make the difference between a connection attempt and a validated
connection attempt. It is problematic because we tend to switch early
to SI_ST_EST but can't always do what we want in this state since it's
supposed to be set when we don't need to visit sess_establish() again.

This patch introduces a new state betwen SI_ST_CON and SI_ST_EST, which
is SI_ST_RDY. It indicates that we've verified that the connection is
ready. It's a transient state, like SI_ST_DIS, that cannot persist when
leaving process_stream(). For now it is not set, only verified in various
tests where SI_ST_CON was used or SI_ST_EST depending on the cases.

The stream-int state diagram was minimally updated to reflect the new
state, though it is largely obsolete and would need to be seriously
updated.
2019-06-06 16:36:19 +02:00
Willy Tarreau
7ab22adbf7 MEDIUM: stream-int: remove dangerous interval checks for stream-int states
The stream interface state checks involving ranges were replaced with
checks on a set of states, already revealing some issues. No issue was
fixed, all was replaced in a one-to-one mapping for easier control. Some
checks involving a strict difference were also replaced with fields to
be clearer. At this stage, the result must be strictly equivalent. A few
tests were also turned to their bit-field equivalent for better readability
or in preparation for upcoming changes.

The test performed in the SPOE filter was swapped so that the closed and
error states are evicted first and that the established vs conn state is
tested second.
2019-06-06 16:36:19 +02:00
Willy Tarreau
19ecf71b60 BUG/MINOR: stream: don't emit a send-name-header in conn error or disconnect states
The test for the send-name-header field used to cover all states between
SI_ST_CON and SI_ST_CLO, which include SI_ST_CER and SI_ST_DIS. Trying to
send a header in these states makes no sense at all, so let's fix this.
This should have no visible impact so no backport is needed.
2019-06-06 16:36:19 +02:00
Willy Tarreau
975b155ebb MINOR: server: really increase the pool-purge-delay default to 5 seconds
Commit fb55365f9 ("MINOR: server: increase the default pool-purge-delay
to 5 seconds") did this but the setting placed in new_server() was
overwritten by srv_settings_cpy() from the default-server values preset
in init_default_instance(). Now let's put it at the right place.
2019-06-06 16:25:55 +02:00
Frédéric Lécaille
56aec0ddc6 BUG/MINOR: peers: Wrong server name parsing.
This commit was not complete:
   BUG/MINOR: peers: Wrong "server_name" decoding.
We forgot forgotten to move forward <msg_cur> pointer variable after
having parse the server name string.

Again this bug may happen only if we add stick-table new data type after
the server name which is the current last one. Furthermore this bug is
visible only the first time a peer sends a server name for a stick-table
entry.

Nothing to backport.
2019-06-06 16:06:00 +02:00
Olivier Houchard
81284e6908 BUG/MEDIUM: ssl: Don't forget to initialize ctx->send_recv and ctx->recv_wait.
When creating a new ssl_sock_ctx, don't forget to initialize its send_recv
and recv_wait to NULL, or we may end up dereferencing random values, and
crash.
2019-06-06 13:21:23 +02:00
Olivier Houchard
03abf2d31e MEDIUM: connections: Remove CONN_FL_SOCK*
Now that the various handshakes come with their own XPRT, there's no
need for the CONN_FL_SOCK* flags, and the conn_sock_want|stop functions,
so garbage-collect them.
2019-06-05 18:03:38 +02:00
Olivier Houchard
fe50bfb82c MEDIUM: connections: Introduce a handshake pseudo-XPRT.
Add a new XPRT that is used when using non-SSL handshakes, such as proxy
protocol or Netscaler, instead of taking care of it in conn_fd_handler().
This XPRT is installed when any of those is used, and it removes itself once
the handshake is done.
This should allow us to remove the distinction between CO_FL_SOCK* and
CO_FL_XPRT*.
2019-06-05 18:03:38 +02:00
Olivier Houchard
2e055483ff MINOR: connections: Add a new xprt method, add_xprt().
Add a new method to xprt_ops, add_xprt(), that changes the underlying
xprt to the one provided, and optionally provide the old one.
2019-06-05 18:03:38 +02:00
Olivier Houchard
5149b59851 MINOR: connections: Add a new xprt method, remove_xprt.
Add a new method to xprt_ops, remove_xprt. When called, if the provided
xprt_ctx is the same as the xprt's underlying xprt_ctx, it then uses the
new xprt provided, otherwise it calls the remove_xprt method of the next
xprt.
The goal is to be able to add a temporary xprt, that removes itself from
the chain when it did what it had to do. This will be used to implement
a pseudo-xprt for anything that just requires a handshake (such as the
proxy protocol).
2019-06-05 18:03:38 +02:00
Olivier Houchard
000694cf96 MINOR: ssl: Make ssl_sock_handshake() static.
ssl_sock_handshake is now only used by the ssl code itself, there's no need
to export it anymore, so make it static.
2019-06-05 18:03:38 +02:00
Olivier Houchard
ea8dd949e4 MEDIUM: ssl: Handle subscribe by itself.
As the SSL code may have different needs than the upper layer, ie it may want
to receive when the upper layer wants to right, instead of directly forwarding
the subscribe to the underlying xprt, handle it ourself. The SSL code will
know remember any subscribe call, and wake the tasklet when it is ready
for more I/O.
2019-06-05 18:03:38 +02:00
Olivier Houchard
c3df4507fa MEDIUM: connections: Wake the upper layer even if sending/receiving is disabled.
In conn_fd_handler(), if the fd is ready to send/recv, wake the upper layer
even if we have CO_FL_ERROR, or if CO_FL_XPRT_RD_ENA/CO_FL_XPRT_WR_ENA isn't
set. The only reason we should reach that point is if we had a shutw/shutr,
and the upper layer may want to know about it, and is supposed to handle it
anyway.
2019-06-05 18:03:38 +02:00
Olivier Houchard
49065544d0 MEDIUM: checks: Make sure we unsubscribe before calling cs_destroy().
When we want to destroy the conn_stream for some reason, usually on error,
make sure we unsubscribed before doing so. If we subsscribed, the xprt may
ultimately wake our tasklet on close, aand the check tasklet doesn't expect
it ot happen when we have no longer any conn_stream.
2019-06-05 18:03:38 +02:00
Olivier Houchard
14fcc2ebcc BUG/MEDIUM: servers: Don't attempt to destroy idle connections if disabled.
In connect_server(), when deciding if we should attempt to remove idle
connections, because we have to many file descriptors opened, don't attempt
to do so if idle connection pool is disabled (with pool-max-conn 0), as
if it is, srv->idle_orphan_conns won't even be allocated, and trying to
dereference it will cause a crash.
2019-06-05 13:58:06 +02:00
Frédéric Lécaille
344e94816c BUG/MINOR: peers: Wrong "server_name" decoding.
This patch fixes a bug which does not occur at this time because the "server_name"
stick-table data type is the last one (see STKTABLE_DT_SERVER_NAME). It was introduced
by this commit: "MINOR: peers: Make peers protocol support new "server_name" data type".

Indeed when receiving STD_T_DICT stick-table data type we first decode the length
of these data, then we decode the ID of this dictionary entry. To know if there
is remaining data to parse, we check if we have reached the end of the current data,
relying on <msg_end> variable. But <msg_end> is at the end of the entire message!

So this patch computes the correct end of the current STD_T_DICT before doing
anything else with it.

Nothing to backport.
2019-06-05 13:36:34 +02:00
Christopher Faulet
0bdeeaacbb BUG/MINOR: flt_trace/htx: Only apply the random forwarding on the message body.
In the function trace_http_payload(), when the random forwarding is enabled,
only blocks of type HTX_BLK_DATA must be considered. Because other blocks must
be forwarding in one time.

This patch must be backported to 1.9. But it will have to be adapted. Because
several changes on the HTX in the 2.0 are missing in the 1.9.
2019-06-05 10:12:11 +02:00
Christopher Faulet
c31872fc04 BUG/MINOR: mux-h1: Don't send more data than expected
In h1_snd_buf(), we try to consume as much data as possible in a loop. In this
loop, we first format the raw HTTP message from the HTX message, then we try to
send it. But we must be carefull to never send more data than specified by the
stream-interface.

This patch must be backported to 1.9.
2019-06-05 10:12:11 +02:00
Christopher Faulet
54b5e214b0 MINOR: htx: Don't use end-of-data blocks anymore
This type of blocks is useless because transition between data and trailers is
obvious. And when there is no trailers, the end-of-message is still there to
know when data end for chunked messages.
2019-06-05 10:12:11 +02:00
Christopher Faulet
2d7c5395ed MEDIUM: htx: Add the parsing of trailers of chunked messages
HTTP trailers are now parsed in the same way headers are. It means trailers are
converted to K/V blocks followed by an end-of-trailer marker. For now, to make
things simple, the type for trailer blocks are not the same than for header
blocks. But the aim is to make no difference between headers and trailers by
using the same type. Probably for the end-of marker too.
2019-06-05 10:12:11 +02:00
Christopher Faulet
8f3c256f7e MEDIUM: cache/htx: Always store info about HTX blocks in the cache
It was only done for the headers (including the EOH marker). data were prefixed
by the info field of these blocks. The payload and the trailers of the messages
were stored in raw. The total size of headers and payload were kept in the
cached object state to help output formatting.

Now, info about each HTX block is store in the cache. Only data are allowed to
be splitted. Otherwise, all blocks of an HTX message are handled the same way,
both when storing a message in the cache and when delivering it from the
cache. This will help the cache implementation to be more robust to internal
changes in the HTX. Especially for the upcoming parsing of trailers. There is
also no more need to keep extra info in the cached object state.
2019-06-05 10:12:11 +02:00
Christopher Faulet
4c7ce017fc MINOR: mux-h1: Don't count the EOM in the estimated size of headers
If there is not enough space in the HTX message, the EOM can be delayed when a
bodyless message is added. So, don't count it in the estimated size of headers.
2019-06-05 10:12:11 +02:00
Christopher Faulet
82f0160318 MINOR: mux-h1: Add h1_eval_htx_hdrs_size() to estimate size of the HTX headers
It is just a cosmetic change, to avoid code duplication.
2019-06-05 10:12:11 +02:00
Christopher Faulet
ada34b6a86 MINOR: mux-h1: Add the flag HAVE_O_CONN on h1s
This flag is set on h1s when output messages are formatted to know the
connection mode was already processed. It replace the variable process_conn_mode
in the function h1_process_output().
2019-06-05 10:12:11 +02:00
Christopher Faulet
94b2c76399 MEDIUM: mux-h1: refactor output processing
When we format the H1 output, in the loop on the HTX message, instead of
switching on the block types, we now switch on the message state. It is almost
the same, but it will ease futur changes, on trailers and end-of markers.
2019-06-05 10:12:11 +02:00
Christopher Faulet
a2ea158cf2 BUG/MINOR: mux-h1: errflag must be set on H1S and not H1M during output processing
This bug is in an unexpected clause of the switch..case, inside
h1_process_output(). The wrong structure is used to set the error flag.

This patch must be backported to 1.9.
2019-06-05 10:12:11 +02:00
Patrick Hemmer
65674662b4 MINOR: SSL: add client/server random sample fetches
This adds 4 sample fetches:
- ssl_fc_client_random
- ssl_fc_server_random
- ssl_bc_client_random
- ssl_bc_server_random

These fetches retrieve the client or server random value sent during the
handshake.

Their use is to be able to decrypt traffic sent using ephemeral ciphers. Tools
like wireshark expect a TLS log file with lines in a few known formats
(https://code.wireshark.org/review/gitweb?p=wireshark.git;a=blob;f=epan/dissectors/packet-tls-utils.c;h=28a51fb1fb029eae5cea52d37ff5b67d9b11950f;hb=HEAD#l5209).
Previously the only format supported using data retrievable from HAProxy state
was the one utilizing the Session-ID. However an SSL/TLS session ID is
optional, and thus cannot be relied upon for this purpose.

This change introduces the ability to extract the client random instead which
can be used for one of the other formats. The change also adds the ability to
extract the server random, just in case it might have some other use, as the
code change to support this was trivial.
2019-06-05 10:07:44 +02:00
Emmanuel Hocdet
839af57c85 CLEANUP: ssl: remove unneeded defined(OPENSSL_IS_BORINGSSL)
BoringSSL pretend to be compatible with OpenSSL 1.1.0 and
OPENSSL_VERSION_NUMBER is set accordly: cleanup redundante #ifdef.
2019-06-05 10:01:44 +02:00
Frédéric Lécaille
36fb77e295 MINOR: peers: Replace hard-coded values for peer protocol messaging by macros.
Simple patch to replace hard-coded values in relation with bytes identifiers used
for stick-table messages by macros.
2019-06-05 08:42:36 +02:00
Frédéric Lécaille
32b5573b13 MINOR: peers: Replace hard-coded for peer protocol 64-bits value encoding by macros.
With this patch we define macros for the minimum values which are
encoded for 2 up to 10 bytes. This latter is big enough to encode
UINT64_MAX. We replaced at several places 240 value by PEER_ENC_2BYTES_MIN
which is the minimum value which is encoded with 2 bytes. The peer protocol
encoding consisting in encoding with only one byte a value which is
less than PEER_ENC_2BYTES_MIN and with at least 2 bytes a 64-bits value greater
than PEER_ENC_2BYTES_MIN.
2019-06-05 08:42:36 +02:00
Frédéric Lécaille
62b0b0bc02 MINOR: peers: Add dictionary cache information to "show peers" CLI command.
This patch adds dictionary entries cached and used for the server by name
stickiness feature (exchanged thanks to peers protocol).
2019-06-05 08:42:36 +02:00
Frédéric Lécaille
16b4f54533 MINOR: stick-table: Make the CLI stick-table handler support dictionary entry data type.
Simple patch to dump the values (strings) of dictionary entries stored in stick-table
entries with STD_T_DICT as internal data type.
2019-06-05 08:42:36 +02:00
Frédéric Lécaille
8d78fa7def MINOR: peers: Make peers protocol support new "server_name" data type.
Make usage of the APIs implemented for dictionaries (dict.c) and their LRU caches (struct dcache)
so that to send/receive server names used for the server by name stickiness. These
names are sent over the network as follows:

 - in every case we send the encode length of the data (STD_T_DICT), then
 - if the server names is not present in the cache used upon transmission (struct dcache_tx)
   we cache it and we the ID of this TX cache entry followed the encode length of the
   server name, and finally the sever name itseft (non NULL terminated string).
 - if the server name is present, we repead these operations but we only send the TX cache
   entry ID.

Upon receipt, the couple of (cache IDs, server name) are stored the LRU cache used
only upon receipt (struct dcache_rx). As the peers protocol is symetrical, the fact
that the server name is present in the received data (resp. or not) denotes if
the entry is absent (resp. or not).
2019-06-05 08:42:33 +02:00
Frédéric Lécaille
03cdf55e69 MINOR: stream: Stickiness server lookup by name.
With this patch we modify the stickiness server targets lookup behavior.
First we look for this server targets by their names before looking for them by their
IDs if not found. We also insert a dictionary entry for the name of the server targets
and store the address of this entry in the underlying stick-table.
2019-06-05 08:33:35 +02:00
Frédéric Lécaille
7da71293e4 MINOR: server: Add a dictionary for server names.
This patch only declares and defines a dictionary for the server
names (stored as ->id member field).
2019-06-05 08:33:35 +02:00
Frédéric Lécaille
84d6046a33 MINOR: proxy: Add a "server by name" tree to proxy.
Add a tree to proxy struct to lookup by name for servers attached
to this proxy and populated it at parsing time.
2019-06-05 08:33:35 +02:00
Frédéric Lécaille
db52d9087a MINOR: cfgparse: Space allocation for "server_name" stick-table data type.
When parsing sticking rules, with this patch we reserve some room for the new
"server_name" stick-table data type, as this is already done for "server_id",
setting the offset and used space (in bytes) in the stick-table entry thanks
to stkable_alloc_data_type().
2019-06-05 08:33:35 +02:00
Frédéric Lécaille
5ad57ea85f MINOR: stick-table: Add "server_name" new data type.
This simple patch only adds definitions to create a new stick-table
data type ID and a new standard type to store information in relation
wich dictionary entries (STD_T_DICT).
2019-06-05 08:33:35 +02:00
Frédéric Lécaille
74167b25f7 MINOR: peers: Add a LRU cache implementation for dictionaries.
We want to send some stick-table data fields stored as strings in dictionaries
without consuming too much memory and CPU. To do so we implement with this patch
a cache for send/received dictionaries entries. These dictionary of strings entries are
stored in others real dictionary entries with an identifier as key (unsigned int)
and a pointer to the dictionary of strings entries as values.
2019-06-05 08:33:35 +02:00
Frédéric Lécaille
4a3fef834c MINOR: dict: Add dictionary new data structure.
This patch adds minimalistic definitions to implement dictionary new data structure
which is an ebtree of ebpt_node structs with strings as keys. Note that this has nothing
to see with real dictionary data structure (maps of keys in association with values).
2019-06-05 08:33:35 +02:00
Frédéric Lécaille
0e8db97df4 BUG/MINOR: peers: Wrong stick-table update message building.
When creating this patch "CLEANUP: peers: Replace hard-coded values by macros",
we realized there was a remaining place in peer_prepare_updatemsg() where the maximum
of an encoded length harcoded value could be replaced by PEER_MSG_ENCODED_LENGTH_MAXLEN
macro. But in this case, the 1 harcoded value for the header length is wrong. Should
be 2 or PEER_MSG_HEADER_LEN. So, there is a missing byte to encode the length of
remaining data after the header.

Note that the bug was never encountered because even with a missing byte, we could
encode a maximum length which would be (1<<25) (32MB) according to the following
extract of the peers protocol documentation which were from far a never reached limit
I guess:

  I) Encoded Integer and Bitfield.

         0  <= X < 240        : 1 byte  (7.875 bits)  [ XXXX XXXX ]
        240 <= X < 2288       : 2 bytes (11 bits)     [ 1111 XXXX ] [ 0XXX XXXX ]
       2288 <= X < 264432     : 3 bytes (18 bits)     [ 1111 XXXX ] [ 1XXX XXXX ]   [ 0XXX XXXX ]
     264432 <= X < 33818864   : 4 bytes (25 bits)     [ 1111 XXXX ] [ 1XXX XXXX ]*2 [ 0XXX XXXX ]
   33818864 <= X < 4328786160 : 5 bytes (32 bits)     [ 1111 XXXX ] [ 1XXX XXXX ]*3 [ 0XXX XXXX ]
2019-06-05 08:33:34 +02:00
Frédéric Lécaille
39143340ec CLEANUP: peers: Replace hard-coded values by macros.
All the peer stick-table messages are made of a 2-byte header (PEER_MSG_HEADER_LEN)
followed by the encoded length of the remaining data wich is harcoded as 5 (in bytes)
for the maximum (PEER_MSG_ENCODED_LENGTH_MAXLEN). With such a length we can encode
a maximum length which equals to (1 << 32) - 1, which is from far enough.

This patches replaces both these values by macros where applicable.
2019-06-05 08:33:34 +02:00
Willy Tarreau
5598d171b3 BUILD: task: fix a build warning when threads are disabled
The __decl_hathreads() macro will leave a lone semi-colon making the end
of variables declarations, resulting in a warning if threads are disabled.
Let's simply swap it with the last variable. Thanks to Ilya Shipitsin for
reporting this issue.

No backport is needed.
2019-06-04 17:18:40 +02:00
Willy Tarreau
4b7531f48b BUG/MEDIUM: vars: make the tcp/http unset-var() action support conditions
Patrick Hemmer reported that http-request unset-var(foo) if ... fails to
parse. The reason is that it reuses the same parser as "set-var(foo)" which
makes a special case of the arguments, supposed to be a sample expression
for set-var, but which must not exist for unset-var. Unfortunately the
parser finds "if" or "unless" and believes it's an expression. Let's simply
drop the test so that the outer rule parser deals with potential extraneous
keywords.

This should be backported to all versions supporting unset-var().
2019-06-04 16:48:15 +02:00
Willy Tarreau
f37b140b06 BUG/MEDIUM: vars: make sure the scope is always valid when accessing vars
Patrick Hemmer reported that a simple tcp rule involving a variable like
this is enough to crash haproxy :

    frontend foo
        bind :8001
        tcp-request session set-var(txn.foo) src

The tests on the variables scopes is not strict enough, it needs to always
verify if the stream is valid when accessing a req/res/txn variable. This
patch does this by adding a new get_vars() function which does the job
instead of open-coding all the lookups everywhere.

It must be backported to all versions supporting set-var and
"tcp-request session" so at least 1.9 and 1.8.
2019-06-04 16:27:36 +02:00
Willy Tarreau
42a6621d30 BUILD: tools: do not use the weak attribute for trace() on obsolete linkers
The default dummy trace() function is marked weak in order to be easily
replaced at link time. Some linkers are having issues with the weak
attribute, so let's not mark it on these linkers. They will simply not
be able to build with TRACE=1, which is no big deal since it's only used
by developers.
2019-06-04 16:02:26 +02:00
Willy Tarreau
fb55365f9e MINOR: server: increase the default pool-purge-delay to 5 seconds
The default used to be a very aggressive delay of 1 second before starting
to purge idle connections, but tests show that with bursty traffic it's a
bit short. Let's increase this to 5 seconds.
2019-06-04 14:06:31 +02:00
Willy Tarreau
a689c3d8d4 MEDIUM: stream: make a full process_stream() loop when completing I/O on exit
During 1.9 development cycle a shortcut was made in process_stream() to
update the analysers immediately after an I/O even detected on the send()
path while leaving the function. In order to prevent this from being abused
by a single stream stealing all the CPU, the loop didn't cover the initial
recv() call, so that events ultimately converge.

This has caused a number of issues over time because the conditions to
decide to loop are a bit tricky. For example the CF_READ_PARTIAL flag is
not immediately removed from rqf_last and may appear for a long time at
this point, sometimes causing some loops to last long.

Another unexpected side effect is that all analysers are called again with
no data to process, just because CF_WRITE_PARTIAL is present. We cannot get
rid of this event even if of very rare use, because some analysers might
wait for some data to leave a buffer before proceeding. With a full loop,
this event would have been merged with a subsequent recv() allowing analysers
to do something more useful than just ack an event they don't care about.

While during early 1.9-dev it was very important to be kind with the
scheduler, nowadays it's lock-free for local tasks so this optimization
is much less interesting to use it for I/Os, especially if we factor in
the trouble it causes.

This patch thus removes the use of the loop for regular I/Os and instead
performs a task_wakeup() with an I/O event so that the task will be
scheduled after all other ones and will have a chance to perform another
recv() and possibly to gather more I/O events to be processed at once.
Synchronous errors and transitions to SI_ST_DIS however are still handled
by the loop.

Doing so significantly reduces the average number of calls to analysers
(those are typically halved when compression is enabled in legacy mode),
and as a side benefit, has increased the H1 performance by about 1%.
2019-06-03 17:55:23 +02:00
Willy Tarreau
7bb39d7cd6 CLEANUP: connection: remove the now unused CS_FL_REOS flag
Let's remove it before it gets uesd again. It was mostly replaced with
CS_FL_EOI and by mux-specific states or flags.
2019-06-03 14:23:33 +02:00
Willy Tarreau
c493c9cb08 MEDIUM: mux-h1: don't use CS_FL_REOS anymore
This flag was already removed from other muxes and from the upper layers,
because it was misused. It indicates to the mux that the end of a stream
was already seen and is pending after existing data, but this should not
be on the conn_stream but internal to the mux.

This patch creates a new H1S flag H1S_F_REOS to replace it and uses it to
replace the last uses of CS_FL_REOS.
2019-06-03 14:18:22 +02:00
Willy Tarreau
fbdf90a6f9 BUG/MEDIUM: mux-h1: only check input data for the current stream, not next one
The mux-h1 doesn't properly propagate end of streams to the application
layer when requests are pipelined. This is visible by launching h2load
in h1 mode with -m greater than 1 : issuing Ctrl-C has no effect until
the client timeout expires.

The reason is that among the checks conditionning the reporting of the
end of stream status and waking up the streams, is a test on the presence
of remaining input data in the demux. But with pipelining, these data may
be present for another stream and should not prevent the end of stream
condition from being reported.

This patch addresses this issue by introducing a new function
"h1s_data_pending" which returns a boolean indicating if there are in
the demux buffer any data for the current stream. That is, if the stream
is in H1_MSG_DONE state, there are never any data for it. And if it's in
a different state, then the demux buffer is checked. This replaces the
tests on b_data(&h1c->ibuf) and correctly allows end of streams to be
reported at the end of requests.

It's worth noting that 1.9 doesn't suffer from this issue but it possibly
isn't completely immune either given that the same tests are present.
2019-06-03 14:13:23 +02:00
Willy Tarreau
d58f27fead MINOR: mux-h1: don't try to recv() before the connection is ready
Just as we already do in h1_send(), if the connection is not yet ready,
do not proceed and instead subscribe. This avoids a needless recvfrom()
and subscription to polling for a case which will never work since the
request was not even sent.
2019-06-03 10:17:12 +02:00
Willy Tarreau
694fcd0ee4 MINOR: connection: also stop receiving after a SOCKS4 response
Just as is done in previous patch for all handshake handlers,
also stop receiving after a SOCKS4 response was received. This
one escaped the previous cleanup but must be done to keep the
code safe.
2019-06-03 10:16:35 +02:00
Willy Tarreau
6499b9d996 BUG/MEDIUM: connection: fix multiple handshake polling issues
Connection handshakes were rarely stacked on top of each other, but the
recent experiments consisting in sending PROXY over SOCKS4 revealed a
number of issues in these lower layers. First, each handler waiting for
data MUST subscribe to recv events with __conn_sock_want_recv() and MUST
unsubscribe from send events using __conn_sock_stop_send() to avoid any
wake-up loop in case a previous sender has set this. Second, each handler
waiting for sending MUST subscribe to send events with __conn_sock_want_send()
and MUST unsubscribe from recv events using __conn_sock_stop_recv() to
avoid any wake-up loop in case some data are available on the connection.

Till now this was done at various random places, and in particular the
cases where the FD was not ready for recv forgot to re-enable reading.

Second, while senders can happily use conn_sock_send() which automatically
handles EINTR, loops, and marks the FD as not ready with fd_cant_send(),
there is no equivalent for recv so receivers facing EAGAIN MUST call
fd_cant_send() to enable polling. It could be argued that implementing
an equivalent conn_sock_recv() function could be useful and more
long-term proof than the current situation.

Third, both types of handlers MUST unsubscribe from their respective
events once they managed to do their job, and none may even play with
__conn_xprt_*(). Here again this was lacking, and one surprizing call
to __conn_xprt_stop_recv() was present in the proxy protocol parser
for TCP6 messages!

Thanks to Alexander Liu for his help on this issue.

This patch must be backported to 1.9 and possibly some older versions,
though the SOCKS parts should be dropped.
2019-06-03 08:31:22 +02:00
Willy Tarreau
7067b3a92e BUG/MINOR: deinit/threads: make hard-stop-after perform a clean exit
As reported in GH issue #99, when hard-stop-after triggers and threads
are in use, the chance that any thread releases the resources in use by
the other ones is non-null. Thus no thread should be allowed to deinit()
nor exit by itself.

Here we take a different approach. We simply use a 3rd possible value
for the "killed" variable so that all threads know they must break out
of the run-poll-loop and immediately stop.

This patch was tested by commenting the stream_shutdown() calls in
hard_stop() to increase the chances to see a stream use released
resources. With this fix applied, it never crashes anymore.

This fix should be backported to 1.9 and 1.8.
2019-06-02 11:30:07 +02:00
Alexander Liu
2a54bb74cd MEDIUM: connection: Upstream SOCKS4 proxy support
Have "socks4" and "check-via-socks4" server keyword added.
Implement handshake with SOCKS4 proxy server for tcp stream connection.
See issue #82.

I have the "SOCKS: A protocol for TCP proxy across firewalls" doc found
at "https://www.openssh.com/txt/socks4.protocol". Please reference to it.

[wt: for now connecting to the SOCKS4 proxy over unix sockets is not
 supported, and mixing IPv4/IPv6 is discouraged; indeed, the control
 layer is unique for a connection and will be used both for connecting
 and for target address manipulation. As such it may for example report
 incorrect destination addresses in logs if the proxy is reached over
 IPv6]
2019-05-31 17:24:06 +02:00
Olivier Houchard
cfbb3e6560 MEDIUM: tasks: Get rid of active_tasks_mask.
Remove the active_tasks_mask variable, we can deduce if we've work to do
by other means, and it is costly to maintain. Instead, introduce a new
function, thread_has_tasks(), that returns non-zero if there's tasks
scheduled for the thread, zero otherwise.
2019-05-29 21:53:37 +02:00
Olivier Houchard
661167d136 BUG/MEDIUM: connection: Use the session to get the origin address if needed.
In conn_si_send_proxy(), if we don't have a conn_stream yet, because the mux
won't be created until the SSL handshake is done, retrieve the opposite's
connection from the session. At this point, we know the session associated
with the connection is the one that initiated it, and we can thus just use
the session's origin.

This should be backported to 1.9.
2019-05-29 17:56:59 +02:00
Willy Tarreau
201840abf1 BUG/MEDIUM: mux-h2: don't refrain from offering oneself a used buffer
Usually when calling offer_buffer(), we don't expect to offer it to
ourselves. But with h2 we have the same buffer_wait for the two directions
so we can unblock the recv path when completing a send(), or we can unblock
part of the mux buffer after sending the first few buffers that we managed
to collect. Thus it is important to always accept to wake up any requester.

A few parts of this patch could possibly be backported but earlier versions
already have other issues related to low-buffer condition so it's not sure
it's worth taking the risk to make things worse.
2019-05-29 17:54:35 +02:00
Willy Tarreau
7f1265a238 BUG/MEDIUM: mux-h2: fix the conditions to end the h2_send() loop
The test for the mux alloc failure in h2_send() right after an attempt
at h2_process_mux() used to make sense as it tried to detect that this
latter failed to produce data. But now that we have a list of buffers,
it is a perfectly valid situation where there can still be data in the
buffer(s).

So now when we see this flag we only declare it's the last run on the
loop. In addition we need to make sure we break out of the loop on
snd_buf failure, or we'll loop indefinitely, for example when the buf
is full and we can't send.

No backport is needed.
2019-05-29 17:54:35 +02:00
Olivier Houchard
58d87f31f7 BUG/MEDIUM: h2: Don't forget to set h2s->cs to NULL after having free'd cs.
In h2c_frt_stream_new, if we failed to create the stream for some reason,
don't forget to set h2s->cs to NULL before calling h2s_destroy(), otherwise
h2s_destroy() will call h2s_close(), which will attempt to access
h2s->cs->flags if it's non-NULL.

This should be backported to 1.9.
2019-05-29 16:45:13 +02:00
Olivier Houchard
250031e444 MEDIUM: sessions: Introduce session flags.
Add session flags, and add a new flag, SESS_FL_PREFER_LAST, to be set when
we use NTLM authentication, and we should reuse the last connection. This
should fix using NTLM with HTX. This totally replaces TX_PREFER_LAST.

This should be backported to 1.9.
2019-05-29 15:41:47 +02:00
Christopher Faulet
1146f975a9 BUG/MEDIUM: mux-h1: Don't skip the TCP splicing when there is no more data to read
When there is no more data to read (h1m->curr_len == 0 in the state
H1_MSG_DATA), we still call xprt->rcv_pipe() callback. It is important to update
connection's flags. Especially to remove the flag CO_FL_WAIT_ROOM. Otherwise,
the pipe remains marked as full, preventing the stream-interface to fallback on
rcv_buf(). So the connection may be freezed because no more data is received and
the mux H1 remains blocked in the state H1_MSG_DATA.

This patch must be backported to 1.9.
2019-05-29 15:32:14 +02:00
Willy Tarreau
1e928c074b MEDIUM: task: don't grab the WR lock just to check the WQ
When profiling locks, it appears that the WQ's lock has become the most
contended one, despite the WQ being split by thread. The reason is that
each thread takes the WQ lock before checking if it it does have something
to do. In practice the WQ almost only contains health checks and rare tasks
that can be scheduled anywhere, so this is a real waste of resources.

This patch proceeds differently. Now that the WQ's lock was turned to RW
lock, we proceed in 3 phases :
  1) locklessly check for the queue's emptiness

  2) take an R lock to retrieve the first element and check if it is
     expired. This way most visits are performed with an R lock to find
     and return the next expiration date.

  3) if one expiration is found, we perform the WR-locked lookup as
     usual.

As a result, on a one-minute test involving 8 threads and 64 streams at
1.3 million ctxsw/s, before this patch the lock profiler reported this :

    Stats about Lock TASK_WQ:
         # write lock  : 1125496
         # write unlock: 1125496 (0)
         # wait time for write     : 263.143 msec
         # wait time for write/lock: 233.802 nsec
         # read lock   : 0
         # read unlock : 0 (0)
         # wait time for read      : 0.000 msec
         # wait time for read/lock : 0.000 nsec

And after :

    Stats about Lock TASK_WQ:
         # write lock  : 173
         # write unlock: 173 (0)
         # wait time for write     : 0.018 msec
         # wait time for write/lock: 103.988 nsec
         # read lock   : 1072706
         # read unlock : 1072706 (0)
         # wait time for read      : 60.702 msec
         # wait time for read/lock : 56.588 nsec

Thus the contention was divided by 4.3.
2019-05-28 19:15:44 +02:00
Willy Tarreau
ef28dc11e3 MINOR: task: turn the WQ lock to an RW_LOCK
For now it's exclusively used as a write lock though, thus it remains
100% equivalent to the spinlock it replaces.
2019-05-28 19:15:44 +02:00
Willy Tarreau
186e96ece0 MEDIUM: buffers: relax the buffer lock a little bit
In lock profiles it's visible that there is a huge contention on the
buffer lock. The reason is that when offer_buffers() is called, it
systematically takes the lock before verifying if there is any
waiter. However doing so doesn't protect against races since a
waiter can happen just after we release the lock as well. Similarly
in h2 we take the lock every time an h2c is going to be released,
even without checking that the h2c belongs to a wait list. These
two have now been addressed by verifying non-emptiness of the list
prior to taking the lock.
2019-05-28 17:25:21 +02:00
Willy Tarreau
a8b2ce02b8 MINOR: activity: report the number of failed pool/buffer allocations
Haproxy is designed to be able to continue to run even under very low
memory conditions. However this can sometimes have a serious impact on
performance that it hard to diagnose. Let's report counters of failed
pool and buffer allocations per thread in show activity.
2019-05-28 17:25:21 +02:00
Willy Tarreau
2ae84e445d MEDIUM: poller: separate the wait time from the wake events
We have been abusing the do_poll()'s timeout for a while, making it zero
whenever there is some known activity. The problem this poses is that it
complicates activity diagnostic by incrementing the poll_exp field for
each known activity. It also requires extra computations that could be
avoided.

This change passes a "wake" argument to say that the poller must not
sleep. This simplifies the operations and allows one to differenciate
expirations from activity.
2019-05-28 17:25:21 +02:00
Willy Tarreau
d78d08f95b MINOR: activity: report totals and average separately
Some fields need to be averaged instead of summed (e.g. avg_poll_us)
when reported on the CLI. Let's have a distinct macro for this.
2019-05-28 17:25:21 +02:00
Willy Tarreau
a0211b864c MINOR: activity: write totals on the "show activity" output
Most of the time we find ourselves adding per-thread fields to observe
activity, so let's compute these on the fly and display them. Now the
output shows "field: total [ thr0 thr1 ... thrn ]".
2019-05-28 15:16:09 +02:00
Willy Tarreau
0350b90e31 MEDIUM: htx: make htx_add_data() never defragment the buffer
Now instead of trying to fit 100% of the input data into the output
buffer at the risk of defragmenting it, we put what fits into it only
and return the amount of bytes transferred. In a test, compared to the
previous commit, it increases the cached data rate from 44 Gbps to
55 Gbps and saves a lot in case of large buffers : with a 1 MB buffer,
uncached transfers jumped from 700 Mbps to 30 Gbps.
2019-05-28 14:48:59 +02:00
Willy Tarreau
0a7ef02074 MINOR: htx: make htx_add_data() return the transmitted byte count
In order to later allow htx_add_data() to transmit partial blocks and
avoid defragmenting the buffer, we'll need to return the number of bytes
consumed. This first modification makes the function do this and its
callers take this into account. At the moment the function still works
atomically so it returns either the block size or zero. However all
call places have been adapted to consider any value between zero and
the block size.
2019-05-28 14:48:59 +02:00
Willy Tarreau
d4908fa465 MINOR: htx: rename htx_append_blk_value() to htx_add_data_atonce()
This function is now dedicated to data blocks, and we'll soon need to
access it from outside in a rare few cases. Let's rename it and export
it.
2019-05-28 14:48:59 +02:00
Olivier Houchard
692c1d07f9 MINOR: ssl: Don't forget to call the close method of the underlying xprt.
In ssl_sock_close(), don't forget to call the underlying xprt's close method
if it exists. For now it's harmless not to do so, because the only available
layer is the raw socket, which doesn't have a close method, but that will
change when we implement QUIC.
2019-05-28 10:08:39 +02:00
Olivier Houchard
19afb274ad MINOR: ssl: Make sure the underlying xprt's init method doesn't fail.
In ssl_sock_init(), when initting the underlying xprt, check the return value,
and give up if it fails.
2019-05-28 10:08:28 +02:00
Willy Tarreau
11c90fbd92 BUG/MEDIUM: http: fix "http-request reject" when not final
When "http-request reject" was introduced in 1.8 with commit 53275e8b0
("MINOR: http: implement the "http-request reject" rule"), it was already
broken. The code mentions "it always returns ACT_RET_STOP" and obviously
a gross copy-paste made it ACT_RET_CONT. If the rule is the last one it
properly blocks, but if not the last one it gets ignored, as can be seen
with this simple configuration :

    frontend f1
        bind :8011
        mode http
        http-request reject
        http-request redirect location /

This trivial fix must be backported to 1.9 and 1.8. It is tracked by
github issue #107.
2019-05-28 08:26:17 +02:00
Christopher Faulet
39744f792d MINOR: htx: Remove support of pseudo headers because it is unused
The code to handle pseudo headers is unused and with no real value. So remove
it.
2019-05-28 07:42:33 +02:00
Christopher Faulet
ced39006a2 MINOR: htx: don't rely on htx_find_blk() anymore in the function htx_truncate()
the function htx_find_blk() is used by only one function, htx_truncate(). So
because this function does nothing very smart, we don't use it anymore. It will
be removed by another commit.
2019-05-28 07:42:33 +02:00
Christopher Faulet
0f6d6a9ab6 MINOR: htx: Optimize htx_drain() when all data are drained
Instead of looping on the HTX message to drain all data, the message is now
reset..
2019-05-28 07:42:33 +02:00
Christopher Faulet
ee847d45d0 MEDIUM: filters/htx: Filter body relatively to the first block
The filters filtering HTX body, in the callback http_payload, must now loop on
an HTX message starting from the first block position. The offset passed as
parameter is relative to this position and not the head one. It is mandatory
because once filtered, data are now forwarded using the function
channel_htx_fwd_payload(). So the first block position is always updated.
2019-05-28 07:42:33 +02:00
Christopher Faulet
16af60e540 MINOR: proto-htx: Use channel_htx_fwd_all() when unfiltered body are forwarded
So the first block position of the HTX message will always be updated
accordingly.
2019-05-28 07:42:33 +02:00
Christopher Faulet
8fa60e4613 MINOR: stats/htx: don't use the first block position but the head one
Applets must never rely on the first block position to consume an HTX
message. The head position must be used instead. For the request it is always
the start-line. At this stage, it is not a bug, because the first position of
the request is never changed by HTX analysers.
2019-05-28 07:42:33 +02:00
Christopher Faulet
29f1758285 MEDIUM: htx: Store the first block position instead of the start-line one
We don't store the start-line position anymore in the HTX message. Instead we
store the first block position to analyze. For now, it is almost the same. But
once all changes will be made on this part, this position will have to be used
by HTX analyzers, and only in the analysis context, to know where the analyse
should start.

When new blocks are added in an HTX message, if the first block position is not
defined, it is set. When the block pointed by it is removed, it is set to the
block following it. -1 remains the value to unset the position. the first block
position is unset when the HTX message is empty. It may also be unset on a
non-empty message, meaning every blocks were already analyzed.

From HTX analyzers point of view, this position is always set during headers
analysis. When they are waiting for a request or a response, if it is unset, it
means the analysis should wait. But once the analysis is started, and as long as
headers are not forwarded, it points to the message start-line.

As mentionned, outside the HTX analysis, no code must rely on the first block
position. So multiplexers and applets must always use the head position to start
a loop on an HTX message.
2019-05-28 07:42:33 +02:00
Christopher Faulet
ee1bd4b4f7 MINOR: proto-htx: Use channel_htx_fwd_headers() to forward 1xx responses
Instead of doing it by hand, we now call the dedicated function to do so.
2019-05-28 07:42:33 +02:00
Christopher Faulet
17fd8a261f MINOR: filters/htx: Use channel_htx_fwd_headers() after headers filtering
Instead of doing it by hand in the function flt_analyze_http_headers(), we now
call the dedicated function to do so.
2019-05-28 07:42:33 +02:00
Christopher Faulet
b75b5eaf26 MEDIUM: htx: 1xx messages are now part of the final reponses
1xx informational messages (all except 101) are now part of the HTTP reponse,
semantically speaking. These messages are not followed by an EOM anymore,
because a final reponse is always expected. All these parts can also be
transferred to the channel in same time, if possible. The HTX response analyzer
has been update to forward them in loop, as the legacy one.
2019-05-28 07:42:30 +02:00
Christopher Faulet
a61e97bcae MINOR: htx: Be sure to xfer all headers in one time in htx_xfer_blks()
In the function htx_xfer_blks(), we take care to transfer all headers in one
time. When the current block is a start-line, we check if there is enough space
to transfer all headers too. If not, and if the destination is empty, a parsing
error is reported on the source.

The H2 multiplexer is the only one to use this function. When a parsing error is
reported during the transfer, the flag CS_FL_EOI is also set on the conn_stream.
2019-05-28 07:42:12 +02:00
Christopher Faulet
a39d8ad086 MINOR: mux-h1: Set hdrs_bytes on the SL when an HTX message is produced 2019-05-28 07:42:12 +02:00
Christopher Faulet
33543e73a2 MINOR: h2/htx: Set hdrs_bytes on the SL when an HTX message is produced 2019-05-28 07:42:12 +02:00
Christopher Faulet
05c083ca8d MINOR: htx: Add a field to set the memory used by headers in the HTX start-line
The field hdrs_bytes has been added in the structure htx_sl. It should be used
to set how many bytes are help by all headers, from the start-line to the
corresponding EOH block. it must be set to -1 if it is unknown.
2019-05-28 07:42:12 +02:00
Christopher Faulet
2f6edc84a8 MINOR: mux-h2/htx: Support zero-copy when possible in h2_rcv_buf()
If the channel's buffer is empty and the message is small enough, we can swap
the H2S buffer with the channel one.
2019-05-28 07:42:12 +02:00
Christopher Faulet
9cdd5036f3 MINOR: stream-int: Don't use the flag CO_RFL_KEEP_RSV anymore in si_cs_recv()
Because the channel_recv_max() always return the right value, for HTX and legacy
streams, we don't need to set this flag. The multiplexer don't use it anymore.
2019-05-28 07:42:12 +02:00
Christopher Faulet
8a9ad4c0e8 MINOR: mux-h2: Use the count value received from the SI in h2_rcv_buf()
Now, the SI calls h2_rcv_buf() with the right count value. So we can rely on
it. Unlike the H1 multiplexer, it is fairly easier for the H2 multiplexer
because the HTX message already exists, we only transfer blocks from the H2S to
the channel. And this part is handled by htx_xfer_blks().
2019-05-28 07:42:12 +02:00
Christopher Faulet
30db3d737b MEDIUM: mux-h1: Use the count value received from the SI in h1_rcv_buf()
Now, the SI calls h1_rcv_buf() with the right count value. So we can rely on
it. During the parsing, we now really respect this value to be sure to never
exceed it. To do so, once headers are parsed, we should estimate the size of the
HTX message before copying data.
2019-05-28 07:42:12 +02:00
Christopher Faulet
156852b613 BUG/MINOR: htx: Change htx_xfer_blk() to also count metadata
This patch makes the function more accurate. Thanks to the function
htx_get_max_blksz(), the transfer of data has been simplified. Note that now the
total number of bytes copied (metadata + payload) is returned. This slighly
change how the function is used in the H2 multiplexer.
2019-05-28 07:42:12 +02:00
Christopher Faulet
a3f1550dfa MEDIUM: http/htx: Perform analysis relatively to the first block
The first block is the start-line, if defined. Otherwise it the head of the HTX
message. So now, during HTTP analysis, lookup are all done using the first block
instead of the head. Concretely, for now, it is the same because only one HTTP
message is stored at a time in an HTX message. 1xx informational messages are
handled separatly from the final reponse and from each other. But it will make
sense when the 1xx informational messages and the associated final reponse will
be stored in the same HTX message.
2019-05-28 07:42:12 +02:00
Christopher Faulet
7b7d507a5b MINOR: http/htx: Use sl_pos directly to replace the start-line
Since the HTX start-line is now referenced by position instead of by its payload
address, it is fairly easier to replace it. No need to search the rigth block to
find the start-line comparing the payloads address. It just enough to get the
block at the position sl_pos.
2019-05-28 07:42:12 +02:00
Christopher Faulet
297fbb45fe MINOR: htx: Replace the function http_find_stline() by http_get_stline()
Now, we only return the start-line. If not found, NULL is returned. No lookup is
performed and the HTX message is no more updated. It is now the caller
responsibility to update the position of the start-line to the right value. So
when it is not found, i.e sl_pos is set to -1, it means the last start-line has
been already processed and the next one has not been inserted yet.

It is mandatory to rely on this kind of warranty to store 1xx informational
responses and final reponse in the same HTX message.
2019-05-28 07:42:12 +02:00
Christopher Faulet
b77a1d26a4 MINOR: mux-h2/htx: Get the start-line from the head when HEADERS frame is built
in the H2 multiplexer, when a HEADERS frame is built before sending it, we have
the warranty the start-line is the head of the HTX message. It is safer to rely
on this fact than on the sl_pos value. For now, it's safe to use sl_pos in muxes
because HTTP 1xx messages are considered as full messages in HTX and only one
HTTP message can be stored at a time in HTX. But we are trying to handle 1xx
messages as a part of the reponse message. In this way, an HTTP reponse will be
the sum of all 1xx informational messages followed by the final response. So it
will be possible to have several start-line in the same HTX message. And the
sl_pos will point to the first unprocessed start-line from the analyzers point
of view.
2019-05-28 07:42:12 +02:00
Christopher Faulet
9c66b980fa MINOR: htx: Store start-line block's position instead of address of its payload
Nothing much to say. This change is just mandatory to consider 1xx informational
messages as part of a response.
2019-05-28 07:42:12 +02:00
Christopher Faulet
28f29c7eea MINOR: htx: Store the head position instead of the wrap one
The head of an HTX message is heavily used whereas the wrap position is only
used when a block is added or removed. So it is more logical to store the head
position in the HTX message instead of the wrap one. The wrap position can be
easily deduced. To get it, the new function htx_get_wrap() may be used.
2019-05-28 07:42:12 +02:00
Christopher Faulet
429b91d308 MINOR: htx: Remove the macro IS_HTX_SMP() and always use IS_HTX_STRM() instead
The macro IS_HTX_SMP() is only used at a place, in a context where the stream
always exists. So, we can remove it to use IS_HTX_STRM() instead.
2019-05-28 07:42:12 +02:00
Willy Tarreau
b01302f9ac MEDIUM: config: now alert when two servers have the same name
We've been emitting warnings for over 5 years (since 1.5-dev22) about
configs accidently carrying multiple servers with the same name in the
same backend, and this starts to cause some real trouble in dynamic
environments since it's still very difficult to accurately process
a state-file and we still can't transport a server's name over the
peers protocol because of this.

It's about time to force users to fix their configs if they still
hadn't given that there is zero technical justification for doing this,
beyond the "yyp" (or copy-paste accident) when editing the config.

The message remains as clear as before, indicating the file and lines
of the conflict so that the user can easily fix it.
2019-05-27 19:31:06 +02:00
Willy Tarreau
c3b5958255 BUG/MEDIUM: threads: fix double-word CAS on non-optimized 32-bit platforms
On armv7 haproxy doesn't work because of the fixes on the double-word
CAS. There are two issues. The first one is that the last argument in
case of dwcas is a pointer to the set of value and not a value ; the
second is that it's not enough to cast the data as (void*) since it will
be a single word. Let's fix this by using the pointers as an array of
long. This was tested on i386, armv7, x86_64 and aarch64 and it is now
fine. An alternate approach using a struct was attempted as well but it
used to produce less optimal code.

This fix must be backported to 1.9. This fixes github issue #105.

Cc: Olivier Houchard <ohouchard@haproxy.com>
2019-05-27 17:40:59 +02:00
Willy Tarreau
bff005ae58 BUG/MEDIUM: queue: fix the tree walk in pendconn_redistribute.
In pendconn_redistribute() we scan the queue using eb32_next() on the
node we've just deleted, which is wrong since the node is not in the
tree anymore, and it could dereference one node that has already been
released by another thread. Note that we cannot use eb32_first() in the
loop here instead because we need to skip pendconns having SF_FORCE_PRST.
Instead, let's keep a copy of the next node before deleting it.

In addition, the pendconn retrieved there is wrong, it uses &node as
the pointer instead of node, resulting in very quick crashes when the
server list is scanned.

Fortunately this only happens when "option redispatch" is used in
conjunction with "maxconn" on server lines, "cookie" for the stickiness,
and when a server goes down with entries in its queue.

This bug was introduced by commit 0355dabd7 ("MINOR: queue: replace
the linked list with a tree") so the fix must be backported to 1.9.
2019-05-27 10:29:59 +02:00
Willy Tarreau
b6195ef2a6 BUG/MAJOR: lb/threads: make sure the avoided server is not full on second pass
In fwrr_get_next_server(), we optionally pass a server to avoid. It
usually points to the current server during a redispatch operation. If
this server is usable, an "avoided" pointer is set and we continue to
look for another server. If in the end no other server is found, then
we fall back to this avoided one, which is still better than nothing.

The problem that may arise with threads is that in the mean time, this
avoided server might have received extra connections and might not be
usable anymore. This causes it to be queued a second time in the "full"
list and the loop to search for a server again, ending up on this one
again and so on.

This patch makes sure that we break out of the loop when we have to
pick the avoided server. It's probably what the code intended to do
as the current break statement causes fwrr_update_position() and
fwrr_dequeue_srv() to be called again on the avoided server.

It must be backported to 1.9 and 1.8, and seems appropriate for older
versions though it's unclear what the impact of this bug might be
there since the race doesn't exist and we're left with the double
update of the server's position.
2019-05-27 10:29:59 +02:00
Willy Tarreau
d6a7850200 MINOR: cli/activity: add 3 general purpose counters in development mode
The unused fd_del and fd_skip were being abused during debugging sessions
as general purpose event counters. With their removal, let's officially
have dedicated counters for such use cases. These counters are called
"ctr0".."ctr2" and are listed at the end when DEBUG_DEV is set.
2019-05-27 07:03:38 +02:00
Willy Tarreau
394c9b4215 MINOR: cli/activity: remove "fd_del" and "fd_skip" from show activity
These variables are never set anymore and were always reported as zero.
2019-05-27 06:59:14 +02:00
Ilya Shipitsin
0590f44254 BUILD: ssl: fix latest LibreSSL reg-test error
starting with OpenSSL 1.0.0 recommended way to disable compression is
using SSL_OP_NO_COMPRESSION when creating context.

manipulations with SSL_COMP_get_compression_methods, sk_SSL_COMP_num
are only required for OpenSSL < 1.0.0
2019-05-26 21:26:02 +02:00
Willy Tarreau
08e2b41e81 BUILD: connections: shut up gcc about impossible out-of-bounds warning
Since commit 88698d9 ("MEDIUM: connections: Add a way to control the
number of idling connections.") when building without threads, gcc
complains that the operations made on the idle_orphan_conns[] list is
out of bounds, which is always false since 1) <i> can only equal zero,
and 2) given it's equal to <tid> we never even enter the loop. But as
usual it thinks it knows better, so let's mask the origin of this <i>
value to shut it up. Another solution consists in making <i> unsigned
and adding an explicit range check.
2019-05-26 11:54:20 +02:00
Willy Tarreau
9c218e7521 MAJOR: mux-h2: switch to next mux buffer on buffer full condition.
Now when we fail to send because the mux buffer is full, before giving
up and marking MFULL, we try to allocate another buffer in the mux's
ring to try again. Thanks to this (and provided there are enough buffers
allocated to the mux's ring), a single stream picked in the send_list
cannot steal all the mux's room at once. For this, we expand the ring
size to 31 buffers as it seems to be optimal on benchmarks since it
divides the number of context switches by 3. It will inflate each H2
conn's memory by 1 kB.

The bandwidth is now much more stable. Prior to this, it a test on
h2->h1 with very large objects (1 GB), a few tens of connections and
a few tens of streams per connection would show a varying performance
between 34 and 95 Gbps on 2 cores/4 threads, with h2_snd_buf() stopped
on a buffer full condition between 300000 and 600000 times per second.
Now the performance is constantly between 88 and 96 Gbps. Measures show
that buffer full conditions are met around only 159 times per second
in this case, or rougly 2000 to 4000 times less often.
2019-05-26 11:33:19 +02:00
Willy Tarreau
60f62682b1 MINOR: mux-h2: report the mbuf's head and tail in "show fd"
It's useful to know how the mbuf spans over the whole area and to have
access to the first and last ones, so let's dump just this.
2019-05-26 11:33:18 +02:00
Willy Tarreau
bcc4595e57 CLEANUP: mux-h2: consistently use a local variable for the mbuf
This makes the code more readable and reduces the calls to br_tail().
In addition, all calls to h2_get_buf() are now made via this local
variable, which should significantly help for retries.
2019-05-26 10:52:47 +02:00
Willy Tarreau
41c4d6a2c5 MEDIUM: mux-h2: make the send() function iterate over all mux buffers
Now send() uses a loop to iterate over all buffers to be sent. These
buffers are released and deleted from the vector once completely sent.
If any buffer gets released, offer_buffers() is called to wake up some
waiters.
2019-05-26 10:52:25 +02:00
Willy Tarreau
2e3c000c1c MINOR: mux-h2: introduce h2_release_mbuf() to release all buffers in the mbuf ring
This function iterates over all buffers in the mbuf ring to release all
of them from the head to the tail.
2019-05-26 10:51:25 +02:00
Willy Tarreau
662fafc02b MEDIUM: mux-h2: make the conditions to send based on mbuf, not just its tail
This is in preparation for iterating over lists. First we need to always
check the buffer's head and not its tail.
2019-05-26 10:50:50 +02:00
Willy Tarreau
5133096df2 MEDIUM: mux-h2: replace all occurrences of mbuf with a buffer ring
For now it's only one buffer long so the head and tails are always the
same, thus it doesn't change what used to work. In short, br_tail(h2c->mbuf)
was inserted everywhere we used to have h2c->mbuf.
2019-05-26 10:50:18 +02:00
Willy Tarreau
455d5681b6 MEDIUM: mux-h2: avoid doing expensive buffer realigns when not absolutely needed
Transferring large objects over H2 sometimes shows unexplained performance
variations. A long analysis resulted in the following discovery. Often the
mux buffer looks like this :

    [ empty_head |    data     | empty_tail ]

Typical numbers are (very common) :
  - empty_head = 31
  - empty_tail = 16  (total free=47)
  - data = 16337
  - size = 16384
  - data to copy: 43

The reason for these holes are the blocking factors that are not always
the same in and out (due to keeping 9 bytes for the frame size, or the
56 bytes corresponding to the HTX header). This can easily happen 10000
times a second if the network bandwidth permits it!

In this case, while copying a DATA frame we find that the buffer has its
free space wrapped so we decide to realign it to optimize the copy. It's
possible that this practice stems from the code used to emit headers,
which do not support fragmentation and which had no other option left.
But it comes with two problems :
  - we don't check if the data fits, which results in a memcpy for nothing
  - we can move huge amounts of data to just copy a small block.

This patch addresses this two ways :
  - first, by not forcing a data realignment if what we have to copy does
    not fit, as this is totally pointless ;

  - second, by refusing to move too large data blocks. The threshold was
    set to 1 kB, because it may make sense to move 1 kB of data to copy
    a 15 kB one at once, which will leave as a single 16 kB block, but
    it doesn't make sense to mvoe 15 kB to copy just 1 kB. In all cases
    the data would fit and would just be split into two blocks, which is
    not very expensive, hence the low limit to 1 kB

With such changes, realignments are very rare, they show up around once
every 15 seconds at 60 Gbps, and look like this, resulting in a much more
stable bit rate :

  buf=0x7fe6ec0c3510,h=16333,d=35,s=16384 room=16349 in=16337

This patch should be safe for backporting to 1.9 if some performance
issues are reported there.
2019-05-25 20:31:53 +02:00
Ilya Shipitsin
e242f3dfb8 BUG/MINOR: ssl_sock: Fix memory leak when disabling compression
according to manpage:

       sk_TYPE_zero() sets the number of elements in sk to zero. It
       does not free sk so after this call sk is still valid.

so we need to free all elements

[wt: seems like it has been there forever and should be backported
 to all stable branches]
2019-05-25 07:45:55 +02:00
Christopher Faulet
b8fd4c031c BUG/MINOR: htx: Remove a forgotten while loop in htx_defrag()
Fortunately, this loop does nothing. Otherwise it would have led to an infinite
loop. It was probably forgotten during a refactoring, in the early stage of the
HTX.

This patch must be backported to 1.9.
2019-05-24 09:11:10 +02:00
Christopher Faulet
f90c24d14c BUG/MEDIUM: proto-htx: Not forward too much data when 1xx reponses are handled
When an 1xx reponse is processed, we forward it immediatly. But another message
may already be in the channel's buffer, waiting to be processed. This may be
another 1xx reponse or the final one. So instead of forwarding everything, we
must take care to only forward the processed 1xx response.

This patch must be backported to 1.9.
2019-05-24 09:11:07 +02:00
Christopher Faulet
8e9e3ef15c BUG/MINOR: mux-h1: Report EOI instead EOS on parsing error or H2 upgrade
When a parsing error occurrs in the H1 multiplexer, we stop to copy HTX
blocks. So the error may be reported with an emtpy HTX message. For instance, if
the headers parsing failed. When it happens, the flag CS_FL_EOS is also set on
the conn_stream. But it is an error. Most of time, it is set on established
connections, so it is not really an issue. But if it happens when the server
connection is not fully established, the connection is shut down immediatly and
the stream-interface is switched from SI_ST_CON to SI_ST_DIS/CLO. So HTX
analyzers have no chance to catch the error.

Instead of setting CS_FL_EOS, it is fairly better to set CS_FL_EOI, which is the
right flag to use. The same is also done on H2 upgrade. As a side effet of this
fix, in the stream-interface code, we must now set the flag CF_READ_PARTIAL on
the channel when the flag CF_EOI is set. It is a warranty to wakeup the stream
when EOI is reported to the channel while no data are received.

This patch must be backported to 1.9.
2019-05-24 09:11:01 +02:00
Christopher Faulet
316934d3c9 BUG/MINOR: mux-h2: Count EOM in bytes sent when a HEADERS frame is formatted
In HTX, when a HEADERS frame is formatted before sending it to the client or the
server, If an EOM is found because there is no body, we must count it in the
number bytes sent.

This patch must be backported to 1.9.
2019-05-24 09:10:46 +02:00
Christopher Faulet
256b69a82d BUG/MINOR: lua: Set right direction and flags on new HTTP objects
When a LUA HTTP object is created using the current TXN object, it is important
to also set the right direction and flags, using ones from the TXN object.

This patch may be backported to all supported branches with the lua
support. But, it seems to have no impact for now.
2019-05-24 09:07:57 +02:00
Christopher Faulet
55ae8a64e4 BUG/MEDIUM: spoe: Don't use the SPOE applet after releasing it
In spoe_release_appctx(), the SPOE applet may be used after it was released to
get its exit status code. Of course, HAProxy crashes when this happens.

This patch must be backported to 1.9 and 1.8.
2019-05-24 09:07:30 +02:00
Christopher Faulet
08e6646460 BUG/MINOR: proto-htx: Try to keep connections alive on redirect
As fat as possible, we try to keep the connections alive on redirect. It's
possible when the request has no body or when the request parsing is finished.

No backport is needed.
2019-05-24 09:06:59 +02:00
Willy Tarreau
1713c03825 MINOR: stats: report the global output bit rate in human readable form
The stats page now reports the per-process output bit rate and applies
the usual conversions needed to turn the TCP payload rate to an Ethernet
bit rate in order to give a reasonably accurate estimate of how far from
interface saturation we are.
2019-05-23 12:31:51 +02:00
Willy Tarreau
7cf0e4517d MINOR: raw_sock: report global traffic statistics
Many times we've been missing per-process traffic statistics. While it
didn't make sense in multi-process mode, with threads it does. Thus we
now have a counter of bytes emitted by raw_sock, and a freq counter for
these as well. However, freq_ctr are limited to 32 bits, and given that
loads of 300 Gbps have already been reached over a loopback using
splicing, we need to downscale this a bit. Here we're storing 1/32 of
the byte rate, which gives a theorical limit of 128 GB/s or ~1 Tbps,
which is more than enough. Let's have fun re-reading this sentence in
2029 :-)  The values can be read in "show info" output on the CLI.
2019-05-23 11:45:38 +02:00
Willy Tarreau
bc1b820606 BUILD: watchdog: condition it to USE_RT
It's needed on Linux to have access to timerfd_*, and on FreeBSD this
lib is needed as well, though not enabled in our default build. We can
see later if it's OK to enable it, for now let's fix the build issues.
2019-05-23 10:20:55 +02:00
Willy Tarreau
02255b24df BUILD: watchdog: use si_value.sival_int, not si_int for the timer's value
Bah, the linux manpage suggests to use si_int but it's a fake, it's only
a define on sigval.sival_int where sigval is defined as si_value. Let's
use si_value.sival_int, at least it builds on both Linux and FreeBSD. It's
likely that this code will have to be limited to a small subset of OSes
if it causes difficulties like this.
2019-05-23 08:36:29 +02:00
Willy Tarreau
96d5195862 MEDIUM: config: deprecate the antique req* and rsp* commands
These commands don't follow the same flow as the rest of the commands,
each of them iterates over all header lines before switching to the
next directive. In addition they make no distinction between start
line and headers and can lead to unparsable rewrites which are very
difficult to deal with internally.

Most of them are still occasionally found in configurations, mainly
because of the usual "we've always done this way". By marking them
deprecated and emitting a warning and recommendation on first use of
each of them, we will raise users' awareness of users regarding the
cleaner, faster and more reliable alternatives.

Some use cases of "reqrep" still appear from time to time for URL
rewriting that is not so convenient with other rules. But at least
users facing this requirement will explain their use case so that we
can best serve them. Some discussion started on this subject in a
thread linked to from github issue #100.

The goal is to remove them in 2.1 since they require to reparse the
result before indexing it and we don't want this hack to live long.
The following directives were marked deprecated :

  -reqadd
  -reqallow
  -reqdel
  -reqdeny
  -reqiallow
  -reqidel
  -reqideny
  -reqipass
  -reqirep
  -reqitarpit
  -reqpass
  -reqrep
  -reqtarpit
  -rspadd
  -rspdel
  -rspdeny
  -rspidel
  -rspideny
  -rspirep
  -rsprep
2019-05-22 20:43:45 +02:00
Willy Tarreau
3844747536 CLEANUP: raw_sock: remove support for very old linux splice bug workaround
We've been dealing with a workaround for a bug in splice that used to
affect version 2.6.25 to 2.6.27.12 and which was fixed 10 years ago
in kernel versions which are not supported anymore. Given that people
who would use a kernel in such a range would face much more serious
stability and security issues, it's about time to get rid of this
workaround and of the ASSUME_SPLICE_WORKS build option used to disable
it.
2019-05-22 20:02:15 +02:00
Willy Tarreau
e5733234f6 CLEANUP: build: rename some build macros to use the USE_* ones
We still have quite a number of build macros which are mapped 1:1 to a
USE_something setting in the makefile but which have a different name.
This patch cleans this up by renaming them to use the USE_something
one, allowing to clean up the makefile and make it more obvious when
reading the code what build option needs to be added.

The following renames were done :

 ENABLE_POLL -> USE_POLL
 ENABLE_EPOLL -> USE_EPOLL
 ENABLE_KQUEUE -> USE_KQUEUE
 ENABLE_EVPORTS -> USE_EVPORTS
 TPROXY -> USE_TPROXY
 NETFILTER -> USE_NETFILTER
 NEED_CRYPT_H -> USE_CRYPT_H
 CONFIG_HAP_CRYPT -> USE_LIBCRYPT
 CONFIG_HAP_NS -> DUSE_NS
 CONFIG_HAP_LINUX_SPLICE -> USE_LINUX_SPLICE
 CONFIG_HAP_LINUX_TPROXY -> USE_LINUX_TPROXY
 CONFIG_HAP_LINUX_VSYSCALL -> USE_LINUX_VSYSCALL
2019-05-22 19:47:57 +02:00
Willy Tarreau
823bda0eb7 BUILD: time: remove the test on _POSIX_C_SOURCE
It seems it's not defined on FreeBSD while it's mentioned on Linux that
clock_gettime() can be detected using this. Given that we also have the
test for _POSIX_TIMERS>0 that should cover it well enough. If it breaks
on other systems, we'll see.

Report was here :
    https://github.com/haproxy/haproxy/runs/133866993
2019-05-22 19:14:59 +02:00
Willy Tarreau
082b62828d BUG/MEDIUM: init/threads: provide per-thread alloc/free function callbacks
We currently have the ability to register functions to be called early
on thread creation and at thread deinitialization. It turns out this is
not sufficient because certain such functions may use resources that are
being allocated by the other ones, thus creating a race condition depending
only on the linking order. For example the mworker needs to register a
file descriptor while the pollers will reallocate the fd_updt[] array.
Similarly logs and trashes may be used by some init functions while it's
unclear whether they have been deduplicated.

The same issue happens on deinit, if the fd_updt[] or trash is released
before some functions finish to use them, we'll get into trouble.

This patch creates a couple of early and late callbacks for per-thread
allocation/freeing of resources. A few init functions were moved there,
and the fd init code was split between the two (since it used to both
allocate and initialize at once). This way the init/deinit sequence is
expected to be safe now.

This patch should be backported to 1.9 as at least the trash/log issue
seems to be present. The run_thread_poll_loop() code is a bit different
there as the mworker is not a callback, but it will have no effect and
it's enough to drop the mworker changes.

This bug was reported by Ilya Shipitsin in github issue #104.
2019-05-22 14:59:08 +02:00
Willy Tarreau
aabbe6a3bb MINOR: WURFL: do not emit warnings when not configured
At the moment the WURFL module emits 3 lines of warnings upon startup
when it is not referenced in the configuration file, which is quite
confusing. Let's make sure to keep it silent when not configured, as
detected by the absence of the wurfl-data-file statement.
2019-05-22 14:01:22 +02:00
mbellomi
ae4fcf1e67 MINOR: WURFL: module version bump to 2.0
Make it version 2.0.
2019-05-22 12:06:42 +02:00
mbellomi
2c07700098 MEDIUM: WURFL: HTX awareness.
Now wurfl fetch process is fully  HTX aware.
2019-05-22 12:06:38 +02:00
mbellomi
9896981675 MINOR: WURFL: wurfl_get() and wurfl_get_all() now return an empty string if device detection fails 2019-05-22 12:06:38 +02:00
mbellomi
e9fedf560a MINOR: WURFL: removes heading wurfl-information-separator from wurfl-get-all() and wurfl-get() results 2019-05-22 12:06:38 +02:00
mbellomi
4304e30af1 MINOR: WURFL: shows log messages during module initialization
Now some useful startup information is logged to stderr. Previously they
were lost because logs were not yet enabled.
2019-05-22 12:06:34 +02:00
mbellomi
f9ea1e2fd4 MINOR: WURFL: fixed Engine load failed error when wurfl-information-list contains wurfl_root_id 2019-05-22 12:06:07 +02:00
mbellomi
d173e93aa7 BUG/MEDIUM: WURFL: segfault in wurfl-get() with missing info.
A segfault may happen in ha_wurfl_get() when dereferencing information not
present in wurfl-information-list. Check the node retrieved from the tree,
not its container.

This fix must be backported to 1.9.
2019-05-22 12:06:02 +02:00
Willy Tarreau
0a7a4fbbc8 CLEANUP: mux-h1: use "H1" and not "h1" as the mux's name
The mux's name is the only one reported in lower case in "show sess"
or "haproxy -vv" while the other ones are upper case, so it loses and
the other ones win :-)
2019-05-22 11:50:48 +02:00
Willy Tarreau
b106ce1c3d MINOR: stream: remove the cpu time detection from process_stream()
It was not as efficient as the watchdog in that it would only trigger
after the problem resolved by itself, and still required a huge margin
to make sure we didn't trigger for an invalid reason. This used to leave
little indication about the cause. Better use the watchdog now and
improve it if needed.

The detector of unkillable tasks remains active though.
2019-05-22 11:50:48 +02:00
Willy Tarreau
2bfefdbaef MAJOR: watchdog: implement a thread lockup detection mechanism
Since threads were introduced, we've naturally had a number of bugs
related to locking issues. In addition we've also got some issues
with corrupted lists in certain rare cases not necessarily involving
threads. Not only these events cause a lot of trouble to the production
as it is very hard to detect that the process is stuck in a loop and
doesn't deliver the service anymore, but it's often difficult (or too
late) to collect more debugging information.

The patch presented here implements a lockup detection mechanism, also
known as "watchdog". The principle is that (on systems supporting it),
each thread will have its own CPU timer which progresses as the thread
consumes CPU cycles, and when a deadline is met, a signal is delivered
(SIGALRM here since it doesn't interrupt gdb by default).

The thread handling this signal (which is not necessarily the one which
triggered the timer) figures the thread ID from the signal arguments and
checks if it's really stuck by looking at the time spent since last exit
from poll() and by checking that the thread's scheduler is still alive
(so that even when dealing with configuration issues resulting in insane
amount of tasks being called in turn, it is not possible to accidently
trigger it). Checking the scheduler's activity will usually result in a
second chance, thus doubling the detecting time.

In order not to incorrectly flag a thread as being the cause of the
lockup, the thread_harmless_mask is checked : a thread could very well
be spinning on itself waiting for all other threads to join (typically
what happens when issuing "show sess"). In this case, once all threads
but one (or two) have joined, all the innocent ones are marked harmless
and will not trigger the timer. Only the ones not reacting will.

The deadline is set to one second, which already appears impossible to
reach, especially since it's 1 second of CPU usage, not elapsed time
with the CPU being preempted by other threads/processes/hypervisor. In
practice due to the scheduler's health verification it takes up to two
seconds to decide to panic.

Once all conditions are met, the goal is to crash from the offending
thread. So if it's the current one, we call ha_panic() otherwise the
signal is bounced to the offending thread which deals with it. This
will result in all threads being woken up in turn to dump their context,
the whole state is emitted on stderr in hope that it can be logged, and
the process aborts, leaving a chance for a core to be dumped and for a
service manager to restart it.

An alternative mechanism could be implemented for systems unable to
wake up a thread once its CPU clock reaches a deadline (e.g. FreeBSD).
Instead of waking the timer each and every deadline, it is possible to
use a standard timer which is reset each time we leave poll(). Since
the signal handler rechecks the CPU consumption this will also work.
However a totally idle process may trigger it from time to time which
may or may not confuse some debugging sessions. The same is true for
alarm() which could be another option for systems not having such a
broad choice of timers (but it seems that in this case they will not
have per-thread CPU measurements available either).

The feature is currently implemented only when threads are enabled in
order to keep the code clean, since the main purpose is to detect and
address inter-thread deadlocks. But if it proves useful for other
situations this condition might be relaxed.
2019-05-22 11:50:48 +02:00
Willy Tarreau
e6a02fa65a MINOR: threads: add a "stuck" flag to the thread_info struct
This flag is constantly cleared by the scheduler and will be set by the
watchdog timer to detect stuck threads. It is also set by the "show
threads" command so that it is easy to spot if the situation has evolved
between two subsequent calls : if the first "show threads" shows no stuck
thread and the second one shows such a stuck thread, it indicates that
this thread didn't manage to make any forward progress since the previous
call, which is extremely suspicious.
2019-05-22 11:50:48 +02:00
Willy Tarreau
578ea8be55 MINOR: debug: dump streams when an applet, iocb or stream is known
Whenever we can retrieve a valid stream pointer, we now call stream_dump()
to get a detailed dump of the stream currently running on the processor.
This is used by "show threads" and by ha_panic().
2019-05-22 11:50:48 +02:00
Willy Tarreau
5484d58a17 MINOR: stream: introduce a stream_dump() function and use it in stream_dump_and_crash()
This function dumps a lot of information about a stream into the provided
buffer. It is now used by stream_dump_and_crash() and will be used by the
debugger as well.
2019-05-22 11:50:48 +02:00
Willy Tarreau
fade80d162 CLEANUP: debug: make use of ha_tkill() and remove ifdefs
This way we always signal the threads the same way.
2019-05-22 11:50:48 +02:00
Willy Tarreau
2beaaf7d46 MINOR: threads: implement ha_tkill() and ha_tkillall()
These functions are used respectively to signal one thread or all threads.
When multithreading is disabled, it's always the current thread which is
signaled.
2019-05-22 11:50:48 +02:00
Willy Tarreau
8b35ba54bc CLEANUP: debug: always report harmless/want_rdv even without threads
This way we have a more consistent output and we can remove annoying
ifdefs.
2019-05-22 11:50:48 +02:00
Willy Tarreau
05ed14cfc4 CLEANUP: threads: really move thread_info to hathreads.c
Commit 5a6e2245f ("REORG: threads: move the struct thread_info from
global.h to hathreads.h") didn't hold its promise well, as the thread_info
struct was still declared and initialized in haproxy.c in addition to being
in hathreads.c. Let's move it for real now.
2019-05-22 11:50:48 +02:00
Willy Tarreau
ddd8533f1b MINOR: debug: switch to SIGURG for thread dumps
The current choice of SIGPWR has the adverse effect of stopping gdb each
time it is triggered using "show threads" or example, which is not really
convenient. Let's switch to SIGURG instead, which we don't use either.
2019-05-22 11:50:48 +02:00
Tim Duesterhus
9b7a976cd6 BUG/MINOR: mworker: Fix memory leak of mworker_proc members
The struct mworker_proc is not uniformly freed everywhere, sometimes leading
to leaks of the `id` string (and possibly the other strings).

Introduce a mworker_free_child function instead of duplicating the freeing
logic everywhere to prevent this kind of issues.

This leak was reported in issue #96.

It looks like the leaks have been introduced in commit 9a1ee7ac31,
which is specific to 2.0-dev. Backporting `mworker_free_child` might be
helpful to ease backporting other fixes, though.
2019-05-22 11:29:18 +02:00
Willy Tarreau
f61782418c CLEANUP: time: refine the test on _POSIX_TIMERS
The clock_gettime() man page says we must check that _POSIX_TIMERS is
defined to a value greater than zero, not just that it's simply defined
so let's fix this right now.
2019-05-21 20:03:03 +02:00
Olivier Houchard
aacc405c1f BUG/MEDIUM: streams: Don't switch from SI_ST_CON to SI_ST_DIS on read0.
When we receive a read0, and we're still in SI_ST_CON state (so on an
outgoing conneciton), don't immediately switch to SI_ST_DIS, or, we would
never call sess_establish(), and so the analysers will never run.
Instead, let sess_establish() handle that case, and switch to SI_ST_DIS if
we already have CF_SHUTR on the channel.

This should be backported to 1.9.
2019-05-21 19:05:09 +02:00
Emmanuel Hocdet
0ba4f483d2 MAJOR: polling: add event ports support (Solaris)
Event ports are kqueue/epoll polling class for Solaris. Code is based
on https://github.com/joyent/haproxy-1.8/tree/joyent/dev-v1.8.8.
Event ports are available only on SunOS systems derived from
Solaris 10 and later (including illumos systems).
2019-05-21 15:16:45 +02:00
Willy Tarreau
663fda4c90 BUILD: threads: only assign the clock_id when supported
I took extreme care to always check for _POSIX_THREAD_CPUTIME before
manipulating clock_id, except at one place (run_thread_poll_loop) as
found by Manu, breaking Solaris. Now fixed, no backport needed.
2019-05-21 15:14:08 +02:00
Willy Tarreau
9c8800af3b MINOR: debug: report each thread's cpu usage in "show thread"
Now we can report each thread's CPU time, both at wake up (poll) and
retrieved while dumping (now), then the difference, which directly
indicates how long the thread has been running uninterrupted. A very
high value for the diff could indicate a deadlock, especially if it
happens between two threads. Note that it may occasionally happen
that a wrong value is displayed since nothing guarantees that the
date is read atomically.
2019-05-20 21:14:14 +02:00
Willy Tarreau
81036f2738 MINOR: time: move the cpu, mono, and idle time to thread_info
These ones are useful across all threads and would be better placed
in struct thread_info than thread-local. There are very few users.
2019-05-20 21:14:14 +02:00
Willy Tarreau
8323a375bc MINOR: threads: add a thread-local thread_info pointer "ti"
Since we're likely to access this thread_info struct more frequently in
the future, let's reserve the thread-local symbol to access it directly
and avoid always having to combine thread_info and tid. This pointer is
set when tid is set.
2019-05-20 21:14:12 +02:00
Willy Tarreau
624dcbf41e MINOR: threads: always place the clockid in the struct thread_info
It will be easier to deal with the internal API to always have it.
2019-05-20 21:13:01 +02:00
Willy Tarreau
5a6e2245fa REORG: threads: move the struct thread_info from global.h to hathreads.h
It doesn't make sense to keep this struct thread_info in global.h, it
causes difficulties to access its contents from hathreads.h, let's move
it to the threads where it ought to have been created.
2019-05-20 20:00:25 +02:00
Willy Tarreau
a9f9fc9e5b MINOR: debug: make ha_panic() report threads starting at 1
Internally they start at zero but everywhere (config, dumps) we show
them starting at 1, so let's fix the confusion.
2019-05-20 17:46:14 +02:00
Willy Tarreau
3710105945 MINOR: tools: provide a may_access() function and make dump_hex() use it
It's a bit too easy to crash by accident when using dump_hex() on any
area. Let's have a function to check if the memory may safely be read
first. This one abuses the stat() syscall checking if it returns EFAULT
or not, in which case it means we're not allowed to read from there. In
other situations it may return other codes or even a success if the
area pointed to by the file exists. It's important not to abuse it
though and as such it's tested only once per output line.
2019-05-20 16:59:37 +02:00
Willy Tarreau
6bdf3e9b11 MINOR: debug/cli: add some debugging commands for developers
When haproxy is built with DEBUG_DEV, the following commands are added
to the CLI :

  debug dev close <fd>        : close this file descriptor
  debug dev delay [ms]        : sleep this long
  debug dev exec  [cmd] ...   : show this command's output
  debug dev exit  [code]      : immediately exit the process
  debug dev hex   <addr> [len]: dump a memory area
  debug dev log   [msg] ...   : send this msg to global logs
  debug dev loop  [ms]        : loop this long
  debug dev panic             : immediately trigger a panic
  debug dev tkill [thr] [sig] : send signal to thread

These are essentially aimed at helping developers trigger certain
conditions and are expected to be complemented over time.
2019-05-20 16:59:30 +02:00
Willy Tarreau
56131ca58e MINOR: debug: implement ha_panic()
This function dumps all existing threads using the thread dump mechanism
then aborts. This will be used by the lockup detection and by debugging
tools.
2019-05-20 16:51:30 +02:00
Willy Tarreau
9fc5dcbd71 MINOR: tools: add dump_hex()
This is used to dump a memory area into a buffer for debugging purposes.
2019-05-20 16:51:30 +02:00
Willy Tarreau
da5a63f8f1 CLEANUP: stream: remove an obsolete debugging test
The test consisted in checking that there was always a timeout on a
stream's task and was only enabled when built in development mode,
but 1) it is never tested and 2) if it had been tested it would have
been noticed that it triggers a bit too easily on the CLI. Let's get
rid of this old one.
2019-05-20 16:19:40 +02:00
Willy Tarreau
91e6df01fa MINOR: threads: add each thread's clockid into the global thread_info
This is the per-thread CPU runtime clock, it will be used to measure
the CPU usage of each thread and by the lockup detection mechanism. It
must only be retrieved at the beginning of run_thread_poll_loop() since
the thread must already have been started for this. But it must be done
before performing any per-thread initcall so that all thread init
functions have access to the clock ID.

Note that it could make sense to always have this clockid available even
in non-threaded situations and place the process' clock there instead.
But it would add portability issues which are currently easy to deal
with by disabling threads so it may not be worth it for now.
2019-05-20 11:42:25 +02:00
Willy Tarreau
522cfbc1ea MINOR: init/threads: make the global threads an array of structs
This way we'll be able to store more per-thread information than just
the pthread pointer. The storage became an array of struct instead of
an allocated array since it's very small (typically 512 bytes) and not
worth the hassle of dealing with memory allocation on this. The array
was also renamed thread_info to make its intended usage more explicit.
2019-05-20 11:37:57 +02:00
Willy Tarreau
64a47b943c CLEANUP: memory: make the fault injection code use the OTHER_LOCK label
The mem_should_fail() function sets a lock while it's building its
messages, and when this was done there was no relevant label available
hence the confusing use of START_LOCK. Now OTHER_LOCK is available for
such use cases, so let's switch to this one instead as START_LOCK is
going to disappear.
2019-05-20 11:26:12 +02:00
Willy Tarreau
619a95f5ad MEDIUM: init/mworker: make the pipe register function a regular initcall
Now that we have the guarantee that init calls happen before any other
thread starts, we don't need anymore the workaround installed by commit
1605c7ae6 ("BUG/MEDIUM: threads/mworker: fix a race on startup") and we
can instead rely on a regular per-thread initcall for this function. It
will only be performed on worker thread #0, the other ones and the master
have nothing to do, just like in the original code that was only moved
to the function.
2019-05-20 11:26:12 +02:00
Willy Tarreau
3078e9f8e2 MINOR: threads/init: synchronize the threads startup
It's a bit dangerous to let threads initialize at different speeds on
startup. Some are still in their init functions while others area already
running. It was even subject to some race condition bugs like the one
fixed by commit 1605c7ae6 ("BUG/MEDIUM: threads/mworker: fix a race on
startup").

Here in order to secure all this, we take a very simplistic approach
consisting in using half of the rendez-vous point, which is made
exactly for this purpose : we first initialize the mask of the threads
requesting a rendez-vous to the mask of all threads, and we simply call
thread_release() once the init is complete. This guarantees that no
thread will go further than the initialization code during this time.

This could even safely be backported if any other issue related to an
init race was discovered in a stable release.
2019-05-20 11:26:12 +02:00
William Lallemand
7b302d8dd5 MINOR: init: setenv HAPROXY_CFGFILES
Set the HAPROXY_CFGFILES environment variable which contains the list of
configuration files used to start haproxy, separated by semicolon.
2019-05-20 11:21:00 +02:00
Willy Tarreau
c7091d89ae MEDIUM: debug/threads: implement an advanced thread dump system
The current "show threads" command was too limited as it was not possible
to dump other threads' detailed states (e.g. their tasks). This patch
goes further by using thread signals so that each thread can dump its
own state in turn into a shared buffer provided by the caller. Threads
are synchronized using a mechanism very similar to the rendez-vous point
and using this method, each thread can safely dump any of its contents
and the caller can finally report the aggregated ones from the buffer.

It is important to keep in mind that the list of signal-safe functions
is limited, so we take care of only using chunk_printf() to write to a
pre-allocated buffer.

This mechanism is enabled by USE_THREAD_DUMP and is enabled by default
on Linux 2.6.28+. On other platforms it falls back to the previous
solution using the loop and the less precise dump.
2019-05-17 17:16:20 +02:00
Willy Tarreau
0ad46fa6f5 MINOR: stream: detach the stream from its own task on stream_free()
This makes sure that the stream is not visible from its own task just
before starting to free some of its components. This way we have the
guarantee that a stream found in a task list is totally valid and can
safely be dereferenced.
2019-05-17 17:16:20 +02:00
Willy Tarreau
01f3489752 MINOR: task: put barriers after each write to curr_task
This one may be watched by signal handlers, we don't want the compiler
to optimize its assignment away at the end of the loop and leave some
wandering pointers there.
2019-05-17 17:16:20 +02:00
Willy Tarreau
38171daf21 MINOR: thread: implement ha_thread_relax()
At some places we're using a painful ifdef to decide whether to use
sched_yield() or pl_cpu_relax() to relax in loops, this is hardly
exportable. Let's move this to ha_thread_relax() instead and une
this one only.
2019-05-17 17:16:20 +02:00
Willy Tarreau
20db9115dc BUG/MINOR: debug: don't check the call date on tasklets
tasklets don't have a call date, so when a tasklet is cast into a task
and is present at the end of a page we run a risk of dereferencing
unmapped memory when dumping them in ha_task_dump(). This commit
simplifies the test and uses to distinct calls for tasklets and tasks.
No backport is needed.
2019-05-17 17:16:20 +02:00
Willy Tarreau
5cf64dd1bd MINOR: debug: make ha_thread_dump() and ha_task_dump() take a buffer
Instead of having them dump into the trash and initialize it, let's have
the caller initialize a buffer and pass it. This will be convenient to
dump multiple threads at once into a single buffer.
2019-05-17 17:16:20 +02:00
Willy Tarreau
14a1ab75d0 BUG/MINOR: debug: make ha_task_dump() actually dump the requested task
It used to only dump the current task, which isn't different for now
but the purpose clearly is to dump the requested task. No backport is
needed.
2019-05-17 17:16:20 +02:00
Willy Tarreau
231ec395c1 BUG/MINOR: debug: make ha_task_dump() always check the task before dumping it
For now it cannot happen since we're calling it from a task but it will
break with signals. No backport is needed.
2019-05-17 17:16:20 +02:00
Olivier Houchard
6db1699f77 BUG/MEDIUM: streams: Try to L7 retry before aborting the connection.
In htx_wait_for_response, in case of error, attempt a L7 retry before
aborting the connection if the TX_NOT_FIRST flag is set.
If we don't do that, then we wouldn't attempt L7 retries after the first
request, or if we use HTTP/2, as with HTTP/2 that flag is always set.
2019-05-17 15:49:21 +02:00
Olivier Houchard
ce1a0292bf BUG/MEDIUM: streams: Don't use CF_EOI to decide if the request is complete.
In si_cs_send(), don't check CF_EOI on the request channel to decide if the
request is complete and if we should save the buffer to eventually attempt
L7 retries. The flag may not be set yet, and it may too be set to early,
before we're done modifying the buffer. Instead, get the msg, and make sure
its state is HTTP_MSG_DONE.
That way we will store the request buffer when sending it even in H2.
2019-05-17 15:49:21 +02:00
Willy Tarreau
4e2b646d60 MINOR: cli/debug: add a thread dump function
The new function ha_thread_dump() will dump debugging info about all known
threads. The current thread will contain a bit more info. The long-term goal
is to make it possible to use it in signal handlers to improve the accuracy
of some dumps.

The function dumps its output into the trash so as it was trivial to add,
a new "show threads" command appeared on the CLI.
2019-05-16 18:06:45 +02:00
Willy Tarreau
58d9621fc8 MINOR: cli/activity: show the dumping thread ID starting at 1
Both the config and gdb report thread IDs starting at 1, so better do the
same in "show activity" to limit confusion. We also display the full
permitted range.

This could be backported to 1.9 since it was present there.
2019-05-16 18:02:03 +02:00
Tim Duesterhus
3506dae342 MEDIUM: Make 'resolution_pool_size' directive fatal
This directive never appeared in a stable release and instead was
introduced and deprecated within 1.8-dev. While it technically could
be outright removed we detect it and error out for good measure.
2019-05-16 18:02:03 +02:00
Tim Duesterhus
10c6c16cde MEDIUM: Make 'option forceclose' actually warn
It is deprecated since 315b39c391 (1.9-dev),
but only was deprecated in the docs.

Make it warn when being used and remove it from the docs.
2019-05-16 18:02:03 +02:00
Christopher Faulet
c1f40dd492 BUG/MINOR: http_fetch: Rely on the smp direction for "cookie()" and "hdr()"
A regression was introduced in the commit 89dc49935 ("BUG/MAJOR: http_fetch: Get
the channel depending on the keyword used") on the samples "cookie()" and
"hdr()". Unlike other samples manipulating the HTTP headers, these ones depend
on the sample direction. To fix the bug, these samples use now their own
functions. Depending on the sample direction, they call smp_fetch_cookie() and
smp_fetch_hdr() with the appropriate keyword.

Thanks to Yves Lafon to report this issue.

This patch must be backported wherever the commit 89dc49935 was backported. For
now, 1.9 and 1.8.
2019-05-16 11:31:28 +02:00
Olivier Houchard
35d116885d MINOR: connections: Use BUG_ON() to enforce rules in subscribe/unsubscribe.
It is not legal to subscribe if we're already subscribed, or to unsubscribe
if we did not subscribe, so instead of trying to handle those cases, just
assert that it's ok using the new BUG_ON() macro.
2019-05-14 18:18:25 +02:00
Olivier Houchard
00b8f7c60b MINOR: h1: Use BUG_ON() to enforce rules in subscribe/unsubscribe.
It is not legal to subscribe if we're already subscribed, or to unsubscribe
if we did not subscribe, so instead of trying to handle those cases, just
assert that it's ok using the new BUG_ON() macro.
2019-05-14 18:18:25 +02:00
Olivier Houchard
f8338151a3 MINOR: h2: Use BUG_ON() to enforce rules in subscribe/unsubscribe.
It is not legal to subscribe if we're already subscribed, or to unsubscribe
if we did not subscribe, so instead of trying to handle those cases, just
assert that it's ok using the new BUG_ON() macro.
2019-05-14 18:18:25 +02:00
Christopher Faulet
fa922f03a3 BUG/MEDIUM: mux-h2: Set EOI on the conn_stream during h2_rcv_buf()
Just like CS_FL_REOS previously, the CS_FL_EOI flag is abused as a proxy
for H2_SF_ES_RCVD. The problem is that this flag is consumed by the
application layer and is set immediately when an end of stream was met,
which is too early since the application must retrieve the rxbuf's
contents first. The effect is that some transfers are truncated (mostly
the first one of a connection in most tests).

The problem of mixing CS flags and H2S flags in the H2 mux is not new
(and is currently being addressed) but this specific one was emphasized
in commit 63768a63d ("MEDIUM: mux-h2: Don't mix the end of the message
with the end of stream") which was backported to 1.9. Note that other
flags, particularly CS_FL_REOS still need to be asynchronously reported,
though their impact seems more limited for now.

This patch makes sure that all internal uses of CS_FL_EOI are replaced
with a test on H2_SF_ES_RCVD (as there is a 1-to-1 equivalence) and that
CS_FL_EOI is only reported once the rxbuf is empty.

This should ideally be backported to 1.9 unless it causes too much
trouble due to the recent changes in this area, as 1.9 *seems* not
to be directly affected by this bug.
2019-05-14 15:47:57 +02:00
Willy Tarreau
99ad1b3e8c MINOR: mux-h2: stop relying on CS_FL_REOS
This flag was introduced early in 1.9 development (a3f7efe00) to report
the fact that the rxbuf that was present on the conn_stream was followed
by a shutr. Since then the rxbuf moved from the conn_stream to the h2s
(638b799b0) but the flag remained on the conn_stream. It is problematic
because some state transitions inside the mux depend on it, thus depend
on the CS, and as such have to test for its existence before proceeding.

This patch replaces the test on CS_FL_REOS with a test on the only
states that set this flag (H2_SS_CLOSED, H2_SS_HREM, H2_SS_ERROR).
The few places where the flag was set were removed (the flag is not
used by the data layer).
2019-05-14 15:47:57 +02:00
Willy Tarreau
4c688eb8d1 MINOR: mux-h2: add macros to check multiple stream states at once
At many places we need to test for several stream states at once, let's
have macros to make a bit mask from a state to ease this.
2019-05-14 15:47:57 +02:00
Willy Tarreau
f8fe3d63f0 CLEANUP: mux-h2: don't test for impossible CS_FL_REOS conditions
This flag is currently set when an incoming close was received, which
results in the stream being in either H2_SS_HREM, H2_SS_CLOSED, or
H2_SS_ERROR states, so let's remove the test for the OPEN and HLOC
cases.
2019-05-14 15:47:57 +02:00
Willy Tarreau
3cf69fe6b2 BUG/MINOR: mux-h2: make sure to honor KILL_CONN in do_shut{r,w}
If the stream closes and quits while there's no room in the mux buffer
to send an RST frame, next time it is attempted it will not lead to
the connection being closed because the conn_stream will have been
released and the KILL_CONN flag with it as well.

This patch reserves a new H2_SF_KILL_CONN flag that is copied from
the CS when calling shut{r,w} so that the stream remains autonomous
on this even when the conn_stream leaves.

This should ideally be backported to 1.9 though it depends on several
previous patches that may or may not be suitable for backporting. The
severity is very low so there's no need to insist in case of trouble.
2019-05-14 15:47:57 +02:00
Willy Tarreau
aebbe5ef72 MINOR: mux-h2: make h2s_wake_one_stream() not depend on temporary CS flags
In h2s_wake_one_stream() we used to rely on the temporary flags used to
adjust the CS to determine the new h2s state. This really is not convenient
and creates far too many dependencies. This commit just moves the same
condition to the places where the temporary flags were set so that we
don't have to rely on these anymore. Whether these are relevant or not
was not the subject of the operation, what matters was to make sure the
conditions to adjust the stream's state and the CS's flags remain the
same. Later it could be studied if these conditions are correct or not.
2019-05-14 15:47:57 +02:00
Willy Tarreau
13b6c2e8b3 MINOR: mux-h2: make h2s_wake_one_stream() the only function to deal with CS
h2s_wake_one_stream() has access to all the required elements to update
the connstream's flags and figure the necessary state transitions, so
let's move the conditions there from h2_wake_some_streams().
2019-05-14 15:47:57 +02:00
Willy Tarreau
234829111f MINOR: mux-h2: make h2_wake_some_streams() not depend on the CS flags
It's problematic to have to pass some CS flags to this function because
that forces some h2s state transistions to update them just in time
while some of them are supposed to only be updated during I/O operations.

As a first step this patch transfers the decision to pass CS_FL_ERR_PENDING
from the caller to the leaf function h2s_wake_one_stream(). It is easy
since this is the only flag passed there and it depends on the position of
the stream relative to the last_sid if it was set.
2019-05-14 15:47:57 +02:00
Willy Tarreau
c3b1183f57 MINOR: mux-h2: remove useless test on stream ID vs last in wake function
h2_wake_some_streams() first looks up streams whose IDs are greater than
or equal to last+1, then checks if the id is lower than or equal to last,
which by definition will never match. Let's remove this confusing leftover
from ancient code.
2019-05-14 15:47:57 +02:00
William Lallemand
920fc8bbe4 BUG/MINOR: mworker: use after free when the PID not assigned
Commit 4528611 ("MEDIUM: mworker: store the leaving state of a process")
introduced a bug in the mworker_env_to_proc_list() function.

This is very unlikely to occur since the PID should always be assigned.
It can probably happen if the environment variable is corrupted.

No backport needed.
2019-05-14 11:28:16 +02:00
Willy Tarreau
f983d00a1c BUG/MINOR: mux-h2: make the do_shut{r,w} functions more robust against retries
These functions may fail to emit an RST or an empty DATA frame because
the mux is full or busy. Then they subscribe the h2s and try again.
However when doing so, they will already have marked the error state on
the stream and will not pass anymore through the sequence resulting in
the failed frame to be attempted to be sent again nor to the close to
be done, instead they will return a success.

It is important to only leave when the stream is already closed, but
to go through the whole sequence otherwise.

This patch should ideally be backported to 1.9 though it's possible that
the lack of the WANT_SHUT* flags makes this difficult or dangerous. The
severity is low enough to avoid this in case of trouble.
2019-05-14 11:13:06 +02:00
Frédéric Lécaille
90a10aeb65 BUG/MINOR: log: Wrong log format initialization.
This patch fixes an issue introduced by 0bad840b commit
"MINOR: log: Extract some code to send syslog messages" which leaded
to wrong log format variable initializations at least for "short" and "raw" format.
This commit skipped the cases where even if passed to __do_send_log(), the
syslog tag and syslog pid string must not be used to format the log message
with "short" and "raw". This is done iniatilizing "tag_max" and "pid_max"
variables (the lengths of the tag and pid strings) to 0, then updating to them to
the length of the tag and pid strings passed as variables to __do_send_log()
depending on the log format and in every cases using this length for the iovec
variable used to send() the log.

This bug is specific to 2.0.
2019-05-14 11:12:00 +02:00
Willy Tarreau
8bdb5c9bb4 CLEANUP: connection: remove the handle field from the wait_event struct
It was only set and not consumed after the previous change. The reason
is that the task's context always contains the relevant information,
so there is no need for a second pointer.
2019-05-13 19:14:52 +02:00
Willy Tarreau
88bdba31fa CLEANUP: mux-h2: simply use h2s->flags instead of ret in h2_deferred_shut()
This one used to rely on the combined return statuses of the shutr/w
functions but now that we have the H2_SF_WANT_SHUT{R,W} flags we don't
need this anymore if we properly remove these flags after their operations
succeed. This is what this patch does.
2019-05-13 19:14:52 +02:00
Willy Tarreau
2c249ebc75 MINOR: mux-h2: add two H2S flags to report the need for shutr/shutw
Currently when a shutr/shutw fails due to lack of buffer space, we abuse
the wait_event's handle pointer to place up to two bits there in addition
to the original pointer. This pointer is not used for anything but this
and overall the intent becomes clearer with h2s flags than with these
two alien bits in the pointer, so let's use clean flags now.
2019-05-13 19:14:52 +02:00
Willy Tarreau
c234ae38f8 CLEANUP: mux-h2: use LIST_ADDED() instead of LIST_ISEMPTY() where relevant
Lots of places were using LIST_ISEMPTY() to detect if a stream belongs
to one of the send lists or to detect if a connection was already
waiting for a buffer or attached to an idle list. Since these ones are
not list heads but list elements, let's use LIST_ADDED() instead.
2019-05-13 19:14:52 +02:00
William Lallemand
7e1770b151 BUG/MAJOR: ssl: segfault upon an heartbeat request
7b5fd1e ("MEDIUM: connections: Move some fields from struct connection
to ssl_sock_ctx.") introduced a bug in the heartbleed mitigation code.

Indeed the code used conn->ctx instead of conn->xprt_ctx for the ssl
context, resulting in a null dereference.
2019-05-13 16:03:44 +02:00
Tim Duesterhus
a6cc7e872a BUG/MINOR: vars: Fix memory leak in vars_check_arg
vars_check_arg previously leaked the string containing the variable
name:

Consider this config:

    frontend fe1
        mode http
        bind :8080
        http-request set-header X %[var(txn.host)]

Starting HAProxy and immediately stopping it by sending a SIGINT makes
Valgrind report this leak:

    ==7795== 9 bytes in 1 blocks are definitely lost in loss record 15 of 71
    ==7795==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
    ==7795==    by 0x4AA2AD: my_strndup (standard.c:2227)
    ==7795==    by 0x51FCC5: make_arg_list (arg.c:146)
    ==7795==    by 0x4CF095: sample_parse_expr (sample.c:897)
    ==7795==    by 0x4BA7D7: add_sample_to_logformat_list (log.c:495)
    ==7795==    by 0x4BBB62: parse_logformat_string (log.c:688)
    ==7795==    by 0x4E70A9: parse_http_req_cond (http_rules.c:239)
    ==7795==    by 0x41CD7B: cfg_parse_listen (cfgparse-listen.c:1466)
    ==7795==    by 0x480383: readcfgfile (cfgparse.c:2089)
    ==7795==    by 0x47A081: init (haproxy.c:1581)
    ==7795==    by 0x4049F2: main (haproxy.c:2591)

This leak can be detected even in HAProxy 1.6, this patch thus should
be backported to all supported branches

[Cf: This fix was reverted because the chunk's area was inconditionnaly
     released, making haproxy to crash when spoe was enabled. Now the chunk is
     released by calling chunk_destroy(). This function takes care of the
     chunk's size to release it or not. It is the responsibility of callers to
     set or not the chunk's size.]
2019-05-13 11:09:12 +02:00
Christopher Faulet
bf9bcb0a00 MINOR: spoe: Set the argument chunk size to 0 when SPOE variables are checked
When SPOE variables are registered during HAProxy startup, the argument used to
call the function vars_check_arg() uses the trash area. To be sure it is never
released by the callee function, the size of the internal chunk (arg.data.str)
is set to 0. It is important to do so because, to fix a memory leak, this buffer
must be released by the function vars_check_arg().

This patch must be backported to 1.9.
2019-05-13 11:07:00 +02:00
Willy Tarreau
ce9bbf523c BUG/MINOR: htx: make sure to always initialize the HTTP method when parsing a buffer
smp_prefetch_htx() is used when trying to access the contents of an HTTP
buffer from the TCP rulesets. The method was not properly set in this
case, which will cause the sample fetch methods relying on the method
to randomly fail in this case.

Thanks to Tim Düsterhus for reporting this issue (#97).

This fix must be backported to 1.9.
2019-05-13 10:10:44 +02:00
Tim Duesterhus
04bcaa1f9f BUG/MINOR: peers: Fix memory leak in cfg_parse_peers
cfg_parse_peers previously leaked the contents of the `kws` string,
as it was unconditionally filled using bind_dump_kws, but only used
(and freed) within the error case.

Move the dumping into the error case to:
1. Ensure that the registered keywords are actually printed as least once.
2. The contents of kws are not leaked.

This move allows to narrow the scope of `kws`, so this is done as well.

This bug was found using valgrind:

    ==28217== 590 bytes in 1 blocks are definitely lost in loss record 51 of 71
    ==28217==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
    ==28217==    by 0x4AD4C7: indent_msg (standard.c:3676)
    ==28217==    by 0x47E962: cfg_parse_peers (cfgparse.c:700)
    ==28217==    by 0x480273: readcfgfile (cfgparse.c:2147)
    ==28217==    by 0x479D51: init (haproxy.c:1585)
    ==28217==    by 0x404A02: main (haproxy.c:2585)

with this super simple configuration:

    peers peers
    	bind :8081
    	server A

This bug exists since the introduction of cfg_parse_peers in commit
355b2033ec (which was introduced for HAProxy
2.0, but marked as backportable). It should be backported to all branches
containing that commit.
2019-05-13 10:10:01 +02:00
Willy Tarreau
f7b0523425 Revert "BUG/MINOR: vars: Fix memory leak in vars_check_arg"
This reverts commit 6ea00195c4.

As found by Christopher, this fix is not correct due to the way args
are built at various places. For example some config or runtime parsers
will place a substring pointer there, and calling free() on it will
immediately crash the program. A quick audit of the code shows that
there are not that many users, but the way it's done requires to
properly set the string as a regular chunk (size=0 if free not desired,
then call chunk_destroy() at release time), and given that the size is
currently set to len+1 in all parsers, a deeper audit needs to be done
to figure the impacts of not setting it anymore.

Thus for now better leave this harmless leak which impacts only the
config parsing time.

This fix must be backported to all branches containing the fix above.
2019-05-13 10:10:01 +02:00
Willy Tarreau
4087346dab BUG/MAJOR: mux-h2: do not add a stream twice to the send list
In this long thread, Maciej Zdeb reported that the H2 mux was still
going through endless loops from time to time :

  https://www.mail-archive.com/haproxy@formilux.org/msg33709.html

What happens is the following :
- in h2s_frt_make_resp_data() we can set H2_SF_BLK_SFCTL and remove the
  stream from the send_list
- then in h2_shutr() and h2_shutw(), we check if the list is empty before
  subscribing the element, which is true after the case above
- then in h2c_update_all_ws() we still have H2_SF_BLK_SFCTL with the item
  in the send_list, thus LIST_ADDQ() adds it a second time.

This patch adds a check of list emptiness before performing the LIST_ADDQ()
when the flow control window opens. Maciej reported that it reliably fixed
the problem for him.

As later discussed with Olivier, this fixes the consequence of the issue
rather than its cause. The root cause is that a stream should never be in
the send_list with a blocking flag set and the various places that can lead
to this situation must be revisited. Thus another fix is expected soon for
this issue, which will require some observation. In the mean time this one
is easy enough to validate and to backport.

Many thanks to Maciej for testing several versions of the patch, each
time providing detailed traces which allowed to nail the problem down.

This patch must be backported to 1.9.
2019-05-13 08:15:10 +02:00
Willy Tarreau
6a38b3297c BUILD: threads: fix again the __ha_cas_dw() definition
This low-level asm implementation of a double CAS was implemented only
for certain architectures (x86_64, armv7, armv8). When threads are not
used, they were not defined, but since they were called directly from
a few locations, they were causing build issues on certain platforms
with threads disabled. This was addressed in commit f4436e1 ("BUILD:
threads: Add __ha_cas_dw fallback for single threaded builds") by
making it fall back to HA_ATOMIC_CAS() when threads are not defined,
but this actually made the situation worse by breaking other cases.

This patch fixes this by creating a high-level macro HA_ATOMIC_DWCAS()
which is similar to HA_ATOMIC_CAS() except that it's intended to work
on a double word, and which rely on the asm implementations when threads
are in use, and uses its own open-coded implementation when threads are
not used. The 3 call places relying on __ha_cas_dw() were updated to
use HA_ATOMIC_DWCAS() instead.

This change was tested on i586, x86_64, armv7, armv8 with and without
threads with gcc 4.7, armv8 with gcc 5.4 with and without threads, as
well as i586 with gcc-3.4 without threads. It will need to be backported
to 1.9 along with the fix above to fix build on armv7 with threads
disabled.
2019-05-11 18:13:29 +02:00
Willy Tarreau
295d614de1 CLEANUP: ssl: move all BIO_* definitions to openssl-compat
The following macros are now defined for openssl < 1.1 so that we
can remove the code performing direct access to the structures :

  BIO_get_data(), BIO_set_data(), BIO_set_init(), BIO_meth_free(),
  BIO_meth_new(), BIO_meth_set_gets(), BIO_meth_set_puts(),
  BIO_meth_set_read(), BIO_meth_set_write(), BIO_meth_set_create(),
  BIO_meth_set_ctrl(), BIO_meth_set_destroy()
2019-05-11 17:39:08 +02:00
Willy Tarreau
11b167167e CLEANUP: ssl: remove ifdef around SSL_CTX_get_extra_chain_certs()
Instead define this one in openssl-compat.h when
SSL_CTRL_GET_EXTRA_CHAIN_CERTS is not defined (which was the current
condition used in the ifdef).
2019-05-11 17:38:21 +02:00
Willy Tarreau
366a6987a7 CLEANUP: ssl: move the SSL_OP_* and SSL_MODE_* definitions to openssl-compat
These ones were defined in the middle of ssl_sock.c, better move them
to the include file to find them.
2019-05-11 17:37:44 +02:00
Tim Duesterhus
6ea00195c4 BUG/MINOR: vars: Fix memory leak in vars_check_arg
vars_check_arg previously leaked the string containing the variable
name:

Consider this config:

    frontend fe1
        mode http
        bind :8080
        http-request set-header X %[var(txn.host)]

Starting HAProxy and immediately stopping it by sending a SIGINT makes
Valgrind report this leak:

    ==7795== 9 bytes in 1 blocks are definitely lost in loss record 15 of 71
    ==7795==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
    ==7795==    by 0x4AA2AD: my_strndup (standard.c:2227)
    ==7795==    by 0x51FCC5: make_arg_list (arg.c:146)
    ==7795==    by 0x4CF095: sample_parse_expr (sample.c:897)
    ==7795==    by 0x4BA7D7: add_sample_to_logformat_list (log.c:495)
    ==7795==    by 0x4BBB62: parse_logformat_string (log.c:688)
    ==7795==    by 0x4E70A9: parse_http_req_cond (http_rules.c:239)
    ==7795==    by 0x41CD7B: cfg_parse_listen (cfgparse-listen.c:1466)
    ==7795==    by 0x480383: readcfgfile (cfgparse.c:2089)
    ==7795==    by 0x47A081: init (haproxy.c:1581)
    ==7795==    by 0x4049F2: main (haproxy.c:2591)

This leak can be detected even in HAProxy 1.6, this patch thus should
be backported to all supported branches.
2019-05-11 06:00:50 +02:00
Olivier Houchard
ddf0e03585 MINOR: streams: Introduce a new retry-on keyword, all-retryable-errors.
Add a new retry-on keyword, "all-retryable-errors", that activates retry
for all errors that are considered retryable.
This currently activates retry for "conn-failure", "empty-response",
"junk-respones", "response-timeout", "0rtt-rejected", "500", "502", "503" and
"504".
2019-05-10 18:05:35 +02:00
Olivier Houchard
602bf7d2ea MEDIUM: streams: Add a new http action, disable-l7-retry.
Add a new action for http-request, disable-l7-retry, that can be used to
disable any attempt at retry requests (see retry-on) if it fails for any
reason other than a connection failure.
This is useful for example to make sure POST requests aren't retried.
2019-05-10 17:49:09 +02:00
Olivier Houchard
ad26d8d820 BUG/MEDIUM: streams: Make sur SI_FL_L7_RETRY is set before attempting a retry.
In a few cases, we'd just check if the backend is configured to do retries,
and not if it's still allowed on the stream_interface.
The SI_FL_L7_RETRY flag could have been removed because we failed to allocate
a buffer, or because the request was too big to fit in a single buffer,
so make sure it's there before attempting a retry.
2019-05-10 17:48:59 +02:00
Olivier Houchard
bfe2a83c24 BUG/MEDIUM: h2: Don't check send_wait to know if we're in the send_list.
When we have to stop sending due to the stream flow control, don't check
if send_wait is NULL to know if we're in the send_list, because at this
point it'll always be NULL, while we're probably in the list.
Use LIST_ISEMPTY(&h2s->list) instead.
Failing to do so mean we might be added in the send_list when flow control
allows us to emit again, while we're already in it.
While I'm here, replace LIST_DEL + LIST_INIT by LIST_DEL_INIT.

This should be backported to 1.9.
2019-05-10 15:06:54 +02:00
Christopher Faulet
132f7b496c BUG/MEDIUM: http: Use pointer to the begining of input to parse message headers
In the legacy HTTP, when the message headers are parsed, in http_msg_analyzer(),
we must use the begining of input and not the head of the buffer. Most of time,
it will be the same pointers because there is no outgoing data when a new
message is received. But when a 1xx informational response is parsed, it is
forwarded and the parsing restarts immediatly. In this case, we have outgoing
data when the next response is parsed.

This patch must be backported to 1.9.
2019-05-10 11:47:00 +02:00
Christopher Faulet
7a3367cca0 BUG/MINOR: stream: Attach the read side on the response as soon as possible
A backend stream-interface attached to a reused connection remains in the state
SI_ST_CONN until some data are sent to validate the connection. But when the
url_param algorithm is used to balance connections, no data are sent while the
connection is not established. So it is a chicken and egg situation.

To solve the problem, if no error is detected and when the request channel is
waiting for the connect(), we mark the read side as attached on the response
channel as soon as possible and we wake the request channel up once. This
happens in 2 places. The first one is right after the connect(), when the
stream-interface is still in state SI_ST_CON, in the function
sess_update_st_con_tcp(). The second one is when an applet is used instead of a
real connection to a server, in the function sess_prepare_conn_req(). In fact,
it is done when the backend stream-interface is set to the state SI_ST_EST.

This patch must be backported to 1.9.
2019-05-10 11:47:00 +02:00
Willy Tarreau
c125cef6da CLEANUP: ssl: make inclusion of openssl headers safe
It's always a pain to have to stuff lots of #ifdef USE_OPENSSL around
ssl headers, it even results in some of them appearing in a random order
and multiple times just to benefit form an existing ifdef block. Let's
make these headers safe for inclusion when USE_OPENSSL is not defined,
they now perform the test themselves and do nothing if USE_OPENSSL is
not defined. This allows to remove no less than 8 such ifdef blocks
and make include blocks more readable.
2019-05-10 09:58:43 +02:00
Willy Tarreau
8d164dc568 CLEANUP: ssl: never include openssl/*.h outside of openssl-compat.h anymore
Since we're providing a compatibility layer for multiple OpenSSL
implementations and their derivatives, it is important that no C file
directly includes openssl headers but only passes via openssl-compat
instead. As a bonus this also gets rid of redundant complex rules for
inclusion of certain files (engines etc).
2019-05-10 09:36:42 +02:00
Willy Tarreau
9356dacd22 REORG: ssl: move some OpenSSL defines from ssl_sock to openssl-compat
Some defines like OPENSSL_VERSION or X509_getm_notBefore() have nothing
to do in ssl_sock and must move to openssl-compat.h so that they are
consistently shared by the whole code. A warning in the code was added
against wild additions of macros there.
2019-05-10 09:31:06 +02:00
Willy Tarreau
5599456ee2 REORG: ssl: move openssl-compat from proto to common
This way we can include it much earlier to cover types/ as well.
2019-05-10 09:19:50 +02:00
Willy Tarreau
df17e0e1a7 BUILD: ssl: fix libressl build again after aes-gcm-enc
Enabling aes-gcm-enc in last commit (MINOR: ssl: enable aes_gcm_dec
on LibreSSL) uncovered a wrong condition on the define of the
EVP_CTRL_AEAD_SET_IVLEN macro which I forgot to add when making the
commit, resulting in breaking libressl build again. In case libressl
later defines this macro, the test will have to change for a version
range instead.
2019-05-10 09:19:07 +02:00
Willy Tarreau
86a394e44d MINOR: ssl: enable aes_gcm_dec on LibreSSL
This one requires OpenSSL 1.0.1 and above, and libressl was forked from
1.0.1g and is compatible (build-tested). No need to exclude it anymore
from using this converter.
2019-05-09 14:26:40 +02:00
Willy Tarreau
5db847ab65 CLEANUP: ssl: remove 57 occurrences of useless tests on LIBRESSL_VERSION_NUMBER
They were all check to comply with the advertised openssl version. Now
that libressl doesn't pretend to be a more recent openssl anymore, we
can simply rely on the regular openssl version tests without having to
deal with exceptions for libressl.
2019-05-09 14:26:39 +02:00
Willy Tarreau
1d158ab12d BUILD: ssl: make libressl use its own version numbers
LibreSSL causes lots of build issues by pretending to be OpenSSL 2.0.0,
and it requires lots of care for each #if added to cover any specific
OpenSSL features.

This commit addresses the problem by making LibreSSL only advertise the
version it forked from (1.0.1g) and by starting to use tests based on
its real version to enable features instead of working by exclusion.
2019-05-09 14:25:47 +02:00
Willy Tarreau
9a1ab08160 CLEANUP: ssl-sock: use HA_OPENSSL_VERSION_NUMBER instead of OPENSSL_VERSION_NUMBER
Most tests on OPENSSL_VERSION_NUMBER have become complex and break all
the time because this number is fake for some derivatives like LibreSSL.
This patch creates a new macro, HA_OPENSSL_VERSION_NUMBER, which will
carry the real openssl version defining the compatibility level, and
this version will be adjusted depending on the variants.
2019-05-09 14:25:43 +02:00
Willy Tarreau
affd1b980a BUILD: ssl: fix again a libressl build failure after the openssl FD leak fix
As with every single OpenSSL fix, LibreSSL build broke again, this time
after commit 56996dabe ("BUG/MINOR: mworker/ssl: close OpenSSL FDs on
reload"). A definitive solution will have to be found quickly. For now,
let's exclude libressl from the version test.

This patch must be backported to 1.9 since the fix above was already
backported there.
2019-05-09 13:55:33 +02:00
Olivier Houchard
d9986ed51e BUG/MEDIUM: h2: Make sure we set send_list to NULL in h2_detach().
In h2_detach(), if we still have a send_wait pointer, because we woke the
tasklet up, but it hasn't ran yet, explicitely set send_wait to NULL after
we removed the tasklet from the task list.
Failure to do so may lead to crashes if the h2s isn't immediately destroyed,
because we considered there were still something to send.

This should be backported to 1.9.
2019-05-09 13:26:48 +02:00
Christopher Faulet
6f3cb1801b MINOR: htx: Remove support for unused OOB HTX blocks
This type of block was introduced in the early design of the HTX and it is not
used anymore. So, just remove it.

This patch may be backported to 1.9.
2019-05-07 22:16:41 +02:00
Christopher Faulet
6177509eb7 MINOR: htx: Don't try to append a trailer block with the previous one
In H1 and H2, one and only one trailer block is emitted during the HTTP
parsing. So it is useless to try to append this block with the previous one,
like for data block.

This patch may be backported to 1.9.
2019-05-07 22:16:41 +02:00
Christopher Faulet
bc5770b91e MINOR: htx: Split on DATA blocks only when blocks are moved to an HTX message
When htx_xfer_blks() is called to move blocks from an HTX message to another
one, most of blocks must be transferred atomically. But some may be splitted if
there is not enough space to move all the block. This was true for DATA and TLR
blocks. But it is a bad idea to split trailers. During HTTP parsing, only one
TLR block is emitted. It simplifies the processing of trailers to keep the block
untouched.

This patch must be backported to 1.9 because some fixes may depend on it.
2019-05-07 22:16:41 +02:00
Christopher Faulet
cc5060217e BUG/MINOR: htx: Never transfer more than expected in htx_xfer_blks()
When the maximum free space available for data in the HTX message is compared to
the number of bytes to transfer, we must take into account the amount of data
already transferred. Otherwise we may move more data than expected.

This patch must be backported to 1.9.
2019-05-07 22:16:41 +02:00
Christopher Faulet
39593e6ae3 BUG/MINOR: mux-h1: Fix the parsing of trailers
Unlike other H1 parsing functions, the 3rd parameter of the function
h1_measure_trailers() is the maximum number of bytes to read. For others
functions, it is the relative offset where to stop the parsing.

This patch must be backported to 1.9.
2019-05-07 22:16:41 +02:00
Christopher Faulet
3b1d004d41 BUG/MEDIUM: spoe: Be sure the sample is found before setting its context
When a sample fetch is encoded, we use its context to set info about the
fragmentation. But if the sample is not found, the function sample_process()
returns NULL. So we me be sure the sample exists before setting its context.

This patch must be backported to 1.9 and 1.8.
2019-05-07 22:16:41 +02:00
Willy Tarreau
201fe40653 BUG/MINOR: mux-h2: fix the condition to close a cs-less h2s on the backend
A typo was introduced in the following commit : 927b88ba0 ("BUG/MAJOR:
mux-h2: fix race condition between close on both ends") making the test
on h2s->cs never being done and h2c->cs being dereferenced without being
tested. This also confirms that this condition does not happen on this
side but better fix it right now to be safe.

This must be backported to 1.9.
2019-05-07 19:17:50 +02:00
William Lallemand
27edc4b915 MINOR: mworker: support a configurable maximum number of reloads
This patch implements a new global parameter for the master-worker mode.
When setting the mworker-max-reloads value, a worker receive a SIGTERM
if its number of reloads is greater than this value.
2019-05-07 19:09:01 +02:00
Willy Tarreau
f656279347 CLEANUP: task: remove unneeded tests before task_destroy()
Since previous commit it's not needed anymore to test a task pointer
before calling task_destory() so let's just remove these tests from
the various callers before they become confusing. The function's
arguments were also documented. The same should probably be done
with tasklet_free() which involves a test in roughly half of the
call places.
2019-05-07 19:08:16 +02:00
Dragan Dosen
7d61a33921 BUG/MEDIUM: stick-table: fix regression caused by a change in proxy struct
In commit 1b8e68e ("MEDIUM: stick-table: Stop handling stick-tables as
proxies."), the ->table member of proxy struct was replaced by a pointer
that is not always checked and in some situations can cause a segfault,
eg. during reload or while using "show table" on CLI socket.

No backport is needed.
2019-05-07 14:56:59 +02:00
Rob Allen
56996dabe6 BUG/MINOR: mworker/ssl: close OpenSSL FDs on reload
From OpenSSL 1.1.1, the default behaviour is to maintain open FDs to any
random devices that get used by the random number library. As a result,
those FDs leak when the master re-execs on reload; since those FDs are
not marked FD_CLOEXEC or O_CLOEXEC, they also get inherited by children.
Eventually both master and children run out of FDs.

OpenSSL 1.1.1 introduces a new function to control whether the random
devices are kept open. When clearing the keep-open flag, it also closes
any currently open FDs, so it can be used to clean-up open FDs too.
Therefore, a call to this function is made in mworker_reload prior to
re-exec.

The call is guarded by whether SSL is in use, because it will cause
initialisation of the OpenSSL random number library if that has not
already been done.

This should be backported to 1.9 and 1.8.
2019-05-07 14:11:55 +02:00
Willy Tarreau
2135f91d18 BUG/MEDIUM: h2/htx: never leave a trailers block alone with no EOM block
If when receiving an H2 response we fail to add an EOM block after too
large a trailers block, we must not leave the trailers block alone as it
violates the internal assumptions by not being followed by an EOM, even
when an error is reported. We must then make sure the error will safely
be reported to upper layers and that no attempt will be made to forward
partial blocks.

This must be backported to 1.9.
2019-05-07 11:17:32 +02:00
Willy Tarreau
fb07b3f825 BUG/MEDIUM: mux-h2/htx: never wait for EOM when processing trailers
In message https://www.mail-archive.com/haproxy@formilux.org/msg33541.html
Patrick Hemmer reported an interesting bug affecting H2 and trailers.

The problem is that in order to close the stream we have to see the EOM
block, but nothing guarantees it will atomically be delivered with the
trailers block(s). So the code currently waits for it by returning zero
when it was not found, resulting in the caller (h2_snd_buf()) to loop
forever calling it again.

The current internal connection/connstream API doesn't allow a send
actor to notify its caller that it cannot process the data until it
gets more, so even returning zero will only lead to calls in loops
without any guarantee that any progress will be made.

Some late amendments to HTX already guaranteed the atomicity of the
trailers block during snd_buf(), which is currently ensured by the
fact that producers create exactly one such trailers block for all
trailers. So in practice we can only loop between trailers and EOM.

This patch changes the behaviour by making h2s_htx_make_trailers()
become atomic by not consuming the EOM block. This way either it finds
the end of trailers marker (empty line) or it fails. Once it sends the
trailers block, ES is set so the stream turns HLOC or CLOSED. Thanks
to previous patch "MEDIUM: mux-h2: discard contents that are to be sent
after a shutdown" is is now safe to interrupt outgoing data processing,
and the late EOM block will silently be discarded when the caller
finally sends it.

This is a bit tricky but should remain solid by design, and seems like
the only option we have that is compatible with 1.9, where it must be
backported along with the aforementioned patch.
2019-05-07 11:08:02 +02:00
Willy Tarreau
2b77848418 MEDIUM: mux-h2: discard contents that are to be sent after a shutdown
In h2_snd_buf() we discard any possible buffer contents requested to be
sent after a close or an error. But in practice we can extend this to
any case where the stream is locally half-closed since it means we will
never be able to send these data anymore.

For now it must not change anything, but it will be used by subsequent
patches to discard lone a HTX EOM block arriving after the trailers
block.
2019-05-07 11:08:02 +02:00
Willy Tarreau
aab1a60977 BUG/MEDIUM: h2/htx: always fail on too large trailers
In case a header frame carrying trailers just fits into the HTX buffer
but leaves no room for the EOM block, we used to return the same code
as the one indicating we're missing data. This could would result in
such frames causing timeouts instead of immediate clean aborts. Now
they are properly reported as stream errors (since the frame was
decoded and the compression context is still synchronized).

This must be backported to 1.9.
2019-05-07 11:08:02 +02:00
Willy Tarreau
5121e5d750 BUG/MINOR: mux-h2: rely on trailers output not input to turn them to empty data
When sending trailers, we may face an empty HTX trailers block or even
have to discard some of the headers there and be left with nothing to
send. RFC7540 forbids sending of empty HEADERS frames, so in this case
we turn to DATA frames (which is possible since after other DATA).

The code used to only check the input frame's contents to decide whether
or not to switch to a DATA frame, it didn't consider the possibility that
the frame only used to contain headers discarded later, thus it could still
emit an empty HEADERS frame in such a case. This patch makes sure that the
output frame size is checked instead to take the decision.

This patch must be backported to 1.9. In practice this situation is never
encountered since the discarded headers have really nothing to do in a
trailers block.
2019-05-07 11:07:59 +02:00
Dragan Dosen
2674303912 MEDIUM: regex: modify regex_comp() to atomically allocate/free the my_regex struct
Now we atomically allocate the my_regex struct within function
regex_comp() and compile the regex or free both in case of failure. The
pointer to the allocated my_regex struct is returned directly. The
my_regex* argument to regex_comp() is removed.

Function regex_free() was modified so that it systematically frees the
my_regex entry. The function does nothing when called with a NULL as
argument (like free()). It will avoid existing risk of not properly
freeing the initialized area.

Other structures are also updated in order to be compatible (the ones
related to Lua and action rules).
2019-05-07 06:58:15 +02:00
Frédéric Lécaille
7fcc24d4ef MINOR: peers: Do not emit global stick-table names.
This commit "MINOR: stick-table: Add prefixes to stick-table names"
prepended the "peers" section name to stick-table names declared in such "peers"
sections followed by a '/' character.  This is not this name which must be sent
over the network to avoid collisions with stick-table name declared as backends.
As the '/' character is forbidden as first character of a backend name, we prefix
the stick-table names declared in peers sections only with a '/' character.
With such declarations:

    peers mypeers
       table t1

	backend t1
	   stick-table ... peers mypeers

at peer protocol level, "t1" declared as stick-table in "mypeers" section is different
of "t1" stick-table declared as backend.

In src/peers.c, only two modifications were required: use ->nid stktable struct
member in place of ->id in peer_prepare_switchmsg() to prepare the stick-table
definition messages. Same thing in peer_treat_definemsg() to treat a stick-table
definition messages.
2019-05-07 06:54:07 +02:00
Frédéric Lécaille
c02766a267 MINOR: stick-table: Add prefixes to stick-table names.
With this patch we add a prefix to stick-table names declared in "peers" sections
concatenating the "peers" section name followed by a '/' character with
the stick-table name. Consequently, "peers" sections have their own
namespace for their stick-tables. Obviously, these stick-table names are not the
ones which should be sent over the network. So these configurations must be
compatible and should make A and B peers communicate with peers protocol:

    # haproxy A config, old way stick-table declerations
    peers mypeers
        peer A ...
        peer B ...

    backend t1
        stick-table type string size 10m store gpc0 peers mypeers

    # haproxy B config, new way stick-table declerations
    peers mypeers
        peer A ...
        peer B ...
        table t1 type string size store gpc0 10m

This "network" name is stored in ->nid new field of stktable struct. The "local"
stktable-name is still stored in ->id.
2019-05-07 06:54:07 +02:00
Frédéric Lécaille
015e4d7d93 MINOR: stick-tables: Add peers process binding computing.
Add a list of proxies for all the stick-tables (->proxies_list struct stktable
member) so that to be able to compute the process bindings of the peers after having
parsed the configuration file.
The proxies are added to the stick-tables they reference when parsing
stick-tables lines in proxy sections, when checking the actions in
check_trk_action() and when resolving samples args for stick-tables
without checking is they are duplicates. We check only there is no loop.
Then, after having parsed everything, we add the proxy bindings to the
peers frontend bindings with stick-tables they reference.
2019-05-07 06:54:07 +02:00
Frédéric Lécaille
1b8e68e89a MEDIUM: stick-table: Stop handling stick-tables as proxies.
This patch adds the support for the "table" line parsing in "peers" sections
to declare stick-table in such sections. This also prevents the user from having
to declare dummy backends sections with a unique stick-table inside.
Even if still supported, this usage will become deprecated.

To do so, the ->table member of proxy struct which is a stktable struct is replaced
by a pointer to a stktable struct allocated at parsing time in src/cfgparse-listen.c
for the dummy stick-table backends and in src/cfgparse.c for "peers" sections.
This has an impact on the code for stick-table sample converters and on the stickiness
rules parsers which first store the name of the dummy before resolving the rules.
This patch replaces proxy_tbl_by_name() calls by stktable_find_by_name() calls
to lookup for stick-tables stored in "stktable_by_name" ebtree at parsing time.
There is only one remaining place where proxy_tbl_by_name() is used: src/hlua.c.

At several places in the code we relied on the fact that ->size member of stick-table
was equal to zero to consider the stick-table was present by not configured,
this do not make sense anymore as ->table member of struct proxyis fow now on a pointer.
These tests are replaced by a test on ->table value itself.

In "peers" section we do not have to temporary store the name of the section the
stick-table are attached to because this name is obviously already known just after
having entered this "peers" section.

About the CLI stick-table I/O handler, the pointer to proxy struct is replaced by
a pointer to a stktable struct.
2019-05-07 06:54:06 +02:00
Frédéric Lécaille
d456aa4ac2 MINOR: config: Extract the code of "stick-table" line parsing.
With this patch we move the code responsible of parsing "stick-table"
lines to implement parse_stick_table() function in src/stick-tabble.c
so that to be able to parse "stick-table" elsewhere than in proxy sections.
We have have also added a conf struct to stktable struct to store the filename
and the line in the file the stick-table has been parsed to help in
diagnosing and displaying any configuration issue.
2019-05-07 06:54:06 +02:00
Willy Tarreau
034c88cf03 MEDIUM: tcp: add the "tfo" option to support TCP fastopen on the server
This implements support for the new API which relies on a call to
setsockopt().
On systems that support it (currently, only Linux >= 4.11), this enables
using TCP fast open when connecting to server.
Please note that you should use the retry-on "conn-failure", "empty-response"
and "response-timeout" keywords, or the request won't be able to be retried
on failure.

Co-authored-by: Olivier Houchard <ohouchard@haproxy.com>
2019-05-06 22:29:39 +02:00
Olivier Houchard
fdcb007ad8 MEDIUM: proto: Change the prototype of the connect() method.
The connect() method had 2 arguments, "data", that tells if there's pending
data to be sent, and "delack" that tells if we have to use a delayed ack
inconditionally, or if the backend is configured with tcp-smart-connect.
Turn that into one argument, "flags".
That way it'll be easier to provide more informations to connect() without
adding extra arguments.
2019-05-06 22:12:57 +02:00
Olivier Houchard
4cd2af4e5d BUG/MEDIUM: ssl: Don't attempt to use early data with libressl.
Libressl doesn't yet provide early data, so don't put the CO_FL_EARLY_SSL_HS
on the connection if we're building with libressl, or the handshake will
never be done.
2019-05-06 15:20:42 +02:00
Ilya Shipitsin
54832b97c6 BUILD: enable several LibreSSL hacks, including
SSL_SESSION_get0_id_context is introduced in LibreSSL-2.7.0
async operations are not supported by LibreSSL
early data is not supported by LibreSSL
packet_length is removed from SSL struct in LibreSSL
2019-05-06 07:26:24 +02:00
Tim Duesterhus
473c283d95 CLEANUP: Remove appsession documentation
I was about to partly revert 294d0f08b3,
because there were no 'X' for 'appsession' in the keyword matrix until
I checked the blame, realizing that the feature does not exist any more.

Clearly the documentation is confusing here, the removal note is only
listed *below* the old documentation and the supported sections still
show 'backend' and 'listen'.

It's been 3.5 years and 4 releases (1.6, 1.7, 1.8 and 1.9), I guess
this can be removed from the documentation of future versions.
2019-05-06 07:15:08 +02:00
Willy Tarreau
55e2f5ad14 BUG/MINOR: logs/threads: properly split the log area upon startup
If logs were emitted before creating the threads, then the dataptr pointer
keeps a copy of the end of the log header. Then after the threads are
created, the headers are reallocated for each thread. However the end
pointer was not reset until the end of the first second, which may result
in logs emitted by multiple threads during the first second to be mangled,
or possibly in some cases to use a memory area that was reused for something
else. The fix simply consists in reinitializing the end pointers immediately
when the threads are created.

This fix must be backported to 1.9 and 1.8.
2019-05-05 10:16:13 +02:00
Willy Tarreau
4fc49a9aab BUG/MEDIUM: checks: make sure the warmup task takes the server lock
The server warmup task is used when a server uses the "slowstart"
parameter. This task affects the server's weight and maxconn, and may
dequeue pending connections from the queue. This must be done under
the server's lock, which was not the case.

This must be backported to 1.9 and 1.8.
2019-05-05 06:54:22 +02:00
Willy Tarreau
223995e8ca BUG/MINOR: stream: also increment the retry stats counter on L7 retries
It happens that the retries stats use their own counter and are not
derived from the stream interface, so we need to update it as well
when performing an L7 retry.

No backport is needed.
2019-05-04 10:40:00 +02:00
Olivier Houchard
e3249a98e2 MEDIUM: streams: Add a new keyword for retry-on, "junk-response"
Add a way to retry requests if we got a junk response from the server, ie
an incomplete response, or something that is not valid HTTP.
To do so, one can use the new "junk-response" keyword for retry-on.
2019-05-04 10:20:24 +02:00
Olivier Houchard
865d8392bb MEDIUM: streams: Add a way to replay failed 0rtt requests.
Add a new keyword for retry-on, 0rtt-rejected. If set, we will try to
replay requests for which we sent early data that got rejected by the
server.
If that option is set, we will attempt to use 0rtt if "allow-0rtt" is set
on the server line even if the client didn't send early data.
2019-05-04 10:20:24 +02:00
Olivier Houchard
a254a37ad7 MEDIUM: streams: Add the ability to retry a request on L7 failure.
When running in HTX mode, if we sent the request, but failed to get the
answer, either because the server just closed its socket, we hit a server
timeout, or we get a 404, 408, 425, 500, 501, 502, 503 or 504 error,
attempt to retry the request, exactly as if we just failed to connect to
the server.

To do so, add a new backend keyword, "retry-on".

It accepts a list of keywords, which can be "none" (never retry),
"conn-failure" (we failed to connect, or to do the SSL handshake),
"empty-response" (the server closed the connection without answering),
"response-timeout" (we timed out while waiting for the server response),
or "404", "408", "425", "500", "501", "502", "503" and "504".

The default is "conn-failure".
2019-05-04 10:19:56 +02:00
Olivier Houchard
f4bda993dd BUG/MEDIUM: streams: Don't add CF_WRITE_ERROR if early data were rejected.
In sess_update_st_con_tcp(), if we have an error on the stream_interface
because we tried to send early_data but failed, don't flag the request
channel as CF_WRITE_ERROR, or we will never reach the analyser that sends
back the 425 response.

This should be backported to 1.9.
2019-05-03 22:23:41 +02:00
Olivier Houchard
010941f876 BUG/MEDIUM: ssl: Use the early_data API the right way.
We can only read early data if we're a server, and write if we're a client,
so don't attempt to mix both.

This should be backported to 1.8 and 1.9.
2019-05-03 21:00:10 +02:00
Willy Tarreau
c40efc1919 MINOR: init/threads: make the threads array global
Currently the thread array is a local variable inside a function block
and there is no access to it from outside, which often complicates
debugging. Let's make it global and export it. Also the allocation
return is now checked.
2019-05-03 10:16:30 +02:00
Willy Tarreau
b4f7cc3839 MINOR: init/threads: remove the useless tids[] array
It's still obscure how we managed to initialize an array of integers
with values always equal to the index, just to retrieve the value
from an opaque pointer to the index instead of directly using it! I
suspect it's a leftover from the very early threading experiments.

This commit gets rid of this and simply passes the thread ID as the
argument to run_thread_poll_loop(), thus significantly simplifying the
few call places and removing the need to allocate then free an array
of identity.
2019-05-03 09:59:15 +02:00
Willy Tarreau
81492c989c MINOR: threads: flatten the per-thread cpu-map
When we initially experimented with threads and processes support, we
needed to implement arrays of threads per process for cpu-map, but this
is not needed anymore since we support either threads or processes.
Let's simply make the thread-based cpu-map per thread and not per
thread and per process since that's not used anymore. Doing so reduces
the global struct from 33kB to 1.5kB.
2019-05-03 09:46:45 +02:00
Olivier Houchard
a48237fd07 BUG/MEDIUM: connections: Make sure we remove CO_FL_SESS_IDLE on disown.
When for some reason the session is not the owner of the connection anymore,
make sure we remove CO_FL_SESS_IDLE, even if we're about to call
conn->mux->destroy(), as the destroy may not destroy the connection
immediately if it's still in use.
This should be backported to 1.9.
u
2019-05-02 12:08:39 +02:00
Dragan Dosen
e99af978c8 BUG/MEDIUM: pattern: fix memory leak in regex pattern functions
The allocated regex is not freed properly and can cause a memory leak,
eg. when patterns are updated via CLI socket.

This patch should be backported to all supported versions.
2019-05-02 10:05:11 +02:00
Dragan Dosen
026ef570e1 BUG/MINOR: checks: free memory allocated for tasklets
The check->wait_list.task and agent->wait_list.task were not
freed properly on deinit().

This patch should be backported to 1.9.
2019-05-02 10:05:09 +02:00
Dragan Dosen
61302da0e7 BUG/MINOR: log: properly free memory on logformat parse error and deinit()
This patch may be backported to all supported versions.
2019-05-02 10:05:07 +02:00
Dragan Dosen
2a7c20f602 BUG/MINOR: haproxy: fix rule->file memory leak
When using the "use_backend" configuration directive, the configuration
file name stored as rule->file was not freed in some situations. This
was introduced in commit 4ed1c95 ("MINOR: http/conf: store the
use_backend configuration file and line for logs").

This patch should be backported to 1.9, 1.8 and 1.7.
2019-05-02 10:05:06 +02:00
Olivier Houchard
b51937ebaa BUG/MEDIUM: ssl: Don't pretend we can retry a recv/send if we got a shutr/w.
In ha_ssl_write() and ha_ssl_read(), don't pretend we can retry a read/write
if we got a shutr/shutw, or we will never properly shutdown the connection.
2019-05-01 17:37:33 +02:00
Ilya Shipitsin
0c50b1ecbb BUG/MEDIUM: servers: fix typo "src" instead of "srv"
When copying the settings for all servers when using server templates,
fix a typo, or we would never copy the length of the ALPN to be used for
checks.

This should be backported to 1.9.
2019-04-30 23:04:47 +02:00
Christopher Faulet
02f3cf19ed CLEANUP: config: Don't alter listener->maxaccept when nbproc is set to 1
This patch only removes a useless calculation on listener->maxaccept when nbproc
is set to 1. Indeed, the following formula has no effet in such case:

  listener->maxaccept = (listener->maxaccept + nbproc - 1) / nbproc;

This patch may be backported as far as 1.5.
2019-04-30 15:28:29 +02:00
Christopher Faulet
6b02ab8734 MINOR: config: Test validity of tune.maxaccept during the config parsing
Only -1 and positive integers from 0 to INT_MAX are accepted. An error is
triggered during the config parsing for any other values.

This patch may be backported to all supported versions.
2019-04-30 15:28:29 +02:00
Christopher Faulet
102854cbba BUG/MEDIUM: listener: Fix how unlimited number of consecutive accepts is handled
There is a bug when global.tune.maxaccept is set to -1 (no limit). It is pretty
visible with one process (nbproc sets to 1). The functions listener_accept() and
accept_queue_process() don't expect to handle negative maxaccept values. So
instead of accepting incoming connections without any limit, none are never
accepted and HAProxy loop infinitly in the scheduler.

When there are 2 or more processes, the bug is a bit more subtile. The limit for
a listener is set to 1. So only one connection is accepted at a time by a given
listener. This happens because the listener's maxaccept value is an unsigned
integer. In check_config_validity(), it is first set to UINT_MAX (-1 casted in
an unsigned integer), and then some calculations on it leads to an integer
overflow.

To fix the bug, the listener's maxaccept value is now a signed integer. So, if a
negative value is set for global.tune.maxaccept, we keep it untouched for the
listener and no calculation is made on it. Then, in the listener code, this
signed value is casted to a unsigned one. It simplifies all tests instead of
dealing with negative values. So, it limits the number of connections accepted
at a time to UINT_MAX at most. But, honestly, it not an issue.

This patch must be backported to 1.9 and 1.8.
2019-04-30 15:28:29 +02:00
Willy Tarreau
bc13bec548 MINOR: activity: report context switch counts instead of rates
It's not logical to report context switch rates per thread in show activity
because everything else is a counter and it's not even possible to compare
values. Let's only report counts. Further, this simplifies the scheduler's
code.
2019-04-30 14:55:18 +02:00
Willy Tarreau
49ee3b2f9a BUG/MAJOR: map/acl: real fix segfault during show map/acl on CLI
A previous commit 8d85aa44d ("BUG/MAJOR: map: fix segfault during
'show map/acl' on cli.") was provided to address a concurrency issue
between "show acl" and "clear acl" on the CLI. Sadly the code placed
there was copy-pasted without changing the element type (which was
struct stream in the original code) and not tested since the crash
is still present.

The reproducer is simple : load a large ACL file (e.g. geolocation
addresses), issue "show acl #0" in loops in one window and issue a
"clear acl #0" in the other one, haproxy crashes.

This fix was also tested with threads enabled and looks good since
the locking seems to work correctly in these areas though. It will
have to be backported as far as 1.6 since the commit above went
that far as well...
2019-04-30 11:50:59 +02:00
Frédéric Lécaille
d803e475e5 MINOR: log: Enable the log sampling and load-balancing feature.
This patch implements the sampling and load-balancing of log servers configured
with "sample" new keyword implemented by this commit:
    'MINOR: log: Add "sample" new keyword to "log" lines'.
As the list of ranges used to sample the log to balance is ordered, we only
have to maintain ->curr_idx member of smp_info struct which is the index of
the sample and check if it belongs or not to the current range to decide if we
must send it to the log server or not.
2019-04-30 09:25:09 +02:00
Frédéric Lécaille
d95ea2897e MINOR: log: Add "sample" new keyword to "log" lines.
This patch implements the parsing of "sample" new optional keyword for "log" lines
to be able to sample and balance the load of log messages between serveral log
destinations declared by "log" lines. This keyword must be followed by a list of
comma seperated ranges of indexes numbered from 1 to define the samples to be used
to balance the load of logs to send. This "sample" keyword must be used on "log" lines
obviously before the remaining optional ones without keyword. The list of ranges
must be followed by a colon character to separate it from the log sampling size.

With such following configuration declarations:

   log stderr local0
   log 127.0.0.1:10001 sample 2-3,8-11:11 local0
   log 127.0.0.2:10002 sample 5:5 local0

in addition to being sent to stderr, about the second "log" line, every 11 logs
the logs #2 up to #3 would be sent to 127.0.0.1:10001, then #8 up tp #11 four
logs would be sent to the same log server and so on periodically. Logs would be
sent to 127.0.0.2:100002 every 5 logs.

It is also possible to define the size of the sample with a value different of
the maximum of the high limits of the ranges, for instance as follows:

   log 127.0.0.1:10001 sample 2-3,8-11:15 local0

as before the two logs #2 and #3 would be sent to 127.0.0.1:10001, then #8
up tp #11 logs, but in this case here, this would be done periodically every 15
messages.

Also note that the ranges must not overlap each others. This is to ease the
way the logs are periodically sent.
2019-04-30 09:25:09 +02:00
Christopher Faulet
85db3212b8 MINOR: spoe: Use the sample context to pass frag_ctx info during encoding
This simplifies the API and hide the details in the sample. This way, only
string and binary are aware of these info, because other types cannot be
partially encoded.

This patch may be backported to 1.9 and 1.8.
2019-04-29 16:02:05 +02:00
Kevin Zhu
f7f54280c8 BUG/MEDIUM: spoe: arg len encoded in previous frag frame but len changed
Fragmented arg will do fetch at every encode time, each fetch may get
different result if SMP_F_MAY_CHANGE, for example res.payload, but
the length already encoded in first fragment of the frame, that will
cause SPOA decode failed and waste resources.

This patch must be backported to 1.9 and 1.8.
2019-04-29 16:02:05 +02:00
Christopher Faulet
1907ccc2f7 BUG/MINOR: http: Call stream_inc_be_http_req_ctr() only one time per request
The function stream_inc_be_http_req_ctr() is called at the beginning of the
analysers AN_REQ_HTTP_PROCESS_FE/BE. It as an effect only on the backend. But we
must be careful to call it only once. If the processing of HTTP rules is
interrupted in the middle, when the analyser is resumed, we must not call it
again. Otherwise, the tracked counters of the backend are incremented several
times.

This bug was reported in github. See issue #74.

This fix should be backported as far as 1.6.
2019-04-29 16:01:47 +02:00
Willy Tarreau
97215ca284 BUG/MEDIUM: mux-h2: properly deal with too large headers frames
In h2c_decode_headers(), now that we support CONTINUATION frames, we
try to defragment all pending frames at once before processing them.
However if the first is exactly full and the second cannot be parsed,
we don't detect the problem and we wait for the next part forever due
to an incorrect check on exit; we must abort the processing as soon as
the current frame remains full after defragmentation as in this case
there is no way to make forward progress.

Thanks to Yves Lafon for providing traces exhibiting the problem.

This must be backported to 1.9.
2019-04-29 10:20:21 +02:00
David CARLIER
4de0eba848 MEDIUM: da: HTX mode support.
The DeviceAtlas module now can support both the legacy
mode and the new HTX's with the known set of support headers
for the latter.
2019-04-26 17:06:32 +02:00
David Carlier
0470d704a7 BUILD/MEDIUM: contrib: Dummy DeviceAtlas API.
Creating a "mocked" version mainly for testing purposes.
2019-04-26 17:06:32 +02:00
Willy Tarreau
4ad574fbe2 MEDIUM: streams: measure processing time and abort when detecting bugs
On some occasions we've had loops happening when processing actions
(e.g. a yield not being well understood) resulting in analysers being
called in loops until the analysis timeout without incrementing the
stream's call count, thus this type of bug cannot be caught by the
current protection system.

What this patch proposes is to start to measure the time spent in analysers
when profiling is enabled on the thread, in order to detect if a stream is
really misbehaving. In this case we measured the consumed CPU time, not the
wall clock time, so as not to be affected by possible noisy neighbours
sharing the same CPU. When more than 100ms are spent in an analyser, we
trigger the stream_dump_and_crash() function to report the anomaly.

The choice of 100ms comes from the fact that regular calls only take around
1 microsecond and it seems reasonable to accept a degradation factor of
100000, which covers very slow machines such as home gateways running on
sub-ghz processors, with extremely heavy configurations. Some complete
tests show that even this common bogus map_regm() entry supposedly designed
to extract a port from an IP:port entry does not trigger the timeout (25 ms
evaluation time for a 4kB header, exercise left to the reader to spot the
mistake) :

   ([0-9]{0,3}).([0-9]{0,3}).([0-9]{0,3}).([0-9]{0,3}):([0-9]{0,5}) \5

However this one purposely designed to kill haproxy definitely dies as it
manages to completely freeze the whole process for more than one second
on a 4 GHz CPU for only 120 bytes in :

   (.{0,20})(.{0,20})(.{0,20})(.{0,20})(.{0,20})b \1

This protection will definitely help during the code stabilization period
and may possibly be left enabled later depending on reported issues or not.

If you've noticed that your workload is affected by this patch, please
report it as you have very likely found a bug. And in the mean time you
can turn profiling off to disable it.
2019-04-26 14:30:59 +02:00
Willy Tarreau
3d07a16f14 MEDIUM: stream/debug: force a crash if a stream spins over itself forever
If a stream is caught spinning over itself at more than 100000 loops per
second and for more than one second, the process will be aborted and the
offender reported on the console and logs. Typical figures usually are just
a few tens to hundreds per second over a very short time so there is a huge
margin here. Using even higher values could also work but there is the risk
of not being able to catch offenders if multiple ones start to bug at the
same time and share the load. This code should ideally be disabled for
stable releases, though in theory nothing should ever trigger it.
2019-04-26 13:16:14 +02:00
Willy Tarreau
dcb0e1d37d MEDIUM: appctx/debug: force a crash if an appctx spins over itself forever
If an appctx is caught spinning over itself at more than 100000 loops per
second and for more than one second, the process will be aborted and the
offender reported on the console and logs. Typical figures usually are just
a few tens to hundreds per second over a very short time so there is a huge
margin here. Using even higher values could also work but there is the risk
of not being able to catch offenders if multiple ones start to bug at the
same time and share the load. This code should ideally be disabled for
stable releases, though in theory nothing should ever trigger it.
2019-04-26 13:15:56 +02:00
Willy Tarreau
71c07ac65a MINOR: stream/debug: make a stream dump and crash function
During 1.9 development (and even a bit after) we've started to face a
significant number of situations where streams were abusively spinning
due to an uncaught error flag or complex conditions that couldn't be
correctly identified. Sometimes streams wake appctx up and conversely
as well. More importantly when this happens the only fix is to restart.

This patch adds a new function to report a serious error, some relevant
info and to crash the process using abort() so that a core dump is
available. The purpose will be for this function to be called in various
situations where the process is unfixable. It will help detect these
issues much earlier during development and may even help fixing test
platforms which are able to automatically restart when such a condition
happens, though this is not the primary purpose.

This patch only provides the function and doesn't use it yet.
2019-04-26 13:15:56 +02:00
Willy Tarreau
5e370daa52 BUG/MINOR: proto_http: properly reset the stream's call rate on keep-alive
The stream's call rate measurement was added by commit 2e9c1d296 ("MINOR:
stream: measure and report a stream's call rate in "show sess"") but it
forgot to reset it in case of HTTP keep-alive (legacy mode), resulting
in incorrect measurements.

No backport is needed, unless the patch above is backported.
2019-04-25 18:33:37 +02:00
Willy Tarreau
d5ec4bfe85 CLEANUP: standard: use proper const to addr_to_str() and port_to_str()
The input parameter was not marked const, making it painful for some calls.
2019-04-25 17:48:16 +02:00
Willy Tarreau
d2d3348acb MINOR: activity: enable automatic profiling turn on/off
Instead of having to manually turn task profiling on/off in the
configuration, by default it will work in "auto" mode, which
automatically turns on on any thread experiencing sustained loop
latencies over one millisecond averaged over the last 1024 samples.

This may happen with configs using lots of regex (thing map_reg for
example, which is the lazy way to convert Apache's rewrite rules but
must not be abused), and such high latencies affect all the process
and the problem is most often intermittent (e.g. hitting a map which
is only used for certain host names).

Thus now by default, with profiling set to "auto", it remains off all
the time until something bad happens. This also helps better focus on
the issues when looking at the logs as well as in "show sess" output.
It automatically turns off when the average loop latency over the last
1024 calls goes below 990 microseconds (which typically takes a while
when in idle).

This patch could be backported to stable versions after a bit more
exposure, as it definitely improves observability and the ability to
quickly spot the culprit. In this case, previous patch ("MINOR:
activity: make the profiling status per thread and not global") must
also be taken.
2019-04-25 17:26:46 +02:00
Willy Tarreau
d9add3acc8 MINOR: activity: make the profiling status per thread and not global
In order to later support automatic profiling turn on/off, we need to
have it per-thread. We're keeping the global option to know whether to
turn it or on off, but the profiling status is now set per thread. We're
updating the status in activity_count_runtime() which is called before
entering poll(). The reason is that we'll extend this with run time
measurement when deciding to automatically turn it on or off.
2019-04-25 17:26:19 +02:00
Willy Tarreau
d636675137 BUG/MINOR: activity: always initialize the profiling variable
It happens it was only set if present in the configuration. It's
harmless anyway but can still cause doubts when comparing logs and
configurations so better correctly initialize it.

This should be backported to 1.9.
2019-04-25 17:26:19 +02:00
Willy Tarreau
22d63a24d9 MINOR: applet: measure and report an appctx's call rate in "show sess"
Very similarly to previous commit doing the same for streams, we now
measure and report an appctx's call rate. This will help catch applets
which do not consume all their data and/or which do not properly report
that they're waiting for something else. Some of them like peers might
theorically be able to exhibit some occasional peeks when teaching a
full table to a nearby peer (e.g. the new replacement process), but
nothing close to what a bogus service can do so there is no risk of
confusion.
2019-04-24 16:04:23 +02:00
Willy Tarreau
2e9c1d2960 MINOR: stream: measure and report a stream's call rate in "show sess"
Quite a few times some bugs have made a stream task incorrectly
handle a complex combination of events, which was often reported as
"100% CPU", and was usually caused by the event not being properly
identified and flushed, and the stream's handler called in loops.

This patch adds a call rate counter to the stream struct. It's not
huge, it's really inexpensive (especially compared to the rest of the
processing function) and will easily help spot such tasks in "show sess"
output, possibly even allowing to kill them.

A future patch should probably consist in alerting when they're above a
certain threshold, possibly sending a dump and killing them. Some options
could also consist in aborting in order to get an analyzable core dump
and let a service manager restart a fresh new process.
2019-04-24 16:04:23 +02:00
Willy Tarreau
0212fadd65 MINOR: tasks/activity: report the context switch and task wakeup rates
It's particularly useful to spot runaway tasks to see this. The context
switch rate covers all tasklet calls (tasks and I/O handlers) while the
task wakeups only covers tasks picked from the run queue to be executed.
High values there will indicate either an intense traffic or a bug that
mades a task go wild.
2019-04-24 16:04:23 +02:00
Willy Tarreau
69b5a7f1a3 CLEANUP: task: report calls as unsigned in show sess
The "show sess" output used signed ints to report the number of calls,
which is confusing for runaway tasks where the call count can turn
negative.
2019-04-24 16:04:23 +02:00
Christopher Faulet
4904058661 BUG/MINOR: htx: Exclude TCP proxies when the HTX mode is handled during startup
When tests are performed on the HTX mode during HAProxy startup, only HTTP
proxies are considered. It is important because, since the commit 1d2b586cd
("MAJOR: htx: Enable the HTX mode by default for all proxies"), the HTX is
enabled on all proxies by default. But for TCP proxies, it is "deactivated".

This patch must be backported to 1.9.
2019-04-24 15:40:02 +02:00
Willy Tarreau
274ba67862 BUG/MAJOR: lb/threads: fix AB/BA locking issue in round-robin LB
An occasional divide by zero in the round-robin scheduler was addressed
in commit 9df86f997 ("BUG/MAJOR: lb/threads: fix insufficient locking on
round-robin LB") by grabing the server's lock in fwrr_get_server_from_group().

But it happens that this is not the correct approach as it introduces a
case of AB/BA deadlock reported by Maksim Kupriianov. This happens when
a server weight changes from/to zero while another thread extracts this
server from the tree. The reason is that the functions used to manipulate
the state work under the server's lock and grab the LB lock while the ones
used in LB work under the LB lock and grab the server's lock when needed.

This commit mostly reverts the changes above and instead further completes
the locking analysis performed on this code to identify areas that really
need to be protected by the server's lock, since this is the only algorithm
which happens to have this requirement. This audit showed that in fact all
locations which require the server's lock are already protected by the LB
lock. This was not noticed the first time due to the server's lock being
taken instead and due to some functions misleadingly using atomic ops to
modify server fields which are under the LB lock protection (these ones
were now removed).

The change consists in not taking the server's lock anymore here, and
instead making sure that the aforementioned function which used to
suffer from the server's weight becoming zero only uses a copy of the
weight which was preliminary verified to be non-null (when the weight
is null, the server will be removed from the tree anyway so there is
no need to recalculate its position).

With this change, the code survived an injection at 200k req/s split
on two servers with weights changing 50 times a second.

This commit must be backported to 1.9 only.
2019-04-24 14:23:40 +02:00
Olivier Houchard
a28454ee21 BUG/MEDIUM: ssl: Return -1 on recv/send if we got EAGAIN.
In ha_ssl_read()/ha_ssl_write(), if we couldn't send/receive data because
we got EAGAIN, return -1 and not 0, as older SSL versions expect that.
This should fix the problems with OpenSSL < 1.1.0.
2019-04-24 12:06:08 +02:00
Christopher Faulet
371723b0c2 BUG/MINOR: spoe: Don't systematically wakeup SPOE stream in the applet handler
This can lead to wakeups in loop between the SPOE stream and the SPOE applets
waiting to receive agent messages (mainly AGENT-HELLO and AGENT-DISCONNECT).

This patch must be backported to 1.9 and 1.8.
2019-04-23 21:20:47 +02:00
Christopher Faulet
5e1a9d715e BUG/MEDIUM: stream: Fix the way early aborts on the client side are handled
A regression was introduced with the commit c9aecc8ff ("BUG/MEDIUM: stream:
Don't request a server connection if a shutw was scheduled"). Among other this,
it breaks the CLI when the shutr on the client side is handled with the client
data. To depend on the flag CF_SHUTW_NOW to not establish the server connection
when an error on the client side is detected is the right way to fix the bug,
because this flag may be set without any error on the client side.

So instead, we abort the request where the error is handled and only when the
backend stream-interface is in the state SI_ST_INI. This way, there is no
ambiguity on the reason why the abort accurred. The stream-interface is also
switched to the state SI_ST_CLO.

This patch must be backported to 1.9. If the commit c9aecc8ff is backported to
previous versions, this one MUST also be backported. Otherwise, it MAY be
backported to older versions that 1.9 with caution.
2019-04-23 21:20:47 +02:00
Frédéric Lécaille
bed883abe8 BUG/MAJOR: stream: Missing DNS context initializations.
Fix some missing initializations wich came with 333939c commit (MINOR: action:
new '(http-request|tcp-request content) do-resolve' action). The DNS contexts of
streams which were allocated were not initialized by stream_new(). This leaded to
accesses to non-allocated memory when freeing these contexts with stream_free().
2019-04-23 20:24:11 +02:00
Frédéric Lécaille
0bad840b4d MINOR: log: Extract some code to send syslog messages.
This patch extracts the code of __send_log() responsible of sending a syslog
message to a syslog destination represented as a logsrv struct to define
__do_send_log() function. __send_log() calls __do_send_log() for each syslog
destination of a proxy after having prepared some of its parameters.
2019-04-23 14:16:51 +02:00
Baptiste Assmann
333939c2ee MINOR: action: new '(http-request|tcp-request content) do-resolve' action
The 'do-resolve' action is an http-request or tcp-request content action
which allows to run DNS resolution at run time in HAProxy.
The name to be resolved can be picked up in the request sent by the
client and the result of the resolution is stored in a variable.
The time the resolution is being performed, the request is on pause.
If the resolution can't provide a suitable result, then the variable
will be empty. It's up to the admin to take decisions based on this
statement (return 503 to prevent loops).

Read carefully the documentation concerning this feature, to ensure your
setup is secure and safe to be used in production.

This patch creates a global counter to track various errors reported by
the action 'do-resolve'.
2019-04-23 11:41:52 +02:00
Baptiste Assmann
db4c8521ca MINOR: dns: move callback affection in dns_link_resolution()
In dns.c, dns_link_resolution(), each type of dns requester is managed
separately, that said, the callback function is affected globaly (and
points to server type callbacks only).
This design prevents the addition of new dns requester type and this
patch aims at fixing this limitation: now, the callback setting is done
directly into the portion of code dedicated to each requester type.
2019-04-23 11:34:11 +02:00
Baptiste Assmann
dfd35fd71a MINOR: dns: dns_requester structures are now in a memory pool
dns_requester structure can be allocated at run time when servers get
associated to DNS resolution (this happens when SRV records are used in
conjunction with service discovery).
Well, this memory allocation is safer if managed in an HAProxy pool,
furthermore with upcoming HTTP action which can perform DNS resolution
at runtime.

This patch moves the memory management of the dns_requester structure
into its own pool.
2019-04-23 11:33:48 +02:00
paulborile
7714b12604 MINOR: wurfl: enabled multithreading mode
Initially excluded multithreaded mode is completely supported (libwurfl is fully MT safe).
Internal tests now are run also with multithreading enabled.
2019-04-23 11:00:23 +02:00
paulborile
bad132c384 CLEANUP: wurfl: removed deprecated methods
last 2 major releases of libwurfl included a complete review of engine options with
the result of deprecating many features. The patch removes unecessary code and fixes
the documentation.
Can be backported on any version of haproxy.

[wt: must not be backported since it removes config keywords and would
 thus break existing configurations]
Signed-off-by: Willy Tarreau <w@1wt.eu>
2019-04-23 11:00:23 +02:00
paulborile
59d50145dc BUILD: wurfl: build fix for 1.9/2.0 code base
This applies the required changes for the new buffer API that came in 1.9.
This patch must be backported to 1.9.
2019-04-23 11:00:23 +02:00
Willy Tarreau
b518823f1b MINOR: wurfl: indicate in haproxy -vv the wurfl version in use
It also explicitly mentions that the library is the dummy one when it
is detected.

We have this output now :

$ ./haproxy  -vv |grep -i wurfl
Built with WURFL support (dummy library version 1.11.2.100)
2019-04-23 11:00:23 +02:00
Willy Tarreau
b3cc9f2887 Revert "CLEANUP: wurfl: remove dead, broken and unmaintained code"
This reverts commit 8e5e1e7bf0.

The following patches will fix this code and may be backported.
2019-04-23 10:34:43 +02:00
Emeric Brun
d0e095c2aa MINOR: ssl/cli: async fd io-handlers printable on show fd
This patch exports the async fd iohandlers and make them printable
doing a 'show fd' on cli.
2019-04-19 17:27:01 +02:00
Christopher Faulet
46451d6e04 MINOR: gcc: Fix a silly gcc warning in connect_server()
Don't know why it happens now, but gcc seems to think srv_conn may be NULL when
a reused connection is removed from the orphan list. It happens when HAProxy is
compiled with -O2 with my gcc (8.3.1) on fedora 29... Changing a little how
reuse parameter is tested removes the warnings. So...

This patch may be backported to 1.9.
2019-04-19 15:53:23 +02:00
Christopher Faulet
f48552f2c1 BUG/MINOR: da: Get the request channel to call CHECK_HTTP_MESSAGE_FIRST()
Since the commit 89dc49935 ("BUG/MAJOR: http_fetch: Get the channel depending on
the keyword used"), the right channel must be passed as argument when the macro
CHECK_HTTP_MESSAGE_FIRST is called.

This patch must be backported to 1.9.
2019-04-19 15:53:23 +02:00
Christopher Faulet
2db9dac4c8 BUG/MINOR: 51d: Get the request channel to call CHECK_HTTP_MESSAGE_FIRST()
Since the commit 89dc49935 ("BUG/MAJOR: http_fetch: Get the channel depending on
the keyword used"), the right channel must be passed as argument when the macro
CHECK_HTTP_MESSAGE_FIRST is called.

This patch must be backported to 1.9.
2019-04-19 15:53:23 +02:00
Christopher Faulet
c54e4b053d BUG/MEDIUM: stream: Don't request a server connection if a shutw was scheduled
If a shutdown for writes was performed on the client side (CF_SHUTW is set on
the request channel) while the server connection is still unestablished (the
stream-int is in the state SI_ST_INI), then it is aborted. It must also be
aborted when the shudown for write is pending (only CF_SHUTW_NOW is
set). Otherwise, some errors on the request channel can be ignored, leaving the
stream in an undefined state.

This patch must be backported to 1.9. It may probably be backported to all
suported versions, but it is unclear if the bug is visbile for older versions
than 1.9. So it is probably safer to wait bug reports on these versions to
backport this patch.
2019-04-19 15:53:23 +02:00
Christopher Faulet
e84289e585 BUG/MEDIUM: thread/http: Add missing locks in set-map and add-acl HTTP rules
Locks are missing in the rules "http-request set-map" and "http-response
add-acl" when an acl or map update is performed. Pattern elements must be
locked.

This patch must be backported to 1.9 and 1.8. For the 1.8, the HTX part must be
ignored.
2019-04-19 15:53:23 +02:00
Baptiste Assmann
e1afd4fec6 MINOR: proto_tcp: tcp-request content: enable set-dst and set-dst-var
The set-dst and set dst-var are available at both 'tcp-request
connection' and 'http-request' but not at the layer in the middle.
This patch fixes this miss and enables both set-dst and set-dst-var at
'tcp-request content' layer.
2019-04-19 15:50:06 +02:00
Willy Tarreau
78c5eec949 BUG/MINOR: acl: properly detect pattern type SMP_T_ADDR
Since 1.6-dev4 with commit b2f8f087f ("MINOR: map: The map can return
IPv4 and IPv6"), maps can return both IPv4 and IPv6 addresses, which
is represented as SMP_T_ADDR at the output of the map converter. But
the ACL parser only checks for either SMP_T_IPV4 or SMP_T_IPV6 and
requires to see an explicit matching method specified. Given that it
uses the same pattern parser for both address families, it implicitly
is also compatible with SMP_T_ADDR, which ought to have been added
there.

This fix should be backported as far as 1.6.
2019-04-19 11:45:20 +02:00
Willy Tarreau
aa5801bcaa BUG/MEDIUM: maps: only try to parse the default value when it's present
Maps returning an IP address (e.g. map_str_ip) support an optional
default value which must be parsed. Unfortunately the parsing code does
not check for this argument's existence and uncondtionally tries to
resolve the argument whenever the output is of type address, resulting
in segfaults at parsing time when no such argument is provided. This
patch adds the appropriate check.

This fix may be backported as far as 1.6.
2019-04-19 11:35:22 +02:00
Olivier Houchard
88698d966d MEDIUM: connections: Add a way to control the number of idling connections.
As by default we add all keepalive connections to the idle pool, if we run
into a pathological case, where all client don't do keepalive, but the server
does, and haproxy is configured to only reuse "safe" connections, we will
soon find ourself having lots of idling, unusable for new sessions, connections,
while we won't have any file descriptors available to create new connections.

To fix this, add 2 new global settings, "pool_low_ratio" and "pool_high_ratio".
pool-low-fd-ratio  is the % of fds we're allowed to use (against the maximum
number of fds available to haproxy) before we stop adding connections to the
idle pool, and destroy them instead. The default is 20. pool-high-fd-ratio is
the % of fds we're allowed to use (against the maximum number of fds available
to haproxy) before we start killing idling connection in the event we have to
create a new outgoing connection, and no reuse is possible. The default is 25.
2019-04-18 19:52:03 +02:00
Olivier Houchard
7c49d2e213 MINOR: fd: Add a counter of used fds.
Add a new counter, ha_used_fds, that let us know how many file descriptors
we're currently using.
2019-04-18 19:19:59 +02:00
Emeric Brun
0bbec0fa34 MINOR: peers: adds counters on show peers about tasks calls.
This patch adds a counter of calls on the orchestator peers task
and a counter on the tasks linked to applet i/o handler for
each peer.

Those two counters are useful to detect if a peer sync is active
or frozen.

This patch is related to the commit:
  "MINOR: peers: Add a new command to the CLI for peers."
and should be backported with it.
2019-04-18 18:24:25 +02:00
Olivier Houchard
66a7b3302a BUILD/medium: ssl: Fix build with OpenSSL < 1.1.0
Make sure it builds with OpenSSL < 1.1.0, a lot of the BIO_get/set methods
were introduced with OpenSSL 1.1.0, so fallback with the old way of doing
things if needed.
2019-04-18 15:58:58 +02:00
Olivier Houchard
a8955d57ed MEDIUM: ssl: provide our own BIO.
Instead of letting the OpenSSL code handle the file descriptor directly,
provide a custom BIO, that will use the underlying XPRT to send/recv data.
This will let us implement QUIC later, and probably clean the upper layer,
if/when the SSL code provide its own subscribe code, so that the upper layers
won't have to care if we're still waiting for the handshake to complete or not.
2019-04-18 14:56:24 +02:00
Olivier Houchard
e179d0e88f MEDIUM: connections: Provide a xprt_ctx for each xprt method.
For most of the xprt methods, provide a xprt_ctx.  This will be useful later
when we'll want to be able to stack xprts.
The init() method now has to create and provide the said xprt_ctx if needed.
2019-04-18 14:56:24 +02:00
Olivier Houchard
df35784600 MEDIUM: ssl: provide its own subscribe/unsubscribe function.
In order to prepare for the possibility of using different kinds of xprt
with ssl, make the ssl code provide its own subscribe and unsubscribe
functions, right now it just calls conn_subscribe and conn_unsubsribe.
2019-04-18 14:56:24 +02:00
Olivier Houchard
7b5fd1ec26 MEDIUM: connections: Move some fields from struct connection to ssl_sock_ctx.
Move xprt_st, tmp_early_data and sent_early_data from struct connection to
struct ssl_sock_ctx, as they are only used in the SSL code.
2019-04-18 14:56:24 +02:00
Olivier Houchard
66ab498f26 MEDIUM: ssl: Give ssl_sock its own context.
Instead of using directly a SSL * as xprt_ctx, give ssl_sock its own context.
It's useless for now, but will be useful later when we'll want to be able to
stack xprts.
2019-04-18 14:56:24 +02:00
Olivier Houchard
ed1a6a0d8a MEDIUM: tasks: Use __ha_barrier_store after modifying global_tasks_mask.
Now that we no longer use atomic operations to update global_tasks_mask,
as it's always modified while holding the TASK_RQ_LOCK, we have to use
__ha_barrier_store() instead of __ha_barrier_atomic_store() to ensure
any modification of global_tasks_mask is seen before modifying
active_tasks_mask.

This should be backported to 1.9.
2019-04-18 14:14:10 +02:00
Willy Tarreau
d83b6c1ab3 BUG/MINOR: mworker: disable busy polling in the master process
When enabling busy polling, we don't want the master to use it, or it
wastes a dedicated processor to this!

Must be backported to 1.9.
2019-04-18 11:34:41 +02:00
Olivier Houchard
1cfac37b65 MEDIUM: tasks: Don't account a destroyed task as a runned task.
In process_runnable_tasks(), if the task we're about to run has been
destroyed, and should be free, don't account for it in the number of task
we ran. We're only allowed a maximum number of tasks to run per call to
process_runnable_tasks(), and freeing one shouldn't take the slot of a
valid task.
2019-04-18 10:11:13 +02:00
Olivier Houchard
3f795f76e8 MEDIUM: tasks: Merge task_delete() and task_free() into task_destroy().
task_delete() was never used without calling task_free() just after, and
task_free() was only used on error pathes to destroy a just-created task,
so merge them into task_destroy(), that will remove the task from the
wait queue, and make sure the task is either destroyed immediately if it's
not in the run queue, or destroyed when it's supposed to run.
2019-04-18 10:10:04 +02:00
Willy Tarreau
03dd029a5b CLEANUP: task: remain consistent when using the task's handler
A pointer "process" is assigned the task's handler in
process_runnable_tasks(), we have no reason to use t->process
right after it is assigned.
2019-04-17 22:32:27 +02:00
Olivier Houchard
51205a1958 BUG/MEDIUM: applets: Don't use task_in_rq().
When deciding if we want to wake the task of an applet up, don't give up
if task_in_rq returns 1, as there's a race condition and another thread
may run it. Instead, always attempt to task_wakeup(), at worst the task
is already in the run queue, and nothing will happen.
2019-04-17 19:30:23 +02:00
Olivier Houchard
0c7a4b6371 MINOR: tasks: Don't set the TASK_RUNNING flag when adding in the tasklet list.
Now that TASK_QUEUED is enforced, there's no need to set TASK_RUNNING when
removing the task from the runqueue to add it to the tasklet list. The flag
will only be set right before we run the task.
2019-04-17 19:28:01 +02:00
Olivier Houchard
de82aeaa26 BUG/MEDIUM: tasks: Make sure we modify global_tasks_mask with the rq_lock.
When modifying global_tasks_mask, make sure we hold the rq_lock, or we might
remove the bit while it has been re-set by somebody else, and we make not
be waked when needed.
2019-04-17 19:28:01 +02:00
Willy Tarreau
b038007ae8 BUG/MEDIUM: tasks: Make sure we set TASK_QUEUED before adding a task to the rq.
Make sure we set TASK_QUEUED in every case before adding the task to the
run queue. task_wakeup() now checks if either TASK_QUEUED or TASK_RUNNING
is set, and if neither is set, add TASK_QUEUED and effectively add the task
to the runqueue.
No longer use __task_wakeup() anywhere except in task_wakeup(), always use
task_wakeup() instead.
With the old code, process_runnable_task() may re-add a task in the runqueue
without setting the TASK_QUEUED flag, and there were race conditions that could
lead to a task having the TASK_QUEUED flag but not in the runqueue, thus
being unschedulable.

This should be backported to 1.9.
2019-04-17 19:28:01 +02:00
Christopher Faulet
46575cd392 BUG/MINOR: http_fetch/htx: Use HTX versions if the proxy enables the HTX mode
Because the HTX is now the default mode for all proxies (HTTP and TCP), it is
better to match on the proxy options to know if the HTX is enabled or not. This
way, if a TCP proxy explicitly disables the HTX mode, the legacy version of HTTP
fetches will be used.

No backport needed except if the patch activating the HTX by default for all
proxies is backported.
2019-04-17 15:12:27 +02:00
Christopher Faulet
5ec8bcb021 BUG/MINOR: http_fetch/htx: Allow permissive sample prefetch for the HTX
As for smp_prefetch_http(), there is now a way to successfully perform a
prefetch in HTX, even if the message forwarding already begun. It is used for
the sample fetches "req.proto_http" and "method".

This patch must be backported to 1.9.
2019-04-17 15:12:27 +02:00
Christopher Faulet
89dc499359 BUG/MAJOR: http_fetch: Get the channel depending on the keyword used
All HTTP samples are buggy because the channel tested in the prefetch functions
(HTX and legacy HTTP) is chosen depending on the sample direction and not the
keyword really used. It means the request channel is used if the sample is
called during the request analysis and the response channel is used if it is
called during the response analysis, regardless the sample really called. For
instance, if you use the sample "req.ver" in an http-response rule, the response
channel will be prefeched because it is called during the response analysis,
while the request channel should have been used instead. So some assumptions on
the validity of the sample may be made on the wrong channel. It is the first
bug.

Then the same error is done in some samples themselves. So fetches are performed
on the wrong channel. For instance, the header extraction (req.fhdr, res.fhdr,
req.hdr, res.hdr...). If the sample "req.hdr" is used in an http-response rule,
then the matching is done on the response headers and not the request ones. It
is the second bug.

Finally, the last one but not the least, in some samples, the right channel is
used. But because the prefetch was done on the wrong one, this channel may be in
a undefined state. For instance, using the sample "req.ver" in an http-response
rule leads to a matching on a posibility released buffer.

To fix all these bugs, the right channel is now chosen in sample fetches, before
the prefetch. If the same function is used to fetch requests and responses
elements, then the keyword is used to choose the right one. This channel is then
used by the functions smp_prefetch_htx() and smp_prefetch_http(). Of course, it
is also used by the samples themselves to extract information.

This patch must be backported to all supported versions. For version 1.8 and
priors, it must be totally refactored. First because there is no HTX into these
versions. Then the buffers API has changed in HAProxy 1.9. The files
http_fetch.{ch} doesn't exist on old versions.
2019-04-17 15:12:27 +02:00
Christopher Faulet
038ad8123b MINOR: mux-h1: Handle read0 during TCP splicing
It avoids a roundtrip with underlying I/O callbacks to do so. If a read0 is
handled at the end of h1_rcv_pipe(), the flag CS_FL_REOS is set on the
conn_stream. And if there is no data in the pipe, the flag CS_FL_EOS is also
set.

This path may be backported to 1.9.
2019-04-17 14:52:31 +02:00
Christopher Faulet
e18777b79d BUG/MEDIUM: mux-h1: Enable TCP splicing to exchange data only
Use the TCP splicing only when the input parser is in the state H1_MSG_DATA or
H1_MSG_TUNNEL and don't transfer more than then known expected length for these
data (unlimited for the tunnel mode). In other states or when all data are
transferred, the TCP splicing is disabled.

This patch must be backported to 1.9.
2019-04-17 14:52:31 +02:00
Christopher Faulet
f7d5ff37e0 BUG/MEDIUM: mux-h1: Notify the stream waiting for TCP splicing if ibuf is empty
When a stream-interface want to use the TCP splicing to forward its data, it
notifies the mux h1. We will then flush the input buffer and don't read more
data. So the stream-interface will not be notified for read anymore, except if
an error or a read0 is detected. It is a problem everytime the receive I/O
callback is called again. It happens when the pipe is full or when no data are
received on the pipe. It also happens when the input buffer is freshly
flushed. Because the TCP splicing is enabled, nothing is done in h1_recv() and
the stream-interface is never woken up. So, now, in h1_recv(), if the TCP
splicing is used and the input buffer is empty, the stream-interface is notified
for read.

This patch must be backported to 1.9.
2019-04-17 14:52:31 +02:00
Christopher Faulet
2f320ee59c BUG/MINOR: mux-h1: Don't switch the parser in busy mode if other side has done
There is no reaon to switch the input parser in busy mode if all the output has
been processed.

This patch must be backported to 1.9.
2019-04-17 14:52:31 +02:00
Christopher Faulet
91f77d5999 BUG/MINOR: mux-h1: Process input even if the input buffer is empty
It is required, at least, to add the EOM block and finish the message when the
TCP splicing was used to send all data. Otherwise, there is no way to finish the
parsing.

This patch must be backported to 1.9.
2019-04-17 14:52:31 +02:00
William Lallemand
74f0ec3894 BUG/MINOR: mworker: ensure that we still quits with SIGINT
Since the fix "BUG/MINOR: mworker: don't exit with an ambiguous value"
we are leaving with a EXIT_SUCCESS upon a SIGINT.

We still need to quit with a SIGINT when a worker leaves with a SIGINT.

This is done this way because vtest expect a 130 during the process
stop, haproxy without mworker returns a 130, so it should be the same in
mworker mode.

This should be backported in 1.9, with the previous patch ("BUG/MINOR:
mworker: don't exit with an ambiguous value").

Code has moved, mworker_catch_sigchld() is in haproxy.c.
2019-04-16 18:14:29 +02:00
William Lallemand
4cf4b33744 BUG/MINOR: mworker: don't exit with an ambiguous value
When the sigchld handler is called and waitpid() returns -1,
the behavior of waitpid() with the status variable is undefined.
It is not a good idea to exit with the value contained in it.

Since this exit path does not use the exitcode variable, it means that
this is an expected and successful exit.

This should be backported in 1.9, code has moved,
mworker_catch_sigchld() is in haproxy.c.
2019-04-16 18:14:29 +02:00
William Lallemand
32b6901550 BUG/MINOR: mworker: mworker_kill should apply on every children
Commit 3f12887 ("MINOR: mworker: don't use children variable anymore")
introduced a regression.

The previous behavior was to send a signal to every children, whether or
not they are former children. Instead of this, we only send a signal to
the current children, so we don't try to kill -INT or -TERM all
processes during a reload.

No backport needed.
2019-04-16 18:14:29 +02:00
Willy Tarreau
85d0424b20 BUG/MINOR: listener/mq: correctly scan all bound threads under low load
When iterating on the CLI using "show activity" and no other load, it
was visible that the last thread was always skipped. This was caused by
the way the thread bits were walking : t1 was updated after t2 to make
sure it never equals t2 (thus it skips t2), and in case of a tie we
choose t1. This results in the chosen thread never to equal t2 unless
the other ones already have one connection. In addition to this, t2 was
recalulated upon each pass due to the fact that only the 31th bit was
looked at instead of looking at the t2'th bit.

This patch fixes this by updating t2 after t1 so that t1 is free to
walk over all positions under equal load. No measurable performance
gains are expected from this though, but it at least removes one
strange indicator which could lead to some suspicion.

No backport is needed.
2019-04-16 18:09:13 +02:00
Willy Tarreau
636848aa86 MINOR: init: add a "set-dumpable" global directive to enable core dumps
It's always a pain to get a core dump when enabling user/group setting
(which disables the dumpable flag on Linux), when using a chroot and/or
when haproxy is started by a service management tool which requires
complex operations to just raise the core dump limit.

This patch introduces a new "set-dumpable" global directive to work
around these troubles by doing the following :

  - remove file size limits     (equivalent of ulimit -f unlimited)
  - remove core size limits     (equivalent of ulimit -c unlimited)
  - mark the process dumpable again (equivalent of suid_dumpable=1)

Some of these will depend on the operating system. This way it becomes
much easier to retrieve a core file. Temporarily moving the chroot to
a user-writable place generally enough.
2019-04-16 14:31:23 +02:00
William Lallemand
482f9a9a2f MINOR: mworker: export HAPROXY_MWORKER=1 when running in mworker mode
Export HAPROXY_MWORKER=1 in an environment variable when running in
mworker mode.
2019-04-16 13:26:43 +02:00
William Lallemand
620072bc0d MINOR: cli: don't add a semicolon at the end of HAPROXY_CLI
Only add the semicolon when there is several CLI in HAPROXY_CLI and
HAPROXY_MASTER_CLI.
2019-04-16 13:26:43 +02:00
William Lallemand
9a37fd0f19 MEDIUM: mworker/cli: export the HAPROXY_MASTER_CLI variable
It works the same way as the HAPROXY_CLI variable, it exports the
listeners addresses separated by semicolons.
2019-04-16 13:26:43 +02:00
William Lallemand
8f7069a389 CLEANUP: mworker: remove the type field in mworker_proc
Since the introduction of the options field, we can use it to store the
type of process.

type = 'm' is replaced by PROC_O_TYPE_MASTER
type = 'w' is replaced by PROC_O_TYPE_WORKER
type = 'e' is replaced by PROC_O_TYPE_PROG

The old values are still used in the HAPROXY_PROCESSES environment
variable to pass the information during a reload.
2019-04-16 13:26:43 +02:00
William Lallemand
bd3de3efb7 MEDIUM: mworker-prog: implements 'option start-on-reload'
This option is already the default, but its opposite 'no option
start-on-reload' allows the master to keep a previous instance of a
program and don't start a new one upon a reload.

The old program will then appear as a current one in "show proc" and
could also trigger an exit-on-failure upon a segfault.
2019-04-16 13:26:43 +02:00
William Lallemand
4528611ed6 MEDIUM: mworker: store the leaving state of a process
Previously we were assuming than a process was in a leaving state when
its number of reload was greater than 0. With mworker programs it's not
the case anymore so we need to store a leaving state.
2019-04-16 13:26:43 +02:00
Willy Tarreau
9df86f997e BUG/MAJOR: lb/threads: fix insufficient locking on round-robin LB
Maksim Kupriianov reported very strange crashes in fwrr_update_position()
which didn't make sense because of an apparent divide overflow except that
the value was not null in the core.

It happens that while the locking is correct in all the functions' call
graph, the uppermost one (fwrr_get_next_server()) incorrectly expected
that its target server was already locked when called. This stupid
assumption causd the server lock not to be held when calling the other
ones, explaining how it was possible to change the server's eweight by
calling srv_lb_commit_status() under the server lock yet collide with
its unprotected usage.

This commit makes sure that fwrr_get_server_from_group() retrieves a
locked server and that fwrr_get_next_server() is responsible for
unlocking the server before returning it. There is one subtlety in
this function which is that it builds a list of avoided servers that
were full while scanning the tree, and all of them are queued in a
full state so they must be unlocked upon return.

Many thanks to Maksim for providing detailed info allowing to narrow
down this bug.

This fix must be backported to 1.9. In 1.8 the lock seems much wider
and changes to the server's state are performed under the rendez-vous
point so this it doesn't seem possible that it happens there.
2019-04-16 11:21:14 +02:00
Frédéric Lécaille
95679dc096 MINOR: peers: Add a new command to the CLI for peers.
Implements "show peers [peers section]" new CLI command to dump information
about the peers and their stick-tables to be synchronized and others internal.

May be backported as far as 1.5.
2019-04-16 09:58:40 +02:00
Willy Tarreau
6f7a02a381 BUILD: htx: fix a used uninitialized warning on is_cookie2
gcc-3.4 reports this which actually looks like a valid warning when
looking at the code, it's unsure why others didn't notice it :

src/proto_htx.c: In function `htx_manage_server_side_cookies':
src/proto_htx.c:4266: warning: 'is_cookie2' might be used uninitialized in this function
2019-04-15 21:55:48 +02:00
Willy Tarreau
8de1df92a3 BUILD: do not specify "const" on functions returning structs or scalars
Older compilers (like gcc-3.4) warn about the use of "const" on functions
returning a struct, which makes sense since the return may only be copied :

  include/common/htx.h:233: warning: type qualifiers ignored on function return type

Let's simply drop "const" here.
2019-04-15 21:55:48 +02:00
Willy Tarreau
0e492e2ad0 BUILD: address a few cases of "static <type> inline foo()"
Older compilers don't like to see "inline" placed after the type in a
function declaration, it must be "static inline <type>" only. This
patch touches various areas. The warnings were seen with gcc-3.4.
2019-04-15 21:55:48 +02:00
Olivier Houchard
998410a41b BUG/MEDIUM: h2: Revamp the way send subscriptions works.
Instead of abusing the SUB_CALL_UNSUBSCRIBE flag, revamp the H2 code a bit
so that it just checks if h2s->sending_list is empty to know if the tasklet
of the stream_interface has been waken up or not.
send_wait is now set to NULL in h2_snd_buf() (ideally we'd set it to NULL
as soon as we're waking the tasklet, but it can't be done, because we still
need it in case we have to remove the tasklet from the task list).
2019-04-15 19:27:57 +02:00
Olivier Houchard
9a0f559676 BUG/MEDIUM: h2: Make sure we're not already in the send_list in h2_subscribe().
In h2_subscribe(), don't add ourself to the send_list if we're already in it.
That may happen if we try to send and fail twice, as we're only removed
from the send_list if we managed to send data, to promote fairness.
Failing to do so can lead to either an infinite loop, or some random crashes,
as we'd get the same h2s in the send_list twice.

This should be backported to 1.9.
2019-04-15 19:27:57 +02:00
Olivier Houchard
0e0793715c BUG/MEDIUM: muxes: Make sure we unsubcribed when destroying mux ctx.
In the h1 and h2 muxes, make sure we unsubscribed before destroying the
mux context.
Failing to do so will lead in a segfault later, as the connection will
attempt to dereference its conn->send_wait or conn->recv_wait, which pointed
to the now-free'd mux context.

This was introduced by commit 39a96ee16e, so
should only be backported if that commit gets backported.
2019-04-15 19:27:57 +02:00
Willy Tarreau
e61828449c BUILD: cli/threads: fix build in single-threaded mode
Commit a8f57d51a ("MINOR: cli/activity: report the accept queue sizes
in "show activity"") broke the single-threaded build because the
accept-rings are not implemented there. Let's ifdef this out. Ideally
we should start to think about always having such elements initialized
even without threads to improve the test coverage.
2019-04-15 18:55:31 +02:00
Willy Tarreau
3466e3cdcb BUILD: task/thread: fix single-threaded build of task.c
As expected, commit cde7902ac ("MEDIUM: tasks: improve fairness between
the local and global queues") broke the build with threads disabled,
and I forgot to rerun this test before committing. No backport is
needed.
2019-04-15 18:52:40 +02:00
Nenad Merdanovic
646b7741bc BUG/MEDIUM: map: Fix memory leak in the map converter
The allocated trash chunk is not freed properly and causes a memory leak
exhibited as the growth in the trash pool allocations. Bug was introduced
in commit 271022 (BUG/MINOR: map: fix map_regm with backref).

This should be backported to all branches where the above commit was
backported.
2019-04-15 09:53:46 +02:00
Willy Tarreau
c8da044b41 MINOR: tasks: restore the lower latency scheduling when niced tasks are present
In the past we used to reduce the number of tasks consulted at once when
some niced tasks were present in the run queue. This was dropped in 1.8
when the scheduler started to take batches. With the recent fixes it now
becomes possible to restore this behaviour which guarantees a better
latency between tasks when niced tasks are present. Thanks to this, with
the default number of 200 for tune.runqueue-depth, with a parasitic load
of 14000 requests per second, nice 0 gives 14000 rps, nice 1024 gives
12000 rps and nice -1024 gives 16000 rps. The amplitude widens if the
runqueue depth is lowered.
2019-04-15 09:50:56 +02:00
Willy Tarreau
2d1fd0a0d2 MEDIUM: tasks: only base the nice offset on the run queue depth
The offset calculated for the nice value used to be wrong for a long
time and got even worse when the improved multi-thread sheduler was
implemented because it continued to rely on the run queue size, which
become irrelevant given that we extract tasks in batches, so the run
queue size moves following a sawtooth form.

However the offsets much better reflects insertion positions in the
queue, so it's worth dropping this rq_size component of the equation.
Last point, due to the batches made of runqueue-depth entries at once,
the higher the depth, the lower the effect of the nice setting since
values are picked together in batches and placed into a list. An
intuitive approach consists in multiplying the nice value with the
batch size to allow tasks to participate to a different batch. And
experimentation shows that this works pretty well.

With a runqueue-depth of 16 and a parasitic load of 16000 requests
per second on 100 streams, a default nice of 0 shows 16000 requests
per second for nice 0, 22000 for nice -1024 and 10000 for nice 1024.

The difference is even bigger with a runqueue depth of 5. At 200
however it's much smoother (16000-22000).
2019-04-15 09:50:56 +02:00
Willy Tarreau
cde7902ac9 MEDIUM: tasks: improve fairness between the local and global queues
Tasks allowed to run on multiple threads, as well as those scheduled by
one thread to run on another one pass through the global queue. The
local queues only see tasks scheduled by one thread to run on itself.
The tasks extracted from the global queue are transferred to the local
queue when they're picked by one thread. This causes a priority issue
because the global tasks experience a priority contest twice while the
local ones experience it only once. Thus if a tasks returns still
running, it's immediately reinserted into the local run queue and runs
much faster than the ones coming from the global queue.

Till 1.9 the tasks going through the global queue were mostly :
  - health checks initialization
  - queue management
  - listener dequeue/requeue

These ones are moderately sensitive to unfairness so it was not that
big an issue.

Since 2.0-dev2 with the multi-queue accept, tasks are scheduled to
remote threads on most accept() and it becomes fairly visible under
load that the accept slows down, even for the CLI.

This patch remedies this by consulting both the local and the global
run queues in parallel and by always picking the task whose deadline
is the earliest. This guarantees to maintain an excellent fairness
between the two queues and removes the cascade effect experienced
by the global tasks.

Now the CLI always continues to respond quickly even in presence of
expensive tasks running for a long time.

This patch may possibly be backported to 1.9 if some scheduling issues
are reported but at this time it doesn't seem necessary.
2019-04-15 09:50:56 +02:00
Willy Tarreau
24f382f555 CLEANUP: task: do not export rq_next anymore
This one hasn't been used anymore since the scheduler changes after 1.8
but it kept being exported and maintained up to date while it's always
reset when scanning the trees. Let's stop exporting it and updating it.
2019-04-15 09:50:56 +02:00
Christopher Faulet
61840e715f BUG/MEDIUM: muxes: Don't dereference mux context if null in release functions
When a mux context is released, we must be sure it exists before dereferencing
it. The bug was introduced in the commit 39a96ee16 ("MEDIUM: muxes: Be prepared
to don't own connection during the release").

No need to backport this patch, expect if the commit 39a96ee16 is backported
too.
2019-04-15 09:47:10 +02:00
Christopher Faulet
1d2b586cdd MAJOR: htx: Enable the HTX mode by default for all proxies
The legacy HTTP mode is no more the default one. So now, by default, without any
option in your configuration, all proxies will use the HTX mode. The line
"option http-use-htx" in proxy sections are now useless, except to cancel the
legacy HTTP mode. To fallback on legacy HTTP mode, you should use the line "no
option http-use-htx" explicitly.

Note that the reg-tests still work by default on legacy HTTP mode. The HTX will
be enabled by default in a futur commit.
2019-04-12 22:06:53 +02:00
Christopher Faulet
0ef372a390 MAJOR: muxes/htx: Handle inplicit upgrades from h1 to h2
The upgrade is performed when an H2 preface is detected when the first request
on a connection is parsed. The CS is destroyed by setting EOS flag on it. A
special flag is added on the HTX message to warn the HTX analyzers the stream
will be closed because of an upgrade. This way, no error and no log are
emitted. When the mux h1 is released, we create a mux h2, without any CS and
passing the buffer with the unparsed H2 preface.
2019-04-12 22:06:53 +02:00
Christopher Faulet
bbe685452f MAJOR: proxy/htx: Handle mux upgrades from TCP to HTTP in HTX mode
It is now possible to upgrade TCP streams to HTX when an HTTP backend is set for
a TCP frontend (both with the HTX enabled). So concretely, in such case, an
upgrade is performed from the mux pt to the mux h1. The current CS and the
channel's buffer are used to initialize the mux h1.
2019-04-12 22:06:53 +02:00
Christopher Faulet
eb7098035c MEDIUM: htx: Allow the option http-use-htx to be used on TCP proxies too
This will be mandatory to allow upgrades from TCP to HTTP in HTX. Of course, raw
buffers will still be used by default on TCP proxies, this option sets or
not. But if you want to handle mux upgrades from a TCP proxy, you must enable
the HTX on it and on all its backends.

There is only a small change in the lua code. Because TCP proxies can be HTX
aware, to exclude TCP services only for HTTP proxies, we must also check the
mode (TCP/HTTP) now.
2019-04-12 22:06:53 +02:00
Christopher Faulet
39a96ee16e MEDIUM: muxes: Be prepared to don't own connection during the release
This happens during mux upgrades. In such case, when the destroy() callback is
called, the connection points to a different mux's context than the one passed
to the callback. It means the connection is owned by another mux. The old mux is
then released but the connection is not closed.
2019-04-12 22:06:53 +02:00
Christopher Faulet
73c1207c71 MINOR: muxes: Pass the context of the mux to destroy() instead of the connection
It is mandatory to handle mux upgrades, because during a mux upgrade, the
connection will be reassigned to another multiplexer. So when the old one is
destroyed, it does not own the connection anymore. Or in other words, conn->ctx
does not point to the old mux's context when its destroy() callback is
called. So we now rely on the multiplexer context do destroy it instead of the
connection.

In addition, h1_release() and h2_release() have also been updated in the same
way.
2019-04-12 22:06:53 +02:00
Christopher Faulet
51f73eb11a MEDIUM: muxes: Add an optional input buffer during mux initialization
The mux's callback init() now take a pointer to a buffer as extra argument. It
must be used by the multiplexer as its input buffer. This buffer is always NULL
when a multiplexer is initialized with a fresh connection. But if a mux upgrade
is performed, it may be filled with existing data. Note that, for now, mux
upgrades are not supported. But this commit is mandatory to do so.
2019-04-12 22:06:53 +02:00
Christopher Faulet
e9b7072e9e MINOR: muxes: Rely on conn_is_back() during init to handle front/back conn
Instead of using the connection context to make the difference between a
frontend connection and a backend connection, we now rely on the function
conn_is_back().
2019-04-12 22:06:53 +02:00
Christopher Faulet
0f17a9b510 MINOR: filters/htx: Use stream flags instead of px mode to instanciate a filter
In the function flt_stream_add_filter(), if the HTX is enabled, before attaching
a filter to a stream, we test if the filter can handle it or not. If not, the
filter is ignored. Before the proxy mode was tested. Now we test if the stream
is an HTX stream or not.
2019-04-12 22:06:53 +02:00
Christopher Faulet
eca8854555 MINOR: http_fetch/htx: Use stream flags instead of px mode in smp_prefetch_htx
In the function smp_prefetch_htx(), we must know if data in the channel's buffer
are structured or not. Before the proxy mode was tested. Now we test if the
stream is an HTX stream or not. If yes, we know the HTX is used to structure
data in the channel's buffer.
2019-04-12 22:06:53 +02:00
Christopher Faulet
0e160ff5bb MINOR: stream: Set a flag when the stream uses the HTX
The flag SF_HTX has been added to know when a stream uses the HTX or not. It is
set when an HTX stream is created. There are 2 conditions to set it. The first
one is when the HTTP frontend enables the HTX. The second one is when the attached
conn_stream uses an HTX multiplexer.
2019-04-12 22:06:53 +02:00
Christopher Faulet
9f38f5aa80 MINOR: muxes: Add a flag to specify a multiplexer uses the HTX
A multiplexer must now set the flag MX_FL_HTX when it uses the HTX to structured
the data exchanged with channels. the muxes h1 and h2 set this flag. Of course,
for the mux h2, it is set on h2_htx_ops only.
2019-04-12 22:06:53 +02:00
Christopher Faulet
9b579106fe MINOR: mux-h2: Add a mux_ops dedicated to the HTX mode
Instead of using the same mux_ops structure for the legacy HTTP mode and the HTX
mode, a dedicated mux_ops is now used for the HTX mode. Same callbacks are used
for both. But the flags may be different depending on the mode used.
2019-04-12 22:06:53 +02:00
Christopher Faulet
7f36636c21 BUG/MINOR: mux-h1: Handle the flag CS_FL_KILL_CONN during a shutdown read/write
This flag is used to explicitly kill the connection when the CS is closed. It
may be set by tcp rules. It must be respect by the mux-h1.

This patch must be backported to 1.9.
2019-04-12 22:06:53 +02:00
Christopher Faulet
14c91cfdf8 MINOR: mux-h1: Don't release the conn_stream anymore when h1s is destroyed
An H1 stream is destroyed when the conn_stream is detached or when the H1
connection is destroyed. In the first case, the CS is released by the caller. In
the second one, because the connection is closed, no CS is attached anymore. In
both, there is no reason to release the conn_stream in h1s_destroy().
2019-04-12 22:06:53 +02:00
Christopher Faulet
b992af00b6 MEDIUM: mux-h1: Simplify the connection mode management by sanitizing headers
Connection headers are now sanitized during the parsing and the formatting. This
means "close" and "keep-alive" values are always removed but right flags are
set. This way, client side and server side are independent of each other. On the
input side, after the parsing, neither "close" nor "keep-alive" values
remain. So on the output side, if we found one of these values in a connection
headers, it means it was explicitly added by HAProxy. So it overwrites the other
rules, if applicable. Always sanitizing the output is also a way to simplifiy
conditions to update the connection header. Concretly, only additions of "close"
or "keep-alive" values remain, depending the case.

No need to backport this patch.
2019-04-12 22:06:53 +02:00
Christopher Faulet
a51ebb7f56 MEDIUM: h1: Add an option to sanitize connection headers during parsing
The flag H1_MF_CLEAN_CONN_HDR has been added to let the H1 parser sanitize
connection headers. It means it will remove all "close" and "keep-alive" values
during the parsing. One noticeable effect is that connection headers may be
unfolded. In practice, this is not a problem because it is not frequent to have
multiple values for the connection headers.

If this flag is set, during the parsing The function
h1_parse_next_connection_header() is called in a loop instead of
h1_parse_conection_header().

No need to backport this patch
2019-04-12 22:06:53 +02:00
Christopher Faulet
b829f4c726 MINOR: stats/htx: Don't add "Connection: close" header anymore in stats responses
On the client side, as far as possible, we will try to keep connection
alive. So, in most of cases, this header will be removed. So it is better to not
add it at all. If finally the connection must be closed, the header will be
added by the mux h1.

No need to backport this patch.
2019-04-12 22:06:53 +02:00
Christopher Faulet
cdc90e9175 MINOR: mux-h1: Simplify handling of 1xx responses
Because of previous changes on http tunneling, the synchronization of the
transaction can be simplified. Only the check on intermediate messages remains
and it only concerns the response path.

This patch must be backported to 1.9. It is not strictly speaking required but
it will ease futur backports.
2019-04-12 22:06:53 +02:00
Christopher Faulet
c62c2b9d92 BUG/MEDIUM: htx: Fix the process of HTTP CONNECT with h2 connections
In HTX, the HTTP tunneling does not work if h1 and h2 are mixed (an h1 client
sending requests to an h2 server or this opposite) because the h1 multiplexer
always adds an EOM before switching it to tunnel mode. The h2 multiplexer
interprets it as an end of stream, closing the stream as for any other
transaction.

To make it works again, we need to swith to the tunnel mode without emitting any
EOM blocks. Because of that, HTX analyzers have been updated to switch the
transaction to tunnel mode before end of the message (because there is no end of
message...).

To be consistent, the protocol switching is also handled the same way even
though the 101 responses are not supported in h2.

This patch must be backported to 1.9.
2019-04-12 22:06:53 +02:00
Christopher Faulet
03b9d8ba4a MINOR: proto_htx: Don't adjust transaction mode anymore in HTX analyzers
Because the option http-tunnel is now ignored in HTX, there is no longer any
need to adjust the transaction mode in HTX analyzers. A channel can still be
switch to the tunnel mode for legitimate cases (HTTP CONNECT or switching
protocols). So the function htx_adjust_conn_mode() is now useless.

This patch must be backported to 1.9. It is not strictly speaking required but
it will ease futur backports.
2019-04-12 22:06:53 +02:00
Christopher Faulet
6c9bbb2265 MEDIUM: htx: Deprecate the option 'http-tunnel' and ignore it in HTX
The option http-tunnel disables any HTTP processing past the first
transaction. In HTX, it works for full h1 transactions. As for the legacy HTTP,
it is a workaround, but it works. But it is impossible to make it works with an
h2 connection. In such case, it has no effect, the stream is closed at the end
of the transaction. So to avoid any inconsistancies between h1 and h2
connections, this option is now always ignored when the HTX is enabled. It is
also a good opportinity to deprecate an old and ugly option. A warning is
emitted during HAProxy startup to encourage users to remove this option.

Note that in legacy HTTP, this option only works with full h1 transactions
too. If an h2 connection is established on a frontend with this option enabled,
it will have no effect at all. But we keep it for the legacy HTTP for
compatibility purpose. It will be removed with the legacy HTTP.

So to be short, if you have to really (REALLY) use it, it will only work for
legacy HTTP frontends with H1 clients.

The documentation has been updated accordingly.

This patch must be backported to 1.9. It is not strictly speaking required but
it will ease futur backports.
2019-04-12 22:06:53 +02:00
Christopher Faulet
f1449b785e BUG/MEDIUM: htx: Don't crush blocks payload when append is done on a data block
If there is a data block when a header block is added in a HTX message, its
payload will be inserted after the data block payload. But its index will be
moved before the EOH block. So at this stage, if a new data block is added, we
will try to append its payload to the last data block (because it is also the
tail). Thus the payload of the further header block will be crushed.

This cannot happens if the payloads wrap thanks to the previous fix. But it
happens when the tail is not the front too. So now, in this case, we add a new
block instead of appending.

This patch must be backported in 1.9.
2019-04-12 22:06:45 +02:00
Christopher Faulet
05aab64b06 BUG/MEDIUM: htx: Defrag if blocks position is changed and the payloads wrap
When a header is added or when a data block is added before another one, the
blocks position may be changed (but not their payloads position). For instance,
when a header is added, we move the block just before the EOH, if any. When the
payloads wraps, it is pretty annoying because we loose the last inserted
block. It is neither the tail nor the head. And it is not the front either.

It is a design problem. Waiting for fixing this problem, we force a
defragmentation in such case. Anyway, it should be pretty rare, so it's not
really critical.

This patch must be backported to 1.9.
2019-04-12 21:34:30 +02:00
Christopher Faulet
63263e50ed BUG/MINOR: spoe: Be sure to set tv_request when each message fragment is encoded
When a message or a fragment is encoded, the date the frame processing starts
must be set if it is undefined. The test on tv_request field was wrong.

This patch must be backported to 1.9.
2019-04-12 21:33:52 +02:00
Christopher Faulet
a715ea82ea BUG/MEDIUM: spoe: Return an error if nothing is encoded for fragmented messages
If the maximum frame size is very small with a large message or argument name,
it is possible to be unable to encode anything. In such case, it is important to
stop processing returning an error otherwise we will retry in loop to encode the
message, failing each time because of the too small frame size.

This patch must be backported to 1.9 and 1.8.
2019-04-12 16:38:54 +02:00
Christopher Faulet
3e86cec05e BUG/MEDIUM: spoe: Queue message only if no SPOE applet is attached to the stream
If a SPOE applet is already attached to a stream to handle its messages, we must
not queue them. Otherwise it could be handled by another applet leading to
errors. This happens with fragmented messages only. When the first framgnent is
sent, the SPOE applet sending it is attached to the stream. It should be used to
send all other fragments.

This patch must be backported to 1.9 and 1.8.
2019-04-12 16:38:54 +02:00
Willy Tarreau
a8f57d51a0 MINOR: cli/activity: report the accept queue sizes in "show activity"
Seeing the size of each ring helps understand which threads are
overloaded and why some of them are less often elected than others
by the multi-queue load balancer.
2019-04-12 15:54:15 +02:00
Willy Tarreau
64a9c05f37 MINOR: cli/listener: report the number of accepts on "show activity"
The "show activity" command reports the number of incoming connections
dispatched per thread but doesn't report the number of connections
received by each thread. It is important to be able to monitor this
value as it can show that for whatever reason a smaller set of threads
is receiving the connections and dispatching them to all other ones.
2019-04-12 15:54:15 +02:00
Willy Tarreau
0d858446b6 BUG/MINOR: listener: renice the accept ring processing task
It is not acceptable that the accept queues are handled with a normal
priority since they are supposed to quickly dispatch the incoming
traffic, resulting in tasks which will have their respective nice
values and place in the queue. Let's renice the accept ring tasks
to -1024.

No backport is needed, this is strictly 2.0.
2019-04-12 15:54:03 +02:00
Willy Tarreau
587a8130b1 BUG/MINOR: tasks: make sure the first task to be queued keeps its nice value
The run queue offset computed from the nice value depends on the run
queue size, but for the first task to enter the run queue, this size
is zero and the task gets queued just as if its nice value was zero as
well. This is problematic for example for the CLI socket if another
higher priority task gets queued immediately after as it can steal its
place.

This patch simply adds one to the rq_size value to make sure the nice
is never multiplied by zero. The way the offset is calculated is
questionable anyway these days, since with the newer scheduler it seems
that just using the nice value as an offset should work (possibly damped
by the task's number of calls).

This fix must be backported to 1.9. It may possibly be backported to
older versions if it proves to make the CLI more interactive.
2019-04-12 15:54:02 +02:00
Willy Tarreau
f8bce3125e BUG/MEDIUM: task/threads: address a fairness issue between local and global tasks
It is possible to hit a fairness issue in the scheduler when a local
task runs for a long time (i.e. process_stream() returns running), and
a global task wants to run on the same thread and remains in the global
queue. What happens in this case is that the condition to extract tasks
from the global queue will rarely be satisfied for very low task counts
since whatever non-null queue size multiplied by a thread count >1 is
always greater than the small remaining number of tasks in the queue.
In theory another thread should pick the task but we do have some mono
threaded tasks in the global queue as well during inter-thread wakeups.

Note that this can only happen with task counts lower than the thread
counts, typically one task in each queue for more than two threads.

This patch works around the problem by allowing a very small unfairness,
making sure that we can always pick at least one task from the global
queue even if there is already one in the local queue.

A better approach will consist in scanning the two trees in parallel
and always pick the best task. This will be more complex and will
constitute a separate patch.

This fix must be backported to 1.9.
2019-04-12 15:53:43 +02:00
Olivier Houchard
b2fc04ebef BUG/MEDIUM: stream_interface: Don't bother doing chk_rcv/snd if not connected.
If the interface is not in state SI_ST_CON or SI_ST_EST, don't bother
trying to send/recv data, we can't do it anyway, and if we're in SI_ST_TAR,
that may lead to adding the SI_FL_ERR flag back on the stream_interface,
while we don't want it.

This should be backported to 1.9.
2019-04-12 13:14:55 +02:00
Olivier Houchard
56897e20a3 BUG/MEDIUM: streams: Only re-run process_stream if we're in a connected state.
In process_stream(), only try again when there's the SI_FL_ERR flag and we're
in a connected state, otherwise we can loop forever.
It used to work because si_update_both() bogusly removed the SI_FL_ERR flag,
and it would never be set at this point. Now it does, so take that into
account.
Many, many thanks to Maciej Zdeb for reporting the problem, and helping
investigating it.

This should be backported to 1.9.
2019-04-12 13:14:48 +02:00
Emmanuel Hocdet
2b4edfb0bd MINOR: ssl: Activate aes_gcm_dec converter for BoringSSL
BoringSSL can support it, no need to disable.
2019-04-11 15:00:13 +02:00
Robin H. Johnson
543d4507ca MINOR: skip get_gmtime where tm is unused
For LOG_FMT_TS (%Ts), the tm variable is not used, so save some cycles
on the call to get_gmtime.

Backport: 1.9 1.8
Signed-off-by: Robin H. Johnson <rjohnson@digitalocean.com>
2019-04-11 14:58:32 +02:00
Willy Tarreau
0f93672dfe BUG/MEDIUM: pattern: assign pattern IDs after checking the config validity
Pavlos Parissis reported an interesting case where some map identifiers
were not assigned (appearing as -1 in show map). It turns out that it
only happens for log-format expressions parsed in check_config_validity()
that involve maps (log-format, use_backend, unique-id-header), as in the
sample configuration below :

    frontend foo
        bind :8001
        unique-id-format %[src,map(addr.lst)]
        log-format %[src,map(addr.lst)]
        use_backend %[src,map(addr.lst)]

The reason stems from the initial introduction of unique IDs in 1.5 via
commit af5a29d5f ("MINOR: pattern: Each pattern is identified by unique
id.") : the unique_id assignment was done before calling
check_config_validity() so all maps loaded after this call are not
properly configured. From what the function does, it seems they will not
be able to use a cache, will not have a unique_id assigned and will not
be updatable from the CLI.

This fix must be backported to all supported versions.
2019-04-11 14:52:25 +02:00
Olivier Houchard
46453d3f7d MINOR: threads: Implement thread_cpus_enabled() for FreeBSD.
Use cpuset_getaffinity() to implement thread_cpus_enabled() on FreeBSD, so
that we can know the number of CPUs available, and automatically launch as
much threads if nbthread isn't specified.
2019-04-11 00:09:22 +02:00
Olivier Houchard
86dcad6c62 BUG/MEDIUM: stream: Don't clear the stream_interface flags in si_update_both.
In commit d7704b534, we introduced and expiration flag on the stream interface,
which is used for the connect, the queue and the turn around. Because the
turn around state isn't an error, the flag was reset in process_stream(), and
later in commit cff6411f9 when introducing the SI_FL_ERR flag, the cleanup
of the flag at this place was erroneously generalized.
To fix this, the SI_FL_EXP flag is only cleared at the end of the turn around
state, and nobody should clear the stream interface flags anymore.

This should be backported to 1.9, it has no known impact on older versions.
2019-04-09 19:31:22 +02:00
Olivier Houchard
120f64a8c4 BUG/MEDIUM: streams: Store prev_state before calling si_update_both().
As si_update_both() sets prev_state to state for each stream_interface, if
we want to check it changed, copy it before calling si_update_both().

This should be backported to 1.9.
2019-04-09 19:31:22 +02:00
Olivier Houchard
39cc020af1 BUG/MEDIUM: streams: Don't remove the SI_FL_ERR flag in si_update_both().
Don't inconditionally remove the SI_FL_ERR code in si_update_both(), which
is called at the end of process_stream(). Doing so was a bug that was there
since the flag was introduced, because we were always setting si->flags to
SI_FL_NONE, however we don't want to lose that one, except if we will retry
connecting, so only remove it in sess_update_st_cer().

This should be backported to 1.9.
2019-04-09 19:31:22 +02:00
Willy Tarreau
90caa07935 BUG/MEDIUM: htx: fix random premature abort of data transfers
It can happen in some cases that the last block of an H2 transfer over
HTX is truncated. This was tracked down to a leftover of an earlier
implementation of htx_xfer_blks() causing the computed size of a block
to be incorrectly calculated if a data block doesn't completely fit into
the target buffer. In practice it causes the EOM block to be attempted to
be emitted with a wrong size and the message to be truncated. One way to
reproduce this is to chain two haproxy instances in h1->h2->h1 with
httpterm as the server and h2load as the client, making many requests
between 8 and 10kB over a single connection. Usually one of the very
first requests will fail.

This fix must be backported to 1.9.
2019-04-09 16:30:20 +02:00
Olivier Houchard
3ca18bf0bd BUG/MEDIUM: h2: Don't attempt to recv from h2_process_demux if we subscribed.
Modify h2c_restart_reading() to add a new parameter, to let it know if it
should consider if the buffer isn't empty when retrying to read or not, and
call h2c_restart_reading() using 0 as a parameter from h2_process_demux().
If we're leaving h2_process_demux() with a non-empty buffer, it means the
frame is incomplete, and we're waiting for more data, and if we already
subscribed, we'll be waken when more data are available.
Failing to do so means we'll be waken up in a loop until more data are
available.

This should be backported to 1.9.
2019-04-05 16:03:54 +02:00
Emeric Brun
9ef2ad7844 BUG/MEDIUM: peers: fix a case where peer session is not cleanly reset on release.
The deinit took place in only peer_session_release, but in the a case of a
previous call to peer_session_forceshutdown, the session cursors
won't be reset, resulting in a bad state for new session of the same
peer. For instance, a table definition message could be dropped and
so all update messages will be dropped by the remote peer.

This patch move the deinit processing directly in the force shutdown
funtion. Killed session remains in "ST_END" state but ref on peer was
reset to NULL and deinit will be skipped on session release function.

The session release continue to assure the deinit for "active" sessions.

This patch should be backported on all stable version since proto
peers v2.
2019-04-03 14:42:10 +02:00
Christopher Faulet
aed68d4390 BUG/MINOR: proto_htx: Reset to_forward value when a message is set to DONE
Because we try to forward infinitly message body, when its state is set to DONE,
we must be sure to reset to_foward value of the corresponding
channel. Otherwise, some errors can be errornously triggered.

No need to backport this patch.
2019-04-01 15:43:40 +02:00
William Lallemand
33d29e2a11 MINOR: cli: export HAPROXY_CLI environment variable
Export the HAPROXY_CLI environment variable which contains the list of
all stats sockets (including the sockpair@) separated by semicolons.
2019-04-01 14:45:37 +02:00
William Lallemand
e58915f07f MINOR: cli: start addresses by a prefix in 'show cli sockets'
Displays a prefix for every addresses in 'show cli sockets'.
It could be 'unix@', 'ipv4@', 'ipv6@', 'abns@' or 'sockpair@'.

Could be backported in 1.9 and 1.8.
2019-04-01 14:45:37 +02:00
William Lallemand
75812a7a3c BUG/MINOR: cli: correctly handle abns in 'show cli sockets'
The 'show cli sockets' was not handling the abns sockets. This is a
problem since it uses the AF_UNIX family, it displays nothing
in the path column because the path starts by \0.

Should be backported to 1.9 and 1.8.
2019-04-01 14:45:37 +02:00
William Lallemand
ad53d6dd75 MINOR: mworker/cli: show programs in 'show proc'
Show the programs in 'show proc'

Example:

	# programs
	2285            dataplane-api   -               0               0d 00h00m12s
	# old programs
	2261            dataplane-api   -               1               0d 00h00m53s
2019-04-01 14:45:37 +02:00
William Lallemand
9a1ee7ac31 MEDIUM: mworker-prog: implement program for master-worker
This patch implements the external binary support in the master worker.

To configure an external process, you need to use the program section,
for example:

	program dataplane-api
		command ./dataplane_api

Those processes are launched at the same time as the workers.

During a reload of HAProxy, those processes are dealing with the same
sequence as a worker:

  - the master is re-executed
  - the master sends a USR1 signal to the program
  - the master launches a new instance of the program

During a stop, or restart, a SIGTERM is sent to the program.
2019-04-01 14:45:37 +02:00
William Lallemand
88dc7c5de9 REORG: mworker/cli: move CLI functions to mworker.c
Move the CLI functions of the master worker to mworker.c
2019-04-01 14:45:37 +02:00
William Lallemand
3f12887ffa MINOR: mworker: don't use children variable anymore
The children variable is still used in haproxy, it is not required
anymore since we have the information about the current workers in the
mworker_proc linked list.

The oldpids array is also replaced by this linked list when we
generated the arguments for the master reexec.
2019-04-01 14:45:37 +02:00
William Lallemand
f3a86831ae MINOR: mworker: calloc mworker_proc structures
Initialize mworker_proc structures to 0 with calloc instead of just
doing a malloc.
2019-04-01 14:45:37 +02:00
William Lallemand
9001ce8c2f REORG: mworker: move mworker_cleanlisteners to mworker.c 2019-04-01 14:45:37 +02:00
William Lallemand
e25473c846 REORG: mworker: move signal handlers and related functions
Move the following functions to mworker.c:

void mworker_catch_sighup(struct sig_handler *sh);
void mworker_catch_sigterm(struct sig_handler *sh);
void mworker_catch_sigchld(struct sig_handler *sh);

static void mworker_kill(int sig);
int current_child(int pid);
2019-04-01 14:45:37 +02:00
William Lallemand
3fa724db87 REORG: mworker: move IPC functions to mworker.c
Move the following functions to mworker.c:

void mworker_accept_wrapper(int fd);
void mworker_pipe_register();
2019-04-01 14:45:37 +02:00
William Lallemand
3cd95d2f1b REORG: mworker: move signals functions to mworker.c
Move the following functions to mworker.c:

void mworker_block_signals();
void mworker_unblock_signals();
2019-04-01 14:45:37 +02:00
William Lallemand
48dfbbdea9 REORG: mworker: move serializing functions to mworker.c
Move the 2 following functions to mworker.c:

void mworker_proc_list_to_env()
void mworker_env_to_proc_list()
2019-04-01 14:45:37 +02:00
Nenad Merdanovic
c31499d747 MINOR: ssl: Add aes_gcm_dec converter
The converter can be used to decrypt the raw byte input using the
AES-GCM algorithm, using provided nonce, key and AEAD tag. This can
be useful to decrypt encrypted cookies for example and make decisions
based on the content.
2019-04-01 13:33:31 +02:00
Willy Tarreau
0ca24aa028 BUILD: connection: fix naming of ip_v field
AIX defines ip_v as ip_ff.ip_fv in netinet/ip.h using a macro, and
unfortunately we do have a local variable with such a name and which
uses the same header file. Let's rename the variable to ip_ver to fix
this.
2019-04-01 07:44:56 +02:00
Willy Tarreau
a1bd1faeeb BUILD: use inttypes.h instead of stdint.h
I found on an (old) AIX 5.1 machine that stdint.h didn't exist while
inttypes.h which is expected to include it does exist and provides the
desired functionalities.

As explained here, stdint being just a subset of inttypes for use in
freestanding environments, it's probably always OK to switch to inttypes
instead:

  https://pubs.opengroup.org/onlinepubs/009696799/basedefs/stdint.h.html

Also it's even clearer here in the autoconf doc :

  https://www.gnu.org/software/autoconf/manual/autoconf-2.61/html_node/Header-Portability.html

  "The C99 standard says that inttypes.h includes stdint.h, so there's
   no need to include stdint.h separately in a standard environment.
   Some implementations have inttypes.h but not stdint.h (e.g., Solaris
   7), but we don't know of any implementation that has stdint.h but not
   inttypes.h"
2019-04-01 07:44:56 +02:00
Willy Tarreau
7b5654f54a BUILD: re-implement an initcall variant without using executable sections
The current initcall implementation relies on dedicated sections (one
section per init stage) to store the initcall descriptors. Then upon
startup, these sections are scanned from beginning to end and all items
found there are called in sequence.

On platforms like AIX or Cygwin it seems difficult to figure the
beginning and end of sections as the linker doesn't seem to provide
the corresponding symbols. In order to replace this, this patch
simply implements an array of single linked (one per init stage)
which are fed using constructors for each register call. These
constructors are declared static, with a name depending on their
line number in the file, in order to avoid name clashes. The final
effect is the same, except that the method is slightly more expensive
in that it explicitly produces code to register these initcalls :

$ size  haproxy.sections haproxy.constructor
   text    data     bss     dec     hex filename
4060312  249176 1457652 5767140  57ffe4 haproxy.sections
4062862  260408 1457652 5780922  5835ba haproxy.constructor

This mechanism is enabled as an alternative to the default one when
build option USE_OBSOLETE_LINKER is set. This option is currently
enabled by default only on AIX and Cygwin, and may be attempted for
any target which fails to build complaining about missing symbols
__start_init_* and/or __stop_init_*.

Once confirmed as a reliable fix, this will likely have to be backported
to 1.9 where AIX and Cygwin do not build anymore.
2019-04-01 07:43:07 +02:00
Willy Tarreau
9d22e56178 MINOR: tools: add an unsetenv() implementation
Older Solaris and AIX versions do not have unsetenv(). This adds a
fairly simple implementation which scans the environment, for use
with those systems. It will simply require to pass the define in
the "DEFINE" macro at build time like this :

      DEFINE="-Dunsetenv=my_unsetenv"
2019-03-29 21:05:37 +01:00
Willy Tarreau
e0609f5f49 MINOR: tools: make memvprintf() never pass a NULL target to vsnprintf()
Most modern platforms don't touch the output buffer when the size
argument is null, but there exist a few old ones (like AIX 5 and
possibly Tru64) where the output will be dereferenced anyway, probably
to write the trailing null, crashing the process. memprintf() uses this
to measure the desired length.

There is a very simple workaround to this consisting in passing a pointer
to a character instead of a NULL pointer. It was confirmed to fix the issue
on AIX 5.1.
2019-03-29 21:03:39 +01:00
Willy Tarreau
2231b63887 BUILD: cache: avoid a build warning with some compilers/linkers
The struct http_cache_applet was fully declared at the beginning
instead of just doing a forward declaration using an extern modifier.
Some linkers report warnings about a redefined symbol since these
really are two complete declarations.

The proper way to do this is to use extern on the first one and to
have a full declaration later. However it's not permitted to have
both static and extern so the change done in commit 0f2229943
("CLEANUP: cache: don't export http_cache_applet anymore") has to
be partially undone.

This should be backported to 1.9 for sanity but has no effet on
most platforms. However on 1.9 the extern keyword must also be
added to include/types/cache.h.
2019-03-29 21:03:24 +01:00
Ricardo Nabinger Sanchez
4bccea9891 BUG/MAJOR: checks: segfault during tcpcheck_main
When using TCP health checks (tcp-check connect), it is possible to
crash with a segfault when, for reasons yet to be understood, the
protocol family is unknown.

In the function tcpcheck_main(), proto is dereferenced without a prior
test in case it is NULL, leading to the segfault during proto->connect
dereference.

The line has been unmodified since it was introduced, in commit
69e273f3fc.  This was the only use of
proto (or more specifically, the return of  protocol_by_family()) that
was unprotected; all other callsites perform the test for a NULL
pointer.

This patch should be backported to 1.9, 1.8, 1.7, and 1.6.
2019-03-29 11:12:35 +01:00
Olivier Houchard
06f6811d9f BUG/MEDIUM: checks: Don't bother subscribing if we have a connection error.
In __event_srv_chk_r() and __event_srv_chk_w(), don't bother subscribing
if we're waiting for a handshake, but we had a connection error. We will
never be able to send/receive anything on that connection anyway, and
the conn_stream is probably about to be destroyed, and we will crash if
the tasklet is waken up.
I'm not convinced we need to subscribe here at all anyway, but I'd rather
modify the check code as little as possible.

This should be backported to 1.9.
2019-03-28 17:32:42 +01:00
William Lallemand
f94afebb94 BUG/MEDIUM: mworker: don't free the wrong child when not found
A bug occurs when the sigchld handler is called and a child which is
not in the process list just left, or with an empty process list.

The child variable won't be set and left as an uninitialized variable or
set to the wrong child entry, which can lead to a free of this
uninitialized variable or of the wrong child.

This can lead to a crash of the master during a stop or a reload.

It is not supposed to happen with a worker which was created by the
master. A cause could be a fork made by a dependency. (openssl, lua ?)

This patch strengthens the case of the missing child by doing the free
only if the child was found.

This patch must be backported to 1.9.
2019-03-28 11:36:18 +01:00
Christopher Faulet
5220ef25e3 BUG/MINOR: mux-h1: Only skip invalid C-L headers on output
When an HTTP request with an empty body is received, the flag HTX_SL_F_BODYLESS
is set on the HTX start-line block. It is true if the header content-length is
explicitly set to 0 or if it is omitted for a non chunked request.

On the server side, when the request is reformatted, because HTX_SL_F_BODYLESS
is set, the flag H1_MF_CLEN is added on the request parser. It is done to not
add an header transfer-encoding on bodyless requests. But if an header
content-length is explicitly set to 0, when it is parsed, because H1_MF_CLEN is
set, the function h1_parse_cont_len_header() returns 0, meaning the header can
be dropped. So in such case, a request without any header content-length is sent
to the server.

Some servers seems to reject empty POST requests with an error 411 when there is
no header content-length. So to fix this issue, on the output side, only headers
with an invalid content length are skipped, ie only when the function
h1_parse_cont_len_header() returns a negative value.

This patch must be backported to 1.9.
2019-03-28 10:00:36 +01:00
David Carlier
5671662f08 BUILD/MINOR: listener: Silent a few signedness warnings.
Silenting couple of warnings related to signedness, due to a mismatch of
signed and unsigned ints with l->nbconn, actconn and p->feconn.
2019-03-27 17:37:44 +01:00
Frédéric Lécaille
b7405c1c50 BUG/MINOR: peers: Missing initializations after peer session shutdown.
This patch fixes a bug introduced by 045e0d4 commit where it was really a bad
idea to reset the peer applet context before shutting down the underlying
session. This had as side effect to cancel the re-initializations done by
peer_session_release(), especially prevented this function from re-initializing
the current table pointer which is there to force annoucement of stick-table
definitions on when reconnecting. Consequently the peers could send stick-table
update messages without a first stick-table definition message. As this is
forbidden, this leaded the remote peers to close the sessions.
2019-03-27 15:16:25 +01:00
Willy Tarreau
7728ed3565 BUILD: report the whole feature set with their status in haproxy -vv
It's not convenient not to know the status of default options, and
requires the user to know what option is enabled by default in each
target. With this patch, a new "Features list" line is added to the
output of "haproxy -vv" to report the whole list of known features
with their respective status. They're prefixed with a "+" when enabled
or a "-" when disabled. The "USE_" prefix is removed for clarity.
2019-03-27 14:32:58 +01:00
Frédéric Lécaille
54bff83f43 CLEANUP: peers: replace timeout constants by macros.
This adds two macros PEER_RESYNC_TIMEOUT and PEER_RECONNECT_TIMEOUT
both set to 5 seconds in order to remove magic timeouts which appear
in the code.
2019-03-26 10:54:06 +01:00
Frédéric Lécaille
aba44a2abc CLEANUP: peers: remove useless annoying tabulations.
There were tabs in between macro names and their values in their
definition, forcing everyone to do the same, and causing some
mangling in patches. Let's fix all this.
2019-03-26 10:53:09 +01:00
Frédéric Lécaille
045e0d4b3b BUG/MINOR: peers: Really close the sessions with no heartbeat.
645635d commit was not sufficient to implement the heartbeat feature.
When no heartbeat was received before its timeout has expired the session was not
closed due to the fact that process_peer_sync() which is the task responsible of
handling the heartbeat and session expirations only checked the heartbeat timeout,
and sent a heartbeat message if it has expired. This has as side
effect to leave the session opened. On the remote side, a peer which receives a
heartbeat message, even if not supported, does not close the session.
Furthermore it not sufficient to update ->reconnect peer member field to schedule
a peer session release.

With this patch, a peer is flagged as alive as soon as it received peer protocol
messages (and not only heartbeat messages). When no updates must be sent,
we first check the reconnection timeout (->reconnect peer member field). If expired,
we really shutdown the session if the peer is not alive, but if the peer seen as alive,
we reset this flag and update the ->reconnect for the next period.
If the reconnection timeout has not expired, then we check the heartbeat timeout
which is there only to emit heartbeat messages upon expirations. If expired, as before this
patch we increment the heartbeat timeout by 3s to schedule the next heartbeat message
then we emit a heartbeat message waking up the peer I/O handler.
In every cases we update the task expiration to the earlier time between the
reconnection time and the heartbeat timeout time so that to be sure to check
again these two ->reconnect and ->heartbeat timers.
2019-03-26 10:51:12 +01:00
Willy Tarreau
65e04eb2bb MINOR: channel: don't unset CF_SHUTR_NOW after shutting down.
This flag is set by the stream layer to request an abort, and results in
CF_SHUTR being set once the abort is performed. However by analogy with
the send side, the flag was removed once the CF_SHUTR flag was set, thus
we lose the information about the cause of the shutr. This is what creates
the confusion that sometimes arises between client and server aborts.

This patch makes sure we don't remove this flag anymore in this case.
All call places only use it to perform the shutr and already check it
against CF_SHUTR. So no condition needs to be updated to take this into
account.

Some later, more careful changes may consist in refining the conditions
where we report a client reset or a server reset to ignore SHUTR when
SHUTR_NOW is set so that we don't report such misleading information
anymore.
2019-03-25 18:35:05 +01:00
Willy Tarreau
a27db38f12 BUG/MEDIUM: mux-h2: make sure to always notify streams of EOS condition
Recent commit 63768a63d ("MEDIUM: mux-h2: Don't mix the end of the message
with the end of stream") introduced a race which may manifest itself with
small connection counts on large objects and large server timeouts in
legacy mode. Sometimes h2s_close() is called while the data layer is
subscribed to read events but nothing in the chain can cause this wake-up
to happen and some streams stall for a while at the end of a transfer
until the server timeout strikes and ends the stream completes.

We need to wake the stream up if it's subscribed to rx events there,
which is what this patch does. When the patch above is backported to
1.9, this patch will also have to be backported.
2019-03-25 18:13:16 +01:00
Willy Tarreau
e73256fd2a BUG/MEDIUM: task/h2: add an idempotent task removal fucntion
Previous commit 3ea351368 ("BUG/MEDIUM: h2: Remove the tasklet from the
task list if unsubscribing.") uncovered an issue which needs to be
addressed in the scheduler's API. The function task_remove_from_task_list()
was initially designed to remove a task from the running tasklet list from
within the scheduler, and had to be used in h2 to abort pending I/O events.
However this function was not designed to be idempotent, occasionally
causing a double removal from the tasklet list, with the second doing
nothing but affecting the apparent tasks count and making haproxy use
100% CPU on some tests consisting in stopping the client during some
transfers. The h2_unsubscribe() function can sometimes be called upon
stream exit after an error where the tasklet was possibly already
removed, so it.

This patch does 2 things :
  - it renames task_remove_from_task_list() to
    __task_remove_from_tasklet_list() to discourage users from calling
    it. Also note the fix in the naming since it's a tasklet list and
    not a task list. This function is still uesd from the scheduler.
  - it adds a new, idempotent, task_remove_from_tasklet_list() function
    which does nothing if the task is already not in the tasklet list.

This patch will need to be backported where the commit above is backported.
2019-03-25 18:02:54 +01:00
Olivier Houchard
3ea3513689 BUG/MEDIUM: h2: Remove the tasklet from the task list if unsubscribing.
In h2_unsubscribe(), if we unsubscribe on SUB_CALL_UNSUBSCRIBE, then remove
ourself from the sending_list, and remove the tasklet from the task list.
We're probably about to destroy the stream anyway, so we don't want the
tasklet to run, or to stay in the sending_list, or it could lead to a crash.

This should be backpored to 1.9.
2019-03-25 14:34:26 +01:00
Olivier Houchard
afc7cb85c4 BUG/MEDIUM: h2: Follow the same logic in h2_deferred_shut than in h2_snd_buf.
In h2_deferred_shut(), don't just set h2s->send_wait to NULL, instead, use
the same logic as in h2_snd_buf() and only do so if we successfully sent data
(or if we don't want to send them anymore). Setting it to NULL can lead to
crashes.

This should be backported to 1.9.
2019-03-25 14:34:26 +01:00
Olivier Houchard
fd1e96d2fb BUG/MEDIUM: h2: Use the new sending_list in h2s_notify_send().
In h2s_notify_send(), use the new sending_list instead of using the old
way of setting hs->send_wait to NULL, failing to do so may lead to crashes.

This should be backported to 1.9.
2019-03-25 14:34:26 +01:00
Olivier Houchard
01d4cb5339 BUG/MEDIUM: h2: only destroy the h2s if h2s->cs is NULL.
In h2_deferred_shut(), only attempt to destroy the h2s if h2s->cs is NULL.
h2s->cs being non-NULL means it's still referenced by the stream interface,
so it may try to use it later, and that could lead to a crash.

This should be backported to 1.9.
2019-03-25 13:35:02 +01:00
Christopher Faulet
66af0b2b99 MEDIUM: proto_htx: Reintroduce the infinite forwarding on data
This commit was reverted because of bugs. Now it should be ok. Difference with
the commit f52170d2f ("MEDIUM: proto_htx: Switch to infinite forwarding if there
is no data filte") is that when the infinite forwarding is enabled, the message
is switched to the state HTTP_MSG_DONE if the flag CF_EOI is set.
2019-03-25 06:55:23 +01:00
Christopher Faulet
87a8f353f1 CLEANUP: muxes/stream-int: Remove flags CS_FL_READ_NULL and SI_FL_READ_NULL
Since the flag CF_SHUTR is no more set to mark the end of the message, these
flags become useless.

This patch should be backported to 1.9.
2019-03-25 06:55:23 +01:00
Christopher Faulet
769d0e98b8 BUG/MEDIUM: http/htx: Fix handling of the option abortonclose
Because the flag CF_SHUTR is no more set to mark the end of the message by the
H2 multiplexer, we can rely on it again to detect aborts. there is no more need
to make a check on the flag SI_FL_CLEAN_ABRT when the option abortonclose is
enabled. So, this option should work as before for h2 clients.

This patch must be backported to 1.9 with the previous EOI patches.
2019-03-25 06:55:13 +01:00
Christopher Faulet
dbe2cb4ee5 MINOR: mux-h1: Set CS_FL_EOI the end of the message is reached
As for the H2 multiplexer, When the end of a message is detected, the flag
CS_FL_EOI is set on the conn_stream.

This patch should be backported to 1.9.
2019-03-25 06:33:53 +01:00
Christopher Faulet
63768a63d7 MEDIUM: mux-h2: Don't mix the end of the message with the end of stream
The H2 multiplexer now sets CS_FL_EOI when it receives a frame with the ES
flag. And when the H2 streams is closed, it set the flag CS_FL_REOS.

This patch should be backported to 1.9.
2019-03-25 06:26:30 +01:00
Christopher Faulet
297d3e2e0f MINOR: channel: Report EOI on the input channel if it was reached in the mux
The flag CF_EOI is now set on the input channel when the flag CS_FL_EOI is set
on the corresponding conn_stream. In addition, if a read activity is reported
when this flag is set, the stream is woken up.

This patch should be backported to 1.9.
2019-03-25 06:24:43 +01:00
Christopher Faulet
3ab07c35b4 MINOR: mux-h2: Remove useless test on ES flag in h2_frt_transfer_data()
Same test is already performed in the caller function, h2c_frt_handle_data().

This patch should be backported to 1.9.
2019-03-22 18:06:17 +01:00
Christopher Faulet
2f5c784864 BUG/MINOR: proto-http: Don't forward request body anymore on error
In the commit 93e02d8b7 ("MINOR: proto-http/proto-htx: Make error handling
clearer during data forwarding"), a return clause was removed by error in the
function http_request_forward_body(). This bug seems not having any visible
impact.

This patch must be backported to 1.9.
2019-03-22 18:05:50 +01:00
Olivier Houchard
d360ac60f4 BUG/MEDIUM: h2: Try to be fair when sending data.
On the send path, try to be fair, and make sure the first to attempt to
send data will actually be the first to send data when it's possible (ie
when the mux' buffer is not full anymore).
To do so, use a separate list element for the sending_list, and only remove
the h2s from the send_list/fctl_list if we successfully sent data. If we did
not, we'll keep our place in the list, and will be able to try again next time.

This should be backported to 1.9.
2019-03-22 18:05:03 +01:00
Radek Zajic
594c456d14 BUG/MINOR: log: properly format IPv6 address when LOG_OPT_HEXA modifier is used.
In lf_ip(), when LOG_OPT_HEXA modifier is used, there is a code to format the
IP address as a hexadecimal string. This code does not properly handle cases
when the IP address is IPv6. In such case, the code only prints `00000000`.

This patch adds support for IPv6. For legacy IPv4, the format remains
unchanged. If IPv6 socket is used to accept IPv6 connection, the full IPv6
address is returned. For example, IPv6 localhost, ::1, is printed as
00000000000000000000000000000001.

If IPv6 socket accepts IPv4 connection, the IPv4 address is mapped by the
kernel into the IPv4-mapped-IPv6 address space (RFC4291, section 2.5.5.2)
and is formatted as such. For example, 127.0.0.1 becomes ::ffff:127.0.0.1,
which is printed as 00000000000000000000FFFF7F000001.

This should be backported to 1.9.
2019-03-22 17:31:18 +01:00
Pierre Cheynier
bc34cd1de2 BUG/MEDIUM: ssl: ability to set TLS 1.3 ciphers using ssl-default-server-ciphersuites
Any attempt to put TLS 1.3 ciphers on servers failed with output 'unable
to set TLS 1.3 cipher suites'.

This was due to usage of SSL_CTX_set_cipher_list instead of
SSL_CTX_set_ciphersuites in the TLS 1.3 block (protected by
OPENSSL_VERSION_NUMBER >= 0x10101000L & so).

This should be backported to 1.9 and 1.8.

Signed-off-by: Pierre Cheynier <p.cheynier@criteo.com>
Reported-by: Damien Claisse <d.claisse@criteo.com>
Cc: Emeric Brun <ebrun@haproxy.com>
2019-03-22 17:24:14 +01:00
Willy Tarreau
749f5cab83 CLEANUP: mux-h2: add some comments to help understand the code
Some functions' roles and usage are far from being obvious, and diving
into this part each time requires deep concentration before starting to
understand who does what. Let's add a few comments which help figure
some of the useful pieces.
2019-03-21 19:19:36 +01:00
Willy Tarreau
8ab128c06a MINOR: mux-h2: copy small data blocks more often and reduce the number of pauses
We tend to refrain from sending data a bit too much in the H2 mux :
whenever there are pending data in the buffer and we try to copy
something larger than 1/4 of the buffer we prefer to pause. This
is suboptimal for medium-sized objects which have to send their
headers and later their data.

This patch slightly changes this by allowing a copy of a large block
if it fits at once and if the realign cost is small, i.e. the pending
data are small or the block fits in the contiguous area. Depending on
the object size this measurably improves the download performance by
between 1 and 10%, and possibly lowers the transfer latency for medium
objects.
2019-03-21 18:28:31 +01:00
Olivier Houchard
fd8bd4521a BUG/MEDIUM: mux-h2: Use the right list in h2_stop_senders().
In h2_stop_senders(), when we're about to move the h2s about to send back
to the send_list, because we know the mux is full, instead of putting them
all in the send_list, put them back either in the fctl_list or the send_list
depending on if they are waiting for the flow control or not. This also makes
sure they're inserted in their arrival order and not reversed.

This should be backported to 1.9.
2019-03-21 18:28:31 +01:00
Olivier Houchard
16ff261633 BUG/MEDIUM: mux-h2: Don't bother keeping the h2s if detaching and nothing to send.
In h2_detach(), don't bother keeping the h2s even if it was waiting for
flow control if we no longer are subscribed for receiving or sending, as
nobody will do anything once we can write in the mux, anyway. Failing to do
so may lead to h2s being kept opened forever.

This should be backported to 1.9.
2019-03-21 18:28:31 +01:00
Olivier Houchard
7a977431ca BUG/MEDIUM: mux-h2: Make sure we destroyed the h2s once shutr/shutw is done.
If we're waiting until we can send a shutr and/or a shutw, once we're done
and not considering sending anything, destroy the h2s, and eventually the
h2c if we're done with the whole connection, or it will never be done.

This should be backported to 1.9.
2019-03-21 18:28:31 +01:00
Willy Tarreau
6e8d6a9163 Revert "MEDIUM: proto_htx: Switch to infinite forwarding if there is no data filter"
This reverts commit f52170d2f4.

This commit was merged too early, some areas are not ready and
transfers from H1 to H2 often stall. Christopher suggested to wait
for the other parts to be ready before reintroducing it.
2019-03-21 18:28:31 +01:00
Christopher Faulet
18c2e8dc0f MINOR: lua: Don't handle the header Expect in lua HTTP applets anymore
This header is now handled in HTTP analyzers the same way for all HTTP applets.
2019-03-19 09:58:35 +01:00
Willy Tarreau
0f22299435 CLEANUP: cache: don't export http_cache_applet anymore
This one can become static since it's not used by http/htx anymore.
2019-03-19 09:58:35 +01:00
Christopher Faulet
2571bc6410 MINOR: http/applets: Handle all applets intercepting HTTP requests the same way
In addition to stats and cache applets, there are also HTTP applet services
declared in an http-request rule. All these applets are now handled the same
way. Among other things, the header Expect is handled at the same place for all
these applets.
2019-03-19 09:54:20 +01:00
Christopher Faulet
bcf242a1d5 MINOR: stats/cache: Handle the header Expect when applets are registered
First of all, it is a way to handle 100-Continue for the cache without
duplicating code. Then, for the stats, it is no longer necessary to wait for the
request body.
2019-03-19 09:53:14 +01:00
Christopher Faulet
4a28a536a3 MINOR: proto_htx: Add function to handle the header "Expect: 100-continue"
The function htx_handle_expect_hdr() is now responsible to search the header
"Expect" and send the corresponding response if necessary.
2019-03-19 09:51:38 +01:00
Christopher Faulet
87451fd0bf MINOR: proto_http: Add function to handle the header "Expect: 100-continue"
The function http_handle_expect_hdr() is now responsible to search the header
"Expect" and send the corresponding response if necessary.
2019-03-19 09:50:54 +01:00
Christopher Faulet
56a3d6e1f1 BUG/MEDIUM: lua: Fully consume large requests when an HTTP applet ends
In Lua, when an HTTP applet ends (in HTX and legacy HTTP), we must flush
remaining outgoing data on the request. But only outgoing data at time the
applet is called are consumed. If a request with a huge body is sent, an error
is triggerred because a SHUTW is catched for an unfinisehd request.

Now, we consume request data until the end. In fact, we don't try to shutdown
the request's channel for write anymore.

This patch must be backported to 1.9 after some observation period. It should
probably be backported in prior versions too. But honnestly, with refactoring
on the connection layer and the stream interface in 1.9, it is probably safer
to not do so.
2019-03-19 09:49:50 +01:00
Christopher Faulet
3a78aa6e95 BUG/MINOR: stats: Fully consume large requests in the stats applet
In the stats applet (in HTX and legacy HTTP), after a response is fully sent to
a client, the request is consumed. It is done at the end, after all the response
was copied into the channel's buffer. But only outgoing data at time the applet
is called are consumed. Then the applet is closed. If a request with a huge body
is sent, an error is triggerred because a SHUTW is catched for an unfinisehd
request.

Now, we consume request data until the end. In fact, we don't try to shutdown
the request's channel for write anymore.

This patch must be backported to 1.9 after some observation period. It should
probably be backported in prior versions too. But honnestly, with refactoring
on the connection layer and the stream interface in 1.9, it is probably safer
to not do so.
2019-03-19 09:49:29 +01:00
Christopher Faulet
adb363135c BUG/MINOR: cache: Fully consume large requests in the cache applet
In the cache applet (in HTX and legacy HTTP), when an cached object is sent to a
client, the request must be consumed. It is done at the end, after all the
response was copied into the channel's buffer. But only outgoing data at time
the applet is called are consumed. Then the applet is closed. If a request with
a huge body is sent, an error is triggerred because a SHUTW is catched on an
unfinished request.

Now, we consume request data as soon as possible and we do it until the end. In
fact, we don't try to shutdown the request's channel for write anymore.

This patch must be backported to 1.9 after some observation period.
2019-03-19 09:49:08 +01:00
Christopher Faulet
f52170d2f4 MEDIUM: proto_htx: Switch to infinite forwarding if there is no data filter
Because in HTX the parsing is done by the multiplexers, there is no reason to
limit the amount of data fast-forwarded. Of course, it is only true when there
is no data filter registered on the corresponding channel. So now, we enable the
infinite forwarding when possible. However, the HTTP message state remains
HTTP_MSG_DATA. Then, when infinite forwarding is enabled, if the flag CF_SHUTR
is set, the state is switched to HTTP_MSG_DONE.
2019-03-19 09:48:05 +01:00
Willy Tarreau
679bba13f7 MINOR: init: report the list of optionally available services
It's never easy to guess what services are built in. We currently have
the prometheus exporter in contrib/ which is the only extension for now.
Let's enumerate all available ones just like we do for filterr and pollers.
2019-03-19 08:08:10 +01:00
Willy Tarreau
9b6be3bbeb BUILD: tools: fix a build warning on some 32-bit archs
Some recent versions of gcc apparently can detect that x >> 32 will not
work on a 32-bit architecture, but are failing to see that the code will
not be built since it's enclosed in "if (sizeof(LONG) > 4)" or equivalent.
Just shift right twice by 16 bits in this case, the compiler correctly
replaces it by a single 32-bit shift.

No backport is needed.
2019-03-18 16:33:15 +01:00
Christopher Faulet
93e02d8b73 MINOR: proto-http/proto-htx: Make error handling clearer during data forwarding
It is just a cleanup. Error handling is grouped at the end HTTP data analysers.

This patch must be backported to 1.9 because it is used by another patch to fix
a bug.
2019-03-18 15:50:23 +01:00
Christopher Faulet
203b2b0a5a MINOR: muxes: Report the Last read with a dedicated flag
For conveniance, in HTTP muxes (h1 and h2), the end of the stream and the end of
the message are reported the same way to the stream, by setting the flag
CS_FL_EOS. In the stream-interface, when CS_FL_EOS is detected, a shutdown for
read is reported on the channel side. This is historical. With the legacy HTTP
layer, because the parsing is done by the stream in HTTP analyzers, the EOS
really means a shutdown for read.

Most of time, for muxes h1 and h2, it works pretty well, especially because the
keep-alive is handled by the muxes. The stream is only used for one
transaction. So mixing EOS and EOM is good enough. But not everytime. For now,
client aborts are only reported if it happens before the end of the request. It
is an error and it is properly handled. But because the EOS was already
reported, client aborts after the end of the request are silently
ignored. Eventually an error can be reported when the response is sent to the
client, if the sending fails. Otherwise, if the server does not reply fast
enough, an error is reported when the server timeout is reached. It is the
expected behaviour, excpect when the option abortonclose is set. In this case,
we must report an error when the client aborts. But as said before, this event
can be ignored. So to be short, for now, the abortonclose is broken.

In fact, it is a design problem and we have to rethink all channel's flags and
probably the conn-stream ones too. It is important to split EOS and EOM to not
loose information anymore. But it is not a small job and the refactoring will be
far from straightforward.

So for now, temporary flags are introduced. When the last read is received, the
flag CS_FL_READ_NULL is set on the conn-stream. This way, we can set the flag
SI_FL_READ_NULL on the stream interface. Both flags are persistant. And to be
sure to wake the stream, the event CF_READ_NULL is reported. So the stream will
always have the chance to handle the last read.

This patch must be backported to 1.9 because it will be used by another patch to
fix the option abortonclose.
2019-03-18 15:50:23 +01:00
Christopher Faulet
35757d38ce MINOR: mux-h2: Set REFUSED_STREAM error to reset a stream if no data was never sent
According to the H2 spec (see #8.1.4), setting the REFUSED_STREAM error code
is a way to indicate that the stream is being closed prior to any processing
having occurred, such as when a server-side H1 keepalive connection is closed
without sending anything (which differs from the regular error case since
haproxy doesn't even generate an error message). Any request that was sent on
the reset stream can be safely retried. So, when a stream is closed, if no
data was ever sent back (ie. the flag H2_SF_HEADERS_SENT is not set), we can
set the REFUSED_STREAM error code on the RST_STREAM frame.

This patch may be backported to 1.9.
2019-03-18 15:50:23 +01:00
Christopher Faulet
f02ca00a36 BUG/MEDIUM: mux-h2: Always wakeup streams with no id to avoid frozen streams
This only happens for server streams because their id is assigned when the first
message is sent. If these streams are not woken up, some events can be lost
leading to frozen streams. For instance, it happens when a server closes its
connection before sending its preface.

This patch must be backported to 1.9.
2019-03-18 15:50:23 +01:00
Willy Tarreau
d1fd6f5f64 BUG/MINOR: http/counters: fix missing increment of fe->srv_aborts
When a server aborts a transfer, we used to increment the backend's
counter but not the frontend's during the forwarding phase. This fixes
it. It might be backported to all supported versions (possibly removing
the htx part) though it is of very low importance.
2019-03-18 15:50:23 +01:00
Christopher Faulet
2f9a41d52b BUG/MAJOR: stats: Fix how huge POST data are read from the channel
When the body length is greater than a chunk size (so if length of POST data
exceeds the buffer size), the requests is rejected with the status code
STAT_STATUS_EXCD. Otherwise the stats applet will wait to have all the data to
copy and parse them. But there is a problem when the total request size
(including the headers) is just lower than the buffer size but greater the
buffer size less the reserve. In such case, the body length is considered as
enough small to be processed but not entierly received. So the stats applet
waits for more data. But because outgoing data are still there, the channel's
buffer is considered as full and nothing more can be read, leading to a freeze
of the session.

Note this bug is pretty easy to reproduce with the legacy HTTP. It is harder
with the HTX but still possible. To fix the bug, in the stats applet, when the
request is not fully received, we check if at least the reserve remains
available the channel's buffer.

This patch must be backported as far as 1.5. But because the HTX does not exist
in 1.8 and lower, it will have to be adapted for these versions.
2019-03-18 15:50:23 +01:00
Christopher Faulet
fe261551b9 BUG/MAJOR: spoe: Fix initialization of thread-dependent fields
A bug was introduced in the commit b0769b ("BUG/MEDIUM: spoe: initialization
depending on nbthread must be done last"). The code depending on global.nbthread
was moved from cfg_parse_spoe_agent() to spoe_check() but the pointer on the
agent configuration was not updated to use the filter's one. The variable
curagent is a global variable only valid during the configuration parsing. In
spoe_check(), conf->agent must be used instead.

This patch must be backported to 1.9 and 1.8.
2019-03-18 14:07:38 +01:00
Willy Tarreau
57cb506df8 BUILD: listener: shut up a build warning when threads are disabled
We get this with __decl_hathreads due to the lone semi-colon, let's move
it at the end of the innermost declaration :

  src/listener.c: In function 'listener_accept':
  src/listener.c:601:2: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
2019-03-15 17:17:33 +01:00
Christopher Faulet
5d45e381b4 BUG/MINOR: stats: Be more strict on what is a valid request to the stats applet
First of all, only GET, HEAD and POST methods are now allowed. Others will be
rejected with the status code STAT_STATUS_IVAL (invalid request). Then, for the
legacy HTTP, only POST requests with a content-length are allowed. Now, chunked
encoded requests are also considered as invalid because the chunk formatting
will interfere with the parsing of POST parameters. In HTX, It is not a problem
because data are unchunked.

This patch must be backported to 1.9. For prior versions too, but HTX part must
be removed. The patch introducing the status code STAT_STATUS_IVAL must also be
backported.
2019-03-15 14:35:11 +01:00
Christopher Faulet
2b9b6784b9 MINOR: stats: Move stuff about the stats status codes in stats files
The status codes definition (STAT_STATUS_*) and their string representation
stat_status_codes) have been moved in stats files. There is no reason to keep
them in proto_http files.
2019-03-15 14:34:59 +01:00
Christopher Faulet
3c2ecf75c8 MINOR: stats: Add the status code STAT_STATUS_IVAL to handle invalid requests
This patch must be backported to 1.9 because a bug fix depends on it.
2019-03-15 14:34:52 +01:00
Christopher Faulet
0ae79d0b0e BUG/MINOR: lua/htx: Don't forget to call htx_to_buf() when appropriate
When htx_from_buf() is used to get an HTX message from a buffer, htx_to_buf()
must always be called when finish. Some calls to htx_to_buf() were missing.

This patch must be backported to 1.9.
2019-03-15 14:34:36 +01:00
Christopher Faulet
f6cce3f0ef BUG/MINOR: lua/htx: Use channel_add_input() when response data are added
This patch must be backported to 1.9.
2019-03-15 14:33:50 +01:00
Christopher Faulet
1e2d636413 BUG/MINOR: stats/htx: Call channel_add_input() when response headers are sent
This function will only increment the total amount of bytes read by a channel
because at this stage there is no fast forwarding. So the bug is pretty limited.

This patch must be backported to 1.9.
2019-03-15 14:33:38 +01:00
Christopher Faulet
269223886d BUG/MINOR: mux-h1: Don't report an error on EOS if no message was received
An error is reported if the EOS is detected before the end of the message. But
we must be carefull to not report an error if there is no message at all.

This patch must be backported to 1.9.
2019-03-15 14:33:02 +01:00
Olivier Houchard
1b32790324 BUG/MEDIUM: tasks: Make sure we wake sleeping threads if needed.
When waking a task on a remote thread, we currently check 1) if this
thread was sleeping, and 2) if it was already marked as active before
writing to its pipe. Unfortunately this doesn't always work as desired
because only one thread from the mask is woken up, while the
active_tasks_mask indicates all eligible threads for this task. As a
result, if one multi-thread task (e.g. a health check) wakes up to run
on any thread, then an accept() dispatches an incoming connection on
thread 2, this thread will already have its bit set in active_tasks_mask
because of the previous wakeup and will not be woken up.

This is easily noticeable on 2.0-dev by injecting on a multi-threaded
listener with a single connection at a time while health checks are
running quickly in the background : the injection runs slowly with
random response times (the poll timeouts). In 1.9 it affects the
dequeing of server connections, which occasionally experience pauses
if multiple threads share the same queue.

The correct solution consists in adjusting the sleeping_thread_mask
when waking another thread up. This mask reflects threads that are
sleeping, hence that need to be signaled to wake up. Threads with a
bit in active_tasks_mask already don't have their sleeping_thread_mask
bit set before polling so the principle remains consistent. And by
doing so we can remove the old_active_mask field.

This should be backported to 1.9.
2019-03-15 14:09:39 +01:00
Willy Tarreau
3f20085617 BUG/MEDIUM: init/threads: consider epoll_fd/pipes for automatic maxconn calculation
This is the equivalent of the previous patch for the automatic maxconn
calculation. This doesn't need any backport.
2019-03-14 20:02:37 +01:00
Willy Tarreau
2c58b41c96 BUG/MEDIUM: threads/fd: do not forget to take into account epoll_fd/pipes
Each thread uses one epoll_fd or kqueue_fd, and a pipe (thus two FDs).
These ones have to be accounted for in the maxsock calculation, otherwise
we can reach maxsock before maxconn. This is difficult to observe but it
in fact happens when a server connects back to the frontend and has checks
enabled : the check uses its FD and serves to fill the loop. In this case
all FDs planed for the datapath are used for this.

This needs to be backported to 1.9 and 1.8.
2019-03-14 20:02:37 +01:00
Willy Tarreau
897e2c58e6 BUG/MEDIUM: listener: make sure we don't pick stopped threads
Dragan Dosen reported that after the multi-queue changes, appending
"process 1/even" on a bind line can make the process immediately crash
when delivering a first connection. This is due to the fact that I
believed that thread_mask(mask) applied the all_threads_mask value,
but it doesn't. And in case of even/odd the bits cover more than the
available threads, resulting in too high a thread number being selected
and a non-existing task to be woken up.

No backport is needed.
2019-03-13 15:03:53 +01:00
Willy Tarreau
df23c0ce45 MINOR: config: continue to rely on DEFAULT_MAXCONN to set the minimum maxconn
Some packages used to rely on DEFAULT_MAXCONN to set the default global
maxconn value to use regardless of the initial ulimit. The recent changes
made the lowest bound set to 100 so that it is compatible with almost any
environment. Now that DEFAULT_MAXCONN is not needed for anything else, we
can use it for the lowest bound set when maxconn is not configured. This
way it retains its original purpose of setting the default maxconn value
eventhough most of the time the effective value will be higher thanks to
the automatic computation based on "ulimit -n".
2019-03-13 10:10:49 +01:00
Willy Tarreau
ca783d4ee6 MINOR: config: remove obsolete use of DEFAULT_MAXCONN at various places
This entry was still set to 2000 but never used anymore. The only places
where it appeared was as an alias to SYSTEM_MAXCONN which forces it, so
let's turn these ones to SYSTEM_MAXCONN and remove the default value for
DEFAULT_MAXCONN. SYSTEM_MAXCONN still defines the upper bound however.
2019-03-13 10:10:25 +01:00
Olivier Houchard
25ad13f9a0 MEDIUM: vars: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:38 +01:00
Olivier Houchard
cab0f0b418 MEDIUM: time: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:38 +01:00
Olivier Houchard
64dbb2df23 MEDIUM: tcp_rules: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:38 +01:00
Olivier Houchard
dc6111e864 MEDIUM: stream: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:38 +01:00
Olivier Houchard
2be5a4c627 MEDIUM: ssl: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:38 +01:00
Olivier Houchard
d5b3d30b60 MEDIUM: sessions: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:38 +01:00
Olivier Houchard
b4df492d01 MEDIUM: queues: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:38 +01:00
Olivier Houchard
4051410fef MEDIUM: proto_tcp: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:38 +01:00
Olivier Houchard
ed87989ab5 MEDIUM: peers: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:38 +01:00
Olivier Houchard
20872763dd MEDIUM: memory: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:38 +01:00
Olivier Houchard
d2ee3e7227 MEDIUM: logs: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:38 +01:00
Olivier Houchard
64213e910d MEDIUM: listeners: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:38 +01:00
Olivier Houchard
36a8e6f970 MEDIUM: lb/threads: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:38 +01:00
Olivier Houchard
a798bf56e2 MEDIUM: http: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:38 +01:00
Olivier Houchard
b23a61f78a MEDIUM: threads: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:38 +01:00
Olivier Houchard
9e7ae28a16 MEDIUM: spoe: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:38 +01:00
Olivier Houchard
43da3430f1 MEDIUM: compression: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:38 +01:00
Olivier Houchard
cb6c9274ae MEDIUM: pollers: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:38 +01:00
Olivier Houchard
7059c55463 MEDIUM: checks: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:38 +01:00
Olivier Houchard
aa090d46fe MEDIUM: cache: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:38 +01:00
Olivier Houchard
237f781f2d MEDIUM: backend: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:37 +01:00
Olivier Houchard
0823ca8b96 MEDIUM: activity: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:37 +01:00
Olivier Houchard
4c28328572 MEDIUM: task: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:37 +01:00
Olivier Houchard
d360879fb5 MEDIUM: fd: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:37 +01:00
Olivier Houchard
d2b5d16187 MEDIUM: various: Use __ha_barrier_atomic* when relevant.
When protecting data modified by atomic operations, use __ha_barrier_atomic*
to avoid unneeded barriers on x86.
2019-03-11 17:02:37 +01:00
Olivier Houchard
a51885621d BUG/MEDIUM: listeners: Don't call fd_stop_recv() if fd_updt is NULL.
In do_unbind_listener, don't bother calling fd_stop_recv() if fd_updt is
NULL. It means it has already been free'd, and it would crash.
2019-03-08 16:05:31 +01:00
Dragan Dosen
bc6218e1b0 BUG/MEDIUM: 51d: fix possible segfault on deinit_51degrees()
When haproxy is built with 51Degrees support, but not configured to use
51Degrees database, a segfault can occur when deinit_51degrees()
function is called, eg. during soft-stop on SIGUSR1 signal.

Only builds that use Pattern algorithm are affected.

This fix must be backported to all stable branches where 51Degrees
support is available. Additional adjustments are required for some
branches due to API and naming changes.
2019-03-07 17:16:27 +01:00
Frédéric Lécaille
2365fb0c97 BUG/MAJOR: config: Wrong maxconn adjustment.
Before c8d5b95 the "maxconn" of the backend of dynamic "use_backend"
rules was not modified (this does not make sense and this is correct).
When implementing proxy_adjust_all_maxconn(), c8d5b95 commit missed this case.
With this patch we adjust the "maxconn" of the backend of such rules only if
they are not dynamic.

Without this patch reg-tests/http-rules/h00003.vtc could make haproxy crash.
2019-03-07 17:07:23 +01:00
Olivier Houchard
7c49711d60 BUG/MEDIUM: logs: Only attempt to free startup_logs once.
deinit_log_buffers() can be called once per thread, however startup_logs
is common to all threads. So only attempt to free it once.

This should be backported to 1.9 and 1.8.
2019-03-07 14:59:34 +01:00
Willy Tarreau
0cf33176bd MINOR: listener: move thr_idx from the bind_conf to the listener
Tests show that it's slightly faster to have this field in the listener.
The cache walk patterns are under heavy stress and having only this field
written to in the bind_conf was wasting a cache line that was heavily
read. Let's move this close to the other entries already written to in
the listener. Warning, the position does have an impact on peak performance.
2019-03-07 14:08:26 +01:00
Willy Tarreau
9f1d4e7f7f CLEANUP: listener: remove old thread bit mapping
Now that the P2C algorithm for the accept queue is removed, we don't
need to map a number to a thread bit anymore, so let's remove all
these fields which are taking quite some space for no reason.
2019-03-07 13:59:04 +01:00
Willy Tarreau
0fe703bd50 MEDIUM: listener: change the LB algorithm again to use two round robins instead
At this point, the random used in the hybrid queue distribution algorithm
provides little benefit over a periodic scan, can even have a slightly
worse worst case, and it requires to establish a mapping between a
discrete number and a thread ID among a mask.

This patch introduces a different approach using two indexes. One scans
the thread mask from the left, the other one from the right. The related
threads' loads are compared, and the least loaded one receives the new
connection. Then one index is adjusted depending on the load resulting
from this election, so that we start the next election from two known
lightly loaded threads.

This approach provides an extra 1% peak performance boost over the previous
one, which likely corresponds to the removal of the extra work on the
random and the previously required two mappings of index to thread.

A test was attempted with two indexes going in the same direction but it
was much less interesting because the same thread pairs were compared most
of the time with the load climbing in a ladder-like model. With the reverse
directions this cannot happen.
2019-03-07 13:57:33 +01:00
Willy Tarreau
fc630bd373 MINOR: listener: improve incoming traffic distribution
By picking two randoms following the P2C algorithm, we seldom observe
asymmetric loads on bursts of small session counts. This is typically
what makes h2load take a bit of time to complete the last 100% because
if a thread gets two connections while the other ones only have one,
it takes twice the time to complete its work.

This patch proposes a modification of the p2c algorithm which seems
more suitable to this case : it mixes a rotating index with a random.
This way, we're certain that all threads are consulted in turn and at
the same time we're not forced to use the ones we're giving a chance.

This significantly increases the traffic rate. Now h2load shows faster
completion and the average request rates on H2 and the TLS resume rate
increases by a bit more than 5% compared to pure p2c.

The index was placed into the struct bind_conf because 1) it's faster
there and it's the best place to optimally distribute traffic among a
group of listeners. It's the only runtime-modified element there and
it will be quite cache-hot.
2019-03-07 13:48:04 +01:00
Frédéric Lécaille
bfe6138150 MINOR: sample: Add a protocol buffers specific converter.
This patch adds "protobuf" protocol buffers specific converter wich
may used in combination with "ungrpc" as first converter to extract
a protocol buffers field value. It is simply implemented reusing
protobuf_field_lookup() which is the protocol buffers specific parser already
used by "ungrpc" converter which only parse a gRPC header in addition of
parsing protocol buffers message.

Update the documentation for this new "protobuf" converter.
2019-03-06 15:36:02 +01:00
Frédéric Lécaille
5f33f85ce8 MINOR: sample: Extract some protocol buffers specific code.
We move the code responsible of parsing protocol buffers messages
inside gRPC messages from sample.c to include/proto/protocol_buffers.h
so that to reuse it to cascade "ungrpc" converter.
2019-03-06 15:36:02 +01:00
Lukas Tribus
1aabc93978 BUG/MINOR: ssl: fix warning about ssl-min/max-ver support
In 84e417d8 ("MINOR: ssl: support Openssl 1.1.1 early callback for
switchctx") the code was extended to also support OpenSSL 1.1.1
(code already supported BoringSSL). A configuration check warning
was updated but with the wrong logic, the #ifdef needs a && instead
of an ||.

Reported in #54.

Should be backported to 1.8.
2019-03-05 23:56:58 +01:00
Willy Tarreau
5799e9cd37 MINOR: config: relax the range checks on cpu-map
Emeric reports that when MAX_THREADS and/or MAX_PROCS are set to lower
values, referencing thread or process numbers higher than these limits
in cpu-map returns errors. This is annoying because these typically are
silent settings that are expected to be used only when set. Let's switch
back to LONGBITS for this limit.
2019-03-05 18:14:03 +01:00
Willy Tarreau
8e5e1e7bf0 CLEANUP: wurfl: remove dead, broken and unmaintained code
Since the "wurfl" device detection engine was merged slightly more than
two years ago (2016-11-04), it never received a single fix nor update.
For almost two years it didn't receive even the minimal review or changes
needed to be compatible with threads, and it's remained build-broken for
about the last 9 months, consecutive to the last buffer API changes,
without anyone ever noticing! When asked on the list, nobody confirmed
using it :

   https://www.mail-archive.com/haproxy@formilux.org/msg32516.html

And obviously nobody even cared to verify that it did still build. So we
are left with this broken code with no user and no maintainer. It might
even suffer from remotely exploitable vulnerabilities without anyone
being able to check if it presents any risk. It's a pain to update each
time there is an API change because it doesn't build as it depends on
external libraries that are not publicly accessible, leading to careful
blind changes. It slows down the whole project. This situation is not
acceptable at all.

It's time to cure the problem where it is. This patch removes all this
dead, non-buildable, non-working code. If anyone ever decides to use it,
which I seriously doubt based on history, it could be reintegrated, but
this time the following guarantees will be required :
  - someone has to step up as a maintainer and have his name listed in
    the MAINTAINERS file (I should have been more careful last time).
    This person will take the sole blame for all issues and will be
    responsible for fixing the bugs and incompatibilities affecting
    this code, and for making it evolve to follow regular internal API
    updates.

  - support building on a standard distro with automated tools (i.e. no
    more "click on this site, register your e-mail and download an
    archive then figure how to place this into your build system").
    Dummy libs are OK though as long as they allow the mainline code to
    build and start.

  - multi-threaded support must be fixed. I mean seriously, not worked
    around with a check saying "please disable threads, we've been busy
    fishing for the last two years".

This may be backported to 1.9 given that the code has never worked there
either, thus at least we're certain nobody will miss it.
2019-03-05 13:46:12 +01:00
Frédéric Lécaille
756d97f205 MINOR: sample: Rework gRPC converter code.
For now on, "ungrpc" may take a second optional argument to provide
the protocol buffers types used to encode the field value to be extracted.
When absent the field value is extracted as a binary sample which may then
followed by others converters like "hex" which takes binary as input sample.
When this second argument is a type which does not match the one found by "ungrpc",
this field is considered as not found even if present.

With this patch we also remove the useless "varint" and "svarint" converters.

Update the documentation about "ungrpc" converters.
2019-03-05 11:04:23 +01:00
Frédéric Lécaille
7c93e88d0c MINOR: sample: Code factorization "ungrpc" converter.
Parsing protocol buffer fields always consists in skip the field
if the field is not found or store the field value if found.
So, with this patch we factorize a little bit the code for "ungrpc" converter.
2019-03-05 11:03:53 +01:00
Willy Tarreau
9255e7e971 BUG/MEDIUM: h2/htx: verify that :path doesn't contain invalid chars
While the legacy code converts h2 to h1 and provides some control over
what is passed, in htx mode there is no such control and it is possible
to pass control chars and linear white spaces in the path, which are
possibly reencoded differently once passed to the H1 side.

HTX supports parse error reporting using a special flag. Let's check
the correctness of the :path pseudo header and report any anomaly in
the HTX flag.

Thanks to Jérôme Magnin for reporting this bug with a working reproducer.

This fix must be backported to 1.9 along with the two previous patches
("MINOR: htx: unconditionally handle parsing errors in requests or
responses" and "MINOR: mux-h2: always pass HTX_FL_PARSING_ERROR
between h2s and buf on RX").
2019-03-05 10:58:28 +01:00
Willy Tarreau
7196dd6071 MINOR: mux-h2: always pass HTX_FL_PARSING_ERROR between h2s and buf on RX
In order to allow the H2 parser to report parsing errors, we must make
sure to always pass the HTX_FL_PARSING_ERROR flag from the h2s htx to
the conn_stream's htx.
2019-03-05 10:56:34 +01:00
Willy Tarreau
4236f035fe MINOR: htx: unconditionally handle parsing errors in requests or responses
The htx request and response processing functions currently only check
for HTX_FL_PARSING_ERROR on incomplete messages because that's how mux_h1
delivers these. However with H2 we have to detect some parsing errors in
the format of certain pseudo-headers (e.g. :path), so we do have a complete
message but we want to report an error.

Let's move the parse error check earlier so that it always triggers when
the flag is present. It was also moved for htx_wait_for_request_body()
since we definitely want to be able to abort processing such an invalid
request even if it appears complete, but it was not changed in the forward
functions so as not to truncate contents before the position of the first
error.
2019-03-05 10:56:34 +01:00
Frédéric Lécaille
50290fbb42 MINOR: sample: Replace "req.ungrpc" smp fetch by a "ungrpc" converter.
This patch simply extracts the code of smp_fetch_req_ungrpc() for "req.ungrpc"
from http_fetch.c to move it to sample.c with very few modifications.
Furthermore smp_fetch_body_buf() used to fetch the body contents is no more needed.

Update the documentation for gRPC.
2019-03-04 08:28:42 +01:00
Willy Tarreau
927b88ba00 BUG/MAJOR: mux-h2: fix race condition between close on both ends
A crash in H2 was reported in issue #52. It turns out that there is a
small but existing race by which a conn_stream could detach itself
using h2_detach(), not being able to destroy the h2s due to pending
output data blocked by flow control, then upon next h2s activity
(transfer_data or trailers parsing), an ES flag may need to be turned
into a CS_FL_REOS bit, causing a dereference of a NULL stream. This is
a side effect of the fact that we still have a few places which
incorrectly depend on the CS flags, while these flags should only be
set by h2_rcv_buf() and h2_snd_buf().

All candidate locations along this path have been secured against this
risk, but the code should really evolve to stop depending on CS anymore.

This fix must be backported to 1.9 and possibly partially to 1.8.
2019-03-04 08:17:12 +01:00
Willy Tarreau
ac35093a19 MEDIUM: init: make the global maxconn default to what rlim_fd_cur permits
The global maxconn value is often a pain to configure :
  - in development the user never has the permissions to increase the
    rlim_cur value too high and gets warnings all the time ;

  - in some production environments, users may have limited actions on
    it or may only be able to act on rlim_fd_cur using ulimit -n. This
    is sometimes particularly true in containers or whatever environment
    where the user has no privilege to upgrade the limits.

  - keeping config homogenous between machines is even less easy.

We already had the ability to automatically compute maxconn from the
memory limits when they were set. This patch goes a bit further by also
computing the limit permitted by the configured limit on the number of
FDs. For this it simply reverses the rlim_fd_cur calculation to determine
maxconn based on the number of reserved sockets for listeners & checks,
the number of SSL engines and the number of pipes (absolute or relative).

This way it becomes possible to make maxconn always be the highest possible
value resulting in maxsock matching what was set using "ulimit -n", without
ever setting it. Note that we adjust to the soft limit, not the hard one,
since it's what is configured with ulimit -n. This allows users to also
limit to low values if needed.

Just like before, the calculated value is reported in verbose mode.
2019-03-01 15:54:16 +01:00
Willy Tarreau
8d687d8464 MINOR: init: move some maxsock updates earlier
We'll need to know the global maxsock before the maxconn calculation.
Actually only two components were calculated too late, the peers FD
and the stats FD. Let's move them a few lines upward.
2019-03-01 15:53:14 +01:00
Willy Tarreau
5a023f0d7a MINOR: init: make the maxpipe computation more accurate
The default number of pipes is adjusted based on the sum of frontends
and backends maxconn/fullconn settings. Now that it is possible to have
a null maxconn on a frontend to indicate "unlimited" with commit
c8d5b95e6 ("MEDIUM: config: don't enforce a low frontend maxconn value
anymore"), the sum of maxconn may remain low and limited to the only
frontends/backends where this limit is set.

This patch considers this new unlimited case when doing the check, and
automatically switches to the default value which is maxconn/4 in this
case. All the calculation was moved to a distinct function for ease of
use. This function also supports returning unlimited (-1) when the
value depends on global.maxconn and this latter is not yet set.
2019-03-01 15:53:14 +01:00
Willy Tarreau
8dca19549a BUG/MINOR: mworker: be careful to restore the original rlim_fd_cur/max on reload
When the master re-execs itself on reload, it doesn't restore the initial
rlim_fd_cur/rlim_fd_max values, which have been modified by the ulimit-n
or global maxconn directives. This is a problem, because if these values
were set really low it could prevent the process from restarting, and if
they were set very high, this could have some implications on the restart
time, or later on the computed maxconn.

Let's simply reset these values to the ones we had at boot to maintain
the system in a consistent state.

A backport could be performed to 1.9 and maybe 1.8. This patch depends on
the two previous ones.
2019-03-01 11:26:08 +01:00
Willy Tarreau
9f6dc72477 BUG/MINOR: checks: make external-checks restore the original rlim_fd_cur/max
It's not normal that external processes are run with high FD limits,
as quite often such processes (especially shell scripts) will iterate
over all FDs to close them. Ideally we should even provide a tunable
with the external-check directive to adjust this value, but at least
we need to restore it to the value that was active when starting
haproxy (before it was adjusted for maxconn). Additionally with very
low maxconn values causing rlim_fd_cur to be low, some heavy checks
could possibly fail. This was also mentioned in issue #45.

Currently the following config and scripts report this :

  $ cat rlim.cfg
  global
      maxconn 500000
      external-check

  listen www
      bind :8001
      timeout client 5s
      timeout server 5s
      timeout connect 5s
      option external-check
      external-check command "$PWD/sleep1.sh"
      server local 127.0.0.1:80 check inter 1s

  $ cat sleep1.sh
  #!/bin/sh
  /bin/sleep 0.1
  echo -n "soft: ";ulimit -S -n
  echo -n "hard: ";ulimit -H -n

  # ./haproxy -db -f rlim.cfg
  soft: 1000012
  hard: 1000012
  soft: 1000012
  hard: 1000012

Now with the fix :
  # ./haproxy -db -f rlim.cfg
  soft: 1024
  hard: 4096
  soft: 1024
  hard: 4096

This fix should be backported to stable versions but it depends on
"MINOR: global: keep a copy of the initial rlim_fd_cur and rlim_fd_max
values" and "BUG/MINOR: init: never lower rlim_fd_max".
2019-03-01 11:23:45 +01:00
Willy Tarreau
e5cfdacb83 BUG/MINOR: init: never lower rlim_fd_max
If a ulimit-n value is set, we must not lower the rlim_max value if the
new value is lower, we must only adjust the rlim_cur one. The effect is
that on very low values, this could prevent a master-worker reload, or
make an external check fail by lack of FDs.

This may be backported to 1.9 and earlier, but it depends on this patch
"MINOR: global: keep a copy of the initial rlim_fd_cur and rlim_fd_max
values".
2019-03-01 10:40:30 +01:00
Willy Tarreau
bf6964007a MINOR: global: keep a copy of the initial rlim_fd_cur and rlim_fd_max values
Let's keep a copy of these initial values. They will be useful to
compute automatic maxconn, as well as to restore proper limits when
doing an execve() on external checks.
2019-03-01 10:40:30 +01:00
Frédéric Lécaille
645635da84 MINOR: peers: Add a message for heartbeat.
This patch implements peer heartbeat feature to prevent any haproxy peer
from reconnecting too often, consuming sockets for nothing.

To do so, we add PEER_MSG_CTRL_HEARTBEAT new message to PEER_MSG_CLASS_CONTROL peers
control class of messages. A ->heartbeat field is added to peer structs
to store the heatbeat timeout value which is handled by the same function as for ->reconnect
to control the session timeouts. A 2-bytes heartbeat message is sent every 3s when
no updates have to be sent. This way, the peer which receives such a message is sure
the remote peer is still alive. So, it resets the ->reconnect peer session
timeout to its initial value (5s). This prevents any reconnection to an
already connected alive peer.
2019-03-01 09:33:26 +01:00
Willy Tarreau
c8d5b95e6d MEDIUM: config: don't enforce a low frontend maxconn value anymore
Historically the default frontend's maxconn used to be quite low (2000),
which was sufficient two decades ago but often proved to be a problem
when users had purposely set the global maxconn value but forgot to set
the frontend's.

There is no point in keeping this arbitrary limit for frontends : when
the global maxconn is lower, it's already too high and when the global
maxconn is much higher, it becomes a limiting factor which causes trouble
in production.

This commit allows the value to be set to zero, which becomes the new
default value, to mean it's not directly limited, or in fact it's set
to the global maxconn. Since this operation used to be performed before
computing a possibly automatic global maxconn based on memory limits,
the calculation of the maxconn value and its propagation to the backends'
fullconn has now moved to a dedicated function, proxy_adjust_all_maxconn(),
which is called once the global maxconn is stabilized.

This comes with two benefits :
  1) a configuration missing "maxconn" in the defaults section will not
     limit itself to a magically hardcoded value but will scale up to the
     global maxconn ;

  2) when the global maxconn is not set and memory limits are used instead,
     the frontends' maxconn automatically adapts, and the backends' fullconn
     as well.
2019-02-28 17:05:32 +01:00
Willy Tarreau
d89cc8bfc0 MINOR: proxy: do not change the listeners' maxconn when updating the frontend's
It is possible to update a frontend's maxconn from the CLI. Unfortunately
when doing this it scratches all listeners' maxconn values and sets them
all to the new frontend's value. This can be problematic when mixing
different traffic classes (bind to interface or private networks, etc).

Now that the listener's maxconn is allowed to remain unset, let's not
change these values when setting the frontend's maxconn. This way the
overall frontend's limit can be raised but if certain specific listeners
had their own value forced in the config, they will be preserved. This
makes more sense and is more in line with the principle of defaults
propagation.
2019-02-28 17:05:32 +01:00
Willy Tarreau
a8cf66bcab MINOR: listener: do not needlessly set l->maxconn
It's pointless to always set and maintain l->maxconn because the accept
loop already enforces the frontend's limit anyway. Thus let's stop setting
this value by default and keep it to zero meaning "no limit". This way the
frontend's maxconn will be used by default. Of course if a value is set,
it will be enforced.
2019-02-28 17:05:32 +01:00
Willy Tarreau
e2711c7bd6 MINOR: listener: introduce listener_backlog() to report the backlog value
In an attempt to try to provide automatic maxconn settings, we need to
decorrelate a listner's backlog and maxconn so that these values can be
independent. This introduces a listener_backlog() function which retrieves
the backlog value from the listener's backlog, the frontend's, the
listener's maxconn, the frontend's or falls back to 1024. This
corresponds to what was done in cfgparse.c to force a value there except
the last fallback which was not set since the frontend's maxconn is always
known.
2019-02-28 17:05:29 +01:00
Willy Tarreau
82c9789ac4 BUG/MEDIUM: listener: make sure the listener never accepts too many conns
We were not checking p->feconn nor the global actconn soon enough. In
older versions this could result in a frontend accepting more connections
than allowed by its maxconn or the global maxconn, exactly N-1 extra
connections where N is the number of threads, provided each of these
threads were running a different listener. But with the lock removal,
it became worse, the excess could be the listener's maxconn multiplied
by the number of threads. Among the nasty side effect was that LI_FULL
could be removed while the limit was still over and in some cases the
polling on the socket was no re-enabled.

This commit takes care of updating and checking p->feconn and the global
actconn *before* processing the connection, so that the listener can be
turned off before accepting the socket if needed. This requires to move
some of the bookkeeping operations form session to listen, which totally
makes sense in this context.

Now the limits are properly respected, even if a listener's maxconn is
over a frontend's. This only applies on top of the listener lock removal
series and doesn't have to be backported.
2019-02-28 16:08:54 +01:00
Willy Tarreau
01abd02508 BUG/MEDIUM: listener: use a self-locked list for the dequeue lists
There is a very difficult to reproduce race in the listener's accept
code, which is much easier to reproduce once connection limits are
properly enforced. It's an ABBA lock issue :

  - the following functions take l->lock then lq_lock :
      disable_listener, pause_listener, listener_full, limit_listener,
      do_unbind_listener

  - the following ones take lq_lock then l->lock :
      resume_listener, dequeue_all_listener

This is because __resume_listener() only takes the listener's lock
and expects to be called with lq_lock held. The problem can easily
happen when listener_full() and limit_listener() are called a lot
while in parallel another thread releases sessions for the same
listener using listener_release() which in turn calls resume_listener().

This scenario is more prevalent in 2.0-dev since the removal of the
accept lock in listener_accept(). However in 1.9 and before, a different
but extremely unlikely scenario can happen :

      thread1                                  thread2
         ............................  enter listener_accept()
  limit_listener()
         ............................  long pause before taking the lock
  session_free()
    dequeue_all_listeners()
      lock(lq_lock) [1]
         ............................  try_lock(l->lock) [2]
      __resume_listener()
        spin_lock(l->lock) =>WAIT[2]
         ............................  accept()
                                       l->accept()
                                       nbconn==maxconn =>
                                         listener_full()
                                           state==LI_LIMITED =>
                                             lock(lq_lock) =>DEADLOCK[1]!

In practice it is almost impossible to trigger it because it requires
to limit both on the listener's maxconn and the frontend's rate limit,
at the same time, and to release the listener when the connection rate
goes below the limit between poll() returns the FD and the lock is
taken (a few nanoseconds). But maybe with threads competing on the
same core it has more chances to appear.

This patch removes the lq_lock and replaces it with a lockless queue
for the listener's wait queue (well, technically speaking a self-locked
queue) brought by commit a8434ec14 ("MINOR: lists: Implement locked
variations.") and its few subsequent fixes. This relieves us from the
need of the lq_lock and removes the deadlock. It also gets rid of the
distinction between __resume_listener() and resume_listener() since the
only difference was the lq_lock. All listener removals from the list
are now unconditional to avoid races on the state. It's worth noting
that the list used to never be initialized and that it used to work
only thanks to the state tests, so the initialization has now been
added.

This patch must carefully be backported to 1.9 and very likely 1.8.
It is mandatory to be careful about replacing all manipulations of
l->wait_queue, global.listener_queue and p->listener_queue.
2019-02-28 16:08:54 +01:00
Willy Tarreau
c912f94b57 MINOR: server: remove a few unneeded LIST_INIT calls after LIST_DEL_LOCKED
Since LIST_DEL_LOCKED() and LIST_POP_LOCKED() now automatically reinitialize
the removed element, there's no need for keeping this LIST_INIT() call in the
idle connection code.
2019-02-28 16:08:54 +01:00
Willy Tarreau
18215cba6a BUG/MINOR: config: don't over-count the global maxsock value
global.maxsock used to be augmented by the frontend's maxconn value
for each frontend listener, which is absurd when there are many
listeners in a frontend because the frontend's maxconn fixes an
upper limit to how many connections will be accepted on all of its
listeners anyway. What is needed instead is to add one to count the
listening socket.

In addition, the CLI's and peers' value was incremented twice, the
first time when creating the listener and the second time in the
main init code.

Let's now make sure we only increment global.maxsock by the required
amount of sockets. This means not adding maxconn for each listener,
and relying on the global values when they are correct.
2019-02-27 19:35:37 +01:00
Willy Tarreau
149ab779cc MAJOR: threads: enable one thread per CPU by default
Threads have long matured by now, still for most users their usage is
not trivial. It's about time to enable them by default on platforms
where we know the number of CPUs bound. This patch does this, it counts
the number of CPUs the process is bound to upon startup, and enables as
many threads by default. Of course, "nbthread" still overrides this, but
if it's not set the default behaviour is to start one thread per CPU.

The default number of threads is reported in "haproxy -vv". Simply using
"taskset -c" is now enough to adjust this number of threads so that there
is no more need for playing with cpu-map. And thanks to the previous
patches on the listener, the vast majority of configurations will not
need to duplicate "bind" lines with the "process x/y" statement anymore
either, so a simple config will automatically adapt to the number of
processors available.
2019-02-27 14:51:50 +01:00
Willy Tarreau
7ac908bf8c MINOR: config: add global tune.listener.multi-queue setting
tune.listener.multi-queue { on | off }
  Enables ('on') or disables ('off') the listener's multi-queue accept which
  spreads the incoming traffic to all threads a "bind" line is allowed to run
  on instead of taking them for itself. This provides a smoother traffic
  distribution and scales much better, especially in environments where threads
  may be unevenly loaded due to external activity (network interrupts colliding
  with one thread for example). This option is enabled by default, but it may
  be forcefully disabled for troubleshooting or for situations where it is
  estimated that the operating system already provides a good enough
  distribution and connections are extremely short-lived.
2019-02-27 14:27:07 +01:00
Willy Tarreau
8a03408d81 MINOR: activity: add accept queue counters for pushed and overflows
It's important to monitor the accept queues to know if some incoming
connections had to be handled by their originating thread due to an
overflow. It's also important to be able to confirm thread fairness.
This patch adds "accq_pushed" to activity reporting, which reports
the number of connections that were successfully pushed into each
thread's queue, and "accq_full", which indicates the number of
connections that couldn't be pushed because the thread's queue was
full.
2019-02-27 14:27:07 +01:00
Willy Tarreau
e0e9c48ab2 MAJOR: listener: use the multi-queue for multi-thread listeners
The idea is to redistribute an incoming connection to one of the
threads a bind_conf is bound to when there is more than one. We do this
using a random improved by the p2c algorithm : a random() call returns
two different thread numbers. We then compare their respective connection
count and the length of their accept queues, and pick the least loaded
one. We even use this deferred accept mechanism if the target thread
ends up being the local thread, because this maintains fairness between
all connections and tests show that it's about 1% faster this way,
likely due to cache locality. If the target thread's accept queue is
full, the connection is accepted synchronously by the current thread.
2019-02-27 14:27:07 +01:00
Willy Tarreau
1efafce61f MINOR: listener: implement multi-queue accept for threads
There is one point where we can migrate a connection to another thread
without taking risk, it's when we accept it : the new FD is not yet in
the fd cache and no task was created yet. It's still possible to assign
it a different thread than the one which accepted the connection. The
only requirement for this is to have one accept queue per thread and
their respective processing tasks that have to be woken up each time
an entry is added to the queue.

This is a multiple-producer, single-consumer model. Entries are added
at the queue's tail and the processing task is woken up. The consumer
picks entries at the head and processes them in order. The accept queue
contains the fd, the source address, and the listener. Each entry of
the accept queue was rounded up to 64 bytes (one cache line) to avoid
cache aliasing because tests have shown that otherwise performance
suffers a lot (5%). A test has shown that it's important to have at
least 256 entries for the rings, as at 128 it's still possible to fill
them often at high loads on small thread counts.

The processing task does almost nothing except calling the listener's
accept() function and updating the global session and SSL rate counters
just like listener_accept() does on synchronous calls.

At this point the accept queue is implemented but not used.
2019-02-27 14:27:07 +01:00
Willy Tarreau
b2b50a7784 MINOR: listener: pre-compute some thread counts per bind_conf
In order to quickly pick a thread ID when accepting a connection, we'll
need to know certain pre-computed values derived from the thread mask,
which are counts of bits per position multiples of 1, 2, 4, 8, 16 and
32. In practice it is sufficient to compute only the 4 first ones and
store them in the bind_conf. We update the count every time the
bind_thread value is adjusted.

The fields in the bind_conf struct have been moved around a little bit
to make it easier to group all thread bit values into the same cache
line.

The function used to return a thread number is bind_map_thread_id(),
and it maps a number between 0 and 31/63 to a thread ID between 0 and
31/63, starting from the left.
2019-02-27 14:27:07 +01:00
Willy Tarreau
f3241115e7 MINOR: tools: implement functions to look up the nth bit set in a mask
Function mask_find_rank_bit() returns the bit position in mask <m> of
the nth bit set of rank <r>, between 0 and LONGBITS-1 included, starting
from the left. For example ranks 0,1,2,3 for mask 0x55 will be 6, 4, 2
and 0 respectively. This algorithm is based on a popcount variant and
is described here : https://graphics.stanford.edu/~seander/bithacks.html.
2019-02-27 14:27:07 +01:00
Willy Tarreau
9e85318417 MINOR: listener: maintain a per-thread count of the number of connections on a listener
Having this information will help us improve thread-level distribution
of incoming traffic.
2019-02-27 14:27:07 +01:00
Willy Tarreau
3f0d02bbc2 MAJOR: listener: do not hold the listener lock in listener_accept()
This function used to hold the listener's lock as a way to stay safe
against concurrent manipulations, but it turns out this is wrong. First,
the lock is held during l->accept(), which itself might indirectly call
listener_release(), which, if the listener is marked full, could result
in __resume_listener() to be called and the lock being taken twice. In
practice it doesn't happen right now because the listener's FULL state
cannot change while we're doing this.

Second, all the code does is now protected against concurrent accesses.
It used not to be the case in the early days of threads : the frequency
counters are thread-safe. The rate limiting doesn't require extreme
precision. Only the nbconn check is not thread safe.

Third, the parts called here will have to be called from different
threads without holding this lock, and this becomes a bigger issue
if we need to keep this one.

This patch does 3 things which need to be addressed at once :
  1) it moves the lock to the only 2 functions that were not protected
     since called form listener_accept() :
     - limit_listener()
     - listener_full()

  2) it makes sure delete_listener() properly checks its state within
     the lock.

  3) it updates the l->nbconn tracking to make sure that it is always
     properly reported and accounted for. There is a point of particular
     care around the situation where the listener's maxconn is reached
     because the listener has to be marked full before accepting the
     connection, then resumed if the connection finally gets dropped.
     It is not possible to perform this change without removing the
     lock due to the deadlock issue explained above.

This patch almost doubles the accept rate in multi-thread on a shared
port between 8 threads, and multiplies by 4 the connection rate on a
tcp-request connection reject rule.
2019-02-27 14:27:07 +01:00
Willy Tarreau
a36b324777 MEDIUM: listener: keep a single thread-mask and warn on "process" misuse
Now that nbproc and nbthread are exclusive, we can still provide more
detailed explanations about what we've found in the config when a bind
line appears on multiple threads and processes at the same time, then
ignore the setting.

This patch reduces the listener's thread mask to a single mask instead
of an array of masks per process. Now we have only one thread mask and
one process mask per bind-conf. This removes ~504 bytes of RAM per
bind-conf and will simplify handling of thread masks.

If a "bind" line only refers to process numbers not found by its parent
frontend or not covered by the global nbproc directive, or to a thread
not covered by the global nbthread directive, a warning is emitted saying
what will be used instead.
2019-02-27 14:27:07 +01:00
Willy Tarreau
26f6ae12c0 MAJOR: config: disable support for nbproc and nbthread in parallel
When 1.8 was released, we wanted to support both nbthread and nbproc to
observe how things would go. Since then it appeared obvious that the two
are never used together because of the pain to configure affinity in this
case, and instead of bringing benefits, it brings the limitations of both
models, and causes multiple threads to compete for the same CPU. In
addition, it costs a lot to support both in parallel, so let's get rid
of this once for all.
2019-02-27 14:27:04 +01:00
Willy Tarreau
741b4d6b7a BUG/MINOR: listener: keep accept rate counters accurate under saturation
The test on l->nbconn forces to exit the loop before updating the freq
counters, so the last session which reaches a listener's limit will not
be accounted for in the session rate measurement.

Let's move the test at the beginning of the loop and mark the listener
as saturated on exit.

This may be backported to 1.9 and 1.8.
2019-02-27 08:03:41 +01:00
Frédéric Lécaille
12a718488a BUG/MEDIUM: standard: Wrong reallocation size.
The number of bytes to use with "my_realloc2()" in parse_dotted_nums()
was wrong: missing multiplication by the size of an element of an array
when reallocating it.
2019-02-26 19:07:44 +01:00
Olivier Houchard
dd1c8f1f72 MINOR: cfgparse: Add a cast to make gcc happier.
When calling calloc(), cast global.nbthread to unsigned int, so that gcc
doesn't freak out, as it has no way of knowing global.nbthread can't be
negative.
2019-02-26 18:47:59 +01:00
Olivier Houchard
9ea5d361ae MEDIUM: servers: Reorganize the way idle connections are cleaned.
Instead of having one task per thread and per server that does clean the
idling connections, have only one global task for every servers.
That tasks parses all the servers that currently have idling connections,
and remove half of them, to put them in a per-thread list of connections
to kill. For each thread that does have connections to kill, wake a task
to do so, so that the cleaning will be done in the context of said thread.
2019-02-26 18:17:32 +01:00
Olivier Houchard
7f1bc31fee MEDIUM: servers: Used a locked list for idle_orphan_conns.
Use the locked macros when manipulating idle_orphan_conns, so that other
threads can remove elements from it.
It will be useful later to avoid having a task per server and per thread to
cleanup the orphan list.
2019-02-26 18:17:32 +01:00
Tim Duesterhus
36839dc39f CLEANUP: stream: Remove bogus loop in conn_si_send_proxy
The if-statement was converted into a while-loop in
7fe45698f5 to handle EINTR.

This special handling was later replaced in
0a03c0f022 by conn_sock_send.

The while-loop was not changed back and is not unconditionally
exited after one iteration, with no `continue` inside the body.

Replace by an if-statement.
2019-02-26 17:27:04 +01:00
Tim Duesterhus
c7f880ee3b CLEANUP: http: Remove unreachable code in parse_http_req_capture
`len` has already been checked to be strictly positive a few lines above.

This unreachable code was introduced in 82bf70dff4.
2019-02-26 17:27:04 +01:00
Willy Tarreau
6c1b667e57 [RELEASE] Released version 2.0-dev1
Released version 2.0-dev1 with the following main changes :
    - MINOR: mux-h2: only increase the connection window with the first update
    - REGTESTS: remove the expected window updates from H2 handshakes
    - BUG/MINOR: mux-h2: make empty HEADERS frame return a connection error
    - BUG/MEDIUM: mux-h2: mark that we have too many CS once we have more than the max
    - MEDIUM: mux-h2: remove padlen during headers phase
    - MINOR: h2: add a bit-based frame type representation
    - MINOR: mux-h2: remove useless check for empty frame length in h2s_decode_headers()
    - MEDIUM: mux-h2: decode HEADERS frames before allocating the stream
    - MINOR: mux-h2: make h2c_send_rst_stream() use the dummy stream's error code
    - MINOR: mux-h2: add a new dummy stream for the REFUSED_STREAM error code
    - MINOR: mux-h2: fail stream creation more cleanly using RST_STREAM
    - MINOR: buffers: add a new b_move() function
    - MINOR: mux-h2: make h2_peek_frame_hdr() support an offset
    - MEDIUM: mux-h2: handle decoding of CONTINUATION frames
    - CLEANUP: mux-h2: remove misleading comments about CONTINUATION
    - BUG/MEDIUM: servers: Don't try to reuse connection if we switched server.
    - BUG/MEDIUM: tasks: Decrement tasks_run_queue in tasklet_free().
    - BUG/MINOR: htx: send the proper authenticate header when using http-request auth
    - BUG/MEDIUM: mux_h2: Don't add to the idle list if we're full.
    - BUG/MEDIUM: servers: Fail if we fail to allocate a conn_stream.
    - BUG/MAJOR: servers: Use the list api correctly to avoid crashes.
    - BUG/MAJOR: servers: Correctly use LIST_ELEM().
    - BUG/MAJOR: sessions: Use an unlimited number of servers for the conn list.
    - BUG/MEDIUM: servers: Flag the stream_interface on handshake error.
    - MEDIUM: servers: Be smarter when switching connections.
    - MEDIUM: sessions: Keep track of which connections are idle.
    - MINOR: payload: add sample fetch for TLS ALPN
    - BUG/MEDIUM: log: don't mark log FDs as non-blocking on terminals
    - MINOR: channel: Add the function channel_add_input
    - MINOR: stats/htx: Call channel_add_input instead of updating channel state by hand
    - BUG/MEDIUM: cache: Be sure to end the forwarding when XFER length is unknown
    - BUG/MAJOR: htx: Return the good block address after a defrag
    - MINOR: lb: allow redispatch when using consistent hash
    - CLEANUP: mux-h2: fix end-of-stream flag name when processing headers
    - BUG/MEDIUM: mux-h2: always restart reading if data are available
    - BUG/MINOR: mux-h2: set the stream-full flag when leaving h2c_decode_headers()
    - BUG/MINOR: mux-h2: don't check the CS count in h2c_bck_handle_headers()
    - BUG/MINOR: mux-h2: mark end-of-stream after processing response HEADERS, not before
    - BUG/MINOR: mux-h2: only update rxbuf's length for H1 headers
    - BUG/MEDIUM: mux-h1: use per-direction flags to indicate transitions
    - BUG/MEDIUM: mux-h1: make HTX chunking consistent with H2
    - BUG/MAJOR: stream-int: Update the stream expiration date in stream_int_notify()
    - BUG/MEDIUM: proto-htx: Set SI_FL_NOHALF on server side when request is done
    - BUG/MEDIUM: mux-h1: Add a task to handle connection timeouts
    - MINOR: mux-h2: make h2c_decode_headers() return a status, not a count
    - MINOR: mux-h2: add a new dummy stream : h2_error_stream
    - MEDIUM: mux-h2: make h2c_decode_headers() support recoverable errors
    - BUG/MINOR: mux-h2: detect when the HTX EOM block cannot be added after headers
    - MINOR: mux-h2: remove a misleading and impossible test
    - CLEANUP: mux-h2: clean the stream error path on HEADERS frame processing
    - MINOR: mux-h2: check for too many streams only for idle streams
    - MINOR: mux-h2: set H2_SF_HEADERS_RCVD when a HEADERS frame was decoded
    - BUG/MEDIUM: mux-h2: decode trailers in HEADERS frames
    - MINOR: h2: add h2_make_h1_trailers to turn H2 headers to H1 trailers
    - MEDIUM: mux-h2: pass trailers to H1 (legacy mode)
    - MINOR: htx: add a new function to add a block without filling it
    - MINOR: h2: add h2_make_htx_trailers to turn H2 headers to HTX trailers
    - MEDIUM: mux-h2: pass trailers to HTX
    - MINOR: mux-h1: parse the content-length header on output and set H1_MF_CLEN
    - BUG/MEDIUM: mux-h1: don't enforce chunked encoding on requests
    - MINOR: mux-h2: make HTX_BLK_EOM processing idempotent
    - MINOR: h1: make the H1 headers block parser able to parse headers only
    - MEDIUM: mux-h2: emit HEADERS frames when facing HTX trailers blocks
    - MINOR: stream/htx: Add info about the HTX structs in "show sess all" command
    - MINOR: stream: Add the subscription events of SIs in "show sess all" command
    - MINOR: mux-h1: Add the subscription events in "show fd" command
    - BUG/MEDIUM: h1: Get the h1m state when restarting the headers parsing
    - BUG/MINOR: cache/htx: Be sure to count partial trailers
    - BUG/MEDIUM: h1: In h1_init(), wake the tasklet instead of calling h1_recv().
    - BUG/MEDIUM: server: Defer the mux init until after xprt has been initialized.
    - MINOR: connections: Remove a stall comment.
    - BUG/MEDIUM: cli: make "show sess" really thread-safe
    - BUILD: add a new file "version.c" to carry version updates
    - MINOR: stream/htx: add the HTX flags output in "show sess all"
    - MINOR: stream/cli: fix the location of the waiting flag in "show sess all"
    - MINOR: stream/cli: report more info about the HTTP messages on "show sess all"
    - BUG/MINOR: lua: bad args are returned for Lua actions
    - BUG/MEDIUM: lua: dead lock when Lua tasks are trigerred
    - MINOR: htx: Add an helper function to get the max space usable for a block
    - MINOR: channel/htx: Add HTX version for some helper functions
    - BUG/MEDIUM: cache/htx: Respect the reserve when cached objects are served
    - BUG/MINOR: stats/htx: Respect the reserve when the stats page is dumped
    - DOC: regtest: make it clearer what the purpose of the "broken" series is
    - REGTEST: mailers: add new test for 'mailers' section
    - REGTEST: Add a reg test for health-checks over SSL/TLS.
    - BUG/MINOR: mux-h1: Close connection on shutr only when shutw was really done
    - MEDIUM: mux-h1: Clarify how shutr/shutw are handled
    - BUG/MINOR: compression: Disable it if another one is already in progress
    - BUG/MINOR: filters: Detect cache+compression config on legacy HTTP streams
    - BUG/MINOR: cache: Disable the cache if any compression filter precedes it
    - REGTEST: Add some informatoin to test results.
    - MINOR: htx: Add a function to truncate all blocks after a specific offset
    - MINOR: channel/htx: Add the HTX version of channel_truncate/erase
    - BUG/MINOR: proto_htx: Use HTX versions to truncate or erase a buffer
    - BUG/CRITICAL: mux-h2: re-check the frame length when PRIORITY is used
    - DOC: Fix typo in req.ssl_alpn example (commit 4afdd138424ab...)
    - DOC: http-request cache-use / http-response cache-store expects cache name
    - REGTEST: "capture (request|response)" regtest.
    - BUG/MINOR: lua/htx: Respect the reserve when data are send from an HTX applet
    - REGTEST: filters: add compression test
    - BUG/MEDIUM: init: Initialize idle_orphan_conns for first server in server-template
    - BUG/MEDIUM: ssl: Disable anti-replay protection and set max data with 0RTT.
    - DOC: Be a bit more explicit about allow-0rtt security implications.
    - MINOR: mux-h1: make the mux_h1_ops struct static
    - BUILD: makefile: add an EXTRA_OBJS variable to help build optional code
    - BUG/MEDIUM: connection: properly unregister the mux on failed initialization
    - BUG/MAJOR: cache: fix confusion between zero and uninitialized cache key
    - REGTESTS: test case for map_regm commit 271022150d
    - REGTESTS: Basic tests for concat,strcmp,word,field,ipmask converters
    - REGTESTS: Basic tests for using maps to redirect requests / select backend
    - DOC: REGTESTS README varnishtest -Dno-htx= define.
    - MINOR: spoe: Make the SPOE filter compatible with HTX proxies
    - MINOR: checks: Store the proxy in checks.
    - BUG/MEDIUM: checks: Avoid having an associated server for email checks.
    - REGTEST: Switch to vtest.
    - REGTEST: Adapt reg test doc files to vtest.
    - BUG/MEDIUM: h1: Make sure we destroy an inactive connectin that did shutw.
    - BUG/MINOR: base64: dec func ignores padding for output size checking
    - BUG/MEDIUM: ssl: missing allocation failure checks loading tls key file
    - MINOR: ssl: add support of aes256 bits ticket keys on file and cli.
    - BUG/MINOR: backend: don't use url_param_name as a hint for BE_LB_ALGO_PH
    - BUG/MINOR: backend: balance uri specific options were lost across defaults
    - BUG/MINOR: backend: BE_LB_LKUP_CHTREE is a value, not a bit
    - MINOR: backend: move url_param_name/len to lbprm.arg_str/len
    - MINOR: backend: make headers and RDP cookie also use arg_str/len
    - MINOR: backend: add new fields in lbprm to store more LB options
    - MINOR: backend: make the header hash use arg_opt1 for use_domain_only
    - MINOR: backend: remap the balance uri settings to lbprm.arg_opt{1,2,3}
    - MINOR: backend: move hash_balance_factor out of chash
    - MEDIUM: backend: move all LB algo parameters into an union
    - MINOR: backend: make the random algorithm support a number of draws
    - BUILD/MEDIUM: da: Necessary code changes for new buffer API.
    - BUG/MINOR: stick_table: Prevent conn_cur from underflowing
    - BUG: 51d: Changes to the buffer API in 1.9 were not applied to the 51Degrees code.
    - BUG/MEDIUM: stats: Get the right scope pointer depending on HTX is used or not
    - DOC: add a missing space in the documentation for bc_http_major
    - REGTEST: checks basic stats webpage functionality
    - BUG/MEDIUM: servers: Make assign_tproxy_address work when ALPN is set.
    - BUG/MEDIUM: connections: Add the CO_FL_CONNECTED flag if a send succeeded.
    - DOC: add github issue templates
    - MINOR: cfgparse: Extract some code to be re-used.
    - CLEANUP: cfgparse: Return asap from cfg_parse_peers().
    - CLEANUP: cfgparse: Code reindentation.
    - MINOR: cfgparse: Useless frontend initialization in "peers" sections.
    - MINOR: cfgparse: Rework peers frontend init.
    - MINOR: cfgparse: Simplication.
    - MINOR: cfgparse: Make "peer" lines be parsed as "server" lines.
    - MINOR: peers: Make outgoing connection to SSL/TLS peers work.
    - MINOR: cfgparse: SSL/TLS binding in "peers" sections.
    - DOC: peers: SSL/TLS documentation for "peers"
    - BUG/MINOR: startup: certain goto paths in init_pollers fail to free
    - BUG/MEDIUM: checks: fix recent regression on agent-check making it crash
    - BUG/MINOR: server: don't always trust srv_check_health when loading a server state
    - BUG/MINOR: check: Wake the check task if the check is finished in wake_srv_chk()
    - BUG/MEDIUM: ssl: Fix handling of TLS 1.3 KeyUpdate messages
    - DOC: mention the effect of nf_conntrack_tcp_loose on src/dst
    - BUG/MINOR: proto-htx: Return an error if all headers cannot be received at once
    - BUG/MEDIUM: mux-h2/htx: Respect the channel's reserve
    - BUG/MINOR: mux-h1: Apply the reserve on the channel's buffer only
    - BUG/MINOR: mux-h1: avoid copying output over itself in zero-copy
    - BUG/MAJOR: mux-h2: don't destroy the stream on failed allocation in h2_snd_buf()
    - BUG/MEDIUM: backend: also remove from idle list muxes that have no more room
    - BUG/MEDIUM: mux-h2: properly abort on trailers decoding errors
    - MINOR: h2: declare new sets of frame types
    - BUG/MINOR: mux-h2: CONTINUATION in closed state must always return GOAWAY
    - BUG/MINOR: mux-h2: headers-type frames in HREM are always a connection error
    - BUG/MINOR: mux-h2: make it possible to set the error code on an already closed stream
    - BUG/MINOR: hpack: return a compression error on invalid table size updates
    - MINOR: server: make sure pool-max-conn is >= -1
    - BUG/MINOR: stream: take care of synchronous errors when trying to send
    - CLEANUP: server: fix indentation mess on idle connections
    - BUG/MINOR: mux-h2: always check the stream ID limit in h2_avail_streams()
    - BUG/MINOR: mux-h2: refuse to allocate a stream with too high an ID
    - BUG/MEDIUM: backend: never try to attach to a mux having no more stream available
    - MINOR: server: add a max-reuse parameter
    - MINOR: mux-h2: always consider a server's max-reuse parameter
    - MEDIUM: stream-int: always mark pending outgoing SI_ST_CON
    - MINOR: stream: don't wait before retrying after a failed connection reuse
    - MEDIUM: h2: always parse and deduplicate the content-length header
    - BUG/MINOR: mux-h2: always compare content-length to the sum of DATA frames
    - CLEANUP: h2: Remove debug printf in mux_h2.c
    - MINOR: cfgparse: make the process/thread parser support a maximum value
    - MINOR: threads: make MAX_THREADS configurable at build time
    - DOC: nbthread is no longer experimental.
    - BUG/MINOR: listener: always fill the source address for accepted socketpairs
    - BUG/MINOR: mux-h2: do not report available outgoing streams after GOAWAY
    - BUG/MINOR: spoe: corrected fragmentation string size
    - BUG/MINOR: task: fix possibly missed event in inter-thread wakeups
    - BUG/MEDIUM: servers: Attempt to reuse an unfinished connection on retry.
    - BUG/MEDIUM: backend: always call si_detach_endpoint() on async connection failure
    - SCRIPTS: add the issue tracker URL to the announce script
    - MINOR: peers: Extract some code to be reused.
    - CLEANUP: peers: Indentation fixes.
    - MINOR: peers: send code factorization.
    - MINOR: peers: Add new functions to send code and reduce the I/O handler.
    - MEDIUM: peers: synchronizaiton code factorization to reduce the size of the I/O handler.
    - MINOR: peers: Move update receive code to reduce the size of the I/O handler.
    - MINOR: peers: Move ack, switch and definition receive code to reduce the size of the I/O handler.
    - MINOR: peers: Move high level receive code to reduce the size of I/O handler.
    - CLEANUP: peers: Be more generic.
    - MINOR: peers: move error handling to reduce the size of the I/O handler.
    - MINOR: peers: move messages treatment code to reduce the size of the I/O handler.
    - MINOR: peers: move send code to reduce the size of the I/O handler.
    - CLEANUP: peers: Remove useless statements.
    - MINOR: peers: move "hello" message treatment code to reduce the size of the I/O handler.
    - MINOR: peers: move peer initializations code to reduce the size of the I/O handler.
    - CLEANUP: peers: factor the error handling code in peer_treet_updatemsg()
    - CLEANUP: peers: factor error handling in peer_treat_definedmsg()
    - BUILD/MINOR: peers: shut up a build warning introduced during last cleanup
    - BUG/MEDIUM: mux-h2: only close connection on request frames on closed streams
    - CLEANUP: mux-h2: remove two useless but misleading assignments
    - BUG/MEDIUM: checks: Check that conn_install_mux succeeded.
    - BUG/MEDIUM: servers: Only destroy a conn_stream we just allocated.
    - BUG/MEDIUM: servers: Don't add an incomplete conn to the server idle list.
    - BUG/MEDIUM: checks: Don't try to set ALPN if connection failed.
    - BUG/MEDIUM: h2: In h2_send(), stop the loop if we failed to alloc a buf.
    - BUG/MEDIUM: peers: Handle mux creation failure.
    - BUG/MEDIUM: servers: Close the connection if we failed to install the mux.
    - BUG/MEDIUM: compression: Rewrite strong ETags
    - BUG/MINOR: deinit: tcp_rep.inspect_rules not deinit, add to deinit
    - CLEANUP: mux-h2: remove misleading leftover test on h2s' nullity
    - BUG/MEDIUM: mux-h2: wake up flow-controlled streams on initial window update
    - BUG/MEDIUM: mux-h2: fix two half-closed to closed transitions
    - BUG/MEDIUM: mux-h2: make sure never to send GOAWAY on too old streams
    - BUG/MEDIUM: mux-h2: do not abort HEADERS frame before decoding them
    - BUG/MINOR: mux-h2: make sure response HEADERS are not received in other states than OPEN and HLOC
    - MINOR: h2: add a generic frame checker
    - MEDIUM: mux-h2: check the frame validity before considering the stream state
    - CLEANUP: mux-h2: remove stream ID and frame length checks from the frame parsers
    - BUG/MINOR: mux-h2: make sure request trailers on aborted streams don't break the connection
    - DOC: compression: Update the reasons for disabled compression
    - BUG/MEDIUM: buffer: Make sure b_is_null handles buffers waiting for allocation.
    - DOC: htx: make it clear that htxbuf() and htx_from_buf() always return valid pointers
    - MINOR: htx: never check for null htx pointer in htx_is_{,not_}empty()
    - MINOR: mux-h2: consistently rely on the htx variable to detect the mode
    - BUG/MEDIUM: peers: Peer addresses parsing broken.
    - BUG/MEDIUM: mux-h1: Don't add "transfer-encoding" if message-body is forbidden
    - BUG/MEDIUM: connections: Don't forget to remove CO_FL_SESS_IDLE.
    - BUG/MINOR: stream: don't close the front connection when facing a backend error
    - BUG/MEDIUM: mux-h2: wait for the mux buffer to be empty before closing the connection
    - MINOR: stream-int: add a new flag to mention that we want the connection to be killed
    - MINOR: connstream: have a new flag CS_FL_KILL_CONN to kill a connection
    - BUG/MEDIUM: mux-h2: do not close the connection on aborted streams
    - BUG/MINOR: server: fix logic flaw in idle connection list management
    - MINOR: mux-h2: max-concurrent-streams should be unsigned
    - MINOR: mux-h2: make sure to only check concurrency limit on the frontend
    - MINOR: mux-h2: learn and store the peer's advertised MAX_CONCURRENT_STREAMS setting
    - BUG/MEDIUM: mux-h2: properly consider the peer's advertised max-concurrent-streams
    - MINOR: xref: Add missing barriers.
    - MINOR: muxes: Don't bother to LIST_DEL(&conn->list) before calling conn_free().
    - MINOR: debug: Add an option that causes random allocation failures.
    - BUG/MEDIUM: backend: always release the previous connection into its own target srv_list
    - BUG/MEDIUM: htx: check the HTX compatibility in dynamic use-backend rules
    - BUG/MINOR: tune.fail-alloc: Don't forget to initialize ret.
    - BUG/MINOR: backend: check srv_conn before dereferencing it
    - BUG/MEDIUM: mux-h2: always omit :scheme and :path for the CONNECT method
    - BUG/MEDIUM: mux-h2: always set :authority on request output
    - BUG/MEDIUM: stream: Don't forget to free s->unique_id in stream_free().
    - BUG/MINOR: threads: fix the process range of thread masks
    - BUG/MINOR: config: fix bind line thread mask validation
    - CLEANUP: threads: fix misleading comment about all_threads_mask
    - CLEANUP: threads: use nbits to calculate the thread mask
    - OPTIM: listener: optimize cache-line packing for struct listener
    - MINOR: tools: improve the popcount() operation
    - MINOR: config: keep an all_proc_mask like we have all_threads_mask
    - MINOR: global: add proc_mask() and thread_mask()
    - MINOR: config: simplify bind_proc processing using proc_mask()
    - MINOR: threads: make use of thread_mask() to simplify some thread calculations
    - BUG/MINOR: compression: properly report compression stats in HTX mode
    - BUG/MINOR: task: close a tiny race in the inter-thread wakeup
    - BUG/MAJOR: config: verify that targets of track-sc and stick rules are present
    - BUG/MAJOR: spoe: verify that backends used by SPOE cover all their callers' processes
    - BUG/MAJOR: htx/backend: Make all tests on HTTP messages compatible with HTX
    - BUG/MINOR: config: make sure to count the error on incorrect track-sc/stick rules
    - DOC: ssl: Clarify when pre TLSv1.3 cipher can be used
    - DOC: ssl: Stop documenting ciphers example to use
    - BUG/MINOR: spoe: do not assume agent->rt is valid on exit
    - BUG/MINOR: lua: initialize the correct idle conn lists for the SSL sockets
    - BUG/MEDIUM: spoe: initialization depending on nbthread must be done last
    - BUG/MEDIUM: server: initialize the idle conns list after parsing the config
    - BUG/MEDIUM: server: initialize the orphaned conns lists and tasks at the end
    - MINOR: config: make MAX_PROCS configurable at build time
    - BUG/MAJOR: spoe: Don't try to get agent config during SPOP healthcheck
    - BUG/MINOR: config: Reinforce validity check when a process number is parsed
    - BUG/MEDIUM: peers: check that p->srv actually exists before using p->srv->use_ssl
    - CONTRIB: contrib/prometheus-exporter: Add a Prometheus exporter for HAProxy
    - BUG/MINOR: mux-h1: verify the request's version before dropping connection: keep-alive
    - BUG: 51d: In Hash Trie, multi header matching was affected by the header names stored globaly.
    - MEDIUM: 51d: Enabled multi threaded operation in the 51Degrees module.
    - BUG/MAJOR: stream: avoid double free on unique_id
    - BUILD/MINOR: stream: avoid a build warning with threads disabled
    - BUILD/MINOR: tools: fix build warning in the date conversion functions
    - BUILD/MINOR: peers: remove an impossible null test in intencode()
    - BUILD/MINOR: htx: fix some potential null-deref warnings with http_find_stline
    - BUG/MEDIUM: peers: Missing peer initializations.
    - BUG/MEDIUM: http_fetch: fix the "base" and "base32" fetch methods in HTX mode
    - BUG/MEDIUM: proto_htx: Fix data size update if end of the cookie is removed
    - BUG/MEDIUM: http_fetch: fix "req.body_len" and "req.body_size" fetch methods in HTX mode
    - BUILD/MEDIUM: initcall: Fix build on MacOS.
    - BUG/MEDIUM: mux-h2/htx: Always set CS flags before exiting h2_rcv_buf()
    - MINOR: h2/htx: Set the flag HTX_SL_F_BODYLESS for messages without body
    - BUG/MINOR: mux-h1: Add "transfer-encoding" header on outgoing requests if needed
    - BUG/MINOR: mux-h2: Don't add ":status" pseudo-header on trailers
    - BUG/MINOR: proto-htx: Consider a XFER_LEN message as chunked by default
    - BUG/MEDIUM: h2/htx: Correctly handle interim responses when HTX is enabled
    - MINOR: mux-h2: Set HTX extra value when possible
    - BUG/MEDIUM: htx: count the amount of copied data towards the final count
    - MINOR: mux-h2: make the H2 MAX_FRAME_SIZE setting configurable
    - BUG/MEDIUM: mux-h2/htx: send an empty DATA frame on empty HTX trailers
    - BUG/MEDIUM: servers: Use atomic operations when handling curr_idle_conns.
    - BUG/MEDIUM: servers: Add a per-thread counter of idle connections.
    - MINOR: fd: add a new my_closefrom() function to close all FDs
    - MINOR: checks: use my_closefrom() to close all FDs
    - MINOR: fd: implement an optimised my_closefrom() function
    - BUG/MINOR: fd: make sure my_closefrom() doesn't miss some FDs
    - BUG/MAJOR: fd/threads, task/threads: ensure all spin locks are unlocked
    - BUG/MAJOR: listener: Make sure the listener exist before using it.
    - MINOR: fd: Use closefrom() as my_closefrom() if supported.
    - BUG/MEDIUM: mux-h1: Report the right amount of data xferred in h1_rcv_buf()
    - BUG/MINOR: channel: Set CF_WROTE_DATA when outgoing data are skipped
    - MINOR: htx: Add function to drain data from an HTX message
    - MINOR: channel/htx: Add function to skips output bytes from an HTX channel
    - BUG/MAJOR: cache/htx: Set the start-line offset when a cached object is served
    - BUG/MEDIUM: cache: Get objects from the cache only for GET and HEAD requests
    - BUG/MINOR: cache/htx: Return only the headers of cached objects to HEAD requests
    - BUG/MINOR: mux-h1: Always initilize h1m variable in h1_process_input()
    - BUG/MEDIUM: proto_htx: Fix functions applying regex filters on HTX messages
    - BUG/MEDIUM: h2: advertise to servers that we don't support push
    - MINOR: standard: Add a function to parse uints (dotted notation).
    - MINOR: arg: Add support for ARGT_PBUF_FNUM arg type.
    - MINOR: http_fetch: add "req.ungrpc" sample fetch for gRPC.
    - MINOR: sample: Add two sample converters for protocol buffers.
    - DOC: sample: Add gRPC related documentation.
2019-02-26 16:43:49 +01:00
Frédéric Lécaille
fd95c62f1b MINOR: sample: Add two sample converters for protocol buffers.
Add "varint" to convert all the protocol buffers binary varints excepted the signed
ones ("sint32" and "sint64") to an integer. The binary signed varints may be
converted to an integer with "svarint" converter implemented by this patch.
These two new converters do not take any argument.
2019-02-26 16:27:05 +01:00
Frédéric Lécaille
1fceee8316 MINOR: http_fetch: add "req.ungrpc" sample fetch for gRPC.
This patch implements "req.ungrpc" sample fetch method to decode and
parse a gRPC request. It takes only one argument: a protocol buffers
field number to identify the protocol buffers message number to be looked up.
This argument is a sort of path in dotted notation to the terminal field number
to be retrieved.

  ex:
    req.ungrpc(1.2.3.4)

This sample fetch catch the data in raw mode, without interpreting them.
Some protocol buffers specific converters may be used to convert the data
to the correct type.
2019-02-26 16:27:05 +01:00
Frédéric Lécaille
3a463c92cf MINOR: arg: Add support for ARGT_PBUF_FNUM arg type.
This new argument type is used to parse Protocol Buffers field number
with dotted notation (e.g: 1.2.3.4).
2019-02-26 16:27:05 +01:00
Frédéric Lécaille
3b71716685 MINOR: standard: Add a function to parse uints (dotted notation).
This function is useful to parse strings made of unsigned integers
and to allocate a C array of unsigned integers from there.
For instance this function allocates this array { 1, 2, 3, 4, } from
this string: "1.2.3.4".
2019-02-26 16:27:05 +01:00
Willy Tarreau
0bbad6bb06 BUG/MEDIUM: h2: advertise to servers that we don't support push
The h2c_send_settings() function was initially made to serve on the
frontend. Here we don't need to advertise that we don't support PUSH
since we don't do that ourselves. But on the backend side it's
different because PUSH is enabled by default so we must announce that
we don't want the server to use it.

This must be backported to 1.9.
2019-02-26 16:07:27 +01:00
Christopher Faulet
02e771a9e0 BUG/MEDIUM: proto_htx: Fix functions applying regex filters on HTX messages
The HTX functions htx_apply_filter_to_req_headers() and
htx_apply_filter_to_resp_headers() contain 2 bugs. The first one is about the
matching on each header. The chunk 'hdr' used to format a full header line was
never reset. The second bug appears when we try to replace or remove a
header. The variable ctx was not fully initialized, leading to sefaults.

This patch must be backported to 1.9.
2019-02-26 15:45:02 +01:00
Christopher Faulet
7402776c52 BUG/MINOR: mux-h1: Always initilize h1m variable in h1_process_input()
It is used at the end of the function to know if the end of the message was
reached. So we must be sure to always initialize it.

This patch must be backported to 1.9.
2019-02-26 14:51:17 +01:00
Christopher Faulet
f0dd037456 BUG/MINOR: cache/htx: Return only the headers of cached objects to HEAD requests
The body of a cached object must not be sent in response to a HEAD request. This
works for the legacy HTTP because the parsing is performed by HTTP analyzers
_AND_ because the connection is closed at the end of the transaction. So the
body is ignored. But the applet send it. For the HTX, the applet must skip the
body explicitly.

This patch must be backported to 1.9.
2019-02-26 14:04:23 +01:00
Christopher Faulet
b3d4bca415 BUG/MEDIUM: cache: Get objects from the cache only for GET and HEAD requests
Only responses for GET requests are stored in the cache. But there is no check
on the method during the lookup. So it is possible to retrieve an object from
the cache independently of the method, from the time the key of the object
matches. Now, lookups are performed only for GET and HEAD requests.

This patch must be backportedi in 1.9.
2019-02-26 14:04:23 +01:00
Christopher Faulet
a0df957471 BUG/MAJOR: cache/htx: Set the start-line offset when a cached object is served
When the function htx_add_stline() is used, this offset is automatically set
when necessary. But the HTX cache applet adds all header blocks of the responses
manually, including the start-line. So its offset must be explicitly set by the
applet.

When everything goes well, the HTTP analyzer http_wait_for_response() looks for
the start-line in the HTX messages, calling http_find_stline(). If necessary,
the start-line offet will also be automatically set during this stage. So the
bug of the HTX cache applet does not hurt most of the time. But, when an error
occurred, HTTP responses analyzers can be bypassed. In such caese, the
start-line offset of cached responses remains unset.

Some part of the code relies on the start-line offset to process the HTX
messages. Among others, when H2 responses are sent to clients, the H2
multiplexer read the start-line without any check, because it _MUST_ always be
there. if its offset is not set, a NULL pointer is dereferenced leading to a
segfault.

The patch must be backported to 1.9.
2019-02-26 14:04:23 +01:00
Christopher Faulet
549822f0a1 MINOR: htx: Add function to drain data from an HTX message
The function htx_drain() can now be used to drain data from an HTX message.

It will be used by other commits to fix bugs, so it must be backported to 1.9.
2019-02-26 14:04:23 +01:00
Christopher Faulet
b8d2ee0406 BUG/MEDIUM: mux-h1: Report the right amount of data xferred in h1_rcv_buf()
h1_rcv_buf() must return the amount of data copied in the channel's buffer and
not the number of bytes parsed. Because this value is used during the fast
forwarding to decrement to_forward value, returning the wrong value leads to
undefined behaviours.

This patch must be backported to 1.9.
2019-02-26 14:04:23 +01:00
Olivier Houchard
2292edf67c MINOR: fd: Use closefrom() as my_closefrom() if supported.
Add a new option, USE_CLOSEFROM. If set, it is assumed the system provides
a closefrom() function, so use it.
It is only implicitely used on FreeBSD for now, it should work on
OpenBSD/NetBSD/DragonflyBSD/Solaris too, but as I have no such system to
test it, I'd rather leave it disabled by default. Users can add USE_CLOSEFROM
explicitely on their make command line to activate it.
2019-02-25 16:51:03 +01:00
Olivier Houchard
d16a9dfed8 BUG/MAJOR: listener: Make sure the listener exist before using it.
In listener_accept(), make sure we have a listener before attempting to
use it.
An another thread may have closed the FD meanwhile, and set fdtab[fd].owner
to NULL.
As the listener is not free'd, it is ok to attempt to accept() a new
connection even if the listener was closed. At worst the fd has been
reassigned to another connection, and accept() will fail anyway.

Many thanks to Richard Russo for reporting the problem, and suggesting the
fix.

This should be backported to 1.9 and 1.8.
2019-02-25 16:30:13 +01:00
Richard Russo
bc9d9844d5 BUG/MAJOR: fd/threads, task/threads: ensure all spin locks are unlocked
Calculate if the fd or task should be locked once, before locking, and
reuse the calculation when determing when to unlock.

Fixes a race condition added in 87d54a9a for fds, and b20aa9ee for tasks,
released in 1.9-dev4. When one thread modifies thread_mask to be a single
thread for a task or fd while a second thread has locked or is waiting on a
lock for that task or fd, the second thread will not unlock it.  For FDs,
this is observable when a listener is polled by multiple threads, and is
closed while those threads have events pending.  For tasks, this seems
possible, where task_set_affinity is called, but I did not observe it.

This must be backported to 1.9.
2019-02-25 16:16:36 +01:00
Willy Tarreau
b8e602cb1b BUG/MINOR: fd: make sure my_closefrom() doesn't miss some FDs
The optimized my_closefrom() implementation introduced with previous commit
9188ac60e ("MINOR: fd: implement an optimised my_closefrom() function")
has a small bug causing it to miss some FDs at the end of each batch.
The reason is that poll() returns the number of non-zero events, so
it contains the size of the batch minus the FDs to close. Thus if the
FDs to close are at the beginning they'll be seen but if they're at the
end after all other closed ones, the returned count will not cover them.

No backport is needed.
2019-02-22 09:07:42 +01:00
Willy Tarreau
9188ac60eb MINOR: fd: implement an optimised my_closefrom() function
The idea is that poll() can set the POLLNVAL flag for each invalid
FD in a pollfd list. Thus this function makes use of poll() when
compiled in, and builds lists of up to 1024 FDs at once, checks the
output and only closes those which do not have this flag set. Tests
show that this is about twice as fast as blindly calling close() for
each closed fd.
2019-02-21 23:07:24 +01:00
Willy Tarreau
2555ccf4d0 MINOR: checks: use my_closefrom() to close all FDs
Instead of looping on all FDs, let's use my_closefrom() which does it
respecting the current process' limits and possibly doing it more
efficiently.
2019-02-21 22:22:06 +01:00
Willy Tarreau
2d7f81b809 MINOR: fd: add a new my_closefrom() function to close all FDs
This is a naive implementation of closefrom() which closes all FDs
starting from the one passed in argument. closefrom() is not provided
on all operating systems, and other versions will follow.
2019-02-21 22:19:17 +01:00
Olivier Houchard
f131481a0a BUG/MEDIUM: servers: Add a per-thread counter of idle connections.
Add a per-thread counter of idling connections, and use it to determine
how many connections we should kill after the timeout, instead of using
the global counter, or we're likely to just kill most of the connections.

This should be backported to 1.9.
2019-02-21 19:07:45 +01:00
Olivier Houchard
e737103173 BUG/MEDIUM: servers: Use atomic operations when handling curr_idle_conns.
Use atomic operations when dealing with srv->curr_idle_conns, as it's shared
between threads, otherwise we could get inconsistencies.

This should be backported to 1.9.
2019-02-21 19:07:19 +01:00
Willy Tarreau
67b8caefc9 BUG/MEDIUM: mux-h2/htx: send an empty DATA frame on empty HTX trailers
When chunked-encoding is used in HTX mode, a trailers HTX block is always
made due to the way trailers are currently implemented (verbatim copy of
the H1 representation). Because of this it's not possible to know when
processing data that we've reached the end of the stream, and it's up
to the function encoding the trailers (h2s_htx_make_trailers) to put the
end of stream. But when there are no trailers and only an empty HTX block,
this one cannot produce a HEADERS frame, thus it cannot send the END_STREAM
flag either, leaving the other end with an incomplete message, waiting for
either more data or some trailers. This is particularly visible with POST
requests where the server continues to wait.

What this patch does is transform the HEADERS frame into an empty DATA
frame when meeting an empty trailers block. It is possible to do this
because we've not sent any trailers so the other end is still waiting
for DATA frames. The check is made after attempting to encode the list
of headers, so as to minimize the specific code paths.

Thanks to Dragan Dosen for reporting the issue with a reproducer.

This fix must be backported to 1.9.
2019-02-21 18:22:26 +01:00
Willy Tarreau
a24b35ca18 MINOR: mux-h2: make the H2 MAX_FRAME_SIZE setting configurable
This creates a new tunable "tune.h2.max-frame-size" to adjust the
advertised max frame size. When not set it still defaults to the buffer
size. It is convenient to advertise sizes lower than the buffer size,
for example when using very large buffers.
2019-02-21 17:30:59 +01:00
Willy Tarreau
2bf0c13261 BUG/MEDIUM: htx: count the amount of copied data towards the final count
Currently htx_xfer_blks() respects the <count> limit for each block
instead of for the sum of the transfered blocks. This causes it to
return slightly more than requested when both headers and data are
present in the source buffer, which happens early in the transfer
when the reserve is still active. Thus with large enough headers,
the reserve will not be respected.

Note that this function is only called from h2_rcv_buf() thus this only
affects data entering over H2 (H2 requests or H2 responses).

This fix must be backported to 1.9.
2019-02-21 17:13:07 +01:00
Christopher Faulet
eaf0d2a936 MINOR: mux-h2: Set HTX extra value when possible
For now, this can be only done when a content-length is specified. In fact, it
is the same value than h2s->body_len, the remaining body length according to
content-length. Setting this field allows the fast forwarding at the channel
layer, improving significantly data transfer for big objects.

This patch may be backported to 1.9.
2019-02-19 16:26:14 +01:00
Christopher Faulet
0b46548a68 BUG/MEDIUM: h2/htx: Correctly handle interim responses when HTX is enabled
1xx responses does not work in HTTP2 when the HTX is enabled. First of all, when
a response is parsed, only one HEADERS frame is expected. So when an interim
response is received, the flag H2_SF_HEADERS_RCVD is set and the next HEADERS
frame (for another interim repsonse or the final one) is parsed as a trailers
one. Then when the response is sent, because an EOM block is found at the end of
the interim HTX response, the ES flag is added on the frame, closing too early
the stream. Here, it is a design problem of the HTX. Iterim responses are
considered as full messages, leading to some ambiguities when HTX messages are
processed. This will not be fixed now, but we need to keep it in mind for future
improvements.

To fix the parsing bug, the flag H2_MSGF_RSP_1XX is added when the response
headers are decoded. When this flag is set, an EOM block is added into the HTX
message, despite the fact that there is no ES flag on the frame. And we don't
set the flag H2_SF_HEADERS_RCVD on the corresponding H2S. So the next HEADERS
frame will not be parsed as a trailers one.

To fix the sending bug, the ES flag is not set on the frame when an interim
response is processed and the flag H2_SF_HEADERS_SENT is not set on the
corresponding H2S.

This patch must be backported to 1.9.
2019-02-19 16:26:14 +01:00
Christopher Faulet
834eee7928 BUG/MINOR: proto-htx: Consider a XFER_LEN message as chunked by default
An HTX message with a known body length is now considered by default as
chunked. It means the header "content-length" must be found to consider it as a
non-chunked message. Before, it was the reverse, the message was considered with
a content length by default. But it is a bug for HTTP/2 messages. There is no
chunked transfer encoding in HTTP/2 but internally messages without content
length are considered as chunked. It eases HTTP/1 <-> HTTP/2 conversions.

This patch must be backported to 1.9.
2019-02-18 16:25:06 +01:00
Christopher Faulet
fd74267264 BUG/MINOR: mux-h2: Don't add ":status" pseudo-header on trailers
It is a cut-paste bug. Pseudo-header fields MUST NOT appear in trailers.

This patch must be backported to 1.9.
2019-02-18 16:25:06 +01:00
Christopher Faulet
1f890ddbe2 BUG/MINOR: mux-h1: Add "transfer-encoding" header on outgoing requests if needed
As for outgoing response, if an HTTP/1.1 or above request is sent to a server
with neither the headers "content-length" nor "transfer-encoding", it is
considered as a chunked request and the header "transfer-encoding: chunked" is
automatically added. Of course, it is only true for requests with a
body. Concretely, it only happens for incoming HTTP/2 requests sent to an
HTTP/1.1 server.

This patch must be backported to 1.9.
2019-02-18 16:25:06 +01:00
Christopher Faulet
44af3cfca3 MINOR: h2/htx: Set the flag HTX_SL_F_BODYLESS for messages without body
This information is usefull to know if a body is expected or not, regardless the
presence or not of the header "Content-Length" and its value. Once the ES flag
is set on the header frame or when the content length is 0, we can safely add
the flag HTX_SL_F_BODYLESS on the HTX start-line.

Among other things, it will help the mux-h1 to know if it should add TE header
or not. It will also help the HTTP compression filter.

This patch must be backported to 1.9 because a bug fix depends on it.
2019-02-18 16:25:06 +01:00
Christopher Faulet
37070b2b15 BUG/MEDIUM: mux-h2/htx: Always set CS flags before exiting h2_rcv_buf()
It is especially important when some data are blocked in the RX buf and the
channel buffer is already full. In such case, instead of exiting the function
directly, we need to set right flags on the conn_stream. CS_FL_RCV_MORE and
CS_FL_WANT_ROOM must be set, otherwise, the stream-interface will subscribe to
receive events, thinking it is not blocked.

This bug leads to connection freeze when everything was received with some data
blocked in the RX buf and a channel full.

This patch must be backported to 1.9.
2019-02-18 16:25:06 +01:00
Dragan Dosen
5a606685f1 BUG/MEDIUM: http_fetch: fix "req.body_len" and "req.body_size" fetch methods in HTX mode
When in HTX mode, in functions smp_fetch_body_len() and
smp_fetch_body_size() we were subtracting the size of each header block
from the total size htx->data to calculate the size of body, and that
could result in wrong calculated value.

To avoid this, we now loop on blocks to sum up the size of only those
that are of type HTX_BLK_DATA.

This patch must be backported to 1.9.
2019-02-14 15:41:17 +01:00
Christopher Faulet
6cdaf2ad9a BUG/MEDIUM: proto_htx: Fix data size update if end of the cookie is removed
When client-side or server-side cookies are parsed, if the end of the cookie
line is removed, the HTX message must be updated. The length of the HTX block is
decreased and the data size of the HTX message is modified accordingly. The
update of the HTX block was ok but the update of the HTX message was wrong,
leading to undefined behaviours during the data forwarding. One of possible
effect was a freeze of the connection and no data forward.

This patch must be backported in 1.9.
2019-02-13 09:56:54 +01:00
Dragan Dosen
8861e1c082 BUG/MEDIUM: http_fetch: fix the "base" and "base32" fetch methods in HTX mode
The resulting value produced in functions smp_fetch_base() and
smp_fetch_base32() was wrong when in HTX mode.

This patch also adds the semicolon at the end of the for-loop line, used
in function smp_fetch_path(), since it's actually with no body.

This must be backported to 1.9.
2019-02-12 20:43:03 +01:00
Frédéric Lécaille
76d2cef0c2 BUG/MEDIUM: peers: Missing peer initializations.
Initialize ->srv peer field for all the peers, the local peer included.
Indeed, a haproxy process needs to connect to the local peer of a remote
process. Furthermore, when a "peer" or "server" line is parsed by parse_server()
the address must be copied to ->addr field of the peer object only if this address
has been also parsed by parse_server(). This is not the case if this address belongs
to the local peer and is provided on a "server" line.

After having parsed the "peer" or "server" lines of a peer
sections, the ->srv part of all the peer must be initialized for SSL, if
enabled. Same thing for the binding part.

Revert 1417f0b commit which is no more required.

No backport is needed, this is purely 2.0.
2019-02-12 19:49:22 +01:00
Willy Tarreau
cdce54c2b7 BUILD/MINOR: htx: fix some potential null-deref warnings with http_find_stline
http_find_stline() carefully verifies that it finds a start line otherwise
returns NULL when not found. But a few calling functions ignore this NULL
in return and dereference this pointer without checking. Let's add the
test where needed in the callers. If it turns out that over the long term
a start line is mandatory, then the test will be removed and the faulty
function will have to be simplified.

This must be backported to 1.9.
2019-02-12 12:02:27 +01:00
Willy Tarreau
9bdd7bc63d BUILD/MINOR: peers: remove an impossible null test in intencode()
intencode() tests for the nullity of the target pointer passed in
argument, but the code calling intencode() never does so and happily
dereferences it. gcc at -O3 detects this as a potential null deref.
Let's remove this incorrect and misleading test. If this pointer was
null, the code would already crash in the calling functions.

This must be backported to stable versions.
2019-02-12 11:59:35 +01:00
Willy Tarreau
4eee38aa57 BUILD/MINOR: tools: fix build warning in the date conversion functions
Some gcc versions emit potential null deref warnings at -O3 in
date2str_log(), gmt2str_log() and localdate2str_log() after utoa_pad()
because this function may return NULL if its size argument is too small
for the integer value. And it's true that we can't guarantee that the
input number is always valid.

This must be backported to all stable versions.
2019-02-12 11:30:04 +01:00
Willy Tarreau
1ef724e216 BUILD/MINOR: stream: avoid a build warning with threads disabled
gcc 6+ complains about a possible null-deref here due to the test in
objt_server() :

       if (objt_server(s->target))
           HA_ATOMIC_ADD(&objt_server(s->target)->counters.retries, 1);

Let's simply change it to __objt_server(). This can be backported to
1.9 and 1.8.
2019-02-12 10:59:32 +01:00
Willy Tarreau
09c4bab411 BUG/MAJOR: stream: avoid double free on unique_id
Commit 32211a1 ("BUG/MEDIUM: stream: Don't forget to free
s->unique_id in stream_free().") addressed a memory leak but in
exchange may cause double-free due to the fact that after freeing
s->unique_id it doesn't null it and then calls http_end_txn()
which frees it again. Thus the process quickly crashes at runtime.

This fix must be backported to all stable branches where the
aforementioned patch was backported.
2019-02-10 18:49:37 +01:00
Ben51Degrees
4ddf59d070 MEDIUM: 51d: Enabled multi threaded operation in the 51Degrees module.
The existing threading flag in the 51Degrees API
(FIFTYONEDEGREES_NO_THREADING) has now been mapped to the HAProxy
threading flag (USE_THREAD), and the 51Degrees module code has been made
thread safe.
In Pattern, the cache is now locked with a spin lock from hathreads.h
using a new lable 'OTHER_LOCK'. The workset pool is now created with the
same size as the number of threads to avoid any time waiting on a
worket.
In Hash Trie, the global device offsets structure is only used in single
threaded operation. Multi threaded operation creates a new offsets
structure in each thread.
2019-02-08 21:29:23 +01:00
Ben51Degrees
e0f6a2a2aa BUG: 51d: In Hash Trie, multi header matching was affected by the header names stored globaly.
Some logic around mutli header matching in Hash Trie has been improved
where only the name of the most important header was stored in the
global heade_names structure. Now all headers are stored, so are used in
the mutli header matching correctly.
2019-02-08 21:29:08 +01:00
Willy Tarreau
7701cad444 BUG/MINOR: mux-h1: verify the request's version before dropping connection: keep-alive
The mux h1 properly avoid to set "connection: keep-alive" when the response
is in HTTP/1.1 but it forgot to check the request's version. Thus when the
client requests using HTTP/1.0 and connection: keep-alive (like ab does),
the response is in 1.1 with no keep-alive and ab waits for the close without
checking for the content-length. Response headers actually depend on the
recipient, thus on both request and response's version.

This patch must be backported to 1.9.
2019-02-08 15:38:22 +01:00
Christopher Faulet
18cca781f5 BUG/MINOR: config: Reinforce validity check when a process number is parsed
Now, in the function parse_process_number(), when a process number or a set of
processes is parsed, an error is triggered if an invalid character is found. It
means following syntaxes are not forbidden and will emit an alert during the
HAProxy startup:

  1a
  1/2
  1-2-3

This bug was reported on Github. See issue #36.

This patch may be backported to 1.9 and 1.8.
2019-02-07 21:23:58 +01:00
Christopher Faulet
11389018bc BUG/MAJOR: spoe: Don't try to get agent config during SPOP healthcheck
During SPOP healthchecks, a dummy appctx is used to create the HAPROXY-HELLO
frame and then to parse the AGENT-HELLO frame. No agent are attached to it. So
it is important to not rely on an agent during these stages. When HAPROXY-HELLO
frame is created, there is no problem, all accesses to an agent are
guarded. This is not true during the parsing of the AGENT-HELLO frame. Thus, it
is possible to crash HAProxy with a SPOA declaring the async or the pipelining
capability during a healthcheck.

This patch must be backported to 1.9 and 1.8.
2019-02-07 21:23:58 +01:00
Willy Tarreau
ff9c9140f4 MINOR: config: make MAX_PROCS configurable at build time
For some embedded systems, it's pointless to have 32- or even 64- large
arrays of processes when it's known that much fewer processes will be
used in the worst case. Let's introduce this MAX_PROCS define which
contains the highest number of processes allowed to run at once. It
still defaults to LONGBITS but may be lowered.
2019-02-07 15:10:19 +01:00
Willy Tarreau
980855bd95 BUG/MEDIUM: server: initialize the orphaned conns lists and tasks at the end
This also depends on the nbthread count, so it must only be performed after
parsing the whole config file. As a side effect, this removes some code
duplication between servers and server-templates.

This must be backported to 1.9.
2019-02-07 15:08:13 +01:00
Willy Tarreau
835daa119e BUG/MEDIUM: server: initialize the idle conns list after parsing the config
The idle conns lists are sized according to the number of threads. As such
they cannot be initialized during the parsing since nbthread can be set
later, as revealed by this simple config which randomly crashes when used.
Let's do this at the end instead.

    listen proxy
        bind :4445
        mode http
        timeout client 10s
        timeout server 10s
        timeout connect 10s
        http-reuse always
        server s1 127.0.0.1:8000

    global
        nbthread 8

This fix must be backported to 1.9 and 1.8.
2019-02-07 15:08:13 +01:00
Willy Tarreau
b0769b2ed6 BUG/MEDIUM: spoe: initialization depending on nbthread must be done last
The agent used to be configured depending on global.nbthread while nbthread
may be set after the agent is parsed. Let's move this part to the spoe_check()
function to make sure nbthread is always correct and arrays are appropriately
sized.

This fix must be backported to 1.9 and 1.8.
2019-02-07 15:08:13 +01:00
Willy Tarreau
b784b35ce8 BUG/MINOR: lua: initialize the correct idle conn lists for the SSL sockets
Commit 40a007cf2 ("MEDIUM: threads/server: Make connection list
(priv/idle/safe) thread-safe") made a copy-paste error when initializing
the Lua sockets, as the TCP one was initialized twice. Fortunately it has
no impact because the pointers are set to NULL after a memset(0) and are
not changed in between.

This must be backported to 1.9 and 1.8.
2019-02-07 15:08:13 +01:00
Willy Tarreau
3ddcf7643c BUG/MINOR: spoe: do not assume agent->rt is valid on exit
As reported by Christopher, we may call spoe_release_agent() when leaving
after an allocation failure or a config parse error. We must not assume
agent->rt is valid there as the allocation could have failed.

This should be backported to 1.9 and 1.8.
2019-02-07 15:08:13 +01:00
Willy Tarreau
1a0fe3becd BUG/MINOR: config: make sure to count the error on incorrect track-sc/stick rules
When commit 151e1ca98 ("BUG/MAJOR: config: verify that targets of track-sc
and stick rules are present") added a check for some process inconsistencies
between rules and their stick tables, some errors resulted in a "return 0"
statement, which is taken as "no error" in some cases. Let's fix this.

This must be backported to all versions using the above commit.
2019-02-06 10:25:07 +01:00
Christopher Faulet
f7679ad4db BUG/MAJOR: htx/backend: Make all tests on HTTP messages compatible with HTX
A piece of code about the HTX was lost this summer, after the "big merge"
(htx/http2/connection layer refactoring). I forgot to keep HTX changes in the
functions connect_server() and assign_server(). So, this patch fixes "uri",
"url_param" and "hdr" LB algorithms when the HTX is enabled. It also fixes
evaluation of the "sni" expression on server lines.

This issue was reported on github. See issue #32.

This patch must be backported in 1.9.
2019-02-06 10:20:01 +01:00
Willy Tarreau
2bdcfde426 BUG/MAJOR: spoe: verify that backends used by SPOE cover all their callers' processes
When a filter is installed on a proxy and references spoe, we must be
absolutely certain that the whole chain is valid on a given process
when running in multi-process mode. The problem here is that if a proxy
1 runs on process 1, referencing an SPOE agent itself based on a backend
running on process 2, this last one will be completely deinited on
process 1, and will thus cause random crashes when it gets messages
from this proess.

This patch makes sure that the whole chain is valid on all of the caller's
processes.

This fix must be backported to all spoe-enabled maintained versions. It
may potentially disrupt configurations which already randomly crash.
There hardly is any intermediary solution though, such configurations
need to be fixed.
2019-02-05 13:37:19 +01:00
Willy Tarreau
151e1ca989 BUG/MAJOR: config: verify that targets of track-sc and stick rules are present
Stick and track-sc rules may optionally designate a table in a different
proxy. In this case, a number of verifications are made such as validating
that this proxy actually exists. However, in multi-process mode, the target
table might indeed exist but not be bound to the set of processes the rules
will execute on. This will definitely result in a random behaviour especially
if these tables do require peer synchronization, because some tasks will be
started to try to synchronize form uninitialized areas.

The typical issue looks like this :

    peers my-peers
         peer foo ...

    listen proxy
         bind-process 1
         stick on src table ip
         ...

    backend ip
         bind-process 2
         stick-table type ip size 1k peers my-peers

While it appears obvious that the example above will not work, there are
less obvious situations, such as having bind-process in a defaults section
and having a larger set of processes for the referencing proxy than the
referenced one.

The present patch adds checks for such situations by verifying that all
processes from the referencing proxy are present on the other one in all
track-sc* and stick-* rules, and in sample fetch / converters referencing
another table so that sc_inc_gpc0() and similar are safe as well.

This fix must be backported to all maintained versions. It may potentially
disrupt configurations which already randomly crash. There hardly is any
intermediary solution though, such configurations need to be fixed.
2019-02-05 11:54:49 +01:00
Willy Tarreau
155acffc13 BUG/MINOR: task: close a tiny race in the inter-thread wakeup
__task_wakeup() takes care of a small race that exists between threads,
but it uses a store barrier that is not sufficient since apparently the
state read after clearing the leaf_p pointer sometimes is incorrect. This
results in missed wakeups between threads competing at a high rate. Let's
use a full barrier instead to serialize the operations.

This may be backported to 1.9 though it's extremely unlikely that this
bug will ever manifest itself there.
2019-02-04 14:21:35 +01:00
Willy Tarreau
ef6fd85623 BUG/MINOR: compression: properly report compression stats in HTX mode
When HTX support was added to HTTP compression, a set of counters was missed,
namely comp_in and comp_byp, resulting in no stats being available for compression.

This must be backported to 1.9.
2019-02-04 11:48:03 +01:00
Willy Tarreau
3d95717b58 MINOR: threads: make use of thread_mask() to simplify some thread calculations
By doing so it's visible that some fd_insert() calls were relying on
MAX_THREADS while all_threads_mask should have been more suitable.
2019-02-04 05:09:16 +01:00
Willy Tarreau
6daac19b3f MINOR: config: simplify bind_proc processing using proc_mask()
At a number of places we used to have null tests on bind_proc for
listeners and proxies. Let's simplify all these tests by always
having the proper bits reported via proc_mask().
2019-02-04 05:09:16 +01:00
Willy Tarreau
a38a7175b1 MINOR: config: keep an all_proc_mask like we have all_threads_mask
This simplifies some mask comparisons at various places where
nbits(global.nbproc) was used.
2019-02-04 05:09:15 +01:00
Willy Tarreau
fc647360e0 CLEANUP: threads: use nbits to calculate the thread mask
It's pointless to do arithmetics by hand, we have a function for this.
2019-02-02 17:48:39 +01:00
Willy Tarreau
6b4a39adc4 BUG/MINOR: config: fix bind line thread mask validation
When no nbproc is specified, a computation leads to reading bind_thread[-1]
before checking if the thread mask is valid for a bind conf. It may either
report a false warning and compute a wrong mask, or miss some incorrect
configs.

This must be backported to 1.9 and possibly 1.8.
2019-02-02 17:46:24 +01:00
Willy Tarreau
bbcf2b9e0d BUG/MINOR: threads: fix the process range of thread masks
Commit 421f02e ("MINOR: threads: add a MAX_THREADS define instead of
LONGBITS") used a MAX_THREADS macros to fix threads limits. However,
one change was wrong as it affected the upper bound of the process
loop when setting threads masks. No backport is needed.
2019-02-02 13:18:01 +01:00
Olivier Houchard
32211a17eb BUG/MEDIUM: stream: Don't forget to free s->unique_id in stream_free().
In stream_free(), free s->unique_id. We may still have one, because it's
allocated in log.c::strm_log() no matter what, even if it's a TCP connection
and thus it won't get free'd by http_end_txn().
Failure to do so leads to a memory leak.

This should probably be backported to all maintained branches.
2019-02-01 18:17:36 +01:00
Willy Tarreau
053c15750b BUG/MEDIUM: mux-h2: always set :authority on request output
PiBa-NL reported that some servers don't fall back to the Host header when
:authority is absent. After studying all the combinations of Host and
:authority, it appears that we always have to send the latter, hence we
never need the former. In case of CONNECT method, the authority is retrieved
from the URI part, otherwise it's extracted from the Host field.

The tricky part is that we have to scan all headers for the Host header
before dumping other headers. This is due to the fact that we must emit
pseudo headers before other ones. One improvement could possibly be made
later in the request parser to search and emit the Host header immediately
if authority was not found. This would cost nothing on the vast marjority
of requests and make the lookup faster on output since Host would appear
first.

This fix must be backported to 1.9.
2019-02-01 16:47:46 +01:00
Willy Tarreau
5be92ff23f BUG/MEDIUM: mux-h2: always omit :scheme and :path for the CONNECT method
This is mandated by RFC7540 #8.3, these pseudo-headers must not be emitted
with the CONNECT method.

This must be backported to 1.9.
2019-02-01 16:47:46 +01:00
Willy Tarreau
1da41ecf5b BUG/MINOR: backend: check srv_conn before dereferencing it
Commit 3c4e19f42 ("BUG/MEDIUM: backend: always release the previous
connection into its own target srv_list") introduced a valid warning
about a null-deref risk since we didn't check conn_new()'s return value.

This patch must be backported to 1.9 with the patch above.
2019-02-01 16:47:46 +01:00
Olivier Houchard
9c4f08ae39 BUG/MINOR: tune.fail-alloc: Don't forget to initialize ret.
In mem_should_fail(), if we don't want to fail the allocation, either
because mem_fail_rate is 0, or because we're still initializing, don't
forget to initialize ret, or we may return a non-zero value, and fail
an allocation we didn't want to fail.
This should only affect users that compile with DEBUG_FAIL_ALLOC.
2019-02-01 16:30:08 +01:00
Willy Tarreau
3e451842dc BUG/MEDIUM: htx: check the HTX compatibility in dynamic use-backend rules
I would have sworn it was done, probably we lost it during the refactoring.
If a frontend is in HTX and the backend not (and conersely), this is
normally detected at config parsing time unless the rule is dynamic. In
this case we must abort with an error 500. The logs will report "RR"
(resource issue while processing request) with the frontend and the
backend assigned, so that it's possible to figure what was attempted.

This must be backported to 1.9.
2019-02-01 15:09:54 +01:00
Willy Tarreau
3c4e19f429 BUG/MEDIUM: backend: always release the previous connection into its own target srv_list
There was a bug reported in issue #19 regarding the fact that haproxy
could mis-route requests to the wrong server. It turns out that when
switching to another server, the old connection was put back into the
srv_list corresponding to the stream's target instead of this connection's
target. Thus if this connection was later picked, it was pointing to the
wrong server.

The patch fixes this and also clarifies the assignment to srv_conn->target
so that it's clear we don't change it when picking it from the srv_list.

This must be backported to 1.9 only.
2019-02-01 11:58:33 +01:00
Olivier Houchard
dc21ff778b MINOR: debug: Add an option that causes random allocation failures.
When compiling with DEBUG_FAIL_ALLOC, add a new option, tune.fail-alloc,
that gives the percentage of chances an allocation fails.
This is useful to check that allocation failures are always handled
gracefully.
2019-01-31 19:38:25 +01:00
Olivier Houchard
9c9da5ee89 MINOR: muxes: Don't bother to LIST_DEL(&conn->list) before calling conn_free().
conn_free() already removes the connection from any idle list, so there's no
need to do it in the mux code, just before calling conn_free().
2019-01-31 19:38:25 +01:00
Willy Tarreau
8694978892 BUG/MEDIUM: mux-h2: properly consider the peer's advertised max-concurrent-streams
Till now we used to only rely on tune.h2.max-concurrent-streams but if
a peer advertises a lower limit this can cause streams to be reset or
even the conection to be killed. Let's respect the peer's value for
outgoing streams.

This patch should be backported to 1.9, though it depends on the following
ones :
    BUG/MINOR: server: fix logic flaw in idle connection list management
    MINOR: mux-h2: max-concurrent-streams should be unsigned
    MINOR: mux-h2: make sure to only check concurrency limit on the frontend
    MINOR: mux-h2: learn and store the peer's advertised MAX_CONCURRENT_STREAMS setting
2019-01-31 19:38:25 +01:00
Willy Tarreau
2e2083ae5b MINOR: mux-h2: learn and store the peer's advertised MAX_CONCURRENT_STREAMS setting
We used not to take it into account because we only used the configured
parameter everywhere. This patch makes sure we can actually learn the
value advertised by the peer. We still enforce our own limit on top of
it however, to make sure we can actually limit resources or stream
concurrency in case of suboptimal server settings.
2019-01-31 19:38:25 +01:00
Willy Tarreau
fa1d357f05 MINOR: mux-h2: make sure to only check concurrency limit on the frontend
h2_has_too_many_cs() was renamed to h2_frt_has_too_many_cs() to make it
clear it's only used to throttle the frontend connection, and the call
places were adjusted to only call this code on a front connection. In
practice it was already the case since the H2_CF_DEM_TOOMANY flag is
only set there. But now the ambiguity is removed.
2019-01-31 19:38:25 +01:00
Willy Tarreau
5a490b669e MINOR: mux-h2: max-concurrent-streams should be unsigned
We compare it to other unsigned values, let's make it unsigned as well.
2019-01-31 19:38:25 +01:00
Willy Tarreau
00f18a36b6 BUG/MINOR: server: fix logic flaw in idle connection list management
With variable connection limits, it's not possible to accurately determine
whether the mux is still in use by comparing usage and max to be equal due
to the fact that one determines the capacity and the other one takes care
of the context. This can cause some connections to be dropped before they
reach their stream ID limit.

It seems it could also cause some connections to be terminated with
streams still alive if the limit was reduced to match the newly computed
avail_streams() value, though this cannot yet happen with existing muxes.

Instead let's switch to usage reports and simply check whether connections
are both unused and available before adding them to the idle list.

This should be backported to 1.9.
2019-01-31 19:38:25 +01:00
Willy Tarreau
180590409f BUG/MEDIUM: mux-h2: do not close the connection on aborted streams
We used to rely on a hint that a shutw() or shutr() without data is an
indication that the upper layer had performed a tcp-request content reject
and really wanted to kill the connection, but sadly there is another
situation where this happens, which is failed keep-alive request to a
server. In this case the upper layer stream silently closes to let the
client retry. In our case this had the side effect of killing all the
connection.

Instead of relying on such hints, let's address the problem differently
and rely on information passed by the upper layers about the intent to
kill the connection. During shutr/shutw, this is detected because the
flag CS_FL_KILL_CONN is set on the connstream. Then only in this case
we send a GOAWAY(ENHANCE_YOUR_CALM), otherwise we only send the reset.

This makes sure that failed backend requests only fail frontend requests
and not the whole connections anymore.

This fix relies on the two previous patches adding SI_FL_KILL_CONN and
CS_FL_KILL_CONN as well as the fix for the connection close, and it must
be backported to 1.9 and 1.8, though the code in 1.8 could slightly differ
(cs is always valid) :

  BUG/MEDIUM: mux-h2: wait for the mux buffer to be empty before closing the connection
  MINOR: stream-int: add a new flag to mention that we want the connection to be killed
  MINOR: connstream: have a new flag CS_FL_KILL_CONN to kill a connection
2019-01-31 19:38:25 +01:00
Willy Tarreau
51d0a7e54c MINOR: connstream: have a new flag CS_FL_KILL_CONN to kill a connection
This is the equivalent of SI_FL_KILL_CONN but for the connstreams. It
will be set by the stream-interface during the various shutdown
operations.
2019-01-31 19:38:25 +01:00
Willy Tarreau
0f9cd7b196 MINOR: stream-int: add a new flag to mention that we want the connection to be killed
The new flag SI_FL_KILL_CONN is now set by the rare actions which
deliberately want the whole connection (and not just the stream) to be
killed. This is only used for "tcp-request content reject",
"tcp-response content reject", "tcp-response content close" and
"http-request reject". The purpose is to desambiguate the close from
a regular shutdown. This will be used by the next patches.
2019-01-31 19:38:25 +01:00
Willy Tarreau
4dbda620f2 BUG/MEDIUM: mux-h2: wait for the mux buffer to be empty before closing the connection
When finishing to respond on a stream, a shutw() is called (resulting
in either an end of stream or RST), then h2_detach() is called, and may
decide to kill the connection is a number of conditions are satisfied.
Actually one of these conditions is that a GOAWAY frame was already sent
or attempted to be sent. This one is wrong, because it can happen in at
least these two situations :
  - a shutw() sends a GOAWAY to obey tcp-request content reject
  - a graceful shutdown is pending

In both cases, the connection will be aborted with the mux buffer holding
some data. In case of a strong abort the client will not see the GOAWAY or
RST and might want to try again, which is counter-productive. In case of
the graceful shutdown, it could result in truncated data. It looks like a
valid candidate for the issue reported here :

    https://www.mail-archive.com/haproxy@formilux.org/msg32433.html

A backport to 1.9 and 1.8 is necessary.
2019-01-31 19:38:25 +01:00
Willy Tarreau
28e581b21c BUG/MINOR: stream: don't close the front connection when facing a backend error
In 1.5-dev13, a bug was introduced by commit e3224e870 ("BUG/MINOR:
session: ensure that we don't retry connection if some data were sent").
If a connection error is reported after some data were sent (and lost),
we used to accidently mark the front connection as being in error instead
of only the back one because the two direction flags were applied to the
same channel. This case is extremely rare with raw connections but can
happen a bit more often with multiplexed streams. This will result in
the error not being correctly reported to the client.

This patch can be backported to all supported versions.
2019-01-31 19:38:25 +01:00
Olivier Houchard
8788b4111c BUG/MEDIUM: connections: Don't forget to remove CO_FL_SESS_IDLE.
If we're adding a connection to the server orphan idle list, don't forget
to remove the CO_FL_SESS_IDLE flag, or we will assume later it's still
attached to a session.

This should be backported to 1.9.
2019-01-31 19:38:25 +01:00
Christopher Faulet
3949c9d90d BUG/MEDIUM: mux-h1: Don't add "transfer-encoding" if message-body is forbidden
When a HTTP/1.1 or above response is emitted to a client, if the flag
H1_MF_XFER_LEN is set whereas H1_MF_CLEN and H1_MF_CHNK are not, the header
"transfer-encoding" is added. It is a way to make HTX chunking consistent with
H2. But we must exclude all cases where the message-body is explicitly forbidden
by the RFC:

 * for all 1XX, 204 and 304 responses
 * for any responses to HEAD requests
 * for 200 responses to CONNECT requests

For these 3 cases, the flag H1_MF_XFER_LEN is set but H1_MF_CLEN and H1_MF_CHNK
not. And the header "transfer-encoding" must not be added.

See issue #27 on github for details about the bug.

This patch must be backported in 1.9.
2019-01-31 11:07:17 +01:00
Frédéric Lécaille
04636b7bac BUG/MEDIUM: peers: Peer addresses parsing broken.
This bug was introduced by 355b203 commit which prevented the peer
addresses to be parsed for the local peer of a "peers" section.
When adding "parse_addr" boolean parameter to parse_server(), this commit
missed the case where the syntax with "peer" keyword should still be
supported in addition to the new syntax with "server"+"bind" keyword.

May be backported as fas as 1.5.
2019-01-31 09:56:39 +01:00
Willy Tarreau
a9b7796862 MINOR: mux-h2: consistently rely on the htx variable to detect the mode
In h2_frt_transfer_data(), we support both HTX and legacy modes. The
HTX mode is detected from the proxy option and sets a valid pointer
into the htx variable. Better rely on this variable in all the function
rather than testing the option again. This way the code is clearer and
even the compiler knows this pointer is valid when it's dereferenced.

This should be backported to 1.9 if the b_is_null() patch is backported.
2019-01-31 08:07:17 +01:00
Willy Tarreau
1f035507af BUG/MINOR: mux-h2: make sure request trailers on aborted streams don't break the connection
We used to respond a connection error in case we received a trailers
frame on a closed stream, but it's a problem to do this if the error
was caused by a reset because the sender has not yet received it and
is just a victim of the timing. Thus we must not close the connection
in this case.

This patch may be backported to 1.9 but then it requires the following
previous ones :
   MINOR: h2: add a generic frame checker
   MEDIUM: mux-h2: check the frame validity before considering the stream state
   CLEANUP: mux-h2: remove stream ID and frame length checks from the frame parsers
2019-01-30 19:37:20 +01:00
Willy Tarreau
b860c73756 CLEANUP: mux-h2: remove stream ID and frame length checks from the frame parsers
It's not convenient to have such structural checks mixed with the ones
related to the stream state. Let's remove all these basic tests that are
already covered once for all when reading the frame header.
2019-01-30 19:37:20 +01:00
Willy Tarreau
54f46e53dd MEDIUM: mux-h2: check the frame validity before considering the stream state
There are some uneasy situation where it's difficult to validate a frame's
format without being in an appropriate state. This patch makes sure that
each frame passes through h2_frame_check() before being checked in the
context of the stream's state. This makes sure we can always return a GOAWAY
for protocol violations even if we can't process the frame.
2019-01-30 19:37:20 +01:00
Willy Tarreau
9c84d8299a MINOR: h2: add a generic frame checker
The new function h2_frame_check() checks the protocol limits for the
received frame (length, ID, direction) and returns a verdict made of
a connection error code. The purpose is to be able to validate any
frame regardless of the state and the ability to call the frame handler,
and to emit a GOAWAY early in this case.
2019-01-30 19:37:20 +01:00
Willy Tarreau
08bb1d6109 BUG/MINOR: mux-h2: make sure response HEADERS are not received in other states than OPEN and HLOC
RFC7540#5.1 states that these are the only states allowing any frame
type. For response HEADERS frames, we cannot accept that they are
delivered on idle streams of course, so we're left with these two
states only. It is important to test this so that we can remove the
generic CLOSE_STREAM test for such frames in the main loop.

This must be backported to 1.9 (1.8 doesn't have response HEADERS).
2019-01-30 19:37:14 +01:00
Willy Tarreau
8d9ac3ed8b BUG/MEDIUM: mux-h2: do not abort HEADERS frame before decoding them
If a response HEADERS frame arrives on a closed connection (due to a
client abort sending an RST_STREAM), it's currently immediately rejected
with an RST_STREAM, like any other frame. This is incorrect, as HEADERS
frames must first be decoded to keep the HPACK decoder synchronized,
possibly breaking subsequent responses.

This patch excludes HEADERS/CONTINUATION/PUSH_PROMISE frames from the
central closed state test and leaves to the respective frame parsers
the responsibility to decode the frame then send RST_STREAM.

This fix must be backported to 1.9. 1.8 is not directly impacted since
it doesn't have response HEADERS nor trailers thus cannot recover from
such situations anyway.
2019-01-30 19:36:21 +01:00
Willy Tarreau
24ff1f8341 BUG/MEDIUM: mux-h2: make sure never to send GOAWAY on too old streams
The H2 spec requires to send GOAWAY when the client sends a frame after
it has already closed using END_STREAM. Here the corresponding case was
the fallback of a series of tests on the stream state, but it unfortunately
also catches old closed streams which we don't know anymore. Thus any late
packet after we've sent an RST_STREAM will trigger this GOAWAY and break
other streams on the connection.

This can happen when launching two tabs in a browser targetting the same
slow page through an H2-to-H2 proxy, and pressing Escape to stop one of
them. The other one gets an error when the page finally responds (and it
generally retries), and the logs in the middle indicate SD-- flags since
the late response was cancelled.

This patch takes care to only send GOAWAY on streams we still know.

It must be backported to 1.9 and 1.8.
2019-01-30 19:35:42 +01:00
Willy Tarreau
fc10f599cc BUG/MEDIUM: mux-h2: fix two half-closed to closed transitions
When receiving a HEADERS or DATA frame with END_STREAM set, we would
inconditionally switch to half-closed(remote). This is wrong because we
could already have been in half-closed(local) and need to switch to closed.
This happens in the following situations :
    - receipt of the end of a client upload after we've already responded
      (e.g. redirects to POST requests)
    - receipt of a response on the backend side after we've already finished
      sending the request (most common case).

This may possibly have caused some streams to stay longer than needed
at the end of a transfer, though this is not apparent in tests.

This must be backported to 1.9 and 1.8.
2019-01-30 19:34:40 +01:00
Willy Tarreau
b1c9edc579 BUG/MEDIUM: mux-h2: wake up flow-controlled streams on initial window update
When a settings frame updates the initial window, all affected streams's
window is updated as well. However the streams are not put back into the
send list if they were already blocked on flow control. The effect is that
such a stream will only be woken up by a WINDOW_UPDATE message but not by
a SETTINGS changing the initial window size. This can be verified with
h2spec's test http2/6.9.2/1 which occasionally fails without this patch.

It is unclear whether this situation is really met in field, but the
fix is trivial, it consists in adding each unblocked streams to the
wait list as is done for the window updates.

This fix must be backported to 1.9. For 1.8 the patch needs quite
a few adaptations. It's better to copy-paste the code block from
h2c_handle_window_update() adding the stream to the send_list when
its mws is > 0.
2019-01-30 16:21:39 +01:00
Willy Tarreau
6432dc8783 CLEANUP: mux-h2: remove misleading leftover test on h2s' nullity
The WINDOW_UPDATE and DATA frame handlers used to still have a check on
h2s to return either h2s_error() or h2c_error(). This is a leftover from
the early code. The h2s cannot be null there anymore as it has already
been dereferenced before reaching these locations.
2019-01-30 15:45:02 +01:00
Kevin Zhu
13ebef7ecb BUG/MINOR: deinit: tcp_rep.inspect_rules not deinit, add to deinit
It seems like this can be backported as far as 1.5.
2019-01-30 10:22:34 +01:00
Tim Duesterhus
b229f018ee BUG/MEDIUM: compression: Rewrite strong ETags
RFC 7232 section 2.3.3 states:

> Note: Content codings are a property of the representation data,
> so a strong entity-tag for a content-encoded representation has to
> be distinct from the entity tag of an unencoded representation to
> prevent potential conflicts during cache updates and range
> requests.  In contrast, transfer codings (Section 4 of [RFC7230])
> apply only during message transfer and do not result in distinct
> entity-tags.

Thus a strong ETag must be changed when compressing. Usually this is done
by converting it into a weak ETag, which represents a semantically, but not
byte-by-byte identical response. A conversion to a weak ETag still allows
If-None-Match to work.

This should be backported to 1.9 and might be backported to every supported
branch with compression.
2019-01-29 20:26:06 +01:00
Olivier Houchard
7493114729 BUG/MEDIUM: servers: Close the connection if we failed to install the mux.
If we failed to install the mux, just close the connection, or
conn_fd_handler() will be called for the FD, and crash as soon as it attempts
to access the mux' wake method.

This should be backported to 1.9.
2019-01-29 19:53:09 +01:00
Olivier Houchard
ef60ff38fb BUG/MEDIUM: peers: Handle mux creation failure.
If the mux fails to properly be created by conn_install_mux, fail, instead
of silently ignoring it.

This should be backported to 1.9.
2019-01-29 19:47:20 +01:00
Olivier Houchard
2b09443e04 BUG/MEDIUM: h2: In h2_send(), stop the loop if we failed to alloc a buf.
In h2_send(), make sure we break the loop if we failed to alloc a buffer,
or we'd end up looping endlessly.

This should be backported to 1.9.
2019-01-29 19:47:20 +01:00
Olivier Houchard
a48437bb5e BUG/MEDIUM: checks: Don't try to set ALPN if connection failed.
If we failed to connect, don't attempt to set the ALPN, as we don't have
a SSL context, anyway.

This should be backported to 1.9.
2019-01-29 19:47:20 +01:00
Olivier Houchard
26da323cb9 BUG/MEDIUM: servers: Don't add an incomplete conn to the server idle list.
If we failed to add the connection to the session, only try to add it back
to the server idle list if it has a mux, otherwise the connection is
incomplete and unusable.

This should be backported to 1.9.
2019-01-29 19:47:20 +01:00
Olivier Houchard
4dc85538ba BUG/MEDIUM: servers: Only destroy a conn_stream we just allocated.
In connect_server(), if we failed to add the connection to the session,
only destroy the conn_stream if we just allocated it, otherwise it may
have been allocated outside connect_server(), along with a connection which
has its destination address set.
Also use si_release_endpoint() instead of cs_destroy(), to make sure the
stream_interface doesn't reference it anymore.

This should be backported to 1.9.
2019-01-29 19:47:20 +01:00
Olivier Houchard
f67be93ae0 BUG/MEDIUM: checks: Check that conn_install_mux succeeded.
If conn_install_mux failed, then the connection has no mux and won't be
usable, so just give up is on failure instead of ignoring it.

This should be backported to 1.9.
2019-01-29 19:47:20 +01:00
Willy Tarreau
f1e6fa35de CLEANUP: mux-h2: remove two useless but misleading assignments
h2c->st0 was assigned to H2_CS_ERROR right after returning from
h2c_error(), which had already done it. It's useless and confusing,
let's remove this.
2019-01-29 18:51:41 +01:00
Willy Tarreau
3ad5d31bdf BUG/MEDIUM: mux-h2: only close connection on request frames on closed streams
A subtle bug was introduced with H2 on the backend. RFC7540 states that
an attempt to create a stream on an ID not higher than the max known is
a connection error. This was translated into rejecting HEADERS frames
for closed streams. But with H2 on the backend, if the client aborts
and causes an RST_STREAM to be emitted, the stream is effectively closed,
and if/once the server responds, it starts by emitting a HEADERS frame
with this ID thus it is interpreted as a connection error.

This test must of course consider the side the mux is installed on and
not take this for a connection error on responses.

The effect is that an aborted stream on an outgoing H2 connection, for
example due to a client stopping a transfer with option abortonclose
set, would lead to an abort of all other streams. In the logs, this
appears as one or several CD-- line(s) followed by one or several SD--
lines which are victims.

Thanks to Luke Seelenbinder for reporting this problem and providing
enough elements to help understanding how to reproduce it.

This fix must be backported to 1.9.
2019-01-29 18:49:27 +01:00
Willy Tarreau
6254a9257e BUILD/MINOR: peers: shut up a build warning introduced during last cleanup
A new warning appears when building at -O0 since commit 3f0fb9df6 ("MINOR:
peers: move "hello" message treatment code to reduce the size of the I/O
handler."), it is related to the fact that proto_len is initialized from
strlen() which is not a constant. Let's replace it with sizeof-1 instead
and also mark the variable as static since it's useless outside of the file.
2019-01-29 17:45:23 +01:00
Willy Tarreau
6f731f33ac CLEANUP: peers: factor error handling in peer_treat_definedmsg()
This is a trivial code refactoring of similar parsing error code
under a single label.
2019-01-29 11:11:23 +01:00
Willy Tarreau
1e82a14c34 CLEANUP: peers: factor the error handling code in peer_treet_updatemsg()
The error handling code was extremely repetitive and error-prone due
to the numerous copy-pastes, some involving unlocks or free. Let's
factor this out. The code could be further simplified, but 12 locations
were already cleaned without taking risks.
2019-01-29 11:08:06 +01:00
Frédéric Lécaille
4b2fd9bf71 MINOR: peers: move peer initializations code to reduce the size of the I/O handler.
Implements two new functions to init peer flags and other stuff after
having accepted or connected them with the peer I/O handler so that
to reduce its size.

May be backported as far as 1.5.
2019-01-29 10:29:54 +01:00
Frédéric Lécaille
3f0fb9df6c MINOR: peers: move "hello" message treatment code to reduce the size of the I/O handler.
This patch implements three functions to read and parse the three
line of a "hello" peer protocol message so that to call them from the
peer I/O handler and reduce its size.

May be backported as far as 1.5.
2019-01-29 10:29:54 +01:00
Frédéric Lécaille
be825e5c05 CLEANUP: peers: Remove useless statements.
When implementing peer_recv_msg() we added the statements reached with
a "goto imcomplete" at the end of this function. This statements
are executed only when co_getblk() returns something <0. So they
are useless for now on, and may be safely removed. The following
section wich was responsible of sending any peer protocol messages
were reached only when co_getblk() returned 0 (no more message to
read). In this case we replace the "goto impcomplete" statement by
a "goto send_msgs" to reach this only when peer_recv_msg() returns 0.

May be backported as far as 1.5.
2019-01-29 10:29:54 +01:00
Frédéric Lécaille
25e1d5e435 MINOR: peers: move send code to reduce the size of the I/O handler.
This patch extracts the code responsible of sending peer protocol
messages from the peer I/O handler to create a new function and to
reduce the size of this handler.

May be backported as far as 1.5.
2019-01-29 10:29:54 +01:00
Frédéric Lécaille
444243c62c MINOR: peers: move messages treatment code to reduce the size of the I/O handler.
Extract the code of the peer I/O handler responsible of treating
any peer protocol message to create peer_treat_awaited_msg() function.
Also rename peer_recv_updatemsg() to peer_treat_updatemsg() as this
function only parse a stick-table update message already received
by peer_recv_msg().

May be backported as far as 1.5.
2019-01-29 10:29:54 +01:00
Frédéric Lécaille
7d0ceeec80 MINOR: peers: move error handling to reduce the size of the I/O handler.
Implement new functions to send error and control class stick-table
messages.

May be backported as far as 1.5.
2019-01-29 10:29:54 +01:00
Frédéric Lécaille
d5fe14bb96 CLEANUP: peers: Be more generic.
Make usage of a C union to pass parameters to all the peer_prepare_*()
functions (more readable).

May be backported as far as 1.5.
2019-01-29 10:29:54 +01:00
Frédéric Lécaille
95203f2185 MINOR: peers: Move high level receive code to reduce the size of I/O handler.
Implement a new function to read incoming stick-table messages.

May be backported as far as 1.5.
2019-01-29 10:29:54 +01:00
Frédéric Lécaille
d27b09400c MINOR: peers: Move ack, switch and definition receive code to reduce the size of the I/O handler.
Implement three new functions to treat peer acks, switch and
definition messages extracting the code from the big swich-case
of the peer I/O handler to give more chances to this latter to be
readable.

May be backported as far as 1.5.
2019-01-29 10:29:54 +01:00
Frédéric Lécaille
168a34b45f MINOR: peers: Move update receive code to reduce the size of the I/O handler.
This patch implements a new function to treat the stick-table
update messages so that to reduce the size of the peer I/O handler
by ~200 lines.

May be backported as far as 1.5.
2019-01-29 10:29:54 +01:00
Frédéric Lécaille
6a8303d49e MEDIUM: peers: synchronizaiton code factorization to reduce the size of the I/O handler.
Factorize the code responsible of synchronizing the peers upon startup.

May be backported as far as 1.5.
2019-01-29 10:29:54 +01:00
Frédéric Lécaille
87f554c9fb MINOR: peers: Add new functions to send code and reduce the I/O handler.
This patch reduces the size of the peer I/O handler implementing
a new function named peer_send_updatemsg() which uses the already
implement peer_prepare_updatemsg(), then ci_putblk().
Reuse the code used to implement peer_send_(ack|swith)msg() function
especially the more generic function peer_send_msg().

May be backported as far as 1.5.
2019-01-29 10:29:54 +01:00
Frédéric Lécaille
ec44ea8692 MINOR: peers: send code factorization.
Implements peer_send_*msg() functions for switch and ack messages which call the
already defined peer_prepare_*msg() before calling ci_putblk().
These two new functions are used at three places in the peer_io_handler().

May be backported as far as 1.5.
2019-01-29 10:29:54 +01:00
Frédéric Lécaille
a8725ec372 CLEANUP: peers: Indentation fixes.
May be backported as far as 1.5.
2019-01-29 10:29:54 +01:00
Frédéric Lécaille
ce02557aad MINOR: peers: Extract some code to be reused.
May be backported as far as 1.5.
2019-01-29 10:29:54 +01:00
Willy Tarreau
d822013f45 BUG/MEDIUM: backend: always call si_detach_endpoint() on async connection failure
In case an asynchronous connection (ALPN) succeeds but the mux fails to
attach, we must release the stream interface's endpoint, otherwise we
leave the stream interface with an endpoint pointing to a freed connection
with si_ops == si_conn_ops, and sess_update_st_cer() calls si_shutw() on
it, causing it to crash.

This must be backported to 1.9 only.
2019-01-28 16:33:35 +01:00
Olivier Houchard
9ef5155ba6 BUG/MEDIUM: servers: Attempt to reuse an unfinished connection on retry.
In connect_server(), if the previous connection failed, but had an alpn, no
mux was created, and thus the stream_interface's endpoint would be the
connection. In this case, instead of forgetting about it, and overriding
the stream_interface's endpoint later, try to reuse the connection, or the
connection will still be in the session's connection list, and will reference
to a stream that was probably destroyed.

This should be backported to 1.9.
2019-01-28 16:33:31 +01:00
Miroslav Zagorac
6b3690bc6a BUG/MINOR: spoe: corrected fragmentation string size
This patch must be backported to 1.9 and 1.8.
2019-01-28 13:45:09 +01:00
Willy Tarreau
6afec46ba3 BUG/MINOR: mux-h2: do not report available outgoing streams after GOAWAY
The calculation of available outgoing H2 streams was improved by commit
d64a3ebe6 ("BUG/MINOR: mux-h2: always check the stream ID limit in
h2_avail_streams()"), but it still is incorrect because RFC7540#6.8
specifically forbids the creation of new streams after a GOAWAY frame
was received. Thus we must not mark the connection as available anymore
in order to be able to handle a graceful shutdown.

This needs to be backported to 1.9.
2019-01-28 06:44:53 +01:00
Willy Tarreau
888d5678f7 BUG/MINOR: listener: always fill the source address for accepted socketpairs
The source address was not set but passed down the chain to the upper
layer's accept() calls. Let's initialize it like other UNIX sockets in
this case. At the moment it should not have any impact since socketpairs
are only usable for the master CLI.

This should be backported to 1.9.
2019-01-27 21:48:29 +01:00
Willy Tarreau
f5809cde7a MINOR: threads: make MAX_THREADS configurable at build time
There's some value in being able to limit MAX_THREADS, either to save
precious resources in embedded environments, or to protect certain
deployments against accidently incorrect settings.

With this patch, if MAX_THREADS is defined at build time, it will be
used. However, given that LONGBITS is not a macro but is defined
according to sizeof(long), we can't check the value range at build
time and instead we need to perform the check at early boot time.
However, the compiler is able to optimize away the constant comparisons
and doesn't even emit the check code when values are correct.

The output message regarding threading support was improved to report
the number of threads.
2019-01-26 13:37:48 +01:00
Willy Tarreau
c9a82e48bf MINOR: cfgparse: make the process/thread parser support a maximum value
It was hard-wired to LONGBITS, let's make it configurable depending on the
context (threads, processes).
2019-01-26 13:25:14 +01:00
Tim Duesterhus
4707033932 CLEANUP: h2: Remove debug printf in mux_h2.c
It was introduced by 1915ca2738
and should be backported to 1.9.
2019-01-25 05:22:07 +01:00
Willy Tarreau
1915ca2738 BUG/MINOR: mux-h2: always compare content-length to the sum of DATA frames
This is mandated by RFC7541#8.1.2.6. Till now we didn't have a copy of
the content-length header field. But now that it's already parsed, it's
easy to add the check.

The reg-test was updated to match the new behaviour as the previous one
expected unadvertised data to be silently discarded.

This should be backported to 1.9 along with previous patch (MEDIUM: h2:
always parse and deduplicate the content-length header) after it has got
a bit more exposure.
2019-01-24 19:45:27 +01:00
Willy Tarreau
4790f7c907 MEDIUM: h2: always parse and deduplicate the content-length header
The header used to be parsed only in HTX but not in legacy. And even in
HTX mode, the value was dropped. Let's always parse it and report the
parsed value back so that we'll be able to store it in the streams.
2019-01-24 19:07:26 +01:00
Willy Tarreau
f7a259d46f MINOR: stream: don't wait before retrying after a failed connection reuse
When a connection reuse fails, we must not wait before retrying, as most
likely the issue is related to the reused connection and not to the server
itself.

This should be backported to 1.9, though it depends on previous patches
dealing with SI_ST_CON for connection reuse.
2019-01-24 19:06:43 +01:00
Willy Tarreau
bf66bd1b8b MEDIUM: stream-int: always mark pending outgoing SI_ST_CON
Before the first send() attempt, we should be in SI_ST_CON, not
SI_ST_EST, since we have not yet attempted to send and we are
allowed to retry. This is particularly important with complex
outgoing muxes which can fail during the first send attempt (e.g.
failed stream ID allocation).

It only requires that sess_update_st_con_tcp() knows about this
possibility, as we must not forcefully close a reused connection
when facing an error in this case, this will be handled later.

This may be backported to 1.9 with care after some observation period.
2019-01-24 19:06:43 +01:00
Willy Tarreau
e9634bdc22 MINOR: mux-h2: always consider a server's max-reuse parameter
This parameter allows to limit the number of successive requests sent
on a connection. Let's compare it to the number of streams already sent
on the connection to decide if the connection may still appear in the
idle list or not. This may be used to help certain servers work around
resource leaks, and also helps dealing with the issue of the GOAWAY in
flight which requires to set a usage limit on the client to be reliable.

This must be backported to 1.9.
2019-01-24 19:06:43 +01:00
Willy Tarreau
9c538e01c2 MINOR: server: add a max-reuse parameter
Some servers may wish to limit the total number of requests they execute
over a connection because some of their components might leak resources.
In HTTP/1 it was easy, they just had to emit a "connection: close" header
field with the last response. In HTTP/2, it's less easy because the info
is not always shared with the component dealing with the H2 protocol and
it could be harder to advertise a GOAWAY with a stream limit.

This patch provides a solution to this by adding a new "max-reuse" parameter
to the server keyword. This parameter indicates how many times an idle
connection may be reused for new requests. The information is made available
and the underlying muxes will be able to use it at will.

This patch should be backported to 1.9.
2019-01-24 19:06:43 +01:00
Willy Tarreau
2c7deddc06 BUG/MEDIUM: backend: never try to attach to a mux having no more stream available
The code dealing with idle connections used to check the number of streams
available on the connection only to unlink the connection from the idle
list. But this still resulted in too many streams reusing the same connection
when they were already attached to it.

We must detect that there is no more room and refrain from using this
connection at all, and instead fall back to the no-reuse case. Ideally
we should try to search among other idle connections, but for a backport
let's stay safe.

This must be backported to 1.9.
2019-01-24 19:06:43 +01:00
Willy Tarreau
a80dca8535 BUG/MINOR: mux-h2: refuse to allocate a stream with too high an ID
One of the reasons for the excessive number of aborted requests when a
server sets a limit on the highest stream ID is that we don't check
this limit while allocating a new stream.

This patch does this at two locations :
  - when a backend stream is allocated, we verify that there are still
    IDs left ;
  - when the ID is assigned, we verify that it's not higher than the
    advertised limit.

This should be backported to 1.9.
2019-01-24 19:06:43 +01:00
Willy Tarreau
d64a3ebe64 BUG/MINOR: mux-h2: always check the stream ID limit in h2_avail_streams()
This function is used to decide whether to put an idle connection back
into the idle pool. While it considers the limit in number of concurrent
requests, it does not consider the limit in number of streams, so if a
server announces a low limit in a GOAWAY frame, it will be ignored.

However there is a caveat : since we assign the stream IDs when sending
them, we have a number of allocated streams which max_id doesn't take
care of. This can be addressed by adding a new nb_reserved count on each
connection to keep track of the ID-less streams.

This patch makes sure we take care of the remaining number of streams
if such a limit was announced, or of the number of streams before the
highest ID. Now it is possible to accurately know how many streams
can be allocated, and the number of failed outgoing streams has dropped
in half.

This must be backported to 1.9.
2019-01-24 19:06:43 +01:00
Willy Tarreau
15c120d251 CLEANUP: server: fix indentation mess on idle connections
Apparently some code was moved around leaving the inner block incorrectly
indented and with the closing brace in the middle of nowhere.
2019-01-24 19:06:43 +01:00
Willy Tarreau
64f6945fec BUG/MINOR: stream: take care of synchronous errors when trying to send
We currently detect a number of situations where we have to immediately
deal with a state change, but we failed to consider the case of the
synchronous error reported on the stream-interface. We definitely do not
want to have to wait for a timeout to handle this one, especially at the
beginning of the connection when it can lead to an immediate retry.

This should be backported to 1.9.
2019-01-24 19:06:43 +01:00
Willy Tarreau
cb923d5001 MINOR: server: make sure pool-max-conn is >= -1
The keyword parser doesn't check the value range, but supported values are
-1 and positive values, thus we should check it.

This can be backported to 1.9.
2019-01-24 16:31:56 +01:00
Willy Tarreau
1e7d444eec BUG/MINOR: hpack: return a compression error on invalid table size updates
RFC7541#6.3 mandates that an error is reported when a dynamic table size
update announces a size larger than the one configured with settings. This
is tested by h2spec using test "hpack/6.3/1".

This must be backported to 1.9 and possibly 1.8 as well.
2019-01-24 15:27:06 +01:00
Willy Tarreau
175cebb38a BUG/MINOR: mux-h2: make it possible to set the error code on an already closed stream
When sending RST_STREAM in response to a frame delivered on an already
closed stream, we used not to be able to update the error code and
deliver an RST_STREAM with a wrong code (e.g. H2_ERR_CANCEL). Let's
always allow to update the code so that RST_STREAM is always sent
with the appropriate error code (most often H2_ERR_STREAM_CLOSED).

This should be backported to 1.9 and possibly to 1.8.
2019-01-24 15:27:06 +01:00
Willy Tarreau
5b4eae33de BUG/MINOR: mux-h2: headers-type frames in HREM are always a connection error
There are incompatible MUST statements in the HTTP/2 specification. Some
require a stream error and others a connection error for the same situation.
As discussed in the thread below, let's always apply the connection error
when relevant (headers-like frame in half-closed(remote)) :

  https://mailarchive.ietf.org/arch/msg/httpbisa/pOIWRBRBdQrw5TDHODZXp8iblcE

This must be backported to 1.9, possibly to 1.8 as well.
2019-01-24 15:27:06 +01:00
Willy Tarreau
113c7a2794 BUG/MINOR: mux-h2: CONTINUATION in closed state must always return GOAWAY
Since we now support CONTINUATION frames, we must take care of properly
aborting the connection when they are sent on a closed stream. By default
we'd get a stream error which is not sufficient since the compression
context is modified and unrecoverable.

More info in this discussion :

   https://mailarchive.ietf.org/arch/msg/httpbisa/azZ1jiOkvM3xrpH4jX-Q72KoH00

This needs to be backported to 1.9 and possibly to 1.8 (less important there).
2019-01-24 15:27:06 +01:00
Willy Tarreau
31e846a071 BUG/MEDIUM: mux-h2: properly abort on trailers decoding errors
There was an incomplete test in h2c_frt_handle_headers() resulting
in negative return values from h2c_decode_headers() not being taken
as errors. The effect is that the stream is then aborted on timeout
only.

This fix must be backported to 1.9.
2019-01-24 15:27:06 +01:00
Willy Tarreau
5ce6337254 BUG/MEDIUM: backend: also remove from idle list muxes that have no more room
The current test consists in removing muxes which report that they're going
to assign their last available stream, but a mux may already be saturated
without having passed in this situation at all. This is what happens with
mux_h2 when receiving a GOAWAY frame informing the mux about the ID of the
last stream the other end is willing to process. The limit suddenly changes
from near infinite to 0. Currently what happens is that such a mux remains
in the idle list for a long time and refuses all new streams. Now at least
it will only fail a single stream in a retryable way. A future improvement
should consist in trying to pick another connection from the idle list.

This fix must be backported to 1.9.
2019-01-24 13:53:06 +01:00
Willy Tarreau
759ca1eacc BUG/MAJOR: mux-h2: don't destroy the stream on failed allocation in h2_snd_buf()
In case we cannot allocate a stream ID for an outgoing stream, the stream
will be aborted. The problem is that we also release it and it will be
destroyed again by the application detecting the error, leading to a NULL
dereference in h2_shutr() and h2_shutw(). Let's only mark the error on the
CS and let the rest of the code handle the close.

This should be backported to 1.9.
2019-01-24 13:52:10 +01:00
Willy Tarreau
b57af617c0 BUG/MINOR: mux-h1: avoid copying output over itself in zero-copy
It's almost funny but one side effect of the latest zero-copy changes made
to mux-h1 resulted in the temporary buffer being copied over itself at the
exact same location. This has no impact except slowing down operations and
irritating valgrind. The cause is an incorrect pointer check after the
alignment optimizations were made.

This needs to be backported to 1.9.

Reported-by: Tim Duesterhus <tim@bastelstu.be>
2019-01-23 20:43:53 +01:00
Christopher Faulet
afe57846bf BUG/MINOR: mux-h1: Apply the reserve on the channel's buffer only
There is no reason to use the reserve to limit the amount of data of the input
buffer that we can parse at a time. The only important thing is to keep the
reserve free in the channel's buffer.

This patch should be backported to 1.9.
2019-01-23 11:27:34 +01:00
Christopher Faulet
a413e958fd BUG/MEDIUM: mux-h2/htx: Respect the channel's reserve
When data are pushed in the channel's buffer, in h2_rcv_buf(), the mux-h2 must
respect the reserve if the flag CO_RFL_KEEP_RSV is set. In HTX, because the
stream-interface always sees the buffer as full, there is no other way to know
the reserve must be respected.

This patch must be backported to 1.9.
2019-01-23 11:27:34 +01:00
Christopher Faulet
dcd8c5eed4 BUG/MINOR: proto-htx: Return an error if all headers cannot be received at once
When an HTX stream is waiting for a request or a response, it reports an error
(400 for the request or 502 for the response) if a parsing error is reported by
the mux (HTX_FL_PARSING_ERROR). The mux-h1 uses this error, among other things,
when the headers are too big to be analyzed at once. But the mux-h2 doesn't. So
the stream must also report an error if the multiplexer is unable to emit all
headers at once. The multiplexers must always emit all the headers at once
otherwise it is an error.

There are 2 ways to detect this error:

  * The end-of-headers marker was not received yet _AND_ the HTX message is not
    empty.

  * The end-of-headers marker was not received yet _AND_ the multiplexer have
    some data to emit but it is waiting for more space in the channel's buffer.

Note the mux-h2 is buggy for now when HTX is enabled. It does not respect the
reserve. So there is no way to hit this bug.

This patch must be backported to 1.9.
2019-01-23 11:27:34 +01:00
Dirkjan Bussink
526894ff39 BUG/MEDIUM: ssl: Fix handling of TLS 1.3 KeyUpdate messages
In OpenSSL 1.1.1 TLS 1.3 KeyUpdate messages will trigger the callback
that is used to verify renegotiation is disabled. This means that these
KeyUpdate messages fail. In OpenSSL 1.1.1 a better mechanism is
available with the SSL_OP_NO_RENEGOTIATION flag that disables any TLS
1.2 and earlier negotiation.

So if this SSL_OP_NO_RENEGOTIATION flag is available, instead of having
a manual check, trust OpenSSL and disable the check. This means that TLS
1.3 KeyUpdate messages will work properly.

Reported-By: Adam Langley <agl@imperialviolet.org>
2019-01-23 09:51:22 +01:00
Christopher Faulet
774c486cec BUG/MINOR: check: Wake the check task if the check is finished in wake_srv_chk()
With tcp-check, the result of the check is set by the function tcpcheck_main()
from the I/O layer. So it is important to wake up the check task to handle the
result and finish the check. Otherwise, we will wait the task timeout to handle
the result of a tcp-check, delaying the next check by as much.

This patch also fixes a problem about email alerts reported by PiBa-NL (Pieter)
on the ML [1] on all versions since the 1.6. So this patch must be backported
from 1.9 to 1.6.

[1] https://www.mail-archive.com/haproxy@formilux.org/msg32190.html
2019-01-22 07:01:15 +01:00
Jérôme Magnin
f57afa453a BUG/MINOR: server: don't always trust srv_check_health when loading a server state
When we load health values from a server state file, make sure what we assign
to srv->check.health actually matches the state we restore.

This should be backported as far as 1.6.
2019-01-21 11:09:03 +01:00
Willy Tarreau
1ba32032ef BUG/MEDIUM: checks: fix recent regression on agent-check making it crash
In order to address the mailers issues, we needed to store the proxy
into the checks struct, which was done by commit c98aa1f18 ("MINOR:
checks: Store the proxy in checks."). However this one did it only for
the health checks and not for the agent checks, resulting in an immediate
crash when the agent is enabled on a random config like this one :

  listen agent
      bind :8000
      server s1 255.255.255.255:1 agent-check agent-port 1

Thanks to Seri Kim for reporting it and providing a reproducer in
issue #20. This fix must be backported to 1.9.
2019-01-21 07:48:26 +01:00
Uman Shahzad
da7eeedf38 BUG/MINOR: startup: certain goto paths in init_pollers fail to free
If we fail to initialize pollers due to fdtab/fdinfo/polled_mask
not getting allocated, we free any of those that were allocated
and exit. However the ordering was incorrect, and there was an old
unused and unreachable "fail_cache" path as well, which needs to
be taken when no poller works.

This was introduced with this commit during 1.9-dev :
   cb92f5c ("MINOR: pollers: move polled_mask outside of struct fdtab.")

It needs to be backported to 1.9 only.
2019-01-21 04:48:48 +01:00
Frédéric Lécaille
355b2033ec MINOR: cfgparse: SSL/TLS binding in "peers" sections.
Make "bind" keywork be supported in "peers" sections.
All "bind" settings are supported on this line.
Add "default-bind" option to parse the binding options excepted the bind address.
Do not parse anymore the bind address for local peers on "server" lines.
Do not use anymore list_for_each_entry() to set the "peers" section
listener parameters because there is only one listener by "peers" section.

May be backported to 1.5 and newer.
2019-01-18 14:26:21 +01:00
Frédéric Lécaille
1055e687a2 MINOR: peers: Make outgoing connection to SSL/TLS peers work.
This patch adds pointer to a struct server to peer structure which
is initialized after having parsed a remote "peer" line.

After having parsed all peers section we run ->prepare_srv to initialize
all SSL/TLS stuff of remote perr (or server).

Remaining thing to do to completely support peer protocol over SSL/TLS:
make "bind" keyword be supported in "peers" sections to make SSL/TLS
incoming connections to local peers work.

May be backported to 1.5 and newer.
2019-01-18 14:26:21 +01:00
Frédéric Lécaille
c06b5d4f74 MINOR: cfgparse: Make "peer" lines be parsed as "server" lines.
With this patch "default-server" lines are supported in "peers" sections
to setup the default settings of peers which are from now setup
when parsing both "peer" and "server" lines.

May be backported to 1.5 and newer.
2019-01-18 14:26:21 +01:00
Frédéric Lécaille
9492c4ecdb MINOR: cfgparse: Simplication.
Make init_peers_frontend() be callable without having to check if
there is something to do or not.

May be backported to 1.5 and newer.
2019-01-18 14:26:21 +01:00
Frédéric Lécaille
91694d51f7 MINOR: cfgparse: Rework peers frontend init.
Even if not already the case, we suppose that the frontend "peers" section
may have been already initialized outside of "peer" line, we seperate
their initializations from their binding initializations.

May be backported to 1.5 and newer.
2019-01-18 14:26:21 +01:00
Frédéric Lécaille
4ba5198899 MINOR: cfgparse: Useless frontend initialization in "peers" sections.
Use ->local "peers" struct member to flag a "peers" section frontend
has being initialized. This is to be able to initialize the frontend
of "peers" sections on lines different from "peer" lines.

May be backported to 1.5 and newer.
2019-01-18 14:26:21 +01:00
Frédéric Lécaille
16e491004b CLEANUP: cfgparse: Code reindentation.
May help the series of patches to be reviewed.

May be backported to 1.5 and newer.
2019-01-18 14:26:21 +01:00
Frédéric Lécaille
6617e769bf CLEANUP: cfgparse: Return asap from cfg_parse_peers().
Avoid useless code indentation.

May be backported to 1.5 and newer.
2019-01-18 14:26:21 +01:00
Frédéric Lécaille
1825103fbe MINOR: cfgparse: Extract some code to be re-used.
Create init_peers_frontend() function to allocate and initialize
the frontend of "peers" sections (->peers_fe) so that to reuse it later.

May be backported to 1.5 and newer.
2019-01-18 14:26:21 +01:00
Olivier Houchard
f24502ba46 BUG/MEDIUM: connections: Add the CO_FL_CONNECTED flag if a send succeeded.
If a send succeeded, add the CO_FL_CONNECTED flag, the send may have been
called by the upper layers before we even realized we were connected, and we
may even read the response before we get the information, and si_cs_recv()
has to know if we were connected or not.

This should be backported to 1.9.
2019-01-17 19:18:20 +01:00
Olivier Houchard
09a0f03994 BUG/MEDIUM: servers: Make assign_tproxy_address work when ALPN is set.
If an ALPN is set on the server line, then when we reach assign_tproxy_address,
the stream_interface's endpoint will be a connection, not a conn_stream,
so make sure assign_tproxy_address() handles both cases.

This should be backported to 1.9.
2019-01-17 19:18:20 +01:00
Christopher Faulet
ed7a066b45 BUG/MEDIUM: stats: Get the right scope pointer depending on HTX is used or not
For HTX streams, the scope pointer is relative to the URI in the start-line. But
for streams using the legacy HTTP representation, the scope pointer is relative
to the beginning of output data in the channel's buffer. So we must be careful
to use the right one depending on the HTX is used or not.

Because the start-line is used to get de scope pointer, it is important to keep
it after the parsing of post paramters. So now, instead of removing blocks when
read in the function stats_process_http_post(), we just move on next, leaving it
in the HTX message.

Thanks to Pieter (PiBa-NL) to report this bug.

This patch must be backported to 1.9.
2019-01-16 17:27:49 +01:00
Ben51Degrees
daa356bd7d BUG: 51d: Changes to the buffer API in 1.9 were not applied to the 51Degrees code.
The code using the deprecated 'buf->p' has been updated to use
'ci_head(buf)' as described in section 5 of
'doc/internals/buffer-api.txt'.
A compile time switch on 'BUF_NULL' has also been added to enable the
same source code to be used with pre and post API change HAProxy.

This should be backported to 1.9, and is compatible with previous
versions.
2019-01-16 17:26:14 +01:00
David Carlier
f8f8ddf3af BUILD/MEDIUM: da: Necessary code changes for new buffer API.
The most significant change from 1.8 to >=1.9 is the buffer
data structure, using the new field and fixing along side
a little hidden compilation warning.

This must be backported to 1.9.
2019-01-15 15:07:30 +01:00
Willy Tarreau
21c741a665 MINOR: backend: make the random algorithm support a number of draws
When an argument <draws> is present, it must be an integer value one
or greater, indicating the number of draws before selecting the least
loaded of these servers. It was indeed demonstrated that picking the
least loaded of two servers is enough to significantly improve the
fairness of the algorithm, by always avoiding to pick the most loaded
server within a farm and getting rid of any bias that could be induced
by the unfair distribution of the consistent list. Higher values N will
take away N-1 of the highest loaded servers at the expense of performance.
With very high values, the algorithm will converge towards the leastconn's
result but much slower. The default value is 2, which generally shows very
good distribution and performance. This algorithm is also known as the
Power of Two Random Choices and is described here :

http://www.eecs.harvard.edu/~michaelm/postscripts/handbook2001.pdf
2019-01-14 19:33:17 +01:00
Willy Tarreau
0cac26cd88 MEDIUM: backend: move all LB algo parameters into an union
Since all of them are exclusive, let's move them to an union instead
of eating memory with the sum of all of them. We're using a transparent
union to limit the code changes.

Doing so reduces the struct lbprm from 392 bytes to 372, and thanks
to these changes, the struct proxy is now down to 6480 bytes vs 6624
before the changes (144 bytes saved per proxy).
2019-01-14 19:33:17 +01:00
Willy Tarreau
76e84f5091 MINOR: backend: move hash_balance_factor out of chash
This one is a proxy option which can be inherited from defaults even
if the LB algo changes. Move it out of the lb_chash struct so that we
don't need to keep anything separate between these structs. This will
allow us to merge them into an union later. It even takes less room
now as it fills a hole and removes another one.
2019-01-14 19:33:17 +01:00
Willy Tarreau
a9a7249966 MINOR: backend: remap the balance uri settings to lbprm.arg_opt{1,2,3}
The algo-specific settings move from the proxy to the LB algo this way :
  - uri_whole => arg_opt1
  - uri_len_limit => arg_opt2
  - uri_dirs_depth1 => arg_opt3
2019-01-14 19:33:17 +01:00
Willy Tarreau
9fed8586b5 MINOR: backend: make the header hash use arg_opt1 for use_domain_only
This is only a boolean extra arg. Let's map it to arg_opt1 and remove
hh_match_domain from struct proxy.
2019-01-14 19:33:17 +01:00
Willy Tarreau
20e68378f1 MINOR: backend: add new fields in lbprm to store more LB options
Some algorithms require a few extra options (up to 3). Let's provide
some room in lbprm to store them, and make sure they're passed from
defaults to backends.
2019-01-14 19:33:17 +01:00
Willy Tarreau
484ff07691 MINOR: backend: make headers and RDP cookie also use arg_str/len
These ones used to rely on separate variables called hh_name/hh_len
but they are exclusive with the former. Let's use the same variable
which becomes a generic argument name and length for the LB algorithm.
2019-01-14 19:33:17 +01:00
Willy Tarreau
4c03d1c9b6 MINOR: backend: move url_param_name/len to lbprm.arg_str/len
This one is exclusively used by LB parameters, when using URL param
hashing. Let's move it to the lbprm struct under a more generic name.
2019-01-14 19:33:17 +01:00
Willy Tarreau
6c30be52da BUG/MINOR: backend: BE_LB_LKUP_CHTREE is a value, not a bit
There are a few instances where the lookup algo is tested against
BE_LB_LKUP_CHTREE using a binary "AND" operation while this macro
is a value among a set, and not a bit. The test happens to work
because the value is exactly 4 and no bit overlaps with the other
possible values but this is a latent bug waiting for a new LB algo
to appear to strike. At the moment the only other algo sharing a bit
with it is the "first" algo which is never supported in the same code
places.

This fix should be backported to maintained versions for safety if it
passes easily, otherwise it's not important as it will not fix any
visible issue.
2019-01-14 19:33:17 +01:00
Willy Tarreau
602a499da5 BUG/MINOR: backend: balance uri specific options were lost across defaults
The "balance uri" options "whole", "len" and "depth" were not properly
inherited from the defaults sections. In addition, "whole" and "len"
were not even reset when parsing "uri", meaning that 2 subsequent
"balance uri" statements would not have the expected effect as the
options from the first one would remain for the second one.

This may be backported to all maintained versions.
2019-01-14 19:33:17 +01:00
Willy Tarreau
089eaa0ba7 BUG/MINOR: backend: don't use url_param_name as a hint for BE_LB_ALGO_PH
At a few places in the code we used to rely on this variable to guess
what LB algo was in place. This is wrong because if the defaults section
presets "balance url_param foo" and a backend uses "balance roundrobin",
these locations will still see this url_param_name set and consider it.
The harm is limited, as this only causes the beginning of the request
body to be buffered. And in general this is a bad practice which prevents
us from cleaning the lbprm stuff. Let's explicitly check the LB algo
instead.

This may be backported to all currently maintained versions.
2019-01-14 19:33:17 +01:00
Emeric Brun
9e7547740c MINOR: ssl: add support of aes256 bits ticket keys on file and cli.
Openssl switched from aes128 to aes256 since may 2016  to compute
tls ticket secrets used by default. But Haproxy still handled only
128 bits keys for both tls key file and CLI.

This patch permit the user to set aes256 keys throught CLI or
the key file (80 bytes encoded in base64) in the same way that
aes128 keys were handled (48 bytes encoded in base64):
- first 16 bytes for the key name
- next 16/32 bytes for aes 128/256 key bits key
- last 16/32 bytes for hmac 128/256 bits

Both sizes are now supported (but keys from same file must be
of the same size and can but updated via CLI only using a key of
the same size).

Note: This feature need the fix "dec func ignores padding for output
size checking."
2019-01-14 19:32:58 +01:00
Emeric Brun
09852f70e0 BUG/MEDIUM: ssl: missing allocation failure checks loading tls key file
This patch fixes missing allocation checks loading tls key file
and avoid memory leak in some error cases.

This patch should be backport on branches 1.9 and 1.8
2019-01-14 19:32:45 +01:00
Emeric Brun
ed697e4856 BUG/MINOR: base64: dec func ignores padding for output size checking
Decode function returns an error even if the ouptut buffer is
large enought because the padding was not considered. This
case was never met with current code base.
2019-01-14 19:32:15 +01:00
Olivier Houchard
32d75ed300 BUG/MEDIUM: h1: Make sure we destroy an inactive connectin that did shutw.
In h1_process(), if we have no associated stream, and the connection got a
shutw, then destroy it, it is unusable and it may be our last chance to do
so.

This should be backported to 1.9.
2019-01-14 18:14:52 +01:00
Olivier Houchard
0923fa4200 BUG/MEDIUM: checks: Avoid having an associated server for email checks.
When using a check to send email, avoid having an associated server, so that
we don't modify the server state if we fail to send an email.
Also revert back to initialize the check status to HCHK_STATUS_INI, now that
set_server_check_status() stops early if there's no server, we shouldn't
get in a mail loop anymore.

This should be backported to 1.9.
2019-01-14 11:15:11 +01:00
Olivier Houchard
c98aa1f182 MINOR: checks: Store the proxy in checks.
Instead of assuming we have a server, store the proxy directly in struct
check, and use it instead of s->server.
This should be a no-op for now, but will be useful later when we change
mail checks to avoid having a server.

This should be backported to 1.9.
2019-01-14 11:15:11 +01:00
Christopher Faulet
00292353a1 MINOR: spoe: Make the SPOE filter compatible with HTX proxies
There is any specific HTTP processing in the SPOE. So there is no reason to not
use it on HTX proxies.

This patch may be backported to 1.9.
2019-01-14 10:52:28 +01:00
Willy Tarreau
c9036c0004 BUG/MAJOR: cache: fix confusion between zero and uninitialized cache key
The cache uses the first 32 bits of the uri's hash as the key to reference
the object in the cache. It makes a special case of the value zero to mean
that the object is not in the cache anymore. The problem is that when an
object hashes as zero, it's still inserted but the eb32_delete() call is
skipped, resulting in the object still being chained in the memory area
while the block has been reclaimed and used for something else. Then when
objects which were chained below it (techically any object since zero is
at the root) are deleted, the walk through the upper object may encounter
corrupted values where valid pointers were expected.

But while this should only happen statically once on 4 billion, the problem
gets worse when the cache-use conditions don't match the cache-store ones,
because cache-store runs with an uninitialized key, which can create objects
that will never be found by the lookup code, or worse, entries with a zero
key preventing eviction of the tree node and resulting in a crash. It's easy
to accidently end up on such a config because the request rules generally
can't be used to decide on the response :

  http-request  cache-use cache   if { path_beg /images }
  http-response cache-store cache

In this test, mixing traffic with /images/$RANDOM and /foo/$RANDOM will
result in random keys being inserted, some of them possibly being zero,
and crashes will quickly happen.

The fix consists in 1) always initializing the transaction's cache_hash
to zero, and 2) never storing a response for which the hash has not been
calculated, as indicated by the value zero.

It is worth noting that objects hashing as value zero will never be cached,
but given that there's only one chance among 4 billion that this happens,
this is totally harmless.

This fix must be backported to 1.9 and 1.8.
2019-01-14 10:31:31 +01:00
Willy Tarreau
f77a158c87 MINOR: mux-h1: make the mux_h1_ops struct static
It was needlessly exported while it's only used inside the mux.
2019-01-10 10:00:08 +01:00
Olivier Houchard
51088ce68f BUG/MEDIUM: ssl: Disable anti-replay protection and set max data with 0RTT.
When using early data, disable the OpenSSL anti-replay protection, and set
the max amount of early data we're ready to accept, based on the size of
buffers, or early data won't work with the released OpenSSL 1.1.1.

This should be backported to 1.8.
2019-01-09 16:26:28 +01:00
Daniel Corbett
43bb842a08 BUG/MEDIUM: init: Initialize idle_orphan_conns for first server in server-template
When initializing server-template all of the servers after the first
have srv->idle_orphan_conns initialized within server_template_init()
The first server does not have this initialized and when http-reuse
is active this causes a segmentation fault when accessed from
srv_add_to_idle_list().  This patch removes the check for
srv->tmpl_info.prefix within server_finalize_init() and allows
the first server within a server-template to have srv->idle_orphan_conns
properly initialized.

This should be backported to 1.9.
2019-01-09 14:45:21 +01:00
Christopher Faulet
4b0e9b2870 BUG/MINOR: lua/htx: Respect the reserve when data are send from an HTX applet
In the function hlua_applet_htx_send_yield(), there already was a test to
respect the reserve but the wrong function was used to get the available space
for data in the HTX buffer. Instead of calling htx_free_space(), the function
htx_free_data_space() must be used. But in fact, there is no reason to bother
with that anymore because the function channel_htx_recv_max() has been added for
this purpose.

The result of this bug is that the call to htx_add_data() failed unexpectedly
while the amount of written data was incremented, leading the applet to think
all data was sent. To prevent any futher bugs, a test has been added to yield if
we are not able to write data into the channel buffer.

This patch must be backported to 1.9.
2019-01-09 14:36:22 +01:00
Willy Tarreau
a01f45e3ce BUG/CRITICAL: mux-h2: re-check the frame length when PRIORITY is used
Tim Düsterhus reported a possible crash in the H2 HEADERS frame decoder
when the PRIORITY flag is present. A check is missing to ensure the 5
extra bytes needed with this flag are actually part of the frame. As per
RFC7540#4.2, let's return a connection error with code FRAME_SIZE_ERROR.

Many thanks to Tim for responsibly reporting this issue with a working
config and reproducer. This issue was assigned CVE-2018-20615.

This fix must be backported to 1.9 and 1.8.
2019-01-08 13:20:59 +01:00
Christopher Faulet
202c6ce1a2 BUG/MINOR: proto_htx: Use HTX versions to truncate or erase a buffer
channel_truncate() is not aware of the underlying format of the messages. So if
there are some outgoing data in the channel when called, it does some unexpected
operations on the channel's buffer. So the HTX version, channel_htx_truncate(),
must be used. The same is true for channel_erase(). It resets the buffer but not
the HTX message. So channel_htx_erase() must be used instead. This patch is
flagged as a bug, but as far as we know, it was never hitted.

This patch should be backported to 1.9. If so, following patch must be
backported too:

  * MINOR: channel/htx: Add the HTX version of channel_truncate/erase
2019-01-08 12:06:55 +01:00
Christopher Faulet
00cf697215 MINOR: htx: Add a function to truncate all blocks after a specific offset
This function will be used to truncate all incoming data in a channel, keeping
outgoing ones.

This may be backported to 1.9.
2019-01-08 12:06:55 +01:00
Christopher Faulet
839791af0d BUG/MINOR: cache: Disable the cache if any compression filter precedes it
We need to check if any compression filter precedes the cache filter. This is
only possible when the compression is configured in the frontend while the cache
filter is configured on the backend (via a cache-store action or
explicitly). This case cannot be detected during HAProxy startup. So in such
cases, the cache is disabled.

The patch must be backported to 1.9.
2019-01-08 11:32:23 +01:00
Christopher Faulet
ff17b183fe BUG/MINOR: filters: Detect cache+compression config on legacy HTTP streams
On legacy HTTP streams, it is forbidden to use the compression with the
cache. When the compression filter is explicitly specified, the detection works
as expected and such configuration are rejected at startup. But it does not work
when the compression filter is implicitly defined. To fix the bug, the implicit
declaration of the compression filter is checked first, before calling .check()
callback of each filters.

This patch should be backported to 1.9.
2019-01-08 11:32:23 +01:00
Christopher Faulet
1d3613a031 BUG/MINOR: compression: Disable it if another one is already in progress
Since the commit 9666720c8 ("BUG/MEDIUM: compression: Use the right buffer
pointers to compress input data"), the compression can be done twice. The first
time on the frontend and the second time on the backend. This may happen by
configuring the compression in a default section.

To fix the bug, when the response is checked to know if it should be compressed
or not, if the flag HTTP_MSGF_COMPRESSING is set, the compression is not
performed. It means it is already handled by a previous compression filter.

Thanks to Pieter (PiBa-NL) to report this bug.

This patch must be backported to 1.9.
2019-01-08 11:31:56 +01:00
Christopher Faulet
666a0c4d82 MEDIUM: mux-h1: Clarify how shutr/shutw are handled
Now, h1_shutr() only do a shutdown read and try to set the flag
H1C_F_CS_SHUTDOWN if shutdown write was already performed. On its side,
h1_shutw(), if all conditions are met, do the same for the shutdown write. The
real connection close is done when the mux h1 is released, in h1_release().

The flag H1C_F_CS_SHUTW was renamed to H1C_F_CS_SHUTDOWN to be less ambiguous.

This patch may be backported to 1.9.
2019-01-08 11:31:16 +01:00
Christopher Faulet
f3eb2b1c24 BUG/MINOR: mux-h1: Close connection on shutr only when shutw was really done
In h1_shutr(), to fully close the connection, we must be sure the shutdown write
was already performed on the connection. So we know rely on connection flags
instead of conn_stream flags. If CO_FL_SOCK_WR_SH is already set when h1_shutr()
is called, we can do a full connection close. Otherwise, we just do the shutdown
read.

Without this patch, it is possible to close the connection too early with some
outgoing data in the output buf.

This patch must be backported to 1.9.
2019-01-08 11:31:16 +01:00
Christopher Faulet
69fc88c605 BUG/MINOR: stats/htx: Respect the reserve when the stats page is dumped
As for the cache applet, this one must respect the reserve on HTX streams. This
patch is tagged as MINOR because it is unlikely to fully fill the channel's
buffer. Some tests are already done to not process almost full buffer.

This patch must be backported to 1.9.
2019-01-07 16:32:10 +01:00
Christopher Faulet
cc156623b2 BUG/MEDIUM: cache/htx: Respect the reserve when cached objects are served
It is only true for HTX streams. The legacy code relies on ci_putblk() which is
already aware of the reserve. It is mandatory to not fill the reserve to let
other filters analysing data. It is especially true for the compression
filter. It needs at least 20 bytes of free space, plus at most 5 bytes per 32kB
block. So if the cache fully fills the channel's buffer, the compression will
not have enough space to do its job and it will block the data forwarding,
waiting for more free space. But if the buffer fully filled with input data (ie
no outgoing data), the stream will be frozen infinitely.

This patch must be backported to 1.9. It depends on the following patches:

  * BUG/MEDIUM: cache/htx: Respect the reserve when cached objects are served
    from the cache
  * MINOR: channel/htx: Add HTX version for some helper functions
2019-01-07 16:32:07 +01:00
Thierry FOURNIER
bf90ce12aa BUG/MEDIUM: lua: dead lock when Lua tasks are trigerred
When a task is created from Lua context out of initialisation,
the hlua_ctx_init() function can be called from safe environement,
so we must not initialise it. While the support of threads appear,
the safe environment set a lock to ensure only one Lua execution
at a time. If we initialize safe environment in another safe
environmenet, we have a dead lock.

this patch adds the support of the idicator "already_safe" whoch
indicates if the context is initialized form safe Lua fonction.

thank to Flakebi for the report

This patch must be backported to haproxy-1.9 and haproxy-1.8
2019-01-07 10:54:19 +01:00
Thierry FOURNIER
1725c2e395 BUG/MINOR: lua: bad args are returned for Lua actions
In tcp actions case, the argument n - 1 is returned. For example:

  http-request lua.script stuff

display "stuff" as first arg

  tcp-request content lua.script stuff

display "lua.script" as first arg

The action parser doesn't use the *cur_arg value.

Thanks to Andy Franks for the bug report.

This patch mist be backported in haproxy-1.8 and haproxy-1.9
2019-01-07 10:52:46 +01:00
Willy Tarreau
7778b59be1 MINOR: stream/cli: report more info about the HTTP messages on "show sess all"
The "show sess all" command didn't allow to detect whether compression
is in use for a given stream, which is sometimes annoying. Let's add a
few more info about the HTTP messages, namely the flags, body len, chunk
len and the "next" pointer.
2019-01-07 10:38:10 +01:00
Willy Tarreau
adf7a15bd1 MINOR: stream/cli: fix the location of the waiting flag in "show sess all"
The "waiting" flag indicates if the stream is waiting for some memory,
and was placed on the same output line as the txn for ease of reading.
But since 1.6 the txn is not part of the stream anymore so this output
was placed under a condition, resulting in "waiting" to appear only
when a txn is present. Let's move it upper, closer to the stream's
flags to fix this.

This may safely be backported though it has little value for older
versions.
2019-01-07 10:10:07 +01:00
Willy Tarreau
b84e67fee9 MINOR: stream/htx: add the HTX flags output in "show sess all"
Commit b9af88151 ("MINOR: stream/htx: Add info about the HTX structs in
"show sess all" command") accidently forgot the flags on the request
path, it was only on the response path.

It makes sense to backport this to 1.9 so that both outputs are the same.
2019-01-07 10:01:34 +01:00
Willy Tarreau
909b9d852b BUILD: add a new file "version.c" to carry version updates
While testing fixes, it's sometimes confusing to rebuild only one C file
(e.g. a mux) and not to have the correct commit ID reported in "haproxy -v"
nor on the stats page.

This patch adds a new "version.c" file which is always rebuilt. It's
very small and contains only 3 variables derived from the various
version strings. These variables are used instead of the macros at the
few places showing the version. This way the output version of the
running code is always correct for the parts that were rebuilt.
2019-01-04 18:20:32 +01:00
Willy Tarreau
e6e52366c1 BUG/MEDIUM: cli: make "show sess" really thread-safe
This one used to rely on a few spin locks around lists manipulations
only but 1) there were still a few races (e.g. when aborting, or
between STAT_ST_INIT and STAT_ST_LIST), and 2) after last commit
which dumps htx info it became obvious that dereferencing the buffer
contents is not safe at all.

This patch uses the thread isolation from the rendez-vous point
instead, to guarantee that nothing moves during the dump. It may
make the dump a bit slower but it will be 100% safe.

This fix must be backported to 1.9, and possibly to 1.8 which likely
suffers from the short races above, eventhough they're extremely
hard to trigger.
2019-01-04 18:06:49 +01:00
Olivier Houchard
5cd6217185 BUG/MEDIUM: server: Defer the mux init until after xprt has been initialized.
In connect_server(), if we're using a new connection, and we have to
initialize the mux right away, only do it so after si_connect() has been
called. si_connect() is responsible for initializing the xprt, and the
mux initialization may depend on the xprt being usable, as it may try to
receive data. Otherwise, the connection will be flagged as having an error,
and we will have to try to connect a second time.

This should be backported to 1.9.
2019-01-04 17:08:47 +01:00
Olivier Houchard
9b960a860c BUG/MEDIUM: h1: In h1_init(), wake the tasklet instead of calling h1_recv().
In h1_init(), instead of calling h1_recv() directly, just wake the tasklet,
so that the receive will be done later.
h1_init() might be called from connect_server(), which is itself called
indirectly from process_stream(), and if the receive fails, we may call
si_cs_process(), which may destroy the channel buffers while process_stream()
still expects them to exist.

This should be backported to 1.9.
2019-01-04 17:08:45 +01:00