6804 Commits

Author SHA1 Message Date
PiBa-NL
4763ffdf04 BUG/MINOR: mworker: fix validity check for the pipe FDs
Check if master-worker pipe getenv succeeded, also allow pipe fd 0 as
valid. On FreeBSD in quiet mode the stdin/stdout/stderr are closed
which lets the mworker_pipe to use fd 0 and fd 1. Additionally exit()
upon failure to create or get the master-worker pipe.

This needs to be backported to 1.8.
2017-12-02 13:24:47 +01:00
Willy Tarreau
721d8e0286 MINOR: config: report when "monitor fail" rules are misplaced
"monitor-uri" may rely on "monitor fail" rules, which are processed
very early, immediately after the HTTP request is parsed and before
any http rulesets. It's not reported by the config parser when this
ruleset is misplaces, causing some configurations not to work like
users would expect. Let's just add the warning for a misplaced rule.
2017-12-01 18:25:08 +01:00
David Carlier
7e351eefe5 BUILD/MINOR: haproxy: compiling config cpu parsing handling when needed
parse_cpu_set is only relevant where there is cpu affinity,
avoiding in the process compilation warning as well.
2017-12-01 16:05:35 +01:00
Emeric Brun
088c9b73ca BUG/MAJOR: thread/peers: fix deadlock on peers sync.
Table lock was not released on an error path (if there is no
enough room to write table switch message).

[wt: needs to be backported to 1.8]
2017-12-01 15:06:43 +01:00
Emeric Brun
0fed0b0a38 BUG/MEDIUM: peers: fix some track counter rules dont register entries for sync.
This BUG was introduced with:
'MEDIUM: threads/stick-tables: handle multithreads on stick tables'

The API was reviewed to handle stick table entry updates
asynchronously and the caller must now call a 'stkable_touch_*'
function each time the content of an entry is modified to
register the entry to be synced.

There was missing call to stktable_touch_* resulting in
not propagated entries to remote peers (or local one during reload)
2017-11-29 19:16:22 +01:00
Willy Tarreau
872855998b BUG/MEDIUM: h2: don't report an error after parsing a 100-continue response
Yves Lafon reported a breakage with 100-continue. In fact the problem
is caused when an 1xx is the last response in the buffer (which commonly
is the case). We loop back immediately into the parser with what remains
of the input buffer (ie: nothing), while it is not expected to be called
with an empty response, so it fails.

Let's simply get back to the caller to decide whether or not more data
are expected to be sent.

This fix needs to be backported to 1.8.
2017-11-29 15:41:32 +01:00
Willy Tarreau
cea8537efd BUG/MEDIUM: threads/peers: decrement, not increment jobs on quitting
Commit 8d8aa0d ("MEDIUM: threads/listeners: Make listeners thread-safe")
mistakenly placed HA_ATOMIC_ADD(job, 1) to replace a job--, so it maintains
the job count too high preventing the process from cleanly exiting on
reload.

This needs to be backported to 1.8.
2017-11-29 14:51:20 +01:00
Emmanuel Hocdet
cebd7962e2 BUG/MINOR: ssl: CO_FL_EARLY_DATA removal is managed by stream
Manage BoringSSL early_data as it is with openssl 1.1.1.
2017-11-29 14:34:47 +01:00
David Carlier
6d5c841d24 BUILD/MINOR: haproxy : FreeBSD/cpu affinity needs pthread_np header
for pthread_*_np calls, pthread_np.h is needed under FreeBSD.
2017-11-29 14:30:38 +01:00
Willy Tarreau
5bcfd56519 BUG/MEDIUM: stream: fix session leak on applet-initiated connections
Commit 3e13cba ("MEDIUM: session: make use of the connection's destroy
callback") ensured that connections could be autonomous to destroy the
session they initiated, but it didn't take care of doing the same for
applets. Such applets are used for peers, Lua and SPOE outgoing
connections. In this case, once the stream ends, it closes everything
and nothing takes care of releasing the session. The problem is not
immediately obvious since the only visible effect is that older
processes will not quit on reload after having leaked one such session.

For now we check in stream_free() if the session's origin is the applet
we're releasing, and then free the session as well. Something more
uniform should probably be done once we manage to unify applets and
connections a bit more.

This fix needs to be backported to 1.8. Thanks to Emmanuel Hocdet for
reporting the problem.
2017-11-29 14:26:11 +01:00
William Lallemand
bcd9101a66 BUG/MEDIUM: cache: bad computation of the remaining size
The cache was not setting the hdrs_len to zero when we are called
in the http_forward_data with headers + body.

The consequence is to always try to store a size - the size of headers,
during the calls to http_forward_data even when it has already forwarded
the headers.

Thanks to Cyril Bonté for reporting this bug.

Must be backported to 1.8.
2017-11-28 12:06:06 +01:00
William Lallemand
c3cd35f96c BUG/MEDIUM: ssl: don't allocate shctx several time
The shctx_init() function does not check anymore if the pointer is not
NULL, this check must be done is the caller.

The consequence was to allocate one shctx per ssl bind.

Bug introduced by 4f45bb9 ("MEDIUM: shctx: separate ssl and shctx")

Thanks to Maciej Zdeb for reporting this bug.

Must be backported to 1.8.
2017-11-28 12:04:16 +01:00
Christopher Faulet
b61028549e BUG/MEDIUM: tcp-check: Don't lock the server in tcpcheck_main
There was a deadlock in tcpcheck_main function. The server's lock was already
acquired by the caller (process_chk_conn or wake_srv_chk).

This patch must be backported in 1.8.
2017-11-28 10:43:23 +01:00
David Carlier
e78915a47a BUILD/MINOR: deviceatlas: enable thread support
DeviceAtlas detection being multi-thread safe, we enable the
new thread feature support.
Needs to be backported to 1.8 branch.
2017-11-27 14:22:21 +01:00
Olivier Houchard
ba8e8c3518 BUG/MEDIUM: kqueue: Don't bother closing the kqueue after fork.
kqueue fd's are not shared with children after fork(), so the children
don't have to close them, and it may in fact be dangerous, because we may
end up closing a totally unrelated fd.

[wt: to be backported to 1.8 where master-worker broke on this, and
 likely to older versions for completeness]
2017-11-26 20:10:58 +01:00
Willy Tarreau
103e5663c8 BUG/MAJOR: threads/queue: avoid recursive locking in pendconn_get_next_strm()
pendconn_get_next_strm() is called from process_srv_queue() under the
server lock, and calls stream_add_srv_conn() with this lock held, while
the latter tries to take it again. This results in a deadlock when
a server's maxconn is reached and haproxy is built with thread support.
2017-11-26 18:50:30 +01:00
Willy Tarreau
fd5efb5936 CLEANUP: cache: more efficiently pack the struct cache
By having the cache id on 33 bytes as the first member, it was
creating a hole and forcing the "hot" remaining part to be split
across two cache lines. Let's move the id at the end as it's used
only during config parsing.
2017-11-26 11:10:53 +01:00
Willy Tarreau
b6a2f58993 MINOR: buffers: cache-align buffer_wq_lock
This lock is highly stressed, avoid cache-line sharing to limit stress.
2017-11-26 11:10:51 +01:00
Willy Tarreau
a24d1d0be4 MINOR: task: align the rq and wq locks
We really don't want them to share the same cache line as they are
expected to be used in parallel. Adding a 64-byte alignment here shows
a performance increase of about 4.5% on task-intensive workloads with
2 to 4 threads.
2017-11-26 11:10:51 +01:00
Willy Tarreau
6d1222ce73 MINOR: task: keep a pointer to the currently running task
Very often when debugging, the current task's pointer isn't easy to
recover (eg: from a core file). Let's keep a copy of it, it will
likely help, especially with threads.
2017-11-26 11:10:50 +01:00
William Lallemand
4cfede87a3 MAJOR: mworker: exits the master on failure
This patch changes the behavior of the master during the exit of a
worker.

When a worker exits with an error code, for example in the case of a
segfault, all workers are now killed and the master leaves.

If you don't want this behavior you can use the option
"master-worker no-exit-on-failure".
2017-11-24 22:48:27 +01:00
William Lallemand
49b4453b58 MEDIUM: cache: max-age configuration keyword
Add a configuration keyword to change the max-age.
The default one is still 60s.
2017-11-24 19:31:01 +01:00
William Lallemand
a71cd1d407 MINOR: cache: replace a fprint() by an abort()
In the applet I/O handler we can never get an object bigger than a
buffer, so we should never reach this case.
2017-11-24 19:00:07 +01:00
Willy Tarreau
bafbe01028 CLEANUP: pools: rename all pool functions and pointers to remove this "2"
During the migration to the second version of the pools, the new
functions and pool pointers were all called "pool_something2()" and
"pool2_something". Now there's no more pool v1 code and it's a real
pain to still have to deal with this. Let's clean this up now by
removing the "2" everywhere, and by renaming the pool heads
"pool_head_something".
2017-11-24 17:49:53 +01:00
Olivier Houchard
fbc74e8556 MINOR/CLEANUP: proxy: rename "proxy" to "proxies_list"
Rename the global variable "proxy" to "proxies_list".
There's been multiple proxies in haproxy for quite some time, and "proxy"
is a potential source of bugs, a number of functions have a "proxy" argument,
and some code used "proxy" when it really meant "px" or "curproxy". It worked
by pure luck, because it usually happened while parsing the config, and thus
"proxy" pointed to the currently parsed proxy, but we should probably not
rely on this.

[wt: some of these are definitely fixes that are worth backporting]
2017-11-24 17:21:27 +01:00
Christopher Faulet
767a84bcc0 CLEANUP: log: Rename Alert/Warning in ha_alert/ha_warning 2017-11-24 17:19:12 +01:00
Christopher Faulet
56803b1c98 CLEANUP: debug: Use DPRINTF instead of fprintf into #ifdef DEBUG_FULL/#endif 2017-11-24 17:19:03 +01:00
Christopher Faulet
165f07e7b4 MEDIUM: listener: Bind listeners on a thread subset if specified
If a "process" option with a thread set is used on the bind line, we use the
corresponding bitmask when the listener's FD is created.
2017-11-24 15:38:50 +01:00
Christopher Faulet
c644fa9bf5 MINOR: config: Add threads support for "process" option on "bind" lines
It is now possible on a "bind" line (or a "stats socket" line) to specify the
thread set allowed to process listener's connections. For instance:

    # HTTPS connections will be processed by all threads but the first and HTTP
    # connection will be processed on the first thread.
    bind *:80 process 1/1
    bind *:443 ssl crt mycert.pem process 1/2-
2017-11-24 15:38:50 +01:00
Christopher Faulet
cb6a94510d MINOR: config: Add the threads support in cpu-map directive
Now, it is possible to bind CPU at the thread level instead of the process level
by defining a thread set in "cpu-map" directives. Thus, its format is now:

  cpu-map [auto:]<process-set>[/<thread-set>] <cpu-set>...

where <process-set> and <thread-set> must follow the format:

  all | odd | even | number[-[number]]

Having a process range and a thread range in same time with the "auto:" prefix
is not supported. Only one range is supported, the other one must be a fixed
number. But it is allowed when there is no "auto:" prefix.

Because it is possible to define a mapping for a process and another for a
thread on this process, threads will be bound on the intersection of their
mapping and the one of the process on which they are attached. If the
intersection is null, no specific binding will be set for the threads.
2017-11-24 15:38:50 +01:00
Christopher Faulet
11da456e77 MINOR:: config: Remove thread-map directive
It was a temporary directive used for development purpose. Now, CPU mapping for
at the thread level should be done using the cpu-map directive. This feature
will be added in a next commit.
2017-11-24 15:38:50 +01:00
Christopher Faulet
ff4121f741 MINOR: config: Support partial ranges in cpu-map directive
Now, processa and CPU ranges can be partially defined. The higher bound can be
omitted. In such case, it is replaced by the corresponding maximum value, 32 or
64 depending on the machine's word size.

By extension, It is also true for the "bind-process" directive and "process"
parameter on a "bind" or a "stats socket" line.
2017-11-24 15:38:50 +01:00
Christopher Faulet
26028f6209 MINOR: config: Add auto-increment feature for cpu-map
The prefix "auto:" can be added before the process set to let HAProxy
automatically bind a process to a CPU by incrementing process and CPU sets. To
be valid, both sets must have the same size. No matter the declaration order of
the CPU sets, it will be bound from the lower to the higher bound.

  Examples:
      # all these lines bind the process 1 to the cpu 0, the process 2 to cpu 1
      #  and so on.
      cpu-map auto:1-4   0-3
      cpu-map auto:1-4   0-1 2-3
      cpu-map auto:1-4   3 2 1 0

      # bind each process to exaclty one CPU using all/odd/even keyword
      cpu-map auto:all   0-63
      cpu-map auto:even  0-31
      cpu-map auto:odd   32-63

      # invalid cpu-map because process and CPU sets have different sizes.
      cpu-map auto:1-4   0    # invalid
      cpu-map auto:1     0-3  # invalid
2017-11-24 15:38:49 +01:00
Christopher Faulet
f1f0c5f591 MINOR: config: Export parse_process_number and use it wherever it's applicable
This function is used when "bind-process" directive is parsed and when "process"
parameter on a "bind" or a "stats socket" line is parsed.
2017-11-24 15:38:49 +01:00
Christopher Faulet
5ab51775e7 MINOR: config: Slightly change how parse_process_number works
Now, this function returns a status code to indicate a success (0) or a failure
(1) and the error message in set in <err> parameter. And the result of the parsing
is set in <proc> parameter.
2017-11-24 15:38:49 +01:00
Christopher Faulet
1dcb9cb81c MINOR: config: Support a range to specify processes in "cpu-map" parameter
Now, you can define processes concerned by a cpu-map line using a range. For
instance, the following line binds the first 32 processes on CPUs 0 to 3:

  cpu-map 1-32 0-3
2017-11-24 15:38:49 +01:00
Christopher Faulet
15eb3a9a08 BUG/MINOR: listener: Allow multiple "process" options on "bind" lines
The documentation specifies that you can have several "process" options to
define several ranges on "bind" lines (or "stats socket" lines). It is uncommon,
but it should be possible. So the bind_proc bitmask in bind_conf structure must
not be overwritten at each new "process" option parsed.

This bug also exists in 1.7, 1.6 and 1.5. So it may be backported. But no one
seems to have noticed it, so it was probably never hitted.
2017-11-24 15:38:49 +01:00
William Lallemand
ecb73b12c1 MINOR: cache: move the refcount decrease in the applet release
Move the refcount decrease of the cache in the release callback of the
applet. We don't need to decrease it in the applet code.
2017-11-24 15:04:36 +01:00
William Lallemand
49dc048c25 BUG/MEDIUM: cache: free ressources in chn_end_analyze
Upon an aborted HTTP connection, or an error, the filter cache does not
decrement the refcount and does not free the allocated ressources.
2017-11-24 15:04:36 +01:00
Willy Tarreau
0542c8b39a BUG/MEDIUM: stream: always release the stream-interface on abort
The cache exhibited a but in process_stream() where upon abort it is
possible to switch the stream-int's state to SI_ST_CLO without calling
si_release_endpoint(), resulting in a possibly missing ->release() for
the applet.

It should affect all other applets as well (eg: lua, spoe, peers) and
should carefully be backported to stable branches after some observation
period.
2017-11-24 15:04:36 +01:00
Emmanuel Hocdet
ca6a957c5d MINOR: ssl: Handle early data with BoringSSL
BoringSSL early data differ from OpenSSL 1.1.1 implementation. When early
handshake is done, SSL_in_early_data report if SSL_read will be done on early
data. CO_FL_EARLY_SSL_HS and CO_FL_EARLY_DATA can be adjust accordingly.
2017-11-24 13:50:02 +01:00
Willy Tarreau
45a66ccc55 MEDIUM: config: ensure that tune.bufsize is at least 16384 when using HTTP/2
HTTP/2 mandates the support of 16384 bytes frames by default, so we need
a large enough buffer to process them. Till now if tune.bufsize was too
small, H2 connections were simply rejected during their establishment,
making it quite hard to troubleshoot the issue.

Now we detect when HTTP/2 is enabled on an HTTP frontend and emit an
error if tune.bufsize is not large enough, with the appropriate
recommendation.
2017-11-24 11:28:00 +01:00
Willy Tarreau
599391a7c2 MINOR: h2: make use of client-fin timeout after GOAWAY
At the moment, the "client" timeout is used on an HTTP/2 connection once
it's idle with no active stream. With this patch, this timeout is replaced
by client-fin once a GOAWAY frame is sent. This closely matches what is
done on HTTP/1 since the principle is the same, as it indicates a willing
ness to quickly close a connection on which we don't expect to see anything
anymore.
2017-11-24 10:16:00 +01:00
Willy Tarreau
a76e4c2183 MEDIUM: h2: don't gracefully close the connection anymore on Connection: close
As reported by Lukas, it causes more harm than good, for example on
prompt for authentication. Now we have an "http-request reject" rule
to use instead of "http-request deny" if we absolutely want to close
the connection.
2017-11-24 08:17:28 +01:00
Willy Tarreau
90c3232e54 MINOR: h2: send RST_STREAM before GOAWAY on reject
Apparently the h2c client has trouble reading the RST_STREAM frame after
a GOAWAY was sent, so it's likely that other clients may face the same
difficulty. Curl and Firefox don't care about this ordering, so let's
send it first.
2017-11-24 08:00:30 +01:00
Willy Tarreau
53275e8b02 MINOR: http: implement the "http-request reject" rule
This one acts similarly to its tcp-request counterpart. It immediately
closes the request without emitting any response. It can be suitable in
certain DoS conditions, as well as to close an HTTP/2 connection.
2017-11-24 07:52:01 +01:00
William Lallemand
f528fff46b MEDIUM: cache: store sha1 for hashing the cache key
The cache was relying on the txn->uri for creating its key, which was a
big problem when there was no log activated.

This patch does a sha1 of the host + uri, and stores it in the txn.
When a object is stored, the eb32node uses the first 32 bits of the hash
as a key, and the whole hash is stored in the cache entry.

During a lookup, the truncated hash is used, and when it matches an
entry we check the real sha1.
2017-11-23 20:20:04 +01:00
Olivier Houchard
7fc96d5a01 MINOR: mux: Make sure every string is woken up after the handshake.
In case any stream was waiting for the handshake after receiving early data,
we have to wake all of them. Do so by making the mux responsible for
removing the CO_FL_EARLY_DATA flag after all of them are woken up, instead
of doing it in si_cs_wake_cb(), which would then only work for the first one.
This makes wait_for_handshake work with HTTP/2.
2017-11-23 19:35:42 +01:00
Olivier Houchard
90084a133d MINOR: ssl: Handle reading early data after writing better.
It can happen that we want to read early data, write some, and then continue
reading them.
To do so, we can't reuse tmp_early_data to store the amount of data sent,
so introduce a new member.
If we read early data, then ssl_sock_to_buf() is now the only responsible
for getting back to the handshake, to make sure we don't miss any early data.
2017-11-23 19:35:28 +01:00
Willy Tarreau
51753458c4 BUG/MAJOR: threads/task: dequeue expired tasks under the WQ lock
There is a small unprotected window for a task between the wait queue
and the run queue where a task could be woken up and destroyed at the
same time. What typically happens is that a timeout is reached at the
same time an I/O completes and wakes it up, and the I/O terminates the
task, causing a use after free in wake_expired_tasks() possibly causing
a crash and/or memory corruption :

       thread 1                             thread 2
  (wake_expired_tasks)                (stream_int_notify)

 HA_SPIN_UNLOCK(TASK_WQ_LOCK, &wq_lock);
                              task_wakeup(task, TASK_WOKEN_IO);
                              ...
                              process_stream()
                                stream_free()
                                   task_free()
                                      pool_free(task)
 task_wakeup(task, TASK_WOKEN_TIMER);

This case is reasonably easy to reproduce with a config using very short
server timeouts (100ms) and client timeouts (10ms), while injecting on
httpterm requesting medium sized objects (5kB) over SSL. All this is
easier done with more threads than allocated CPUs so that pauses can
happen anywhere and last long enough for process_stream() to kill the
task.

This patch inverts the lock and the wakeup(), but requires some changes
in process_runnable_tasks() to ensure we never try to grab the WQ lock
while having the RQ lock held. This means we have to release the RQ lock
before calling task_queue(), so we can't hold the RQ lock during the
loop and must take and drop it.

It seems that a different approach with the scope-aware trees could be
easier, but it would possibly not cover situations where a task is
allowed to run on multiple threads. The current solution covers it and
doesn't seem to have any measurable performance impact.
2017-11-23 18:47:04 +01:00