7942 Commits

Author SHA1 Message Date
Frédéric Lécaille
4c8aa117f9 BUG/MINOR: ssl: Wrong usage of shctx_init().
With this patch we check that shctx_init() does not return 0.

Must be backported to 1.8.
2018-10-26 04:54:40 +02:00
Frédéric Lécaille
bc584494e6 BUG/MINOR: cache: Wrong usage of shctx_init().
With this patch we check that shctx_init() does not returns 0.
This is possible if the maxblocks argument, which is passed as an
int, is negative due to an implicit conversion.

Must be backported to 1.8.
2018-10-26 04:54:40 +02:00
Frédéric Lécaille
b9b8b6b6be BUG/MINOR: cache: Crashes with "total-max-size" > 2047(MB).
With this patch we support cache size larger than 2047 (MB) and prevent haproxy from crashing when "total-max-size" is parsed as negative values by atoi().

The limit at parsing time is 4095 MB (UINT_MAX >> 20).

May be backported to 1.8.
2018-10-26 04:54:40 +02:00
Frédéric Lécaille
a2219f5e3b MINOR: cache: Add "max-object-size" option.
This patch adds "max-object-size" option to the cache to limit
the size in bytes of the HTTP objects to be cached. When not provided,
the maximum size of an HTTP object is a 256th of the cache size.
2018-10-24 04:40:03 +02:00
Frédéric Lécaille
b7838afe6f MINOR: shctx: Add a maximum object size parameter.
This patch adds a new parameter to shctx_init() function to be used to
limit the size of each shared object, -1 value meaning "no limit".
2018-10-24 04:39:44 +02:00
Frédéric Lécaille
8df65ae5e2 MINOR: cache: Larger HTTP objects caching.
This patch makes the capable of storing HTTP objects larger than a buffer.
It makes usage of the "block by block shared object allocation" new shctx API.

A new pointer to struct shared_block has been added to the cache applet
context to memorize the next block to be used by the HTTP cache I/O handler
http_cache_io_handler() to emit the data. Another member, named "sent" memorize
the number of bytes already sent by this handler. So, to send an object from cache,
http_cache_io_handler() must be called until "sent" counter reaches the size
of this object.
2018-10-24 04:37:12 +02:00
Frédéric Lécaille
0bec807e08 MINOR: shctx: Shared objects block by block allocation.
This patch makes shctx capable of storing objects in several parts,
each parts being made of several blocks. There is no more need to
walk through until reaching the end of a row to append new blocks.

A new pointer to a struct shared_block member, named last_reserved,
has been added to struct shared_block so that to memorize the last block which was
reserved by shctx_row_reserve_hot(). Same thing about "last_append" pointer which
is used to memorize the last block used by shctx_row_data_append() to store the data.
2018-10-24 04:35:53 +02:00
Willy Tarreau
30f931ead2 BUG/MEDIUM: pools: fix the minimum allocation size
Fred reported a random crash related to the pools. This was introduced
by commit e18db9e98 ("MEDIUM: pools: implement a thread-local cache for
pool entries") because the minimum pool item size should have been
increased to 32 bytes to accommodate the 2 double-linked lists.

No backport is needed.
2018-10-23 14:40:23 +02:00
Willy Tarreau
68ad3a42f7 MINOR: proxy: add a new option "http-use-htx"
This option makes a proxy use only HTX-compatible muxes instead of the
HTTP-compatible ones for HTTP modes. It must be set on both ends, this
is checked at parsing time.
2018-10-23 10:22:36 +02:00
Christopher Faulet
955188d37d BUG/MEDIUM: stream-int: don't set SI_FL_WAIT_ROOM on CF_READ_DONTWAIT
With the previous connection model, when we purposely decided to stop
receiving in order to avoid polling after a complete request was received
for example, it was needed to set SI_FL_WAIT_ROOM to prevent receive
polling from being re-armed. Now with the new subscription-based model
there is no such thing anymore and there is noone to remove this flag
either. Thus if a request takes more than one packet to come in or
spans over too many packets, this flag will cause it to wait forever.
Let's simply remove this flag now.

This patch should not be backported since older versions still need
that this flag is set here to stop receiving.
2018-10-23 10:22:36 +02:00
Christopher Faulet
66943a4903 CLEANUP: http: Remove the unused function http_find_header 2018-10-23 10:22:36 +02:00
Olivier Houchard
31f04e4416 MINOR: stream_interface: Avoid calling si_cs_send/recv if not needed.
Don't bother calling si_cs_send and si_cs_recv if we're either already
subscribe, or if the output buffer is empty for si_cs_send.
2018-10-22 16:05:08 +02:00
Olivier Houchard
d846c267d5 MINOR: h2: Don't run tasks that are waiting to send if mux in full.
We wake up all the streams waiting to send data when we have space available
in the mux buffer. Doing so means we probably wake way too many streams,
because after a few the buffer will probably be full instead. So keep a
list of all the streams that are about to send data, and if we detect that
the buffer is full, unschedule the tasks and put the streams back to the
send_list.
2018-10-21 06:00:13 +02:00
Olivier Houchard
d7bd3e3c4c MINOR: streams: Call tasklet_free() after si_release_endpoint().
Make sure we call tasklet_free() only after si_release_endpoint(), when the
unsubscribe() method has been called, so that we're sure the mux won't
attempt to access the taslet.
2018-10-21 05:59:55 +02:00
Olivier Houchard
53216e7db9 MEDIUM: connections: Don't directly mess with the polling from the upper layers.
Avoid using conn_xprt_want_send/recv, and totally nuke cs_want_send/recv,
from the upper layers. The polling is now directly handled by the connection
layer, it is activated on subscribe(), and unactivated once we got the event
and we woke the related task.
2018-10-21 05:58:40 +02:00
Olivier Houchard
81a15af6bc MINOR: h2: Make sure to return 1 in h2_recv() when needed.
In h2_recv(), return 1 if we have data available, or if h2_recv_allowed()
failed, to be sure h2_process() is called.
Also don't subscribe if our buffer is full.
2018-10-21 05:58:33 +02:00
Olivier Houchard
85b73e9427 BUG/MEDIUM: stream: Make sure polling is right on retry.
When retrying to connect to a server, because the previous connection failed,
make sure if we subscribed to the previous connection, the polling flags will
be true for the new fd.

No backport is needed.
2018-10-21 05:55:32 +02:00
Olivier Houchard
52b946686c BUG/MEDIUM: h2: Close connection if no stream is left an GOAWAY was sent.
When we're closing a stream, is there's no stream left and a goaway was sent,
close the connection, there's no reason to keep it open.

[wt: it's likely that this is needed in 1.8 as well, though it's unclear
 how to trigger this issue, some tests are needed]
2018-10-21 05:53:09 +02:00
Olivier Houchard
8b2c8a7894 BUILD: memory: fix free_list pointer declaration again for atomic CAS
Similary to what's been done in 7a6ad88b02d8b74c2488003afb1a7063043ddd2d,
take into account that free_list that free_list is a void **, and so use
a void ** too when attempting to do a CAS.
2018-10-21 05:44:38 +02:00
Willy Tarreau
4e7cc3381b BUILD: compiler: rename __unreachable() to my_unreachable()
Olivier reported that on FreeBSD __unreachable is already defined
and causes build warnings. Let's rename it then.
2018-10-20 17:45:48 +02:00
Willy Tarreau
ed72d82827 MEDIUM: time: measure the time stolen by other threads
The purpose is to detect if threads or processes are competing for the
same CPU. This can happen when threads are incorrectly bound, or after a
reload if the previous process still has an important activity. With
threads this situation is problematic because a preempted thread holding
a lock will block other ones waiting for this lock to be released.

A first attempt consisted in measuring the cumulated lost time more
precisely but the system's scheduler is smart enough to try to limit the
thread preemption rate by mostly context switching during poll()'s blank
periods, so most of the time lost is not seen. In essence this is good
because it means a thread is not preempted with a lock held, and even
regarding the rendez-vous point it cannot prevent the other ones from
making progress. But still it happens tens to hundreds of times per
second that a thread might be preempted, so it's still possible to detect
that the situation is happening, thus it's interesting to measure and
report its frequency.

Each time we enter the poller, we check the CPU time spent working and
see if we've lost time doing something else. To limit false positives,
we're only interested in losses of 500 microseconds or more (i.e. half
a clock tick on a 1 kHz system). If so, it indicates that some time was
stolen by another thread or process. Note that we purposely store some
sub-millisecond counters so that under heavy traffic with a 1 kHz clock,
it's still possible to measure something without being subject to the
risk of rounding errors (i.e. if exactly 1 ms is stolen it's possible
that the time difference could often be slightly lower).

This counter of lost CPU time slots time is reported in "show activity"
in numbers of milliseconds of CPU lost per second, per 15s, and total
over the process' life. By definition, the per-second counter cannot
report values larger than 1000 per thread per second and the 15s one
will be limited to 15000/s in the worst case, but it's possible that
peak values exceed such thresholds after long pauses.
2018-10-19 08:51:59 +02:00
Willy Tarreau
ac6c8805be BUILD: memory: fix pointer declaration for atomic CAS
The calls to HA_ATOMIC_CAS() on the lockfree version of the pool allocator
were mistakenly done on (void*) for the old value instead of (void **).
While this has no impact on "recent" gcc, it does have one for gcc < 4.7
since the CAS was open coded and it's not possible to assign a temporary
variable of type "void".

No backport is needed, this only affects 1.9.
2018-10-18 16:12:28 +02:00
Willy Tarreau
7e9c4ae4de MINOR: poller: move time and date computation out of the pollers
By placing this code into time.h (tv_entering_poll() and tv_leaving_poll())
we can remove the logic from the pollers and prepare for extending this to
offer more accurate time measurements.
2018-10-17 19:59:43 +02:00
Willy Tarreau
f37ba94768 MINOR: fd: centralize poll timeout computation in compute_poll_timeout()
The 4 pollers all contain the same code used to compute the poll timeout.
This is pointless, let's centralize this into fd.h. This also gets rid of
the useless SCHEDULER_RESOLUTION macro which used to work arond a very old
linux 2.2 bug causing select() to wake up slightly before the timeout.
2018-10-17 19:59:43 +02:00
Olivier Houchard
33992267aa MINOR: peers: use defines instead of enums to appease clang.
Clang (rightfully) warns that we're trying to set chars to values >= 128.
Use defines with hex values instead of an enum to address this.
2018-10-16 19:31:15 +02:00
Olivier Houchard
3332090a2d MINOR: cfgparse: Write 130 as 128 as 0x82 and 0x80.
Write 130 and 128 as 8x82 and 0x80, to avoid warnings about casting from
int to size. "check_req" should probably be unsigned, but it's hard to do so.
2018-10-16 19:28:35 +02:00
Willy Tarreau
5dfb6c4cc9 CLEANUP: state-file: make the path concatenation code a bit more consistent
There are as many ways to build the globalfilepathlen variable as branches
in the if/then/else, creating lots of confusion. Address the most obvious
parts, but some polishing definitely is still needed.
2018-10-16 19:26:12 +02:00
Olivier Houchard
17f8b90736 MINOR: server: Use memcpy() instead of strncpy().
Use memcpy instead of strncpy, strncpy buys us nothing, and gcc is being
annoying.
2018-10-16 19:22:20 +02:00
Willy Tarreau
b059b894cd BUILD: lua: silence some compiler warnings after WILL_LJMP
These ones are on error paths that are properly handled by luaL_error()
which does a longjmp() but the compiler cannot know it. By adding an
__unreachable() statement in WILL_LJMP(), there is no ambiguity anymore.

This may be backported to 1.8 but these previous patches are needed first :
  - BUILD: compiler: add a new statement "__unreachable()"
  - MINOR: lua: all functions calling lua_yieldk() may return
  - BUILD: lua: silence some compiler warnings about potential null derefs (#2)
2018-10-16 17:57:36 +02:00
Willy Tarreau
9635e03c41 MINOR: lua: all functions calling lua_yieldk() may return
There was a mistake when tagging functions which always use longjmp and
those which may use it in that all those supposed to call lua_yieldk()
may return without calling longjmp. Thus they must not use WILL_LJMP()
but MAY_LJMP(). It has zero impact on the code emitted as such, but
prevents other fixes from being properly implemented : this was the
cause of the previous failure with the __unreachable() calls.

This may be backported to older versions. It may or may not apply
well depending on the context, though the change simply consists in
replacing "WILL_LJMP(hlua_yieldk" with "MAY_LJMP(hlua_yieldk", and
same with the single call to lua_yieldk() in hlua_yieldk().
2018-10-16 17:56:20 +02:00
Willy Tarreau
e09101e8d9 BUILD: lua: silence some compiler warnings about potential null derefs (#2)
Here we make sure that appctx is always taken from the unchecked value
since we know it's an appctx, which explains why it's immediately
dereferenced. A missing test was added to ensure that task_new() does
not return a NULL.

This may be backported to 1.8.
2018-10-16 17:39:05 +02:00
Willy Tarreau
526aed219f Revert "BUILD: lua: silence some compiler warnings about potential null derefs"
This reverts commit f1ffb39b614b0d9654c9450ac6e8c88cfc942784.

It breaks Lua causing some timeouts. Removing the __unreachable() statement
from WILL_LJMP() fixes it. It's very strange and unclear whether it's an
issue with WILL_LJMP() not fullfilling its promise of not returning, if
the code emitted with __unreachable() gets broken, or anything else. Let's
revert this for now.
2018-10-16 17:32:55 +02:00
Willy Tarreau
a9c0252b2e BUG/MEDIUM: threads: fix thread_release() at the end of the rendez-vous point
There is a bug in this function used to release other threads. It leaves
the current thread marked as harmless. If after this another thread does
a thread_isolate(), but before the first one reaches poll(), the second
thread will believe it's alone while it's not.

This must be backported to 1.8 since the rendez-vous point was merged
into 1.8.14.
2018-10-16 17:03:16 +02:00
Willy Tarreau
e18db9e984 MEDIUM: pools: implement a thread-local cache for pool entries
Each thread now keeps the last ~512 kB of freed objects into a local
cache. There are some heuristics involved so that a specific pool cannot
use more than 1/8 of the total cache in number of objects. Tests have
shown that 512 kB is an optimal size on a 24-thread test running on a
dual-socket machine, resulting in an overall 7.5% performance increase
and a cache miss ratio reducing from 19.2 to 17.7%. Anyway it seems
pointless to keep more than an L2 cache, which probably explains why
sizes between 256 and 512 kB are optimal.

Cached objects appear in two lists, one per pool and one LRU to help
with fair eviction. Currently there is no way to check each thread's
cache state nor to flush it. This cache cannot be disabled and is
enabled as soon as the lockless pools are enabled (i.e.: threads are
enabled, no pool debugging is in use and the CPU supports a double word
CAS).
2018-10-16 13:46:08 +02:00
Willy Tarreau
0a93b6413f MINOR: pools: allocate most memory pools from an array
For caching it will be convenient to have indexes associated with pools,
without having to dereference the pool itself. One solution could consist
in replacing all pool pointers with integers but this would limit the
number of allocatable pools. Instead here we allocate the 32 first pools
from a pre-allocated array whose base address is known so that it's trivial
to convert a pool to an index in this array. Pools that cannot fit there
will be allocated normally.
2018-10-16 10:29:26 +02:00
Willy Tarreau
8d8747abe0 OPTIM: tasks: group all tree roots per cache line
Currently we have per-thread arrays of trees and counts, but these
ones unfortunately share cache lines and are accessed very often. This
patch moves the task-specific stuff into a structure taking a multiple
of a cache line, and has one such per thread. Just doing this has
reduced the cache miss ratio from 19.2% to 18.7% and increased the
12-thread test performance by 3%.

It starts to become visible that we really need a process-wide per-thread
storage area that would cover more than just these parts of the tasks.
The code was arranged so that it's easy to move the pieces elsewhere if
needed.
2018-10-15 19:06:13 +02:00
Willy Tarreau
b20aa9eef3 MAJOR: tasks: create per-thread wait queues
Now we still have a main contention point with the timers in the main
wait queue, but the vast majority of the tasks are pinned to a single
thread. This patch creates a per-thread wait queue and queues a task
to the local wait queue without any locking if the task is bound to a
single thread (the current one) otherwise to the shared queue using
locking. This significantly reduces contention on the wait queue. A
test with 12 threads showed 11 ms spent in the WQ lock compared to
4.7 seconds in the same test without this change. The cache miss ratio
decreased from 19.7% to 19.2% on the 12-thread test, and its performance
increased by 1.5%.

Another indirect benefit is that the average queue size is divided
by the number of threads, which roughly removes log(nbthreads) levels
in the tree and further speeds up lookups.
2018-10-15 19:04:40 +02:00
Willy Tarreau
87d54a9a6d MEDIUM: fd/threads: only grab the fd's lock if the FD has more than one thread
The vast majority of FDs are only seen by one thread. Currently the lock
on FDs costs a lot because it's touched often, though there should be very
little contention. This patch ensures that the lock is only grabbed if the
FD is shared by more than one thread, since otherwise the situation is safe.
Doing so resulted in a 15% performance boost on a 12-threads test.
2018-10-15 13:25:06 +02:00
Willy Tarreau
9504dd64c6 MINOR: config: use atleast2() instead of my_popcountl() where relevant
Quite often we used my_popcountl() just to check for > 1 bit set. Now
we have an easier solution, let's use it.
2018-10-15 13:25:06 +02:00
Willy Tarreau
d944344f01 BUILD: peers: check allocation error during peers_init_sync()
peers_init_sync() doesn't check task_new()'s return value and doesn't
return any result to indicate success or failure. Let's make it return
an int and check it from the caller.

This can be backported as far as 1.6.
2018-10-15 13:24:43 +02:00
Willy Tarreau
848522f05d BUILD: stick-table: make sure not to fail on task_new() during initialization
Gcc reports a potential null-deref error in the stick-table init code.
While not critical there, it's trivial to fix. This check has been
missing since 1.4 so this fix can be backported to all supported versions.
2018-10-15 13:24:43 +02:00
Willy Tarreau
a8825520b7 BUILD: ssl: fix another null-deref warning in ssl_sock_switchctx_cbk()
This null-deref cannot happen either as there necesarily is a listener
where this function is called. Let's use __objt_listener() to address
this.

This may be backported to 1.8.
2018-10-15 13:24:43 +02:00
Willy Tarreau
b729077710 BUILD: ssl: fix null-deref warning in ssl_fc_cipherlist_str sample fetch
Gcc 6.4 detects a potential null-deref warning in smp_fetch_ssl_fc_cl_str().
This one is not real since already addressed a few lines above. Let's use
__objt_conn() instead of objt_conn() to avoid the extra test that confuses
it.

This could be backported to 1.8.
2018-10-15 13:24:43 +02:00
Willy Tarreau
f1ffb39b61 BUILD: lua: silence some compiler warnings about potential null derefs
These ones are on error paths that are properly handled by luaL_error()
which does a longjmp() but the compiler cannot know it. By adding an
__unreachable() statement in WILL_LJMP(), there is no ambiguity anymore.

This may be backported to 1.8 but the previous patch (BUILD: compiler:
add a new statement "__unreachable()") is needed for this.
2018-10-15 13:24:43 +02:00
Willy Tarreau
e5f229e639 BUG/MEDIUM: stream: don't crash on out-of-memory
In case pool_alloc() fails in stream_new(), we try to detach the stream
from the list before it has been added, dereferencing a NULL. In order
to fix it, simply move the LIST_DEL call upwards.

This must be backported to 1.8.
2018-10-15 13:24:43 +02:00
William Lallemand
dd319a5b1d BUG/MEDIUM: mworker: don't poll on LI_O_INHERITED listeners
The listeners with the LI_O_INHERITED flag were deleted but not unbound
which is a problem since we have a polling in the master.

This patch unbind every listeners which are not require for the master,
but does not close the FD of those that have a LI_O_INHERITED flag.
2018-10-12 19:30:18 +02:00
Willy Tarreau
b3fb56db10 MINOR: h2: add a new flag to quickly distinguish front vs back connection
We will need to know if a mux was created for a front or a back
connection and once it's established it's much harder, so let's
introduce H2_CF_IS_BACK for this.
2018-10-12 16:58:41 +02:00
Willy Tarreau
a8e4954856 MINOR: h2: split h2c_stream_new() into h2s_new() + h2c_frt_stream_new()
For backend connections we'll have to initialize streams but not allocate
conn_streams since they'll already be there. Thus this patch splits the
h2c_stream_new() function into one dedicated to allocation of a new stream
and another one supposed to attach this stream to an existing frontend
connection.
2018-10-12 16:58:01 +02:00
Willy Tarreau
0b37d658e6 MINOR: h2: retrieve the front proxy from the caller instead of the session
Till now in order to figure the timeouts, we used to retrieve the proxy
from the session's owner, but the new API provides it so it's better to
simply take it from the caller at init time. We take this opportunity to
store the pointer to the proxy into the h2 connection so that we can
reuse it later when needed.
2018-10-12 16:58:01 +02:00
Willy Tarreau
7dc24e49cc MINOR: h2: unify the mux init function
The init function was split into the mux init and the front init, but it
appears that most of the code will be common between the two sides when
implementing the backend init. Thus let's simply make this a unique
h2_init() function.
2018-10-12 16:58:01 +02:00