Commit Graph

2947 Commits

Author SHA1 Message Date
Olivier Houchard
1256836ebf MEDIUM: fd/threads: Make sure we don't miss a fd cache entry.
An fd cache entry might be removed and added at the end of the list, while
another thread is parsing it, if that happens, we may miss fd cache entries,
to avoid that, add a new field in the struct fdtab, "added_mask", which
contains a mask for potentially affected threads, if it is set, the
corresponding thread will set its bit in fd_cache_mask, to avoid waiting in
poll while it may have more work to do.
2018-02-05 16:02:22 +01:00
Olivier Houchard
4815c8cbfe MAJOR: fd/threads: Make the fdcache mostly lockless.
Create a local, per-thread, fdcache, for file descriptors that only belongs
to one thread, and make the global fd cache mostly lockless, as we can get
a lot of contention on the fd cache lock.
2018-02-05 16:02:22 +01:00
Olivier Houchard
cf975d46bc MINOR: pools/threads: Implement lockless memory pools.
On CPUs that support a double-width compare-and-swap, implement lockless
pools.
2018-02-05 16:02:22 +01:00
Willy Tarreau
5266b3e12d MINOR: threads: add test and set/reset operations
This just adds a set of naive bts/btr operations based on OR/AND. Later
it could rely on pl_bts/btr to use arch-specific versions if needed.
2018-02-05 14:24:50 +01:00
Olivier Houchard
f61f0cb95f MINOR: threads: Introduce double-width CAS on x86_64 and arm.
Introduce double-width compare-and-swap on arches that support it, right now
x86_64, arm, and aarch64.
Also introduce functions to do memory barriers.
2018-02-05 14:24:50 +01:00
Olivier Houchard
928fbfa8b7 MINOR: compiler: introduce offsetoff().
Add a offsetof() macro, if it is no there already.
2018-02-05 14:24:50 +01:00
Olivier Houchard
6fa63d9852 MINOR: early data: Don't rely on CO_FL_EARLY_DATA to wake up streams.
Instead of looking for CO_FL_EARLY_DATA to know if we have to try to wake
up a stream, because it is waiting for a SSL handshake, instead add a new
conn_stream flag, CS_FL_WAIT_FOR_HS. This way we don't have to rely on
CO_FL_EARLY_DATA, and we will only wake streams that are actually waiting.
2018-02-05 14:24:50 +01:00
Christopher Faulet
b077cdc012 MEDIUM: spoe: Use an ebtree to manage idle applets
Instead of using a list of applets with idle ones in front, we now use an
ebtree. Aapplets in the tree are idle by definition. And the key is the applet's
weight. When a new frame is queued, the first idle applet (with the lowest
weight) is woken up and its weight is increased by one. And when an applet sends
a frame to a SPOA, its weight is decremented by one.

This is empirical, but it should avoid to overuse a very few number of applets
and increase the balancing between idle applets.
2018-02-02 16:00:32 +01:00
Christopher Faulet
8f82b203d5 MINOR: spoe: Count the number of frames waiting for an ack for each applet
So it is easier to respect the max_fpa value. This is no more the maximum frames
processed by an applet at each loop but the maximum frames waiting for an ack
for a specific applet.

The function spoe_handle_processing_appctx has been rewritten accordingly.
2018-02-02 16:00:32 +01:00
Christopher Faulet
6f9ea4f87b MINOR: spoe: Replace sending_rate by a frequency counter
sending_rate was a counter used to evaluate the SPOE capacity to process
frames. Because it was not really accurrate, it has been replaced by a frequency
counter representing the number of frames handled by the SPOE per second. We
just check this counter is higher than the number of streams waiting for a
reply. If not, a new applet is created.
2018-02-02 16:00:32 +01:00
Christopher Faulet
fce747bbaa MINOR: spoe: Always link a SPOE context with the applet processing it
This was already done for fragmented frames. Now, this is true for all
frames.
2018-02-02 16:00:32 +01:00
Christopher Faulet
420977903b MINOR: spoe: Remove check on min_applets number when a SPOE context is queued
The calculation of a minimal number of active applets was really empirical and
finally useless. On heavy load, there are always many active applets (most of
time, more than the minimal required) and when the load is low, there is no
reason to keep unused applets opened.

Because of this change, the flag SPOE_APPCTX_FL_PERSIST is now unused. So it has
been removed.
2018-02-02 16:00:32 +01:00
Frédéric Lécaille
6778b27542 MINOR: stick-tables: Adds support for new "gpc1" and "gpc1_rate" counters.
Implement exactly the same code as this has been done for "gpc0" and "gpc0_rate"
counters.
2018-01-31 09:40:05 +01:00
Christopher Faulet
f51bac2ba8 BUG/MINOR: threads: Update labels array because of changes in lock_label enum
Recent changes to the enum were not synchronized with the lock debugging
code. Now we use a switch/case instead of an array so that the compiler
throws a warning if there is any inconsistency.

To be backported to 1.8 (at least to add the START entry).
2018-01-30 14:35:24 +01:00
Willy Tarreau
a9786b6f04 MINOR: fd: pass the iocb and owner to fd_insert()
fd_insert() is currently called just after setting the owner and iocb,
but proceeding like this prevents the operation from being atomic and
requires a lock to protect the maxfd computation in another thread from
meeting an incompletely initialized FD and computing a wrong maxfd.
Fortunately for now all fdtab[].owner are set before calling fd_insert(),
and the first lock in fd_insert() enforces a memory barrier so the code
is safe.

This patch moves the initialization of the owner and iocb to fd_insert()
so that the function will be able to properly arrange its operations and
remain safe even when modified to become lockless. There's no other change
beyond the internal API.
2018-01-29 16:07:25 +01:00
Willy Tarreau
82b37d74d2 MEDIUM: fd: use atomic ops for hap_fd_{clr,set} and remove poll_lock
Now that we can use atomic ops to set/clear an fd occurrence in an
fd_set, we don't need the poll_lock anymore. Let's remove it.
2018-01-29 16:03:15 +01:00
Willy Tarreau
322e6c7e73 MINOR: fd: move the hap_fd_{clr,set,isset} functions to fd.h
These functions were created for poll() in 1.5-dev18 (commit 80da05a4) to
replace the previous FD_{CLR,SET,ISSET} that were shared with select()
because some libcs enforce a limit on FD_SET. But FD_SET doesn't seem
to be universally MT-safe, requiring locks in the select() code that
are not needed in the poll code. So let's move back to the initial
situation where we used to only use bit fields, since that has been in
use since day one without a problem, and let's use these hap_fd_*
functions instead of FD_*.

This patch only moves the functions to fd.h and revives hap_fd_isset()
that was recently removed to kill an "unused" warning.
2018-01-29 16:03:15 +01:00
Willy Tarreau
745c60eac6 CLEANUP: fd: remove the unused "new" field
This field has been unused since 1.6, it's only updated and never
tested. Let's remove it.
2018-01-29 16:02:59 +01:00
Willy Tarreau
f2b5c99b4c CLEANUP: fd/threads: remove the now unused fdtab_lock
It was only used to protect maxfd computation and is not needed
anymore.
2018-01-29 15:25:35 +01:00
Willy Tarreau
173d9951e2 MEDIUM: polling: start to move maxfd computation to the pollers
Since only select() and poll() still make use of maxfd, let's move
its computation right there in the pollers themselves, and only
during each fd update pass. The computation doesn't need a lock
anymore, only a few atomic ops. It will be accurate, be done much
less often and will not be required anymore in the FD's fast patch.

This provides a small performance increase of about 1% in connection
rate when using epoll since we get rid of this computation which was
performed under a lock.
2018-01-29 15:22:57 +01:00
Frédéric Lécaille
a41d531e4e MINOR: config: Enable tracking of up to MAX_SESS_STKCTR stick counters.
This patch really adds support for up to MAX_SESS_STKCTR stick counters.
2018-01-29 13:53:56 +01:00
Tim Duesterhus
471851713a MINOR: standard: Add str2mask6 function
This new function mirrors the str2mask() function for IPv4 addresses.

This commit is in preparation to support ARGT_MSK6.
2018-01-25 22:25:40 +01:00
Tim Duesterhus
92bb034209 CLEANUP: Fix typo in ARGT_MSK6 comment
The incorrect comment was introduced in commit:
2ac5718dbd

v1.5-dev9 is the first tag containing this comment, the fix
should be backported to haproxy 1.5 and newer.
2018-01-25 22:25:40 +01:00
Willy Tarreau
1605c7ae61 BUG/MEDIUM: threads/mworker: fix a race on startup
Marc Fournier reported an interesting case when using threads with the
master-worker mode : sometimes, a listener would have its FD closed
during startup. Sometimes it could even be health checks seeing this.

What happens is that after the threads are created, and the pollers
enabled on each threads, the master-worker pipe is registered, and at
the same time a close() is performed on the write side of this pipe
since the children must not use it.

But since this is replicated in every thread, what happens is that the
first thread closes the pipe, thus releases the FD, and the next thread
starting a listener in parallel gets this FD reassigned. Then another
thread closes the FD again, which this time corresponds to the listener.
It can also happen with the health check sockets if they're started
early enough.

This patch splits the mworker_pipe_register() function in two, so that
the close() of the write side of the FD is performed very early after the
fork() and long before threads are created (we don't need to delay it
anyway). Only the pipe registration is done in the threaded code since
it is important that the pollers are properly allocated for this.
The mworker_pipe_register() function now takes care of registering the
pipe only once, and this is guaranteed by a new surrounding lock.

The call to protocol_enable_all() looks fragile in theory since it
scans the list of proxies and their listeners, though in practice
all threads scan the same list and take the same locks for each
listener so it's not possible that any of them escapes the process
and finishes before all listeners are started. And the operation is
idempotent.

This fix must be backported to 1.8. Thanks to Marc for providing very
detailed traces clearly showing the problem.
2018-01-23 19:18:57 +01:00
Willy Tarreau
c9c8378c2b MINOR: fd: add a bitmask to indicate that an FD is known by the poller
Some pollers like epoll() need to know if the fd is already known or
not in order to compute the operation to perform (add, mod, del). For
now this is performed based on the difference between the previous FD
state and the new state but this will not be usable anymore once threads
become responsible for their own polling.

Here we come with a different approach : a bitmask is stored with the
fd to indicate which pollers already know it, and the pollers will be
able to simply perform the add/mod/del operations based on this bit
combined with the new state.

This patch only adds the bitmask declaration and initialization, it
is it not yet used. It will be needed by the next two fixes and will
need to be backported to 1.8.
2018-01-23 15:42:57 +01:00
Willy Tarreau
ebc78d78a2 BUG/MEDIUM: fd: maintain a per-thread update mask
Since the fd update tables are per-thread, we need to have a bit per
thread to indicate whether an update exists, otherwise this can lead
to lost update events every time multiple threads want to update the
same FD. In practice *for now*, it only happens at start time when
listeners are enabled and ask for polling after facing their first
EAGAIN. But since the pollers are still shared, a lost event is still
recovered by a neighbor thread. This will not reliably work anymore
with per-thread pollers, where it has been observed a few times on
startup that a single-threaded listener would not always accept
incoming connections upon startup.

It's worth noting that during this code review it appeared that the
"new" flag in the fdtab isn't used anymore.

This fix should be backported to 1.8.
2018-01-23 15:41:19 +01:00
Christopher Faulet
69553fe62c MINOR: threads/fd: Use a bitfield to know if there are FDs for a thread in the FD cache
A bitfield has been added to know if there are some FDs processable by a
specific thread in the FD cache. When a FD is inserted in the FD cache, the bits
corresponding to its thread_mask are set. On each thread, the bitfield is
updated when the FD cache is processed. If there is no FD processed, the thread
is removed from the bitfield by unsetting its tid_bit.

Note that this bitfield is updated but not checked in
fd_process_cached_events. So, when this function is called, the FDs cache is
always processed.

[wt: should be backported to 1.8 as it will help fix a design limitation]
2018-01-23 15:39:10 +01:00
Willy Tarreau
d80cb4ee13 MINOR: global: add some global activity counters to help debugging
A number of counters have been added at special places helping better
understanding certain bug reports. These counters are maintained per
thread and are shown using "show activity" on the CLI. The "clear
counters" commands also reset these counters. The output is sent as a
single write(), which currently produces up to about 7 kB of data for
64 threads. If more counters are added, it may be necessary to write
into multiple buffers, or to reset the counters.

To backport to 1.8 to help collect more detailed bug reports.
2018-01-23 15:38:33 +01:00
Willy Tarreau
421f02e738 MINOR: threads: add a MAX_THREADS define instead of LONGBITS
This one allows not to inflate some structures when threads are
disabled. Now struct global is 1.4 kB instead of 33 kB.

Should be backported to 1.8 for ease of backporting of upcoming
patches.
2018-01-23 15:28:20 +01:00
Willy Tarreau
f4571a027f MINOR: global/threads: move cpu_map at the end of the global struct
The "thread" part is 32kB long, better move it at the end of the
structure since it's only used during initialization, to keep the
rest grouped together.

Should be backported to 1.8 to ease backporting of upcoming patches,
no functional impact.
2018-01-23 15:27:52 +01:00
Christopher Faulet
336d3ef0e7 MINOR: spoe: add register-var-names directive in spoe-agent configuration
In addition to "option force-set-var", recently added, this directive can be
used to selectivelly register unknown variable names, without totally relaxing
their registration during the runtime, like "option force-set-var" does.

So there is no way for a malicious agent to exhaust memory by defining a too
high number of variable names. In other hand, you need to enumerate all
variable names. This could be painfull in some circumstances.

Remember, this directive is only usefull when the variable names are not
referenced anywhere in the HAProxy configuration or the SPOE one.

Thanks to Etienne Carrière for his help on this part.
2018-01-15 13:47:27 +01:00
David Carlier
ec5e84552a BUILD/MINOR: ancient gcc versions atomic fix
Commit 1a69af6d38 introduced code
for atomic prior to 4.7. Unfortunately clang uses as well those
constants which is misleading.
2018-01-11 15:31:07 +01:00
Willy Tarreau
1a69af6d38 MINOR: hathreads: add support for gcc < 4.7
Till now the use of __atomic_* gcc builtins required gcc >= 4.7. Since
some supported and quite common operating systems like CentOS 6 still
come with older versions (4.4) and the mapping to the older builtins
is reasonably simple, let's implement it.

This code is only used for gcc < 4.7. It has been quickly tested on a
machine using gcc 4.4.4 and provided expected results.

This patch should be backported to 1.8.
2018-01-10 07:51:56 +01:00
Olivier Houchard
2ec2db9725 MINOR: dns: Handle SRV record weight correctly.
A SRV record weight can range from 0 to 65535, while haproxy weight goes
from 0 to 256, so we have to divide it by 256 before handing it to haproxy.
Also, a SRV record with a weight of 0 doesn't mean the server shouldn't be
used, so use a minimum weight of 1.

This should probably be backported to 1.8.
2018-01-09 15:43:11 +01:00
Olivier Houchard
e2a34967a9 CLEANUP: rbtree: remove
Remove the rbtree implementation. It's not used, it's not even connected to
the build, and we probably have no use for it .
2018-01-05 10:56:32 +01:00
Willy Tarreau
3083276187 MINOR: h2: add a function to report pseudo-header names
For debugging we need to be able to dump pseudo headers when we know
their name, let's put this there as we already have the other way
around.
2017-12-30 17:17:07 +01:00
Willy Tarreau
a48c141f44 BUG/MAJOR: connection: refine the situations where we don't send shutw()
Since commit f9ce57e ("MEDIUM: connection: make conn_sock_shutw() aware
of lingering"), we refrain from performing the shutw() on the socket if
there is no lingering risk. But there is a problem with this in tunnel
and in TCP modes where a client is explicitly allowed to send a shutw
to the server, eventhough it it risky.

Not doing it creates this situation reported by Ricardo Fraile and
diagnosed by Christopher : a typical HTTP client (eg: curl) connecting
via the config below to an HTTP server would receive its response,
immediately close while the server remains in keep-alive mode. The
shutr() received by haproxy from the client is "propagated" to the
server side but not acted upon because fdtab[fd].linger_risk is set,
so we expect that the next close will immediately complete this
operation.

  listen proxy-tcp
    bind 127.0.0.1:8888
    mode tcp
    timeout connect 5s
    timeout server  10s
    timeout client  10s
    server server1 127.0.0.1:8000

But since the whole stream will not end until the server closes in
turn, the server doesn't close and haproxy expires on server timeout.
This problem has already struck by waking up an older bug and was
partially fixed with commit 8059351 ("BUG/MEDIUM: http: don't disable
lingering on requests with tunnelled responses") though it was not
enough.

The problem is that linger_risk is not suited here. In fact we need to
know whether or not it is desired to close normally or silently, and
whether or not a shutr() has already been received on this connection.

This is the approach this patch takes, and it solves the problem for
the various difficult modes (tcp, http-server-close, pretend-keepalive).

This fix needs to be backported to 1.8. Many thanks to Ricardo for
providing very detailed traces and configurations.
2017-12-22 18:54:05 +01:00
Willy Tarreau
0ad8e0dfea MINOR: http: add a function to check request's cache-control header field
The new function check_request_for_cacheability() is used to check if
a request may be served from the cache, and/or allows the response to
be stored into the cache. For this it checks the cache-control and
pragma header fields, and adjusts the existing TX_CACHEABLE and a new
TX_CACHE_IGNORE flags.

For now, just like its response side counterpart, it only checks the
first value of the header field. These functions should be reworked to
improve their parsers and validate all elements.
2017-12-22 17:56:17 +01:00
Willy Tarreau
984fca9363 MINOR: stream-int: set flag SI_FL_CLEAN_ABRT when mux supports clean aborts
By copying the info in the stream interface that the mux cleanly reports
aborts, we'll have the ability to check this flag wherever needed regardless
of the presence of a mux or not.
2017-12-20 16:56:32 +01:00
Willy Tarreau
28f1cb9da2 MINOR: mux: add flags to describe a mux's capabilities
This new field will be used to describe certain properties of some
muxes. For now we only add MX_FL_CLEAN_ABRT to indicate that a mux
is able to unambiguously report aborts using CS_FL_ERROR contrary
to others who may only report it via a read0. This will be used to
improve handling of the abortonclose option with H2. Other flags
may come later to report multiplexing capabilities or not, support
of client/server sides etc.
2017-12-20 16:31:30 +01:00
Etienne Carriere
aec8989e53 MINOR: spoe: add force-set-var option in spoe-agent configuration
For security reasons, the spoe filter was only able to change values of
existing variables. In specific cases (ex : with LUA code), the name of
variables are unknown at the configuration parsing phase.
The force-set-var option can be enabled to register all variables.
2017-12-20 08:55:18 +01:00
Willy Tarreau
3c8294b607 MINOR: conn_stream: add new flag CS_FL_RCV_MORE to indicate pending data
Due to the nature of multiplexed protocols, it will often happen that
some operations are only performed on full frames, preventing any partial
operation from being performed. HTTP/2 is one such example. The current
MUX API causes a problem here because the rcv_buf() function has no way
to let the stream layer know that some data could not be read due to a
lack of room in the buffer, but that data are definitely present. The
problem with this is that the stream layer might not know it needs to
call the function again after it has made some room. And if the frame
in the buffer is not followed by any other, nothing will move anymore.

This patch introduces a new conn_stream flag CS_FL_RCV_MORE whose purpose
is to indicate on the stream that more data than what was received are
already available for reading as soon as more room will be available in
the buffer.

This patch doesn't make use of this flag yet, it only declares it. It is
expected that other similar flags may come in the future, such as reports
of pending end of stream, errors or any such event that might save the
caller from having to poll, or simply let it know that it can take some
actions after having processed data.
2017-12-10 21:13:25 +01:00
Thierry FOURNIER
cb14688496 BUG/MEDIUM: lua/notification: memory leak
The thread patches adds refcount for notifications. The notifications are
used with the Lua cosocket. These refcount free the notifications when
the session is cleared. In the Lua task case, it not have sessions, so
the nofications are never cleraed.

This patch adds a garbage collector for signals. The garbage collector
just clean the notifications for which the end point is disconnected.

This patch should be backported in 1.8
2017-12-10 19:38:58 +01:00
Thierry FOURNIER
d5b79835f8 DOC: notifications: add precisions about thread usage
Precise the terms of use the notification functions.
2017-12-10 19:38:55 +01:00
Emeric Brun
ece0c334bd BUG/MEDIUM: ssl engines: Fix async engines fds were not considered to fix fd limit automatically.
The number of async fd is computed considering the maxconn, the number
of sides using ssl and the number of engines using async mode.

This patch should be backported on haproxy 1.8
2017-12-06 14:17:41 +01:00
Willy Tarreau
6c71e4696b BUG/MAJOR: hpack: don't pretend large headers fit in empty table
In hpack_dht_make_room(), we try to fulfill this rule form RFC7541#4.4 :

 "It is not an error to attempt to add an entry that is larger than the
  maximum size; an attempt to add an entry larger than the maximum size
  causes the table to be emptied of all existing entries and results in
  an empty table."

Unfortunately it is not consistent with the way it's used in
hpack_dht_insert() as this last one will consider a success as a
confirmation it can copy the header into the table, and a failure as
an indexing error. This results in the two following issues :
  - if a client sends too large a header into an empty table, this
    header may overflow the table. Fortunately, most clients send
    small headers like :authority first, and never mark headers that
    don't fit into the table as indexable since it is counter-productive ;

  - if a client sends too large a header into a populated table, the
    operation fails after the table is totally flushed and the request
    is not processed.

This patch fixes the two issues at once :
  - a header not fitting into an empty table is always a sign that it
    will never fit ;
  - not fitting into the table is not an error

Thanks to Yves Lafon for reporting detailed traces demonstrating this
issue. This fix must be backported to 1.8.
2017-12-04 18:06:51 +01:00
Willy Tarreau
d85ba4e092 BUG/MINOR: hpack: reject invalid header index
If the hpack decoder sees an invalid header index, it emits value
"### ERR ###" that was used during debugging instead of rejecting the
block. This is harmless, and was detected by h2spec.

To backport to 1.8.
2017-12-03 21:08:39 +01:00
Emeric Brun
0fed0b0a38 BUG/MEDIUM: peers: fix some track counter rules dont register entries for sync.
This BUG was introduced with:
'MEDIUM: threads/stick-tables: handle multithreads on stick tables'

The API was reviewed to handle stick table entry updates
asynchronously and the caller must now call a 'stkable_touch_*'
function each time the content of an entry is modified to
register the entry to be synced.

There was missing call to stktable_touch_* resulting in
not propagated entries to remote peers (or local one during reload)
2017-11-29 19:16:22 +01:00
Willy Tarreau
ec7464726f BUILD: checks: don't include server.h
server.h needs checks.h since it references the struct check, but depending
on the include order it will fail if check.h is included first due to this
one including server.h in turn while it doesn't need it.
2017-11-29 10:54:05 +01:00
Willy Tarreau
b306650c2a [RELEASE] Released version 1.9-dev0
Released version 1.9-dev0 with the following main changes :
    - BUG/MEDIUM: stream: don't automatically forward connect nor close
    - BUG/MAJOR: stream: ensure analysers are always called upon close
    - BUG/MINOR: stream-int: don't try to read again when CF_READ_DONTWAIT is set
    - MEDIUM: mworker: Add systemd `Type=notify` support
    - BUG/MEDIUM: cache: free callback to remove from tree
    - CLEANUP: cache: remove unused struct
    - MEDIUM: cache: enable the HTTP analysers
    - CLEANUP: cache: remove wrong comment
    - MINOR: threads/atomic: rename local variables in macros to avoid conflicts
    - MINOR: threads/plock: rename local variables in macros to avoid conflicts
    - MINOR: threads/atomic: implement pl_mb() in asm on x86
    - MINOR: threads/atomic: implement pl_bts() on non-x86
    - MINOR: threads/build: atomic: replace the few inlines with macros
    - BUILD: threads/plock: fix a build issue on Clang without optimization
    - BUILD: ebtree: don't redefine types u32/s32 in scope-aware trees
    - BUILD: compiler: add a new type modifier __maybe_unused
    - BUILD: h2: mark some inlined functions "unused"
    - BUILD: server: check->desc always exists
    - BUG/MEDIUM: h2: properly report connection errors in headers and data handlers
    - MEDIUM: h2: add a function to emit an HTTP/1 request from a headers list
    - MEDIUM: h2: change hpack_decode_headers() to only provide a list of headers
    - BUG/MEDIUM: h2: always reassemble the Cookie request header field
    - BUG/MINOR: systemd: ignore daemon mode
    - CONTRIB: spoa_example: allow to compile outside HAProxy.
    - CONTRIB: spoa_example: remove bref, wordlist, cond_wordlist
    - CONTRIB: spoa_example: remove last dependencies on type "sample"
    - CONTRIB: spoa_example: remove SPOE enums that are useless for clients
    - CLEANUP: cache: reorder includes
    - MEDIUM: shctx: use unsigned int for len and block_count
    - MEDIUM: cache: "show cache" on the cli
    - BUG/MEDIUM: cache: use key=0 as a condition for freeing
    - BUG/MEDIUM: cache: refcount forbids to free the objects
    - BUG/MEDIUM: cache fix cli_kws structure
    - BUG/MEDIUM: deinit: correctly deinitialize the proxy and global listener tasks
    - BUG/MINOR: ssl: Always start the handshake if we can't send early data.
    - MINOR: ssl: Don't disable early data handling if we could not write.
    - MINOR: pools: prepare functions to override malloc/free in pools
    - MINOR: pools: implement DEBUG_UAF to detect use after free
    - BUG/MEDIUM: threads/time: fix time drift correction
    - BUG/MEDIUM: threads/time: maintain a common time reference between all threads
    - MINOR: sample: Add "thread" sample fetch
    - BUG/MINOR: Use crt_base instead of ca_base when crt is parsed on a server line
    - BUG/MINOR: stream: fix tv_request calculation for applets
    - BUG/MAJOR: h2: always remove a stream from the send list before freeing it
    - BUG/MAJOR: threads/task: dequeue expired tasks under the WQ lock
    - MINOR: ssl: Handle reading early data after writing better.
    - MINOR: mux: Make sure every string is woken up after the handshake.
    - MEDIUM: cache: store sha1 for hashing the cache key
    - MINOR: http: implement the "http-request reject" rule
    - MINOR: h2: send RST_STREAM before GOAWAY on reject
    - MEDIUM: h2: don't gracefully close the connection anymore on Connection: close
    - MINOR: h2: make use of client-fin timeout after GOAWAY
    - MEDIUM: config: ensure that tune.bufsize is at least 16384 when using HTTP/2
    - MINOR: ssl: Handle early data with BoringSSL
    - BUG/MEDIUM: stream: always release the stream-interface on abort
    - BUG/MEDIUM: cache: free ressources in chn_end_analyze
    - MINOR: cache: move the refcount decrease in the applet release
    - BUG/MINOR: listener: Allow multiple "process" options on "bind" lines
    - MINOR: config: Support a range to specify processes in "cpu-map" parameter
    - MINOR: config: Slightly change how parse_process_number works
    - MINOR: config: Export parse_process_number and use it wherever it's applicable
    - MINOR: standard: Add my_ffsl function to get the position of the bit set to one
    - MINOR: config: Add auto-increment feature for cpu-map
    - MINOR: config: Support partial ranges in cpu-map directive
    - MINOR:: config: Remove thread-map directive
    - MINOR: config: Add the threads support in cpu-map directive
    - MINOR: config: Add threads support for "process" option on "bind" lines
    - MEDIUM: listener: Bind listeners on a thread subset if specified
    - CLEANUP: debug: Use DPRINTF instead of fprintf into #ifdef DEBUG_FULL/#endif
    - CLEANUP: log: Rename Alert/Warning in ha_alert/ha_warning
    - MINOR/CLEANUP: proxy: rename "proxy" to "proxies_list"
    - CLEANUP: pools: rename all pool functions and pointers to remove this "2"
    - DOC: update the roadmap file with the latest changes merged in 1.8
    - DOC: fix mangled version in peers protocol documentation
    - DOC: add initial peers protovol v2.0 documentation.
    - DOC: mention William as maintainer of the cache and master-worker
    - DOC: add Christopher and Emeric as maintainers of the threads
    - MINOR: cache: replace a fprint() by an abort()
    - MEDIUM: cache: max-age configuration keyword
    - DOC: explain HTTP2 timeout behavior
    - DOC: cache: configuration and management
    - MAJOR: mworker: exits the master on failure
    - BUG/MINOR: threads: don't drop "extern" on the lock in include files
    - MINOR: task: keep a pointer to the currently running task
    - MINOR: task: align the rq and wq locks
    - MINOR: fd: cache-align fdtab and fdcache locks
    - MINOR: buffers: cache-align buffer_wq_lock
    - CLEANUP: server: reorder some fields in struct server to save 40 bytes
    - CLEANUP: proxy: slightly reorder the struct proxy to reduce holes
    - CLEANUP: checks: remove 16 bytes of holes in struct check
    - CLEANUP: cache: more efficiently pack the struct cache
    - CLEANUP: fd: place the lock at the beginning of struct fdtab
    - CLEANUP: pools: align pools on a cache line
    - DOC: config: add a few bits about how to configure HTTP/2
    - BUG/MAJOR: threads/queue: avoid recursive locking in pendconn_get_next_strm()
    - BUILD: Makefile: reorder object files by size
2017-11-26 19:50:17 +01:00
Willy Tarreau
103e5663c8 BUG/MAJOR: threads/queue: avoid recursive locking in pendconn_get_next_strm()
pendconn_get_next_strm() is called from process_srv_queue() under the
server lock, and calls stream_add_srv_conn() with this lock held, while
the latter tries to take it again. This results in a deadlock when
a server's maxconn is reached and haproxy is built with thread support.
2017-11-26 18:50:30 +01:00
Willy Tarreau
1ca1b70cf9 CLEANUP: pools: align pools on a cache line
There are just a few pools, and they're stressed a lot, so it makes
sense to dedicate them a cache line to avoid contention and to place
the lock at the beginning.
2017-11-26 11:10:53 +01:00
Willy Tarreau
5809052ae1 CLEANUP: fd: place the lock at the beginning of struct fdtab
The struct is not cache line aligned but at least, every time the lock
will appear in the same cache line as the fd it will benefit from being
accessed first. This improves the performance by about 2% on fd-intensive
workloads with 4 threads.
2017-11-26 11:10:53 +01:00
Willy Tarreau
08eaa78739 CLEANUP: checks: remove 16 bytes of holes in struct check
These ones were easily recovered by swapping two members.
2017-11-26 11:10:52 +01:00
Willy Tarreau
a51108443e CLEANUP: proxy: slightly reorder the struct proxy to reduce holes
16 bytes were recovered from the struct doing minimal reordering.
2017-11-26 11:10:52 +01:00
Willy Tarreau
d7e33bbe2f CLEANUP: server: reorder some fields in struct server to save 40 bytes
In 1.8 many holes were introduced in struct server, so let's slightly
reorder a few fields to plug most of them. This saves 40 bytes in the
struct.
2017-11-26 11:10:52 +01:00
Willy Tarreau
8b94969054 MINOR: fd: cache-align fdtab and fdcache locks
These locks are highly contended, let's not make them share cache lines.
2017-11-26 11:10:51 +01:00
Willy Tarreau
53bae85b8e BUG/MINOR: threads: don't drop "extern" on the lock in include files
Commit 9dcf9b6 ("MINOR: threads: Use __decl_hathreads to declare locks")
accidently lost a few "extern" in certain lock declarations, possibly
causing certain entries to be declared at multiple places. Apparently
it hasn't caused any harm though.

The offending ones were :
  - fdtab_lock
  - fdcache_lock
  - poll_lock
  - buffer_wq_lock
2017-11-26 11:10:50 +01:00
William Lallemand
4cfede87a3 MAJOR: mworker: exits the master on failure
This patch changes the behavior of the master during the exit of a
worker.

When a worker exits with an error code, for example in the case of a
segfault, all workers are now killed and the master leaves.

If you don't want this behavior you can use the option
"master-worker no-exit-on-failure".
2017-11-24 22:48:27 +01:00
Willy Tarreau
bafbe01028 CLEANUP: pools: rename all pool functions and pointers to remove this "2"
During the migration to the second version of the pools, the new
functions and pool pointers were all called "pool_something2()" and
"pool2_something". Now there's no more pool v1 code and it's a real
pain to still have to deal with this. Let's clean this up now by
removing the "2" everywhere, and by renaming the pool heads
"pool_head_something".
2017-11-24 17:49:53 +01:00
Olivier Houchard
fbc74e8556 MINOR/CLEANUP: proxy: rename "proxy" to "proxies_list"
Rename the global variable "proxy" to "proxies_list".
There's been multiple proxies in haproxy for quite some time, and "proxy"
is a potential source of bugs, a number of functions have a "proxy" argument,
and some code used "proxy" when it really meant "px" or "curproxy". It worked
by pure luck, because it usually happened while parsing the config, and thus
"proxy" pointed to the currently parsed proxy, but we should probably not
rely on this.

[wt: some of these are definitely fixes that are worth backporting]
2017-11-24 17:21:27 +01:00
Christopher Faulet
767a84bcc0 CLEANUP: log: Rename Alert/Warning in ha_alert/ha_warning 2017-11-24 17:19:12 +01:00
Christopher Faulet
c644fa9bf5 MINOR: config: Add threads support for "process" option on "bind" lines
It is now possible on a "bind" line (or a "stats socket" line) to specify the
thread set allowed to process listener's connections. For instance:

    # HTTPS connections will be processed by all threads but the first and HTTP
    # connection will be processed on the first thread.
    bind *:80 process 1/1
    bind *:443 ssl crt mycert.pem process 1/2-
2017-11-24 15:38:50 +01:00
Christopher Faulet
cb6a94510d MINOR: config: Add the threads support in cpu-map directive
Now, it is possible to bind CPU at the thread level instead of the process level
by defining a thread set in "cpu-map" directives. Thus, its format is now:

  cpu-map [auto:]<process-set>[/<thread-set>] <cpu-set>...

where <process-set> and <thread-set> must follow the format:

  all | odd | even | number[-[number]]

Having a process range and a thread range in same time with the "auto:" prefix
is not supported. Only one range is supported, the other one must be a fixed
number. But it is allowed when there is no "auto:" prefix.

Because it is possible to define a mapping for a process and another for a
thread on this process, threads will be bound on the intersection of their
mapping and the one of the process on which they are attached. If the
intersection is null, no specific binding will be set for the threads.
2017-11-24 15:38:50 +01:00
Christopher Faulet
26028f6209 MINOR: config: Add auto-increment feature for cpu-map
The prefix "auto:" can be added before the process set to let HAProxy
automatically bind a process to a CPU by incrementing process and CPU sets. To
be valid, both sets must have the same size. No matter the declaration order of
the CPU sets, it will be bound from the lower to the higher bound.

  Examples:
      # all these lines bind the process 1 to the cpu 0, the process 2 to cpu 1
      #  and so on.
      cpu-map auto:1-4   0-3
      cpu-map auto:1-4   0-1 2-3
      cpu-map auto:1-4   3 2 1 0

      # bind each process to exaclty one CPU using all/odd/even keyword
      cpu-map auto:all   0-63
      cpu-map auto:even  0-31
      cpu-map auto:odd   32-63

      # invalid cpu-map because process and CPU sets have different sizes.
      cpu-map auto:1-4   0    # invalid
      cpu-map auto:1     0-3  # invalid
2017-11-24 15:38:49 +01:00
Christopher Faulet
ff8131861f MINOR: standard: Add my_ffsl function to get the position of the bit set to one 2017-11-24 15:38:49 +01:00
Christopher Faulet
f1f0c5f591 MINOR: config: Export parse_process_number and use it wherever it's applicable
This function is used when "bind-process" directive is parsed and when "process"
parameter on a "bind" or a "stats socket" line is parsed.
2017-11-24 15:38:49 +01:00
William Lallemand
f528fff46b MEDIUM: cache: store sha1 for hashing the cache key
The cache was relying on the txn->uri for creating its key, which was a
big problem when there was no log activated.

This patch does a sha1 of the host + uri, and stores it in the txn.
When a object is stored, the eb32node uses the first 32 bits of the hash
as a key, and the whole hash is stored in the cache entry.

During a lookup, the truncated hash is used, and when it matches an
entry we check the real sha1.
2017-11-23 20:20:04 +01:00
Olivier Houchard
90084a133d MINOR: ssl: Handle reading early data after writing better.
It can happen that we want to read early data, write some, and then continue
reading them.
To do so, we can't reuse tmp_early_data to store the amount of data sent,
so introduce a new member.
If we read early data, then ssl_sock_to_buf() is now the only responsible
for getting back to the handshake, to make sure we don't miss any early data.
2017-11-23 19:35:28 +01:00
Willy Tarreau
158fa75811 MINOR: pools: implement DEBUG_UAF to detect use after free
This code has been used successfully a few times in the past to detect
that a pool was used after being freed. Its main goal is to allocate a
full page for each object so that they are always released individually
and unmapped from memory. This way if any part of the code reference the
object after is was freed and before it is reallocated, a segv occurs at
the exact offending location. It does a few extra things such as writing
to the memory area before freeing to detect double-frees and free of
read-only areas, and placing the data at the end of the page instead of
the beginning so that out of bounds accesses are easier to spot. The
amount of memory used with this is huge (about 10 times the regular
usage) but it can be useful sometimes.
2017-11-22 19:43:57 +01:00
Willy Tarreau
f13322ede1 MINOR: pools: prepare functions to override malloc/free in pools
This will be useful to add some debugging capabilities. For now it
changes nothing.
2017-11-22 19:27:44 +01:00
William Lallemand
111bfef33c MEDIUM: shctx: use unsigned int for len and block_count
Allows bigger objects to be cached in the shctx, the first
implementation was only storing small ssl session, but we want to store
bigger HTTP response.
2017-11-21 21:35:04 +01:00
Willy Tarreau
59a10fb53d MEDIUM: h2: change hpack_decode_headers() to only provide a list of headers
The current H2 to H1 protocol conversion presents some issues which will
require to perform some processing on certain headers before writing them
so it's not possible to convert HPACK to H1 on the fly.

This commit modifies the headers decoding so that it now works in two
phases : hpack_decode_headers() only decodes the HPACK stream in the
HEADERS frame and puts the result into a list. Headers which require
storage (huffman-compressed or from the dynamic table) are stored in
a chunk allocated by the H2 demuxer. Then once the headers are properly
decoded into this list, h2_make_h1_request() is called with this list
to produce the HTTP/1.1 request into the destination buffer. The list
necessarily enforces a limit. Here we use 2*MAX_HTTP_HDR, which means
that we can have as many individual cookies as we have regular headers
if a client decides to break their cookies into multiple values. This
seams reasonable and will allow the H1 parser to decide whether it's
too much or not.

Thus the output stream is not produced on the fly anymore and this will
permit to deal with certain corner cases like reparing the Cookie header
(which for now is not done).

In order to limit header duplication and parsing, the known pseudo headers
continue to be passed by their index : the name element in the list then
has a NULL pointer and the value is the pseudo header's index. Given that
these ones represent about half of the incoming requests and need to be
found quickly, it maintains an acceptable level of performance.

The code was significantly reduced by doing this because the orignal code
had to deal with HPACK and H1 combinations (eg: index vs not indexed, etc)
and now the HPACK decoding is totally focused on the decompression, and
the H1 encoding doesn't have to deal with the issue of wrapping input for
example.

One bug was addressed here (though it couldn't happen at the moment). The
H2 demuxer used to detect a failure to write the request into the H1 buffer
and would then detect if the output buffer wraps, realign it and try again.
The problem by doing so was that the HPACK context was already modified and
not rewindable. Thus the size check is now performed first and a failure is
reported if it doesn't fit.
2017-11-21 21:13:36 +01:00
Willy Tarreau
f24ea8e45e MEDIUM: h2: add a function to emit an HTTP/1 request from a headers list
The current H2 to H1 protocol conversion presents some issues which will
require to perform some processing on certain headers before writing them
so it's not possible to convert HPACK to H1 on the fly.

Here we introduce a function which performs half of what hpack_decode_header()
used to do, which is to take a list of headers on input and emit the
corresponding request in HTTP/1.1 format. The code is the same and functions
were renamed to be prefixed with "h2" instead of "hpack", though it ends
up being simpler as the various HPACK-specific cases could be fused into
a single one (ie: add header).

Moving this part here makes a lot of sense as now this code is specific to
what is documented in HTTP/2 RFC 7540 and will be able to deal with special
cases related to H2 to H1 conversion enumerated in section 8.1.

Various error codes which were previously assigned to HPACK were never
used (aside being negative) and were all replaced by -1 with a comment
indicating what error was detected. The code could be further factored
thanks to this but this commit focuses on compatibility first.

This code is not yet used but builds fine.
2017-11-21 21:13:33 +01:00
Willy Tarreau
dbd25fc75a BUILD: compiler: add a new type modifier __maybe_unused
While gcc only emits warnings about unused static functions, Clang also
emits such a warning when the functions are inlined. This is a bit
annoying at certain places where functions are provided to manipulate
multiple data types and are not yet used. Let's have a type modifier
"__maybe_unused" which sets the "unused" attribute like the Linux kernel
does. It's elegant as it allows the code author to indicate that it knows
that this element might be unused. It works on variables as well, which
is convenient to remove ifdefs around local variables in certain functions,
but doesn't work on labels.
2017-11-20 21:27:27 +01:00
Willy Tarreau
2532bd2f81 BUILD: threads/plock: fix a build issue on Clang without optimization
[ plock commit 4c53fd3a0b2b1892817cebd0db012a52f4087850 ]

Pieter Baauw reported a build issue affecting haproxy after plock was
included. It happens that expressions of the form :

     if ((const) ? (expr1) : (expr2))
       do_something()

always produce code for both expr1 and expr2 on Clang when building
without optimization. The resulting asm code is even funny, basically
doing :

     mov reg, 1
     cmp reg, 1
     ...

This causes our sizeof() tests to fail to build because we purposely
dereference a fake function that reports the location and nature of the
inconsistency, but this fake function appears in the object code despite
all conditions being there to avoid it.

However the compiler is still smart enough to optimize away code doing

    if (const)
       do_something()

So we simply repeat the condition before do_something(), and the dummy
function is not referenced anymore unless really required.
2017-11-20 21:06:35 +01:00
Willy Tarreau
b5f271555e MINOR: threads/build: atomic: replace the few inlines with macros
[ plock commit 61e255286ae32e83e1a3174dd7c49eda99880a8b]

There are a few inlines such as pl_barrier() and pl_cpu_relax() which
are used a lot. Unfortunately, while building test code at -O0, inlining
is disabled and these ones are called a lot and show up a lot in any
profile, are traced into when single-stepping with a debugger, etc, thus
they are polluting the landscape. Since they're single-asm statements,
there is no reason for not turning them into macros.

The result becomes fairly visible here at -O0 :

  $ size latency.inline latency.macro
     text    data     bss     dec     hex filename
    11431     692     656   12779    31eb treelock.inline
    10967     692     656   12315    301b treelock.macro

And it was verified that regularly optimized code remains strictly identical.
2017-11-20 21:06:35 +01:00
Willy Tarreau
d0d8ba59d3 MINOR: threads/atomic: implement pl_bts() on non-x86
[ plock commit da17ba320aad3a8faf08e36fca604de9cad21fdd ]

This one was missing, it can be done using sync_fetch_and_or().
2017-11-20 21:06:03 +01:00
Willy Tarreau
01b8398b9e MINOR: threads/atomic: implement pl_mb() in asm on x86
[ plock commit 44081ea493dd78dab48076980e881748e9b33db5 ]

Older compilers (eg: gcc 3.4) don't provide __sync_synchronize() so let's
do it by hand on this platform.
2017-11-20 20:45:47 +01:00
Willy Tarreau
f7ba77eb80 MINOR: threads/plock: rename local variables in macros to avoid conflicts
[ plock commit b155d5c762fb9a9793911881f80e61faa6b0e889 ]

Local variables "l", "i" and "ret" were renamed "__pl_l", "__pl_i" and
"__pl_r" respectively, to limit the risk of conflicts with existing
variables in application code.
2017-11-20 20:45:43 +01:00
Willy Tarreau
98409e34ca MINOR: threads/atomic: rename local variables in macros to avoid conflicts
[ plock commit bfac5887ebabb8ef753b0351f162265767eb219b ]

Local variable "t" was renamed "__pl_t" to limit the risk of conflicts
with existing variables in application code.
2017-11-20 20:45:38 +01:00
William Lallemand
71bd11a1f3 MEDIUM: cache: enable the HTTP analysers
Enable the same analysers as the stats applet.
Allows keepalive and termination flags to work.
2017-11-20 19:22:27 +01:00
William Lallemand
44e259c0b7 CLEANUP: cache: remove unused struct
Remove unused structure which remain from old dev.
2017-11-20 19:22:27 +01:00
Tim Duesterhus
d6942c8297 MEDIUM: mworker: Add systemd Type=notify support
This patch adds support for `Type=notify` to the systemd unit.

Supporting `Type=notify` improves both starting as well as reloading
of the unit, because systemd will be let known when the action completed.

See this quote from `systemd.service(5)`:
> Note however that reloading a daemon by sending a signal (as with the
> example line above) is usually not a good choice, because this is an
> asynchronous operation and hence not suitable to order reloads of
> multiple services against each other. It is strongly recommended to
> set ExecReload= to a command that not only triggers a configuration
> reload of the daemon, but also synchronously waits for it to complete.

By making systemd aware of a reload in progress it is able to wait until
the reload actually succeeded.

This patch introduces both a new `USE_SYSTEMD` build option which controls
including the sd-daemon library as well as a `-Ws` runtime option which
runs haproxy in master-worker mode with systemd support.

When haproxy is running in master-worker mode with systemd support it will
send status messages to systemd using `sd_notify(3)` in the following cases:

- The master process forked off the worker processes (READY=1)
- The master process entered the `mworker_reload()` function (RELOADING=1)
- The master process received the SIGUSR1 or SIGTERM signal (STOPPING=1)

Change the unit file to specify `Type=notify` and replace master-worker
mode (`-W`) with master-worker mode with systemd support (`-Ws`).

Future evolutions of this feature could include making use of the `STATUS`
feature of `sd_notify()` to send information about the number of active
connections to systemd. This would require bidirectional communication
between the master and the workers and thus is left for future work.
2017-11-20 18:39:41 +01:00
Olivier Houchard
e6060c5d87 MINOR: SSL: Store the ASN1 representation of client sessions.
Instead of storing the SSL_SESSION pointer directly in the struct server,
store the ASN1 representation, otherwise, session resumption is broken with
TLS 1.3, when multiple outgoing connections want to use the same session.
2017-11-16 19:03:32 +01:00
Christopher Faulet
595d7b72a6 MINOR: applets: Use a bitfield to track applets activity per-thread
a bitfield has been added to know if there are runnable applets for a
thread. When an applet is woken up, the bits corresponding to its thread_mask
are set. When all active applets for a thread is get to be processed, the thread
is removed from active ones by unsetting its tid_bit from the bitfield.
2017-11-16 11:19:46 +01:00
Christopher Faulet
3911ee85df MINOR: tasks: Use a bitfield to track tasks activity per-thread
a bitfield has been added to know if there are runnable tasks for a thread. When
a task is woken up, the bits corresponding to its thread_mask are set. When all
tasks for a thread have been evaluated without any wakeup, the thread is removed
from active ones by unsetting its tid_bit from the bitfield.
2017-11-16 11:19:46 +01:00
William Lallemand
75ea0a06b0 BUG/MEDIUM: mworker: does not close inherited FD
At the end of the master initialisation, a call to protocol_unbind_all()
was made, in order to close all the FDs.

Unfortunately, this function closes the inherited FDs (fd@), upon reload
the master wasn't able to reload a configuration with those FDs.

The create_listeners() function now store a flag to specify if the fd
was inherited or not.

Replace the protocol_unbind_all() by  mworker_cleanlisteners() +
deinit_pollers()
2017-11-15 19:53:33 +01:00
Willy Tarreau
9c1e15d8cd MINOR: tools: emphasize the node being worked on in the tree dump
Now we can show in dotted red the node being removed or surrounded in red
a node having been inserted, and add a description on the graph related to
the operation in progress for example.
2017-11-15 19:43:05 +01:00
Willy Tarreau
ed3cda02ae MINOR: tools: add a function to dump a scope-aware tree to a file
It emits a dump in DOT format for graphing purposes during debugging
sessions. It's convenient to dump the run queue.
2017-11-15 16:07:15 +01:00
Christopher Faulet
99bca65f53 BUG/MEDIUM: standard: itao_str/idx and quote_str/idx must be thread-local
This bug has an impact on the stats applet and easily leads to a crash of HAProxy.

This is specific to threads, no backport is needed.
2017-11-14 18:11:57 +01:00
Christopher Faulet
e9a896e09e BUG/MINOR: threads: tid_bit must be a unsigned long
This is specific to threads, no backport is needed.
2017-11-14 18:11:28 +01:00
Christopher Faulet
fa5c812a6b BUG/MINOR: buffers: Fix b_alloc_margin to be "fonctionnaly" thread-safe
b_alloc_margin is, strickly speeking, thread-safe. It will not crash
HAproxy. But its contract is not respected anymore in a multithreaded
environment. In this function, we need to be sure to have <margin> buffers
available in the pool after the allocation. So to have this guarantee, we must
lock the memory pool during all the operation. This also means, we must call
internal and lockless memory functions (prefixed with '__').

For the record, this patch fixes a pernicious bug happens after a soft reload
where some streams can be blocked infinitly, waiting for a buffer in the
buffer_wq list. This happens because, during a soft reload, pool_gc2 is called,
making some calls to b_alloc_fast fail.

This is specific to threads, no backport is needed.
2017-11-13 11:42:48 +01:00
Christopher Faulet
9dcf9b6f03 MINOR: threads: Use __decl_hathreads to declare locks
This macro should be used to declare variables or struct members depending on
the USE_THREAD compile option. It avoids the encapsulation of such declarations
between #ifdef/#endif. It is used to declare all lock variables.
2017-11-13 11:38:17 +01:00
Willy Tarreau
387bd4f69f CLEANUP: global: introduce variable pid_bit to avoid shifts with relative_pid
At a number of places, bitmasks are used for process affinity and to map
listeners to processes. Every time 1UL<<(relative_pid-1) is used. Let's
create a "pid_bit" variable corresponding to this value to clean this up.
2017-11-10 19:08:14 +01:00
Willy Tarreau
28b55c6fed CLEANUP: mux: remove the unused "release()" function
In commit 53a4766 ("MEDIUM: connection: start to introduce a mux layer
between xprt and data") we introduced a release() function which ends
up never being used. Let's get rid of it now.
2017-11-10 16:43:05 +01:00
Willy Tarreau
aa39860aef MINOR: tools: don't use unlikely() in hex2i()
This small inline function causes some pain to the compiler when used
inside other functions due to its use of the unlikely() hint for non-digits.
It causes the letters to be processed far away in the calling function and
makes the code less efficient. Removing these unlikely() hints has increased
the chunk size parsing by around 5%.
2017-11-10 11:19:54 +01:00
Willy Tarreau
b15e3fefc9 BUG/MEDIUM: h1: ensure the chunk size parser can deal with full buffers
The HTTP/1 code always has the reserve left available so the buffer is
never full there. But with HTTP/2 we have to deal with full buffers,
and it happens that the chunk size parser cannot tell the difference
between a full buffer and an empty one since it compares the start and
the stop pointer.

Let's change this to instead deal with the number of bytes left to process.

As a side effect, this code ends up being about 10% faster than the previous
one, even on HTTP/1.
2017-11-10 11:17:08 +01:00
Christopher Faulet
c5a9d5bf23 BUG/MEDIUM: stream-int: Don't loss write's notifs when a stream is woken up
When a write activity is reported on a channel, it is important to keep this
information for the stream because it take part on the analyzers' triggering.
When some data are written, the flag CF_WRITE_PARTIAL is set. It participates to
the task's timeout updates and to the stream's waking. It is also used in
CF_MASK_ANALYSER mask to trigger channels anaylzers. In the past, it was cleared
by process_stream. Because of a bug (fixed in commit 95fad5ba4 ["BUG/MAJOR:
stream-int: don't re-arm recv if send fails"]), It is now cleared before each
send and in stream_int_notify. So it is possible to loss this information when
process_stream is called, preventing analyzers to be called, and possibly
leading to a stalled stream.

Today, this happens in HTTP2 when you call the stat page or when you use the
cache filter. In fact, this happens when the response is sent by an applet. In
HTTP1, everything seems to work as expected.

To fix the problem, we need to make the difference between the write activity
reported to lower layers and the one reported to the stream. So the flag
CF_WRITE_EVENT has been added to notify the stream of the write activity on a
channel. It is set when a send succedded and reset by process_stream. It is also
used in CF_MASK_ANALYSER. finally, it is checked in stream_int_notify to wake up
a stream and in channel_check_timeouts.

This bug is probably present in 1.7 but it seems to have no effect. So for now,
no needs to backport it.
2017-11-09 15:16:05 +01:00
Willy Tarreau
1b4cf9b754 BUG/MINOR: h1: the HTTP/1 make status code parser check for digits
The H1 parser used by the H2 gateway was a bit lax and could validate
non-numbers in the status code. Since it computes the code on the fly
it's problematic, as "30:" is read as status code 310. Let's properly
check that it's a number now. No backport needed.
2017-11-09 11:15:45 +01:00
Olivier Houchard
522eea7110 MINOR: ssl: Handle sending early data to server.
This adds a new keyword on the "server" line, "allow-0rtt", if set, we'll try
to send early data to the server, as long as the client sent early data, as
in case the server rejects the early data, we no longer have them, and can't
resend them, so the only option we have is to send back a 425, and we need
to be sure the client knows how to interpret it correctly.
2017-11-08 14:11:10 +01:00
Emeric Brun
d8b3b65faa BUG/MEDIUM: splice/threads: pipe reuse list was not protected.
The list is now protected using a global spinlock.
2017-11-07 14:47:28 +01:00
Christopher Faulet
2a944ee16b BUILD: threads: Rename SPIN/RWLOCK macros using HA_ prefix
This remove any name conflicts, especially on Solaris.
2017-11-07 11:10:24 +01:00
Olivier Houchard
55dcdf4c39 BUG/MINOR: dns: Don't try to get the server lock if it's already held.
dns_link_resolution() can be called with the server lock already held, so
don't attempt to lock it again in that case.
2017-11-06 18:34:24 +01:00
Willy Tarreau
88ac59be4d MINOR: threads: use faster locks for the spin locks
The spin locks used to rely on W locks, which involve a loop waiting
for readers to leave, and this doesn't happen here. It's more efficient
to use S locks instead, which are also mutually exclusive and do not
have this loop. This saves one test per spinlock and a few tens of
bytes allowing certain functions to be inlined.
2017-11-06 11:20:11 +01:00
Willy Tarreau
8d38805d3d MAJOR: task: make use of the scope-aware ebtree functions
Currently the task scheduler suffers from an O(n) lookup when
skipping tasks that are not for the current thread. The reason
is that eb32_lookup_ge() has no information about the current
thread so it always revisits many tasks for other threads before
finding its own tasks.

This is particularly visible with HTTP/2 since the number of
concurrent streams created at once causes long series of tasks
for the same stream in the scheduler. With only 10 connections
and 100 streams each, by running on two threads, the performance
drops from 640kreq/s to 11.2kreq/s! Lookup metrics show that for
only 200000 task lookups, 430 million skips had to be performed,
which means that on average, each lookup leads to 2150 nodes to
be visited.

This commit backports the principle of scope lookups for ebtrees
from the ebtree_v7 development tree. The idea is that each node
contains a mask indicating the union of the scopes for the nodes
below it, which is fed during insertion, and used during lookups.

Then during lookups, branches that do not contain any leaf matching
the requested scope are simply ignored. This perfectly matches a
thread mask, allowing a thread to only extract the tasks it cares
about from the run queue, and to always find them in O(log(n))
instead of O(n). Thus the scheduler uses tid_bit and
task->thread_mask as the ebtree scope here.

Doing this has recovered most of the performance, as can be seen on
the test below with two threads, 10 connections, 100 streams each,
and 1 million requests total :

                              Before     After    Gain
              test duration : 89.6s      4.73s     x19
    HTTP requests/s (DEBUG) : 11200     211300     x19
     HTTP requests/s (PROD) : 15900     447000     x28
             spin_lock time : 85.2s      0.46s    /185
            time per lookup : 13us       40ns     /325

Even when going to 6 threads (on 3 hyperthreaded CPU cores), the
performance stays around 284000 req/s, showing that the contention
is much lower.

A test showed that there's no benefit in using this for the wait queue
though.
2017-11-06 11:20:11 +01:00
Willy Tarreau
62a124977b MINOR: applets: no need to check for runqueue's emptiness in appctx_res_wakeup()
The __appctx_wakeup() function already does it. It matters with threads
enabled because it simplifies the code in appctx_res_wakeup() to get rid
of this test.
2017-11-05 12:01:11 +01:00
Willy Tarreau
bbd09b9306 BUG/MAJOR: thread/listeners: enable_listener must not call unbind_listener()
unbind_listener() takes the listener lock, which is already held by
enable_listener(). This situation happens when starting with nbproc > 1
with some bind lines limited to a certain process, because in this case
enable_listener() tries to stop unneeded listeners.

This commit introduces __do_unbind_listeners() which must be called with
the lock held, and makes enable_listener() use this one. Given that the
only return code has never been used and that it starts to make the code
more complicated to propagate it before throwing it to the trash, the
function's return type was changed to void.
2017-11-05 11:38:44 +01:00
David Carlier
5222d8eb25 BUG/MINOR: stdarg.h inclusion
Needed for the memvprintf part, the va_list type.
Spotted during OpenBSD build.
2017-11-03 15:04:09 +01:00
Willy Tarreau
4b75fffa2b BUG/MAJOR: buffers: fix get_buffer_nc() for data at end of buffer
This function incorrectly dealt with the case where data doesn't
wrap but lies at the end of the buffer, resulting in Lukas' reported
data corruption with HTTP/2. No backport is needed, it was introduced
for HTTP/2 in 1.8-dev.
2017-11-02 17:16:07 +01:00
Willy Tarreau
7c2a2ad65c BUG/MINOR: thread: fix a typo in the debug code
__spin_unlock() used to call RWLOCK_WRUNLOCK() to unlock in the
debug code. It's harmless as they happen to be identical.
2017-11-02 16:26:02 +01:00
William Lallemand
77c1197bfb MEDIUM: cache: deliver objects from cache
Lookup objects in the cache and deliver them using the http-request
action "cache-use".
2017-10-31 21:17:19 +01:00
William Lallemand
41db46035e MEDIUM: cache: configuration parsing and initialization
Parse a configuration section "cache" and a http-{response,request}
actions.

Example:

    listen frt
        mode http
        http-response cache-store foobar
        http-request cache-use foobar

    cache foobar
        total-max-size 4   # size in megabytes
2017-10-31 21:17:19 +01:00
Willy Tarreau
ffca736401 MINOR: h2: centralize all HTTP/2 protocol elements and constants
These constants from RFC7540 will be centralized into common/h2.h for
use by the future h2 mux and other places.
2017-10-31 18:03:24 +01:00
Willy Tarreau
1be4f3d8af MEDIUM: hpack: implement basic hpack encoding
For now it only supports literals and a bit of static header table
references for the 9 most common header field names (date, server,
content-type, content-length, last-modified, accept-ranges, etag,
cache-control, location).

A previous incarnation of this commit used to strip the forbidden H2
header names (connection, proxy-connection, upgrade, transfer-encoding,
keep-alive) but this is no longer the case as this filtering is irrelevant
to HPACK encoding and is specific to H2, so this will have to be done by
the caller.

It's quite not optimal but works fine enough to prepare some valid and
partially compressed responses during development.
2017-10-31 18:03:24 +01:00
Willy Tarreau
679790baae MINOR: hpack: implement the decoder
The decoder is now fully functional. It makes use of the dynamic header
table. Dynamic header table size updates are currently ignored, as our
initially advertised value is the highest we support. Strictly speaking,
the impact is that a client referencing a header field after such an
update wouldn't observe an error instead of the connection being dropped
if it was implemented.

Decoded header fields are copied into a target buffer in HTTP/1 format
using HTTP/1.1 as the version. The Host header field is automatically
appended if a ":authority" header field is present.

All decoded header fields can be displayed if the file is compiled with
DEBUG_HPACK.
2017-10-31 18:03:24 +01:00
Willy Tarreau
ce04094c4a MINOR: hpack: implement the header tables management
This code deals with header insertion, retrieval and eviction, as well
as with dynamic header table defragmentation. It is functional for use
as a decoder and was heavily tested in this context. There's still some
room for optimization (eg: the defragmentation code currently does it
in place using a memcpy).

Also for now the dynamic header table is allocated using malloc() while
a pool needs to be created instead.

This code was mostly imported from https://github.com/wtarreau/http2-exp
with "hpack_" prepended in front of most names to avoid risks of conflicts.
Some small cleanups and renamings were applied during the import. This
version must be considered more recent.

Some HPACK error codes were placed here (HPACK_ERR_*), not exactly because
they're needed by the decoder but they'll be needed by all callers. Maybe
a different location should be found.
2017-10-31 18:03:24 +01:00
Willy Tarreau
a004ade512 MINOR: hpack: implement the HPACK Huffman table decoder
The code was borrowed from the HPACK experimental implementations
available here :

    https://github.com/wtarreau/http2-exp

It contains the Huffman table as specified in RFC7541 Appendix B, and a
set of reverse tables used to decode a Huffman byte stream, and produced
by contrib/h2/gen-rht. The encoder is not finalized, it doesn't emit the
byte stream but this is not needed for now.
2017-10-31 18:03:24 +01:00
Willy Tarreau
436d333124 MEDIUM: connection: add a destroy callback
This callback will be used to release upper layers when a mux is in
use. Given that the mux can be asynchronously deleted, we need a way
to release the extra information such as the session.

This callback will be called directly by the mux upon releasing
everything and before the connection itself is released, so that
the callee can find its information inside the connection if needed.

The way it currently works is not perfect, and most likely this should
instead become a mux release callback, but for now we have no easy way
to add mux-specific stuff, and since there's one mux per connection,
it works fine this way.
2017-10-31 18:03:24 +01:00
Willy Tarreau
2c52a2b9ee MEDIUM: connection: make mux->detach() release the connection
For H2, only the mux's timeout or other conditions might cause a
release of the mux and the connection, no stream should be allowed
to kill such a shared connection. So a stream will only detach using
cs_destroy() which will call mux->detach() then free the cs.

For now it's only handled by mux_pt. The goal is that the data layer
never has to care about the connection, which will have to be released
depending on the mux's mood.
2017-10-31 18:03:24 +01:00
Willy Tarreau
6978db35e9 MINOR: connection: add cs_close() to close a conn_stream
This basically calls cs_shutw() followed by cs_shutr(). Both of them
are called in the most conservative mode so that any previous call is
still respected. The CS flags are cleared so that it can be reused
(this is important for connection retries when conn and CS are reused
without being reallocated).
2017-10-31 18:03:24 +01:00
Willy Tarreau
ecdb3fe9f4 MINOR: conn_stream: modify cs_shut{r,w} API to pass the desired mode
Now we can specify how we want to shutdown (drain vs reset, and normal
vs silent), and this propagates to the mux then the transport layer.
2017-10-31 18:03:23 +01:00
Willy Tarreau
79dadb5335 MINOR: conn_stream: new shutr/w status flags
In order to support all shutdown modes on the CS, we introduce the
following flags :
  CS_FL_SHRD : shut read, drain extra data
  CS_FL_SHRR : shut read, reset extra data
  CS_FL_SHWN : shut write, normal notification
  CS_FL_SHWS : shut write, silent mode (no notification)

And the following modes for shutr/shutw :

  CS_SHR_DRAIN, CS_SHR_RESET, CS_SHW_NORMAL, CS_SHW_SILENT.

Note: it's possible that we won't need to distinguish the two shutw
above as they're only an action.

For now they are not used.
2017-10-31 18:03:23 +01:00
Olivier Houchard
9aaf778129 MAJOR: connection : Split struct connection into struct connection and struct conn_stream.
All the references to connections in the data path from streams and
stream_interfaces were changed to use conn_streams. Most functions named
"something_conn" were renamed to "something_cs" for this. Sometimes the
connection still is what matters (eg during a connection establishment)
and were not always renamed. The change is significant and minimal at the
same time, and was quite thoroughly tested now. As of this patch, all
accesses to the connection from upper layers go through the pass-through
mux.
2017-10-31 18:03:23 +01:00
Willy Tarreau
63dd75d934 MINOR: connection: introduce the conn_stream manipulation functions
Most of the functions dealing with conn_streams are here. They act at
the data layer and interact with the mux. For now they are not used yet
but everything builds.
2017-10-31 18:03:23 +01:00
Olivier Houchard
8e6147292e MINOR: mux: add more methods to mux_ops
We'll need to support reading/writing from both sides, with buffers and
pipes, as well as retrieving/updating flags.
2017-10-31 18:03:23 +01:00
Olivier Houchard
e2b40b9eab MINOR: connection: introduce conn_stream
This patch introduces a new struct conn_stream. It's the stream-side of
a multiplexed connection. A pool is created and destroyed on exit. For
now the conn_streams are not used at all.
2017-10-31 18:03:23 +01:00
Willy Tarreau
2e0b2b5f83 MEDIUM: session: use the ALPN token and proxy mode to select the mux
When an incoming connection is made on an HTTP mode frontend, the
session now looks up the mux to use based on the ALPN token and the
proxy mode. This will allow easier mux registration, and we don't
need to hard-code the mux_pt_ops anymore.
2017-10-31 18:03:23 +01:00
Willy Tarreau
2386be64ba MINOR: connection: implement alpn registration of muxes
Selecting a mux based on ALPN and the proxy mode will quickly become a
pain. This commit provides new functions to register/lookup a mux based
on the ALPN string and the proxy mode to make this easier. Given that
we're not supposed to support a wide range of muxes, the lookup should
not have any measurable performance impact.
2017-10-31 18:03:23 +01:00
Willy Tarreau
53a4766e40 MEDIUM: connection: start to introduce a mux layer between xprt and data
For HTTP/2 and QUIC, we'll need to deal with multiplexed streams inside
a connection. After quite a long brainstorming, it appears that the
connection interface to the existing streams is appropriate just like
the connection interface to the lower layers. In fact we need to have
the mux layer in the middle of the connection, between the transport
and the data layer.

A mux can exist on two directions/sides. On the inbound direction, it
instanciates new streams from incoming connections, while on the outbound
direction it muxes streams into outgoing connections. The difference is
visible on the mux->init() call : in one case, an upper context is already
known (outgoing connection), and in the other case, the upper context is
not yet known (incoming connection) and will have to be allocated by the
mux. The session doesn't have to create the new streams anymore, as this
is performed by the mux itself.

This patch introduces this and creates a pass-through mux called
"mux_pt" which is used for all new connections and which only
calls the data layer's recv,send,wake() calls. One incoming stream
is immediately created when init() is called on the inbound direction.
There should not be any visible impact.

Note that the connection's mux is purposely not set until the session
is completed so that we don't accidently run with the wrong mux. This
must not cause any issue as the xprt_done_cb function is always called
prior to using mux's recv/send functions.
2017-10-31 18:03:23 +01:00
Willy Tarreau
b29dc95a97 MINOR: threads: add a portable barrier for threads and non-threads
HA_BARRIER() is just a simple memory barrier to prevent the compiler
from reordering our code.
2017-10-31 18:01:18 +01:00
Willy Tarreau
2510f702f9 MINOR: h1: add a function to measure the trailers length
This is needed in the H2->H1 gateway so that we know how long the trailers
block is in chunked encoding. It returns the number of bytes, or 0 if some
are missing, or -1 in case of parse error.
2017-10-31 17:18:10 +01:00
Willy Tarreau
f65610a83d CLEANUP: threads: rename process_mask to thread_mask
It was a leftover from the last cleaning session; this mask applies
to threads and calling it process_mask is a bit confusing. It's the
same in fd, task and applets.
2017-10-31 16:06:06 +01:00
Olivier Houchard
d16bfe6c01 BUG/MINOR: dns: Fix SRV records with the new thread code.
srv_set_fqdn() may be called with the DNS lock already held, but tries to
lock it anyway. So, add a new parameter to let it know if it was already
locked or not;
2017-10-31 15:47:55 +01:00
Willy Tarreau
a5e0590b80 BUILD: stick-tables: silence an uninitialized variable warning
Commit 819fc6f ("MEDIUM: threads/stick-tables: handle multithreads on
stick tables") introduced a valid warning about an uninitialized return
value in stksess_kill_if_expired(). It just happens that this result is
never used, so let's turn the function back to void as previously.
2017-10-31 15:45:42 +01:00
Emeric Brun
6e0128630b BUG/MAJOR: threads/freq_ctr: fix lock on freq counters.
The wrong bit was set to keep the lock on freq counter update. And the read
functions were re-worked to use volatile.

Moreover, when a freq counter is updated, it is now rotated only if the current
counter is in the past (now.tv_sec > ctr->curr_sec). It is important with
threads because the current time (now) is thread-local. So, rounded to the
second, the time may vary by more or less 1 second. So a freq counter rotated by
one thread may be see 1 second in the future. In this case, it is updated but
not rotated.
2017-10-31 13:58:33 +01:00
Christopher Faulet
cd7879adc2 BUG/MEDIUM: threads: Run the poll loop on the main thread too
There was a flaw in the way the threads was created. the main one was just used
to create all the others and just wait to exit. Now, it is used to run a poll
loop. So we only create nbthread-1 threads.

This also fixes a bug about the compression filter when there is only 1 thread
(nbthread == 1 or no threads support). The bug was in the way thread-local
resources was initialized. per-thread init/deinit callbacks were never called
for the main process. So, with nthread set to 1, some buffers remained
uninitialized.
2017-10-31 13:58:33 +01:00
Emeric Brun
9f0b458525 MEDIUM: threads/server: Use the server lock to protect health check and cli concurrency 2017-10-31 13:58:33 +01:00
Christopher Faulet
c2a89a6aed MINOR: threads/mailers: Add a lock to protect queues of email alerts 2017-10-31 13:58:33 +01:00
Christopher Faulet
cfda847643 MINOR: threads/checks: Add a lock to protect the pid list used by external checks 2017-10-31 13:58:33 +01:00
Christopher Faulet
6251902e67 MINOR: threads: Add thread-map config parameter in the global section
By default, no affinity is set for threads. To bind threads on CPU, you must
define a "thread-map" in the global section. The format is the same than the
"cpu-map" parameter, with a small difference. The process number must be
defined, with the same format than cpu-map ("all", "even", "odd" or a number
between 1 and 31/63).

A thread will be bound on the intersection of its mapping and the one of the
process on which it is attached. If the intersection is null, no specific bind
will be set for the thread.
2017-10-31 13:58:33 +01:00
Christopher Faulet
b2812a6240 MEDIUM: thread/dns: Make DNS thread-safe 2017-10-31 13:58:33 +01:00
Christopher Faulet
24289f2e07 MEDIUM: thread/spoe: Make the SPOE thread-safe
Because there is not migration mechanism yet, all runtime information about an
SPOE agent are thread-local and async exchanges with agents are disabled when we
have serveral threads. Howerver, pipelining is still available. So for now, the
thread part of the SPOE is pretty simple.
2017-10-31 13:58:33 +01:00
Thierry FOURNIER
738a6d76f6 MEDIUM: threads/tasks: Add lock around notifications
This patch add lock around some notification calls
2017-10-31 13:58:32 +01:00
Thierry FOURNIER
952939d294 MEDIUM: threads/xref: Convert xref function to a thread safe model
Ensure that the unlink is done safely between thread and that
the peer struct will not destroy between the usage of the peer.
2017-10-31 13:58:32 +01:00
Thierry FOURNIER
94a6bfce9b MEDIUM: threads/lua: Cannot acces to the socket if we try to access from another thread.
We have two y for nsuring that the data is not concurently manipulated:
 - locks
 - running task on the same thread.
locks are expensives, it is better to avoid it.

This patch cecks that the Lua task run on the same thread that
the stream associated to the coprocess.

TODO: in a next version, the error should be replaced by a yield
and thread migration request.
2017-10-31 13:58:32 +01:00
Thierry FOURNIER
61ba0e2b6d MEDIUM: threads/lua: Add locks around the Lua execution parts.
Note that the Lua processing is not really thread safe. It provides
heavy system which consists to add our own lock function in the Lua
code and recompile the library. This system will probably not accepted
by maintainers of various distribs.

Our main excution point of the Lua is the function lua_resume(). A
quick looking on the Lua sources displays a lua_lock() a the start
of function and a lua_unlock() at the end of the function. So I
conclude that the Lua thread safe mode just perform a mutex around
all execution. So I prefer to do this in the HAProxy code, it will be
easier for distro maintainers.

Note that the HAProxy lua functions rounded by the macro SET_SAFE_LJMP
and RESET_SAFE_LJMP manipulates the Lua stack, so it will be careful
to set mutex around these functions.
2017-10-31 13:58:32 +01:00
Christopher Faulet
8ca3b4bc46 MEDIUM: threads/compression: Make HTTP compression thread-safe 2017-10-31 13:58:32 +01:00
Christopher Faulet
71a6a8efaa MEDIUM: threads/filters: Add init/deinit callback per thread
Now, it is possible to define init_per_thread and deinit_per_thread callbacks to
deal with ressources allocation for each thread.

This is the filter responsibility to deal with concurrency. This is also the
filter responsibility to know if HAProxy is started with some threads. A good
way to do so is to check "global.nbthread" value. If it is greater than 1, then
_per_thread callbacks will be called.
2017-10-31 13:58:32 +01:00
Christopher Faulet
e95f2c3ef5 MEDIUM: thread/vars: Make vars thread-safe
A RW lock has been added to the vars structure to protect each list of
variables. And a global RW lock is used to protect registered names.

When a varibable is fetched, we duplicate sample data because the variable could
be modified by another thread.
2017-10-31 13:58:32 +01:00
Christopher Faulet
94b712337d MEDIUM: threads/freq_ctr: Make the frequency counters thread-safe
When a frequency counter must be updated, we use the curr_sec/curr_tick fields
as a lock, by setting the MSB to 1 in a compare-and-swap to lock and by reseting
it to unlock. And when we need to read it, we loop until the counter is
unlocked. This way, the frequency counters are thread-safe without any external
lock. It is important to avoid increasing the size of many structures (global,
proxy, server, stick_table).
2017-10-31 13:58:32 +01:00
Emeric Brun
b5997f740b MAJOR: threads/map: Make acls/maps thread safe
locks have been added in pat_ref and pattern_expr structures to protect all
accesses to an instance of on of them. Moreover, a global lock has been added to
protect the LRU cache used for pattern matching.

Patterns are now duplicated after a successfull matching, to avoid modification
by other threads when the result is used.

Finally, the function reloading a pattern list has been modified to be
thread-safe.
2017-10-31 13:58:32 +01:00
Emeric Brun
821bb9beaa MAJOR: threads/ssl: Make SSL part thread-safe
First, OpenSSL is now initialized to be thread-safe. This is done by setting 2
callbacks. The first one is ssl_locking_function. It handles the locks and
unlocks. The second one is ssl_id_function. It returns the current thread
id. During the init step, we create as much as R/W locks as needed, ie the
number returned by CRYPTO_num_locks function.

Next, The reusable SSL session in the server context is now thread-local.

Shctx is now also initialized if HAProxy is started with several threads.

And finally, a global lock has been added to protect the LRU cache used to store
generated certificates. The function ssl_sock_get_generated_cert is now
deprecated because the retrieved certificate can be removed by another threads
in same time. Instead, a new function has been added,
ssl_sock_assign_generated_cert. It must be used to search a certificate in the
cache and set it immediatly if found.
2017-10-31 13:58:32 +01:00
Emeric Brun
6b35e9bfbf MEDIUM: threads/stream: Make streams list thread safe
Adds a global lock to protect the full streams list used to dump
sessions on stats socket.
2017-10-31 13:58:32 +01:00
Emeric Brun
a1dd243adb MAJOR: threads/buffer: Make buffer wait queue thread safe
Adds a global lock to protect the buffer wait queue.
2017-10-31 13:58:31 +01:00
Emeric Brun
80527f5bb6 MAJOR: threads/peers: Make peers thread safe
A lock is used to protect accesses to a peer structure.

A the lock is taken in the applet handler when the peer is identified
and released living the applet handler.

In the scheduling task for peers section, the lock is taken for every
listed peer and released at the end of the process task function.

The peer 'force shutdown' function was also re-worked.
2017-10-31 13:58:31 +01:00
Emeric Brun
1138fd0c57 MAJOR: threads/applet: Handle multithreading for applets
A global lock has been added to protect accesses to the list of active
applets. A process mask has also been added on each applet. Like for FDs and
tasks, it is used to know which threads are allowed to process an
applet. Because applets are, most of time, linked to a session, it should be
sticky on the same thread. But in all cases, it is the responsibility of the
applet handler to lock what have to be protected in the applet context.
2017-10-31 13:58:31 +01:00
Emeric Brun
272e252e61 MINOR: threads/regex: Change Regex trash buffer into a thread local variable 2017-10-31 13:58:31 +01:00
Emeric Brun
8c1aaa201a MEDIUM: threads/http: Make http_capture_bad_message thread-safe
This is done by passing the right stream's proxy (the frontend or the backend,
depending on the context) to lock the error snapshot used to store the error
info.
2017-10-31 13:58:31 +01:00
Emeric Brun
819fc6f563 MEDIUM: threads/stick-tables: handle multithreads on stick tables
The stick table API was slightly reworked:

A global spin lock on stick table was added to perform lookup and
insert in a thread safe way. The handling of refcount on entries
is now handled directly by stick tables functions under protection
of this lock and was removed from the code of callers.

The "stktable_store" function is no more externalized and users should
now use "stktable_set_entry" in any case of insertion. This last one performs
a lookup followed by a store if not found. So the code using "stktable_store"
was re-worked.

Lookup, and set_entry functions automatically increase the refcount
of the returned/stored entry.

The function "sticktable_touch" was renamed "sticktable_touch_local"
and is now able to decrease the refcount if last arg is set to true. It
is allowing to release the entry without taking the lock twice.

A new function "sticktable_touch_remote" is now used to insert
entries coming from remote peers at the right place in the update tree.
The code of peer update was re-worked to use this new function.
This function is also able to decrease the refcount if wanted.

The function "stksess_kill" also handle a parameter to decrease
the refcount on the entry.

A read/write lock is added on each entry to protect the data content
updates of the entry.
2017-10-31 13:58:31 +01:00
Christopher Faulet
5b51755aef MEDIUM: threads/lb: Make LB algorithms (lb_*.c) thread-safe
A lock for LB parameters has been added inside the proxy structure and atomic
operations have been used to update server variables releated to lb.

The only significant change is about lb_map. Because the servers status are
updated in the sync-point, we can call recalc_server_map function synchronously
in map_set_server_status_up/down function.
2017-10-31 13:58:31 +01:00
Christopher Faulet
5d42e099c5 MINOR: threads/server: Add a lock to deal with insert in updates_servers list
This list is used to save changes on the servers state. So when serveral threads
are used, it must be locked. The changes are then applied in the sync-point. To
do so, servers_update_status has be moved in the sync-point. So this is useless
to lock it at this step because the sync-point is a protected area by iteself.
2017-10-31 13:58:31 +01:00
Christopher Faulet
29f77e846b MEDIUM: threads/server: Add a lock per server and atomically update server vars
The server's lock is use, among other things, to lock acces to the active
connection list of a server.
2017-10-31 13:58:31 +01:00
Christopher Faulet
40a007cf2a MEDIUM: threads/server: Make connection list (priv/idle/safe) thread-safe
For now, we have a list of each type per thread. So there is no need to lock
them. This is the easiest solution for now, but not the best one because there
is no sharing between threads. An idle connection on a thread will not be able
be used by a stream on another thread. So it could be a good idea to rework this
patch later.
2017-10-31 13:58:30 +01:00
Christopher Faulet
ff8abcd31d MEDIUM: threads/proxy: Add a lock per proxy and atomically update proxy vars
Now, each proxy contains a lock that must be used when necessary to protect
it. Moreover, all proxy's counters are now updated using atomic operations.
2017-10-31 13:58:30 +01:00
Christopher Faulet
8d8aa0d681 MEDIUM: threads/listeners: Make listeners thread-safe
First, we use atomic operations to update jobs/totalconn/actconn variables,
listener's nbconn variable and listener's counters. Then we add a lock on
listeners to protect access to their information. And finally, listener queues
(global and per proxy) are also protected by a lock. Here, because access to
these queues are unusal, we use the same lock for all queues instead of a global
one for the global queue and a lock per proxy for others.
2017-10-31 13:58:30 +01:00
Christopher Faulet
b79a94c9f3 MEDIUM: threads/signal: Add a lock to make signals thread-safe
A global lock has been added to protect the signal processing. So when a signal
it triggered, only one thread will catch it.
2017-10-31 13:58:30 +01:00
Emeric Brun
c60def8368 MAJOR: threads/task: handle multithread on task scheduler
2 global locks have been added to protect, respectively, the run queue and the
wait queue. And a process mask has been added on each task. Like for FDs, this
mask is used to know which threads are allowed to process a task.

For many tasks, all threads are granted. And this must be your first intension
when you create a new task, else you have a good reason to make a task sticky on
some threads. This is then the responsibility to the process callback to lock
what have to be locked in the task context.

Nevertheless, all tasks linked to a session must be sticky on the thread
creating the session. It is important that I/O handlers processing session FDs
and these tasks run on the same thread to avoid conflicts.
2017-10-31 13:58:30 +01:00
Christopher Faulet
36716a7fec MEDIUM: threads/fd: Initialize the process mask during the call to fd_insert
Listeners will allow any threads to process the corresponding fd. But for other
FDs, we limit the processing to the current thread.
2017-10-31 13:58:30 +01:00
Christopher Faulet
a7c5d43085 MINOR: threads/fd: Add a mask of threads allowed to process on each fd in fdtab array 2017-10-31 13:58:30 +01:00
Christopher Faulet
d4604adeaa MAJOR: threads/fd: Make fd stuffs thread-safe
Many changes have been made to do so. First, the fd_updt array, where all
pending FDs for polling are stored, is now a thread-local array. Then 3 locks
have been added to protect, respectively, the fdtab array, the fd_cache array
and poll information. In addition, a lock for each entry in the fdtab array has
been added to protect all accesses to a specific FD or its information.

For pollers, according to the poller, the way to manage the concurrency is
different. There is a poller loop on each thread. So the set of monitored FDs
may need to be protected. epoll and kqueue are thread-safe per-se, so there few
things to do to protect these pollers. This is not possible with select and
poll, so there is no sharing between the threads. The poller on each thread is
independant from others.

Finally, per-thread init/deinit functions are used for each pollers and for FD
part for manage thread-local ressources.

Now, you must be carefull when a FD is created during the HAProxy startup. All
update on the FD state must be made in the threads context and never before
their creation. This is mandatory because fd_updt array is thread-local and
initialized only for threads. Because there is no pollers for the main one, this
array remains uninitialized in this context. For this reason, listeners are now
enabled in run_thread_poll_loop function, just like the worker pipe.
2017-10-31 13:58:30 +01:00
Christopher Faulet
b349e48ede MEDIUM: threads/pool: Make pool thread-safe by locking all access to a pool
A lock has been added for each memory pool. It is used to protect the pool
during allocations and releases. It is also used when pool info are dumped.
2017-10-31 13:58:30 +01:00
Christopher Faulet
f8188c69fa MEDIUM: threads/logs: Make logs thread-safe
log buffers and static variables used in log functions are now thread-local. So
there is no need to lock anything to log messages. Moreover, per-thread
init/deinit functions are now used to initialize these buffers.
2017-10-31 13:58:30 +01:00
Christopher Faulet
9a65571781 MEDIUM: threads/time: Many global variables from time.h are now thread-local 2017-10-31 13:58:30 +01:00
Christopher Faulet
6adad11283 MEDIUM: threads/chunks: Transform trash chunks in thread-local variables
So, per-thread init/deinit functions are registered to allocate/release them.
2017-10-31 13:58:30 +01:00
Christopher Faulet
339fff8a18 MEDIUM: threads: Adds a set of functions to handle sync-point
A sync-point is a protected area where you have the warranty that no concurrency
access is possible. It is implementated as a thread barrier to enter in the
sync-point and another one to exit from it. Inside the sync-point, all threads
that must do some syncrhonous processing will be called one after the other
while all other threads will wait. All threads will then exit from the
sync-point at the same time.

A sync-point will be evaluated only when necessary because it is a costly
operation. To limit the waiting time of each threads, we must have a mechanism
to wakeup all threads. This is done with a pipe shared by all threads. By
writting in this pipe, we will interrupt all threads blocked on a poller. The
pipe is then flushed before exiting from the sync-point.
2017-10-31 13:58:29 +01:00
Christopher Faulet
be0faa2e47 MINOR: threads: Add nbthread parameter
It is only parsed and initialized for now. It will be used later. This parameter
is only available when support for threads was built in.
2017-10-31 13:58:29 +01:00
Christopher Faulet
415f611ff4 MINOR: threads: Add mechanism to register per-thread init/deinit functions
hap_register_per_thread_init and hap_register_per_thread_deinit functions has
been added to register functions to do, for each thread, respectively, some
initialization and deinitialization. These functions are added in the global
lists per_thread_init_list and per_thread_deinit_list.

These functions are called only when HAProxy is started with more than 1 thread
(global.nbthread > 1).
2017-10-31 13:58:29 +01:00
Christopher Faulet
1a2b56ea8e MEDIUM: threads: Add hathreads header file
This file contains all functions and macros used to deal with concurrency in
HAProxy. It contains all high-level function to do atomic operation
(HA_ATOMIC_*). Note, for now, we rely on "__atomic" GCC builtins to do atomic
operation. So HAProxy can be compiled with the thread support iff these builtins
are available.

It also contains wrappers around plocks to use spin or read/write locks. These
wrappers are used to abstract the internal representation of the locking system
and to add information to help debugging, when compiled with suitable
options.

To add extra info on locks, you need to add DEBUG=-DDEBUG_THREAD or
DEBUG=-DDEBUG_FULL compilation option. In addition to timing info on locks, we
keep info on where a lock was acquired the last time (function name, file and
line). There are also the thread id and a flag to know if it is still locked or
not. This will be useful to debug deadlocks.
2017-10-31 13:58:23 +01:00
Emeric Brun
7122ab31b1 MINOR: threads: Add atomic-ops and plock includes in import dir
atomic-ops header contains some low-level functions to do atomic
operations. These operations are used by the progressive locks (plock).
2017-10-31 11:36:13 +01:00
Christopher Faulet
e9bd686b68 MINOR: threads: Add THREAD_LOCAL macro
When compiled with threads support, this marco is set to __thread. Else it is
empty.
2017-10-31 11:36:13 +01:00
Christopher Faulet
93a518f02a MINOR: standard: Add memvprintf function
Now memprintf relies on memvprintf. This new function does exactly what
memprintf did before, but it must be called with a va_list instead of a variable
number of arguments. So there is no change for every functions using
memprintf. But it is now also possible to have same functionnality from any
function with variadic arguments.
2017-10-31 11:36:12 +01:00
Christopher Faulet
0108bb3e40 MEDIUM: mailers: Init alerts during conf parsing and refactor their processing
Email alerts relies on checks to send emails. The link between a mailers section
and a proxy was resolved during the configuration parsing, But initialization was
done when the first alert is triggered. This implied memory allocations and
tasks creations. With this patch, everything is now initialized during the
configuration parsing. So when an alert is triggered, only the memory required
by this alert is dynamically allocated.

Moreover, alerts processing had a flaw. The task handler used to process alerts
to be sent to the same mailer, process_email_alert, was designed to give back
the control to the scheduler when an alert was sent. So there was a delay
between the sending of 2 consecutives alerts (the min of
"proxy->timeout.connect" and "mailer->timeout.mail"). To fix this problem, now,
we try to process as much queued alerts as possible when the task is woken up.
2017-10-31 11:36:12 +01:00
Christopher Faulet
67957bd59e MAJOR: dns: Refactor the DNS code
This is a huge patch with many changes, all about the DNS. Initially, the idea
was to update the DNS part to ease the threads support integration. But quickly,
I started to refactor some parts. And after several iterations, it was
impossible for me to commit the different parts atomically. So, instead of
adding tens of patches, often reworking the same parts, it was easier to merge
all my changes in a uniq patch. Here are all changes made on the DNS.

First, the DNS initialization has been refactored. The DNS configuration parsing
remains untouched, in cfgparse.c. But all checks have been moved in a post-check
callback. In the function dns_finalize_config, for each resolvers, the
nameservers configuration is tested and the task used to manage DNS resolutions
is created. The links between the backend's servers and the resolvers are also
created at this step. Here no connection are kept alive. So there is no needs
anymore to reopen them after HAProxy fork. Connections used to send DNS queries
will be opened on demand.

Then, the way DNS requesters are linked to a DNS resolution has been
reworked. The resolution used by a requester is now referenced into the
dns_requester structure and the resolution pointers in server and dns_srvrq
structures have been removed. wait and curr list of requesters, for a DNS
resolution, have been replaced by a uniq list. And Finally, the way a requester
is removed from a DNS resolution has been simplified. Now everything is done in
dns_unlink_resolution.

srv_set_fqdn function has been simplified. Now, there is only 1 way to set the
server's FQDN, independently it is done by the CLI or when a SRV record is
resolved.

The static DNS resolutions pool has been replaced by a dynamoc pool. The part
has been modified by Baptiste Assmann.

The way the DNS resolutions are triggered by the task or by a health-check has
been totally refactored. Now, all timeouts are respected. Especially
hold.valid. The default frequency to wake up a resolvers is now configurable
using "timeout resolve" parameter.

Now, as documented, as long as invalid repsonses are received, we really wait
all name servers responses before retrying.

As far as possible, resources allocated during DNS configuration parsing are
releases when HAProxy is shutdown.

Beside all these changes, the code has been cleaned to ease code review and the
doc has been updated.
2017-10-31 11:36:12 +01:00
Christopher Faulet
344c4ab6a9 MEDIUM: spoe/rules: Process "send-spoe-group" action
The messages processing is done using existing functions. So here, the main task
is to find the SPOE engine to use. To do so, we loop on all filter instances
attached to the stream. For each, we check if it is a SPOE filter and, if yes,
if its name is the one used to declare the "send-spoe-group" action.

We also take care to return an error if the action processing is interrupted by
HAProxy (because of a timeout or an error at the HAProxy level). This is done by
checking if the flag ACT_FLAG_FINAL is set.

The function spoe_send_group is the action_ptr callback ot
2017-10-31 11:36:12 +01:00
Christopher Faulet
c718b82dfe MINOR: spoe: Add a type to qualify the message list during encoding
Because we can have messages chained by event or by group, we need to have a way
to know which kind of list we manipulate during the encoding. So 2 types of list
has been added, SPOE_MSGS_BY_EVENT and SPOE_MSGS_BY_GROUP. And the right type is
passed when spoe_encode_messages is called.
2017-10-31 11:36:12 +01:00
Christopher Faulet
76c09ef8de MEDIUM: spoe/rules: Add "send-spoe-group" action for tcp/http rules
This action is used to trigger sending of a group of SPOE messages. To do so,
the SPOE engine used to send messages must be defined, as well as the SPOE group
to send. Of course, the SPOE engine must refer to an existing SPOE filter. If
not engine name is provided on the SPOE filter line, the SPOE agent name must be
used. For example:

   http-request send-spoe-group my-engine some-group

This action is available for "tcp-request content", "tcp-response content",
"http-request" and "http-response" rulesets. It cannot be used for tcp
connection/session rulesets because actions for these rulesets cannot yield.

For now, the action keyword is parsed and checked. But it does nothing. Its
processing will be added in another patch.
2017-10-31 11:36:12 +01:00
Christopher Faulet
11610f3b5a MEDIUM: spoe: Parse new "spoe-group" section in SPOE config file
For now, this section is only parsed. It should have the following format:

    spoe-group <grp-name>
      messages <msg-name> ...

And then SPOE groups must be referenced in spoe-agent section:

    spoe-agnt <name>
        ...
	groups <grp-name> ...

The purpose of these groups is to trigger messages sending from TCP or HTTP
rules, directly from HAProxy configuration, and not on specific event. This part
will be added in another patch.

It is important to note that a message belongs at most to a group.
2017-10-31 11:36:12 +01:00
Christopher Faulet
7ee8667c99 MINOR: spoe: Check uniqness of SPOE engine names during config parsing
The engine name is now kept in "spoe_config" struture. Because a SPOE filter can
be declared without engine name, we use the SPOE agent name by default. Then,
its uniqness is checked against all others SPOE engines configured for the same
proxy.

  * TODO: Add documentation
2017-10-31 11:36:12 +01:00
Christopher Faulet
57583e474e MEDIUM: spoe: Add support of ACLS to enable or disable sending of SPOE messages
Now, it is possible to conditionnaly send a SPOE message by adding an ACL-based
condition on the "event" line, in a "spoe-message" section. Here is the example
coming for the SPOE documentation:

    spoe-message get-ip-reputation
        args ip=src
        event on-client-session if ! { src -f /etc/haproxy/whitelist.lst }

To avoid mixin with proxy's ACLs, each SPOE message has its private ACL list. It
possible to declare named ACLs in "spoe-message" section, using the same syntax
than for proxies. So we can rewrite the previous example to use a named ACL:

    spoe-message get-ip-reputation
        args ip=src
	acl ip-whitelisted src -f /etc/haproxy/whitelist.lst
        event on-client-session if ! ip-whitelisted

ACL-based conditions are executed in the context of the stream that handle the
client and the server connections.
2017-10-31 11:36:12 +01:00
Christopher Faulet
1b421eab87 MINOR: acl: Pass the ACLs as an explicit parameter of build_acl_cond
So it is possible to use anothers ACLs to build ACL conditions than those of
proxies.
2017-10-31 11:36:12 +01:00
Christopher Faulet
78880fb196 MINOR: action: Add function to check rules using an action ACT_ACTION_TRK_*
The function "check_trk_action" has been added to find and check the target
table for rules using an action ACT_ACTION_TRK_*.
2017-10-31 11:36:12 +01:00
Christopher Faulet
6d950b92cd MINOR: action: Add a function pointer in act_rule struct to check its validity
It is possible to define the field "act_rule.check_ptr" if you want to check the
validity of a tcp/http rule.
2017-10-31 11:36:12 +01:00
Christopher Faulet
4fce0d8447 MINOR: action: Use trk_idx instead of tcp/http_trk_idx
So tcp_trk_idx and http_trk_idx have been removed.
2017-10-31 11:36:12 +01:00
Christopher Faulet
7421b14c22 MINOR: action: Add trk_idx inline function
It returns tracking index corresponding to an action ACT_ACTION_TRK_SC*. It will
replace http_trk_idx and tcp_trk_idx.
2017-10-31 11:36:12 +01:00
Willy Tarreau
d22e83abd9 MINOR: h1: store the status code in the H1 message
It was painful not to have the status code available, especially when
it was computed. Let's store it and ensure we don't claim content-length
anymore on 1xx, only 0 body bytes.
2017-10-31 08:43:29 +01:00
William Lallemand
a3c77cfdd7 MINOR: shctx: rename lock functions
Rename lock functions to shctx_lock() and shctx_unlock() to be coherent
with the new API.
2017-10-31 03:49:44 +01:00
William Lallemand
4f45bb9c46 MEDIUM: shctx: separate ssl and shctx
This patch reorganize the shctx API in a generic storage API, separating
the shared SSL session handling from its core.

The shctx API only handles the generic data part, it does not know what
kind of data you use with it.

A shared_context is a storage structure allocated in a shared memory,
allowing its usage in a multithread or a multiprocess context.

The structure use 2 linked list, one containing the available blocks,
and another for the hot locked blocks. At initialization the available
list is filled with <maxblocks> blocks of size <blocksize>. An <extra>
space is initialized outside the list in case you need some specific
storage.

+-----------------------+--------+--------+--------+--------+----
| struct shared_context | extra  | block1 | block2 | block3 | ...
+-----------------------+--------+--------+--------+--------+----
                                 <--------  maxblocks  --------->
                                            * blocksize

The API allows to store content on several linked blocks. For example,
if you allocated blocks of 16 bytes, and you want to store an object of
60 bytes, the object will be allocated in a row of 4 blocks.

The API was made for LRU usage, each time you get an object, it pushes
the object at the end of the list. When it needs more space, it discards

The functions name have been renamed in a more logical way, the part
regarding shctx have been prefixed by shctx_ and the functions for the
shared ssl session cache have been prefixed by sh_ssl_sess_.
2017-10-31 03:49:40 +01:00
William Lallemand
ed0b5ad1aa REORG: shctx: move ssl functions to ssl_sock.c
Move the ssl callback functions of the ssl shared session cache to
ssl_sock.c. The shctx functions still needs to be separated of the ssl
tree and data.
2017-10-31 03:48:39 +01:00
William Lallemand
3f85c9aec8 MEDIUM: shctx: allow the use of multiple shctx
Add an shctx argument which permits to create new independent shctx
area.
2017-10-31 03:44:11 +01:00
William Lallemand
24a7a75be6 REORG: shctx: move lock functions and struct
Move locks functions to proto/shctx.h, and structures to types/shctx.h
in order to simplify the split ssl/shctx.
2017-10-31 03:44:11 +01:00
William Lallemand
83215a44b8 MEDIUM: lists: list_for_each_entry{_safe}_from functions
Add list_for_each_entry_from and list_for_each_entry_safe_from which
allows to iterate in a list starting from a specific item.
2017-10-31 03:44:11 +01:00
Emmanuel Hocdet
01da571e21 MINOR: merge ssl_sock_get calls for log and ppv2
Merge ssl_sock_get_version and ssl_sock_get_proto_version.
Change ssl_sock_get_cipher to be used in ppv2.
2017-10-27 19:32:36 +02:00
Emmanuel Hocdet
58118b43b1 MINOR: update proxy-protocol-v2 #define
Report #define from doc/proxy-protocol.txt.
2017-10-27 19:32:36 +02:00
Olivier Houchard
9679ac997a MINOR: ssl: Don't abuse ssl_options.
A bind_conf does contain a ssl_bind_conf, which already has a flag to know
if early data are activated, so use that, instead of adding a new flag in
the ssl_options field.
2017-10-27 19:26:52 +02:00
Olivier Houchard
c2aae74f01 MEDIUM: ssl: Handle early data with OpenSSL 1.1.1
When compiled with Openssl >= 1.1.1, before attempting to do the handshake,
try to read any early data. If any early data is present, then we'll create
the session, read the data, and handle the request before we're doing the
handshake.

For this, we add a new connection flag, CO_FL_EARLY_SSL_HS, which is not
part of the CO_FL_HANDSHAKE set, allowing to proceed with a session even
before an SSL handshake is completed.

As early data do have security implication, we let the origin server know
the request comes from early data by adding the "Early-Data" header, as
specified in this draft from the HTTP working group :

    https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-replay
2017-10-27 10:54:05 +02:00
Olivier Houchard
51a76d84e4 MINOR: http: Mark the 425 code as "Too Early".
This adds a new status code for use with the "http-request deny" ruleset.
The use case for this code is currently handled by this draft dedicated
to 0-RTT processing :

   https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-replay
2017-10-27 10:53:32 +02:00
Thierry FOURNIER
31904278dc MINOR: hlua: Add regex class
This patch simply brings HAProxy internal regex system to the Lua API.
Lua doesn't embed regexes, now it inherits from the regexes compiled
with haproxy.
2017-10-27 10:30:44 +02:00
William Lallemand
48b4bb4b09 MEDIUM: cfgparse: post parsing registration
Allow to register a function which will be called after the
configuration file parsing, at the end of the check_config_validity().

It's useful fo checking dependencies between sections or for resolving
keywords, pointers or values.
2017-10-27 10:15:56 +02:00
William Lallemand
d2ff56d2a3 MEDIUM: cfgparse: post section callback
This commit implements a post section callback. This callback will be
used at the end of a section parsing.

Every call to cfg_register_section must be modified to use the new
prototype:

    int cfg_register_section(char *section_name,
                             int (*section_parser)(const char *, int, char **, int),
                             int (*post_section_parser)());
2017-10-27 10:14:51 +02:00
Willy Tarreau
145746c2d5 MINOR: buffer: add the buffer input manipulation functions
We used to have bo_{get,put}_{chr,blk,str} to retrieve/send data to
the output area of a buffer, but not the equivalent ones for the input
area. This will be needed to copy uploaded data frames in HTTP/2.
2017-10-27 10:00:17 +02:00
Willy Tarreau
7b271b214f MEDIUM: connection: make use of CO_FL_WILL_UPDATE in conn_sock_shutw()
This one may be called by upper layers (eg: si_shutw()) or lower layers
(si_shutw() as well during stream_int_notify()) so we want it to take
care of updating the connection's flags if it's not going to be done
by the caller.
2017-10-25 15:52:41 +02:00
Willy Tarreau
916e12dcfb MINOR: connection: add flag CO_FL_WILL_UPDATE to indicate when updates are granted
In transport-layer functions (snd_buf/rcv_buf), it's very problematic
never to know if polling changes made to the connection will be propagated
or not. This has led to some conn_cond_update_polling() calls being placed
at a few places to cover both the cases where the function is called from
the upper layer and when it's called from the lower layer. With the arrival
of the MUX, this becomes even more complicated, as the upper layer will not
have to manipulate anything from the connection layer directly and will not
have to push such updates directly either. But the snd_buf functions will
need to see their updates committed when called from upper layers.

The solution here is to introduce a connection flag set by the connection
handler (and possibly any other similar place) indicating that the caller
is committed to applying such changes on return. This way, the called
functions will be able to apply such changes by themselves before leaving
when the flag is not set, and the upper layer will not have to care about
that anymore.
2017-10-25 15:52:41 +02:00
Willy Tarreau
bc97cc4fd1 MINOR: connection: move the cleanup of flag CO_FL_WAIT_ROOM
This flag is only used when reading using splicing for now, and is only
set when a pipe full condition is met, so we can simplify its reset
condition in conn_refresh_polling_flags so that it's cleared at the
same time as the other ones, only when the control layer is ready.

This flag could be used more, to mark that a buffer full condition was
met with any receive method in order to simplify polling management.
This should probably be revisited after 1.8.
2017-10-25 15:52:41 +02:00
Dragan Dosen
7389dd086c IMPORT: sha1: import SHA1 functions
This is based on the git SHA1 implementation and optimized to do word
accesses rather than byte accesses, and to avoid unnecessary copies into
the context array.
2017-10-25 04:45:48 +02:00
Emmanuel Hocdet
019f9b10ef MINOR: ssl: build with recent BoringSSL library
BoringSSL switch OPENSSL_VERSION_NUMBER to 1.1.0 for compatibility.
Fix BoringSSL call and openssl-compat.h/#define occordingly.
This will not break openssl/libressl compat.
2017-10-24 19:57:16 +02:00
Willy Tarreau
1296382d0b CONTRIB: trace: add the possibility to place trace calls in the code
Now any call to trace() in the code will automatically appear interleaved
with the call sequence and timestamped in the trace file. They appear with
a '#' on the 3rd argument (caller's pointer) in order to make them easy to
spot. If the trace functionality is not used, a dmumy weak function is used
instead so that it doesn't require to recompile every time traces are
enabled/disabled.

The trace decoder knows how to deal with these messages, detects them and
indents them similarly to the currently traced function. This can be used
to print function arguments for example.

Note that we systematically flush the log when calling trace() to ensure we
never miss important events, so this may impact performance.

The trace() function uses the same format as printf() so it should be easy
to setup during debugging sessions.
2017-10-24 19:54:25 +02:00
Willy Tarreau
cbc6524a19 MINOR: connection: remove conn_force_close()
Now only conn_full_close() will be used. It will become more obvious
when the tracking is in place or not and will make it easier to
convert remaining call places to conn_streams.
2017-10-22 09:54:19 +02:00
Willy Tarreau
3b737c9894 MINOR: stream-int: use conn_full_close() instead of conn_force_close()
We simply disable tracking before calling it.
2017-10-22 09:54:18 +02:00
Willy Tarreau
dc42acddb6 MINOR: connection: add conn_stop_tracking() to disable tracking
This will be used before conn_full_close() instead of using
conn_force_close(), resulting in a clearer exit path in various
situations.
2017-10-22 09:54:16 +02:00
Willy Tarreau
6a0a80adaf MINOR: connection: ensure conn_ctrl_close() also resets the fd
The connection's fd was reset to DEAD_FD_MAGIC on conn_force_close()
but not on conn_full_close(), which is a bit strange. Let's do it on
both.
2017-10-22 09:54:16 +02:00
Willy Tarreau
f9ce57e86c MEDIUM: connection: make conn_sock_shutw() aware of lingering
Instead of having to manually handle lingering outside, let's make
conn_sock_shutw() check for it before calling shutdown(). We simply
don't want to emit the FIN if we're going to reset the connection
due to lingering. It's particularly important for silent-drop where
it's absolutely mandatory that no packet leaves the machine.
2017-10-22 09:54:16 +02:00
Olivier Houchard
1a0545f3d7 REORG: connection: rename CO_FL_DATA_* -> CO_FL_XPRT_*
These flags are not exactly for the data layer, they instead indicate
what is expected from the transport layer. Since we're going to split
the connection between the transport and the data layers to insert a
mux layer, it's important to have a clear idea of what each layer does.

All function conn_data_* used to manipulate these flags were renamed to
conn_xprt_*.
2017-10-22 09:54:15 +02:00
Willy Tarreau
794f9af894 MEDIUM: h1: reimplement the http/1 response parser for the gateway
The HTTP/2->HTTP/1 gateway will need to process HTTP/1 responses. We
cannot sanely rely on the HTTP/1 txn to parse a response because :

  1) responses generated by haproxy such as error messages, redirects,
     stats or Lua are neither parsed nor indexed ; this could be
     addressed over the long term but will take time.

  2) the http txn is useless to parse the body : the states present there
     are only meaningful to received bytes (ie next bytes to parse) and
     not at all to sent bytes. Thus chunks cannot be followed at all.
     Even when implementing this later, it's unsure whether it will be
     possible when dealing with compression.

So using the HTTP txn is now out of the equation and the only remaining
solution is to call an HTTP/1 message parser. We already have one, it was
slightly modified to avoid keeping states by benefitting from the fact
that the response was produced by haproxy and this is entirely available.
It assumes the following rules are true, or that incuring an extra cost
to work around them is acceptable :
  - the response buffer is read-write and supports modifications in place

  - headers sent through / by haproxy are not folded. Folding is still
    implemented by replacing CR/LF/tabs/spaces with spaces if encountered

  - HTTP/0.9 responses are never sent by haproxy and have never been
    supported at all

  - haproxy will not send partial responses, the whole headers block will
    be sent at once ; this means that we don't need to keep expensive
    states and can afford to restart the parsing from the beginning when
    facing a partial response ;

  - response is contiguous (does not wrap). This was already the case
    with the original parser and ensures we can safely dereference all
    fields with (ptr,len)

The parser replaces all of the http_msg fields that were necessary with
local variables. The parser is not called on an http_msg but on a string
with a start and an end. The HTTP/1 states were reused for ease of use,
though the request-specific ones have not been implemented for now. The
error position and error state are supported and optional ; these ones
may be used later for bug hunting.

The parser issues the list of all the headers into a caller-allocated
array of struct ist.

The content-length/transfer-encoding header are checked and the relevant
info fed the h1 message state (flags + body_len).
2017-10-22 09:54:15 +02:00
Willy Tarreau
306924ecb8 MINOR: http: add very simple header management based on double strings
This will be used initially by the hpack table and hopefully later by a
new native http processor. These headers are made of name and value, both
an immediate string (ie: pointer and length).
2017-10-22 09:54:14 +02:00
Willy Tarreau
4093a4dc01 MINOR: h1: add struct h1m for basic HTTP/1 messages
This one is much simpler than http_msg and will be used in the HTTP
parsers involved in the H2 to H1 gateway.
2017-10-22 09:54:14 +02:00
Willy Tarreau
b28925675d MEDIUM: http: make the chunk crlf parser only depend on the buffer
The chunk crlf parser used to depend on the channel and on the HTTP
message, eventhough it's not really needed. Let's remove this dependency
so that it can be used within the H2 to H1 gateway.

As part of this small API change, it was renamed to h1_skip_chunk_crlf()
to mention that it doesn't depend on http_msg anymore.
2017-10-22 09:54:14 +02:00
Willy Tarreau
e56cdd3629 MEDIUM: http: make the chunk size parser only depend on the buffer
The chunk parser used to depend on the channel and on the HTTP message
but it's not really needed as they're only used to retrieve the buffer
as well as to return the number of bytes parsed and the chunk size.

Here instead we pass the (few) relevant information in arguments so that
the function may be reused without a channel nor an HTTP message (ie
from the H2 to H1 gateway).

As part of this API change, it was renamed to h1_parse_chunk_size() to
mention that it doesn't depend on http_msg anymore.
2017-10-22 09:54:14 +02:00
Willy Tarreau
8740c8b1b2 REORG: http: move the HTTP/1 header block parser to h1.c
Since it still depends on http_msg, it was not renamed yet.
2017-10-22 09:54:13 +02:00
Willy Tarreau
db4893d6a4 REORG: http: move the HTTP/1 chunk parser to h1.{c,h}
Functions http_parse_chunk_size(), http_skip_chunk_crlf() and
http_forward_trailers() were moved to h1.h and h1.c respectively so
that they can be called from outside. The parts that were inline
remained inline as it's critical for performance (+41% perf
difference reported in an earlier test). For now the "http_" prefix
remains in their name since they still depend on the http_msg type.
2017-10-22 09:54:13 +02:00
Willy Tarreau
0da5b3bddc REORG: http: move some very http1-specific parts to h1.{c,h}
Certain types and enums are very specific to the HTTP/1 parser, and we'll
need to share them with the HTTP/2 to HTTP/1 translation code. Let's move
them to h1.c/h1.h. Those with very few occurrences or only used locally
were renamed to explicitly mention the relevant HTTP version :

  enum ht_state      -> h1_state.
  http_msg_state_str -> h1_msg_state_str
  HTTP_FLG_*         -> H1_FLG_*
  http_char_classes  -> h1_char_classes

Others like HTTP_IS_*, HTTP_MSG_* are left to be done later.
2017-10-22 09:54:13 +02:00
Willy Tarreau
0621da5f5b MINOR: buffer: make bo_getblk_nc() not return 2 for a full buffer
Thus function returns the number of blocks. When a buffer is full and
properly aligned, buf->p loops back the beginning, and the test in the
code doesn't cover that specific case, so it returns two chunks, a full
one and an empty one. It's harmless but can sometimes have a small impact
on performance and definitely makes the code hard to debug.
2017-10-22 09:54:12 +02:00
Emeric Brun
5a1335110c BUG/MEDIUM: log: check result details truncated.
Fix regression introduced by commit:
'MAJOR: servers: propagate server status changes asynchronously.'

The building of the log line was re-worked to be done at the
postponed point without lack of data.

[wt: this only affects 1.8-dev, no backport needed]
2017-10-19 18:51:32 +02:00
Willy Tarreau
e67c4e5744 MINOR: ist: add ist0() to add a trailing zero to a string.
This function modifies the string to add a zero after the end, and returns
the start pointer. The purpose is to use it on strings extracted by parsers
from larger strings cut with delimiters that are not important and can be
destroyed. It allows any such string to be used with regular string
functions. It's also convenient to use with printf() to show data extracted
from writable areas.
2017-10-19 15:01:08 +02:00
Willy Tarreau
41ab86898e MINOR: channel: make the channel be a const in all {ci,co}_get* functions
There's no point having the channel marked writable as these functions
only extract data from the channel. The code was retrieved from their
ci/co ancestors.
2017-10-19 15:01:08 +02:00
Willy Tarreau
e0e734ccc5 MINOR: buffer: add bo_getblk() and bo_getblk_nc()
These functions respectively extract a block from an output buffer by
copying it or by just passing pointers and lengths for zero copy operation.
2017-10-19 15:01:08 +02:00
Willy Tarreau
06d80a9a9c REORG: channel: finally rename the last bi_* / bo_* functions
For HTTP/2 we'll need some buffer-only equivalent functions to some of
the ones applying to channels and still squatting the bi_* / bo_*
namespace. Since these names have kept being misleading for quite some
time now and are really getting annoying, it's time to rename them. This
commit will use "ci/co" as the prefix (for "channel in", "channel out")
instead of "bi/bo". The following ones were renamed :

  bi_getblk_nc, bi_getline_nc, bi_putblk, bi_putchr,
  bo_getblk, bo_getblk_nc, bo_getline, bo_getline_nc, bo_inject,
  bi_putchk, bi_putstr, bo_getchr, bo_skip, bi_swpbuf
2017-10-19 15:01:08 +02:00
Willy Tarreau
5b9834f12a MINOR: buffer: add buffer_space_wraps()
This function returns true if the available buffer space wraps. This
will be used to detect if it's worth realigning a buffer when it lacks
contigous space.
2017-10-19 15:01:08 +02:00
Willy Tarreau
e5676e7103 MINOR: buffer: add two functions to inject data into buffers
bi_istput() injects the ist string into the input region of the buffer,
it will be used to feed small data chunks into the conn_stream. bo_istput()
does the same into the output region of the buffer, it will be used to send
data via the transport layer and assumes there's no input data.
2017-10-19 15:01:08 +02:00
Willy Tarreau
6634b63c78 MINOR: buffer: add a function to match against string patterns
In order to match known patterns in wrapping buffer, we'll introduce new
string manipulation functions for buffers. The new function b_isteq()
relies on an ist string for the pattern and compares it against any
location in the buffer relative to <p>. The second function bi_eat()
is specially designed to match input contents.
2017-10-19 15:01:07 +02:00
Willy Tarreau
7f564d2b60 MINOR: buffer: add bo_del() to delete a number of characters from output
This simply reduces the amount of output data from the buffer after
they have been transferred, in a way that is more natural than by
fiddling with buf->o. b_del() was renamed to bi_del() to avoid any
ambiguity (it's not yet used).
2017-10-19 15:01:07 +02:00
Willy Tarreau
dea7c5c03d BUG/MINOR: tools: fix my_htonll() on x86_64
Commit 36eb3a3 ("MINOR: tools: make my_htonll() more efficient on x86_64")
brought an incorrect asm statement missing the input constraints, causing
the input value not necessarily to be placed into the same register as the
output one, resulting in random output. It happens to work when building at
-O0 but not above. This was only detected in the HTTP/2 parser, but in
mainline it could only affect the integer to binary sample cast.

No backport is needed since this bug was only introduced in the development
branch.
2017-10-18 11:46:17 +02:00
Olivier Houchard
9130a9605d MINOR: checks: Add a new keyword to specify a SNI when doing SSL checks.
Add a new keyword, "check-sni", to be able to specify the SNI to be used when
doing health checks over SSL.
2017-10-17 18:10:24 +02:00
Emeric Brun
64cc49cf7e MAJOR: servers: propagate server status changes asynchronously.
In order to prepare multi-thread development, code was re-worked
to propagate changes asynchronoulsy.

Servers with pending status changes are registered in a list
and this one is processed and emptied only once 'run poll' loop.

Operational status changes are performed before administrative
status changes.

In a case of multiple operational status change or admin status
change in the same 'run poll' loop iteration, those changes are
merged to reach only the targeted status.
2017-10-13 12:00:27 +02:00
Willy Tarreau
bf08beb2a3 MINOR: session: remove the list of streams from struct session
Commit bcb86ab ("MINOR: session: add a streams field to the session
struct") added this list of streams that is not needed anymore. Let's
get rid of it now.
2017-10-08 22:32:05 +02:00
Willy Tarreau
c939835f77 MINOR: compiler: restore the likely() wrapper for gcc 5.x
After some tests, gcc 5.x produces better code with likely()
than without, contrary to gcc 4.x where it was better to disable
it. Let's re-enable it for 5 and above.
2017-10-08 22:32:05 +02:00
Willy Tarreau
2ba672726c MINOR: ist: add a macro to ease const array initialization
It's not possible to use strlen() in const arrays even with const
strings, but we can use sizeof-1 via a macro. Let's provide this in
the IST() macro, as it saves the developer from having to count the
characters.
2017-09-21 15:32:31 +02:00
Willy Tarreau
82967bf9b3 MINOR: connection: adjust CO_FL_NOTIFY_DATA after removal of flags
After the removal of CO_FL_DATA_RD_SH and CO_FL_DATA_WR_SH, the
aggregate mask CO_FL_NOTIFY_DATA was not updated. It happens that
now CO_FL_NOTIFY_DATA and CO_FL_NOTIFY_DONE are similar, which may
reveal some overlap between the ->wake and ->xprt_done callbacks.
We'll see after the mux changes if both are still required.
2017-09-21 06:28:52 +02:00
Willy Tarreau
5531d5732d MINOR: net_helper: add 64-bit read/write functions
These ones are the same as the previous ones but for 64 bit values.
We're using my_ntohll() and my_htonll() from standard.h for the byte
order conversion.
2017-09-21 06:27:08 +02:00
Willy Tarreau
2888c08346 MINOR: net_helper: add write functions
These ones are the equivalent of the read_* functions. They support
writing unaligned words, possibly wrapping, in host and network order.
The write_i*() functions were not implemented since the caller can
already use the unsigned version.
2017-09-21 06:25:10 +02:00
Willy Tarreau
d5370e1d6c MINOR: net_helper: add functions to read from vectors
This patch adds the ability to read from a wrapping memory area (ie:
buffers). The new functions are called "readv_<type>". The original
ones were renamed to start with "read_" to make the difference more
obvious between the read method and the returned type.

It's worth noting that the memory barrier in readv_bytes() is critical,
as otherwise gcc decides that it doesn't need the resulting data, but
even worse, removes the length checks in readv_u64() and happily
performs an out-of-bounds unaligned read using read_u64()! Such
"optimizations" are a bit borderline, especially when they impact
security like this...
2017-09-20 11:27:31 +02:00
Willy Tarreau
26488ad358 MINOR: buffer: add b_end() and b_to_end()
These ones return respectively the pointer to the end of the buffer and
the distance between b->p and the end. These will simplify a bit some
new code needed to parse directly from a wrapping buffer.
2017-09-20 11:27:31 +02:00
Willy Tarreau
4a6425d373 MINOR: buffer: add b_del() to delete a number of characters
This will be used by code which directly parses buffers with no channel
in the middle (eg: h2, might be used by checks as well).
2017-09-20 11:27:31 +02:00
Willy Tarreau
36eb3a3ac8 MINOR: tools: make my_htonll() more efficient on x86_64
The current construct was made when developing on a 32-bit machine.
Having a simple bswap operation replaced with 2 bswap, 2 shift and
2 or is quite of a waste of precious cycles... Let's provide a trivial
asm-based implementation for x86_64.
2017-09-20 11:27:31 +02:00
Willy Tarreau
05f5047d40 MINOR: listener: new function listener_release
Instead of duplicating some sensitive listener-specific code in the
session and in the stream code, let's call listener_release() when
releasing a connection attached to a listener.
2017-09-15 11:49:52 +02:00
Willy Tarreau
2cc5bae0b8 MINOR: listeners: make listeners count consistent with reality
Some places call delete_listener() then decrement the number of
listeners and jobs. At least one other place calls delete_listener()
without doing so, but since it's in deinit(), it's harmless and cannot
risk to cause zombie processes to survive. Given that the number of
listeners and jobs is incremented when creating the listeners, it's
much more logical to symmetrically decrement them when deleting such
listeners.
2017-09-15 11:49:52 +02:00
Willy Tarreau
0de59fd53a MINOR: listeners: new function create_listeners
This function is used to create a series of listeners for a specific
address and a port range. It automatically calls the matching protocol
handlers to add them to the relevant lists. This way cfgparse doesn't
need to manipulate listeners anymore. As an added bonus, the memory
allocation is checked.
2017-09-15 11:49:52 +02:00
Willy Tarreau
31794892af MINOR: unix: remove the now unused proto_uxst.h file
Since everything is self contained in proto_uxst.c there's no need to
export anything. The same should be done for proto_tcp.c but the file
contains other stuff that's not related to the TCP protocol itself
and which should first be moved somewhere else.
2017-09-15 11:49:52 +02:00
Willy Tarreau
9d5be5c823 MINOR: protocols: register the ->add function and stop calling them directly
cfgparse has no business directly calling each individual protocol's 'add'
function to create a listener. Now that they're all registered, better
perform a protocol lookup on the family and have a standard ->add method
for all of them.
2017-09-15 11:49:52 +02:00
Willy Tarreau
3228238c73 MINOR: protocols: always pass a "port" argument to the listener creation
It's a shame that cfgparse() has to make special cases of each protocol
just to cast the port to the target address family. Let's pass the port
in argument to the function. The unix listener simply ignores it.
2017-09-15 11:49:52 +02:00
Andjelko Iharos
c4df59e914 MINOR: cli: add socket commands and config to prepend informational messages with severity
Adds cli commands to change at runtime whether informational messages
are prepended with severity level or not, with support for numeric and
worded severity in line with syslog severity level.

Adds stats socket config keyword severity-output to set default behavior
per socket on startup.
2017-09-13 13:37:59 +02:00
Olivier Houchard
ed0d96cac4 MINOR: net_helper: Inline functions meant to be inlined. 2017-09-13 13:35:35 +02:00
Thierry FOURNIER
d697596c6c MINOR: tasks: Move Lua notification from Lua to tasks
These notification management function and structs are generic and
it will be better to move in common parts.

The notification management functions and structs have names
containing some "lua" references because it was written for
the Lua. This patch removes also these references.
2017-09-11 18:59:40 +02:00
Thierry FOURNIER
2da788e755 MEDIUM: xref/lua: Use xref for referencing cosocket relation between stream and lua
This relation will ensure that each was informed about death of another one.
2017-09-11 18:59:40 +02:00
Thierry FOURNIER
3c65b7a916 MINOR: xref: Add a new xref system
xref is used to create a relation between two elements.
Once an element is released, it breaks the relation. If the
relation is already broken, it frees the xref struct.
The pointer between two elements is a sort of refcount with
max value 1. The relation is only between two elements.
The pointer and the type of element a and b are conventional.

Note that xref is initialised from Lua files because Lua is
the only one user.
2017-09-11 18:59:40 +02:00
Emmanuel Hocdet
ddcde195eb MINOR: ssl: rework smp_fetch_ssl_fc_cl_str without internal ssl use
smp_fetch_ssl_fc_cl_str as very limited usage (only work with openssl == 1.0.2
compiled with the option enable-ssl-trace). It use internal cipher.algorithm_ssl
attribut and SSL_CIPHER_standard_name (available with ssl-trace).
This patch implement this (debug) function in a standard way. It used common
SSL_CIPHER_get_name to display cipher name. It work with openssl >= 1.0.2
and boringssl.
2017-09-09 08:36:22 +02:00
Christopher Faulet
21e9267ac3 MINOR: fd: Add fd_update_events function
This function should be called by the poller to set FD_POLL_* flags on an FD and
update its state if needed. This function has been added to ease threads support
integration.
2017-09-05 15:43:09 +02:00
Emeric Brun
52a91d3d48 MEDIUM: check: server states and weight propagation re-work
The server state and weight was reworked to handle
"pending" values updated by checks/CLI/LUA/agent.
These values are commited to be propagated to the
LB stack.

In further dev related to multi-thread, the commit
will be handled into a sync point.

Pending values are named using the prefix 'next_'
Current values used by the LB stack are named 'cur_'
2017-09-05 15:23:16 +02:00
Christopher Faulet
de2075fd21 MINOR: freq_ctr: Return the new value after an update
This will ease threads support integration.
2017-09-05 11:55:07 +02:00
Christopher Faulet
d82b180d6b MINOR: fd: Use inlined functions to check fd state in fd_*_send/recv functions
It these functions, the test is inverted and we rely on fd_recv/send_* function
to check the fd state. This will ease threads support integration.
2017-09-05 10:47:32 +02:00
Christopher Faulet
8db2fdfaba MINOR: fd: Add fd_active function
This inlined function is used to check if a fd is active for receive or send. It
will ease threads support integration.
2017-09-05 10:39:46 +02:00
Christopher Faulet
6988f678cd MINOR: http: Use a trash chunk to store decoded string of the HTTP auth header
This string is used in sample fetches so it is safe to use a preallocated trash
chunk instead of a buffer dynamically allocated during HAProxy startup.
2017-09-05 10:36:28 +02:00
Christopher Faulet
ca20d02ea8 MINOR: stick-tables: Make static_table_key a struct variable instead of a pointer
First, this variable does not need to be publicly exposed because it is only
used by stick_table functions. So we declare it as a global static in
stick_table.c file. Then, it is useless to use a pointer. Using a plain struct
variable avoids any dynamic allocation.
2017-09-05 10:35:07 +02:00
Christopher Faulet
ad405f1714 MINOR: buffers: Move swap_buffer into buffer.c and add deinit_buffer function
swap_buffer is a global variable only used by buffer_slow_realign. So it has
been moved from global.h to buffer.c and it is allocated by init_buffer
function. deinit_buffer function has been added to release it. It is also used
to destroy the buffers' pool.
2017-09-05 10:34:30 +02:00
Christopher Faulet
0132d06f68 MINOR: logs: Use dedicated function to init/deinit log buffers
Now, we use init_log_buffers and deinit_log_buffers to, respectively, initialize
and deinitialize log buffers used for syslog messages.

These functions have been introduced to be used by threads, to deal with
thread-local log buffers.
2017-09-05 10:29:31 +02:00
Christopher Faulet
748919a4c7 MINOR: chunks: Use dedicated function to init/deinit trash buffers
Now, we use init_trash_buffers and deinit_trash_buffers to, respectively,
initialize and deinitialize trash buffers (trash, trash_buf1 and trash_buf2).

These functions have been introduced to be used by threads, to deal with
thread-local trash buffers.
2017-09-05 10:22:20 +02:00
Christopher Faulet
576c5aa25c MINOR: fd: Set owner and iocb field before inserting a new fd in the fdtab
This will be needed for concurrent accesses.
2017-09-05 10:17:10 +02:00
Christopher Faulet
d531f88622 MINOR: fd: Don't forget to reset fdtab[fd].update when a fd is added/removed
It used to be guaranteed by the polling functions on a later call but
with concurrent accesses it cannot be granted anymore.
2017-09-05 10:16:42 +02:00
Christopher Faulet
f5b8adc5c0 MINOR: listeners: Change enable_listener and disable_listener into private functions
These functions are only used in listener.c.
2017-09-05 10:14:16 +02:00
Christopher Faulet
5580ba2e11 MINOR: listeners: Change listener_full and limit_listener into private functions
These functions are only used in listener_accept. So there is no need to export
them.
2017-09-05 10:13:55 +02:00
Christopher Faulet
ae459fd206 CLEANUP: memory: Remove unused function pool_destroy
This one was never used.
2017-09-05 10:13:20 +02:00
Emmanuel Hocdet
4366476852 MINOR: ssl: remove duplicate ssl_methods in struct bind_conf
Patch "MINOR: ssl: support ssl-min-ver and ssl-max-ver with crt-list"
introduce ssl_methods in struct ssl_bind_conf. struct bind_conf have now
ssl_methods and ssl_conf.ssl_methods (unused). It's error-prone. This patch
remove the duplicate structure to avoid any confusion.
2017-09-05 09:42:30 +02:00
Willy Tarreau
bbae3f0170 MEDIUM: connection: remove useless flag CO_FL_DATA_WR_SH
After careful inspection, this flag is set at exactly two places :
  - once in the health-check receive callback after receipt of a
    response
  - once in the stream interface's shutw() code where CF_SHUTW is
    always set on chn->flags

The flag was checked in the checks before deciding to send data, but
when it is set, the wake() callback immediately closes the connection
so the CO_FL_SOCK_WR_SH flag is also set.

The flag was also checked in si_conn_send(), but checking the channel's
flag instead is enough and even reveals that one check involving it
could never match.

So it's time to remove this flag and replace its check with a check of
CF_SHUTW in the stream interface. This way each layer is responsible
for its shutdown, this will ease insertion of the mux layer.
2017-08-30 10:05:49 +02:00
Willy Tarreau
cde5651c4d CLEANUP: connection: remove the unused conn_sock_shutw_pending()
This has never been used anywhere.
2017-08-30 08:18:53 +02:00
Willy Tarreau
54e917cfa1 MEDIUM: connection: remove useless flag CO_FL_DATA_RD_SH
This flag is both confusing and wrong. It is supposed to report the
fact that the data layer has received a shutdown, but in fact this is
reported by CO_FL_SOCK_RD_SH which is set by the transport layer after
this condition is detected. The only case where the flag above is set
is in the stream interface where CF_SHUTR is also set on the receiving
channel.

In addition, it was checked in the health checks code (while never set)
and was always test jointly with CO_FL_SOCK_RD_SH everywhere, except in
conn_data_read0_pending() which incorrectly doesn't match the second
time it's called and is fortunately protected by an extra check on
(ic->flags & CF_SHUTR).

This patch gets rid of the flag completely. Now conn_data_read0_pending()
accurately reports the fact that the transport layer has detected the end
of the stream, regardless of the fact that this state was already consumed,
and the stream interface watches ic->flags&CF_SHUTR to know if the channel
was already closed by the upper layer (which it already used to do).

The now unused conn_data_read0() function was removed.
2017-08-30 08:18:50 +02:00
Willy Tarreau
5790eb0a76 MINOR: stream: provide a new stream creation function for connections
The purpose will be to create new streams for a given connection so
that we can later abstract this from a mux.
2017-08-30 07:06:39 +02:00
Willy Tarreau
0b74eae1f1 MEDIUM: session: add a pointer to a struct task in the session
The session may need to enforce a timeout when waiting for a handshake.
Till now we used a trick to avoid allocating a pointer, we used to set
the connection's owner to the task and set the task's context to the
session, so that it was possible to circle between all of them. The
problem is that we'll really need to pass the pointer to the session
to the upper layers during initialization and that the only place to
store it is conn->owner, which is squatted for this trick.

So this patch moves the struct task* into the session where it should
always have been and ensures conn->owner points to the session until
the data layer is properly initialized.
2017-08-30 07:05:49 +02:00
Willy Tarreau
ca3610251b CLEANUP: listener: remove the unused handler field
Historically listeners used to have a handler depending on the upper
layer. But now it's exclusively process_stream() and nothing uses it
anymore so it can safely be removed.
2017-08-30 07:05:08 +02:00
Willy Tarreau
87787acf72 MEDIUM: stream: make stream_new() allocate its own task
Currently a task is allocated in session_new() and serves two purposes :
  - either the handshake is complete and it is offered to the stream via
    the second arg of stream_new()

  - or the handshake is not complete and it's diverted to be used as a
    timeout handler for the embryonic session and repurposed once we land
    into conn_complete_session()

Furthermore, the task's process() function was taken from the listener's
handler in conn_complete_session() prior to being replaced by a call to
stream_new(). This will become a serious mess with the mux.

Since it's impossible to have a stream without a task, this patch removes
the second arg from stream_new() and make this function allocate its own
task. In session_accept_fd(), we now only allocate the task if needed for
the embryonic session and delete it later.
2017-08-30 07:05:04 +02:00
Willy Tarreau
8e3c6ce75a MEDIUM: connection: get rid of data->init() which was not for data
The ->init() callback of the connection's data layer was only used to
complete the session's initialisation since sessions and streams were
split apart in 1.6. The problem is that it creates a big confusion in
the layers' roles as the session has to register a dummy data layer
when waiting for a handshake to complete, then hand it off to the
stream which will replace it.

The real need is to notify that the transport has finished initializing.
This should enable a better splitting between these layers.

This patch thus introduces a connection-specific callback called
xprt_done_cb() which informs about handshake successes or failures. With
this, data->init() can disappear, CO_FL_INIT_DATA as well, and we don't
need to register a dummy data->wake() callback to be notified of errors.
2017-08-30 07:04:04 +02:00
Willy Tarreau
585744bf2e REORG/MEDIUM: connection: introduce the notion of connection handle
Till now connections used to rely exclusively on file descriptors. It
was planned in the past that alternative solutions would be implemented,
leading to member "union t" presenting sock.fd only for now.

With QUIC, the connection will need to continue to exist but will not
rely on a file descriptor but a connection ID.

So this patch introduces a "connection handle" which is either a file
descriptor or a connection ID, to replace the existing "union t". We've
now removed the intermediate "struct sock" which was never used. There
is no functional change at all, though the struct connection was inflated
by 32 bits on 64-bit platforms due to alignment.
2017-08-24 19:30:04 +02:00
Willy Tarreau
0c219be3df BUG/MEDIUM: dns: fix accepted_payload_size parser to avoid integer overflow
Since commit 9d8dbbc ("MINOR: dns: Maximum DNS udp payload set to 8192") it's
possible to specify a packet size, but passing too large a size or a negative
size is not detected and results in memset() being performed over a 2GB+ area
upon receipt of the first DNS response, causing runtime crashes.

We now check that the size is not smaller than the smallest packet which is
the DNS header size (12 bytes).

No backport is needed.
2017-08-22 12:03:46 +02:00
Baptiste Assmann
9d8dbbc56b MINOR: dns: Maximum DNS udp payload set to 8192
Following up DNS extension introduction, this patch aims at making the
computation of the maximum number of records in DNS response dynamic.
This computation is based on the announced payload size accepted by
HAProxy.
2017-08-22 11:39:57 +02:00
Baptiste Assmann
747359eeca BUG/MINOR: dns: server set by SRV records stay in "no resolution" status
This patch fixes a bug where some servers managed by SRV record query
types never ever recover from a "no resolution" status.
The problem is due to a wrong function called when breaking the
server/resolution (A/AAAA) relationship: this is performed when a server's SRV
record disappear from the SRV response.
2017-08-22 11:34:49 +02:00
Frédéric Lécaille
6ca71a9297 BUG/MINOR: Wrong type used as argument for spoe_decode_buffer().
Contrary to 64-bits libCs where size_t type size is 8, on systems with 32-bits
size of size_t is 4 (the size of a long) which does not equal to size of uint64_t type.
This was revealed by such GCC warnings on 32bits systems:

src/flt_spoe.c:2259:40: warning: passing argument 4 of spoe_decode_buffer from
incompatible pointer type
  if (spoe_decode_buffer(&p, end, &str, &sz) == -1)
                                         ^
As the already existing code using spoe_decode_buffer() already use such pointers to
uint64_t, in place of pointer to size_t ;), most of this code is in contrib directory,
this simple patch modifies the prototype of spoe_decode_buffer() so that to use a
pointer to uint64_t in place of a pointer to size_t, uint64_t type being the type
finally required for decode_varint().
2017-08-22 11:27:20 +02:00
Willy Tarreau
a5480694bf MINOR: http: export some of the HTTP parser macros
The two macros EXPECT_LF_HERE and EAT_AND_JUMP_OR_RETURN were exported
for use outside the HTTP parser. They now take extra arguments to avoid
implicit pointers and jump labels. These will be used to reimplement a
minimalist HTTP/1 parser in the H1->H2 gateway.
2017-08-18 13:38:47 +02:00
Willy Tarreau
e11f727c95 MINOR: ist: implement very simple indirect strings
For HPACK we'll need to perform a lot of string manipulation between the
dynamic headers table and the output stream, and we need an efficient way
to deal with that, considering that the zero character is not an end of
string marker here. It turns out that gcc supports returning structs from
functions and is able to place up to two words directly in registers when
-freg-struct is used, which is the case by default on x86 and armv8. On
other architectures the caller reserves some stack space where the callee
can write, which is equivalent to passing a pointer to the return value.

So let's implement a few functions to deal with this as the resulting code
will be optimized on certain architectures where retrieving the length of
a string will simply consist in reading one of the two returned registers.

Extreme care was taken to ensure that the compiler gets maximum opportunities
to optimize out every bit of unused code. This is also the reason why no
call to regular string functions (such as strlen(), memcmp(), memcpy() etc)
were used. The code involving them is often larger than when they are open
coded. Given that strings are usually very small, especially when manipulating
headers, the time spent calling a function optimized for large vectors often
ends up being higher than the few cycles needed to count a few bytes.

An issue was met with __builtin_strlen() which can automatically convert
a constant string to its constant length. It doesn't accept NULLs and there
is no way to hide them using expressions as the check is made before the
optimizer is called. On gcc 4 and above, using an intermediary variable
is enough to hide it. On older versions, calls to ist() with an explicit
NULL argument will issue a warning. There is normally no reason to do this
but taking care of it the best possible still seems important.
2017-08-18 13:38:47 +02:00
Willy Tarreau
2bfd35885e MINOR: stream: link the stream to its session
Now each stream is added to the session's list of streams, so that it
will be possible to know all the streams belonging to a session, and
to know if any stream is still attached to a sessoin.
2017-08-18 13:26:35 +02:00
Willy Tarreau
bcb86abaca MINOR: session: add a streams field to the session struct
This will be used to hold the list of streams belonging to a given session.
2017-08-18 13:26:35 +02:00
Willy Tarreau
82032f1223 MINOR: chunks: add chunk_memcpy() and chunk_memcat()
These two functions respectively copy a memory area onto the chunk, and
append the contents of a memory area over a chunk. They are convenient
to prepare binary output data to be sent and will be used for HTTP/2.
2017-08-18 13:26:20 +02:00