Commit Graph

773 Commits

Author SHA1 Message Date
Willy Tarreau
b306650c2a [RELEASE] Released version 1.9-dev0
Released version 1.9-dev0 with the following main changes :
    - BUG/MEDIUM: stream: don't automatically forward connect nor close
    - BUG/MAJOR: stream: ensure analysers are always called upon close
    - BUG/MINOR: stream-int: don't try to read again when CF_READ_DONTWAIT is set
    - MEDIUM: mworker: Add systemd `Type=notify` support
    - BUG/MEDIUM: cache: free callback to remove from tree
    - CLEANUP: cache: remove unused struct
    - MEDIUM: cache: enable the HTTP analysers
    - CLEANUP: cache: remove wrong comment
    - MINOR: threads/atomic: rename local variables in macros to avoid conflicts
    - MINOR: threads/plock: rename local variables in macros to avoid conflicts
    - MINOR: threads/atomic: implement pl_mb() in asm on x86
    - MINOR: threads/atomic: implement pl_bts() on non-x86
    - MINOR: threads/build: atomic: replace the few inlines with macros
    - BUILD: threads/plock: fix a build issue on Clang without optimization
    - BUILD: ebtree: don't redefine types u32/s32 in scope-aware trees
    - BUILD: compiler: add a new type modifier __maybe_unused
    - BUILD: h2: mark some inlined functions "unused"
    - BUILD: server: check->desc always exists
    - BUG/MEDIUM: h2: properly report connection errors in headers and data handlers
    - MEDIUM: h2: add a function to emit an HTTP/1 request from a headers list
    - MEDIUM: h2: change hpack_decode_headers() to only provide a list of headers
    - BUG/MEDIUM: h2: always reassemble the Cookie request header field
    - BUG/MINOR: systemd: ignore daemon mode
    - CONTRIB: spoa_example: allow to compile outside HAProxy.
    - CONTRIB: spoa_example: remove bref, wordlist, cond_wordlist
    - CONTRIB: spoa_example: remove last dependencies on type "sample"
    - CONTRIB: spoa_example: remove SPOE enums that are useless for clients
    - CLEANUP: cache: reorder includes
    - MEDIUM: shctx: use unsigned int for len and block_count
    - MEDIUM: cache: "show cache" on the cli
    - BUG/MEDIUM: cache: use key=0 as a condition for freeing
    - BUG/MEDIUM: cache: refcount forbids to free the objects
    - BUG/MEDIUM: cache fix cli_kws structure
    - BUG/MEDIUM: deinit: correctly deinitialize the proxy and global listener tasks
    - BUG/MINOR: ssl: Always start the handshake if we can't send early data.
    - MINOR: ssl: Don't disable early data handling if we could not write.
    - MINOR: pools: prepare functions to override malloc/free in pools
    - MINOR: pools: implement DEBUG_UAF to detect use after free
    - BUG/MEDIUM: threads/time: fix time drift correction
    - BUG/MEDIUM: threads/time: maintain a common time reference between all threads
    - MINOR: sample: Add "thread" sample fetch
    - BUG/MINOR: Use crt_base instead of ca_base when crt is parsed on a server line
    - BUG/MINOR: stream: fix tv_request calculation for applets
    - BUG/MAJOR: h2: always remove a stream from the send list before freeing it
    - BUG/MAJOR: threads/task: dequeue expired tasks under the WQ lock
    - MINOR: ssl: Handle reading early data after writing better.
    - MINOR: mux: Make sure every string is woken up after the handshake.
    - MEDIUM: cache: store sha1 for hashing the cache key
    - MINOR: http: implement the "http-request reject" rule
    - MINOR: h2: send RST_STREAM before GOAWAY on reject
    - MEDIUM: h2: don't gracefully close the connection anymore on Connection: close
    - MINOR: h2: make use of client-fin timeout after GOAWAY
    - MEDIUM: config: ensure that tune.bufsize is at least 16384 when using HTTP/2
    - MINOR: ssl: Handle early data with BoringSSL
    - BUG/MEDIUM: stream: always release the stream-interface on abort
    - BUG/MEDIUM: cache: free ressources in chn_end_analyze
    - MINOR: cache: move the refcount decrease in the applet release
    - BUG/MINOR: listener: Allow multiple "process" options on "bind" lines
    - MINOR: config: Support a range to specify processes in "cpu-map" parameter
    - MINOR: config: Slightly change how parse_process_number works
    - MINOR: config: Export parse_process_number and use it wherever it's applicable
    - MINOR: standard: Add my_ffsl function to get the position of the bit set to one
    - MINOR: config: Add auto-increment feature for cpu-map
    - MINOR: config: Support partial ranges in cpu-map directive
    - MINOR:: config: Remove thread-map directive
    - MINOR: config: Add the threads support in cpu-map directive
    - MINOR: config: Add threads support for "process" option on "bind" lines
    - MEDIUM: listener: Bind listeners on a thread subset if specified
    - CLEANUP: debug: Use DPRINTF instead of fprintf into #ifdef DEBUG_FULL/#endif
    - CLEANUP: log: Rename Alert/Warning in ha_alert/ha_warning
    - MINOR/CLEANUP: proxy: rename "proxy" to "proxies_list"
    - CLEANUP: pools: rename all pool functions and pointers to remove this "2"
    - DOC: update the roadmap file with the latest changes merged in 1.8
    - DOC: fix mangled version in peers protocol documentation
    - DOC: add initial peers protovol v2.0 documentation.
    - DOC: mention William as maintainer of the cache and master-worker
    - DOC: add Christopher and Emeric as maintainers of the threads
    - MINOR: cache: replace a fprint() by an abort()
    - MEDIUM: cache: max-age configuration keyword
    - DOC: explain HTTP2 timeout behavior
    - DOC: cache: configuration and management
    - MAJOR: mworker: exits the master on failure
    - BUG/MINOR: threads: don't drop "extern" on the lock in include files
    - MINOR: task: keep a pointer to the currently running task
    - MINOR: task: align the rq and wq locks
    - MINOR: fd: cache-align fdtab and fdcache locks
    - MINOR: buffers: cache-align buffer_wq_lock
    - CLEANUP: server: reorder some fields in struct server to save 40 bytes
    - CLEANUP: proxy: slightly reorder the struct proxy to reduce holes
    - CLEANUP: checks: remove 16 bytes of holes in struct check
    - CLEANUP: cache: more efficiently pack the struct cache
    - CLEANUP: fd: place the lock at the beginning of struct fdtab
    - CLEANUP: pools: align pools on a cache line
    - DOC: config: add a few bits about how to configure HTTP/2
    - BUG/MAJOR: threads/queue: avoid recursive locking in pendconn_get_next_strm()
    - BUILD: Makefile: reorder object files by size
2017-11-26 19:50:17 +01:00
Willy Tarreau
1ca1b70cf9 CLEANUP: pools: align pools on a cache line
There are just a few pools, and they're stressed a lot, so it makes
sense to dedicate them a cache line to avoid contention and to place
the lock at the beginning.
2017-11-26 11:10:53 +01:00
Willy Tarreau
53bae85b8e BUG/MINOR: threads: don't drop "extern" on the lock in include files
Commit 9dcf9b6 ("MINOR: threads: Use __decl_hathreads to declare locks")
accidently lost a few "extern" in certain lock declarations, possibly
causing certain entries to be declared at multiple places. Apparently
it hasn't caused any harm though.

The offending ones were :
  - fdtab_lock
  - fdcache_lock
  - poll_lock
  - buffer_wq_lock
2017-11-26 11:10:50 +01:00
Willy Tarreau
bafbe01028 CLEANUP: pools: rename all pool functions and pointers to remove this "2"
During the migration to the second version of the pools, the new
functions and pool pointers were all called "pool_something2()" and
"pool2_something". Now there's no more pool v1 code and it's a real
pain to still have to deal with this. Let's clean this up now by
removing the "2" everywhere, and by renaming the pool heads
"pool_head_something".
2017-11-24 17:49:53 +01:00
Christopher Faulet
767a84bcc0 CLEANUP: log: Rename Alert/Warning in ha_alert/ha_warning 2017-11-24 17:19:12 +01:00
Christopher Faulet
26028f6209 MINOR: config: Add auto-increment feature for cpu-map
The prefix "auto:" can be added before the process set to let HAProxy
automatically bind a process to a CPU by incrementing process and CPU sets. To
be valid, both sets must have the same size. No matter the declaration order of
the CPU sets, it will be bound from the lower to the higher bound.

  Examples:
      # all these lines bind the process 1 to the cpu 0, the process 2 to cpu 1
      #  and so on.
      cpu-map auto:1-4   0-3
      cpu-map auto:1-4   0-1 2-3
      cpu-map auto:1-4   3 2 1 0

      # bind each process to exaclty one CPU using all/odd/even keyword
      cpu-map auto:all   0-63
      cpu-map auto:even  0-31
      cpu-map auto:odd   32-63

      # invalid cpu-map because process and CPU sets have different sizes.
      cpu-map auto:1-4   0    # invalid
      cpu-map auto:1     0-3  # invalid
2017-11-24 15:38:49 +01:00
Christopher Faulet
ff8131861f MINOR: standard: Add my_ffsl function to get the position of the bit set to one 2017-11-24 15:38:49 +01:00
Christopher Faulet
f1f0c5f591 MINOR: config: Export parse_process_number and use it wherever it's applicable
This function is used when "bind-process" directive is parsed and when "process"
parameter on a "bind" or a "stats socket" line is parsed.
2017-11-24 15:38:49 +01:00
Willy Tarreau
158fa75811 MINOR: pools: implement DEBUG_UAF to detect use after free
This code has been used successfully a few times in the past to detect
that a pool was used after being freed. Its main goal is to allocate a
full page for each object so that they are always released individually
and unmapped from memory. This way if any part of the code reference the
object after is was freed and before it is reallocated, a segv occurs at
the exact offending location. It does a few extra things such as writing
to the memory area before freeing to detect double-frees and free of
read-only areas, and placing the data at the end of the page instead of
the beginning so that out of bounds accesses are easier to spot. The
amount of memory used with this is huge (about 10 times the regular
usage) but it can be useful sometimes.
2017-11-22 19:43:57 +01:00
Willy Tarreau
f13322ede1 MINOR: pools: prepare functions to override malloc/free in pools
This will be useful to add some debugging capabilities. For now it
changes nothing.
2017-11-22 19:27:44 +01:00
Willy Tarreau
59a10fb53d MEDIUM: h2: change hpack_decode_headers() to only provide a list of headers
The current H2 to H1 protocol conversion presents some issues which will
require to perform some processing on certain headers before writing them
so it's not possible to convert HPACK to H1 on the fly.

This commit modifies the headers decoding so that it now works in two
phases : hpack_decode_headers() only decodes the HPACK stream in the
HEADERS frame and puts the result into a list. Headers which require
storage (huffman-compressed or from the dynamic table) are stored in
a chunk allocated by the H2 demuxer. Then once the headers are properly
decoded into this list, h2_make_h1_request() is called with this list
to produce the HTTP/1.1 request into the destination buffer. The list
necessarily enforces a limit. Here we use 2*MAX_HTTP_HDR, which means
that we can have as many individual cookies as we have regular headers
if a client decides to break their cookies into multiple values. This
seams reasonable and will allow the H1 parser to decide whether it's
too much or not.

Thus the output stream is not produced on the fly anymore and this will
permit to deal with certain corner cases like reparing the Cookie header
(which for now is not done).

In order to limit header duplication and parsing, the known pseudo headers
continue to be passed by their index : the name element in the list then
has a NULL pointer and the value is the pseudo header's index. Given that
these ones represent about half of the incoming requests and need to be
found quickly, it maintains an acceptable level of performance.

The code was significantly reduced by doing this because the orignal code
had to deal with HPACK and H1 combinations (eg: index vs not indexed, etc)
and now the HPACK decoding is totally focused on the decompression, and
the H1 encoding doesn't have to deal with the issue of wrapping input for
example.

One bug was addressed here (though it couldn't happen at the moment). The
H2 demuxer used to detect a failure to write the request into the H1 buffer
and would then detect if the output buffer wraps, realign it and try again.
The problem by doing so was that the HPACK context was already modified and
not rewindable. Thus the size check is now performed first and a failure is
reported if it doesn't fit.
2017-11-21 21:13:36 +01:00
Willy Tarreau
f24ea8e45e MEDIUM: h2: add a function to emit an HTTP/1 request from a headers list
The current H2 to H1 protocol conversion presents some issues which will
require to perform some processing on certain headers before writing them
so it's not possible to convert HPACK to H1 on the fly.

Here we introduce a function which performs half of what hpack_decode_header()
used to do, which is to take a list of headers on input and emit the
corresponding request in HTTP/1.1 format. The code is the same and functions
were renamed to be prefixed with "h2" instead of "hpack", though it ends
up being simpler as the various HPACK-specific cases could be fused into
a single one (ie: add header).

Moving this part here makes a lot of sense as now this code is specific to
what is documented in HTTP/2 RFC 7540 and will be able to deal with special
cases related to H2 to H1 conversion enumerated in section 8.1.

Various error codes which were previously assigned to HPACK were never
used (aside being negative) and were all replaced by -1 with a comment
indicating what error was detected. The code could be further factored
thanks to this but this commit focuses on compatibility first.

This code is not yet used but builds fine.
2017-11-21 21:13:33 +01:00
Willy Tarreau
dbd25fc75a BUILD: compiler: add a new type modifier __maybe_unused
While gcc only emits warnings about unused static functions, Clang also
emits such a warning when the functions are inlined. This is a bit
annoying at certain places where functions are provided to manipulate
multiple data types and are not yet used. Let's have a type modifier
"__maybe_unused" which sets the "unused" attribute like the Linux kernel
does. It's elegant as it allows the code author to indicate that it knows
that this element might be unused. It works on variables as well, which
is convenient to remove ifdefs around local variables in certain functions,
but doesn't work on labels.
2017-11-20 21:27:27 +01:00
Willy Tarreau
9c1e15d8cd MINOR: tools: emphasize the node being worked on in the tree dump
Now we can show in dotted red the node being removed or surrounded in red
a node having been inserted, and add a description on the graph related to
the operation in progress for example.
2017-11-15 19:43:05 +01:00
Willy Tarreau
ed3cda02ae MINOR: tools: add a function to dump a scope-aware tree to a file
It emits a dump in DOT format for graphing purposes during debugging
sessions. It's convenient to dump the run queue.
2017-11-15 16:07:15 +01:00
Christopher Faulet
99bca65f53 BUG/MEDIUM: standard: itao_str/idx and quote_str/idx must be thread-local
This bug has an impact on the stats applet and easily leads to a crash of HAProxy.

This is specific to threads, no backport is needed.
2017-11-14 18:11:57 +01:00
Christopher Faulet
e9a896e09e BUG/MINOR: threads: tid_bit must be a unsigned long
This is specific to threads, no backport is needed.
2017-11-14 18:11:28 +01:00
Christopher Faulet
fa5c812a6b BUG/MINOR: buffers: Fix b_alloc_margin to be "fonctionnaly" thread-safe
b_alloc_margin is, strickly speeking, thread-safe. It will not crash
HAproxy. But its contract is not respected anymore in a multithreaded
environment. In this function, we need to be sure to have <margin> buffers
available in the pool after the allocation. So to have this guarantee, we must
lock the memory pool during all the operation. This also means, we must call
internal and lockless memory functions (prefixed with '__').

For the record, this patch fixes a pernicious bug happens after a soft reload
where some streams can be blocked infinitly, waiting for a buffer in the
buffer_wq list. This happens because, during a soft reload, pool_gc2 is called,
making some calls to b_alloc_fast fail.

This is specific to threads, no backport is needed.
2017-11-13 11:42:48 +01:00
Christopher Faulet
9dcf9b6f03 MINOR: threads: Use __decl_hathreads to declare locks
This macro should be used to declare variables or struct members depending on
the USE_THREAD compile option. It avoids the encapsulation of such declarations
between #ifdef/#endif. It is used to declare all lock variables.
2017-11-13 11:38:17 +01:00
Willy Tarreau
aa39860aef MINOR: tools: don't use unlikely() in hex2i()
This small inline function causes some pain to the compiler when used
inside other functions due to its use of the unlikely() hint for non-digits.
It causes the letters to be processed far away in the calling function and
makes the code less efficient. Removing these unlikely() hints has increased
the chunk size parsing by around 5%.
2017-11-10 11:19:54 +01:00
Emeric Brun
d8b3b65faa BUG/MEDIUM: splice/threads: pipe reuse list was not protected.
The list is now protected using a global spinlock.
2017-11-07 14:47:28 +01:00
Christopher Faulet
2a944ee16b BUILD: threads: Rename SPIN/RWLOCK macros using HA_ prefix
This remove any name conflicts, especially on Solaris.
2017-11-07 11:10:24 +01:00
Willy Tarreau
88ac59be4d MINOR: threads: use faster locks for the spin locks
The spin locks used to rely on W locks, which involve a loop waiting
for readers to leave, and this doesn't happen here. It's more efficient
to use S locks instead, which are also mutually exclusive and do not
have this loop. This saves one test per spinlock and a few tens of
bytes allowing certain functions to be inlined.
2017-11-06 11:20:11 +01:00
David Carlier
5222d8eb25 BUG/MINOR: stdarg.h inclusion
Needed for the memvprintf part, the va_list type.
Spotted during OpenBSD build.
2017-11-03 15:04:09 +01:00
Willy Tarreau
4b75fffa2b BUG/MAJOR: buffers: fix get_buffer_nc() for data at end of buffer
This function incorrectly dealt with the case where data doesn't
wrap but lies at the end of the buffer, resulting in Lukas' reported
data corruption with HTTP/2. No backport is needed, it was introduced
for HTTP/2 in 1.8-dev.
2017-11-02 17:16:07 +01:00
Willy Tarreau
7c2a2ad65c BUG/MINOR: thread: fix a typo in the debug code
__spin_unlock() used to call RWLOCK_WRUNLOCK() to unlock in the
debug code. It's harmless as they happen to be identical.
2017-11-02 16:26:02 +01:00
Willy Tarreau
ffca736401 MINOR: h2: centralize all HTTP/2 protocol elements and constants
These constants from RFC7540 will be centralized into common/h2.h for
use by the future h2 mux and other places.
2017-10-31 18:03:24 +01:00
Willy Tarreau
1be4f3d8af MEDIUM: hpack: implement basic hpack encoding
For now it only supports literals and a bit of static header table
references for the 9 most common header field names (date, server,
content-type, content-length, last-modified, accept-ranges, etag,
cache-control, location).

A previous incarnation of this commit used to strip the forbidden H2
header names (connection, proxy-connection, upgrade, transfer-encoding,
keep-alive) but this is no longer the case as this filtering is irrelevant
to HPACK encoding and is specific to H2, so this will have to be done by
the caller.

It's quite not optimal but works fine enough to prepare some valid and
partially compressed responses during development.
2017-10-31 18:03:24 +01:00
Willy Tarreau
679790baae MINOR: hpack: implement the decoder
The decoder is now fully functional. It makes use of the dynamic header
table. Dynamic header table size updates are currently ignored, as our
initially advertised value is the highest we support. Strictly speaking,
the impact is that a client referencing a header field after such an
update wouldn't observe an error instead of the connection being dropped
if it was implemented.

Decoded header fields are copied into a target buffer in HTTP/1 format
using HTTP/1.1 as the version. The Host header field is automatically
appended if a ":authority" header field is present.

All decoded header fields can be displayed if the file is compiled with
DEBUG_HPACK.
2017-10-31 18:03:24 +01:00
Willy Tarreau
ce04094c4a MINOR: hpack: implement the header tables management
This code deals with header insertion, retrieval and eviction, as well
as with dynamic header table defragmentation. It is functional for use
as a decoder and was heavily tested in this context. There's still some
room for optimization (eg: the defragmentation code currently does it
in place using a memcpy).

Also for now the dynamic header table is allocated using malloc() while
a pool needs to be created instead.

This code was mostly imported from https://github.com/wtarreau/http2-exp
with "hpack_" prepended in front of most names to avoid risks of conflicts.
Some small cleanups and renamings were applied during the import. This
version must be considered more recent.

Some HPACK error codes were placed here (HPACK_ERR_*), not exactly because
they're needed by the decoder but they'll be needed by all callers. Maybe
a different location should be found.
2017-10-31 18:03:24 +01:00
Willy Tarreau
a004ade512 MINOR: hpack: implement the HPACK Huffman table decoder
The code was borrowed from the HPACK experimental implementations
available here :

    https://github.com/wtarreau/http2-exp

It contains the Huffman table as specified in RFC7541 Appendix B, and a
set of reverse tables used to decode a Huffman byte stream, and produced
by contrib/h2/gen-rht. The encoder is not finalized, it doesn't emit the
byte stream but this is not needed for now.
2017-10-31 18:03:24 +01:00
Willy Tarreau
b29dc95a97 MINOR: threads: add a portable barrier for threads and non-threads
HA_BARRIER() is just a simple memory barrier to prevent the compiler
from reordering our code.
2017-10-31 18:01:18 +01:00
Christopher Faulet
cd7879adc2 BUG/MEDIUM: threads: Run the poll loop on the main thread too
There was a flaw in the way the threads was created. the main one was just used
to create all the others and just wait to exit. Now, it is used to run a poll
loop. So we only create nbthread-1 threads.

This also fixes a bug about the compression filter when there is only 1 thread
(nbthread == 1 or no threads support). The bug was in the way thread-local
resources was initialized. per-thread init/deinit callbacks were never called
for the main process. So, with nthread set to 1, some buffers remained
uninitialized.
2017-10-31 13:58:33 +01:00
Christopher Faulet
c2a89a6aed MINOR: threads/mailers: Add a lock to protect queues of email alerts 2017-10-31 13:58:33 +01:00
Christopher Faulet
cfda847643 MINOR: threads/checks: Add a lock to protect the pid list used by external checks 2017-10-31 13:58:33 +01:00
Christopher Faulet
b2812a6240 MEDIUM: thread/dns: Make DNS thread-safe 2017-10-31 13:58:33 +01:00
Christopher Faulet
24289f2e07 MEDIUM: thread/spoe: Make the SPOE thread-safe
Because there is not migration mechanism yet, all runtime information about an
SPOE agent are thread-local and async exchanges with agents are disabled when we
have serveral threads. Howerver, pipelining is still available. So for now, the
thread part of the SPOE is pretty simple.
2017-10-31 13:58:33 +01:00
Thierry FOURNIER
738a6d76f6 MEDIUM: threads/tasks: Add lock around notifications
This patch add lock around some notification calls
2017-10-31 13:58:32 +01:00
Thierry FOURNIER
952939d294 MEDIUM: threads/xref: Convert xref function to a thread safe model
Ensure that the unlink is done safely between thread and that
the peer struct will not destroy between the usage of the peer.
2017-10-31 13:58:32 +01:00
Thierry FOURNIER
61ba0e2b6d MEDIUM: threads/lua: Add locks around the Lua execution parts.
Note that the Lua processing is not really thread safe. It provides
heavy system which consists to add our own lock function in the Lua
code and recompile the library. This system will probably not accepted
by maintainers of various distribs.

Our main excution point of the Lua is the function lua_resume(). A
quick looking on the Lua sources displays a lua_lock() a the start
of function and a lua_unlock() at the end of the function. So I
conclude that the Lua thread safe mode just perform a mutex around
all execution. So I prefer to do this in the HAProxy code, it will be
easier for distro maintainers.

Note that the HAProxy lua functions rounded by the macro SET_SAFE_LJMP
and RESET_SAFE_LJMP manipulates the Lua stack, so it will be careful
to set mutex around these functions.
2017-10-31 13:58:32 +01:00
Christopher Faulet
8ca3b4bc46 MEDIUM: threads/compression: Make HTTP compression thread-safe 2017-10-31 13:58:32 +01:00
Christopher Faulet
e95f2c3ef5 MEDIUM: thread/vars: Make vars thread-safe
A RW lock has been added to the vars structure to protect each list of
variables. And a global RW lock is used to protect registered names.

When a varibable is fetched, we duplicate sample data because the variable could
be modified by another thread.
2017-10-31 13:58:32 +01:00
Emeric Brun
b5997f740b MAJOR: threads/map: Make acls/maps thread safe
locks have been added in pat_ref and pattern_expr structures to protect all
accesses to an instance of on of them. Moreover, a global lock has been added to
protect the LRU cache used for pattern matching.

Patterns are now duplicated after a successfull matching, to avoid modification
by other threads when the result is used.

Finally, the function reloading a pattern list has been modified to be
thread-safe.
2017-10-31 13:58:32 +01:00
Emeric Brun
821bb9beaa MAJOR: threads/ssl: Make SSL part thread-safe
First, OpenSSL is now initialized to be thread-safe. This is done by setting 2
callbacks. The first one is ssl_locking_function. It handles the locks and
unlocks. The second one is ssl_id_function. It returns the current thread
id. During the init step, we create as much as R/W locks as needed, ie the
number returned by CRYPTO_num_locks function.

Next, The reusable SSL session in the server context is now thread-local.

Shctx is now also initialized if HAProxy is started with several threads.

And finally, a global lock has been added to protect the LRU cache used to store
generated certificates. The function ssl_sock_get_generated_cert is now
deprecated because the retrieved certificate can be removed by another threads
in same time. Instead, a new function has been added,
ssl_sock_assign_generated_cert. It must be used to search a certificate in the
cache and set it immediatly if found.
2017-10-31 13:58:32 +01:00
Emeric Brun
6b35e9bfbf MEDIUM: threads/stream: Make streams list thread safe
Adds a global lock to protect the full streams list used to dump
sessions on stats socket.
2017-10-31 13:58:32 +01:00
Emeric Brun
a1dd243adb MAJOR: threads/buffer: Make buffer wait queue thread safe
Adds a global lock to protect the buffer wait queue.
2017-10-31 13:58:31 +01:00
Emeric Brun
80527f5bb6 MAJOR: threads/peers: Make peers thread safe
A lock is used to protect accesses to a peer structure.

A the lock is taken in the applet handler when the peer is identified
and released living the applet handler.

In the scheduling task for peers section, the lock is taken for every
listed peer and released at the end of the process task function.

The peer 'force shutdown' function was also re-worked.
2017-10-31 13:58:31 +01:00
Emeric Brun
1138fd0c57 MAJOR: threads/applet: Handle multithreading for applets
A global lock has been added to protect accesses to the list of active
applets. A process mask has also been added on each applet. Like for FDs and
tasks, it is used to know which threads are allowed to process an
applet. Because applets are, most of time, linked to a session, it should be
sticky on the same thread. But in all cases, it is the responsibility of the
applet handler to lock what have to be protected in the applet context.
2017-10-31 13:58:31 +01:00
Emeric Brun
272e252e61 MINOR: threads/regex: Change Regex trash buffer into a thread local variable 2017-10-31 13:58:31 +01:00
Emeric Brun
819fc6f563 MEDIUM: threads/stick-tables: handle multithreads on stick tables
The stick table API was slightly reworked:

A global spin lock on stick table was added to perform lookup and
insert in a thread safe way. The handling of refcount on entries
is now handled directly by stick tables functions under protection
of this lock and was removed from the code of callers.

The "stktable_store" function is no more externalized and users should
now use "stktable_set_entry" in any case of insertion. This last one performs
a lookup followed by a store if not found. So the code using "stktable_store"
was re-worked.

Lookup, and set_entry functions automatically increase the refcount
of the returned/stored entry.

The function "sticktable_touch" was renamed "sticktable_touch_local"
and is now able to decrease the refcount if last arg is set to true. It
is allowing to release the entry without taking the lock twice.

A new function "sticktable_touch_remote" is now used to insert
entries coming from remote peers at the right place in the update tree.
The code of peer update was re-worked to use this new function.
This function is also able to decrease the refcount if wanted.

The function "stksess_kill" also handle a parameter to decrease
the refcount on the entry.

A read/write lock is added on each entry to protect the data content
updates of the entry.
2017-10-31 13:58:31 +01:00
Christopher Faulet
5b51755aef MEDIUM: threads/lb: Make LB algorithms (lb_*.c) thread-safe
A lock for LB parameters has been added inside the proxy structure and atomic
operations have been used to update server variables releated to lb.

The only significant change is about lb_map. Because the servers status are
updated in the sync-point, we can call recalc_server_map function synchronously
in map_set_server_status_up/down function.
2017-10-31 13:58:31 +01:00
Christopher Faulet
5d42e099c5 MINOR: threads/server: Add a lock to deal with insert in updates_servers list
This list is used to save changes on the servers state. So when serveral threads
are used, it must be locked. The changes are then applied in the sync-point. To
do so, servers_update_status has be moved in the sync-point. So this is useless
to lock it at this step because the sync-point is a protected area by iteself.
2017-10-31 13:58:31 +01:00
Christopher Faulet
29f77e846b MEDIUM: threads/server: Add a lock per server and atomically update server vars
The server's lock is use, among other things, to lock acces to the active
connection list of a server.
2017-10-31 13:58:31 +01:00
Christopher Faulet
ff8abcd31d MEDIUM: threads/proxy: Add a lock per proxy and atomically update proxy vars
Now, each proxy contains a lock that must be used when necessary to protect
it. Moreover, all proxy's counters are now updated using atomic operations.
2017-10-31 13:58:30 +01:00
Christopher Faulet
8d8aa0d681 MEDIUM: threads/listeners: Make listeners thread-safe
First, we use atomic operations to update jobs/totalconn/actconn variables,
listener's nbconn variable and listener's counters. Then we add a lock on
listeners to protect access to their information. And finally, listener queues
(global and per proxy) are also protected by a lock. Here, because access to
these queues are unusal, we use the same lock for all queues instead of a global
one for the global queue and a lock per proxy for others.
2017-10-31 13:58:30 +01:00
Christopher Faulet
b79a94c9f3 MEDIUM: threads/signal: Add a lock to make signals thread-safe
A global lock has been added to protect the signal processing. So when a signal
it triggered, only one thread will catch it.
2017-10-31 13:58:30 +01:00
Emeric Brun
c60def8368 MAJOR: threads/task: handle multithread on task scheduler
2 global locks have been added to protect, respectively, the run queue and the
wait queue. And a process mask has been added on each task. Like for FDs, this
mask is used to know which threads are allowed to process a task.

For many tasks, all threads are granted. And this must be your first intension
when you create a new task, else you have a good reason to make a task sticky on
some threads. This is then the responsibility to the process callback to lock
what have to be locked in the task context.

Nevertheless, all tasks linked to a session must be sticky on the thread
creating the session. It is important that I/O handlers processing session FDs
and these tasks run on the same thread to avoid conflicts.
2017-10-31 13:58:30 +01:00
Christopher Faulet
d4604adeaa MAJOR: threads/fd: Make fd stuffs thread-safe
Many changes have been made to do so. First, the fd_updt array, where all
pending FDs for polling are stored, is now a thread-local array. Then 3 locks
have been added to protect, respectively, the fdtab array, the fd_cache array
and poll information. In addition, a lock for each entry in the fdtab array has
been added to protect all accesses to a specific FD or its information.

For pollers, according to the poller, the way to manage the concurrency is
different. There is a poller loop on each thread. So the set of monitored FDs
may need to be protected. epoll and kqueue are thread-safe per-se, so there few
things to do to protect these pollers. This is not possible with select and
poll, so there is no sharing between the threads. The poller on each thread is
independant from others.

Finally, per-thread init/deinit functions are used for each pollers and for FD
part for manage thread-local ressources.

Now, you must be carefull when a FD is created during the HAProxy startup. All
update on the FD state must be made in the threads context and never before
their creation. This is mandatory because fd_updt array is thread-local and
initialized only for threads. Because there is no pollers for the main one, this
array remains uninitialized in this context. For this reason, listeners are now
enabled in run_thread_poll_loop function, just like the worker pipe.
2017-10-31 13:58:30 +01:00
Christopher Faulet
b349e48ede MEDIUM: threads/pool: Make pool thread-safe by locking all access to a pool
A lock has been added for each memory pool. It is used to protect the pool
during allocations and releases. It is also used when pool info are dumped.
2017-10-31 13:58:30 +01:00
Christopher Faulet
9a65571781 MEDIUM: threads/time: Many global variables from time.h are now thread-local 2017-10-31 13:58:30 +01:00
Christopher Faulet
339fff8a18 MEDIUM: threads: Adds a set of functions to handle sync-point
A sync-point is a protected area where you have the warranty that no concurrency
access is possible. It is implementated as a thread barrier to enter in the
sync-point and another one to exit from it. Inside the sync-point, all threads
that must do some syncrhonous processing will be called one after the other
while all other threads will wait. All threads will then exit from the
sync-point at the same time.

A sync-point will be evaluated only when necessary because it is a costly
operation. To limit the waiting time of each threads, we must have a mechanism
to wakeup all threads. This is done with a pipe shared by all threads. By
writting in this pipe, we will interrupt all threads blocked on a poller. The
pipe is then flushed before exiting from the sync-point.
2017-10-31 13:58:29 +01:00
Christopher Faulet
1a2b56ea8e MEDIUM: threads: Add hathreads header file
This file contains all functions and macros used to deal with concurrency in
HAProxy. It contains all high-level function to do atomic operation
(HA_ATOMIC_*). Note, for now, we rely on "__atomic" GCC builtins to do atomic
operation. So HAProxy can be compiled with the thread support iff these builtins
are available.

It also contains wrappers around plocks to use spin or read/write locks. These
wrappers are used to abstract the internal representation of the locking system
and to add information to help debugging, when compiled with suitable
options.

To add extra info on locks, you need to add DEBUG=-DDEBUG_THREAD or
DEBUG=-DDEBUG_FULL compilation option. In addition to timing info on locks, we
keep info on where a lock was acquired the last time (function name, file and
line). There are also the thread id and a flag to know if it is still locked or
not. This will be useful to debug deadlocks.
2017-10-31 13:58:23 +01:00
Christopher Faulet
e9bd686b68 MINOR: threads: Add THREAD_LOCAL macro
When compiled with threads support, this marco is set to __thread. Else it is
empty.
2017-10-31 11:36:13 +01:00
Christopher Faulet
93a518f02a MINOR: standard: Add memvprintf function
Now memprintf relies on memvprintf. This new function does exactly what
memprintf did before, but it must be called with a va_list instead of a variable
number of arguments. So there is no change for every functions using
memprintf. But it is now also possible to have same functionnality from any
function with variadic arguments.
2017-10-31 11:36:12 +01:00
William Lallemand
83215a44b8 MEDIUM: lists: list_for_each_entry{_safe}_from functions
Add list_for_each_entry_from and list_for_each_entry_safe_from which
allows to iterate in a list starting from a specific item.
2017-10-31 03:44:11 +01:00
William Lallemand
48b4bb4b09 MEDIUM: cfgparse: post parsing registration
Allow to register a function which will be called after the
configuration file parsing, at the end of the check_config_validity().

It's useful fo checking dependencies between sections or for resolving
keywords, pointers or values.
2017-10-27 10:15:56 +02:00
William Lallemand
d2ff56d2a3 MEDIUM: cfgparse: post section callback
This commit implements a post section callback. This callback will be
used at the end of a section parsing.

Every call to cfg_register_section must be modified to use the new
prototype:

    int cfg_register_section(char *section_name,
                             int (*section_parser)(const char *, int, char **, int),
                             int (*post_section_parser)());
2017-10-27 10:14:51 +02:00
Willy Tarreau
145746c2d5 MINOR: buffer: add the buffer input manipulation functions
We used to have bo_{get,put}_{chr,blk,str} to retrieve/send data to
the output area of a buffer, but not the equivalent ones for the input
area. This will be needed to copy uploaded data frames in HTTP/2.
2017-10-27 10:00:17 +02:00
Willy Tarreau
1296382d0b CONTRIB: trace: add the possibility to place trace calls in the code
Now any call to trace() in the code will automatically appear interleaved
with the call sequence and timestamped in the trace file. They appear with
a '#' on the 3rd argument (caller's pointer) in order to make them easy to
spot. If the trace functionality is not used, a dmumy weak function is used
instead so that it doesn't require to recompile every time traces are
enabled/disabled.

The trace decoder knows how to deal with these messages, detects them and
indents them similarly to the currently traced function. This can be used
to print function arguments for example.

Note that we systematically flush the log when calling trace() to ensure we
never miss important events, so this may impact performance.

The trace() function uses the same format as printf() so it should be easy
to setup during debugging sessions.
2017-10-24 19:54:25 +02:00
Willy Tarreau
306924ecb8 MINOR: http: add very simple header management based on double strings
This will be used initially by the hpack table and hopefully later by a
new native http processor. These headers are made of name and value, both
an immediate string (ie: pointer and length).
2017-10-22 09:54:14 +02:00
Willy Tarreau
0621da5f5b MINOR: buffer: make bo_getblk_nc() not return 2 for a full buffer
Thus function returns the number of blocks. When a buffer is full and
properly aligned, buf->p loops back the beginning, and the test in the
code doesn't cover that specific case, so it returns two chunks, a full
one and an empty one. It's harmless but can sometimes have a small impact
on performance and definitely makes the code hard to debug.
2017-10-22 09:54:12 +02:00
Willy Tarreau
e67c4e5744 MINOR: ist: add ist0() to add a trailing zero to a string.
This function modifies the string to add a zero after the end, and returns
the start pointer. The purpose is to use it on strings extracted by parsers
from larger strings cut with delimiters that are not important and can be
destroyed. It allows any such string to be used with regular string
functions. It's also convenient to use with printf() to show data extracted
from writable areas.
2017-10-19 15:01:08 +02:00
Willy Tarreau
e0e734ccc5 MINOR: buffer: add bo_getblk() and bo_getblk_nc()
These functions respectively extract a block from an output buffer by
copying it or by just passing pointers and lengths for zero copy operation.
2017-10-19 15:01:08 +02:00
Willy Tarreau
5b9834f12a MINOR: buffer: add buffer_space_wraps()
This function returns true if the available buffer space wraps. This
will be used to detect if it's worth realigning a buffer when it lacks
contigous space.
2017-10-19 15:01:08 +02:00
Willy Tarreau
e5676e7103 MINOR: buffer: add two functions to inject data into buffers
bi_istput() injects the ist string into the input region of the buffer,
it will be used to feed small data chunks into the conn_stream. bo_istput()
does the same into the output region of the buffer, it will be used to send
data via the transport layer and assumes there's no input data.
2017-10-19 15:01:08 +02:00
Willy Tarreau
6634b63c78 MINOR: buffer: add a function to match against string patterns
In order to match known patterns in wrapping buffer, we'll introduce new
string manipulation functions for buffers. The new function b_isteq()
relies on an ist string for the pattern and compares it against any
location in the buffer relative to <p>. The second function bi_eat()
is specially designed to match input contents.
2017-10-19 15:01:07 +02:00
Willy Tarreau
7f564d2b60 MINOR: buffer: add bo_del() to delete a number of characters from output
This simply reduces the amount of output data from the buffer after
they have been transferred, in a way that is more natural than by
fiddling with buf->o. b_del() was renamed to bi_del() to avoid any
ambiguity (it's not yet used).
2017-10-19 15:01:07 +02:00
Willy Tarreau
dea7c5c03d BUG/MINOR: tools: fix my_htonll() on x86_64
Commit 36eb3a3 ("MINOR: tools: make my_htonll() more efficient on x86_64")
brought an incorrect asm statement missing the input constraints, causing
the input value not necessarily to be placed into the same register as the
output one, resulting in random output. It happens to work when building at
-O0 but not above. This was only detected in the HTTP/2 parser, but in
mainline it could only affect the integer to binary sample cast.

No backport is needed since this bug was only introduced in the development
branch.
2017-10-18 11:46:17 +02:00
Willy Tarreau
c939835f77 MINOR: compiler: restore the likely() wrapper for gcc 5.x
After some tests, gcc 5.x produces better code with likely()
than without, contrary to gcc 4.x where it was better to disable
it. Let's re-enable it for 5 and above.
2017-10-08 22:32:05 +02:00
Willy Tarreau
2ba672726c MINOR: ist: add a macro to ease const array initialization
It's not possible to use strlen() in const arrays even with const
strings, but we can use sizeof-1 via a macro. Let's provide this in
the IST() macro, as it saves the developer from having to count the
characters.
2017-09-21 15:32:31 +02:00
Willy Tarreau
5531d5732d MINOR: net_helper: add 64-bit read/write functions
These ones are the same as the previous ones but for 64 bit values.
We're using my_ntohll() and my_htonll() from standard.h for the byte
order conversion.
2017-09-21 06:27:08 +02:00
Willy Tarreau
2888c08346 MINOR: net_helper: add write functions
These ones are the equivalent of the read_* functions. They support
writing unaligned words, possibly wrapping, in host and network order.
The write_i*() functions were not implemented since the caller can
already use the unsigned version.
2017-09-21 06:25:10 +02:00
Willy Tarreau
d5370e1d6c MINOR: net_helper: add functions to read from vectors
This patch adds the ability to read from a wrapping memory area (ie:
buffers). The new functions are called "readv_<type>". The original
ones were renamed to start with "read_" to make the difference more
obvious between the read method and the returned type.

It's worth noting that the memory barrier in readv_bytes() is critical,
as otherwise gcc decides that it doesn't need the resulting data, but
even worse, removes the length checks in readv_u64() and happily
performs an out-of-bounds unaligned read using read_u64()! Such
"optimizations" are a bit borderline, especially when they impact
security like this...
2017-09-20 11:27:31 +02:00
Willy Tarreau
26488ad358 MINOR: buffer: add b_end() and b_to_end()
These ones return respectively the pointer to the end of the buffer and
the distance between b->p and the end. These will simplify a bit some
new code needed to parse directly from a wrapping buffer.
2017-09-20 11:27:31 +02:00
Willy Tarreau
4a6425d373 MINOR: buffer: add b_del() to delete a number of characters
This will be used by code which directly parses buffers with no channel
in the middle (eg: h2, might be used by checks as well).
2017-09-20 11:27:31 +02:00
Willy Tarreau
36eb3a3ac8 MINOR: tools: make my_htonll() more efficient on x86_64
The current construct was made when developing on a 32-bit machine.
Having a simple bswap operation replaced with 2 bswap, 2 shift and
2 or is quite of a waste of precious cycles... Let's provide a trivial
asm-based implementation for x86_64.
2017-09-20 11:27:31 +02:00
Olivier Houchard
ed0d96cac4 MINOR: net_helper: Inline functions meant to be inlined. 2017-09-13 13:35:35 +02:00
Thierry FOURNIER
3c65b7a916 MINOR: xref: Add a new xref system
xref is used to create a relation between two elements.
Once an element is released, it breaks the relation. If the
relation is already broken, it frees the xref struct.
The pointer between two elements is a sort of refcount with
max value 1. The relation is only between two elements.
The pointer and the type of element a and b are conventional.

Note that xref is initialised from Lua files because Lua is
the only one user.
2017-09-11 18:59:40 +02:00
Christopher Faulet
ad405f1714 MINOR: buffers: Move swap_buffer into buffer.c and add deinit_buffer function
swap_buffer is a global variable only used by buffer_slow_realign. So it has
been moved from global.h to buffer.c and it is allocated by init_buffer
function. deinit_buffer function has been added to release it. It is also used
to destroy the buffers' pool.
2017-09-05 10:34:30 +02:00
Christopher Faulet
748919a4c7 MINOR: chunks: Use dedicated function to init/deinit trash buffers
Now, we use init_trash_buffers and deinit_trash_buffers to, respectively,
initialize and deinitialize trash buffers (trash, trash_buf1 and trash_buf2).

These functions have been introduced to be used by threads, to deal with
thread-local trash buffers.
2017-09-05 10:22:20 +02:00
Christopher Faulet
ae459fd206 CLEANUP: memory: Remove unused function pool_destroy
This one was never used.
2017-09-05 10:13:20 +02:00
Willy Tarreau
e11f727c95 MINOR: ist: implement very simple indirect strings
For HPACK we'll need to perform a lot of string manipulation between the
dynamic headers table and the output stream, and we need an efficient way
to deal with that, considering that the zero character is not an end of
string marker here. It turns out that gcc supports returning structs from
functions and is able to place up to two words directly in registers when
-freg-struct is used, which is the case by default on x86 and armv8. On
other architectures the caller reserves some stack space where the callee
can write, which is equivalent to passing a pointer to the return value.

So let's implement a few functions to deal with this as the resulting code
will be optimized on certain architectures where retrieving the length of
a string will simply consist in reading one of the two returned registers.

Extreme care was taken to ensure that the compiler gets maximum opportunities
to optimize out every bit of unused code. This is also the reason why no
call to regular string functions (such as strlen(), memcmp(), memcpy() etc)
were used. The code involving them is often larger than when they are open
coded. Given that strings are usually very small, especially when manipulating
headers, the time spent calling a function optimized for large vectors often
ends up being higher than the few cycles needed to count a few bytes.

An issue was met with __builtin_strlen() which can automatically convert
a constant string to its constant length. It doesn't accept NULLs and there
is no way to hide them using expressions as the check is made before the
optimizer is called. On gcc 4 and above, using an intermediary variable
is enough to hide it. On older versions, calls to ist() with an explicit
NULL argument will issue a warning. There is normally no reason to do this
but taking care of it the best possible still seems important.
2017-08-18 13:38:47 +02:00
Willy Tarreau
82032f1223 MINOR: chunks: add chunk_memcpy() and chunk_memcat()
These two functions respectively copy a memory area onto the chunk, and
append the contents of a memory area over a chunk. They are convenient
to prepare binary output data to be sent and will be used for HTTP/2.
2017-08-18 13:26:20 +02:00
Olivier Houchard
e962fd880d Add a few functions to do unaligned access.
Add a few functions to read 16bits and 32bits integers that may be
unaligned, both in host and network order.
2017-08-09 16:32:49 +02:00
David Carlier
b781dbede3 MINOR: memory: remove macros
We finally get rid of the macros and use usual memory management
functions directly.
2017-07-21 09:54:03 +02:00
Willy Tarreau
cb1949b8b3 MINOR: tools: add a portable timegm() alternative
timegm() is not provided everywhere and the documentation on how to
replace it is bogus as it proposes an inefficient and non-thread safe
alternative.

Here we reimplement everything needed to compute the number of seconds
since Epoch based on the broken down fields in struct tm. It is only
guaranteed to return correct values for correct inputs. It was successfully
tested with all possible 32-bit values of time_t converted to struct tm
using gmtime() and back to time_t using the legacy timegm() and this
function, and both functions always produced the same result.

Thanks to Benot Garnier for an instructive discussion and detailed
explanations of the various time functions, leading to this solution.
2017-07-19 19:15:06 +02:00
Christopher Faulet
a36b311b9f BUG/MINOR: buffers: Fix bi/bo_contig_space to handle full buffers
These functions was added in commit 637f8f2c ("BUG/MEDIUM: buffers: Fix how
input/output data are injected into buffers").

This patch fixes hidden bugs. When a buffer is full (buf->i + buf->o ==
buf->size), instead of returning 0, these functions can return buf->size. Today,
this never happens because callers already check if the buffer is full before
calling bi/bo_contig_space. But to avoid possible bugs if calling conditions
changed, we slightly refactored these functions.
2017-06-14 16:20:20 +02:00
Willy Tarreau
ed936c5d37 MINOR: tools: make debug_hexdump() take a string prefix
When dumping data at various places in the code, it's hard to figure
what is present where. To make this easier, this patch slightly modifies
debug_hexdump() to take a prefix string which is prepended in front of
each output line.
2017-06-02 15:49:31 +02:00
Willy Tarreau
9faef1e391 MINOR: tools: make debug_hexdump() use a const char for the string
There's no reason the string to be dumped should be a char *, it's
a const.
2017-06-02 15:49:31 +02:00
Jarno Huuskonen
577d5ac8ae CLEANUP: str2mask return code comment: non-zero -> zero. 2017-06-02 15:43:46 +02:00
Stphane Cottin
23e9e93128 MINOR: log: Add logurilen tunable.
The default len of request uri in log messages is 1024. In some use
cases, you need to keep the long trail of GET parameters. The only
way to increase this len is to recompile with DEFINE=-DREQURI_LEN=2048.

This commit introduces a tune.http.logurilen configuration directive,
allowing to tune this at runtime.
2017-06-02 11:06:36 +02:00
Lukas Tribus
23953686da DOC: update RFC references
A few doc and code comment updates bumping RFC references to the new
ones.
2017-04-28 18:58:11 +02:00
Thierry FOURNIER
6ab2bae084 REORG: spoe: move spoe_encode_varint / spoe_decode_varint from spoe to common
These encoding functions does general stuff and can be used in
other context than spoe. This patch moves the function spoe_encode_varint
and spoe_decode_varint from spoe to common. It also remove the prefix spoe.

These functions will be used for encoding values in new binary sample fetch.
2017-04-27 11:50:41 +02:00
Frdric Lcaille
b82f742b78 MINOR: server: Add 'server-template' new keyword supported in backend sections.
This patch makes backend sections support 'server-template' new keyword.
Such 'server-template' objects are parsed similarly to a 'server' object
by parse_server() function, but its first arguments are as follows:
    server-template <ID prefix> <nb | range> <ip | fqdn>:<port> ...

The remaining arguments are the same as for 'server' lines.

With such server template declarations, servers may be allocated with IDs
built from <ID prefix> and <nb | range> arguments.

For instance declaring:
    server-template foo 1-5 google.com:80 ...
or
    server-template foo 5 google.com:80 ...

would be equivalent to declare:
    server foo1 google.com:80 ...
    server foo2 google.com:80 ...
    server foo3 google.com:80 ...
    server foo4 google.com:80 ...
    server foo5 google.com:80 ...
2017-04-21 15:42:10 +02:00
Willy Tarreau
7b677265fd [RELEASE] Released version 1.8-dev1
Released version 1.8-dev1 with the following main changes :
    - BUG/MEDIUM: proxy: return "none" and "unknown" for unknown LB algos
    - BUG/MINOR: stats: make field_str() return an empty string on NULL
    - DOC: Spelling fixes
    - BUG/MEDIUM: http: Fix tunnel mode when the CONNECT method is used
    - BUG/MINOR: http: Keep the same behavior between 1.6 and 1.7 for tunneled txn
    - BUG/MINOR: filters: Protect args in macros HAS_DATA_FILTERS and IS_DATA_FILTER
    - BUG/MINOR: filters: Invert evaluation order of HTTP_XFER_BODY and XFER_DATA analyzers
    - BUG/MINOR: http: Call XFER_DATA analyzer when HTTP txn is switched in tunnel mode
    - BUG/MAJOR: stream: fix session abort on resource shortage
    - OPTIM: stream-int: don't disable polling anymore on DONT_READ
    - BUG/MINOR: cli: allow the backslash to be escaped on the CLI
    - BUG/MEDIUM: cli: fix "show stat resolvers" and "show tls-keys"
    - DOC: Fix map table's format
    - DOC: Added 51Degrees conv and fetch functions to documentation.
    - BUG/MINOR: http: don't send an extra CRLF after a Set-Cookie in a redirect
    - DOC: mention that req_tot is for both frontends and backends
    - BUG/MEDIUM: variables: some variable name can hide another ones
    - MINOR: lua: Allow argument for actions
    - BUILD: rearrange target files by build time
    - CLEANUP: hlua: just indent functions
    - MINOR: lua: give HAProxy variable access to the applets
    - BUG/MINOR: stats: fix be/sessions/max output in html stats
    - MINOR: proxy: Add fe_name/be_name fetchers next to existing fe_id/be_id
    - DOC: lua: Documentation about some entry missing
    - DOC: lua: Add documentation about variable manipulation from applet
    - MINOR: Do not forward the header "Expect: 100-continue" when the option http-buffer-request is set
    - DOC: Add undocumented argument of the trace filter
    - DOC: Fix some typo in SPOE documentation
    - MINOR: cli: Remove useless call to bi_putchk
    - BUG/MINOR: cli: be sure to always warn the cli applet when input buffer is full
    - MINOR: applet: Count number of (active) applets
    - MINOR: task: Rename run_queue and run_queue_cur counters
    - BUG/MEDIUM: stream: Save unprocessed events for a stream
    - BUG/MAJOR: Fix how the list of entities waiting for a buffer is handled
    - BUILD/MEDIUM: Fixing the build using LibreSSL
    - BUG/MEDIUM: lua: In some case, the return of sample-fetches is ignored (2)
    - SCRIPTS: git-show-backports: fix a harmless typo
    - SCRIPTS: git-show-backports: add -H to use the hash of the commit message
    - BUG/MINOR: stream-int: automatically release SI_FL_WAIT_DATA on SHUTW_NOW
    - CLEANUP: applet/lua: create a dedicated ->fcn entry in hlua_cli context
    - CLEANUP: applet/table: add an "action" entry in ->table context
    - CLEANUP: applet: remove the now unused appctx->private field
    - DOC: lua: documentation about time parser functions
    - DOC: lua: improve links
    - DOC: lua: section declared twice
    - MEDIUM: cli: 'show cli sockets' list the CLI sockets
    - BUG/MINOR: cli: "show cli sockets" wouldn't list all processes
    - BUG/MINOR: cli: "show cli sockets" would always report process 64
    - CLEANUP: lua: rename one of the lua appctx union
    - BUG/MINOR: lua/cli: bad error message
    - MEDIUM: lua: use memory pool for hlua struct in applets
    - MINOR: lua/signals: Remove Lua part from signals.
    - DOC: cli: show cli sockets
    - MINOR: cli: automatically enable a CLI I/O handler when there's no parser
    - CLEANUP: memory: remove the now unused cli_parse_show_pools() function
    - CLEANUP: applet: group all CLI contexts together
    - CLEANUP: stats: move a misplaced stats context initialization
    - MINOR: cli: add two general purpose pointers and integers in the CLI struct
    - MINOR: appctx/cli: remove the cli_socket entry from the appctx union
    - MINOR: appctx/cli: remove the env entry from the appctx union
    - MINOR: appctx/cli: remove the "be" entry from the appctx union
    - MINOR: appctx/cli: remove the "dns" entry from the appctx union
    - MINOR: appctx/cli: remove the "server_state" entry from the appctx union
    - MINOR: appctx/cli: remove the "tlskeys" entry from the appctx union
    - CONTRIB: tcploop: add limits.h to fix build issue with some compilers
    - MINOR/DOC: lua: just precise one thing
    - DOC: fix small typo in fe_id (backend instead of frontend)
    - BUG/MINOR: Fix the sending function in Lua's cosocket
    - BUG/MINOR: lua: memory leak executing tasks
    - BUG/MINOR: lua: bad return code
    - BUG/MINOR: lua: memleak when Lua/cli fails
    - MEDIUM: lua: remove Lua struct from session, and allocate it with memory pools
    - CLEANUP: haproxy: statify unexported functions
    - MINOR: haproxy: add a registration for build options
    - CLEANUP: wurfl: use the build options list to report it
    - CLEANUP: 51d: use the build options list to report it
    - CLEANUP: da: use the build options list to report it
    - CLEANUP: namespaces: use the build options list to report it
    - CLEANUP: tcp: use the build options list to report transparent modes
    - CLEANUP: lua: use the build options list to report it
    - CLEANUP: regex: use the build options list to report the regex type
    - CLEANUP: ssl: use the build options list to report the SSL details
    - CLEANUP: compression: use the build options list to report the algos
    - CLEANUP: auth: use the build options list to report its support
    - MINOR: haproxy: add a registration for post-check functions
    - CLEANUP: checks: make use of the post-init registration to start checks
    - CLEANUP: filters: use the function registration to initialize all proxies
    - CLEANUP: wurfl: make use of the late init registration
    - CLEANUP: 51d: make use of the late init registration
    - CLEANUP: da: make use of the late init registration code
    - MINOR: haproxy: add a registration for post-deinit functions
    - CLEANUP: wurfl: register the deinit function via the dedicated list
    - CLEANUP: 51d: register the deinitialization function
    - CLEANUP: da: register the deinitialization function
    - CLEANUP: wurfl: move global settings out of the global section
    - CLEANUP: 51d: move global settings out of the global section
    - CLEANUP: da: move global settings out of the global section
    - MINOR: cfgparse: add two new functions to check arguments count
    - MINOR: cfgparse: move parsing of "ca-base" and "crt-base" to ssl_sock
    - MEDIUM: cfgparse: move all tune.ssl.* keywords to ssl_sock
    - MEDIUM: cfgparse: move maxsslconn parsing to ssl_sock
    - MINOR: cfgparse: move parsing of ssl-default-{bind,server}-ciphers to ssl_sock
    - MEDIUM: cfgparse: move ssl-dh-param-file parsing to ssl_sock
    - MEDIUM: compression: move the zlib-specific stuff from global.h to compression.c
    - BUG/MEDIUM: ssl: properly reset the reused_sess during a forced handshake
    - BUG/MEDIUM: ssl: avoid double free when releasing bind_confs
    - BUG/MINOR: stats: fix be/sessions/current out in typed stats
    - MINOR: tcp-rules: check that the listener exists before updating its counters
    - MEDIUM: spoe: don't create a dummy listener for outgoing connections
    - MINOR: listener: move the transport layer pointer to the bind_conf
    - MEDIUM: move listener->frontend to bind_conf->frontend
    - MEDIUM: ssl: remote the proxy argument from most functions
    - MINOR: connection: add a new prepare_bind_conf() entry to xprt_ops
    - MEDIUM: ssl_sock: implement ssl_sock_prepare_bind_conf()
    - MINOR: connection: add a new destroy_bind_conf() entry to xprt_ops
    - MINOR: ssl_sock: implement ssl_sock_destroy_bind_conf()
    - MINOR: server: move the use_ssl field out of the ifdef USE_OPENSSL
    - MINOR: connection: add a minimal transport layer registration system
    - CLEANUP: connection: remove all direct references to raw_sock and ssl_sock
    - CLEANUP: connection: unexport raw_sock and ssl_sock
    - MINOR: connection: add new prepare_srv()/destroy_srv() entries to xprt_ops
    - MINOR: ssl_sock: implement and use prepare_srv()/destroy_srv()
    - CLEANUP: ssl: move tlskeys_finalize_config() to a post_check callback
    - CLEANUP: ssl: move most ssl-specific global settings to ssl_sock.c
    - BUG/MINOR: backend: nbsrv() should return 0 if backend is disabled
    - BUG/MEDIUM: ssl: for a handshake when server-side SNI changes
    - BUG/MINOR: systemd: potential zombie processes
    - DOC: Add timings events schemas
    - BUILD: lua: build failed on FreeBSD.
    - MINOR: samples: add xx-hash functions
    - MEDIUM: regex: pcre2 support
    - BUG/MINOR: option prefer-last-server must be ignored in some case
    - MINOR: stats: Support "select all" for backend actions
    - BUG/MINOR: sample-fetches/stick-tables: bad type for the sample fetches sc*_get_gpt0
    - BUG/MAJOR: channel: Fix the definition order of channel analyzers
    - BUG/MINOR: http: report real parser state in error captures
    - BUILD: scripts: automatically update the branch in version.h when releasing
    - MINOR: tools: add a generic hexdump function for debugging
    - BUG/MAJOR: http: fix risk of getting invalid reports of bad requests
    - MINOR: http: custom status reason.
    - MINOR: connection: add sample fetch "fc_rcvd_proxy"
    - BUG/MINOR: config: emit a warning if http-reuse is enabled with incompatible options
    - BUG/MINOR: tools: fix off-by-one in port size check
    - BUG/MEDIUM: server: consider AF_UNSPEC as a valid address family
    - MEDIUM: server: split the address and the port into two different fields
    - MINOR: tools: make str2sa_range() return the port in a separate argument
    - MINOR: server: take the destination port from the port field, not the addr
    - MEDIUM: server: disable protocol validations when the server doesn't resolve
    - BUG/MEDIUM: tools: do not force an unresolved address to AF_INET:0.0.0.0
    - BUG/MINOR: ssl: EVP_PKEY must be freed after X509_get_pubkey usage
    - BUG/MINOR: ssl: assert on SSL_set_shutdown with BoringSSL
    - MINOR: Use "500 Internal Server Error" for 500 error/status code message.
    - MINOR: proto_http.c 502 error txt typo.
    - DOC: add deprecation notice to "block"
    - MINOR: compression: fix -vv output without zlib/slz
    - BUG/MINOR: Reset errno variable before calling strtol(3)
    - MINOR: ssl: don't show prefer-server-ciphers output
    - OPTIM/MINOR: config: Optimize fullconn automatic computation loading configuration
    - BUG/MINOR: stream: Fix how backend-specific analyzers are set on a stream
    - MAJOR: ssl: bind configuration per certificat
    - MINOR: ssl: add curve suite for ECDHE negotiation
    - MINOR: checks: Add agent-addr config directive
    - MINOR: cli: Add possiblity to change agent config via CLI/socket
    - MINOR: doc: Add docs for agent-addr configuration variable
    - MINOR: doc: Add docs for agent-addr and agent-send CLI commands
    - BUILD: ssl: fix to build (again) with boringssl
    - BUILD: ssl: fix build on OpenSSL 1.0.0
    - BUILD: ssl: silence a warning reported for ERR_remove_state()
    - BUILD: ssl: eliminate warning with OpenSSL 1.1.0 regarding RAND_pseudo_bytes()
    - BUILD: ssl: kill a build warning introduced by BoringSSL compatibility
    - BUG/MEDIUM: tcp: don't poll for write when connect() succeeds
    - BUG/MINOR: unix: fix connect's polling in case no data are scheduled
    - MINOR: server: extend the flags to 32 bits
    - BUG/MINOR: lua: Map.end are not reliable because "end" is a reserved keyword
    - MINOR: dns: give ability to dns_init_resolvers() to close a socket when requested
    - BUG/MAJOR: dns: restart sockets after fork()
    - MINOR: chunks: implement a simple dynamic allocator for trash buffers
    - BUG/MEDIUM: http: prevent redirect from overwriting a buffer
    - BUG/MEDIUM: filters: Do not truncate HTTP response when body length is undefined
    - BUG/MEDIUM: http: Prevent replace-header from overwriting a buffer
    - BUG/MINOR: http: Return an error when a replace-header rule failed on the response
    - BUG/MINOR: sendmail: The return of vsnprintf is not cleanly tested
    - BUG/MAJOR: ssl: fix a regression in ssl_sock_shutw()
    - BUG/MAJOR: lua segmentation fault when the request is like 'GET ?arg=val HTTP/1.1'
    - BUG/MEDIUM: config: reject anything but "if" or "unless" after a use-backend rule
    - MINOR: http: don't close when redirect location doesn't start with "/"
    - MEDIUM: boringssl: support native multi-cert selection without bundling
    - BUG/MEDIUM: ssl: fix verify/ca-file per certificate
    - BUG/MEDIUM: ssl: switchctx should not return SSL_TLSEXT_ERR_ALERT_WARNING
    - MINOR: ssl: removes SSL_CTX_set_ssl_version call and cleanup CTX creation.
    - BUILD: ssl: fix build with -DOPENSSL_NO_DH
    - MEDIUM: ssl: add new sample-fetch which captures the cipherlist
    - MEDIUM: ssl: remove ssl-options from crt-list
    - BUG/MEDIUM: ssl: in bind line, ssl-options after 'crt' are ignored.
    - BUG/MINOR: ssl: fix cipherlist captures with sustainable SSL calls
    - MINOR: ssl: improved cipherlist captures
    - BUG/MINOR: spoe: Fix soft stop handler using a specific id for spoe filters
    - BUG/MINOR: spoe: Fix parsing of arguments in spoe-message section
    - MAJOR: spoe: Add support of pipelined and asynchronous exchanges with agents
    - MINOR: spoe: Add support for pipelining/async capabilities in the SPOA example
    - MINOR: spoe: Remove SPOE details from the appctx structure
    - MINOR: spoe: Add status code in error variable instead of hardcoded value
    - MINOR: spoe: Send a log message when an error occurred during event processing
    - MINOR: spoe: Check the scope of sample fetches used in SPOE messages
    - MEDIUM: spoe: Be sure to wakeup the good entity waiting for a buffer
    - MINOR: spoe: Use the min of all known max_frame_size to encode messages
    - MAJOR: spoe: Add support of payload fragmentation in NOTIFY frames
    - MINOR: spoe: Add support for fragmentation capability in the SPOA example
    - MAJOR: spoe: refactor the filter to clean up the code
    - MINOR: spoe: Handle NOTIFY frames cancellation using ABORT bit in ACK frames
    - REORG: spoe: Move struct and enum definitions in dedicated header file
    - REORG: spoe: Move low-level encoding/decoding functions in dedicated header file
    - MINOR: spoe: Improve implementation of the payload fragmentation
    - MINOR: spoe: Add support of negation for options in SPOE configuration file
    - MINOR: spoe: Add "pipelining" and "async" options in spoe-agent section
    - MINOR: spoe: Rely on alertif_too_many_arg during configuration parsing
    - MINOR: spoe: Add "send-frag-payload" option in spoe-agent section
    - MINOR: spoe: Add "max-frame-size" statement in spoe-agent section
    - DOC: spoe: Update SPOE documentation to reflect recent changes
    - MINOR: config: warn when some HTTP rules are used in a TCP proxy
    - BUG/MEDIUM: ssl: Clear OpenSSL error stack after trying to parse OCSP file
    - BUG/MEDIUM: cli: Prevent double free in CLI ACL lookup
    - BUG/MINOR: Fix "get map <map> <value>" CLI command
    - MINOR: Add nbsrv sample converter
    - CLEANUP: Replace repeated code to count usable servers with be_usable_srv()
    - MINOR: Add hostname sample fetch
    - CLEANUP: Remove comment that's no longer valid
    - MEDIUM: http_error_message: txn->status / http_get_status_idx.
    - MINOR: http-request tarpit deny_status.
    - CLEANUP: http: make http_server_error() not set the status anymore
    - MEDIUM: stats: Add JSON output option to show (info|stat)
    - MEDIUM: stats: Add show json schema
    - BUG/MAJOR: connection: update CO_FL_CONNECTED before calling the data layer
    - MINOR: server: Add dynamic session cookies.
    - MINOR: cli: Let configure the dynamic cookies from the cli.
    - BUG/MINOR: checks: attempt clean shutw for SSL check
    - CONTRIB: tcploop: make it build on FreeBSD
    - CONTRIB: tcploop: fix time format to silence build warnings
    - CONTRIB: tcploop: report action 'K' (kill) in usage message
    - CONTRIB: tcploop: fix connect's address length
    - CONTRIB: tcploop: use the trash instead of NULL for recv()
    - BUG/MEDIUM: listener: do not try to rebind another process' socket
    - BUG/MEDIUM server: Fix crash when dynamic is defined, but not key is provided.
    - CLEANUP: config: Typo in comment.
    - BUG/MEDIUM: filters: Fix channels synchronization in flt_end_analyze
    - TESTS: add a test configuration to stress handshake combinations
    - BUG/MAJOR: stream-int: do not depend on connection flags to detect connection
    - BUG/MEDIUM: connection: ensure to always report the end of handshakes
    - MEDIUM: connection: don't test for CO_FL_WAKE_DATA
    - CLEANUP: connection: completely remove CO_FL_WAKE_DATA
    - BUG: payload: fix payload not retrieving arbitrary lengths
    - BUILD: ssl: simplify SSL_CTX_set_ecdh_auto compatibility
    - BUILD: ssl: fix OPENSSL_NO_SSL_TRACE for boringssl and libressl
    - BUG/MAJOR: http: fix typo in http_apply_redirect_rule
    - MINOR: doc: 2.4. Examples should be 2.5. Examples
    - BUG/MEDIUM: stream: fix client-fin/server-fin handling
    - MINOR: fd: add a new flag HAP_POLL_F_RDHUP to struct poller
    - BUG/MINOR: raw_sock: always perfom the last recv if RDHUP is not available
    - OPTIM: poll: enable support for POLLRDHUP
    - MINOR: kqueue: exclusively rely on the kqueue returned status
    - MEDIUM: kqueue: take care of EV_EOF to improve polling status accuracy
    - MEDIUM: kqueue: only set FD_POLL_IN when there are pending data
    - DOC/MINOR: Fix typos in proxy protocol doc
    - DOC: Protocol doc: add checksum, TLV type ranges
    - DOC: Protocol doc: add SSL TLVs, rename CHECKSUM
    - DOC: Protocol doc: add noop TLV
    - MEDIUM: global: add a 'hard-stop-after' option to cap the soft-stop time
    - MINOR: dns: improve DNS response parsing to use as many available records as possible
    - BUG/MINOR: cfgparse: loop in tracked servers lists not detected by check_config_validity().
    - MINOR: server: irrelevant error message with 'default-server' config file keyword.
    - MINOR: server: Make 'default-server' support 'backup' keyword.
    - MINOR: server: Make 'default-server' support 'check-send-proxy' keyword.
    - CLEANUP: server: code alignement.
    - MINOR: server: Make 'default-server' support 'non-stick' keyword.
    - MINOR: server: Make 'default-server' support 'send-proxy' and 'send-proxy-v2 keywords.
    - MINOR: server: Make 'default-server' support 'check-ssl' keyword.
    - MINOR: server: Make 'default-server' support 'force-sslv3' and 'force-tlsv1[0-2]' keywords.
    - CLEANUP: server: code alignement.
    - MINOR: server: Make 'default-server' support 'no-ssl*' and 'no-tlsv*' keywords.
    - MINOR: server: Make 'default-server' support 'ssl' keyword.
    - MINOR: server: Make 'default-server' support 'send-proxy-v2-ssl*' keywords.
    - CLEANUP: server: code alignement.
    - MINOR: server: Make 'default-server' support 'verify' keyword.
    - MINOR: server: Make 'default-server' support 'verifyhost' setting.
    - MINOR: server: Make 'default-server' support 'check' keyword.
    - MINOR: server: Make 'default-server' support 'track' setting.
    - MINOR: server: Make 'default-server' support 'ca-file', 'crl-file' and 'crt' settings.
    - MINOR: server: Make 'default-server' support 'redir' keyword.
    - MINOR: server: Make 'default-server' support 'observe' keyword.
    - MINOR: server: Make 'default-server' support 'cookie' keyword.
    - MINOR: server: Make 'default-server' support 'ciphers' keyword.
    - MINOR: server: Make 'default-server' support 'tcp-ut' keyword.
    - MINOR: server: Make 'default-server' support 'namespace' keyword.
    - MINOR: server: Make 'default-server' support 'source' keyword.
    - MINOR: server: Make 'default-server' support 'sni' keyword.
    - MINOR: server: Make 'default-server' support 'addr' keyword.
    - MINOR: server: Make 'default-server' support 'disabled' keyword.
    - MINOR: server: Add 'no-agent-check' server keyword.
    - DOC: server: Add docs for "server" and "default-server" new "no-*" and other settings.
    - MINOR: doc: fix use-server example (imap vs mail)
    - BUG/MEDIUM: tcp: don't require privileges to bind to device
    - BUILD: make the release script use shortlog for the final changelog
    - BUILD: scripts: fix typo in announce-release error message
    - CLEANUP: time: curr_sec_ms doesn't need to be exported
    - BUG/MEDIUM: server: Wrong server default CRT filenames initialization.
    - BUG/MEDIUM: peers: fix buffer overflow control in intdecode.
    - BUG/MEDIUM: buffers: Fix how input/output data are injected into buffers
    - BUG/MINOR: http: Fix conditions to clean up a txn and to handle the next request
    - CLEANUP: http: Remove channel_congested function
    - CLEANUP: buffers: Remove buffer_bounce_realign function
    - CLEANUP: buffers: Remove buffer_contig_area and buffer_work_area functions
    - MINOR: http: remove useless check on HTTP_MSGF_XFER_LEN for the request
    - MINOR: http: Add debug messages when HTTP body analyzers are called
    - BUG/MEDIUM: http: Fix blocked HTTP/1.0 responses when compression is enabled
    - BUG/MINOR: filters: Don't force the stream's wakeup when we wait in flt_end_analyze
    - DOC: fix parenthesis and add missing "Example" tags
    - DOC: update the contributing file
    - DOC: log-format/tcplog/httplog update
    - MINOR: config parsing: add warning when log-format/tcplog/httplog is overriden in "defaults" sections
2017-04-03 09:27:49 +02:00
Christopher Faulet
a545569f1e CLEANUP: buffers: Remove buffer_contig_area and buffer_work_area functions
Not used anymore since last commit.
2017-03-31 14:38:30 +02:00
Christopher Faulet
aaf4a325ca CLEANUP: buffers: Remove buffer_bounce_realign function
Not used anymore since last commit.
2017-03-31 14:38:22 +02:00
Christopher Faulet
637f8f2ca7 BUG/MEDIUM: buffers: Fix how input/output data are injected into buffers
The function buffer_contig_space is buggy and could lead to pernicious bugs
(never hitted until now, AFAIK). This function should return the number of bytes
that can be written into the buffer at once (without wrapping).

First, this function is used to inject input data (bi_putblk) and to inject
output data (bo_putblk and bo_inject). But there is no context. So it cannot
decide where contiguous space should placed. For input data, it should be after
bi_end(buf) (ie, buf->p + buf->i modulo wrapping calculation). For output data,
it should be after bo_end(buf) (ie, buf->p) and input data are assumed to not
exist (else there is no space at all).

Then, considering we need to inject input data, this function does not always
returns the right value. And when we need to inject output data, we must be sure
to have no input data at all (buf->i == 0), else the result can also be wrong
(but this is the caller responsibility, so everything should be fine here).

The buffer can be in 3 different states:

 1) no wrapping

              <---- o ----><----- i ----->
 +------------+------------+-------------+------------+
 |            |oooooooooooo|iiiiiiiiiiiii|xxxxxxxxxxxx|
 +------------+------------+-------------+------------+
                           ^             <contig_space>
                           p             ^            ^
			                 l            r

 2) input wrapping

 ...--->            <---- o ----><-------- i -------...
 +-----+------------+------------+--------------------+
 |iiiii|xxxxxxxxxxxx|oooooooooooo|iiiiiiiiiiiiiiiiiiii|
 +-----+------------+------------+--------------------+
       <contig_space>            ^
       ^            ^            p
       l            r

 3) output wrapping

 ...------ o ------><----- i ----->            <----...
 +------------------+-------------+------------+------+
 |oooooooooooooooooo|iiiiiiiiiiiii|xxxxxxxxxxxx|oooooo|
 +------------------+-------------+------------+------+
                    ^             <contig_space>
                    p             ^            ^
		                  l            r

buffer_contig_space returns (l - r). The cases 1 and 3 are correctly
handled. But for the second case, r is wrong. It points on the buffer's end
(buf->data + buf->size). It should be bo_end(buf) (ie, buf->p - buf->o).

To fix the bug, the function has been splitted. Now, bi_contig_space and
bo_contig_space should be used to know the contiguous space available to insert,
respectively, input data and output data. For bo_contig_space, input data are
assumed to not exist. And the right version is used, depending what we want to
do.

In addition, to clarify the buffer's API, buffer_realign does not return value
anymore. So it has the same API than buffer_slow_realign.

This patch can be backported in 1.7, 1.6 and 1.5.
2017-03-31 14:36:04 +02:00
Willy Tarreau
b686afd568 MINOR: chunks: implement a simple dynamic allocator for trash buffers
The trash buffers are becoming increasingly complex to deal with due to
the code's modularity allowing some functions to be chained and causing
the same chunk buffers to be used multiple times along the chain, possibly
corrupting each other. In fact the trash were designed from scratch for
explicitly not surviving a function call but string manipulation makes
this impossible most of the time while not fullfilling the need for
reliable temporary chunks.

Here we introduce the ability to allocate a temporary trash chunk which
is reserved, so that it will not conflict with the trash chunks other
functions use, and will even support reentrant calls (eg: build_logline).

For this, we create a new pool which is exactly the size of a usual chunk
buffer plus the size of the chunk struct so that these chunks when allocated
are exactly the same size as the ones returned by get_trash_buffer(). These
chunks may fail so the caller must check them, and the caller is also
responsible for freeing them.

The code focuses on minimal changes and ease of reliable backporting
because it will be needed in stable versions in order to support next
patch.
2017-02-08 11:16:29 +01:00
Emmanuel Hocdet
98263291cc MAJOR: ssl: bind configuration per certificat
crt-list is extend to support ssl configuration. You can now have
such line in crt-list <file>:
mycert.pem [npn h2,http/1.1]

Support include "npn", "alpn", "verify", "ca_file", "crl_file",
"ecdhe", "ciphers" configuration and ssl options.

"crt-base" is also supported to fetch certificates.
2017-01-13 11:40:34 +01:00
Willy Tarreau
48ef4c95b6 MINOR: tools: make str2sa_range() return the port in a separate argument
This will be needed so that we're don't have to extract it from the
returned address where it will not always be anymore (eg: for unresolved
servers).
2017-01-06 19:29:34 +01:00
Willy Tarreau
0ebb511b3e MINOR: tools: add a generic hexdump function for debugging
debug_hexdump() prints to the requested output stream (typically stdout
or stderr) an hex dump of the blob passed in argument. This is useful
to help debug binary protocols.
2017-01-05 20:12:20 +01:00
David Carlier
f2592b29f1 MEDIUM: regex: pcre2 support
this adds a support of the newest pcre2 library,
more secure than its older sibling in a cost of a
more complex API.
It works pretty similarly to pcre's part to keep
the overall change smooth,  except :

- we define the string class supported at compile time.
- after matching the ovec data is properly sized, althought
we do not take advantage of it here.
- the lack of jit support is treated less 'dramatically'
as pcre2_jit_compile in this case is 'no-op'.
2016-12-28 12:51:51 +01:00
Willy Tarreau
ece9b07c71 MINOR: cfgparse: add two new functions to check arguments count
We already had alertif_too_many_args{,_idx}(), but these ones are
specifically designed for use in cfgparse. Outside of it we're
trying to avoid calling Alert() all the time so we need an
equivalent using a pointer to an error message.

These new functions called too_many_args{,_idx)() do exactly this.
They don't take the file name nor the line number which they have
no use for but instead they take an optional pointer to an error
message and the pointer to the error code is optional as well.
With (NULL, NULL) they'll simply check the validity and return a
verdict. They are quite convenient for use in isolated keyword
parsers.

These two new functions as well as the previous ones have all been
exported.
2016-12-21 23:39:26 +01:00
Christopher Faulet
a73e59b690 BUG/MAJOR: Fix how the list of entities waiting for a buffer is handled
When an entity tries to get a buffer, if it cannot be allocted, for example
because the number of buffers which may be allocated per process is limited,
this entity is added in a list (called <buffer_wq>) and wait for an available
buffer.

Historically, the <buffer_wq> list was logically attached to streams because it
were the only entities likely to be added in it. Now, applets can also be
waiting for a free buffer. And with filters, we could imagine to have more other
entities waiting for a buffer. So it make sense to have a generic list.

Anyway, with the current design there is a bug. When an applet failed to get a
buffer, it will wait. But we add the stream attached to the applet in
<buffer_wq>, instead of the applet itself. So when a buffer is available, we
wake up the stream and not the waiting applet. So, it is possible to have
waiting applets and never awakened.

So, now, <buffer_wq> is independant from streams. And we really add the waiting
entity in <buffer_wq>. To be generic, the entity is responsible to define the
callback used to awaken it.

In addition, applets will still request an input buffer when they become
active. But they will not be sleeped anymore if no buffer are available. So this
is the responsibility to the applet I/O handler to check if this buffer is
allocated or not. This way, an applet can decide if this buffer is required or
not and can do additional processing if not.

[wt: backport to 1.7 and 1.6]
2016-12-12 19:11:04 +01:00
Willy Tarreau
97c2ae13bc REORG: cli: move dump_text(), dump_text_line(), and dump_binary() to standard.c
These are general purpose functions, move them away.
2016-11-24 16:59:27 +01:00
Christopher Faulet
79bdef3cad MINOR: cfgparse: Parse scope lines and save the last one parsed
A scope is a section name between square bracket, alone on its line, ie:

  [scope-name]
  ...

The spaces at the beginning and at the end of the line are skipped. Comments at
the end of the line are also skipped.

When a scope is parsed, its name is saved in the global variable
cfg_scope. Initially, cfg_scope is NULL and it remains NULL until a valid scope
line is parsed.

This feature remains unused in the HAProxy configuration file and
undocumented. However, it will be used during SPOE configuration parsing.
2016-11-09 22:56:59 +01:00
Christopher Faulet
7110b40d06 MINOR: cfgparse: Add functions to backup and restore registered sections
This feature will be used by the stream processing offload engine (SPOE) to
parse dedicated configuration files without mixing HAProxy sections with SPOE
sections.

So, here we can back up all sections known by HAProxy, unregister all of them
and add new ones, dedicted to the SPOE. Once the SPOE configuration file parsed,
we can roll back all changes by restoring HAProxy sections.
2016-11-09 22:56:59 +01:00
Christopher Faulet
898566e7e6 CLEANUP: remove last references to 'ruleset' section 2016-11-09 22:50:54 +01:00
Willy Tarreau
620408f406 MEDIUM: tcp: add registration and processing of TCP L5 rules
This commit introduces "tcp-request session" rules. These are very
much like "tcp-request connection" rules except that they're processed
after the handshake, so it is possible to consider SSL information and
addresses rewritten by the proxy protocol header in actions. This is
particularly useful to track proxied sources as this was not possible
before, given that tcp-request content rules are processed after each
HTTP request. Similarly it is possible to assign the proxied source
address or the client's cert to a variable.
2016-10-21 18:19:24 +02:00
Lukas Tribus
dcbc5c5ecf MINOR: show Built with PCRE version
Inspired by PCRE's pcre_version.c and improved with Willy's
suggestions. Reusable parts have been added to
include/common/standard.h.
2016-09-13 07:55:51 +02:00
Baptiste Assmann
7819c125c2 MINOR: chunk: new strncat function
Purpose of this function is to append data to the end of a chunk when
we know only the pointer to the beginning of the string and the string
length.
2016-09-12 19:51:59 +02:00
Baptiste Assmann
08396c87d0 MINOR: standard.c: ipcpy() function to copy an IP address from a struct sockaddr_storage into an other one
The function ipcpy() simply duplicates the IP address found in one
struct sockaddr_storage into an other struct sockaddr_storage.
It also update the family on the destination structure.

Memory of destination structure must be allocated and cleared by the
caller.
2016-08-14 12:16:43 +02:00
Baptiste Assmann
08b24cfdb2 MINOR: standard.c: ipcmp() function to compare 2 IP addresses stored in 2 struct sockaddr_storage
new ipcmp() function to compare 2 IP addresses stored in struct
sockaddr_storage.
Returns 0 if both addresses doesn't match and 1 if they do.
2016-08-14 12:16:27 +02:00
Willy Tarreau
9d87ca0685 BUILD: tcp: define SOL_TCP when only IPPROTO_TCP exists
FreeBSD prefers to use IPPROTO_TCP over SOL_TCP, just like it does
with their *_IP counterparts. It's worth noting that there are a few
inconsistencies between SOL_TCP and IPPROTO_TCP in the code, eg on
TCP_QUICKACK. The two values are the same but it's worth applying
what implementations recommend.

No backport is needed, this was uncovered by the recent tcp_info stuff.
2016-08-10 21:11:38 +02:00
Willy Tarreau
16e015635c MINOR: tcp: add dst_is_local and src_is_local
It is sometimes needed in application server environments to easily tell
if a source is local to the machine or a remote one, without necessarily
knowing all the local addresses (dhcp, vrrp, etc). Similarly in transparent
proxy configurations it is sometimes desired to tell the difference between
local and remote destination addresses.

This patch adds two new sample fetch functions for this :

dst_is_local : boolean
  Returns true if the destination address of the incoming connection is local
  to the system, or false if the address doesn't exist on the system, meaning
  that it was intercepted in transparent mode. It can be useful to apply
  certain rules by default to forwarded traffic and other rules to the traffic
  targetting the real address of the machine. For example the stats page could
  be delivered only on this address, or SSH access could be locally redirected.
  Please note that the check involves a few system calls, so it's better to do
  it only once per connection.

src_is_local : boolean
  Returns true if the source address of the incoming connection is local to the
  system, or false if the address doesn't exist on the system, meaning that it
  comes from a remote machine. Note that UNIX addresses are considered local.
  It can be useful to apply certain access restrictions based on where the
  client comes from (eg: require auth or https for remote machines). Please
  note that the check involves a few system calls, so it's better to do it only
  once per connection.
2016-08-09 16:50:08 +02:00
Dragan Dosen
1a5d06032b MINOR: standard: add function "escape_string"
Similar to "escape_chunk", this function tries to prefix all characters
tagged in the <map> with the <escape> character. The specified <string>
contains the input to be escaped.
2016-07-26 15:25:32 +02:00
Willy Tarreau
eec1d3869d BUG/MEDIUM: dns: fix alignment issues in the DNS response parser
Alexander Lebedev reported that the DNS parser crashes in 1.6 with a bus
error on Sparc when it receives a response. This is obviously caused by
some alignment issues. The issue can also be reproduced on ARMv5 when
setting /proc/cpu/alignment to 4 (which helps debugging).

Two places cause this crash in turn, the first one is when the IP address
from the packet is compared to the current one, and the second place is
when the address is assigned because an unaligned address is passed to
update_server_addr().

This patch modifies these places to properly use memcpy() and memcmp()
to manipulate the unaligned data.

Nenad Merdanovic found another set of places specific to 1.7 in functions
in_net_ipv4() and in_net_ipv6(), which are used to compare networks. 1.6
has the functions but does not use them. There we perform a temporary copy
to a local variable to fix the problem. The type of the function's argument
is wrong since it's not necessarily aligned, so we change it for a const
void * instead.

This fix must be backported to 1.6. Note that in 1.6 the code is slightly
different, there's no rec[] array, the pointer is used directly from the
buffer.
2016-07-13 12:13:24 +02:00
Hubert Verstraete
2eae3a0497 MINOR: new function my_realloc2 = realloc + free upon failure
When realloc fails to allocate memory, the original pointer is not
freed. Sometime people override the original pointer with the pointer
returned by realloc which is NULL in case of failure. This results
in a memory leak because the memory pointed by the original pointer
cannot be freed.
2016-06-29 10:45:15 +02:00
Emmanuel Hocdet
5e0e6e409b MINOR: ssl: crt-list parsing factor
LINESIZE and MAX_LINE_ARGS are too low for parsing crt-list.
2016-06-20 17:29:56 +02:00
Willy Tarreau
5f6e9054b9 BUILD: fix build on Solaris 11
htonll()/ntohll() already exist on Solaris 11 with a different declaration,
causing a build error as reported by Jonathan Fisher. They used to exist on
OSX with a #define which allowed us to detect them. It was a bad idea to give
these functions a name subject to conflicts like this. Simply rename them
my_htonll()/my_ntohll() to definitely get rid of the conflict.

This patch must be backported to 1.6.
2016-05-26 07:15:57 +02:00
Maxime de Roucy
dc88785f9c MINOR: add list_append_word function
int list_append_word(struct list *li, const char *str, char **err)

Append a copy of string <str> (inside a wordlist) at the end of
the list <li>.
The caller is responsible for freeing the <err> and <str> copy memory
area using free().

On failure : return 0 and <err> filled with an error message.
2016-05-14 00:00:54 +02:00
David Carlier
8ab1043c6b CLEANUP: chunk: adding NULL check to chunk_dup allocation.
Avoiding harmful memcpy call if the allocation failed.
Resetting the size which avoids further harmful freeing
invalid pointer. Closer to the comment behavior description.
2016-03-24 10:18:44 +01:00
Benoit GARNIER
e2e5bde3f2 BUG/MINOR: log: Don't use strftime() which can clobber timezone if chrooted
The strftime() function can call tzset() internally on some platforms.
When haproxy is chrooted, the /etc/localtime file is not found, and some
implementations will clobber the content of the current timezone.

The GMT offset is computed by diffing the times returned by gmtime_r() and
localtime_r(). These variants are guaranteed to not call tzset() and were
already used in haproxy while chrooted, so they should be safe.

This patch must be backported to 1.6 and 1.5.
2016-03-17 05:30:03 +01:00
Benoit GARNIER
b413c2a759 BUG/MINOR: log: GMT offset not updated when entering/leaving DST
GMT offset used in local time formats was computed at startup, but was not updated when DST status changed while running.

For example these two RFC5424 syslog traces where emitted 5 seconds apart, just before and after DST changed:
  <14>1 2016-03-27T01:59:58+01:00 bunch-VirtualBox haproxy 2098 - - Connect ...
  <14>1 2016-03-27T03:00:03+01:00 bunch-VirtualBox haproxy 2098 - - Connect ...

It looked like they were emitted more than 1 hour apart, unlike with the fix:
  <14>1 2016-03-27T01:59:58+01:00 bunch-VirtualBox haproxy 3381 - - Connect ...
  <14>1 2016-03-27T03:00:03+02:00 bunch-VirtualBox haproxy 3381 - - Connect ...

This patch should be backported to 1.6 and partially to 1.5 (no fix needed in log.c).
2016-03-13 23:48:05 +01:00
Willy Tarreau
508a63fb96 MINOR: stats: add ST_SHOWADMIN to pass the admin info in the regular flags
It's easier to have a new flag in <flags> to indicate whether or not we
want to display the admin column in HTML dumps. We already have similar
flags to show the version or the legends.
2016-03-11 17:08:05 +01:00
Willy Tarreau
320ec2a745 BUG/MEDIUM: chunks: always reject negative-length chunks
The recent addition of "show env" on the CLI has revealed an interesting
design bug. Chunks are supposed to support a negative length to indicate
that they carry no data. chunk_printf() sets this size to -1 if the string
is too large for the buffer. At a few places in the http engine we may end
up with trash.len = -1. But bi_putchk(), chunk_appendf() and a few other
chunks consumers don't consider this case as possible and will use such a
chunk, possibly restoring an invalid string or trying to copy -1 bytes.

This fix takes care of clarifying the situation in a backportable way
where such sizes are used, so that a negative length indicating an error
remains present until the chunk is reinitialized or overwritten. But a
cleaner design adjustment needs to be done so that there's a clear contract
on how to use these chunks. At first glance it doesn't seem *that* useful
to support negative sizes, so probably this is what should change.

This fix must be backported to 1.6 and 1.5.
2016-02-25 16:24:14 +01:00
Thierry Fournier
70473a5f8c MINOR: common: mask conversion
Add function which converts network mask from bit length form
to struct in*_addr form.
2016-02-19 14:37:41 +01:00
Pieter Baauw
46af170e41 MINOR: mailers: increase default timeout to 10 seconds
This allows the tcp connection to send multiple SYN packets, so 1 lost
packet does not cause the mail to be lost. It changes the socket timeout
from 2 to 10 seconds, this allows for 3 syn packets to be send and
waiting a little for their reply.

This patch should be backported to 1.6.

Acked-by: Simon Horman <horms@verge.net.au>
2016-02-17 10:19:08 +01:00
Dragan Dosen
0edd10925d MINOR: standard: add function "escape_chunk"
This function tries to prefix all characters tagged in the <map> with the
<escape> character. The specified <chunk> contains the input to be
escaped.
2016-02-12 13:36:47 +01:00
Thierry Fournier
9312794ed7 MINOR: standard: add RFC HTTP date parser
This parser takes a string containing an HTTP date. It returns
a broken-down time struct. We must considers considers this
time as GMT. Maybe later the timezone will be taken in account.
2016-02-12 11:08:53 +01:00
Willy Tarreau
581bf81d34 MEDIUM: pools: add a new flag to avoid rounding pool size up
Usually it's desirable to merge similarly sized pools, which is the
reason why their size is rounded up to the next multiple of 16. But
for the buffers this is problematic because we add the size of
struct buffer to the user-requested size, and the rounding results
in 8 extra bytes that are usable in the end. So the user gets more
bytes than asked for, and in case of SSL it results in short writes
for the extra bytes that are sent above multiples of 16 kB.

So we add a new flag MEM_F_EXACT to request that the size is not
rounded up when creating the entry. Thus it doesn't disable merging.
2016-01-25 02:31:18 +01:00
Willy Tarreau
898529b4a8 MEDIUM: tools: add csv_enc_append() to preserve the original chunk
We have csv_enc() but there's no way to append some CSV-encoded data
to an existing chunk, so here we modify the existing function for this
and create an inlined version of csv_enc() which first resets the output
chunk. It will be handy to append data to an existing chunk without
having to use an extra temporary chunk, or to encode multiple strings
into a single chunk with chunk_newstr().

The patch is quite small, in fact most changes are typo fixes in the
comments.
2016-01-06 20:58:55 +01:00
Willy Tarreau
70af633ebe MINOR: chunk: make chunk_initstr() take a const string
chunk_initstr() prepares a read-only chunk from a string of
fixed length. Thus it must be prepared to accept a read-only
string on the input, otherwise the caller has to force-cast
some const char* and that's not a good idea.
2016-01-06 20:58:55 +01:00
Willy Tarreau
601360b41d MINOR: chunks: add chunk_strcat() and chunk_newstr()
These two new functions will make it easier to manipulate small strings
from within functions, because at many places, multiple short strings
are needed which do not deserve a malloc() nor a free(), and alloca()
is often discouraged. Since we already have trash chunks, it's convenient
to be able to allocate substrings from a chunk and use them later since
our functions already perform all the length checks. chunk_newstr() adds
a trailing zero at the end of a chunk and returns the pointer to the next
character, which can be used as an independant string. chunk_strcat()
does what it says.
2016-01-06 13:53:37 +01:00
Willy Tarreau
0b6044fa24 MINOR: chunks: ensure that chunk_strcpy() adds a trailing zero
Since thus function bears the name of a well-known string function, it
must at least promise compatible semantics. Here it means always adding
the trailing zero so that anyone willing to use chunk->str as a regular
string can do it. Of course the zero is not counted in the chunk's length.
2016-01-06 13:53:37 +01:00
Willy Tarreau
f9476a5a30 BUG/MINOR: chunk: make chunk_dup() always check and set dst->size
chunk_dup() was affected by two bugs at once related to dst->size :
  - first, it didn't check dst->size to know if it could free(dst->str),
    so using it on a statically allocated chunk would cause a free(constant)
    and crash the process ;

  - second, it didn't properly set dst->size, possibly causing smaller
    strings not to be properly reported in a chunk that was previously
    used for something else.

Fortunately, neither of these situations ever happened since the function
is rarely used.

In the process of doing this, we even allocate one more byte for a
trailing zero if the input chunk was not full, so that the copied
string can safely be reused by standard string functions.

The bug was introduced in 1.3.4 nine years ago with this commit :

  0f77253 ("[MINOR] store HTTP error messages into a chunk array")

It's better to backport this fix in case a future fix relies on it.
2016-01-04 20:47:27 +01:00
Thierry FOURNIER
ec9a58c709 BUILD/MINOR: regex: missing header
When HAProxy is compiled with pcre, strlen() is used, but <string.h>
is not included.

This patch must be backported in 1.6
2015-12-22 13:36:01 +01:00
Thierry FOURNIER
1db96672c4 BUILD: freebsd: double declaration
On freebsd, the macro LIST_PREV already exists in the header file
<sys/queue.h>, and this makes a build error.

This patch removes the macros before declaring it. This ensure
that the error doesn't occurs.
2015-11-06 01:15:02 +01:00
Willy Tarreau
58102cf30b MEDIUM: memory: add accounting for failed allocations
We now keep a per-pool counter of failed memory allocations and
we report that, as well as the amount of memory allocated and used
on the CLI.
2015-10-28 16:24:21 +01:00
Willy Tarreau
de30a684ca DEBUG/MEDIUM: memory: add optional control pool memory operations
When DEBUG_MEMORY_POOLS is used, we now use the link pointer at the end
of the pool to store a pointer to the pool, and to control it during
pool_free2() in order to serve four purposes :
  - at any instant we can know what pool an object was allocated from
    when examining memory, hence how we should possibly decode it ;

  - it serves to detect double free when they happen, as the pointer
    cannot be valid after the element is linked into the pool ;

  - it serves to detect if an element is released in the wrong pool ;

  - it serves as a canary, to detect if some buffers experienced an
    overflow before being release.

All these elements will definitely help better troubleshoot strange
situations, or at least confirm that certain conditions did not happen.
2015-10-28 15:28:05 +01:00
Willy Tarreau
ac421118db DEBUG/MEDIUM: memory: optionally protect free data in pools
When debugging a core file, it's sometimes convenient to be able to
visit the released entries in the pools (typically last released
session). Unfortunately the first bytes of these entries are destroyed
by the link elements of the pool. And of course, most structures have
their most accessed elements at the beginning of the structure (typically
flags). Let's add a build-time option DEBUG_MEMORY_POOLS which allocates
an extra pointer in each pool to put the link at the end of each pool
item instead of the beginning.
2015-10-28 15:27:59 +01:00
Willy Tarreau
a84dcb8440 DEBUG/MINOR: memory: add a build option to disable memory pools sharing
Sometimes analysing a core file isn't easy due to shared memory pools.
Let's add a build option to disable this. It's not enabled by default,
it could be backported to older versions.
2015-10-28 15:27:55 +01:00
Neale Ferguson
5e98e3e998 BUILD: enable build on Linux/s390x
I would like to contribute the following fix to enable the Linux s390x
platform. The fix was built against today's git master. I've attached the
patch for review. Depending on your buildbot/jenkins/? requirements I can
set up a virtual machine for automated building/testing of the package in
this environment.
2015-10-12 20:58:51 +02:00
Joseph Lynch
ffaf30b689 BUILD: Fix the build on OSX (htonll/ntohll)
htonll and ntohll were defined in 5b4dd683cb but on osx they are already
defined in sys/_endian.h. So, we check if they are defined before
declaring them.

[wt: no backport needed]
2015-10-09 10:11:59 +02:00
Willy Tarreau
067ac9f4b6 MINOR: debug: enable memory poisonning to use byte 0
When debugging an issue, sometimes it can be useful to be able to use
byte 0 to poison memory areas, resulting in the same effect as a calloc().
This patch changes the default mem_poison_byte to -1 to disable it so that
all positive values are usable.
2015-10-08 14:12:13 +02:00
Willy Tarreau
ae459f3b9f BUILD: tcp: use IPPROTO_IP when SOL_IP is not available
Dmitry Sivachenko reported a build failure on FreeBSD due to SOL_IP not
being defined. IPPROTO_IP must be used there instead.
2015-09-29 18:19:32 +02:00
David Carlier
60deeba090 MINOR: chunk: New function free_trash_buffers()
This new function is meant to be called in the general deinit phase,
to free those two internal chunks.
2015-09-28 14:00:00 +02:00
David Carlier
845efb53c7 MINOR: cfgparse: New function cfg_unregister_sections()
A new function introduced meant to be called during general deinit phase.
During the configuration parsing, the section entries are all allocated.
This new function free them.
2015-09-28 14:00:00 +02:00
Willy Tarreau
270978492c MEDIUM: config: set tune.maxrewrite to 1024 by default
The tune.maxrewrite parameter used to be pre-initialized to half of
the buffer size since the very early days when buffers were very small.
It has grown to absurdly large values over the years to reach 8kB for a
16kB buffer. This prevents large requests from being accepted, which is
the opposite of the initial goal.

Many users fix it to 1024 which is already quite large for header
addition.

So let's change the default setting policy :
  - pre-initialize it to 1024
  - let the user tweak it
  - in any case, limit it to tune.bufsize / 2

This results in 15kB usable to buffer HTTP messages instead of 8kB, and
doesn't affect existing configurations which already force it.
2015-09-28 13:59:41 +02:00
Thierry FOURNIER
7fe3be7281 MINOR: standard: avoid DNS resolution from the function str2sa_range()
This patch blocks the DNS resolution in the function str2sa_range(),
this is useful if the function is used during the HAProxy runtime.
2015-09-27 15:04:32 +02:00
Willy Tarreau
1895428ef4 DEBUG: add p_malloc() to return a poisonned memory area
This one is useful to detect improperly initialized memory areas
when some suspicious malloc() are involved in random behaviours.
2015-09-26 01:28:43 +02:00
Willy Tarreau
72b8c1f0aa MEDIUM: tools: make str2sa_range() optionally return the FQDN
The function does a bunch of things among which resolving environment
variables, skipping address family specifiers and trimming port ranges.
It is the only one which sees the complete host name before trying to
resolve it. The DNS resolving code needs to know the original hostname,
so we modify this function to optionally provide it to the caller.

Note that the function itself doesn't know if the host part was a host
or an address, but str2ip() knows that and can be asked not to try to
resolve. So we first try to parse the address without resolving and
try again with resolving enabled. This way we know if the address is
explicit or needs some kind of resolution.
2015-09-08 15:50:19 +02:00
Willy Tarreau
de39c9b10f CLEANUP: appsession: remove the last include files
These ones were include/common/appsession.h and include/common/sessionhash.h.
2015-08-10 19:42:30 +02:00
Willy Tarreau
5b4dd683cb MINOR: standard: provide htonll() and ntohll()
These are the 64-bit equivalent of htonl() and ntohl(). They're a bit
tricky in order to avoid expensive operations.

The principle consists in letting the compiler detect we're playing
with a union and simplify most or all operations. The asm-optimized
htonl() version involving bswap (x86) / rev (arm) / other is a single
operation on little endian, or a NOP on big-endian. In both cases,
this lets the compiler "see" that we're rebuilding a 64-bit word from
two 32-bit quantities that fit into a 32-bit register. In big endian,
the whole code is optimized out. In little endian, with a decent compiler,
a few bswap and 2 shifts are left, which is the minimum acceptable.
2015-07-21 23:50:06 +02:00
Thierry FOURNIER
763a5d85f7 MINOR: standard: add 64 bits conversion functions
This patch adds 3 functions for 64 bit integer conversion.

 * lltoa_r : converts signed 64 bit integer to string
 * read_uint64 : converts from string to signed 64 bits integer with capping
 * read_int64 : converts from string to unsigned 64 bits integer with capping
2015-07-21 23:27:10 +02:00
David Carlier
e6c3941668 BUILD/MINOR: tools: rename popcount to my_popcountl
This is in order to avoid conflicting with NetBSD popcount* functions
since 6.x release, the final l to mentions the argument is a long like
NetBSD does.

This patch could be backported to 1.5 to fix the build issue there as well.
2015-07-02 11:32:25 +02:00
Thierry FOURNIER
1480bd8dd2 MINOR: standard: add function that converts signed int to a string
This function is the same as "ultoa_r", but it takes a signed value
as input.
2015-06-13 22:59:14 +02:00
Joris Dedieu
9dd44ba5d6 BUG/MEDIUM: compat: fix segfault on FreeBSD
Since commit 65d805fd witch removes standard.h from compat.h some
values were not properly set on FreeBSD. This caused a segfault
at startup when smp_resolve_args is called.

As FreeBSD have IP_BINDANY, CONFIG_HAP_TRANSPARENT is define. This
cause struct conn_src to be extended with some fields. The size of
this structure was incorrect. Including netinet/in.h fix this issue.

While diving in code preprocessing, I found that limits.h was require
to properly set MAX_HOSTNAME_LEN, ULONG_MAX, USHRT_MAX and others
system limits on FreeBSD.
2015-06-13 08:25:36 +02:00
Christopher Faulet
31af49d62b MEDIUM: ssl: Add options to forge SSL certificates
With this patch, it is possible to configure HAProxy to forge the SSL
certificate sent to a client using the SNI servername. We do it in the SNI
callback.

To enable this feature, you must pass following BIND options:

 * ca-sign-file <FILE> : This is the PEM file containing the CA certitifacte and
   the CA private key to create and sign server's certificates.

 * (optionally) ca-sign-pass <PASS>: This is the CA private key passphrase, if
   any.

 * generate-certificates: Enable the dynamic generation of certificates for a
   listener.

Because generating certificates is expensive, there is a LRU cache to store
them. Its size can be customized by setting the global parameter
'tune.ssl.ssl-ctx-cache-size'.
2015-06-12 18:06:59 +02:00
Thierry FOURNIER
ddea626de4 MINOR: common: escape CSV strings
This function checks a string for using it in a CSV output format. If
the string contains one of the following four char <">, <,>, CR or LF,
the string is encapsulated between <"> and the <"> are escaped by a <"">
sequence.

The rounding by <"> is optionnal. It can be canceled, forced or the
function choose automatically the right way.
2015-05-28 17:47:19 +02:00
Willy Tarreau
98d0485a90 MAJOR: config: remove the deprecated reqsetbe / reqisetbe actions
These ones were already obsoleted in 1.4, marked for removal in 1.5,
and not documented anymore. They used to emit warnings, and do still
require quite some code to stay in place. Let's remove them now.
2015-05-26 12:18:29 +02:00
Willy Tarreau
f3045d2a06 MAJOR: pattern: add LRU-based cache on pattern matching
The principle of this cache is to have a global cache for all pattern
matching operations which rely on lists (reg, sub, dir, dom, ...). The
input data, the expression and a random seed are used as a hashing key.
The cached entries contains a pointer to the expression and a revision
number for that expression so that we don't accidently used obsolete
data after a pattern update or a very unlikely hash collision.

Regarding the risk of collisions, 10k entries at 10k req/s mean 1% risk
of a collision after 60 years, that's already much less than the memory's
reliability in most machines and more durable than most admin's life
expectancy. A collision will result in a valid result to be returned
for a different entry from the same list. If this is not acceptable,
the cache can be disabled using tune.pattern.cache-size.

A test on a file containing 10k small regex showed that the regex
matching was limited to 6k/s instead of 70k with regular strings.
When enabling the LRU cache, the performance was back to 70k/s.
2015-04-29 19:15:24 +02:00
Willy Tarreau
e6e49cfa93 MINOR: tools: provide an rdtsc() function for time comparisons
This one returns a timestamp, either the one from the CPU or from
gettimeofday() in 64-bit format. The purpose is to be able to compare
timestamps on various entities to make it easier to detect updates.
It can also be used for benchmarking in certain situations during
development.
2015-04-29 19:14:03 +02:00
Willy Tarreau
1b90511eeb CLEANUP: namespaces: fix protection against multiple inclusions
The include file did not protect correctly against multiple inclusions,
as it didn't define the file name after checking for it. That's currently
harmless as the file is only included from .c but that could change.
2015-04-08 17:31:40 +02:00
Willy Tarreau
87b09668be REORG/MAJOR: session: rename the "session" entity to "stream"
With HTTP/2, we'll have to support multiplexed streams. A stream is in
fact the largest part of what we currently call a session, it has buffers,
logs, etc.

In order to catch any error, this commit removes any reference to the
struct session and tries to rename most "session" occurrences in function
names to "stream" and "sess" to "strm" when that's related to a session.

The files stream.{c,h} were added and session.{c,h} removed.

The session will be reintroduced later and a few parts of the stream
will progressively be moved overthere. It will more or less contain
only what we need in an embryonic session.

Sample fetch functions and converters will have to change a bit so
that they'll use an L5 (session) instead of what's currently called
"L4" which is in fact L6 for now.

Once all changes are completed, we should see approximately this :

   L7 - http_txn
   L6 - stream
   L5 - session
   L4 - connection | applet

There will be at most one http_txn per stream, and a same session will
possibly be referenced by multiple streams. A connection will point to
a session and to a stream. The session will hold all the information
we need to keep even when we don't yet have a stream.

Some more cleanup is needed because some code was already far from
being clean. The server queue management still refers to sessions at
many places while comments talk about connections. This will have to
be cleaned up once we have a server-side connection pool manager.
Stream flags "SN_*" still need to be renamed, it doesn't seem like
any of them will need to move to the session.
2015-04-06 11:23:56 +02:00
Thierry FOURNIER
d2b597aa10 BUG/MEDIUM: lua: segfault with buffer_replace2
The function buffer_contig_space() returns the contiguous space avalaible
to add data (at the end of the input side) while the function
hlua_channel_send_yield() needs to insert data starting at p. Here we
introduce a new function bi_space_for_replace() which returns the amount
of space that can be inserted at the head of the input side with one of
the buffer_replace* functions.

This patch proposes a function that returns the space avalaible after buf->p.
2015-03-09 18:12:59 +01:00
Thierry FOURNIER
549aac8d0b MEDIUM: buffer: make bo_putblk/bo_putstr/bo_putchk return the number of bytes copied.
This is not used yet. Planned for LUA.
2015-02-28 23:12:32 +01:00
Thierry FOURNIER
58639a0ef3 MINOR: global: export function and permits to not resolve DNS names
exports the commonly used function str2ip. The function str2ip2 is
created and permits to not resolve DNS names.
2015-02-28 23:12:32 +01:00
Nenad Merdanovic
05552d4b98 MEDIUM: Add support for configurable TLS ticket keys
Until now, the TLS ticket keys couldn't have been configured and
shared between multiple instances or multiple servers running HAproxy.
The result was that if a request got a TLS ticket from one instance/server
and it hits another one afterwards, it will have to go through the full
SSL handshake and negotation.

This patch enables adding a ticket file to the bind line, which will be
used for all SSL contexts created from that bind line. We can use the
same file on all instances or servers to mitigate this issue and have
consistent TLS tickets assigned. Clients will no longer have to negotiate
every time they change the handling process.

Signed-off-by: Nenad Merdanovic <nmerdan@anine.io>
2015-02-28 23:10:22 +01:00
Willy Tarreau
15a53a4384 MEDIUM: regex: add support for passing regex flags to regex_exec_match()
This function (and its sister regex_exec_match2()) abstract the regex
execution but make it impossible to pass flags to the regex engine.
Currently we don't use them but we'll need to support REG_NOTBOL soon
(to indicate that we're not at the beginning of a line). So let's add
support for this flag and update the API accordingly.
2015-01-22 14:24:53 +01:00
Willy Tarreau
c829ee48c7 MINOR: hash: add new function hash_crc32
This function will be used to perform CRC32 computations. This one wa
loosely inspired from crc32b found here, and focuses on size and speed
at the same time :

    http://www.hackersdelight.org/hdcodetxt/crc.c.txt

Much faster table-based versions exist but are pointless for our usage
here, this hash already sustains gigabit speed which is far faster than
what we'd ever need. Better preserve the CPU's cache instead.
2015-01-20 19:48:05 +01:00
Willy Tarreau
d025648f7c MAJOR: init: automatically set maxconn and/or maxsslconn when possible
If a memory size limit is enforced using "-n" on the command line and
one or both of maxconn / maxsslconn are not set, instead of using the
build-time values, haproxy now computes the number of sessions that can
be allocated depending on a number of parameters among which :

  - global.maxconn (if set)
  - global.maxsslconn (if set)
  - maxzlibmem
  - tune.ssl.cachesize
  - presence of SSL in at least one frontend (bind lines)
  - presence of SSL in at least one backend (server lines)
  - tune.bufsize
  - tune.cookie_len

The purpose is to ensure that not haproxy will not run out of memory
when maxing out all parameters. If neither maxconn nor maxsslconn are
used, it will consider that 100% of the sessions involve SSL on sides
where it's supported. That means that it will typically optimize maxconn
for SSL offloading or SSL bridging on all connections. This generally
means that the simple act of enabling SSL in a frontend or in a backend
will significantly reduce the global maxconn but in exchange of that, it
will guarantee that it will not fail.

All metrics may be enforced using #defines to accomodate variations in
SSL libraries or various allocation sizes.
2015-01-15 21:45:22 +01:00
Willy Tarreau
d92aa5c44a MINOR: global: report information about the cost of SSL connections
An SSL connection takes some memory when it exists and during handshakes.
We measured up to 16kB for an established endpoint, and up to 76 extra kB
during a handshake. The SSL layer stores these values into the global
struct during initialization. If other SSL libs are used, it's easy to
change these values. Anyway they'll only be used as gross estimates in
order to guess the max number of SSL conns that can be established when
memory is constrained and the limit is not set.
2015-01-15 21:34:39 +01:00
Willy Tarreau
3ca1a883f9 MINOR: tools: add new round_2dig() function to round integers
This function rounds down an integer to the closest value having only
2 significant digits.
2015-01-15 19:02:27 +01:00
Willy Tarreau
3889fffe92 MINOR: channel: rename channel_full() to !channel_may_recv()
This function's name was poorly chosen and is confusing to the point of
being suspiciously used at some places. The operations it does always
consider the ability to forward pending input data before receiving new
data. This is not obvious at all, especially at some places where it was
used when consuming outgoing data to know if the buffer has any chance
to ever get the missing data. The code needs to be re-audited with that
in mind. Care must be taken with existing code since the polarity of the
function was switched with the renaming.
2015-01-14 18:41:33 +01:00
Willy Tarreau
75abcb3106 MINOR: config: extend the default max hostname length to 64 and beyond
Some users reported that the default max hostname length of 32 is too
short in some environments. This patch does two things :

  - it relies on the system's max hostname length as found in MAXHOSTNAMELEN
    if it is set. This is the most logical thing to do as the system libs
    generally present the appropriate value supported by the system. This
    value is 64 on Linux and 256 on Solaris, to give a few examples.

  - otherwise it defaults to 64

It is still possible to override this value by defining MAX_HOSTNAME_LEN at
build time. After some observation time, this patch may be backported to
1.5 if it does not cause any build issue, as it is harmless and may help
some users.
2015-01-14 11:52:34 +01:00
Willy Tarreau
a24adf0795 MAJOR: session: only wake up as many sessions as available buffers permit
We've already experimented with three wake up algorithms when releasing
buffers : the first naive one used to wake up far too many sessions,
causing many of them not to get any buffer. The second approach which
was still in use prior to this patch consisted in waking up either 1
or 2 sessions depending on the number of FDs we had released. And this
was still inaccurate. The third one tried to cover the accuracy issues
of the second and took into consideration the number of FDs the sessions
would be willing to use, but most of the time we ended up waking up too
many of them for nothing, or deadlocking by lack of buffers.

This patch completely removes the need to allocate two buffers at once.
Instead it splits allocations into critical and non-critical ones and
implements a reserve in the pool for this. The deadlock situation happens
when all buffers are be allocated for requests pending in a maxconn-limited
server queue, because then there's no more way to allocate buffers for
responses, and these responses are critical to release the servers's
connection in order to release the pending requests. In fact maxconn on
a server creates a dependence between sessions and particularly between
oldest session's responses and latest session's requests. Thus, it is
mandatory to get a free buffer for a response in order to release a
server connection which will permit to release a request buffer.

Since we definitely have non-symmetrical buffers, we need to implement
this logic in the buffer allocation mechanism. What this commit does is
implement a reserve of buffers which can only be allocated for responses
and that will never be allocated for requests. This is made possible by
the requester indicating how much margin it wants to leave after the
allocation succeeds. Thus it is a cooperative allocation mechanism : the
requester (process_session() in general) prefers not to get a buffer in
order to respect other's need for response buffers. The session management
code always knows if a buffer will be used for requests or responses, so
that is not difficult :

  - either there's an applet on the initiator side and we really need
    the request buffer (since currently the applet is called in the
    context of the session)

  - or we have a connection and we really need the response buffer (in
    order to support building and sending an error message back)

This reserve ensures that we don't take all allocatable buffers for
requests waiting in a queue. The downside is that all the extra buffers
are really allocated to ensure they can be allocated. But with small
values it is not an issue.

With this change, we don't observe any more deadlocks even when running
with maxconn 1 on a server under severely constrained memory conditions.

The code becomes a bit tricky, it relies on the scheduler's run queue to
estimate how many sessions are already expected to run so that it doesn't
wake up everyone with too few resources. A better solution would probably
consist in having two queues, one for urgent requests and one for normal
requests. A failed allocation for a session dealing with an error, a
connection event, or the need for a response (or request when there's an
applet on the left) would go to the urgent request queue, while other
requests would go to the other queue. Urgent requests would be served
from 1 entry in the pool, while the regular ones would be served only
according to the reserve. Despite not yet having this, it works
remarkably well.

This mechanism is quite efficient, we don't perform too many wake up calls
anymore. For 1 million sessions elapsed during massive memory contention,
we observe about 4.5M calls to process_session() compared to 4.0M without
memory constraints. Previously we used to observe up to 16M calls, which
rougly means 12M failures.

During a test run under high memory constraints (limit enforced to 27 MB
instead of the 58 MB normally needed), performance used to drop by 53% prior
to this patch. Now with this patch instead it *increases* by about 1.5%.

The best effect of this change is that by limiting the memory usage to about
2/3 to 3/4 of what is needed by default, it's possible to increase performance
by up to about 18% mainly due to the fact that pools are reused more often
and remain hot in the CPU cache (observed on regular HTTP traffic with 20k
objects, buffers.limit = maxconn/10, buffers.reserve = limit/2).

Below is an example of scenario which used to cause a deadlock previously :
  - connection is received
  - two buffers are allocated in process_session() then released
  - one is allocated when receiving an HTTP request
  - the second buffer is allocated then released in process_session()
    for request parsing then connection establishment.
  - poll() says we can send, so the request buffer is sent and released
  - process session gets notified that the connection is now established
    and allocates two buffers then releases them
  - all other sessions do the same till one cannot get the request buffer
    without hitting the margin
  - and now the server responds. stream_interface allocates the response
    buffer and manages to get it since it's higher priority being for a
    response.
  - but process_session() cannot allocate the request buffer anymore

  => We could end up with all buffers used by responses so that none may
     be allocated for a request in process_session().

When the applet processing leaves the session context, the test will have
to be changed so that we always allocate a response buffer regardless of
the left side (eg: H2->H1 gateway). A final improvement would consists in
being able to only retry the failed I/O operation without waking up a
task, but to date all experiments to achieve this have proven not to be
reliable enough.
2014-12-24 23:47:33 +01:00
Willy Tarreau
f4718e8ec0 MEDIUM: buffer: implement b_alloc_margin()
This function is used to allocate a buffer and ensure that we leave
some margin after it in the pool. The function is not obvious. While
we allocate only one buffer, we want to ensure that at least two remain
available after our allocation. The purpose is to ensure we'll never
enter a deadlock where all sessions allocate exactly one buffer, and
none of them will be able to allocate the second buffer needed to build
a response in order to release the first one.

We also take care of remaining fast in the the fast path by first
checking whether or not there is enough margin, in which case we only
rely on b_alloc_fast() which is guaranteed to succeed. Otherwise we
take the slow path using pool_refill_alloc().
2014-12-24 23:47:32 +01:00
Willy Tarreau
620bd6c88e MINOR: buffer: implement b_alloc_fast()
This function allocates a buffer and replaces *buf with this buffer. If
no memory is available, &buf_wanted is used instead. No control is made
to check if *buf already pointed to another buffer. The allocated buffer
is returned, or NULL in case no memory is available. The difference with
b_alloc() is that this function only picks from the pool and never calls
malloc(), so it can fail even if some memory is available. It is the
caller's job to refill the buffer pool if needed.
2014-12-24 23:47:32 +01:00
Willy Tarreau
4428a29e52 MEDIUM: channel: do not report full when buf_empty is present on a channel
Till now we'd consider a buffer full even if it had size==0 due to pointing
to buf.size. Now we change this : if buf_wanted is present, it means that we
have already tried to allocate a buffer but failed. Thus the buffer must be
considered full so that we stop trying to poll for reads on it. Otherwise if
it's empty, it's buf_empty and we report !full since we may allocate it on
the fly.
2014-12-24 23:47:32 +01:00
Willy Tarreau
f2f7d6b27b MEDIUM: buffer: add a new buf_wanted dummy buffer to report failed allocations
Doing so ensures that even when no memory is available, we leave the
channel in a sane condition. There's a special case in proto_http.c
regarding the compression, we simply pre-allocate the tmpbuf to point
to the dummy buffer. Not reusing &buf_empty for this allows the rest
of the code to differenciate an empty buffer that's not used from an
empty buffer that results from a failed allocation which has the same
semantics as a buffer full.
2014-12-24 23:47:32 +01:00
Willy Tarreau
2a4b54359b MEDIUM: buffer: always assign a dummy empty buffer to channels
Channels are now created with a valid pointer to a buffer before the
buffer is allocated. This buffer is a global one called "buf_empty" and
of size zero. Thus it prevents any activity from being performed on
the buffer and still ensures that chn->buf may always be dereferenced.
b_free() also resets the buffer to &buf_empty, and was split into
b_drop() which does not reset the buffer.
2014-12-24 23:47:32 +01:00
Willy Tarreau
7dfca9daec MINOR: buffer: only use b_free to release buffers
We don't call pool_free2(pool2_buffers) anymore, we only call b_free()
to do the job. This ensures that we can start to centralize the releasing
of buffers.
2014-12-24 23:47:32 +01:00
Willy Tarreau
e583ea583a MEDIUM: buffer: use b_alloc() to allocate and initialize a buffer
b_alloc() now allocates a buffer and initializes it to the size specified
in the pool minus the size of the struct buffer itself. This ensures that
callers do not need to care about buffer details anymore. Also this never
applies memory poisonning, which is slow and useless on buffers.
2014-12-24 23:47:32 +01:00
Willy Tarreau
474cf54a97 MINOR: buffer: reset a buffer in b_reset() and not channel_init()
We'll soon need to be able to switch buffers without touching the
channel, so let's move buffer initialization out of channel_init().
We had the same in compressoin.c.
2014-12-24 23:47:31 +01:00
Willy Tarreau
a885f6dc65 MEDIUM: memory: improve pool_refill_alloc() to pass a refill count
Till now this function would only allocate one entry at a time. But with
dynamic buffers we'll like to allocate the number of missing entries to
properly refill the pool.

Let's modify it to take a minimum amount of available entries. This means
that when we know we need at least a number of available entries, we can
ask to allocate all of them at once. It also ensures that we don't move
the pointers back and forth between the caller and the pool, and that we
don't call pool_gc2() for each failed malloc. Instead, it's called only
once and the malloc is only allowed to fail once.
2014-12-24 23:47:31 +01:00
Willy Tarreau
0262241e26 MINOR: memory: cut pool allocator in 3 layers
pool_alloc2() used to pick the entry from the pool, fall back to
pool_refill_alloc(), and to perform the poisonning itself, which
pool_refill_alloc() was also doing. While this led to optimal
code size, it imposes memory poisonning on the buffers as well,
which is extremely slow on large buffers.

This patch cuts the allocator in 3 layers :
  - a layer to pick the first entry from the pool without falling back to
    pool_refill_alloc() : pool_get_first()
  - a layer to allocate a dirty area by falling back to pool_refill_alloc()
    but never performing the poisonning : pool_alloc_dirty()
  - pool_alloc2() which calls the latter and optionally poisons the area

No functional changes were made.
2014-12-24 23:47:31 +01:00
Willy Tarreau
e430e77dfd CLEANUP: memory: replace macros pool_alloc2/pool_free2 with functions
Using inline functions here makes the code more readable and reduces its
size by about 2 kB.
2014-12-24 23:47:31 +01:00
Willy Tarreau
62405a2155 CLEANUP: memory: remove dead code
The very old pool managment code has not been used for the last 7 years
and is still polluting the file. Get rid of it now.
2014-12-24 23:47:31 +01:00
Willy Tarreau
3dd717cd5d CLEANUP: lists: remove dead code
Remove the code dealing with the old dual-linked lists imported from
librt that has remained unused for the last 8 years. Now everything
uses the linux-like circular lists instead.
2014-12-24 23:47:31 +01:00
Willy Tarreau
23a5c396ec DEBUG: pools: apply poisonning on every allocated pool
Till now, when memory poisonning was enabled, it used to be done only
after a calloc(). But sometimes it's not enough to detect unexpected
sharing, so let's ensure that we now poison every allocation once it's
in place. Note that enabling poisonning significantly hurts performance
(it can typically half the overall performance).
2014-11-25 13:48:43 +01:00
KOVACS Krisztian
b3e54fe387 MAJOR: namespace: add Linux network namespace support
This patch makes it possible to create binds and servers in separate
namespaces.  This can be used to proxy between multiple completely independent
virtual networks (with possibly overlapping IP addresses) and a
non-namespace-aware proxy implementation that supports the proxy protocol (v2).

The setup is something like this:

net1 on VLAN 1 (namespace 1) -\
net2 on VLAN 2 (namespace 2) -- haproxy ==== proxy (namespace 0)
net3 on VLAN 3 (namespace 3) -/

The proxy is configured to make server connections through haproxy and sending
the expected source/target addresses to haproxy using the proxy protocol.

The network namespace setup on the haproxy node is something like this:

= 8< =
$ cat setup.sh
ip netns add 1
ip link add link eth1 type vlan id 1
ip link set eth1.1 netns 1
ip netns exec 1 ip addr add 192.168.91.2/24 dev eth1.1
ip netns exec 1 ip link set eth1.$id up
...
= 8< =

= 8< =
$ cat haproxy.cfg
frontend clients
  bind 127.0.0.1:50022 namespace 1 transparent
  default_backend scb

backend server
  mode tcp
  server server1 192.168.122.4:2222 namespace 2 send-proxy-v2
= 8< =

A bind line creates the listener in the specified namespace, and connections
originating from that listener also have their network namespace set to
that of the listener.

A server line either forces the connection to be made in a specified
namespace or may use the namespace from the client-side connection if that
was set.

For more documentation please read the documentation included in the patch
itself.

Signed-off-by: KOVACS Tamas <ktamas@balabit.com>
Signed-off-by: Sarkozi Laszlo <laszlo.sarkozi@balabit.com>
Signed-off-by: KOVACS Krisztian <hidden@balabit.com>
2014-11-21 07:51:57 +01:00
Christian Ruppert
de898712a0 MEDIUM: regex: Use pcre_study always when PCRE is used, regardless of JIT
pcre_study() has been around long before JIT has been added. It also seems to
affect the performance in some cases (positive).

Below I've attached some test restults. The test is based on
http://sljit.sourceforge.net/regex_perf.html (see bottom). It has been modified
to just test pcre_study vs. no pcre_study. Note: This test does not try to
match specific header it's instead run over a larger text with more and less
complex patterns to make the differences more clear.

% ./runtest
'mark.txt' loaded. (Length: 19665221 bytes)
-----------------
Regex: 'Twain'
[pcre-nostudy] time:    14 ms (2388 matches)
[pcre-study] time:    21 ms (2388 matches)
-----------------
Regex: '^Twain'
[pcre-nostudy] time:   109 ms (100 matches)
[pcre-study] time:   109 ms (100 matches)
-----------------
Regex: 'Twain$'
[pcre-nostudy] time:    14 ms (127 matches)
[pcre-study] time:    16 ms (127 matches)
-----------------
Regex: 'Huck[a-zA-Z]+|Finn[a-zA-Z]+'
[pcre-nostudy] time:   695 ms (83 matches)
[pcre-study] time:    26 ms (83 matches)
-----------------
Regex: 'a[^x]{20}b'
[pcre-nostudy] time:    90 ms (12495 matches)
[pcre-study] time:    91 ms (12495 matches)
-----------------
Regex: 'Tom|Sawyer|Huckleberry|Finn'
[pcre-nostudy] time:  1236 ms (3015 matches)
[pcre-study] time:    34 ms (3015 matches)
-----------------
Regex: '.{0,3}(Tom|Sawyer|Huckleberry|Finn)'
[pcre-nostudy] time:  5696 ms (3015 matches)
[pcre-study] time:  5655 ms (3015 matches)
-----------------
Regex: '[a-zA-Z]+ing'
[pcre-nostudy] time:  1290 ms (95863 matches)
[pcre-study] time:  1167 ms (95863 matches)
-----------------
Regex: '^[a-zA-Z]{0,4}ing[^a-zA-Z]'
[pcre-nostudy] time:   136 ms (4507 matches)
[pcre-study] time:   134 ms (4507 matches)
-----------------
Regex: '[a-zA-Z]+ing$'
[pcre-nostudy] time:  1334 ms (5360 matches)
[pcre-study] time:  1214 ms (5360 matches)
-----------------
Regex: '^[a-zA-Z ]{5,}$'
[pcre-nostudy] time:   198 ms (26236 matches)
[pcre-study] time:   197 ms (26236 matches)
-----------------
Regex: '^.{16,20}$'
[pcre-nostudy] time:   173 ms (4902 matches)
[pcre-study] time:   175 ms (4902 matches)
-----------------
Regex: '([a-f](.[d-m].){0,2}[h-n]){2}'
[pcre-nostudy] time:  1242 ms (68621 matches)
[pcre-study] time:   690 ms (68621 matches)
-----------------
Regex: '([A-Za-z]awyer|[A-Za-z]inn)[^a-zA-Z]'
[pcre-nostudy] time:  1215 ms (675 matches)
[pcre-study] time:   952 ms (675 matches)
-----------------
Regex: '"[^"]{0,30}[?!\.]"'
[pcre-nostudy] time:    27 ms (5972 matches)
[pcre-study] time:    28 ms (5972 matches)
-----------------
Regex: 'Tom.{10,25}river|river.{10,25}Tom'
[pcre-nostudy] time:   705 ms (2 matches)
[pcre-study] time:    68 ms (2 matches)

In some cases it's more or less the same but when it's faster than by a huge margin.
It always depends on the pattern, the string(s) to match against etc.

Signed-off-by: Christian Ruppert <c.ruppert@babiel.com>
2014-11-18 13:26:18 +01:00
Thierry FOURNIER
317e1c4f1e MINOR: sample: add "json" converter
This converter escapes string to use it as json/ascii escaped string.
It can read UTF-8 with differents behavior on errors and encode it in
json/ascii.

json([<input-code>])
  Escapes the input string and produces an ASCII ouput string ready to use as a
  JSON string. The converter tries to decode the input string according to the
  <input-code> parameter. It can be "ascii", "utf8", "utf8s", "utf8"" or
  "utf8ps". The "ascii" decoder never fails. The "utf8" decoder detects 3 types
  of errors:
   - bad UTF-8 sequence (lone continuation byte, bad number of continuation
     bytes, ...)
   - invalid range (the decoded value is within a UTF-8 prohibited range),
   - code overlong (the value is encoded with more bytes than necessary).

  The UTF-8 JSON encoding can produce a "too long value" error when the UTF-8
  character is greater than 0xffff because the JSON string escape specification
  only authorizes 4 hex digits for the value encoding. The UTF-8 decoder exists
  in 4 variants designated by a combination of two suffix letters : "p" for
  "permissive" and "s" for "silently ignore". The behaviors of the decoders
  are :
   - "ascii"  : never fails ;
   - "utf8"   : fails on any detected errors ;
   - "utf8s"  : never fails, but removes characters corresponding to errors ;
   - "utf8p"  : accepts and fixes the overlong errors, but fails on any other
                error ;
   - "utf8ps" : never fails, accepts and fixes the overlong errors, but removes
                characters corresponding to the other errors.

  This converter is particularly useful for building properly escaped JSON for
  logging to servers which consume JSON-formated traffic logs.

  Example:
     capture request header user-agent len 150
     capture request header Host len 15
     log-format {"ip":"%[src]","user-agent":"%[capture.req.hdr(1),json]"}

  Input request from client 127.0.0.1:
     GET / HTTP/1.0
     User-Agent: Very "Ugly" UA 1/2

  Output log:
     {"ip":"127.0.0.1","user-agent":"Very \"Ugly\" UA 1\/2"}
2014-10-26 06:41:12 +01:00
Willy Tarreau
3986b9c140 MEDIUM: config: report it when tcp-request rules are misplaced
A config where a tcp-request rule appears after an http-request rule
might seem valid but it is not. So let's report a warning about this
since this case is hard to detect by the naked eye.
2014-09-16 15:43:24 +02:00
Willy Tarreau
edee1d60b7 MEDIUM: stick-table: make it easier to register extra data types
Some users want to add their own data types to stick tables. We don't
want to use a linked list here for performance reasons, so we need to
continue to use an indexed array. This patch allows one to reserve a
compile-time-defined number of extra data types by setting the new
macro STKTABLE_EXTRA_DATA_TYPES to anything greater than zero, keeping
in mind that anything larger will slightly inflate the memory consumed
by stick tables (not per entry though).

Then calling stktable_register_data_store() with the new keyword will
either register a new keyword or fail if the desired entry was already
taken or the keyword already registered.

Note that this patch does not dictate how the data will be used, it only
offers the possibility to create new keywords and have an index to
reference them in the config and in the tables. The caller will not be
able to use stktable_data_cast() and will have to explicitly cast the
stable pointers to the expected types. It can be used for experimentation
as well.
2014-07-15 19:14:52 +02:00
Willy Tarreau
65d805fdfc BUILD: fix dependencies between config and compat.h
compat.h only depends on the system, and config needs compat, not the
opposite. global.h was fixed to explicitly include standard.h for LONGBITS.
2014-07-15 19:09:36 +02:00
Dan Dubovik
bd57a9f977 BUG/MEDIUM: backend: Update hash to use unsigned int throughout
When we were generating a hash, it was done using an unsigned long.  When the hash was used
to select a backend, it was sent as an unsigned int.  This made it difficult to predict which
backend would be selected.

This patch updates get_hash, and the hash methods to use an unsigned int, to remain consistent
throughout the codebase.

This fix should be backported to 1.5 and probably in part to 1.4.
2014-07-08 22:00:21 +02:00
Willy Tarreau
4e957907aa MINOR: log: make MAX_SYSLOG_LEN overridable at build time
This value was set in log.h without any #ifndef around, so when one
wanted to change it, a patch was needed. Let's move it to defaults.h
with the usual #ifndef so that it's easier to change it.
2014-06-27 18:13:53 +02:00
Simon Horman
98637e5bff MEDIUM: Add external check
Add an external check which makes use of an external process to
check the status of a server.
2014-06-20 07:10:07 +02:00
Emeric Brun
c8b27b6c68 MEDIUM: ssl: add 300s supported time skew on OCSP response update.
OCSP_MAX_RESPONSE_TIME_SKEW can be set to a different value at
compilation (default is 300 seconds).
2014-06-19 14:37:30 +02:00
Emeric Brun
4147b2ef10 MEDIUM: ssl: basic OCSP stapling support.
The support is all based on static responses. This doesn't add any
request / response logic to HAProxy, but allows a way to update
information through the socket interface.

Currently certificates specified using "crt" or "crt-list" on "bind" lines
are loaded as PEM files.
For each PEM file, haproxy checks for the presence of file at the same path
suffixed by ".ocsp". If such file is found, support for the TLS Certificate
Status Request extension (also known as "OCSP stapling") is automatically
enabled. The content of this file is optional. If not empty, it must contain
a valid OCSP Response in DER format. In order to be valid an OCSP Response
must comply with the following rules: it has to indicate a good status,
it has to be a single response for the certificate of the PEM file, and it
has to be valid at the moment of addition. If these rules are not respected
the OCSP Response is ignored and a warning is emitted. In order to  identify
which certificate an OCSP Response applies to, the issuer's certificate is
necessary. If the issuer's certificate is not found in the PEM file, it will
be loaded from a file at the same path as the PEM file suffixed by ".issuer"
if it exists otherwise it will fail with an error.

It is possible to update an OCSP Response from the unix socket using:

  set ssl ocsp-response <response>

This command is used to update an OCSP Response for a certificate (see "crt"
on "bind" lines). Same controls are performed as during the initial loading of
the response. The <response> must be passed as a base64 encoded string of the
DER encoded response from the OCSP server.

Example:
  openssl ocsp -issuer issuer.pem -cert server.pem \
               -host ocsp.issuer.com:80 -respout resp.der
  echo "set ssl ocsp-response $(base64 -w 10000 resp.der)" | \
               socat stdio /var/run/haproxy.stat

This feature is automatically enabled on openssl 0.9.8h and above.

This work was performed jointly by Dirkjan Bussink of GitHub and
Emeric Brun of HAProxy Technologies.
2014-06-18 18:28:56 +02:00
Thierry FOURNIER
26202760a4 MINOR: regex: Use native PCRE API.
The pcreposix layer (in the pcre projetc) execute strlen to find
thlength of the string. When we are using the function "regex_exex*2",
the length is used to add a final \0, when pcreposix is executed a
strlen is executed to compute the length.

If we are using a native PCRE api, the length is provided as an
argument, and these operations disappear.

This is useful because PCRE regex are more used than POSIC regex.
2014-06-18 15:14:00 +02:00
Thierry FOURNIER
09af0d6d43 MEDIUM: regex: replace all standard regex function by own functions
This patch remove all references of standard regex in haproxy. The last
remaining references are only in the regex.[ch] files.

In the file src/checks.c, the original function uses a "pmatch" array.
In fact this array is unused. This patch remove it.
2014-06-18 15:07:57 +02:00
Thierry FOURNIER
b8f980cc19 MINOR: regex: Create JIT compatible function that return match strings
This patchs rename the "regex_exec" to "regex_exec2". It add a new
"regex_exec", "regex_exec_match" and "regex_exec_match2" function. This
function can match regex and return array containing matching parts.
Otherwise, this function use the compiled method (JIT or PCRE or POSIX).

JIT require a subject with length. PCREPOSIX and native POSIX regex
require a null terminted subject. The regex_exec* function are splited
in two version. The first version take a null terminated string, but it
execute strlen() on the subject if it is compiled with JIT. The second
version (terminated by "2") take the subject and the length. This
version adds a null character in the subject if it is compiled with
PCREPOSIX or native POSIX functions.

The documentation of posix regex and pcreposix says that the function
returns 0 if the string matche otherwise it returns REG_NOMATCH. The
REG_NOMATCH macro take the value 1 with posix regex and the value 17
with the pcreposix. The documentaion of the native pcre API (used with
JIT) returns a negative number if no match, otherwise, it returns 0 or a
positive number.

This patch fix also the return codes of the regex_exec* functions. Now,
these function returns true if the string match, otherwise it returns
false.
2014-06-18 15:07:50 +02:00
Willy Tarreau
4bfc580dd3 MEDIUM: session: maintain per-backend and per-server time statistics
Using the last rate counters, we now compute the queue, connect, response
and total times per server and per backend with a 95% accuracy over the last
1024 samples. The operation is cheap so we don't need to condition it.
2014-06-17 17:15:56 +02:00
Willy Tarreau
588297f2f9 MINOR: tools: add new functions to quote-encode strings
qstr() and cstr() will be used to quote-encode strings. The first one
does it unconditionally. The second one is aimed at CSV files where the
quote-encoding is only needed when the field contains a quote or a comma.
2014-06-16 18:20:14 +02:00
Simon Horman
75ab8bdb83 MEDIUM: Add port_to_str helper
This helper is similar to addr_to_str but
tries to convert the port rather than the address
of a struct sockaddr_storage.

This is in preparation for supporting
an external agent check.

Signed-off-by: Simon Horman <horms@verge.net.au>
2014-06-16 10:10:33 +02:00
Remi Gacogne
f46cd6e4ec MEDIUM: ssl: Add the option to use standardized DH parameters >= 1024 bits
When no static DH parameters are specified, this patch makes haproxy
use standardized (rfc 2409 / rfc 3526) DH parameters with prime lenghts
of 1024, 2048, 4096 or 8192 bits for DHE key exchange. The size of the
temporary/ephemeral DH key is computed as the minimum of the RSA/DSA server
key size and the value of a new option named tune.ssl.default-dh-param.
2014-06-12 16:12:23 +02:00
Willy Tarreau
c874653bb4 BUILD: don't use type "uint" which is not portable
Dmitry Sivachenko reported that "uint" doesn't build on FreeBSD 10.
On Linux it's defined in sys/types.h and indicated as "old". Just
get rid of the very few occurrences.
2014-05-28 23:05:07 +02:00
Sasha Pachev
c600204ddf BUG/MEDIUM: regex: fix risk of buffer overrun in exp_replace()
Currently exp_replace() (which is used in reqrep/reqirep) is
vulnerable to a buffer overrun. I have been able to reproduce it using
the attached configuration file and issuing the following command:

  wget -O - -S -q http://localhost:8000/`perl -e 'print "a"x4000'`/cookie.php

Str was being checked only in in while (str) and it was possible to
read past that when more than one character was being accessed in the
loop.

WT:
   Note that this bug is only marked MEDIUM because configurations
   capable of triggering this bug are very unlikely to exist at all due
   to the fact that most rewrites consist in static string additions
   that largely fit into the reserved area (8kB by default).

   This fix should also be backported to 1.4 and possibly even 1.3 since
   it seems to have been present since 1.1 or so.

Config:
-------

  global
        maxconn         500
        stats socket /tmp/haproxy.sock mode 600

  defaults
        timeout client      1000
        timeout connect      5000
        timeout server      5000
        retries         1
        option redispatch

  listen stats
        bind :8080
        mode http
        stats enable
        stats uri /stats
        stats show-legends

  listen  tcp_1
        bind :8000
        mode            http
        maxconn 400
        balance roundrobin
        reqrep ^([^\ :]*)\ /(.*)/(.*)\.php(.*) \1\ /\3.php?arg=\2\2\2\2\2\2\2\2\2\2\2\2\2\4
        server  srv1 127.0.0.1:9000 check port 9000 inter 1000 fall 1
        server  srv2 127.0.0.1:9001 check port 9001 inter 1000 fall 1
2014-05-27 14:36:06 +02:00
Willy Tarreau
6346f0a534 DOC: stop referencing the slow git repository in the README
git.1wt.eu is painfully slow and some people experience issues with
it. Better hide it and only advertise git.haproxy.org which is mirrored
on a faster server.

Also replace haproxy.1wt.eu with www.haproxy.org in the download URL
which appears in the stats page.
2014-05-10 11:04:39 +02:00
Willy Tarreau
18ca2d48bf MINOR: tools: split is_addr() and is_inet_addr()
The is_addr() function indicates if an address is set and is an IPv4
or IPv6 address. Let's rename it is_inet_addr() and make is_addr()
also accept AF_UNIX addresses.
2014-05-10 01:26:37 +02:00
Willy Tarreau
a9db57ec5c MEDIUM: config: limit nbproc to the machine's word size
Some consistency checks cannot be performed between frontends, backends
and peers at the moment because there is no way to check for intersection
between processes bound to some processes when the number of processes is
higher than the number of bits in a word.

So first, let's limit the number of processes to the machine's word size.
This means nbproc will be limited to 32 on 32-bit machines and 64 on 64-bit
machines. This is far more than enough considering that configs rarely go
above 16 processes due to scalability and management issues, so 32 or 64
should be fine.

This way we'll ensure we can always build a mask of all the processes a
section is bound to.
2014-05-09 19:16:26 +02:00
Willy Tarreau
cefad67689 BUILD: syscalls: remove improper inline statement in front of syscalls
Trying to build with an old gcc and glibc revealed that we must not
state "inline" in our _syscall* definitions since it's already present
in the declaration making use of the _syscall* macros.
2014-05-08 22:38:02 +02:00
Willy Tarreau
5a8ba60fe1 CLEANUP: buffers: remove unused function buffer_contig_space_with_res()
This function is now unused and was dangerous. Its cousin
buffer_contig_space_res() was removed as well since it was the only
one to use it.
2014-04-24 17:19:22 +02:00
Willy Tarreau
272adea423 REORG: cfgparse: move server keyword parsing to server.c
The cfgparse.c file becomes huge, and a large part of it comes from the
server keyword parser. Since the configuration is a bit more modular now,
move this parser to server.c.

This patch also moves the check of the "server" keyword earlier in the
supported keywords list, resulting in a slightly faster config parsing
for configs with large numbers of servers (about 10%).

No functional change was made, only the code was moved.
2014-03-31 10:42:03 +02:00
Thierry FOURNIER
fa45f1d06c MEDIUM: config: Dynamic sections.
This patch permit to register new sections in the haproxy's
configuration file. This run like all the "keyword" registration, it is
used during the haproxy initialization, typically with the
"__attribute__((constructor))" functions.
2014-03-31 09:56:40 +02:00
Thierry FOURNIER
9f95e4084c MINOR: standard: Add ipv6 support in the function url2sa().
The function url2sa() converts faster url like http://<ip>:<port> in a
struct sockaddr_storage. This patch add:
 - the https support
 - permit to return the length parsed
 - support IPv6
 - support DNS synchronous resolution only during start of haproxy.

The faster IPv4 convertion way is keeped. IPv6 is slower, because I use
the standard IPv6 parser function.
2014-03-31 09:54:44 +02:00
Thierry FOURNIER
fc7ac7b89c MINOR: standard: Disable ip resolution during the runtime
The function str2net runs DNS resolution if valid ip cannot be parsed.
The DNS function used is the standard function of the libc and it
performs asynchronous request.

The asynchronous request is not compatible with the haproxy
archictecture.

str2net() is used during the runtime throught the "socket".

This patch remove the DNS resolution during the runtime.
2014-03-17 18:06:08 +01:00
Thierry FOURNIER
0b6d15fdc8 MINOR: regex: The pointer regstr in the struc regex is no longer used.
The pointer <regstr> is only used to compare and identify the original
regex string with the patterns. Now the patterns have a reference map
containing this original string. It is useless to store this value two
times.
2014-03-17 18:06:08 +01:00
Thierry FOURNIER
b050463375 MINOR: standard: Add function for converting cidr to network mask. 2014-03-17 18:06:07 +01:00
Thierry FOURNIER
511e9475f2 MEDIUM: acl/pattern: standardisation "of pat_parse_int()" and "pat_parse_dotted_ver()"
The goal of these patch is to simplify the prototype of
"pat_pattern_*()" functions. I want to replace the argument "char
**args" by a simple "char *arg" and remove the "opaque" argument.

"pat_parse_int()" and "pat_parse_dotted_ver()" are the unique pattern
parser using the "opaque" argument and using more than one string
argument of the char **args. These specificities are only used with ACL.
Other systems using this pattern parser (MAP and CLI) just use one
string for describing a range.

This two functions can read a range, but the min and the max must y
specified. This patch extends the syntax to describe a range with
implicit min and max. This is used for operators like "lt", "le", "gt",
and "ge". the syntax is the following:

   ":x" -> no min to "x"
   "x:" -> "x" to no max

This patch moves the parsing of the comparison operator from the
functions "pat_parse_int()" and "pat_parse_dotted_ver()" to the acl
parser. The acl parser read the operator and the values and build a
volatile string readable by the functions "pat_parse_int()" and
"pat_parse_dotted_ver()". The transformation is done with these rules:

If the parser is "pat_parse_int()":

   "eq x" -> "x"
   "le x" -> ":x"
   "lt x" -> ":y" (with y = x - 1)
   "ge x" -> "x:"
   "gt x" -> "y:" (with y = x + 1)

If the parser is "pat_parse_dotted_ver()":

   "eq x.y" -> "x.y"
   "le x.y" -> ":x.y"
   "lt x.y" -> ":w.z" (with w.z = x.y - 1)
   "ge x.y" -> "x.y:"
   "gt x.y" -> "w.z:" (with w.z = x.y + 1)

Note that, if "y" is not present, assume that is "0".

Now "pat_parse_int()" and "pat_parse_dotted_ver()" accept only one
pattern and the variable "opaque" is no longer used. The prototype of
the pattern parsers can be changed.
2014-03-17 18:06:06 +01:00
Thierry FOURNIER
e059ec9393 MINOR: standard: add function "encode_chunk"
This function has the same behavior as encode_string(), except it
takes a "struct chunk" instead of a "char *" on input.
2014-03-17 16:38:56 +01:00
Willy Tarreau
cc08d2c9ff MEDIUM: counters: stop relying on session flags at all
Till now, we had one flag per stick counter to indicate if it was
tracked in a backend or in a frontend. We just had to add another
flag per stick-counter to indicate if it relies on contents or just
connection. These flags are quite painful to maintain and tend to
easily conflict with other flags if their number is changed.

The correct solution consists in moving the flags to the stkctr struct
itself, but currently this struct is made of 2 pointers, so adding a
new entry there to store only two bits will cause at least 16 more bytes
to be eaten per counter due to alignment issues, and we definitely don't
want to waste tens to hundreds of bytes per session just for things that
most users don't use.

Since we only need to store two bits per counter, an intermediate
solution consists in replacing the entry pointer with a composite
value made of the original entry pointer and the two flags in the
2 unused lower bits. If later a need for other flags arises, we'll
have to store them in the struct.

A few inline functions have been added to abstract the retrieval
and assignment of the pointers and flags, resulting in very few
changes. That way there is no more dependence on the number of
stick-counters and their position in the session flags.
2014-01-28 23:34:45 +01:00
Willy Tarreau
bb519c7cd1 MINOR: tools: add very basic support for composite pointers
Very often we want to associate one or two flags to a pointer, to
put a type on it or whatever. This patch provides this in standard.h
in the form of a few inline functions which combine a void * pointer
with an int and return an unsigned long called a composite address.
The functions allow to individuall set, retrieve both the pointer and
the flags. This is very similar to what is used in ebtree in fact.
2014-01-28 23:34:45 +01:00
Willy Tarreau
f3338349ec BUG/MEDIUM: counters: flush content counters after each request
One year ago, commit 5d5b5d8 ("MEDIUM: proto_tcp: add support for tracking
L7 information") brought support for tracking L7 information in tcp-request
content rules. Two years earlier, commit 0a4838c ("[MEDIUM] session-counters:
correctly unbind the counters tracked by the backend") used to flush the
backend counters after processing a request.

While that earliest patch was correct at the time, it became wrong after
the second patch was merged. The code does what it says, but the concept
is flawed. "TCP request content" rules are evaluated for each HTTP request
over a single connection. So if such a rule in the frontend decides to
track any L7 information or to track L4 information when an L7 condition
matches, then it is applied to all requests over the same connection even
if they don't match. This means that a rule such as :

     tcp-request content track-sc0 src if { path /index.html }

will count one request for index.html, and another one for each of the
objects present on this page that are fetched over the same connection
which sent the initial matching request.

Worse, it is possible to make the code do stupid things by using multiple
counters:

     tcp-request content track-sc0 src if { path /foo }
     tcp-request content track-sc1 src if { path /bar }

Just sending two requests first, one with /foo, one with /bar, shows
twice the number of requests for all subsequent requests. Just because
both of them persist after the end of the request.

So the decision to flush backend-tracked counters was not the correct
one. In practice, what is important is to flush countent-based rules
since they are the ones evaluated for each request.

Doing so requires new flags in the session however, to keep track of
which stick-counter was tracked by what ruleset. A later change might
make this easier to maintain over time.

This bug is 1.5-specific, no backport to stable is needed.
2014-01-28 21:40:28 +01:00
Willy Tarreau
12833bbca5 MINOR: cli: add the new "show pools" command
show pools
  Dump the status of internal memory pools. This is useful to track memory
  usage when suspecting a memory leak for example. It does exactly the same
  as the SIGQUIT when running in foreground except that it does not flush
  the pools.
2014-01-28 16:50:35 +01:00
Willy Tarreau
2317976daa BUILD: listener: fix recent accept4() again
Recent commit 4448925 ("BUILD/MINOR: listener: remove a glibc warning on accept4()")
broke accept4() on some systems because the glibc's version may now conflict with
the local one.
2014-01-15 16:45:17 +01:00
Willy Tarreau
89efaed6b6 BUILD: definitely silence some stupid GCC warnings
It's becoming increasingly difficult to ignore unwanted function returns in
debug code with gcc. Now even when you try to work around it, it suggests a
way to write your code differently. For example :

    src/frontend.c:187:65: warning: if statement has empty body [-Wempty-body]
                if (write(1, trash.str, trash.len) < 0) /* shut gcc warning */;
                                                                              ^
    src/frontend.c:187:65: note: put the semicolon on a separate line to silence this warning
    1 warning generated.

This is totally unacceptable, this code already had to be written this way
to shut it up in earlier versions. And now it comments the form ? What's the
purpose of the C language if you can't write anymore the code that does what
you want ?

Emeric proposed to just keep a global variable to drain such useless results
so that gcc stops complaining all the time it believes people who write code
are monkeys. The solution is acceptable because the useless assignment is done
only in debug code so it will not impact performance. This patch implements
this, until gcc becomes even "smarter" to detect that we tried to cheat.
2013-12-13 15:21:36 +01:00
Willy Tarreau
5f3f15f618 BUILD: time: adapt the type of TV_ETERNITY to the local system
Some systems use different types for tv_sec/tv_usec, some are
signed others not. From time to time new warnings are reported
about implicit casts being done.

This patch ensures that TV_ETERNITY is cast to the appropriate
type in assignments and conversions.
2013-12-13 09:22:23 +01:00
Thierry FOURNIER
39e258fcee MINOR: regex: Copy the original regex expression into string.
This is useful for the debug or for search regex in maps.
2013-12-12 15:43:34 +01:00
Thierry FOURNIER
799c042daa MINOR: regex: Change the struct containing regex
This change permits to remove the typedef. The original regex structs
are set in haproxy's struct.
2013-12-12 15:42:58 +01:00
Baptiste Assmann
bb77c8e26d MINOR: tools: function my_memmem() to lookup binary contents
This function simply looks for a memory block inside another one.

Signed-off-by: Baptiste Assmann <bedis9@gmail.com>
2013-12-06 11:50:47 +01:00
Willy Tarreau
126d40691a MINOR: tools: add a generic binary hex string parser
We currently use such an hex parser in pat_parse_bin() to parse hex
string patterns. We'll need another generic one so let's move it to
standard.c and have pat_parse_bin() make use of it.
2013-12-06 11:50:47 +01:00
Thierry FOURNIER
d559dd8390 MINOR: tools: Add a function to convert buffer to an ipv6 address
The inet_pton function needs an input string with a final \0. This
function copies the input string to a temporary buffer, adds the final
\0 and converts to address.
2013-12-02 23:31:32 +01:00
Simon Horman
58c32978b2 MEDIUM: Set rise and fall of agent checks to 1
This is achieved by moving rise and fall from struct server to struct check.

After this move the behaviour of the primary check, server->check is
unchanged. However, the secondary agent check, server->agent now has
independent rise and fall values each of which are set to 1.

The result is that receiving "fail", "stopped" or "down" just once from the
agent will mark the server as down. And receiving a weight just once will
allow the server to be marked up if its primary check is in good health.

This opens up the scope to allow the rise and fall values of the agent
check to be configurable, however this has not been implemented at this
stage.

Signed-off-by: Simon Horman <horms@verge.net.au>
2013-11-25 07:31:16 +01:00
Willy Tarreau
a0f4271497 MEDIUM: backend: add support for the wt6 hash
This function was designed for haproxy while testing other functions
in the past. Initially it was not planned to be used given the not
very interesting numbers it showed on real URL data : it is not as
smooth as the other ones. But later tests showed that the other ones
are extremely sensible to the server count and the type of input data,
especially DJB2 which must not be used on numeric input. So in fact
this function is still a generally average performer and it can make
sense to merge it in the end, as it can provide an alternative to
sdbm+avalanche or djb2+avalanche for consistent hashing or when hashing
on numeric data such as a source IP address or a visitor identifier in
a URL parameter.
2013-11-14 16:37:50 +01:00
Bhaskar
98634f0c7b MEDIUM: backend: Enhance hash-type directive with an algorithm options
Summary:
In testing at tumblr, we found that using djb2 hashing instead of the
default sdbm hashing resulted is better workload distribution to our backends.

This commit implements a change, that allows the user to specify the hash
function they want to use. It does not limit itself to consistent hashing
scenarios.

The supported hash functions are sdbm (default), and djb2.

For a discussion of the feature and analysis, see mailing list thread
"Consistent hashing alternative to sdbm" :

      http://marc.info/?l=haproxy&m=138213693909219

Note: This change does NOT make changes to new features, for instance,
applying an avalance hashing always being performed before applying
consistent hashing.
2013-11-14 16:37:50 +01:00
Thierry FOURNIER
ef37a66628 CLEANUP: The function "regex_exec" needs the string length but in many case they expect null terminated char.
If haproxy is compiled with the USE_PCRE_JIT option, the length of the
string is used. If it is compiled without this option the function doesn't
use the length and expects a null terminated string.

The prototype of the function is ambiguous, and depends on the
compilation option. The developer can think that the length is always
used, and many bugs can be created.

This patch makes sure that the length is used. The regex_exec function
adds the final '\0' if it is needed.
2013-10-23 12:19:51 +02:00
Thierry FOURNIER
ed5a4aefae CLEANUP: regex: Create regex_comp function that compiles regex using compilation options
The current file "regex.h" define an abstraction for the regex. It
provides the same struct name and the same "regexec" function for the
3 regex types supported: standard libc, basic pcre and jit pcre.

The regex compilation function is not provided by this file. If the
developper wants to use regex, he must write regex compilation code
containing "#define *JIT*".

This patch provides a unique regex compilation function according to
the compilation options.

In addition, the "regex.h" file checks the presence of the "#define
PCRE_CONFIG_JIT" when "USE_PCRE_JIT" is enabled. If this flag is not
present, the pcre lib doesn't support JIT and "#error" is emitted.
2013-10-14 14:42:50 +02:00
Thierry FOURNIER
e28f1ecf2b BUILD/MINOR: missing header file
In the header file "common/regex.h", the C keyword NULL is used. This
keyword is referenced into the header file "stdlib.h", but this is not
included.
2013-10-10 11:38:35 +02:00
Willy Tarreau
47060b6ae0 MINOR: cli: make it possible to enter multiple values at once with "set table"
The "set table" statement allows to create new entries with their respective
values. Till now it was limited to a single data type per line, requiring as
many "set table" statements as the desired data types to be set. Since this
is only a parser limitation, this patch gets rid of it. It also allows the
creation of a key with no data types (all reset to their default values).
2013-08-01 21:17:19 +02:00
Willy Tarreau
b4c8493a9f MINOR: session: make the number of stick counter entries more configurable
In preparation of more flexibility in the stick counters, make their
number configurable. It still defaults to 3 which is the minimum
accepted value. Changing the value alone is not sufficient to get
more counters, some bitfields still need to be updated and the TCP
actions need to be updated as well, but this update tries to be
easier, which is nice for experimentation purposes.
2013-08-01 21:17:14 +02:00
Lukas Tribus
67db8df12b MEDIUM: http: add IPv6 support for "set-tos"
As per RFC3260 #4 and BCP37 #4.2 and #5.2, the IPv6 counterpart of TOS
is "traffic class".

Add support for IPv6 traffic class in "set-tos" by moving the "set-tos"
related code to the new inline function inet_set_tos(), handling IPv4
(IP_TOS), IPv6 (IPV6_TCLASS) and IPv4-mapped sockets (IP_TOS, like
::ffff:127.0.0.1).

Also define - if missing - the IN6_IS_ADDR_V4MAPPED() macro in
include/common/compat.h for compatibility.
2013-06-23 18:01:38 +02:00
Willy Tarreau
dc13c11c1e BUG/MEDIUM: prevent gcc from moving empty keywords lists into BSS
Benoit Dolez reported a failure to start haproxy 1.5-dev19. The
process would immediately report an internal error with missing
fetches from some crap instead of ACL names.

The cause is that some versions of gcc seem to trim static structs
containing a variable array when moving them to BSS, and only keep
the fixed size, which is just a list head for all ACL and sample
fetch keywords. This was confirmed at least with gcc 3.4.6. And we
can't move these structs to const because they contain a list element
which is needed to link all of them together during the parsing.

The bug indeed appeared with 1.5-dev19 because it's the first one
to have some empty ACL keyword lists.

One solution is to impose -fno-zero-initialized-in-bss to everyone
but this is not really nice. Another solution consists in ensuring
the struct is never empty so that it does not move there. The easy
solution consists in having a non-null list head since it's not yet
initialized.

A new "ILH" list head type was thus created for this purpose : create
an Initialized List Head so that gcc cannot move the struct to BSS.
This fixes the issue for this version of gcc and does not create any
burden for the declarations.
2013-06-21 23:29:02 +02:00
Willy Tarreau
2b57cb8f30 MEDIUM: protocol: implement a "drain" function in protocol layers
Since commit cfd97c6f was merged into 1.5-dev14 (BUG/MEDIUM: checks:
prevent TIME_WAITs from appearing also on timeouts), some valid health
checks sometimes used to show some TCP resets. For example, this HTTP
health check sent to a local server :

  19:55:15.742818 IP 127.0.0.1.16568 > 127.0.0.1.8000: S 3355859679:3355859679(0) win 32792 <mss 16396,nop,nop,sackOK,nop,wscale 7>
  19:55:15.742841 IP 127.0.0.1.8000 > 127.0.0.1.16568: S 1060952566:1060952566(0) ack 3355859680 win 32792 <mss 16396,nop,nop,sackOK,nop,wscale 7>
  19:55:15.742863 IP 127.0.0.1.16568 > 127.0.0.1.8000: . ack 1 win 257
  19:55:15.745402 IP 127.0.0.1.16568 > 127.0.0.1.8000: P 1:23(22) ack 1 win 257
  19:55:15.745488 IP 127.0.0.1.8000 > 127.0.0.1.16568: FP 1:146(145) ack 23 win 257
  19:55:15.747109 IP 127.0.0.1.16568 > 127.0.0.1.8000: R 23:23(0) ack 147 win 257

After some discussion with Chris Huang-Leaver, it appeared clear that
what we want is to only send the RST when we have no other choice, which
means when the server has not closed. So we still keep SYN/SYN-ACK/RST
for pure TCP checks, but don't want to see an RST emitted as above when
the server has already sent the FIN.

The solution against this consists in implementing a "drain" function at
the protocol layer, which, when defined, causes as much as possible of
the input socket buffer to be flushed to make recv() return zero so that
we know that the server's FIN was received and ACKed. On Linux, we can make
use of MSG_TRUNC on TCP sockets, which has the benefit of draining everything
at once without even copying data. On other platforms, we read up to one
buffer of data before the close. If recv() manages to get the final zero,
we don't disable lingering. Same for hard errors. Otherwise we do.

In practice, on HTTP health checks we generally find that the close was
pending and is returned upon first recv() call. The network trace becomes
cleaner :

  19:55:23.650621 IP 127.0.0.1.16561 > 127.0.0.1.8000: S 3982804816:3982804816(0) win 32792 <mss 16396,nop,nop,sackOK,nop,wscale 7>
  19:55:23.650644 IP 127.0.0.1.8000 > 127.0.0.1.16561: S 4082139313:4082139313(0) ack 3982804817 win 32792 <mss 16396,nop,nop,sackOK,nop,wscale 7>
  19:55:23.650666 IP 127.0.0.1.16561 > 127.0.0.1.8000: . ack 1 win 257
  19:55:23.651615 IP 127.0.0.1.16561 > 127.0.0.1.8000: P 1:23(22) ack 1 win 257
  19:55:23.651696 IP 127.0.0.1.8000 > 127.0.0.1.16561: FP 1:146(145) ack 23 win 257
  19:55:23.652628 IP 127.0.0.1.16561 > 127.0.0.1.8000: F 23:23(0) ack 147 win 257
  19:55:23.652655 IP 127.0.0.1.8000 > 127.0.0.1.16561: . ack 24 win 257

This change should be backported to 1.4 which is where Chris encountered
this issue. The code is different, so probably the tcp_drain() function
will have to be put in the checks only.
2013-06-10 20:33:23 +02:00
Willy Tarreau
bf43f6eb32 MINOR: defaults: allow REQURI_LEN and CAPTURE_LEN to be redefined
Some users want to change these default settings but it was not
convenient.
2013-06-10 10:30:09 +02:00
Pieter Baauw
1eb7592bba MINOR: tproxy: add support for OpenBSD
OpenBSD uses (SOL_SOCKET, SO_BINDANY) to enable transparent
proxy on a socket.

This patch adds support for the relevant setsockopt() calls.
2013-05-11 08:03:50 +02:00
Pieter Baauw
ff30b6667b MINOR: tproxy: add support for FreeBSD
FreeBSD uses (IPPROTO_IP, IP_BINDANY) and (IPPROTO_IPV6, IPV6_BINDANY)
to enable transparent proxy on a socket.

This patch adds support for the relevant setsockopt() calls.
2013-05-11 08:03:43 +02:00
Pieter Baauw
d551fb5a8d REORG: tproxy: prepare the transparent proxy defines for accepting other OSes
This patch does not change the logic of the code, it only changes the
way OS-specific defines are tested.

At the moment the transparent proxy code heavily depends on Linux-specific
defines. This first patch introduces a new define "CONFIG_HAP_TRANSPARENT"
which is set every time the defines used by transparent proxy are present.
This also means that with an up-to-date libc, it should not be necessary
anymore to force CONFIG_HAP_LINUX_TPROXY during the build, as the flags
will automatically be detected.

The CTTPROXY flags still remain separate because this older API doesn't
work the same way.

A new line has been added in the version output for haproxy -vv to indicate
what transparent proxy support is available.
2013-05-11 08:03:37 +02:00
Willy Tarreau
dd11293e84 BUG/MINOR: acl: fix a double free during exit when using PCRE_JIT
When freeing ACL regex, we don't want to perform the free() in regex_free()
as it's already performed in free_pattern(). The double free only happens
when using PCRE_JIT when freeing everything during exit so it's harmless
but exhibits libc errors during a reload/restart.

Bug reported by Seri.
2013-05-08 18:14:18 +02:00
de Lafond Guillaume
88c278fadf MEDIUM: stats: add proxy name filtering on the statistic page
This patch adds a "scope" box in the statistics page in order to
display only proxies with a name that contains the requested value.
The scope filter is preserved across all clicks on the page.
2013-04-15 22:50:33 +02:00
Lukas Tribus
0999f7662c BUILD: add explicit support for TFO with USE_TFO
TCP Fast Open is supported in server mode since Linux 3.7, but current
libc's don't define TCP_FASTOPEN=23. Introduce the new USE flag USE_TFO
to define it manually in compat.h. Also note this in the TFO related
documentation.
2013-04-02 17:40:43 +02:00
Willy Tarreau
0161d62d23 OPTIM: http: improve branching in chunk size parser
By tweaking a bit some conditions in http_parse_chunk_size(), we could
improve the overall performance in the worst case by 15%.
2013-04-02 02:00:57 +02:00
Willy Tarreau
bf43927cd7 OPTIM: buffer: remove one jump in buffer_count()
We can help gcc build an expression not involving a jump. This function
is used a lot when parsing chunks.
2013-04-02 01:25:57 +02:00
Hiroaki Nakamura
7035132349 MEDIUM: regex: Use PCRE JIT in acl
This is a patch for using PCRE JIT in acl.

I notice regex are used in other places, but they are more complicated
to modify to use PCRE APIs. So I focused to acl in the first try.

BTW, I made a simple benchmark program for PCRE JIT beforehand.
https://github.com/hnakamur/pcre-jit-benchmark

I read the manual for PCRE JIT
http://www.manpagez.com/man/3/pcrejit/

and wrote my benchmark program.
https://github.com/hnakamur/pcre-jit-benchmark/blob/master/test-pcre.c
2013-04-02 00:02:54 +02:00
Willy Tarreau
dad36a3ee3 MAJOR: tools: support environment variables in addresses
Now that all addresses are parsed using str2sa_range(), it becomes easy
to add support for environment variables and use them everywhere an address
is needed. Environment variables are used as $VAR or ${VAR} as in shell.
Any number of variables may compose an address, allowing various fantasies
such as "fd@${FD_HTTP}" or "${LAN_DC1}.1:80".

These ones are usable in logs, bind, servers, peers, stats socket, source,
dispatch, and check address.
2013-03-11 01:30:02 +01:00
Willy Tarreau
24709286fe MEDIUM: tools: support specifying explicit address families in str2sa_range()
This change allows one to force the address family in any address parsed
by str2sa_range() by specifying it as a prefix followed by '@' then the
address. Currently supported address prefixes are 'ipv4@', 'ipv6@', 'unix@'.
This also helps forcing resolving for host names (when getaddrinfo is used),
and force the family of the empty address (eg: 'ipv4@' = 0.0.0.0 while
'ipv6@' = ::).

The main benefits is that unix sockets can now get a local name without
being forced to begin with a slash. This is useful during development as
it is no longer necessary to have stats socket sent to /tmp.
2013-03-10 22:46:55 +01:00
Willy Tarreau
c120c8d347 CLEANUP: minor cleanup in str2sa_range() and str2ip()
Don't use a statically allocated address both for str2ip and str2sa_range,
use the same. The inet and unix code paths have been splitted a little
better to improve readability.
2013-03-10 21:36:31 +01:00
Willy Tarreau
add0ab1975 CLEANUP: tools: remove str2sun() which is not used anymore. 2013-03-08 14:04:54 +01:00
Willy Tarreau
d393a628bb MINOR: tools: prepare str2sa_range() to accept a prefix
We'll need str2sa_range() to support a prefix for unix sockets. Since
we don't always want to use it (eg: stats socket), let's not take it
unconditionally from global but let the caller pass it.
2013-03-08 14:04:54 +01:00
Willy Tarreau
df350f1f48 MINOR: tools: prepare str2sa_range() to return an error message
We'll need str2sa_range() to return address parsing errors if we want to
extend its functionalities. Let's do that now eventhough it's not used
yet.
2013-03-08 14:04:53 +01:00
Emeric Brun
6924ef8b12 BUG/MEDIUM: ssl: ECDHE ciphers not usable without named curve configured.
Fix consists to use prime256v1 as default named curve to init ECDHE ciphers if none configured.
2013-03-06 19:08:26 +01:00
Willy Tarreau
4f4b18b2ec BUILD/MINOR: syscall: add definition of NR_accept4 for ARM
This platform was not covered and older libc do not provide accept4().
2013-03-04 07:38:08 +01:00
Willy Tarreau
b26cc86b1c BUG/MINOR: syscall: fix NR_accept4 system call on sparc/linux
An invalid copy-paste called it NR_splice instead of NR_accept4.
This does not lead to real issues because if this define is used,
then the code cannot compile since NR_accept4 is still missing.
2013-03-04 07:31:08 +01:00
Willy Tarreau
d4448bc836 MEDIUM: tools: make str2sa_range support all address syntaxes
Right now we have multiple methods for parsing IP addresses in the
configuration. This is quite painful. This patch aims at adapting
str2sa_range() to make it support all formats, so that the callers
perform the appropriate tests on the return values. str2sa() was
changed to simply return str2sa_range().

The output values are now the following ones (taken from the comment
on top of the function).

  Converts <str> to a locally allocated struct sockaddr_storage *, and a port
  range or offset consisting in two integers that the caller will have to
  check to find the relevant input format. The following format are supported :

    String format           | address |  port  |  low   |  high
     addr                   | <addr>  |   0    |   0    |   0
     addr:                  | <addr>  |   0    |   0    |   0
     addr:port              | <addr>  | <port> | <port> | <port>
     addr:pl-ph             | <addr>  |  <pl>  |  <pl>  |  <ph>
     addr:+port             | <addr>  | <port> |   0    | <port>
     addr:-port             | <addr>  |-<port> | <port> |   0

  The detection of a port range or increment by the caller is made by
  comparing <low> and <high>. If both are equal, then port 0 means no port
  was specified. The caller may pass NULL for <low> and <high> if it is not
  interested in retrieving port ranges.

  Note that <addr> above may also be :
    - empty ("")  => family will be AF_INET and address will be INADDR_ANY
    - "*"         => family will be AF_INET and address will be INADDR_ANY
    - "::"        => family will be AF_INET6 and address will be IN6ADDR_ANY
    - a host name => family and address will depend on host name resolving.
2013-02-20 17:29:30 +01:00
Simon Horman
5269cfb458 BUG/MINOR: Correct logic in cut_crlf()
This corrects what appears to be logic errors in cut_crlf().
I assume that the intention of this function is to truncate a
string at the first cr or lf. However, currently lf are ignored.

Also use '\0' instead of 0 as the null character, a cosmetic change.

Cc: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Simon Horman <horms@verge.net.au>

[WT: this fix may be backported to 1.4 too]
2013-02-13 10:52:40 +01:00
Willy Tarreau
eab777c32e BUG/MINOR: time: frequency counters are not totally accurate
When a frontend is rate-limited to 1000 connections per second, the
effective rate measured from the client is 999/s, and connections
experience an average response time of 99.5 ms with a standard
deviation of 2 ms.

The reason for this inaccuracy is that when computing frequency
counters, we use one part of the previous value proportional to the
number of milliseconds remaining in the current second. But even the
last millisecond still uses a part of the past value, which is wrong :
since we have a 1ms resolution, the last millisecond must be dedicated
only to filling the current second.

So we slightly adjust the algorithm to use 999/1000 of the past value
during the first millisecond, and 0/1000 of the past value during the
last millisecond.  We also slightly improve the computation by computing
the remaining time instead of the current time in tv_update_date(), so
that we don't have to negate the value in each frequency counter.

Now with the fix, the connection rate measured by both the client and
haproxy is a steady 1000/s, the average response time measured is 99.2ms
and more importantly, the standard deviation has been divided by 3 to
0.6 millisecond.

This fix should also be backported to 1.4 which has the same issue.
2012-12-29 21:50:07 +01:00
Willy Tarreau
56adcf2cc9 MINOR: tools: simplify the use of the int to ascii macros
These macros (U2H, U2A, LIM2A, ...) have been used with an explicit
index for the local storage variable, making it difficult to change
log formats and causing a few issues from time to time. Let's have
a single macro with a rotating index so that up to 10 conversions
may be used in a single call.
2012-12-23 21:46:30 +01:00
Willy Tarreau
47ca54505c MINOR: chunks: centralize the trash chunk allocation
At the moment, we need trash chunks almost everywhere and the only
correctly implemented one is in the sample code. Let's move this to
the chunks so that all other places can use this allocator.

Additionally, the get_trash_chunk() function now really returns two
different chunks. Previously it used to always overwrite the same
chunk and point it to a different buffer, which was a bit tricky
because it's not obvious that two consecutive results do alias each
other.
2012-12-23 21:46:07 +01:00
Willy Tarreau
37994f034c MINOR: standard: add a simple popcount function
This function returns the number of ones in a word.
2012-11-19 12:12:24 +01:00
Emeric Brun
4663577e24 MINOR: build: allow packagers to specify the ssl cache size
This is done by passing the default value to SSLCACHESIZE in sessions.
User can use tune.sslcachesize to change this value.
By default, it is set to 20000 sessions as openssl internal cache size.
Currently, a session entry size is between 592 and 616 bytes depending on the arch.
2012-11-15 10:52:19 +01:00
Willy Tarreau
e9f49e78fe MAJOR: polling: replace epoll with sepoll and remove sepoll
Now that all pollers make use of speculative I/O, there is no point
having two epoll implementations, so replace epoll with the sepoll code
and remove sepoll which has just become the standard epoll method.
2012-11-11 20:53:30 +01:00
Willy Tarreau
ed7f836f07 BUG/MINOR: stream_interface: don't loop over ->snd_buf()
It is stupid to loop over ->snd_buf() because the snd_buf() itself already
loops and stops when system buffers are full. But looping again onto it,
we lose the information of the full buffers and perform one useless syscall.

Furthermore, this causes issues when dealing with large uploads while waiting
for a connection to establish, as it can report a server reject of some data
as a connection abort, which is wrong.

1.4 does not have this issue as it loops maximum twice (once for each buffer
half) and exists as soon as system buffers are full. So no backport is needed.
2012-10-29 23:30:33 +01:00
Willy Tarreau
7780473c3b CLEANUP: replace chunk_printf() with chunk_appendf()
This function's naming was misleading as it is used to append data
at the end of a string, causing some surprizes when used for the
first time!

Add a chunk_printf() function which does what its name suggests.
2012-10-29 16:14:26 +01:00
Willy Tarreau
c26ac9deea MINOR: chunk: add a function to reset a chunk
This is a first step in avoiding to constantly reinitialize chunks.
It replaces the old chunk_reset() which was not properly named as it
used to drop everything and was only used by chunk_destroy(). It has
been renamed chunk_drop().
2012-10-29 13:33:42 +01:00
Yuxans Yao
4e25b015a7 MINOR: log: add '%Tl' to log-format
The '%Tl' is similar to '%T', but using local timezone.
2012-10-29 11:55:26 +01:00
Willy Tarreau
422a0a5161 MINOR: tools: add a clear_addr() function to unset an address
This will be used to unset a from address.
2012-10-26 20:04:26 +02:00
Willy Tarreau
3dd0c4e20e OPTIM: tools: inline hex2i()
This tiny function was not inlined because initially not much used.
However it's been used un the chunk parser for a while and it became
one of the most CPU-cycle eater there. By inlining it, the chunk parser
speed was increased by 74 %. We're almost 3 times faster than original
with just the last 4 commits.
2012-10-26 01:13:24 +02:00
Willy Tarreau
ad8f8e8ffb MINOR: chunk: provide string compare functions
It's sometimes needed to be able to compare a zero-terminated string with a
chunk, so we now have two functions to do that, one strcmp() equivalent and
one strcasecmp() equivalent.
2012-10-19 15:18:06 +02:00
Willy Tarreau
9b28e03b66 MAJOR: channel: replace the struct buffer with a pointer to a buffer
With this commit, we now separate the channel from the buffer. This will
allow us to replace buffers on the fly without touching the channel. Since
nobody is supposed to keep a reference to a buffer anymore, doing so is not
a problem and will also permit some copy-less data manipulation.

Interestingly, these changes have shown a 2% performance increase on some
workloads, probably due to a better cache placement of data.
2012-10-13 09:07:52 +02:00
Willy Tarreau
7fca87fd9d BUILD: accept4: move the socketcall declaration outside of accept4()
Gcc 4.2.4 breaks on the syscall declared inside the function, move it
outside and declare it static inline.
2012-10-10 17:42:39 +02:00
Willy Tarreau
1bc4aab290 MEDIUM: listener: add support for linux's accept4() syscall
On Linux, accept4() does the same as accept() except that it allows
the caller to specify some flags to set on the resulting socket. We
use this to set the O_NONBLOCK flag and thus to save one fcntl()
call in each connection. The effect is a small performance gain of
around 1%.

The option is automatically enabled when target linux2628 is set, or
when the USE_ACCEPT4 Makefile variable is set. If the libc is too old
to provide the equivalent function, this is automatically detected and
our own function is used instead. In any case it is possible to force
the use of our implementation with USE_MY_ACCEPT4.
2012-10-08 20:11:03 +02:00
Emeric Brun
76d8895c49 MINOR: ssl: add defines LISTEN_DEFAULT_CIPHERS and CONNECT_DEFAULT_CIPHERS.
These ones are used to set the default ciphers suite on "bind" lines and
"server" lines respectively, instead of using OpenSSL's defaults. These
are probably mainly useful for distro packagers.
2012-10-05 22:11:15 +02:00
Willy Tarreau
8c89c2059f MINOR: buffers: add a few functions to write chars, strings and blocks
bo_put{chr,blk,str,chk} are used to write data on the output of a buffer.
Output is truncated if the buffer is not large enough.
2012-10-04 22:26:09 +02:00
Willy Tarreau
4fbb2285e2 MINOR: config: make str2listener() use memprintf() to report errors.
This will make it possible to use the function for other listening
sockets.
2012-09-24 10:53:16 +02:00
Willy Tarreau
eb6cead1de MINOR: standard: make memprintf() support a NULL destination
Doing so removes many checks that were systematically made because
the callees don't know if the caller passed a valid pointer.
2012-09-24 10:53:16 +02:00
Willy Tarreau
ce39bfb7c4 BUG: backend: balance hdr was broken since 1.5-dev11
Alex Markham reported and diagnosed a bug appearing on 1.5-dev11,
causing a crash on x86_64 when header hashing is used. The cause is
a missing (int) cast causing a negative offset to appear positive
and the resulting pointer to go out of bounds.

The crash is not possible anymore since 1.5-dev12 because a second
bug caused the negative sign to disappear so the pointer is always
within range but always wrong, so balance hdr() never works anymore.

This fix restores the correct behaviour and ensures the sign is
correct.
2012-09-22 18:36:29 +02:00