Commit Graph

1396 Commits

Author SHA1 Message Date
Willy Tarreau
37994f034c MINOR: standard: add a simple popcount function
This function returns the number of ones in a word.
2012-11-19 12:12:24 +01:00
Emeric Brun
4f65bff1a5 MINOR: ssl: Add tune.ssl.lifetime statement in global.
Sets the ssl session <lifetime> in seconds. Openssl default is 300 seconds.
2012-11-16 16:47:20 +01:00
Willy Tarreau
fc6c032d8d MEDIUM: global: add support for CPU binding on Linux ("cpu-map")
The new "cpu-map" directive allows one to assign the CPU sets that
a process is allowed to bind to. This is useful in combination with
the "nbproc" and "bind-process" directives.

The support is implicit on Linux 2.6.28 and above.
2012-11-16 16:16:53 +01:00
William Lallemand
ec3e3890f0 BUG/MINOR: compression: deinit zlib only when required
The zlib stream was deinitialized even when the init failed.
2012-11-15 15:42:17 +01:00
Emeric Brun
4663577e24 MINOR: build: allow packagers to specify the ssl cache size
This is done by passing the default value to SSLCACHESIZE in sessions.
User can use tune.sslcachesize to change this value.
By default, it is set to 20000 sessions as openssl internal cache size.
Currently, a session entry size is between 592 and 616 bytes depending on the arch.
2012-11-15 10:52:19 +01:00
Willy Tarreau
3fdb366885 MAJOR: connection: replace struct target with a pointer to an enum
Instead of storing a couple of (int, ptr) in the struct connection
and the struct session, we use a different method : we only store a
pointer to an integer which is stored inside the target object and
which contains a unique type identifier. That way, the pointer allows
us to retrieve the object type (by dereferencing it) and the object's
address (by computing the displacement in the target structure). The
NULL pointer always corresponds to OBJ_TYPE_NONE.

This reduces the size of the connection and session structs. It also
simplifies target assignment and compare.

In order to improve the generated code, we try to put the obj_type
element at the beginning of all the structs (listener, server, proxy,
si_applet), so that the original and target pointers are always equal.

A lot of code was touched by massive replaces, but the changes are not
that important.
2012-11-12 00:42:33 +01:00
Willy Tarreau
128b03c9ab CLEANUP: stream_interface: remove the external task type target
Before connections were introduced, it was possible to connect an
external task to a stream interface. However it was left as an
exercise for the brave implementer to find how that ought to be
done.

The feature was broken since the introduction of connections and
was never fixed since due to lack of users. Better remove this dead
code now.
2012-11-11 23:14:16 +01:00
Willy Tarreau
b31c971bef CLEANUP: channel: remove any reference of the hijackers
Hijackers were functions designed to inject data into channels in the
distant past. They became unused around 1.3.16, and since there has
not been any user of this mechanism to date, it's uncertain whether
the mechanism still works (and it's not really useful anymore). So
better remove it as well as the pointer it uses in the channel struct.
2012-11-11 23:05:39 +01:00
Willy Tarreau
50fc7777c6 MEDIUM: http: refrain from sending "Connection: close" when Upgrade is present
Some servers are not totally HTTP-compliant when it comes to parsing the
Connection header. This is particularly true with WebSocket where it happens
from time to time that a server doesn't support having a "close" token along
with the "Upgrade" token in the Connection header. This broken behaviour has
also been noticed on some clients though the problem is less frequent on the
response path.

Sometimes the workaround consists in enabling "option http-pretend-keepalive"
to leave the request Connection header untouched, but this is not always the
most convenient solution. This patch introduces a new solution : haproxy now
also looks for the "Upgrade" token in the Connection header and if it finds
it, then it refrains from adding any other token to the Connection header
(though "keep-alive" and "close" may still be removed if found). The same is
done for the response headers.

This way, WebSocket much with less changes even when facing non-compliant
clients or servers. At least it fixes the DISCONNECT issue that was seen
on the websocket.org test.

Note that haproxy does not change its internal mode, it just refrains from
adding new tokens to the connection header.
2012-11-11 22:40:00 +01:00
Willy Tarreau
70c6fd82c3 MAJOR: polling: remove unused callbacks from the poller struct
Since no poller uses poller->{set,clr,wai,is_set,rem} anymore, let's
remove them and remove the associated pointer tests in proto/fd.h.
2012-11-11 21:02:34 +01:00
Willy Tarreau
e9f49e78fe MAJOR: polling: replace epoll with sepoll and remove sepoll
Now that all pollers make use of speculative I/O, there is no point
having two epoll implementations, so replace epoll with the sepoll code
and remove sepoll which has just become the standard epoll method.
2012-11-11 20:53:30 +01:00
Willy Tarreau
7f7ad91056 BUILD: stream_interface: remove si_fd() and its references
si_fd() is not used a lot, and breaks builds on OpenBSD 5.2 which
defines this name for its own purpose. It's easy enough to remove
this one-liner function, so let's do it.
2012-11-11 20:53:29 +01:00
Willy Tarreau
09f24569d4 REORG: fd: centralize the processing of speculative events
Speculative events are independant on the poller, so they can be
centralized in fd.c.
2012-11-11 17:45:39 +01:00
Willy Tarreau
6ea20b1acb REORG: fd: move the fd state management from ev_sepoll
ev_sepoll already provides everything needed to manage FD events
by only manipulating the speculative I/O list. Nothing there is
sepoll-specific so move all this to fd.
2012-11-11 17:45:39 +01:00
Willy Tarreau
7be79a41e1 REORG: fd: move the speculative I/O management from ev_sepoll
The speculative I/O will need to be ported to all pollers, so move
this to fd.c.
2012-11-11 17:45:39 +01:00
William Lallemand
d85f917daf MINOR: compression: maximum compression rate limit
This patch adds input and output rate calcutation on the HTTP compresion
feature.

Compression can be limited with a maximum rate value in kilobytes per
second. The rate is set with the global 'maxcomprate' option. You can
change this value dynamicaly with 'set rate-limit http-compression
global' on the UNIX socket.
2012-11-10 17:47:27 +01:00
William Lallemand
f3747837e5 MINOR: compression: tune.comp.maxlevel
This option allows you to set the maximum compression level usable by
the compression algorithm. It affects CPU usage.
2012-11-10 17:47:07 +01:00
Willy Tarreau
037d2c1f8f MAJOR: sepoll: make the poller totally event-driven
At the moment sepoll is not 100% event-driven, because a call to fd_set()
on an event which is already being polled will not change its state.

This causes issues with OpenSSL because if some I/O processing is interrupted
after clearing the I/O event (eg: read all data from a socket, can't put it
all into the buffer), then there is no way to call the SSL_read() again once
the buffer releases some space.

The only real solution is to go 100% event-driven. The principle is to use
the spec list as an event cache and that each time an I/O event is reported
by epoll_wait(), this event is automatically scheduled for addition to the
spec list for future calls until the consumer explicitly asks for polling
or stopping.

Doing this is a bit tricky because sepoll used to provide a substantial
number of optimizations such as event merging. These optimizations have
been maintained : a dedicated update list is affected when events change,
but not the event list, so that updates may cancel themselves without any
side effect such as displacing events. A specific case was considered for
handling newly created FDs as soon as they are detected from within the
poll loop. This ensures that their read or write operation will always be
attempted as soon as possible, thus reducing the number of poll loops and
process_session wakeups. This is especially true for newly accepted fds
which immediately perform their first recv() call.

Two new flags were added to the fdtab[] struct to tag the fact that a file
descriptor already exists in the update list. One flag indicates that a
file descriptor is new and has just been created (fdtab[].new) and the other
one indicates that a file descriptor is already referenced by the update list
(fdtab[].updated). Even if the FD state changes during operations or if the
fd is closed and replaced, it's not an issue because the update flag remains
and is easily spotted during list walks. The flag must absolutely reflect the
presence of the fd in the update list in order to avoid overflowing the update
list with more events than there are distinct fds.

Note that this change also recovers the small performance loss introduced
by its connection counter-part and goes even beyond.
2012-11-10 00:17:27 +01:00
Willy Tarreau
c8dd77fddf MAJOR: connection: remove the CO_FL_CURR_*_POL flag
This is the first step of a series of changes aiming at making the
polling totally event-driven. This first change consists in only
remembering at the connection level whether an FD was enabled or not,
regardless of the fact it was being polled or cached. From now on, an
EAGAIN will always be considered as a change so that the pollers are
able to manage a cache and to flush it based on such events. One of
the noticeable effect is that conn_fd_handler() is called once more
per session (6 instead of 5 min) but other update functions are less
called.

Note that the performance loss caused by this change at the moment is
quite significant, around 2.5%, but the change is needed to have SSL
working correctly in all situations, even when data were read from the
socket and stored in the invisible cache, waiting for some room in the
channel's buffer.
2012-11-09 22:09:33 +01:00
William Lallemand
9d5f5480fd MEDIUM: compression: limit RAM usage
With the global maxzlibmem option, you are able ton control the maximum
amount of RAM usable for HTTP compression.

A test is done before each zlib allocation, if the there isn't available
memory, the test fail and so the zlib initialization, so data won't be
compressed.
2012-11-08 15:23:30 +01:00
William Lallemand
2b50247695 MEDIUM: use pool for zlib
Don't use the zlib allocator anymore, 5 pools are used for the zlib
compression. Their sizes depends of the window size and the memLevel in
deflateInit2.
2012-11-08 15:23:29 +01:00
William Lallemand
a509e4c332 MINOR: compression: memlevel and windowsize
The window size and the memlevel of the zlib are now configurable using
global options tune.zlib.memlevel and tune.zlib.windowsize.

It affects the memory consumption of the zlib.
2012-11-08 15:23:29 +01:00
William Lallemand
08289f12f9 BUILD: remove dependency to zlib.h
The build was dependent of the zlib.h header, regardless of the USE_ZLIB
option. The fix consists of several #ifdef in the source code.

It removes the overhead of the zstream structure in the session when you
don't use the option.
2012-11-05 10:23:16 +01:00
William Lallemand
1c2d622d82 CLEANUP: use struct comp_ctx instead of union
Replace union comp_ctx by struct comp_ctx.

Use struct comp_ctx * in the init/add_data/flush/reset/end prototypes of
compression.h functions.
2012-11-05 10:23:16 +01:00
Willy Tarreau
ed7f836f07 BUG/MINOR: stream_interface: don't loop over ->snd_buf()
It is stupid to loop over ->snd_buf() because the snd_buf() itself already
loops and stops when system buffers are full. But looping again onto it,
we lose the information of the full buffers and perform one useless syscall.

Furthermore, this causes issues when dealing with large uploads while waiting
for a connection to establish, as it can report a server reject of some data
as a connection abort, which is wrong.

1.4 does not have this issue as it loops maximum twice (once for each buffer
half) and exists as soon as system buffers are full. So no backport is needed.
2012-10-29 23:30:33 +01:00
Willy Tarreau
07115412d3 MEDIUM: stick-table: allocate the table key of size buffer size
Keys are copied from samples to stick_table_key. If a key is larger
than the stick_table_key, we have an overflow. In pratice it does not
happen because it requires :
   1) a configuration with tune.bufsize larger than BUFSIZE (common)
   2) a stick-table configured with keys strictly larger than buffers
   3) extraction of data larger than BUFSIZE (eg: using payload())

Points 2 and 3 don't make any sense for a real world configuration. That
said the issue needs be fixed. The solution consists in allocating it the
same size as the global buffer size, just like the samples. This fixes the
issue.
2012-10-29 21:56:59 +01:00
Willy Tarreau
7e2c647ee7 MEDIUM: remove remains of BUFSIZE in HTTP auth and sample conversions
Sample conversions rely on two alternative buffers which were previously
allocated as static bufs of size BUFSIZE. Now they're initialized to the
global buffer size. It was the same for HTTP authentication. Note that it
seems that none of them was prone to any mistake when dealing with the
buffer size, but better stay on the safe side by maintaining the old
assumption that a trash buffer is always "large enough".
2012-10-29 20:44:36 +01:00
Willy Tarreau
19d14ef104 MEDIUM: make the trash be a chunk instead of a char *
The trash is used everywhere to store the results of temporary strings
built out of s(n)printf, or as a storage for a chunk when chunks are
needed.

Using global.tune.bufsize is not the most convenient thing either.

So let's replace trash with a chunk and directly use it as such. We can
then use trash.size as the natural way to get its size, and get rid of
many intermediary chunks that were previously used.

The patch is huge because it touches many areas but it makes the code
a lot more clear and even outlines places where trash was used without
being that obvious.
2012-10-29 16:57:30 +01:00
Willy Tarreau
7780473c3b CLEANUP: replace chunk_printf() with chunk_appendf()
This function's naming was misleading as it is used to append data
at the end of a string, causing some surprizes when used for the
first time!

Add a chunk_printf() function which does what its name suggests.
2012-10-29 16:14:26 +01:00
Willy Tarreau
c26ac9deea MINOR: chunk: add a function to reset a chunk
This is a first step in avoiding to constantly reinitialize chunks.
It replaces the old chunk_reset() which was not properly named as it
used to drop everything and was only used by chunk_destroy(). It has
been renamed chunk_drop().
2012-10-29 13:33:42 +01:00
Yuxans Yao
4e25b015a7 MINOR: log: add '%Tl' to log-format
The '%Tl' is similar to '%T', but using local timezone.
2012-10-29 11:55:26 +01:00
Willy Tarreau
70737d142f MINOR: compression: add an offload option to remove the Accept-Encoding header
This is used when it is desired that backend servers don't compress
(eg: because of buggy implementations).
2012-10-27 01:13:24 +02:00
Willy Tarreau
f2943dccd0 MAJOR: session: detach the connections from the stream interfaces
We will need to be able to switch server connections on a session and
to keep idle connections. In order to achieve this, the preliminary
requirement is that the connections can survive the session and be
detached from them.

Right now they're still allocated at exactly the same place, so when
there is a session, there are always 2 connections. We could soon
improve on this by allocating the outgoing connection only during a
connect().

This current patch touches a lot of code and intentionally does not
change any functionnality. Performance tests show no regression (even
a very minor improvement). The doc has not yet been updated.
2012-10-26 20:15:20 +02:00
Willy Tarreau
c919dc66a3 CLEANUP: remove trashlen
trashlen is a copy of global.tune.bufsize, so let's stop using it as
a duplicate, fall back to the original bufsize, it's less confusing
this way.
2012-10-26 20:04:27 +02:00
Willy Tarreau
422a0a5161 MINOR: tools: add a clear_addr() function to unset an address
This will be used to unset a from address.
2012-10-26 20:04:26 +02:00
Emeric Brun
a7aa309c44 MINOR: ssl: add 'crt' statement on server.
crt: client certificate to send
2012-10-26 15:10:10 +02:00
William Lallemand
82fe75c1a7 MEDIUM: HTTP compression (zlib library support)
This commit introduces HTTP compression using the zlib library.

http_response_forward_body has been modified to call the compression
functions.

This feature includes 3 algorithms: identity, gzip and deflate:

  * identity: this is mostly for debugging, and it was useful for
  developping the compression feature. With Content-Length in input, it
  is making each chunk with the data available in the current buffer.
  With chunks in input, it is rechunking, the output chunks will be
  bigger or smaller depending of the size of the input chunk and the
  size of the buffer. Identity does not apply any change on data.

  * gzip: same as identity, but applying a gzip compression. The data
  are deflated using the Z_NO_FLUSH flag in zlib. When there is no more
  data in the input buffer, it flushes the data in the output buffer
  (Z_SYNC_FLUSH). At the end of data, when it receives the last chunk in
  input, or when there is no more data to read, it writes the end of
  data with Z_FINISH and the ending chunk.

  * deflate: same as gzip, but with deflate algorithm and zlib format.
  Note that this algorithm has ambiguous support on many browsers and
  no support at all from recent ones. It is strongly recommended not
  to use it for anything else than experimentation.

You can't choose the compression ratio at the moment, it will be set to
Z_BEST_SPEED (1), as tests have shown very little benefit in terms of
compression ration when going above for HTML contents, at the cost of
a massive CPU impact.

Compression will be activated depending of the Accept-Encoding request
header. With identity, it does not take care of that header.

To build HAProxy with zlib support, use USE_ZLIB=1 in the make
parameters.

This work was initially started by David Du Colombier at Exceliance.
2012-10-26 02:30:48 +02:00
Willy Tarreau
54d23dfc07 CLEANUP: http: rename HTTP_MSG_DATA_CRLF state
This state's name is confusing as it is only used with chunked encoding
and makes newcomers think it's also related to the content-length. Let's
call it CHUNK_CRLF to clear any doubt on this.
2012-10-26 01:13:52 +02:00
Willy Tarreau
3dd0c4e20e OPTIM: tools: inline hex2i()
This tiny function was not inlined because initially not much used.
However it's been used un the chunk parser for a while and it became
one of the most CPU-cycle eater there. By inlining it, the chunk parser
speed was increased by 74 %. We're almost 3 times faster than original
with just the last 4 commits.
2012-10-26 01:13:24 +02:00
Willy Tarreau
55a6906125 OPTIM: channel: inline channel_forward's fast path
Most calls to channel_forward() are performed with short byte counts and
are already optimized in channel_forward() taking just a few instructions.
Thus it's a waste of CPU cycles to call a function for this, let's just
inline the short byte count case and fall back to the common one for
remaining situations.

Doing so has increased the chunked encoding parser's performance by 12% !
2012-10-26 01:08:01 +02:00
Emeric Brun
a068a2951d MINOR: sample: export 'sample_get_trash_chunk(void)'
This will be used on external fetch modules.
2012-10-22 18:54:24 +02:00
Emeric Brun
07ca496ea9 MINOR: acl: add parse and match primitives to use binary type on ACLs
Binary ACL match patterns can now be entered as hex digit strings.
2012-10-22 18:54:24 +02:00
Willy Tarreau
2e845be249 MEDIUM: sample: pass an empty list instead of a null for fetch args
ACL and sample fetches use args list and it is really not convenient to
check for null args everywhere. Now for empty args we pass a constant
list of end of lists. It will allow us to remove many useless checks.
2012-10-19 19:49:09 +02:00
Willy Tarreau
ad8f8e8ffb MINOR: chunk: provide string compare functions
It's sometimes needed to be able to compare a zero-terminated string with a
chunk, so we now have two functions to do that, one strcmp() equivalent and
one strcasecmp() equivalent.
2012-10-19 15:18:06 +02:00
Willy Tarreau
6c9a3d5585 MEDIUM: ssl: add support for the "npn" bind keyword
The ssl_npn match could not work by itself because clients do not use
the NPN extension unless the server advertises the protocols it supports.
Thanks to Simone Bordet for the explanations on how to get it right.
2012-10-18 19:03:00 +02:00
Willy Tarreau
378e041797 OPTIM: connection: pack the struct target
The struct target contains one int and one pointer, causing it to be
64-bit aligned on 64-bit platforms. By marking it "packed", we can
save 8 bytes in struct connection and as many in struct session on
such platforms.
2012-10-13 14:33:58 +02:00
Willy Tarreau
109e95a1b4 OPTIM: session: reorder struct session fields
A reorering of the struct session fields has increased overall performance
by almost 1% due to better cache usage.
2012-10-13 11:22:24 +02:00
Willy Tarreau
c93f7959e5 CLEANUP: session: remove term_trace which is not used anymore
This field was used to trace precisely where a session was terminated
but it did not survive code rearchitecture and was not used at all
anymore. Let's get rid of it.
2012-10-13 11:10:30 +02:00
Willy Tarreau
0a8535fec8 OPTIM: channel: reorganize struct members to improve cache efficiency
Now that the buffer is moved out of the channel, it is possible to move
the pointer earlier in the struct and reorder some fields. This new
ordering improves overall performance by 2%, mainly saved in the HTTP
parsers and data transfers.
2012-10-13 10:55:22 +02:00
Willy Tarreau
9b28e03b66 MAJOR: channel: replace the struct buffer with a pointer to a buffer
With this commit, we now separate the channel from the buffer. This will
allow us to replace buffers on the fly without touching the channel. Since
nobody is supposed to keep a reference to a buffer anymore, doing so is not
a problem and will also permit some copy-less data manipulation.

Interestingly, these changes have shown a 2% performance increase on some
workloads, probably due to a better cache placement of data.
2012-10-13 09:07:52 +02:00
Willy Tarreau
974ced6305 CLEANUP: channel: use 'chn' instead of 'buf' as local variable names
It's too confusing to see buf->buf everywhere where the first buf is
a channel. Let's fix this now.
2012-10-12 23:11:02 +02:00
Willy Tarreau
394db379eb REORG: http: rename msg->buf to msg->chn since it's a channel
It's extremely confusing to have all those msg->buf->buf everywhere after
the extraction of the buffer from the channel. Let's clean this up.
2012-10-12 22:40:39 +02:00
Willy Tarreau
ffc3fcd6da MEDIUM: log: report SSL ciphers and version in logs using logformat %sslc/%sslv
These two new log-format tags report the SSL protocol version (%sslv) and the
SSL ciphers (%sslc) used for the connection with the client. For instance, to
append these information just after the client's IP/port address information
on an HTTP log line, use the following configuration :

    log-format %Ci:%Cp\ %sslv:%sslc\ [%t]\ %ft\ %b/%s\ %Tq/%Tw/%Tc/%Tr/%Tt\ %st\ %B\ %cc\ \ %cs\ %tsc\ %ac/%fc/%bc/%sc/%rc\ %sq/%bq\ %hr\ %hs\ %{+Q}r

It will report a line such as the following one :

    Oct 12 20:47:30 haproxy[9643]: 127.0.0.1:43602 TLSv1:AES-SHA [12/Oct/2012:20:47:30.303] stick2~ stick2/s1 7/0/12/0/19 200 145 - - ---- 0/0/0/0/0 0/0 "GET /?t=0 HTTP/1.0"
2012-10-12 20:48:51 +02:00
Willy Tarreau
4f65356a22 MINOR: log: make lf_text use a const char *
lf_text() should use a const char * otherwise it makes it more complex
to use data coming from const strings.
2012-10-12 20:30:51 +02:00
Willy Tarreau
93dbc2bc0e MEDIUM: log: add a new LW_XPRT flag to pin the transport layer
This flag will have to be set on log tags which require transport layer
information. They will prevent the conn_xprt_close() call from releasing
the transport layer too early.
2012-10-12 20:30:51 +02:00
Willy Tarreau
1e954913de MEDIUM: connection: add a flag to hold the transport layer
When we start logging SSL information, we need the SSL struct to be
present even past the conn_xprt_close() call. In order to achieve this,
we should use refcounting on the connection and the transport layer. At
the moment it's not worth using plain refcounting as only the logs require
this, so instead of real refcounting we just use a flag which will be set
by the log subsystem when SSL data need to be logged.

What happens then is that the xprt->close() call is ignored and the
transport layer is closed again during session_free(), after the log
line is emitted.
2012-10-12 20:30:50 +02:00
Willy Tarreau
6c03a64978 MEDIUM: connection: always unset the transport layer upon close
When calling conn_xprt_close(), we always clear the transport pointer
so that all transport layers leave the connection in the same state after
a close. This will also make it safer and cheaper to call conn_xprt_close()
multiple times if needed.
2012-10-12 17:03:04 +02:00
Willy Tarreau
773d65f413 MEDIUM: log: suffix the frontend's name with '~' when using SSL
Until now it was not possible to know from the logs whether the incoming
connection was made over SSL or not. In order to address this in the existing
log formats, a new log format %ft was introduced, to log the frontend's name
suffixed with its transport layer. The only transport layer in use right now
is '~' for SSL, so that existing log formats for non-SSL traffic are not
affected at all, and SSL log formats have the frontend's name suffixed with
'~'.

The TCP, HTTP and CLF log format now use %ft instead of %f. This does not
affect existing log formats which still make use of %f however.
2012-10-12 14:56:11 +02:00
Emeric Brun
ef42d9219d MINOR: ssl: add statements 'verify', 'ca-file' and 'crl-file' on servers.
It now becomes possible to verify the server's certificate using the "verify"
directive. This one only supports "none" and "required", as it does not make
much sense to also support "optional" here.
2012-10-12 12:05:15 +02:00
Emeric Brun
f9c5c4701c MINOR: ssl: add statement 'no-tls-tickets' on server side. 2012-10-12 11:48:55 +02:00
Emeric Brun
94324a4c87 MINOR: ssl: move ssl context init for servers from cfgparse.c to ssl_sock.c 2012-10-12 11:37:36 +02:00
Emeric Brun
992adc9210 BUG/MINOR: ssl: Fix issue on server statements 'no-tls*' and 'no-sslv3'
bit field collision with 'force-tlsv*'.
2012-10-12 11:27:57 +02:00
Willy Tarreau
21faa91be6 MINOR: server: add minimal infrastructure to parse keywords
Just like with the "bind" lines, we'll switch the "server" line
parsing to keyword registration. The code is essentially the same
as for bind keywords, with minor changes such as support for the
default-server keywords and support for variable argument count.
2012-10-10 17:42:39 +02:00
Willy Tarreau
7fca87fd9d BUILD: accept4: move the socketcall declaration outside of accept4()
Gcc 4.2.4 breaks on the syscall declared inside the function, move it
outside and declare it static inline.
2012-10-10 17:42:39 +02:00
Willy Tarreau
1bc4aab290 MEDIUM: listener: add support for linux's accept4() syscall
On Linux, accept4() does the same as accept() except that it allows
the caller to specify some flags to set on the resulting socket. We
use this to set the O_NONBLOCK flag and thus to save one fcntl()
call in each connection. The effect is a small performance gain of
around 1%.

The option is automatically enabled when target linux2628 is set, or
when the USE_ACCEPT4 Makefile variable is set. If the libc is too old
to provide the equivalent function, this is automatically detected and
our own function is used instead. In any case it is possible to force
the use of our implementation with USE_MY_ACCEPT4.
2012-10-08 20:11:03 +02:00
Willy Tarreau
1b6c00cb99 BUG/MAJOR: ensure that hdr_idx is always reserved when L7 fetches are used
Baptiste Assmann reported a bug causing a crash on recent versions when
sticking rules were set on layer 7 in a TCP proxy. The bug is easier to
reproduce with the "defer-accept" option on the "bind" line in order to
have some contents to parse when the connection is accepted. The issue
is that the acl_prefetch_http() function called from HTTP fetches relies
on hdr_idx to be preinitialized, which is not the case if there is no L7
ACL.

The solution consists in adding a new SMP_CAP_L7 flag to fetches to indicate
that they are expected to work on L7 data, so that the proxy knows that the
hdr_idx has to be initialized. This is already how ACL and HTTP mode are
handled.

The bug was present since 1.5-dev9.
2012-10-05 22:46:09 +02:00
Emeric Brun
76d8895c49 MINOR: ssl: add defines LISTEN_DEFAULT_CIPHERS and CONNECT_DEFAULT_CIPHERS.
These ones are used to set the default ciphers suite on "bind" lines and
"server" lines respectively, instead of using OpenSSL's defaults. These
are probably mainly useful for distro packagers.
2012-10-05 22:11:15 +02:00
Emeric Brun
8694b9a682 MINOR: ssl: add 'force-sslv3' and 'force-tlsvXX' statements on server
These options force the SSL lib to use the specified protocol when
connecting to a server. They are complentary to no-tlsv*/no-sslv3.
2012-10-05 22:05:04 +02:00
Emeric Brun
2cb7ae5302 MINOR: ssl: add 'force-sslv3' and 'force-tlsvXX' statements on bind.
These options force the SSL lib to use the specified protocol. They
are complentary to no-tlsv*/no-sslv3.
2012-10-05 22:02:42 +02:00
Emeric Brun
8967549d52 MINOR: ssl: use bit fields to store ssl options instead of one int each
Too many SSL options already and some still to come, use a bit field
and get rid of all the integers. No functional change here.
2012-10-05 21:53:59 +02:00
Emeric Brun
fb510ea2b9 MEDIUM: conf: rename 'cafile' and 'crlfile' statements 'ca-file' and 'crl-file'
These names were not really handy.
2012-10-05 21:50:43 +02:00
Emeric Brun
9b3009b440 MEDIUM: conf: rename 'nosslv3' and 'notlsvXX' statements 'no-sslv3' and 'no-tlsvXX'.
These ones were really not easy to read nor write, and become confusing
with the next ones to be added.
2012-10-05 21:47:42 +02:00
Emeric Brun
c8e8d12257 MINOR: ssl: add 'crt-base' and 'ca-base' global statements.
'crt-base' sets root directory used for relative certificates paths.
'ca-base' sets root directory used for relative CAs and CRLs paths.
2012-10-05 21:46:52 +02:00
Emeric Brun
3c4bc6e10a MINOR: ssl: remove prefer-server-ciphers statement and set it as the default on ssl listeners. 2012-10-05 20:02:06 +02:00
Willy Tarreau
1c862c5920 MEDIUM: tcp: enable TCP Fast Open on systems which support it
If TCP_FASTOPEN is defined, then the "tfo" option is supported on
"bind" lines to enable TCP Fast Open (linux >= 3.6).
2012-10-05 16:22:35 +02:00
Willy Tarreau
6c16adc661 MEDIUM: checks: enable the PROXY protocol with health checks
When health checks are configured on a server which has the send-proxy
directive and no "port" nor "addr" settings, the health check connections
will automatically use the PROXY protocol. If "port" or "addr" are set,
the "check-send-proxy" directive may be used to force the protocol.
2012-10-05 00:33:14 +02:00
Willy Tarreau
f4288ee4ba MEDIUM: check: add the ctrl and transport layers in the server check structure
Since it's possible for the checks to use a different protocol or transport layer
than the prod traffic, we need to have them referenced in the server. The
SSL checks are not enabled yet, but the transport layers are completely used.
2012-10-05 00:33:14 +02:00
Willy Tarreau
1ae1b7b53c MEDIUM: checks: use real buffers to store requests and responses
Till now the request was made in the trash and sent to the network at
once, and the response was read into a preallocated char[]. Now we
allocate a full buffer for both the request and the response, and make
use of it.

Some of the operations will probably be replaced later with buffer macros
but the point was to ensure we could migrate to use the data layers soon.

One nice improvement caused by this change is that requests are now formed
at the beginning of the check and may safely be sent in multiple chunks if
needed.
2012-10-05 00:33:14 +02:00
Willy Tarreau
5b3a202f78 REORG: server: move the check-specific parts into a check subsection
The health checks in the servers are becoming a real mess, move them
into their own subsection. We'll soon need to have a struct buffer to
replace the char * as well as check-specific protocol and transport
layers.
2012-10-05 00:33:14 +02:00
Willy Tarreau
5f1504f524 MEDIUM: connection: add a new local send-proxy transport callback
This callback sends a PROXY protocol line on the outgoing connection,
with the local and remote endpoint information. This is used for local
connections (eg: health checks) where the other end needs to have a
valid address and no connection is relayed.
2012-10-05 00:32:35 +02:00
Willy Tarreau
e1e4a61e7a REORG: connection: move the PROXY protocol management to connection.c
It was previously in frontend.c but there is no reason for this anymore
considering that all the information involved is in the connection itself
only. Theorically this should be in the socket layer but we don't have
this yet.
2012-10-05 00:32:33 +02:00
Willy Tarreau
0ffde2cc3f MEDIUM: connection: automatically disable polling on error
We absolutely want to disable FD polling after an error is detected,
otherwise the data layer has to do it and it's far from being obvious
at these layers.

The way we did it was a bit tricky in conn_update_*_polling and
conn_*_polling_changes. However it has almost no impact on performance
and code size both for the fast and slow path.

We'll now be able to remove some flag updates in the stream interface.
2012-10-04 22:26:11 +02:00
Willy Tarreau
2396c1c4a2 MEDIUM: connection: make it possible for data->wake to return an error
Just like ->init(), ->wake() may now be used to return an error and
abort the connection. Currently this is not used but will be with
embryonic sessions.
2012-10-04 22:26:10 +02:00
Willy Tarreau
9e272bf95d MEDIUM: connection: only call the data->wake callback on activity
We now check the connection flags for changes in order not to call the
data->wake callback when there is no activity. Activity means a change
on any of the CO_FL_*_SH, CO_FL_ERROR, CO_FL_CONNECTED, CO_FL_WAIT_CONN*
flags, as well as a call to data->recv or data->send.
2012-10-04 22:26:10 +02:00
Willy Tarreau
f3a6d7e115 MEDIUM: connection: reorganize connection flags
The connection flags have progressively been added one after the other
and were not very well organized. Some of them are often used together
and a number of operations are performed on the DATA/SOCK ENA/POL flags.
Thus, they have been reorganized so that flags that work together are
close to each other (allows immediate operands on ARM) and that polling
changes can be detected with fewer operations using a simple shift and
xor. The handshakes are now the last ones so that it will be easier to
add new ones after without risking a collision. All activity-related
flags are also grouped together.
2012-10-04 22:26:10 +02:00
Willy Tarreau
071e137ec2 MEDIUM: connection: use a generic data-layer init() callback
The generic data-layer init callback is now used after the transport
layer is complete and before calling the data layer recv/send callbacks.

This allows the session to switch from the embryonic session data layer
to the complete stream interface data layer, by making conn_session_complete()
the data layer's init callback.

It sill looks awkwards that the init() callback must be used opon error,
but except by adding yet another one, it does not seem to be mergeable
into another function (eg: it should probably not be merged with ->wake
to avoid unneeded calls during the handshake, though semantically that
would make sense).
2012-10-04 22:26:10 +02:00
Willy Tarreau
f4e114fe54 MINOR: connection: add an init callback to the data_cb struct
This callback is used to initialize the data layer.
2012-10-04 22:26:10 +02:00
Willy Tarreau
bd99aab91f MINOR: connection: split conn_prepare() in two functions
We'll also need a function to takeover an existing connection without
reinitializing it. The same will be needed at the stream interface level.
2012-10-04 22:26:10 +02:00
Willy Tarreau
4aa3683b2d MINOR: connection: provide a generic data layer wakeup callback
Instead of calling conn_notify_si() from the connection handler, we
now call data->wake(), which will allow us to use a different callback
with health checks.

Note that we still rely on a flag in order to decide whether or not
to call this function. The reason is that with embryonic sessions,
the callback is already initialized to si_conn_cb without the flag,
and we can't call the SI notify function in the leave path before
the stream interface is initialized.

This issue should be addressed by involving a different data_cb for
embryonic sessions and for stream interfaces, that would be changed
during session_complete() for the final data_cb.
2012-10-04 22:26:10 +02:00
Willy Tarreau
74beec32a5 REORG: connection: rename app_cb "data"
Now conn->data will designate the data layer which is the client for
the transport layer. In practice it's the stream interface and will
soon also be the health checks.
2012-10-04 22:26:10 +02:00
Willy Tarreau
f7bc57ca6e REORG: connection: rename the data layer the "transport layer"
While working on the changes required to make the health checks use the
new connections, it started to become obvious that some naming was not
logical at all in the connections. Specifically, it is not logical to
call the "data layer" the layer which is in charge for all the handshake
and which does not yet provide a data layer once established until a
session has allocated all the required buffers.

In fact, it's more a transport layer, which makes much more sense. The
transport layer offers a medium on which data can transit, and it offers
the functions to move these data when the upper layer requests this. And
it is the upper layer which iterates over the transport layer's functions
to move data which should be called the data layer.

The use case where it's obvious is with embryonic sessions : an incoming
SSL connection is accepted. Only the connection is allocated, not the
buffers nor stream interface, etc... The connection handles the SSL
handshake by itself. Once this handshake is complete, we can't use the
data functions because the buffers and stream interface are not there
yet. Hence we have to first call a specific function to complete the
session initialization, after which we'll be able to use the data
functions. This clearly proves that SSL here is only a transport layer
and that the stream interface constitutes the data layer.

A similar change will be performed to rename app_cb => data, but the
two could not be in the same commit for obvious reasons.
2012-10-04 22:26:09 +02:00
Willy Tarreau
8c89c2059f MINOR: buffers: add a few functions to write chars, strings and blocks
bo_put{chr,blk,str,chk} are used to write data on the output of a buffer.
Output is truncated if the buffer is not large enough.
2012-10-04 22:26:09 +02:00
Willy Tarreau
8113a5d78f BUG/MINOR: config: use a copy of the file name in proxy configurations
Each proxy contains a reference to the original config file and line
number where it was declared. The pointer used is just a reference to
the one passed to the function instead of being duplicated. The effect
is that it is not valid anymore at the end of the parsing and that all
proxies will be enumerated as coming from the same file on some late
configuration errors. This may happen for exmaple when reporting SSL
certificate issues.

By copying using strdup(), we avoid this issue.

1.4 has the same issue, though no report of the proxy file name is done
out of the config section. Anyway a backport is recommended to ease
post-mortem analysis.
2012-10-04 08:13:32 +02:00
Willy Tarreau
d1a33e35fb BUG/MEDIUM: proxy: must not try to stop disabled proxies upon reload
Herv Commowick reported an issue : haproxy dies in a segfault during a
soft restart if it tries to pause a disabled proxy. This is because disabled
proxies have no management task so we must not wake the task up. This could
easily remain unnoticed since the old process was expected to go away, so
having it go away faster was not really troubling. However, with sync peers,
it is obvious that there is no peer sync during this reload.

This issue has been introduced in 1.5-dev7 with the removal of the
maintain_proxies() function. No backport is needed.
2012-10-04 00:20:55 +02:00
Emeric Brun
2d0c482682 MINOR: ssl: add statement 'no-tls-tickets' on bind to disable stateless session resumption
Disables the stateless session resumption (RFC 5077 TLS Ticket extension)
and force to use stateful session resumption.
Stateless session resumption is more expensive in CPU usage.
2012-10-02 16:05:33 +02:00
Emeric Brun
c0ff4924c0 MINOR: ssl : add statements 'notlsv11' and 'notlsv12' and rename 'notlsv1' to 'notlsv10'.
This is because "notlsv1" used to disable TLSv1.0 only and had no effect
on v1.1/v1.2. so better have an option for each version. This applies both
to "bind" and "server" statements.
2012-10-02 08:34:38 +02:00
Emeric Brun
9faf071acb MINOR: ssl: add build param USE_PRIVATE_CACHE to build cache without shared memory
It removes dependencies with futex or mutex but ssl performances decrease
using nbproc > 1 because switching process force session renegotiation.

This can be useful on small systems which never intend to run in multi-process
mode.
2012-10-02 08:34:38 +02:00
Emeric Brun
4b3091e54e MINOR: ssl: disable shared memory and locks on session cache if nbproc == 1
We don't needa to lock the memory when there is a single process. This can
make a difference on small systems where locking is much more expensive than
just a test.
2012-10-02 08:34:38 +02:00
Emeric Brun
81c00f0a7a MINOR: ssl: add ignore verify errors options
Allow to ignore some verify errors and to let them pass the handshake.

Add option 'crt-ignore-err <list>'
Ignore verify errors at depth == 0 (client certificate)
<list> is string 'all' or a comma separated list of verify error IDs
(see http://www.openssl.org/docs/apps/verify.html)

Add option 'ca-ignore-err <list>'
Same as 'crt-ignore-err' for all depths > 0 (CA chain certs)

Ex ignore all errors on CA and expired or not-yet-valid errors
on client certificate:

bind 0.0.0.0:443 ssl crt crt.pem verify required
 cafile ca.pem ca-ignore-err all crt-ignore-err 10,9
2012-10-02 08:32:50 +02:00
Emeric Brun
d94b3fe98f MEDIUM: ssl: add client certificate authentication support
Add keyword 'verify' on bind:
'verify none': authentication disabled (default)
'verify optional': accept connection without certificate
                   and process a verify if the client sent a certificate
'verify required': reject connection without certificate
                   and process a verify if the client send a certificate

Add keyword 'cafile' on bind:
'cafile <path>' path to a client CA file used to verify.
'crlfile <path>' path to a client CRL file used to verify.
2012-10-02 08:04:49 +02:00
Emeric Brun
2b58d040b6 MINOR: ssl: add elliptic curve Diffie-Hellman support for ssl key generation
Add 'ecdhe' on 'bind' statement: to set named curve used to generate ECDHE keys
(ex: ecdhe secp521r1)
2012-10-02 08:03:21 +02:00
Willy Tarreau
cd379950a7 MINOR: connection: add a pointer to the connection owner
This will be needed to find the stream interface from the connection
once they're detached, but in the more immediate term, we'll need this
for health checks since they don't use a stream interface.
2012-09-28 00:01:22 +02:00
Willy Tarreau
dda5e7c986 CLEANUP: connection: offer conn_prepare() to set up a connection
This will be used by checks as well as stream interfaces.
2012-09-24 22:49:06 +02:00
Willy Tarreau
c53d42256d MEDIUM: stats: remove the stats_sock struct from the global struct
Now the stats socket is allocated when the 'stats socket' line is parsed,
and assigned using the standard str2listener(). This has two effects :
  - more than one stats socket can now be declared
  - stats socket now support protocols other than UNIX

The next step is to remove the duplicate bind config parsing.
2012-09-24 10:53:16 +02:00
Willy Tarreau
4fbb2285e2 MINOR: config: make str2listener() use memprintf() to report errors.
This will make it possible to use the function for other listening
sockets.
2012-09-24 10:53:16 +02:00
Willy Tarreau
eb6cead1de MINOR: standard: make memprintf() support a NULL destination
Doing so removes many checks that were systematically made because
the callees don't know if the caller passed a valid pointer.
2012-09-24 10:53:16 +02:00
Willy Tarreau
ce39bfb7c4 BUG: backend: balance hdr was broken since 1.5-dev11
Alex Markham reported and diagnosed a bug appearing on 1.5-dev11,
causing a crash on x86_64 when header hashing is used. The cause is
a missing (int) cast causing a negative offset to appear positive
and the resulting pointer to go out of bounds.

The crash is not possible anymore since 1.5-dev12 because a second
bug caused the negative sign to disappear so the pointer is always
within range but always wrong, so balance hdr() never works anymore.

This fix restores the correct behaviour and ensures the sign is
correct.
2012-09-22 18:36:29 +02:00
Willy Tarreau
290e63aa87 REORG: listener: move unix perms from the listener to the bind_conf
Unix permissions are per-bind configuration line and not per listener,
so let's concretize this in the way the config is stored. This avoids
some unneeded loops to set permissions on all listeners.

The access level is not part of the unix perms so it has been moved
away. Once we can use str2listener() to set all listener addresses,
we'll have a bind keyword parser for this one.
2012-09-20 18:07:14 +02:00
Willy Tarreau
4348fad1c1 MAJOR: listeners: use dual-linked lists to chain listeners with frontends
Navigating through listeners was very inconvenient and error-prone. Not to
mention that listeners were linked in reverse order and reverted afterwards.
In order to definitely get rid of these issues, we now do the following :
  - frontends have a dual-linked list of bind_conf
  - frontends have a dual-linked list of listeners
  - bind_conf have a dual-linked list of listeners
  - listeners have a pointer to their bind_conf

This way we can now navigate from anywhere to anywhere and always find the
proper bind_conf for a given listener, as well as find the list of listeners
for a current bind_conf.
2012-09-20 16:48:07 +02:00
Willy Tarreau
28a47d6408 MINOR: config: pass the file and line to config keyword parsers
This will be needed when we need to create bind config settings.
2012-09-18 20:02:48 +02:00
Willy Tarreau
51fb7651c4 MINOR: listener: add a scope field in the bind keyword lists
This scope is used to report what the keywords are used for (eg: TCP,
UNIX, ...). It is now reported by bind_dump_kws().
2012-09-18 18:27:14 +02:00
Willy Tarreau
8638f4850f MEDIUM: config: enumerate full list of registered "bind" keywords upon error
When an unknown "bind" keyword is detected, dump the list of all
registered keywords. Unsupported default alternatives are also reported
as "not supported".
2012-09-18 18:27:14 +02:00
Willy Tarreau
79eeafacb4 MEDIUM: move bind SSL parsing to ssl_sock
Registering new SSL bind keywords was not particularly handy as it required
many #ifdef in cfgparse.c. Now the code has moved to ssl_sock.c which calls
a register function for all the keywords.

Error reporting was also improved by this move, because the called functions
build an error message using memprintf(), which can span multiple lines if
needed, and each of these errors will be displayed indented in the context of
the bind line being processed. This is important when dealing with certificate
directories which can report multiple errors.
2012-09-18 16:20:01 +02:00
Willy Tarreau
269826659d MEDIUM: listener: add a minimal framework to register "bind" keyword options
With the arrival of SSL, the "bind" keyword has received even more options,
all of which are processed in cfgparse in a cumbersome way. So it's time to
let modules register their own bind options. This is done very similarly to
the ACLs with a small difference in that we make the difference between an
unknown option and a known, unimplemented option.
2012-09-15 22:33:08 +02:00
Willy Tarreau
88500de69e CLEANUP: listener: remove unused conf->file and conf->line
These ones are already in bind_conf.
2012-09-15 22:29:33 +02:00
Willy Tarreau
2a65ff014e MEDIUM: config: replace ssl_conf by bind_conf
Some settings need to be merged per-bind config line and are not necessarily
SSL-specific. It becomes quite inconvenient to have this ssl_conf SSL-specific,
so let's replace it with something more generic.
2012-09-15 22:29:33 +02:00
Willy Tarreau
d1d5454180 REORG: split "protocols" files into protocol and listener
It was becoming confusing to have protocols and listeners in the same
files, split them.
2012-09-15 22:29:32 +02:00
Willy Tarreau
21c705b0f8 MINOR: config: add a function to indent error messages
Bind parsers may return multiple errors, so let's make use of a new function
to re-indent multi-line error messages so that they're all reported in their
context.
2012-09-15 22:29:27 +02:00
Willy Tarreau
2e1dca8f52 MEDIUM: http: add "redirect scheme" to ease HTTP to HTTPS redirection
For instance :

   redirect scheme https if !{ is_ssl }
2012-09-12 08:43:15 +02:00
Emeric Brun
fc0421fde9 MEDIUM: ssl: add support for SNI and wildcard certificates
A side effect of this change is that the "ssl" keyword on "bind" lines is now
just a boolean and that "crt" is needed to designate certificate files or
directories.

Note that much refcounting was needed to have the free() work correctly due to
the number of cert aliases which can make a context be shared by multiple names.
2012-09-10 09:27:02 +02:00
Willy Tarreau
f5ae8f7637 MEDIUM: config: centralize handling of SSL config per bind line
SSL config holds many parameters which are per bind line and not per
listener. Let's use a per-bind line config instead of having it
replicated for each listener.

At the moment we only do this for the SSL part but this should probably
evolved to handle more of the configuration and maybe even the state per
bind line.
2012-09-08 08:31:50 +02:00
Willy Tarreau
403edff4b8 MEDIUM: config: implement maxsslconn in the global section
SSL connections take a huge amount of memory, and unfortunately openssl
does not check malloc() returns and easily segfaults when too many
connections are used.

The only solution against this is to provide a global maxsslconn setting
to reject SSL connections above the limit in order to avoid reaching
unsafe limits.
2012-09-06 12:10:43 +02:00
David BERARD
e566ecbea8 MEDIUM: ssl: add support for prefer-server-ciphers option
I wrote a small path to add the SSL_OP_CIPHER_SERVER_PREFERENCE OpenSSL option
to frontend, if the 'prefer-server-ciphers' keyword is set.

Example :
	bind 10.11.12.13 ssl /etc/haproxy/ssl/cert.pem ciphers RC4:HIGH:!aNULL:!MD5 prefer-server-ciphers

This option mitigate the effect of the BEAST Attack (as I understand), and it
equivalent to :
	- Apache HTTPd SSLHonorCipherOrder option.
	- Nginx ssl_prefer_server_ciphers option.

[WT: added a test for the support of the option]
2012-09-04 15:35:32 +02:00
Willy Tarreau
ff9f7698fc BUILD: fix build error without SSL (ssl_cert)
One last-minute optimization broke the build without SSL support.
Move ssl_cert out of the #ifdef/#endif and it's OK.
2012-09-04 15:13:20 +02:00
Willy Tarreau
d50265aa0e BUILD: include sys/socket.h to fix build failure on FreeBSD
Joris Dedieu reported that include/common/standard.h needs this.
2012-09-04 14:18:33 +02:00
Willy Tarreau
783f25800c BUILD: http: rename error_message http_error_message to fix conflicts on RHEL
Duncan Hall reported a build issue on CentOS where error_message conflicts
with another system declaration when SSL is enabled. Rename the function.
2012-09-04 12:19:04 +02:00
Willy Tarreau
c230b8bfb6 MEDIUM: config: add "nosslv3" and "notlsv1" on bind and server lines
This is aimed at disabling SSLv3 and TLSv1 respectively. SSLv2 is always
disabled. This can be used in some situations where one version looks more
suitable than the other.
2012-09-03 23:55:16 +02:00
Willy Tarreau
d7aacbffcb MEDIUM: config: add a "ciphers" keyword to set SSL cipher suites
This is supported for both servers and listeners. The cipher suite
simply follows the "ciphers" keyword.
2012-09-03 23:43:25 +02:00
Emeric Brun
fc32acafcd MINOR: ssl add global setting tune.sslcachesize to set SSL session cache size.
This new global setting allows the user to change the SSL cache size in
number of sessions. It defaults to 20000.
2012-09-03 22:36:33 +02:00
Emeric Brun
3e541d1c03 MEDIUM: ssl: add shared memory session cache implementation.
This SSL session cache was developped at Exceliance and is the same that
was proposed for stunnel and stud. It makes use of a shared memory area
between the processes so that sessions can be handled by any process. It
is only useful when haproxy runs with nbproc > 1, but it does not hurt
performance at all with nbproc = 1. The aim is to totally replace OpenSSL's
internal cache.

The cache is optimized for Linux >= 2.6 and specifically for x86 platforms.
On Linux/x86, it makes use of futexes for inter-process locking, with some
x86 assembly for the locked instructions. On other architectures, GCC
builtins are used instead, which are available starting from gcc 4.1.

On other operating systems, the locks fall back to pthread mutexes so
libpthread is automatically linked. It is not recommended since pthreads
are much slower than futexes. The lib is only linked if SSL is enabled.
2012-09-03 22:36:33 +02:00
Emeric Brun
e1f38dbb44 MEDIUM: ssl: protect against client-initiated renegociation
CVE-2009-3555 suggests that client-initiated renegociation should be
prevented in the middle of data. The workaround here consists in having
the SSL layer notify our callback about a handshake occurring, which in
turn causes the connection to be marked in the error state if it was
already considered established (which means if a previous handshake was
completed). The result is that the connection with the client is immediately
aborted and any pending data are dropped.
2012-09-03 22:03:17 +02:00
Emeric Brun
01f8e2f61b MEDIUM: config: add support for the 'ssl' option on 'server' lines
This option currently takes no option and simply turns SSL on for all
connections going to the server. It is likely that more options will
be needed in the future.
2012-09-03 22:02:21 +02:00
Emeric Brun
6e159299f1 MEDIUM: config: add the 'ssl' keyword on 'bind' lines
"bind" now supports "ssl" followed by a PEM cert+key file name.
2012-09-03 20:49:14 +02:00
Emeric Brun
4659195e31 MEDIUM: ssl: add new files ssl_sock.[ch] to provide the SSL data layer
This data layer supports socket-to-buffer and buffer-to-socket operations.
No sock-to-pipe nor pipe-to-sock functions are provided, since splicing does
not provide any benefit with data transformation. At best it could save a
memcpy() and avoid keeping a buffer allocated but that does not seem very
useful.

An init function and a close function are provided because the SSL context
needs to be allocated/freed.

A data-layer shutw() function is also provided because upon successful
shutdown, we want to store the SSL context in the cache in order to reuse
it for future connections and avoid a new key generation.

The handshake function is directly called from the connection handler.
At this point it is not certain whether this will remain this way or
if a new ->handshake callback will be added to the data layer so that
the connection handler doesn't care about SSL.

The sock-to-buf and buf-to-sock functions are all capable of enabling
the SSL handshake at any time. This also implies polling in the opposite
direction to what was expected. The upper layers must take that into
account (it is OK right now with the stream interface).
2012-09-03 20:49:14 +02:00
Emeric Brun
7dd0e505ca MEDIUM: connection: add a new handshake flag for SSL (CO_FL_SSL_WAIT_HS).
This flag is part of the CO_FL_HANDSHAKE family since the SSL handshake
may appear at any time.
2012-09-03 20:49:14 +02:00
Emeric Brun
c6545acee0 MINOR: server: add SSL context to servers if USE_OPENSSL is defined
This will be needed to accept outgoing SSL connections.
2012-09-03 20:49:14 +02:00
Emeric Brun
0b8d4d9372 MINOR: protocol: add SSL context to listeners if USE_OPENSSL is defined
This will be needed to accept incoming SSL connections.
2012-09-03 20:49:14 +02:00
Willy Tarreau
dd2f85eb3b CLEANUP: includes: fix includes for a number of users of fd.h
It appears that fd.h includes a number of unneeded files and was
included from standard.h, and as such served as an intermediary
to provide almost everything to everyone.

By removing its useless includes, a long dependency chain broke
but could easily be fixed.
2012-09-03 20:49:14 +02:00
Willy Tarreau
45dab73788 CLEANUP: fdtab: flatten the struct and merge the spec struct with the rest
The "spec" sub-struct was using 8 bytes for only 5 needed. There is no
reason to keep it as a struct, it doesn't bring any value. By flattening
it, we can merge the single byte with the next single byte, resulting in
an immediate saving of 4 bytes (20%). Interestingly, tests have shown a
steady performance gain of 0.6% after this change, which can possibly be
attributed to a more cache-line friendly struct.
2012-09-03 20:49:14 +02:00
Willy Tarreau
40ff59d820 CLEANUP: fd: remove fdtab->flags
These flags were added for TCP_CORK. They were only set at various places
but never checked by any user since TCP_CORK was replaced with MSG_MORE.
Simply get rid of this now.
2012-09-03 20:49:14 +02:00
Willy Tarreau
56a77e5933 MEDIUM: connection: complete the polling cleanups
I/O handlers now all use __conn_{sock,data}_{stop,poll,want}_* instead
of returning dummy flags. The code has become slightly simpler because
some tricks such as the MIN_RET_FOR_READ_LOOP are not needed anymore,
and the data handlers which switch to a handshake handler do not need
to disable themselves anymore.
2012-09-03 20:47:35 +02:00
Willy Tarreau
e9dfa79a75 MAJOR: connection: rearrange the polling flags.
Polling flags were set for data and sock layer, but while this does make
sense for the ENA flag, it does not for the POL flag which translates the
detection of an EAGAIN condition. So now we remove the {DATA,SOCK}_POL*
flags and instead introduce two new layer-independant flags (WANT_RD and
WANT_WR). These flags are only set when an EAGAIN is encountered so that
polling can be enabled.

In order for these flags to have any meaning they are not persistent and
have to be cleared by the connection handler before calling the I/O and
data callbacks. For this reason, changes detection has been slightly
improved. Instead of comparing the WANT_* flags with CURR_*_POL, we only
check if the ENA status changes, or if the polling appears, since we don't
want to detect the useless poll to ena transition. Tests show that this
has eliminated one useless call to __fd_clr().

Finally the conn_set_polling() function which was becoming complex and
required complex operations from the caller was split in two and replaced
its two only callers (conn_update_data_polling and conn_update_sock_polling).
The two functions are now much smaller due to the less complex conditions.
Note that it would be possible to re-merge them and only pass a mask but
this does not appear much interesting.
2012-09-03 20:47:35 +02:00
Willy Tarreau
74172ff9c3 CLEANUP: frontend: remove the old proxy protocol decoder
This one used to rely on a stream analyser which was inappropriate.
It's not used anymore.
2012-09-03 20:47:35 +02:00
Willy Tarreau
22cda21ad5 MAJOR: connection: make the PROXY decoder a handshake handler
The PROXY protocol is now decoded in the connection before other
handshakes. This means that it may be extracted from a TCP stream
before SSL is decoded from this stream.
2012-09-03 20:47:35 +02:00
Willy Tarreau
2542b53b19 MAJOR: session: introduce embryonic sessions
When an incoming connection request is accepted, a connection
structure is needed to store its state. However we don't want to
fully initialize a session until the data layer is about to be
ready.

As long as the connection is physically stored into the session,
it's not easy to split both allocations.

As such, we only initialize the minimum requirements of a session,
which results in what we call an embryonic session. Then once the
data layer is ready, we can complete the function's initialization.

Doing so avoids buffers allocation and ensures that a session only
sees ready connections.

The frontend's client timeout is used as the handshake timeout. It
is likely that another timeout will be used in the future.
2012-09-03 20:47:35 +02:00
Willy Tarreau
15678efc45 MEDIUM: connection: add an ->init function to data layer
SSL need to initialize the data layer before proceeding with data. At
the moment, this data layer is automatically initialized from itself,
which will not be possible once we extract connection from sessions
since we'll only create the data layer once the handshake is finished.

So let's have the application layer initialize the data layer before
using it.
2012-09-03 20:47:34 +02:00
Willy Tarreau
64ee491309 MINOR: tcp: replace tcp_src_to_stktable_key with addr_to_stktable_key
Make it more obvious that this function does not depend on any knowledge
of the session. This is important to plan for TCP rules that can run on
connection without any initialized session yet.
2012-09-03 20:47:34 +02:00
Willy Tarreau
14f8e86da5 MEDIUM: proto_tcp: remove any dependence on stream_interface
The last uses of the stream interfaces were in tcp_connect_server() and
could easily and more appropriately be moved to its callers, si_connect()
and connect_server(), making a lot more sense.

Now the function should theorically be usable for health checks.

It also appears more obvious that the file is split into two distinct
parts :
  - the protocol layer used at the connection level
  - the tcp analysers executing tcp-* rules and their samples/acls.
2012-09-03 20:47:34 +02:00
Willy Tarreau
93b0f4f6c6 MEDIUM: stream_interface: remove CAP_SPLTCP/CAP_SPLICE flags
These ones are implicitly handled by the connection's data layer, no need
to rely on them anymore and reaching them maintains undesired dependences
on stream-interface.
2012-09-03 20:47:34 +02:00
Willy Tarreau
986a9d2d12 MAJOR: connection: move the addr field from the stream_interface
We need to have the source and destination addresses in the connection.
They were lying in the stream interface so let's move them. The flags
SI_FL_FROM_SET and SI_FL_TO_SET have been moved as well.

It's worth noting that tcp_connect_server() almost does not use the
stream interface anymore except for a few flags.

It has been identified that once we detach the connection from the SI,
it will probably be needed to keep a copy of the server-side addresses
in the SI just for logging purposes. This has not been implemented right
now though.
2012-09-03 20:47:34 +02:00
Willy Tarreau
3cefd521fa REORG: connection: move the target pointer from si to connection
The target is per connection and is directly used by the connection, so
we need it there. It's not needed anymore in the SI however.
2012-09-03 20:47:34 +02:00
Willy Tarreau
8263d2b259 CLEANUP: channel: use "channel" instead of "buffer" in function names
This is a massive rename of most functions which should make use of the
word "channel" instead of the word "buffer" in their names.

In concerns the following ones (new names) :

unsigned long long channel_forward(struct channel *buf, unsigned long long bytes);
static inline void channel_init(struct channel *buf)
static inline int channel_input_closed(struct channel *buf)
static inline int channel_output_closed(struct channel *buf)
static inline void channel_check_timeouts(struct channel *b)
static inline void channel_erase(struct channel *buf)
static inline void channel_shutr_now(struct channel *buf)
static inline void channel_shutw_now(struct channel *buf)
static inline void channel_abort(struct channel *buf)
static inline void channel_stop_hijacker(struct channel *buf)
static inline void channel_auto_connect(struct channel *buf)
static inline void channel_dont_connect(struct channel *buf)
static inline void channel_auto_close(struct channel *buf)
static inline void channel_dont_close(struct channel *buf)
static inline void channel_auto_read(struct channel *buf)
static inline void channel_dont_read(struct channel *buf)
unsigned long long channel_forward(struct channel *buf, unsigned long long bytes)

Some functions provided by channel.[ch] have kept their "buffer" name because
they are really designed to act on the buffer according to some information
gathered from the channel. They have been moved together to the same place in
the file for better readability but they were not changed at all.

The "buffer" memory pool was also renamed "channel".
2012-09-03 20:47:33 +02:00
Willy Tarreau
03cdb7c678 CLEANUP: channel: usr CF_/CHN_ prefixes instead of BF_/BUF_
Get rid of these confusing BF_* flags. Now channel naming should clearly
be used everywhere appropriate.

No code was changed, only a renaming was performed. The comments about
channel operations was updated.
2012-09-03 20:47:33 +02:00
Willy Tarreau
af81935b82 REORG: channel: move buffer_{replace,insert_line}* to buffer.{c,h}
These functions do not depend on the channel flags anymore thus they're
much better suited to be used on plain buffers. Move them from channel
to buffer.
2012-09-03 20:47:33 +02:00
Willy Tarreau
f941cf2ef2 MAJOR: channel: remove the BF_FULL flag
This is similar to the recent removal of BF_OUT_EMPTY. This flag was very
problematic because it relies on permanently changing information such as the
to_forward value, so it had to be updated upon every change to the buffers.
Previous patch already got rid of its users.

One part of the change is sensible : the flag was also part of BF_MASK_STATIC,
which is used by process_session() to rescan all analysers in case the flag's
status changes. At first glance, none of the analysers seems to change its
mind base on this flag when it is subject to change, so it seems fine not to
add variation checks here. Otherwise it's possible that checking the buffer's
input and output is more reliable than checking the flag's replacement.
2012-09-03 20:47:33 +02:00
Willy Tarreau
42d06661a2 MINOR: buffer: provide a new buffer_full() function
This one only focuses on the input part of the buffer and is dedicated
to analysers.
2012-09-03 20:47:33 +02:00
Willy Tarreau
ad1cc3df9c MINOR: channel: rename bi_full to channel_full as it checks the whole channel
Since the function takes care of the forward count and involves more than
buffer knowledge, rename it.
2012-09-03 20:47:32 +02:00
Willy Tarreau
a75bcef867 REORG: buffer: move buffer_flush, b_adv and b_rew to buffer.h
These one now operate over real buffers, not channels anymore.
2012-09-03 20:47:32 +02:00
Willy Tarreau
8e21bb9e52 MAJOR: channel: remove the BF_OUT_EMPTY flag
This flag was very problematic because it was composite in that both changes
to the pipe or to the buffer had to cause this flag to be updated, which is
not always simple (eg: there may not even be a channel attached to a buffer
at all).

There were not that many users of this flags, mostly setters. So the flag got
replaced with a macro which reports whether the channel is empty or not, by
checking both the pipe and the buffer.

One part of the change is sensible : the flag was also part of BF_MASK_STATIC,
which is used by process_session() to rescan all analysers in case the flag's
status changes. At first glance, none of the analysers seems to change its
mind base on this flag when it is subject to change, so it seems fine not to
add variation checks here. Otherwise it's possible that checking the buffer's
output size is more useful than checking the flag's replacement.
2012-09-03 20:47:32 +02:00
Willy Tarreau
c7e4238df0 REORG: buffers: split buffers into chunk,buffer,channel
Many parts of the channel definition still make use of the "buffer" word.
2012-09-03 20:47:32 +02:00
Willy Tarreau
c578891112 CLEANUP: connection: split sock_ops into data_ops, app_cp and si_ops
Some parts of the sock_ops structure were only used by the stream
interface and have been moved into si_ops. Some of them were callbacks
to the stream interface from the connection and have been moved into
app_cp as they're the application seen from the connection (later,
health-checks will need to use them). The rest has moved to data_ops.

Normally at this point the connection could live without knowing about
stream interfaces at all.
2012-09-03 20:47:31 +02:00
Willy Tarreau
96199b1016 MAJOR: stream-interface: restore splicing mechanism
The splicing is now provided by the data-layer rcv_pipe/snd_pipe functions
which in turn are called by the stream interface's recv and send callbacks.

The presence of the rcv_pipe/snd_pipe functions is used to attest support
for splicing at the data layer. It looks like the stream-interface's
SI_FL_CAP_SPLICE flag does not make sense anymore as it's used as a proxy
for the pointers above.

It also appears that we call chk_snd() from the recv callback and then
try to call it again in update_conn(). It is very likely that this last
function will progressively slip into the recv/send callbacks in order
to avoid duplicate check code.

The code works right now with and without splicing. Only raw_sock provides
support for it and it is automatically selected when the various splice
options are set. However it looks like splice-auto doesn't enable it, which
possibly means that the streamer detection code does not work anymore, or
that it's only called at a time where it's too late to enable splicing (in
process_session).
2012-09-03 20:47:31 +02:00
Willy Tarreau
5368d80ede MAJOR: connection: split the send call into connection and stream interface
Similar to what was done on the receive path, the data layer now provides
only an snd_buf() callback that is iterated over by the stream interface's
si_conn_send_loop() function.

The data layer now has no knowledge about channels nor stream interfaces.

The splice() code still need to be ported as it currently is disabled.
2012-09-03 20:47:31 +02:00
Willy Tarreau
ce323dea14 REORG: stream-interface: move sock_raw_read() to si_conn_recv_cb()
The recv function is now generic and is usable to iterate any connection-to-buf
reading function from a stream interface. So let's move it to stream-interface.
2012-09-03 20:47:30 +02:00
Willy Tarreau
1fe6bc335a MINOR: stream-interface: add an rcv_buf callback to sock_ops
This one is to be used by the read I/O handlers.
2012-09-03 20:47:30 +02:00
Willy Tarreau
2ba4465086 MAJOR: raw_sock: extract raw_sock_to_buf() from raw_sock_read()
This is the start of the stream connection iterator which calls the
data-layer reader. This still looks a bit tricky but is OK. Splicing
is not handled at all at the moment.
2012-09-03 20:47:30 +02:00
Willy Tarreau
75bf2c925f REORG: sock_raw: rename the files raw_sock*
The "raw_sock" prefix will be more convenient for naming functions as
it will be prefixed with the data layer and suffixed with the data
direction. So let's rename the files now to avoid any further confusion.

The #include directive was also removed from a number of files which do
not need it anymore.
2012-09-02 21:54:56 +02:00
Willy Tarreau
3af56a9359 MINOR: connection: provide conn_{data|sock}_{read0|shutw} functions
These functions are used to report unidirectional shutdown and to disable
polling in the related direction.
2012-09-02 21:54:56 +02:00
Willy Tarreau
572bf9095d REORG/MAJOR: extract "struct buffer" from "struct channel"
At the moment, the struct is still embedded into the struct channel, but
all the functions have been updated to use struct buffer only when possible,
otherwise struct channel. Some functions would likely need to be splitted
between a buffer-layer primitive and a channel-layer function.

Later the buffer should become a pointer in the struct buffer, but doing so
requires a few changes to the buffer allocation calls.
2012-09-02 21:54:56 +02:00
Willy Tarreau
7421efb85f REORG/MAJOR: use "struct channel" instead of "struct buffer"
This is a massive rename. We'll then split channel and buffer.

This change needs a lot of cleanups. At many locations, the parameter
or variable is still called "buf" which will become ambiguous. Also,
the "struct channel" is still defined in buffers.h.
2012-09-02 21:54:55 +02:00
Willy Tarreau
9bf9c14c12 MEDIUM: stream-interface: provide a generic stream_sock_read0() function
This function is used by the data layer when a zero has been read over a
connection. At the moment it only handles sockets and nothing else. Once
the complete split is done between buffers and stream interfaces, it should
become possible to work regardless on the connection type.
2012-09-02 21:54:55 +02:00
Willy Tarreau
eecf6ca68a MEDIUM: stream-interface: provide a generic si_conn_send_cb callback
The connection send() callback is supposed to be generic for a
stream-interface, and consists in calling the lower layer snd_buf
function. Move this function to the stream interface and remove
the sock-raw and sock-ssl clones.
2012-09-02 21:54:55 +02:00
Willy Tarreau
de5722c302 MEDIUM: stream-interface: provide a generic stream_int_chk_snd_conn() function
This one can be used by both sock_raw and sock_ssl instead of each having their own.
2012-09-02 21:54:55 +02:00
Willy Tarreau
fae4499e36 MEDIUM: stream-interface: add a snd_buf() callback to sock_ops
This callback is used to send data from the buffer to the socket. It is
the old write_loop() call of the data layer which is used both by the
->write() callback and the ->chk_snd() function. The reason for having
it as a pointer is that it's the only remaining part which causes the
write and chk_snd() functions to be different between raw and ssl.
2012-09-02 21:54:18 +02:00
Willy Tarreau
46a8d925c2 MEDIUM: stream-interface: offer a generic chk_rcv function for connections
sock_raw and sock_ssl use a pretty generic chk_rcv function, so let's move
this function to the stream_interface and remove specific functions. Later
we might have a single chk_rcv function.
2012-09-02 21:54:18 +02:00
Willy Tarreau
100c467120 MEDIUM: stream_interface: offer a generic function for connection updates
We need to have a generic function to be called by upper layers when buffer
flags have been updated (the si->update function). At the moment, both sock_raw
and sock_ssl had their own which basically was a copy-paste. Since these
functions are only used to update stream interface flags, it is logical to
have them handled by the stream interface code.

This allowed us to remove the stream_interface-specific update function from
sock_raw and sock_ssl which now use the generic code.

The stream_sock_update_conn callback has also been more appropriately renamed
conn_notify_si() since it's meant to be called by lower layers to notify the
SI and possibly upper layers about incoming changes.
2012-09-02 21:54:18 +02:00
Willy Tarreau
26f44d1e91 MINOR: fd: get rid of FD_WAIT_*
These flags were used to ease a transition which has been completed,
so they're not needed anymore. Get rid of them.
2012-09-02 21:53:12 +02:00
Willy Tarreau
afad0e0f80 MAJOR: make use of conn_{data|sock}_{poll|stop|want}* in connection handlers
This is a second attempt at getting rid of FD_WAIT_*. Now the situation is
much better since native I/O handlers can directly manipulate the FD using
fd_{poll|want|stop}_* and the connection handlers manipulate connection-level
flags using the conn_{data|sock}_* equivalent.

Proceeding this way ensures that the connection flags always reflect the
reality even after data<->handshake switches.
2012-09-02 21:53:12 +02:00
Willy Tarreau
f9dabecd03 MEDIUM: connection: make use of the new polling functions
Now the connection handler, the handshake callbacks and the I/O callbacks
make use of the connection-layer polling functions to enable or disable
polling on a file descriptor.

Some changes still need to be done to avoid using the FD_WAIT_* constants.
2012-09-02 21:53:11 +02:00
Willy Tarreau
b5e2cbdcc8 MEDIUM: connection: add definitions for dual polling mechanisms
The conflicts we're facing with polling is that handshake handlers have
precedence over data handlers and may change the polling requirements
regardless of what is expected by the data layer. This causes issues
such as missed events.

The real need is to have three polling levels :
  - the "current" one, which is effective at any moment
  - the data one, which reflects what the data layer asks for
  - the sock one, which reflects what the socket layer asks for

Depending on whether a handshake is in progress or not, either one of the
last two will replace the current one, and the change will be propagated
to the lower layers.

At the moment, the shutdown status is not considered, and only handshakes
are used to decide which layer to chose. This will probably change.
2012-09-02 21:53:11 +02:00
Willy Tarreau
babd05a6c6 MEDIUM: fd: add fd_poll_{recv,send} for use when explicit polling is required
The old EV_FD_SET() macro was confusing, as it would enable receipt but there
was no way to indicate that EAGAIN was received, hence the recently added
FD_WAIT_* flags. They're not enough as we're still facing a conflict between
EV_FD_* and FD_WAIT_*. So let's offer I/O functions what they need to explicitly
request polling.
2012-09-02 21:53:11 +02:00
Willy Tarreau
49b046dddf MAJOR: fd: replace all EV_FD_* macros with new fd_*_* inline calls
These functions have a more explicity meaning and will offer provisions
for explicit polling.

EV_FD_ISSET() has been left for now as it is still in use in checks.
2012-09-02 21:53:11 +02:00
Willy Tarreau
4a36b56909 MAJOR: stream_int: use a common stream_int_shut*() functions regardless of the data layer
Up to now, we had to use a shutr/shutw interface per data layer, which
basically means 3 distinct functions when we include SSL :
  - generic stream_interface
  - sock_raw
  - sock_ssl

With this change, the code located in the stream_interface manages all the
stream_interface and buffer updates, and calls the data layer hooks when
needed.

At the moment, the socket layer hook had been implicitly considered as
being a regular socket, so the si_shut*() functions call the normal
shutdown() and EV_FD_CLR() functions on the fd if a socket layer is
defined. This may change in the future. The stream_int_shut*()
functions don't call EV_FD_CLR() so that they can later be embedded
in lower layers.

Thus, the si->data->shutr() is not called anymore and si->data->shutw()
is called to close the data layer only (eg: only for SSL).

Proceeding like this is very important because it's the only way to be
able not to rely on these functions when called from the connection
handlers, and call the data layers' instead.
2012-09-02 21:53:10 +02:00
Willy Tarreau
8b117082bc REORG: connection: replace si_data_close() with conn_data_close()
This close function only applies to connection-specific parts and
the stream-interface entry may soon disappear. Move this to the
connection instead.
2012-09-02 21:53:10 +02:00
Willy Tarreau
3788e4c874 MEDIUM: fd: remove the EV_FD_COND_* primitives
These primitives were initially introduced so that callers were able to
conditionally set/disable polling on a file descriptor and check in return
what the state was. It's been long since we last had an "if" on this, and
all pollers' functions were the same for cond_* and their systematic
counter parts, except that this required a check and a specific return
value that are not always necessary.

So let's simplify the FD API by removing this now unused distinction and
by making all specific functions return void.
2012-09-02 21:53:10 +02:00
Willy Tarreau
c76ae33bfc MAJOR: connection: call data layer handshakes from the handler
Handshakes is not called anymore from the data handlers, they're only
called from the connection handler when their flag is set.

Also, this move has uncovered an issue with the stream interface notifier :
it doesn't consider the FD_WAIT_* flags possibly set by the handshake
handlers. This will result in a stuck handshake when no data is in the
output buffer. In order to cover this, for now we'll perform the EV_FD_SET
in the SSL handshake function, but this needs to be addressed separately
from the stream interface operations.
2012-09-02 21:53:09 +02:00
Willy Tarreau
8f8c92fe93 MAJOR: connection: add a new CO_FL_CONNECTED flag
This new flag is used to indicate that the connection was already
connected. It can be used by I/O handlers to know that a connection
has just completed. It is used by stream_sock_update_conn(), allowing
the sock_opt handlers not to manipulate the SI timeout nor the
BF_WRITE_NULL flag anymore.
2012-09-02 21:53:09 +02:00
Willy Tarreau
239d7189fc MEDIUM: stream_interface: pass connection instead of fd in sock_ops
The sock_ops I/O callbacks made use of an FD till now. This has become
inappropriate and the struct connection is much more useful. It also
fixes the race condition introduced by previous change.
2012-09-02 21:53:08 +02:00
Willy Tarreau
fd31e53139 MAJOR: remove the stream interface and task management code from sock_*
The socket data layer code must only focus on moving data between a
socket and a buffer. We need a special stream interface handler to
update the stream interface and the file descriptor status.

At the moment the code works but suffers from a race condition caused
by its API : the read/write callbacks still make use of the fd instead
of using the connection. And when a double shutdown is performed, a call
to ->write() after ->read() processed an error results in dereferencing
a NULL fdtab[]->owner. This is only a temporary issue which doesn't need
to be fixed now since this will automatically go away when the functions
change to use the connection instead.
2012-09-02 21:53:08 +02:00
Willy Tarreau
076be25ab8 CLEANUP: remove the now unused fdtab direct I/O callbacks
They were all left to NULL since last commit so we can safely remove them
all now and remove the temporary dual polling logic in pollers.
2012-09-02 21:51:29 +02:00
Willy Tarreau
2da156fe5e MAJOR: tcp: remove the specific I/O callbacks for TCP connection probes
Use a single tcp_connect_probe() instead of tcp_connect_write() and
tcp_connect_read(). We call this one only when no data layer function
have been processed, so this is a fallback to test for completion of
a connection attempt.

With this done, we don't have the need for any direct I/O callback
anymore.

The function still relies on ->write() to wake the stream interface up,
so it's not finished.
2012-09-02 21:51:29 +02:00
Willy Tarreau
2c6be84b3a MEDIUM: connection: extract the send_proxy callback from proto_tcp
This handshake handler must be independant, so move it away from
proto_tcp. It has a dedicated connection flag. It is tested before
I/O handlers and automatically removes the CO_FL_WAIT_L4_CONN flag
upon success.

It also sets the BF_WRITE_NULL flag on the stream interface and
stops the SI timeout. However it does not perform the task_wakeup(),
and relies on the data handler to do so for now. The SI wakeup will
have to be moved elsewhere anyway.
2012-09-02 21:51:28 +02:00
Willy Tarreau
8018471f44 MINOR: fd: make fdtab->owner a connection and not a stream_interface anymore
It is more convenient with a connection here and will abstract stream_interface
more easily.
2012-09-02 21:51:28 +02:00
Willy Tarreau
59f98393bb MINOR: connection: add a handler for fd-based connections
This connection handler will be used as an I/O handler for events
detected on a file descriptor. It is not used yet.
2012-09-02 21:51:28 +02:00
Willy Tarreau
4e6049e553 MINOR: fd: add a new I/O handler to fdtab
This one will eventually replace both cb[] handlers. At the moment it
is not used yet.
2012-09-02 21:51:27 +02:00
Willy Tarreau
505e34a36d MAJOR: get rid of fdtab[].state and use connection->flags instead
fdtab[].state was only used to know whether a connection was in progress
or an error was encountered. Instead we now use connection->flags to store
a flag for both. This way, connection management will be able to update the
connection status on I/O.
2012-09-02 21:51:26 +02:00
Willy Tarreau
900bc93e24 MINOR: connection: add flags to the connection struct
We're doing this to take over fdtab[].state.
2012-09-02 21:51:26 +02:00
Willy Tarreau
da92e2fb61 REORG/MINOR: checks: put a struct connection into the server
This will be used to handle the connection state once it goes away from fdtab.
There is no functional change at the moment.
2012-09-02 21:51:26 +02:00
Willy Tarreau
56e9c5e963 REORG/MINOR: connection: move declaration to its own include file
This way we don't depend on stream_interface anymore.
2012-09-02 21:51:26 +02:00
Willy Tarreau
ed8f614078 REORG/MEDIUM: fd: get rid of FD_STLISTEN
This state was only used so that ev_sepoll did not match FD_STERROR, which
changed in previous patch. We can now safely remove this state.
2012-09-02 21:51:25 +02:00
Willy Tarreau
db3b32610f REORG/MEDIUM: fd: remove FD_STCLOSE from struct fdtab
In an attempt to get rid of fdtab[].state, and to move the relevant
parts to the connection struct, we remove the FD_STCLOSE state which
can easily be deduced from the <owner> pointer as there is a 1:1 match.
2012-09-02 21:51:25 +02:00
Jamie Gloudon
801a0a353a DOC: fix name for "option independant-streams"
The correct spelling is "independent", not "independant". This patch
fixes the doc and the configuration parser to accept the correct form.
The config parser still allows the old naming for backwards compatibility.
2012-09-02 21:51:07 +02:00
Willy Tarreau
654694e189 MEDIUM: stats/cli: add support for "set table key" to enter values
This is used to enter values for stick tables. The most likely usage
is to set gpc0 for a specific IP address in order to block traffic
for abusers without having to reload. Since all data types are
supported, other usages are possible (eg: replace a users's assigned
server).
2012-09-02 21:51:07 +02:00
Willy Tarreau
c3a08a136b BUG: stktable: tcp_src_to_stktable_key() must return NULL on invalid families
Source addresses of non-TCP families were not correctly handled by
tcp_src_to_stktable_key() as it forgot to return NULL and instead left
the previous value in the stick-table buffer.

This bug is 1.5-specific and was introduced by commit 4f92d320 in 1.5-dev6
so it does not need any backport.
2012-08-31 11:03:30 +02:00
David du Colombier
65c1796c4a MINOR: IPv6 support for transparent proxy
Set socket option IPV6_TRANSPARENT on binding
to enable transparent proxy on IPv6.
This option is available from Linux 2.6.37.
2012-07-31 07:53:42 +02:00
Willy Tarreau
96596aeead MEDIUM: fd/si: move peeraddr from struct fdinfo to struct connection
The destination address is purely a connection thing and not an fd thing.
It's also likely that later the address will be stored into the connection
and linked to by the SI.

struct fdinfo only keeps the pointer to the port range and the local port
for now. All of this also needs to move to the connection but before this
the release of the port range must move from fd_delete() to a new function
dedicated to the connection.
2012-06-08 22:59:52 +02:00
Willy Tarreau
4f8a83cb6e MEDIUM: stats: add the ability to kill sessions from the admin interface
It was not possible to kill remaining sessions from the admin interface,
which is annoying especially when switching to maintenance mode. Now it's
possible.
2012-06-04 00:26:23 +02:00
Willy Tarreau
d72822442d MEDIUM: stats: add support for soft stop/soft start in the admin interface
One important missing feature on the web interface is the ability to perform
a soft stop/soft start. This is now possible.
2012-06-04 00:22:44 +02:00
Justin Karneges
eb2c24ae2a MINOR: checks: add on-marked-up option
This implements the feature discussed in the earlier thread of killing
connections on backup servers when a non-backup server comes back up. For
example, you can use this to route to a mysql master & slave and ensure
clients don't stay on the slave after the master goes from down->up. I've done
some minimal testing and it seems to work.

[WT: added session flag & doc, moved the killing after logging the server UP,
 and ensured that the new server is really usable]
2012-06-03 23:48:42 +02:00
Willy Tarreau
496aa0111e BUG/MEDIUM: ensure that unresolved arguments are freed exactly once
When passing arguments to ACLs and samples, some types are stored as
strings then resolved later after config parsing is done. Upon exit,
the arguments need to be freed only if the string was not resolved
yet. At the moment we can encounter double free during deinit()
because some arguments (eg: userlists) are freed once as their own
type and once as a string.

The solution consists in adding an "unresolved" flag to the args to
say whether the value is still held in the <str> part or is final.

This could be debugged thanks to a useful bug report from Sander Klein.
2012-06-01 10:40:52 +02:00
Willy Tarreau
4992dd2d30 MINOR: http: add support for "httponly" and "secure" cookie attributes
httponly  This option tells haproxy to add an "HttpOnly" cookie attribute
             when a cookie is inserted. This attribute is used so that a
             user agent doesn't share the cookie with non-HTTP components.
             Please check RFC6265 for more information on this attribute.

   secure    This option tells haproxy to add a "Secure" cookie attribute when
             a cookie is inserted. This attribute is used so that a user agent
             never emits this cookie over non-secure channels, which means
             that a cookie learned with this flag will be presented only over
             SSL/TLS connections. Please check RFC6265 for more information on
             this attribute.
2012-05-31 21:02:17 +02:00
Willy Tarreau
b5ba17e3a9 BUG/MINOR: config: do not report twice the incompatibility between cookie and non-http
This one was already taken care of in proxy_cfg_ensure_no_http(), so if a
cookie is presented in a TCP backend, we got two warnings.

This can be backported to 1.4 since it's been this way for 2 years (although not dramatic).
2012-05-31 20:47:00 +02:00
Willy Tarreau
674021329c REORG/MINOR: use dedicated proxy flags for the cookie handling
Cookies were mixed with many other options while they're not used as options.
Move them to a dedicated bitmask (ck_opts). This has released 7 flags in the
proxy options and leaves some room for new proxy flags.
2012-05-31 20:40:20 +02:00
Willy Tarreau
196729eff8 BUG/MINOR: fix option httplog validation with TCP frontends
Option httplog needs to be checked only once the proxy has been validated,
so that its final mode (tcp/http) can be used. Also we need to check for
httplog before checking the log format, so that we can report a warning
about this specific option and not about the format it implies.
2012-05-31 19:30:26 +02:00
Willy Tarreau
ab152a7eda BUG/MAJOR: b_rew() must pass a signed offset to b_ptr()
Commit 13e66da introduced b_rew() but passes -adv which is an unsigned
quantity on 64-bit platforms, causing the buffer to advance in the wrong
direction.

No backport is needed.
2012-05-31 11:33:42 +02:00
Oskar Stolc
8dc4184c57 MINOR: balance uri: added 'whole' parameter to include query string in hash calculation
This patch brings a new "whole" parameter to "balance uri" which makes
the hash work over the whole uri, not just the part before the query
string. Len and depth parameter are still honnored.

The reason for this new feature is explained below.

I have 3 backend servers, each accepting different form of HTTP queries:

http://backend1.server.tld/service1.php?q=...
http://backend1.server.tld/service2.php?q=...

http://backend2.server.tld/index.php?query=...&subquery=...

http://backend3.server.tld/image/49b8c0d9ff

Each backend server returns a different response based on either:
- the URI path (the left part of the URI before the question mark)
- the query string (the right part of the URI after the question mark)
- or the combination of both

I wanted to set up a common caching cluster (using 6 Squid servers, each
configured as reverse proxy for those 3 backends) and have HAProxy balance
the queries among the Squid servers based on URL. I also wanted to achieve
hight cache hit ration on each Squid server and send the same queries to
the same Squid servers. Initially I was considering using the 'balance uri'
algorithm, but that would not work as in case of backend2 all queries would
go to only one Squid server. The 'balance url_param' would not work either
as it would send the backend3 queries to only one Squid server.

So I thought the simplest solution would be to use 'balance uri', but to
calculate the hash based on the whole URI (URI path + query string),
instead of just the URI path.
2012-05-22 07:56:54 +02:00
Emeric Brun
d88fd824b7 MEDIUM: protocol: add a pointer to struct sock_ops to the listener struct
The listener struct is now aware of the socket layer to use upon accept().
At the moment, only sock_raw is supported so this patch should not change
anything.
2012-05-21 22:22:39 +02:00
Emeric Brun
21adb02d19 MINOR: stream_interface: add a pointer to the listener for TARG_TYPE_CLIENT
When the target is a client, it will be convenient to have a pointer to the
original listener so that we can retrieve some configuration information at
the stream interface level.
2012-05-21 22:22:39 +02:00
Willy Tarreau
24208275d5 MINOR: stream_interface: add a data channel close function
This function will be called later when splitting the shutdown in two
steps. It will be needed by SSL and for remote socket operations to
release unused contexts.
2012-05-21 17:59:53 +02:00
Willy Tarreau
949811319b REORG/MEDIUM: stream_interface: move applet->state and private to connection
The state and the private pointer are not specific to the applets, since SSL
will require exactly both of them. Move them to the connection layer now and
rename them. We also now ensure that both are NULL on first call.
2012-05-21 17:09:48 +02:00
Willy Tarreau
fb7508aefb REORG/MINOR: stream_interface: move si->fd to struct connection
The socket fd is used only when in socket mode and with a connection.
2012-05-21 16:47:54 +02:00
Willy Tarreau
73b013b070 MINOR: stream_interface: introduce a new "struct connection" type
We start to move everything needed to manage a connection to a special
entity "struct connection". We have the data layer operations and the
control operations there. We'll also have more info in the future such
as file descriptors and applet contexts, so that in the end it becomes
detachable from the stream interface, which will allow connections to
be reused between sessions.

For now on, we start with minimal changes.
2012-05-21 16:31:45 +02:00
Willy Tarreau
9580d16e40 BUG/MAJOR: checks: don't call set_server_status_* when no LB algo is set
David Touzeau reported that haproxy dies when a server is checked and is
used in a farm with only "option transparent" and no LB algo. This is
because the LB params are NULL, the functions should be checked before
being called.

The same bug is present in 1.4 so this patch must be backported.
2012-05-19 19:09:46 +02:00
Willy Tarreau
2692736aa3 MEDIUM: http: get rid of msg->som which is not used anymore
msg->som was zero before the body and was used to carry the beginning
of a chunk size for chunked-encoded messages, at a moment when msg->sol
is always zero.

Remove msg->som and replace it with msg->sol where needed.
2012-05-18 23:50:43 +02:00
Willy Tarreau
09d1e254c9 MAJOR: http: stop using msg->sol outside the parsers
This is a left-over from the buffer changes. Msg->sol is always null at the
end of the parsing, so we must not use it anymore to read headers or find
the beginning of a message. As a side effect, the dump of the request in
debug mode is working again because it was relying on msg->sol not being
null.

Maybe it will even be mergeable with another of the message pointers.
2012-05-18 22:43:55 +02:00
Willy Tarreau
13e66dad26 MINOR: buffers: add a rewind function
b_rew() will be used to rewind a buffer for certain specific operations
such as header inspection on data already in the output queue.
2012-05-18 22:11:27 +02:00
Willy Tarreau
be0688c64d MEDIUM: stream_interface: remove the si->init
Calling the init() function in sess_establish was a bad idea, it is
too late to allow it to fail on lack of resource and does not help at
all. Remove it for now before it's used.
2012-05-18 15:15:26 +02:00
David du Colombier
7af4605ef7 BUG/MAJOR: trash must always be the size of a buffer
Before it was possible to resize the buffers using global.tune.bufsize,
the trash has always been the size of a buffer by design. Unfortunately,
the recent buffer sizing at runtime forgot to adjust the trash, resulting
in it being too short for content rewriting if buffers were enlarged from
the default value.

The bug was encountered in 1.4 so the fix must be backported there.
2012-05-16 14:21:55 +02:00
Willy Tarreau
7bb68abb9f OPTIM/MEDIUM: stream_interface: add a new SI_FL_NOHALF flag
This flag indicates that we're not interested in keeping half-open
connections on a stream interface. It has the benefit of allowing
the socket layer to cause an immediate write close when detecting
an incoming read close. This releases resources much faster and
saves one syscall (either a shutdown or setsockopt).

This flag is only set by HTTP on the interface going to the server
since we don't want to continue pushing data there when it has
closed.

Another benefit is that it responds with a FIN to a server's FIN
instead of responding with an RST as it used to, which is much
cleaner.

Performance gains of 7.5% have been measured on HTTP connection
rate on empty objects.
2012-05-13 14:52:22 +02:00
Willy Tarreau
b147a8382a CLEANUP: fd: remove unused cb->b pointers in the struct fdtab
These pointers were used to hold pointers to buffers in the past, but
since we introduced the stream interface, they're no longer used but
they were still sometimes set.

Removing them shrink the struct fdtab from 32 to 24 bytes on 32-bit machines,
and from 52 to 36 bytes on 64-bit machines, which is a significant saving. A
quick tests shows a steady 0.5% performance gain, probably due to the better
cache efficiency.
2012-05-13 00:35:44 +02:00
Willy Tarreau
3d2f16f3c3 MINOR: standard: add a new debug macro : fddebug()
This macro is usable like printf but sends messages to fd #-1, which has no
visible effect but is easy to spot in strace. This is very useful to put
tracers at many points during debugging sessions.
2012-05-13 00:21:17 +02:00
Willy Tarreau
ce887fd3b2 MEDIUM: session: add support for tunnel timeouts
Tunnel timeouts are used when TCP connections are forwarded, or
when forwarding upgraded HTTP connections (WebSocket) as well as
CONNECT requests to proxies.

This timeout allows long-lived sessions to be supported without
having to set large timeouts to normal requests.
2012-05-12 12:50:00 +02:00
Willy Tarreau
d02394b5a1 MEDIUM: stream_interface: derive the socket operations from the target
Instead of hard-coding sock_raw in connect_server(), we set this socket
operation at config parsing time. Right now, only servers and peers have
it. Proxies are still hard-coded as sock_raw. This will be needed for
future work on SSL which requires a different socket layer.
2012-05-11 18:52:14 +02:00
Willy Tarreau
64798bd720 MINOR: stream_interface: add an init callback to sock_ops
This will be needed for some socket layers such as SSL. It's not used
at the moment.
2012-05-11 18:39:26 +02:00
Willy Tarreau
f873d754f8 CLEANUP: stream_interface: stop exporting socket layer functions
Similarly to the previous patch, we don't need the socket-layer functions
outside of stream_interface. They could even move to a file dedicated to
applets, though that does not seem particularly useful at the moment.
2012-05-11 17:47:17 +02:00
Willy Tarreau
b277d6e568 CLEANUP: sock_raw: remove last references to stream_sock
We also stop exporting all functions since they're not needed anymore outside
of sock_raw.c.
2012-05-11 17:03:42 +02:00
Willy Tarreau
59b9479667 BUG/MEDIUM: stream_interface: restore get_src/get_dst
Commit e164e7a removed get_src/get_dst setting in the stream interfaces but
forgot to set it in proto_tcp. Get the feature back because we need it for
logging, transparent mode, ACLs etc... We now rely on the stream interface
direction to know what syscall to use.

One benefit of doing it this way is that we don't use getsockopt() anymore
on outgoing stream interfaces nor on UNIX sockets.
2012-05-11 16:48:10 +02:00
Willy Tarreau
1539a01645 MINOR: stream_interface: add a client target : TARG_TYPE_CLIENT
This one will be used to identify the direction the SI is being used. All
incoming connections have a target of type TARG_TYPE_CLIENT.
2012-05-11 14:47:34 +02:00
Willy Tarreau
c63190d429 REORG: use the name sock_raw instead of stream_sock
We'll soon have an SSL socket layer, and in order to ease the difference
between the two, we use the name "sock_raw" to designate the one which
directly talks to the sockets without any conversion.
2012-05-11 14:23:52 +02:00
Willy Tarreau
a7fe8e527c MINOR: http: replace http_message_realign() with buffer_slow_realign()
There is no more reason for the realign function being HTTP specific,
it only operates on a buffer now. Let's move it to buffers.c instead.

It's likely that buffer_bounce_realign is broken (not used), this will
have to be inspected. The function is worth rewriting as it can be
cheaper than buffer_slow_realign() to realign large wrapping buffers.
2012-05-08 21:28:17 +02:00
Willy Tarreau
0a3dd74c9c MEDIUM: cfgparse: use the new error reporting framework for remaining cfg_keywords
All keywords registered using a cfg_kw_list now make use of the new error reporting
framework. This allows easier and more precise error reporting without having to
deal with complex buffer allocation issues.
2012-05-08 21:28:17 +02:00
Willy Tarreau
a93c74be5c MEDIUM: cfgparse: make backend_parse_balance() use memprintf to report errors
Using the new error reporting framework makes it easier to report complex
errors.
2012-05-08 21:28:17 +02:00
Willy Tarreau
6e0644339f MEDIUM: memory: add the ability to poison memory at run time
From time to time, some bugs are discovered that are caused by non-initialized
memory areas. It happens that most platforms return a zero-filled area upon
first malloc() thus hiding potential bugs. This patch also replaces malloc()
in pools with calloc() to ensure that all platforms exhibit the same behaviour
upon startup. In order to catch these bugs more easily, add a -dM command line
flag to enable memory poisonning. Optionally, passing -dM<byte> forces the
poisonning byte to <byte>.
2012-05-08 21:28:16 +02:00
Willy Tarreau
d04b1bce69 MEDIUM: http: improve error capture reports
A number of important information were missing from the error captures, so
let's improve them. Now we also log source port, session flags, transaction
flags, message flags, pending output bytes, expected buffer wrapping position,
total bytes transferred, message chunk length, and message body length.

As such, the output format has slightly evolved and the source address moved
to the third line :

[08/May/2012:11:14:36.341] frontend echo (#1): invalid request
  backend echo (#1), server <NONE> (#-1), event #1
  src 127.0.0.1:40616, session #4, session flags 0x00000000
  HTTP msg state 26, msg flags 0x00000000, tx flags 0x00000000
  HTTP chunk len 0 bytes, HTTP body len 0 bytes
  buffer flags 0x00909002, out 0 bytes, total 28 bytes
  pending 28 bytes, wrapping at 8030, error at position 7:

  00000  GET / /?t=20000 HTTP/1.1\r\n
  00026  \r\n

[08/May/2012:11:13:13.426] backend echo (#1) : invalid response
  frontend echo (#1), server local (#1), event #0
  src 127.0.0.1:40615, session #1, session flags 0x0000044e
  HTTP msg state 32, msg flags 0x0000000e, tx flags 0x08200000
  HTTP chunk len 0 bytes, HTTP body len 20 bytes
  buffer flags 0x00008002, out 81 bytes, total 92 bytes
  pending 11 bytes, wrapping at 7949, error at position 9:

  00000  Foo: bar\r\r\n
2012-05-08 21:28:16 +02:00
Willy Tarreau
bbebbbff83 REORG/MEDIUM: move the default accept function from sockstream to protocols.c
The previous sockstream_accept() function uses nothing from sockstream, and
is totally irrelevant to stream interfaces. Move this to the protocols.c
file which handles listeners and protocols, and call it listener_accept().

It now makes much more sense that the code dealing with listen() also handles
accept() and passes it to upper layers.
2012-05-08 21:28:15 +02:00
Willy Tarreau
26d8c59f0b REORG/MEDIUM: replace stream interface protocol functions by a proto pointer
The stream interface now makes use of the socket protocol pointer instead
of the direct functions.
2012-05-08 21:28:15 +02:00
Willy Tarreau
5c979a9c71 REORG/MEDIUM: stream_interface: initialize socket ops from descriptors 2012-05-08 21:28:14 +02:00
Willy Tarreau
1b79bdee26 REORG/MEDIUM: move protocol->{read,write} to sock_ops
The protocol must not set the read and write callbacks, they're specific
to the socket layer. Move them to sock_ops instead.
2012-05-08 21:28:14 +02:00
Willy Tarreau
060781fb4a REORG: stream_interface: create a struct sock_ops to hold socket operations
These operators are used regardless of the socket protocol family. Move
them to a "sock_ops" struct. ->read and ->write have been moved there too
as they have no reason to remain at the protocol level.
2012-05-08 21:28:14 +02:00
Willy Tarreau
ceb4ac9c34 MEDIUM: acl: support IPv6 address matching
Make use of the new IPv6 pattern type so that acl_match_ip() knows how to
compare pattern and sample.

IPv6 may be entered in their usual form, with or without a netmask appended.
Only bit counts are accepted for IPv6 netmasks. In order to avoid any risk of
trouble with randomly resolved IP addresses, host names are never allowed in
IPv6 patterns.

HAProxy is also able to match IPv4 addresses with IPv6 addresses in the
following situations :
  - tested address is IPv4, pattern address is IPv4, the match applies
    in IPv4 using the supplied mask if any.
  - tested address is IPv6, pattern address is IPv6, the match applies
    in IPv6 using the supplied mask if any.
  - tested address is IPv6, pattern address is IPv4, the match applies in IPv4
    using the pattern's mask if the IPv6 address matches with 2002:IPV4::,
    ::IPV4 or ::ffff:IPV4, otherwise it fails.
  - tested address is IPv4, pattern address is IPv6, the IPv4 address is first
    converted to IPv6 by prefixing ::ffff: in front of it, then the match is
    applied in IPv6 using the supplied IPv6 mask.
2012-05-08 21:28:14 +02:00
Willy Tarreau
6d20e28556 MINOR: standard: add an IPv6 parsing function (str62net)
str62net returns an address and a netmask in number of bits.
2012-05-08 20:57:21 +02:00
Willy Tarreau
c92ddbc37d MINOR: acl: add types to ACL patterns
We cannot currently match IPv6 addresses in ACL simply because we don't
support types on the patterns. Let's introduce this notion. For now, we
rely on the SMP_TYPES though it doesn't seem like it will last forever
given that some types are not present there (eg: regex, meth). Still it
should be enough to support mixed matchings for most types.

We use the special impossible value SMP_TYPES for types that don't exist
in the SMP_T_* space.
2012-05-08 20:57:21 +02:00
Willy Tarreau
cd3b094618 REORG: rename "pattern" files
They're now called "sample" everywhere to match their description.
2012-05-08 20:57:21 +02:00
Willy Tarreau
1278578487 REORG: use the name "sample" instead of "pattern" to designate extracted data
This is mainly a massive renaming in the code to get it in line with the
calling convention. Next patch will rename a few files to complete this
operation.
2012-05-08 20:57:20 +02:00
Willy Tarreau
7dcb6480db MEDIUM: acl: extend the pattern parsers to report meaningful errors
By passing the error pointer to all ACL parsers, we can make them report
useful errors and not simply fail.
2012-05-08 20:57:20 +02:00
Willy Tarreau
b7451bb660 MEDIUM: acl: report parsing errors to the caller
All parsing errors were known but impossible to return. Now by making use
of memprintf(), we're able to build meaningful error messages that the
caller can display.
2012-05-08 20:57:20 +02:00
Willy Tarreau
185b5c4a7b MEDIUM: http: merge acl and pattern header fetch functions
HTTP header fetch is now done using smp_fetch_hdr() for both ACLs and
patterns. This one also supports an occurrence number, making it possible
to specify explicit occurrences for ACLs and patterns.
2012-05-08 20:57:19 +02:00
Willy Tarreau
ae52f06da3 MINOR: acl: add a val_args field to keywords
This will make it possible to delegate argument validating to functions
shared with smp_fetch_*.
2012-05-08 20:57:19 +02:00
Willy Tarreau
7a777edbdf MINOR: acl: set SMP_OPT_ITERATE on fetch functions
This way, fetch functions will be able to tell if they're called for a single
request or as part of a loop. This is important for instance when we use
hdr(foo), because in an ACL this means that all hdr(foo) occurrences must
be checked while in a pattern it means only one of them (eg: last one).
2012-05-08 20:57:18 +02:00
Willy Tarreau
32a6f2e572 MEDIUM: acl/pattern: use the same direction scheme
Patterns were using a bitmask to indicate if request or response was desired
in fetch functions and keywords. ACLs were using a bitmask in fetch keywords
and a single bit in fetch functions. ACLs were also using an ACL_PARTIAL bit
in fetch functions indicating that a non-final fetch was performed, which was
an abuse of the existing direction flag.

The change now consists in using :
  - a capabilities field for fetch keywords => SMP_CAP_REQ/RES to indicate
    if a keyword supports requests, responses, both, etc...
  - an option field for fetch functions to indicate what the caller expects
    (request/response, final/non-final)

The ACL_PARTIAL bit was reversed to get SMP_OPT_FINAL as it's more explicit
to know we're working on a final buffer than on a non-final one.

ACL_DIR_* were removed, as well as PATTERN_FETCH_*. L4 fetches were improved
to support being called on responses too since they're still available.

The <dir> field of all fetch functions was changed to <opt> which is now
unsigned.

The patch is large but mostly made of cosmetic changes to accomodate this, as
almost no logic change happened.
2012-05-08 20:57:17 +02:00
Willy Tarreau
24e32d8c6b MEDIUM: acl: replace acl_expr with args in acl fetch_* functions
Having the args everywhere will make it easier to share fetch functions
between patterns and ACLs. The only place where we could have needed
the expr was in the http_prefetch function which can do well without.
2012-05-08 20:57:16 +02:00
Willy Tarreau
32389b7d04 MEDIUM: acl/pattern: switch rdp_cookie functions stack up-down
Previously, both pattern, backend and persist_rdp_cookie would build fake
ACL expressions to fetch an RDP cookie by calling acl_fetch_rdp_cookie().

Now we switch roles. The RDP cookie fetch function is provided as a sample
fetch function that all others rely on, including ACL. The code is exactly
the same, only the args handling moved from expr->args to args. The code
was moved to proto_tcp.c, but probably that a dedicated file would be more
suited to content handling.
2012-05-08 20:57:16 +02:00
Willy Tarreau
342acb4775 MEDIUM: pattern: integrate pattern_data into sample and use sample everywhere
Now there is no more reference to union pattern_data. All pattern fetch and
conversion functions now make use of the common sample type. Note: none of
them adjust the type right now so it's important to do it next otherwise
we would risk sharing such functions with ACLs and seeing them fail.
2012-05-08 20:57:15 +02:00
Willy Tarreau
b4a88f0672 MINOR: pattern: replace struct pattern with struct sample
This change is pretty minor. Struct pattern is only used for
pattern_process() now so changing it to use the common type is
quite obvious. It's worth noting that the last argument of
pattern_process() is never used so the function is self-sufficient.

Note that pattern_process() does not initialize the pattern at all
before calling fetch->process(), and that minimal initialization
will be required when we later change the argument for the sample.
2012-05-08 20:57:15 +02:00
Willy Tarreau
21e5b0e3cb MEDIUM: get rid of SMP_F_READ_ONLY and SMP_F_MUST_FREE
These ones were either unused or improperly used. Some integers were marked
read-only, which does not make much sense. Buffers are not read-only, they're
"constant" in that they must be kept intact after any possible change.
2012-05-08 20:57:15 +02:00
Willy Tarreau
197e10aaae MEDIUM: acl: get rid of the SET_RES flags
We now simply rely on a boolean result from a fetch to declare a match.
Booleans are not compared against patterns, they fix the result.
2012-05-08 20:57:15 +02:00
Willy Tarreau
f853c46bc3 MEDIUM: pattern/acl: get rid of temp_pattern in ACLs
This one is not needed anymore as we can return the data and its type in the
sample provided by the caller. ACLs now always return the proper type. BOOL
is already returned when the result is expected to be processed as a boolean.

temp_pattern has been unexported now.
2012-05-08 20:57:14 +02:00
Willy Tarreau
3740635b88 MAJOR: acl: make use of the new sample struct and get rid of acl_test
This change is invasive in lines of code but not much in terms of
functionalities as it's mainly a replacement of struct acl_test
with struct sample.
2012-05-08 20:57:14 +02:00
Willy Tarreau
422aa0792d MEDIUM: pattern: add new sample types to replace pattern types
The new sample types are necessary for the acl-pattern convergence.
These types are boolean and signed int. Some types were renamed for
less ambiguity (ip->ipv4, integer->uint).
2012-05-08 20:57:14 +02:00
Willy Tarreau
16c31b00dc MINOR: pattern: add a new 'sample' type to store fetched data
The pattern type is ambiguous because a pattern is only a type and a data
part, and is normally used to match against samples. Currently, patterns
cannot hold information related to the life of the data which was extracted.
We don't want to overload patterns either, so let's add a new "sample" type
which will progressively supersede the acl_test and maybe the pattern at most
places. The sample shares similar information with patterns and also has flags
describing the data volatility and protection.
2012-05-08 20:57:13 +02:00
Willy Tarreau
8f7406e9b4 MEDIUM: acl: remove the ACL_TEST_F_NULL_MATCH flag
This flag was used to force a boolean match even if there was no pattern
to match. It was used only by http_auth() and designed only for this one.
It's easier and cleaner to make the fetch function perform the test and
report the boolean result as a few other functions already do. It simplifies
the acl_exec_cond() logic and will help merging ACLs and patterns.
2012-05-08 20:57:13 +02:00
Willy Tarreau
21d68a6895 MEDIUM: pattern: add an argument validation callback to pattern descriptors
This is used to validate that arguments are coherent. For instance,
payload_lv expects that the last arg (if any) is not more negative
than the sum of the first two. The error is reported if any.
2012-05-08 20:57:13 +02:00
Willy Tarreau
9fcb984b17 MEDIUM: pattern: use the standard arg parser
We don't need the pattern-specific args parsers anymore, make use of the
common parser instead. We still need to improve this by adding a validation
function to report abnormal argument values or combinations. We don't report
precise parsing errors yet but this was not previously done either.
2012-05-08 20:57:13 +02:00
Willy Tarreau
f995410355 MEDIUM: pattern: get rid of arg_i in all functions making use of arguments
arg_i was almost unused, and since we migrated to use struct arg everywhere,
the rare cases where arg_i was needed could be replaced by switching to
arg->type = ARGT_STOP.
2012-05-08 20:57:12 +02:00
Willy Tarreau
ecfb8e8ff9 MEDIUM: pattern: replace type pattern_arg with type arg
arg is more complete than pattern_arg since it also covers ACL args,
so let's use this one instead.
2012-05-08 20:57:12 +02:00
Willy Tarreau
61612d49a7 MAJOR: acl: store the ACL argument types in the ACL keyword declaration
The types and minimal number of ACL keyword arguments are now stored in
their declaration. This will allow many more fantasies if some ACL use
several arguments or types.

Doing so required to rework all ACL keyword declarations to add two
parameters. So this was a good opportunity for a general cleanup and
to sort all entries in alphabetical order.

We still have two pending issues :
  - parse_acl_expr() checks for errors but has no way to report them to
    the user ;
  - the types of some arguments are still not resolved and kept as strings
    (eg: ARGT_FE/BE/TAB) for compatibility reasons, which must be resolved
    in acl_find_targets()
2012-05-08 20:57:11 +02:00
Willy Tarreau
34db108423 MAJOR: acl: make use of the new argument parsing framework
The ACL parser now uses the argument parser to build a typed argument list.
Right now arguments are all strings and only one argument is supported since
this is what ACLs currently support.
2012-05-08 20:57:11 +02:00
Willy Tarreau
2ac5718dbd MEDIUM: add a new typed argument list parsing framework
make_arg_list() builds an array of typed arguments with their values,
that the caller describes how to parse. This will be used to support
multiple arguments for ACLs and patterns, which is currently problematic
and prevents ACLs and patterns from being merged. Up to 7 arguments types
may be enumerated in a single 32-bit word, including their number of
mandatory parts.

At the moment, these files are not used yet, they're only built. Note that
the 4-bit encoding for the type has left only one unused type!
2012-05-08 20:57:10 +02:00
Willy Tarreau
9dab5fc4d4 MEDIUM: buffers: rename a number of buffer management functions
The following renaming took place :
1) buffer input functions
  buffer_put_block => bi_putblk
  buffer_put_char => bi_putchr
  buffer_put_string => bi_putstr
  buffer_put_chunk => bi_putchk
  buffer_feed => bi_putstr
  buffer_feed_chunk => bi_putchk
  buffer_cut_tail => bi_erase
  buffer_ignore => bi_fast_delete

2) buffer output functions
  buffer_get_char => bo_getchr
  buffer_get_line => bo_getline
  buffer_get_block => bo_getblk
  buffer_skip => bo_skip
  buffer_write => bo_inject

3) buffer input avail/full functions were introduced :
  bi_avail
  bi_full
2012-05-08 20:56:56 +02:00
Willy Tarreau
328582c3f9 MEDIUM: buffers: implement b_adv() to advance a buffer's pointer
This is more convenient and efficient than buf->p = b_ptr(buf, n);
It simply advances the buffer's pointer by <n> and trasfers that
amount of bytes from <in> to <out>. The BF_OUT_EMPTY flag is updated
accordingly.

A few occurrences of such computations in buffers.c and stream_sock.c
were updated to use b_adv(), which resulted in a small code shrink.
2012-05-08 12:28:14 +02:00
Willy Tarreau
cc5cfcbcce MEDIUM: buffers: add new pointer wrappers and get rid of almost all buffer_wrap_add calls
buffer_wrap_add was convenient for the migration but is not handy at all.
Let's have new wrappers that report input begin/end and output begin/end
instead.

It looks like we'll also need a b_adv(ofs) to advance a buffer's pointer.
2012-05-08 12:28:14 +02:00
Willy Tarreau
ec1bc82a1d MEDIUM: buffers: fix unsafe use of buffer_ignore at some places
buffer_ignore may only be used when the output of a buffer is empty,
but it's not granted it is always the case when sending HTTP error
responses. Better use buffer_cut_tail() instead, and use buffer_ignore
only on non-wrapping data.
2012-05-08 12:28:14 +02:00
Willy Tarreau
8a0cef2dad MEDIUM: http: remove buffer arg in http_capture_bad_message
The buffer pointer is now taken from the http_msg.
2012-05-08 12:28:13 +02:00
Willy Tarreau
45c0d98769 MEDIUM: http: http_send_name_header: remove references to msg and buffer
They can be deduced from txn.
2012-05-08 12:28:12 +02:00
Willy Tarreau
3a215bedba MAJOR: http: make http_msg->sol relative to buffer's origin
msg->sol is now a relative pointer just like all other ones. There is no
more absolute references to the buffer outside the struct buffer itself.

Next two cleanups should include removing buffer references to functions
which already have an msg, and removal of wrapping detection in request
and response parsing which cannot wrap by definition.
2012-05-08 12:28:12 +02:00
Willy Tarreau
62f791ea6f MEDIUM: http: add a pointer to the buffer in http_msg
ACLs and patterns only rely on a struct http_msg and don't know the pointer
to the actual data. struct http_msg will soon only hold relative references
so that's not possible. We need http_msg to hold a reference to the struct
buffer before having relative pointers everywhere.

It is likely that doing so will also result in opportunities to simplify
a number of functions arguments. The following functions are already
candidate :

	http_buffer_heavy_realign
	http_capture_bad_message
	http_change_connection_header
	http_forward_trailers
	http_header_add_tail
	http_header_add_tail2
	http_msg_analyzer
	http_parse_chunk_size
	http_parse_connection_header
	http_remove_header2
	http_send_name_header
	http_skip_chunk_crlf
	http_upgrade_v09_to_v10
2012-05-08 12:28:12 +02:00
Willy Tarreau
12e48b36dd MAJOR: http: turn http_msg->eol to a buffer-relative offset
It was an absolute pointer to the buffer's data, now it's a pointer relative
to the buffer's origin.
2012-05-08 12:28:12 +02:00
Willy Tarreau
fa4a03ca08 CLEANUP: http: remove unused http_msg->col
The <col> element of the struct http_msg has not been used for a long
time now, remove it.
2012-05-08 12:28:11 +02:00
Willy Tarreau
ea1175a687 MAJOR: http: change msg->{som,col,sov,eoh} to be relative to buffer origin
These offsets were relative to the buffer itself. Now they're relative to
the buffer's origin (buf->p) which normally corresponds to the start of
current message.

This saves a big dependency between the HTTP message struct and the buffers.
It appeared during this change that ->col is not used anymore (it will have
to be removed). Next step is to turn ->eol and ->sol from absolute to relative.
2012-05-08 12:28:11 +02:00
Willy Tarreau
a458b67965 MAJOR: http: move buffer->lr to http_msg->next
The buffer's pointer <lr> was only used by HTTP parsers which also use a
struct http_msg to keep track of the parser's state. We've reached a point
where it makes no sense to keep ->lr in the buffer, as the split between
buffer and msg is only arbitrary for historical reasons.

This change ensures that touching buffers will not impact HTTP messages
anymore, making the buffers more content-agnostic. However, it becomes
very important not to forget to update msg->next when some data get
forwarded or moved (and in general each time buf->p is updated).

The new pointer in http_msg becomes relative to buffer->p so that
parsing multiple messages becomes easier. It is possible that at one
point ->som and ->next will be merged.

Note: http_parse_reqline() and http_parse_stsline() have been temporarily
modified to know the message starting point in the buffer (->p).
2012-05-08 12:28:11 +02:00
Willy Tarreau
363a5bb152 MAJOR: buffers: replace buf->r with buf->p + buf->i
This change gets rid of buf->r which is always equal to buf->p + buf->i.
It removed some wrapping detection at a number of places, but required addition
of new relative offset computations at other locations. A large number of places
can be simplified now with extreme care, since most of the time, either the
pointer has to be computed once or we need a difference between the old ->w and
old ->r to compute free space. The cleanup will probably happen with the rewrite
of the buffer_input_* and buffer_output_* functions anyway.

buf->lr still has to move to the struct http_msg and be relative to buf->p
for the rework to be complete.
2012-05-08 12:28:11 +02:00
Willy Tarreau
89fa706d39 MAJOR: buffers: replace buf->w with buf->p - buf->o
This change introduces the buffer's base pointer, which is the limit between
incoming and outgoing data. It's the point where the parsing should start
from. A number of computations have already been greatly simplified, but
more simplifications are expected to come from the removal of buf->r.

The changes appear good and have revealed occasional improper use of some
pointers. It is possible that this patch has introduced bugs or revealed
some, although preliminary testings tend to indicate that everything still
works as it should.
2012-05-08 12:28:10 +02:00
Willy Tarreau
3f7ff1406c MINOR: buffers: remove unused function buffer_contig_data()
This one was never used and is buggy. It will be easier to rewrite
it when the buffer rework is complete.
2012-05-08 12:28:10 +02:00
Willy Tarreau
7fd758bbcf MINOR: buffers: provide simple pointer normalization functions
Add buffer_wrap_sub() and buffer_wrap_add() to normalize buffer pointers
after an addition or subtract.
2012-05-08 12:28:10 +02:00
Willy Tarreau
02d6cfc1d7 MAJOR: buffer: replace buf->l with buf->{o+i}
We don't have buf->l anymore. We have buf->i for pending data and
the total length is retrieved by adding buf->o. Some computation
already become simpler.

Despite extreme care, bugs are not excluded.

It's worth noting that msg->err_pos as set by HTTP request/response
analysers becomes relative to pending data and not to the beginning
of the buffer. This has not been completed yet so differences might
occur when outgoing data are left in the buffer.
2012-05-08 12:28:10 +02:00
Willy Tarreau
2e046c6017 MAJOR: buffer rework: replace ->send_max with ->o
This is the first minor step of the buffer rework. It's only renaming,
it should have no impact.
2012-04-30 11:57:00 +02:00
Willy Tarreau
a36fc4d7ed MEDIUM: move message-related flags from transaction to message
Too many flags are stored in the transaction structure. Some flags are
clearly message-specific and exist in two versions (request and response).
Move them to a new "flags" field in the http_message struct instead.
2012-04-30 11:57:00 +02:00
Willy Tarreau
9a7bea52b1 MINOR: standard: add a memprintf() function to build formatted error messages
memprintf() is just like snprintf() except that it always returns a properly
sized allocated string that the caller is responsible for freeing. NULL is
returned on serious errors. It also supports stackable calls over the same
pointer since it offers support for automatically freeing a previous one :

     memprintf(&err, "invalid argument: '%s'", arg);
     ...
     memprintf(&err, "keyword parser said: <%s>", *err);
     ...
     memprintf(&err, "line parser said: %s\n", *err);
     ...
     free(*err);
2012-04-30 11:55:35 +02:00
Willy Tarreau
3fb818c014 BUILD: http: make extract_cookie_value() return an int not size_t
It's very annoying that we have to deal with the crappy size_t and with ints
at some places because these ones don't mix well. Patch 6f61b2 changed the
chunk len to int but its size remains size_t and some functions are having
trouble being used by several callers depending on the type of their arguments.

Let's turn extract_cookie_value() to int for now on, and plan a massive cleanup
later to remove all size_t.
2012-04-30 00:19:28 +02:00
Willy Tarreau
9b061e3320 MEDIUM: stream_sock: add a get_src and get_dst callback and remove SN_FRT_ADDR_SET
These callbacks are used to retrieve the source and destination address
of a socket. The address flags are not hold on the stream interface and
not on the session anymore. The addresses are collected when needed.

This still needs to be improved to store the IP and port separately so
that it is not needed to perform a getsockname() when only the IP address
is desired for outgoing traffic.
2012-04-07 18:03:52 +02:00