5905 Commits

Author SHA1 Message Date
Willy Tarreau
239d7189fc MEDIUM: stream_interface: pass connection instead of fd in sock_ops
The sock_ops I/O callbacks made use of an FD till now. This has become
inappropriate and the struct connection is much more useful. It also
fixes the race condition introduced by previous change.
2012-09-02 21:53:08 +02:00
Willy Tarreau
fd31e53139 MAJOR: remove the stream interface and task management code from sock_*
The socket data layer code must only focus on moving data between a
socket and a buffer. We need a special stream interface handler to
update the stream interface and the file descriptor status.

At the moment the code works but suffers from a race condition caused
by its API : the read/write callbacks still make use of the fd instead
of using the connection. And when a double shutdown is performed, a call
to ->write() after ->read() processed an error results in dereferencing
a NULL fdtab[]->owner. This is only a temporary issue which doesn't need
to be fixed now since this will automatically go away when the functions
change to use the connection instead.
2012-09-02 21:53:08 +02:00
Willy Tarreau
076be25ab8 CLEANUP: remove the now unused fdtab direct I/O callbacks
They were all left to NULL since last commit so we can safely remove them
all now and remove the temporary dual polling logic in pollers.
2012-09-02 21:51:29 +02:00
Willy Tarreau
2da156fe5e MAJOR: tcp: remove the specific I/O callbacks for TCP connection probes
Use a single tcp_connect_probe() instead of tcp_connect_write() and
tcp_connect_read(). We call this one only when no data layer function
have been processed, so this is a fallback to test for completion of
a connection attempt.

With this done, we don't have the need for any direct I/O callback
anymore.

The function still relies on ->write() to wake the stream interface up,
so it's not finished.
2012-09-02 21:51:29 +02:00
Willy Tarreau
2c6be84b3a MEDIUM: connection: extract the send_proxy callback from proto_tcp
This handshake handler must be independant, so move it away from
proto_tcp. It has a dedicated connection flag. It is tested before
I/O handlers and automatically removes the CO_FL_WAIT_L4_CONN flag
upon success.

It also sets the BF_WRITE_NULL flag on the stream interface and
stops the SI timeout. However it does not perform the task_wakeup(),
and relies on the data handler to do so for now. The SI wakeup will
have to be moved elsewhere anyway.
2012-09-02 21:51:28 +02:00
Willy Tarreau
61ace1b2ca MEDIUM: connection: remove the FD_POLL_* flags only once
It's inappropriate to remove FD_POLL_IN and FD_POLL_OUT in the IO callback
handlers, first because they shouldn't care about this, and second because
it will make it harder to chain multiple callers.

So let's flush these flags only once for all in the connection handler.
Right now, the HUP and ERR flags are still flushed in each IO handler to
avoid multiple calls. This will probably have to be fixed later.
2012-09-02 21:51:28 +02:00
Willy Tarreau
8018471f44 MINOR: fd: make fdtab->owner a connection and not a stream_interface anymore
It is more convenient with a connection here and will abstract stream_interface
more easily.
2012-09-02 21:51:28 +02:00
Willy Tarreau
d2274c6536 MAJOR: connection: replace direct I/O callbacks with the connection callback
Almost all direct I/O callbacks have been changed to use the connection
callback instead. Only the TCP connection validation remains.
2012-09-02 21:51:28 +02:00
Willy Tarreau
59f98393bb MINOR: connection: add a handler for fd-based connections
This connection handler will be used as an I/O handler for events
detected on a file descriptor. It is not used yet.
2012-09-02 21:51:28 +02:00
Willy Tarreau
aece46a44d MEDIUM: protocols: use the generic I/O callback for accept callbacks
This one is used only on read events, and it was easy to convert to
use the new I/O callback.
2012-09-02 21:51:27 +02:00
Willy Tarreau
20bea42a95 MEDIUM: checks: make use of fdtab->iocb instead of cb[]
Use the single I/O callback to handle the checks. This should soon be
replaced by the common connection handler.
2012-09-02 21:51:27 +02:00
Willy Tarreau
9845e75d23 MEDIUM: polling: prepare to call the iocb() function when defined.
We will need this to centralize I/O callbacks. Nobody sets it right
now so the code should have no impact.
2012-09-02 21:51:27 +02:00
Willy Tarreau
4e6049e553 MINOR: fd: add a new I/O handler to fdtab
This one will eventually replace both cb[] handlers. At the moment it
is not used yet.
2012-09-02 21:51:27 +02:00
Willy Tarreau
505e34a36d MAJOR: get rid of fdtab[].state and use connection->flags instead
fdtab[].state was only used to know whether a connection was in progress
or an error was encountered. Instead we now use connection->flags to store
a flag for both. This way, connection management will be able to update the
connection status on I/O.
2012-09-02 21:51:26 +02:00
Willy Tarreau
da92e2fb61 REORG/MINOR: checks: put a struct connection into the server
This will be used to handle the connection state once it goes away from fdtab.
There is no functional change at the moment.
2012-09-02 21:51:26 +02:00
Willy Tarreau
ed8f614078 REORG/MEDIUM: fd: get rid of FD_STLISTEN
This state was only used so that ev_sepoll did not match FD_STERROR, which
changed in previous patch. We can now safely remove this state.
2012-09-02 21:51:25 +02:00
Willy Tarreau
5d526b7215 REORG/MEDIUM: fd: remove checks for FD_STERROR in ev_sepoll
This test is present only in this poller as an optimization, but this
optimization adds some complexity to remove fdtab[].state. Let's get
rid of it for now.
2012-09-02 21:51:25 +02:00
Willy Tarreau
db3b32610f REORG/MEDIUM: fd: remove FD_STCLOSE from struct fdtab
In an attempt to get rid of fdtab[].state, and to move the relevant
parts to the connection struct, we remove the FD_STCLOSE state which
can easily be deduced from the <owner> pointer as there is a 1:1 match.
2012-09-02 21:51:25 +02:00
Jamie Gloudon
801a0a353a DOC: fix name for "option independant-streams"
The correct spelling is "independent", not "independant". This patch
fixes the doc and the configuration parser to accept the correct form.
The config parser still allows the old naming for backwards compatibility.
2012-09-02 21:51:07 +02:00
Willy Tarreau
654694e189 MEDIUM: stats/cli: add support for "set table key" to enter values
This is used to enter values for stick tables. The most likely usage
is to set gpc0 for a specific IP address in order to block traffic
for abusers without having to reload. Since all data types are
supported, other usages are possible (eg: replace a users's assigned
server).
2012-09-02 21:51:07 +02:00
Willy Tarreau
dec9814e74 MINOR: stats/cli: add plans to support more stick-table actions
Right now we only support show/clear on a table. In order to introduce
the "set" keyword we need to get rid of the "show" boolean arg. There
is no functional change up to this commit.
2012-09-02 21:51:06 +02:00
William Lallemand
1dc00efedc BUG/MINOR: to_log erased with unique-id-format
curproxy->to_log was reset to LW_INIT when using unique-id-format,
so logs looked like option logasap
2012-08-09 19:18:22 +02:00
Willy Tarreau
a9fddca778 MINOR: http: add the urlp_val ACL match
It's derived from other urlp_* matches, but there was no way to check for
an integer value and it seems like it's significantly used.
2012-07-31 07:55:32 +02:00
Willy Tarreau
491c498d97 BUG/MINOR: polling: some events were not set in various pollers
fdtab[].ev was only set in ev_sepoll. Unfortunately, some I/O handling
functions now rely on this, so depending on the polling mechanism, some
useless operations might have been performed, such as performing a useless
recv() when a HUP was reported.

This is a very old issue, the flags were only added to the fdtab and not
propagated into any poller. Then they were used in ev_sepoll which needed
them for the cache. It is unsure whether a backport to 1.4 is appropriate
or not.
2012-07-31 07:55:31 +02:00
Willy Tarreau
dae2a8a5a5 BUG/MINOR: tarpit: fix condition to return the HTTP 500 message
Commit fa7e1025 (1.3.16-rc1) introduced a minor bug by comparing req->flags
with BF_READ_ERROR instead of checking for the bit. The result is that the
error message is always returned even in case of client error. This has no
real impact but this must be fixed.

It may be backported to 1.4 and 1.3.
2012-07-31 07:55:31 +02:00
David du Colombier
65c1796c4a MINOR: IPv6 support for transparent proxy
Set socket option IPV6_TRANSPARENT on binding
to enable transparent proxy on IPv6.
This option is available from Linux 2.6.37.
2012-07-31 07:53:42 +02:00
Willy Tarreau
5b88da269c OPTIM: i386: make use of kernel-mode-linux when available
If haproxy is built with support for USE_VSYSCALL_DLSYM, it's very
easy to check for KML availability. So let's enable it. Tests show
a small overall performance improvement around 1%. Other tests show
that the syscall overhead is divided by 4 on a Geode LX using this
method.
2012-07-31 07:53:42 +02:00
Willy Tarreau
a7ad50cdb1 MEDIUM: pattern: add the "base" sample fetch method
This one returns the concatenation of the first Host header entry with
the path. It can make content-switching rules easier, help with fighting
DDoS on certain URLs and improve shared caches efficiency.
2012-07-26 19:08:38 +02:00
Willy Tarreau
6812bcfc94 MINOR: replace acl_fetch_{path,url}* with smp_fetch_*
Doing so allows us to support sticking on URL, URL's IP, URL's port and
path.

Both fetch functions should be improved to support an optional depth
allowing to stick to a server depending on just a few directory
components. This would help with portals, some prefetch-capable
caches and with outgoing connections using multiple internet links.
2012-07-26 19:06:40 +02:00
Willy Tarreau
e3a461118c BUG/MINOR: ACL implicit arguments must be created with unresolved flag
Commit 496aa0 fixed a design issue by adding an "unresolved" flag to the
ACL arguments. Unfortunately this unresolved flag was not set when building
the fake argument some ACL need when using an implicit argument pointing to
the local proxy.

Special thanks to Michael Kearey who reported the issue with a reproducer
and the commit introducing the bug.
2012-06-15 08:02:34 +02:00
Willy Tarreau
96596aeead MEDIUM: fd/si: move peeraddr from struct fdinfo to struct connection
The destination address is purely a connection thing and not an fd thing.
It's also likely that later the address will be stored into the connection
and linked to by the SI.

struct fdinfo only keeps the pointer to the port range and the local port
for now. All of this also needs to move to the connection but before this
the release of the port range must move from fd_delete() to a new function
dedicated to the connection.
2012-06-08 22:59:52 +02:00
Willy Tarreau
a05903174f BUG/MAJOR: cookie prefix doesn't support cookie-less servers
Commit 827aee91 merged in 1.5-dev5 introduced a regression causing
the srv pointer to be tested twice instead of srv then srv->cookie.
The result is that if a server has no cookie in prefix mode, haproxy
will crash when trying to modify it.

Such a config is very unlikely to happen, except maybe with a backup
server, which would cause haproxy to die with the last server in the
farm.

No backport is needed, only 1.5-dev was affected.
2012-06-06 16:07:00 +02:00
Willy Tarreau
4f8a83cb6e MEDIUM: stats: add the ability to kill sessions from the admin interface
It was not possible to kill remaining sessions from the admin interface,
which is annoying especially when switching to maintenance mode. Now it's
possible.
2012-06-04 00:26:23 +02:00
Willy Tarreau
d72822442d MEDIUM: stats: add support for soft stop/soft start in the admin interface
One important missing feature on the web interface is the ability to perform
a soft stop/soft start. This is now possible.
2012-06-04 00:22:44 +02:00
Justin Karneges
eb2c24ae2a MINOR: checks: add on-marked-up option
This implements the feature discussed in the earlier thread of killing
connections on backup servers when a non-backup server comes back up. For
example, you can use this to route to a mysql master & slave and ensure
clients don't stay on the slave after the master goes from down->up. I've done
some minimal testing and it seems to work.

[WT: added session flag & doc, moved the killing after logging the server UP,
 and ensured that the new server is really usable]
2012-06-03 23:48:42 +02:00
Willy Tarreau
39b0665bc7 BUG/MINOR: commit 196729ef used wrong condition resulting in freeing constants
Recent commit 196729ef had inverted condition to free format strings. No
backport is needed, it was never released.
2012-06-01 10:58:06 +02:00
Willy Tarreau
496aa0111e BUG/MEDIUM: ensure that unresolved arguments are freed exactly once
When passing arguments to ACLs and samples, some types are stored as
strings then resolved later after config parsing is done. Upon exit,
the arguments need to be freed only if the string was not resolved
yet. At the moment we can encounter double free during deinit()
because some arguments (eg: userlists) are freed once as their own
type and once as a string.

The solution consists in adding an "unresolved" flag to the args to
say whether the value is still held in the <str> part or is final.

This could be debugged thanks to a useful bug report from Sander Klein.
2012-06-01 10:40:52 +02:00
Willy Tarreau
4992dd2d30 MINOR: http: add support for "httponly" and "secure" cookie attributes
httponly  This option tells haproxy to add an "HttpOnly" cookie attribute
             when a cookie is inserted. This attribute is used so that a
             user agent doesn't share the cookie with non-HTTP components.
             Please check RFC6265 for more information on this attribute.

   secure    This option tells haproxy to add a "Secure" cookie attribute when
             a cookie is inserted. This attribute is used so that a user agent
             never emits this cookie over non-secure channels, which means
             that a cookie learned with this flag will be presented only over
             SSL/TLS connections. Please check RFC6265 for more information on
             this attribute.
2012-05-31 21:02:17 +02:00
Willy Tarreau
b5ba17e3a9 BUG/MINOR: config: do not report twice the incompatibility between cookie and non-http
This one was already taken care of in proxy_cfg_ensure_no_http(), so if a
cookie is presented in a TCP backend, we got two warnings.

This can be backported to 1.4 since it's been this way for 2 years (although not dramatic).
2012-05-31 20:47:00 +02:00
Willy Tarreau
674021329c REORG/MINOR: use dedicated proxy flags for the cookie handling
Cookies were mixed with many other options while they're not used as options.
Move them to a dedicated bitmask (ck_opts). This has released 7 flags in the
proxy options and leaves some room for new proxy flags.
2012-05-31 20:40:20 +02:00
Willy Tarreau
99a7ca2fa6 BUG/MINOR: log: don't report logformat errors in backends
Logs have always been ignored by backends, do not report useless warnings there.
2012-05-31 19:39:23 +02:00
Willy Tarreau
196729eff8 BUG/MINOR: fix option httplog validation with TCP frontends
Option httplog needs to be checked only once the proxy has been validated,
so that its final mode (tcp/http) can be used. Also we need to check for
httplog before checking the log format, so that we can report a warning
about this specific option and not about the format it implies.
2012-05-31 19:30:26 +02:00
Willy Tarreau
743a2d3e14 BUG/MEDIUM: buffers: fix bi_putchr() to correctly advance the pointer
bi_putchr() failed to move the buffer pointer forward. The only user
was the peer handler which was broken, it failed to sync. Thanks to
Hervé Commowick for reporting the issue.
2012-05-31 16:40:11 +02:00
Willy Tarreau
fa6bac6ec3 BUG/MEDIUM: register peer sync handler in the proper order
Hervé Commowick reported a failure to resync upon restart caused by a
segfault on the old process. This is due to the data_ctx of the connection
being initialized after the stream interface.
2012-05-31 14:16:59 +02:00
Willy Tarreau
cde18fc1ba BUG/MINOR: perform_http_redirect also needs to rewind the buffer
Commit d1de8af362905d43bcd96e7522fcee62a93a53bf was incomplete, because
perform_http_redirect() also needs to rewind the buffer since it's called
after data are scheduled for forwarding.

No backport needed.
2012-05-30 08:00:56 +02:00
Cyril Bonté
a32d275ab0 BUG/MEDIUM: option forwardfor if-none doesn't work with some configurations
When "option forwardfor" is enabled in a frontend that uses backends,
"if-none" ignores the header name provided in the frontend.
This prevents haproxy to add the X-Forwarded-For header if the option is not
used in the backend.

This may introduce security issues for servers/applications that rely on the
header provided by haproxy.

A minimal configuration which can reproduce the bug:
defaults
	mode http

listen OK
	bind :9000

	option forwardfor if-none
	server s1 127.0.0.1:80

listen BUG-frontend
	bind :9001

	option forwardfor if-none

	default_backend BUG-backend

backend BUG-backend
	server s1 127.0.0.1:80
2012-05-30 06:43:24 +02:00
Willy Tarreau
7de211c88b MINOR: add a new function call tracer for debugging purposes
This feature relies on GCC's ability to call helpers at function entry/exit
points. We define these helpers to quickly dump the minimum info into a trace
file that can be converted to a human readable format using a script in the
contrib/trace directory. This has only been implemented in the GNU makefile
for now on as it is unsure whether it's supported on all OSes.

The feature is enabled by building with "TRACE=1". The performance impact is
huge, so this feature should only be used when debugging. To limit the loss
of performance, fprintf() has been disabled and the output is hand-crafted
and emitted using fwrite(), resulting in doubling the performance. Using the
TSC instead of gettimeofday() also doubles the performance. Around 1200 conns/s
may be achieved on a Pentium-M 1.7 GHz which leads to around 50 MB/s of traces.

The entry and exits of all functions will be dumped into a file designated
by the HAPROXY_TRACE environment variable, or by default "trace.out". If the
trace file name is empty or "/dev/null", then traces are disabled. If
opening the trace file fails, then stderr is used. If HAPROXY_TRACE_FAST is
used, then the time is taken from the global <now> variable. Last, if
HAPROXY_TRACE_TSC is used, then the machine's TSC is used instead of the
real time (almost twice as fast).

The output format is :

  <sec.usec> <level> <caller_ptr> <dir> <callee_ptr>
or :
  <tsc> <level> <caller_ptr> <dir> <callee_ptr>

where <dir> is '>' when entering a function and '<' when leaving.

The awk script in contrib/trace provides a nicer indented output :

6f74989e6f8 ->->->   run_poll_loop > signal_process_queue [src/haproxy.c:1097:0x804bd69] > [include/proto/signal.h:32:0x8049cd0]
6f74989eb00          run_poll_loop < signal_process_queue [src/haproxy.c:1097:0x804bd69] < [include/proto/signal.h:32:0x8049cd0]
6f74989ef44 ->->->   run_poll_loop > wake_expired_tasks [src/haproxy.c:1100:0x804bd72] > [src/task.c:123:0x8055060]
6f74989f3a6 ->->->->   wake_expired_tasks > eb32_lookup_ge [src/task.c:128:0x8055091] > [ebtree/eb32tree.c:138:0x80a8c70]
6f74989f7e9            wake_expired_tasks < eb32_lookup_ge [src/task.c:128:0x8055091] < [ebtree/eb32tree.c:138:0x80a8c70]
6f74989fc0d ->->->->   wake_expired_tasks > eb32_first [src/task.c:134:0x80550d5] > [ebtree/eb32tree.h:55:0x8054ad0]
6f7498a003d ->->->->->   eb32_first > eb_first [ebtree/eb32tree.h:56:0x8054af1] > [ebtree/ebtree.h:520:0x8054a10]
6f7498a0436 ->->->->->->   eb_first > eb_walk_down [ebtree/ebtree.h:521:0x8054a33] > [ebtree/ebtree.h:442:0x80549a0]
6f7498a0843 ->->->->->->->   eb_walk_down > eb_gettag [ebtree/ebtree.h:445:0x80549d6] > [ebtree/ebtree.h:418:0x80548e0]
6f7498a0c2b                  eb_walk_down < eb_gettag [ebtree/ebtree.h:445:0x80549d6] < [ebtree/ebtree.h:418:0x80548e0]
6f7498a1042 ->->->->->->->   eb_walk_down > eb_untag [ebtree/ebtree.h:447:0x80549e2] > [ebtree/ebtree.h:412:0x80548a0]
6f7498a1498                  eb_walk_down < eb_untag [ebtree/ebtree.h:447:0x80549e2] < [ebtree/ebtree.h:412:0x80548a0]
6f7498a18c6 ->->->->->->->   eb_walk_down > eb_root_to_node [ebtree/ebtree.h:448:0x80549e7] > [ebtree/ebtree.h:432:0x8054960]
6f7498a1cd4                  eb_walk_down < eb_root_to_node [ebtree/ebtree.h:448:0x80549e7] < [ebtree/ebtree.h:432:0x8054960]
6f7498a20c4                eb_first < eb_walk_down [ebtree/ebtree.h:521:0x8054a33] < [ebtree/ebtree.h:442:0x80549a0]
6f7498a24b4              eb32_first < eb_first [ebtree/eb32tree.h:56:0x8054af1] < [ebtree/ebtree.h:520:0x8054a10]
6f7498a289c            wake_expired_tasks < eb32_first [src/task.c:134:0x80550d5] < [ebtree/eb32tree.h:55:0x8054ad0]
6f7498a2c8c          run_poll_loop < wake_expired_tasks [src/haproxy.c:1100:0x804bd72] < [src/task.c:123:0x8055060]
6f7498a3095 ->->->   run_poll_loop > process_runnable_tasks [src/haproxy.c:1103:0x804bd7a] > [src/task.c:190:0x8055150]

A nice improvement would possibly consist in trying to get the function's
arguments in the stack and to dump a few more infor for some well-known
functions (eg: the session's status for process_session).
2012-05-26 00:12:37 +02:00
Willy Tarreau
1e44a49c89 BUG/MINOR: checks: expire on timeout.check if smaller than timeout.connect
It happens that haproxy doesn't displace the task in the wait queue when
validating a connection, so if the check timeout is set to a smaller value
than timeout.connect, it will not strike before timeout.connect.

The bug is present at least in 1.4.15..1.4.21, so the fix must be backported.
2012-05-25 07:42:37 +02:00
Oskar Stolc
8dc4184c57 MINOR: balance uri: added 'whole' parameter to include query string in hash calculation
This patch brings a new "whole" parameter to "balance uri" which makes
the hash work over the whole uri, not just the part before the query
string. Len and depth parameter are still honnored.

The reason for this new feature is explained below.

I have 3 backend servers, each accepting different form of HTTP queries:

http://backend1.server.tld/service1.php?q=...
http://backend1.server.tld/service2.php?q=...

http://backend2.server.tld/index.php?query=...&subquery=...

http://backend3.server.tld/image/49b8c0d9ff

Each backend server returns a different response based on either:
- the URI path (the left part of the URI before the question mark)
- the query string (the right part of the URI after the question mark)
- or the combination of both

I wanted to set up a common caching cluster (using 6 Squid servers, each
configured as reverse proxy for those 3 backends) and have HAProxy balance
the queries among the Squid servers based on URL. I also wanted to achieve
hight cache hit ration on each Squid server and send the same queries to
the same Squid servers. Initially I was considering using the 'balance uri'
algorithm, but that would not work as in case of backend2 all queries would
go to only one Squid server. The 'balance url_param' would not work either
as it would send the backend3 queries to only one Squid server.

So I thought the simplest solution would be to use 'balance uri', but to
calculate the hash based on the whole URI (URI path + query string),
instead of just the URI path.
2012-05-22 07:56:54 +02:00
Emeric Brun
d88fd824b7 MEDIUM: protocol: add a pointer to struct sock_ops to the listener struct
The listener struct is now aware of the socket layer to use upon accept().
At the moment, only sock_raw is supported so this patch should not change
anything.
2012-05-21 22:22:39 +02:00