haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-12 10:06:58 +02:00

Author	SHA1	Message	Date
Willy Tarreau	fd3828e263	[BUG] fix random memory corruption using "show sess" Commit `8a5c626e73` introduced the sessions dump on the unix socket. This implementation is buggy because it may try to link to the sessions list's head after the last session is removed with a backref. Also, for the LIST_ISEMPTY test to succeed, we have to proceed with LIST_INIT after LIST_DEL.	2009-02-22 15:17:24 +01:00
Willy Tarreau	0abebcc0fb	[MEDIUM] i/o: rework ->to_forward and ->send_max The way the buffers and stream interfaces handled ->to_forward was really not handy for multiple reasons. Now we've moved its control to the receive-side of the buffer, which is also responsible for keeping send_max up to date. This makes more sense as it now becomes possible to send some pre-formatted data followed by forwarded data. The following explanation has also been added to buffer.h to clarify the situation. Right now, tests show that the I/O is behaving extremely well. Some work will have to be done to adapt existing splice code though. /* Note about the buffer structure The buffer contains two length indicators, one to_forward counter and one send_max limit. First, it must be understood that the buffer is in fact split in two parts : - the visible data (->data, for ->l bytes) - the invisible data, typically in kernel buffers forwarded directly from the source stream sock to the destination stream sock (->splice_len bytes). Those are used only during forward. In order not to mix data streams, the producer may only feed the invisible data with data to forward, and only when the visible buffer is empty. The consumer may not always be able to feed the invisible buffer due to platform limitations (lack of kernel support). Conversely, the consumer must always take data from the invisible data first before ever considering visible data. There is no limit to the size of data to consume from the invisible buffer, as platform-specific implementations will rarely leave enough control on this. So any byte fed into the invisible buffer is expected to reach the destination file descriptor, by any means. However, it's the consumer's responsibility to ensure that the invisible data has been entirely consumed before consuming visible data. This must be reflected by ->splice_len. This is very important as this and only this can ensure strict ordering of data between buffers. The producer is responsible for decreasing ->to_forward and increasing ->send_max. The ->to_forward parameter indicates how many bytes may be fed into either data buffer without waking the parent up. The ->send_max parameter says how many bytes may be read from the visible buffer. Thus it may never exceed ->l. This parameter is updated by any buffer_write() as well as any data forwarded through the visible buffer. The consumer is responsible for decreasing ->send_max when it sends data from the visible buffer, and ->splice_len when it sends data from the invisible buffer. A real-world example consists in part in an HTTP response waiting in a buffer to be forwarded. We know the header length (300) and the amount of data to forward (content-length=9000). The buffer already contains 1000 bytes of data after the 300 bytes of headers. Thus the caller will set ->send_max to 300 indicating that it explicitly wants to send those data, and set ->to_forward to 9000 (content-length). This value must be normalised immediately after updating ->to_forward : since there are already 1300 bytes in the buffer, 300 of which are already counted in ->send_max, and that size is smaller than ->to_forward, we must update ->send_max to 1300 to flush the whole buffer, and reduce ->to_forward to 8000. After that, the producer may try to feed the additional data through the invisible buffer using a platform-specific method such as splice(). */	2009-01-09 10:15:03 +01:00
Willy Tarreau	6b66f3e4f6	[MAJOR] implement autonomous inter-socket forwarding If an analyser sets buf->to_forward to a given value, that many data will be forwarded between the two stream interfaces attached to a buffer without waking the task up. The same applies once all analysers have been released. This saves a large amount of calls to process_session() and a number of task_dequeue/queue.	2009-01-09 10:15:02 +01:00
Willy Tarreau	3ffeba1f67	[MEDIUM] enable inter-stream_interface wakeup calls By letting the producer tell the consumer there is data to check, and the consumer tell the producer there is some space left again, we can cut in half the number of session wakeups. This is also an important starting point for future splicing support.	2008-12-28 11:09:02 +01:00
Willy Tarreau	b0ef735c71	[MINOR] add flags to indicate when a stream interface is waiting for space/data It will soon be required to know when a stream interface is waiting for buffer data or buffer room. Let's add two flags for that.	2008-12-28 11:08:03 +01:00
Willy Tarreau	86491c3164	[MEDIUM] indicate when we don't care about read timeout Sometimes we don't care about a read timeout, for instance, from the client when waiting for the server, but we still want the client to be able to read. Till now it was done by articially forcing the read timeout to ETERNITY. But this will cause trouble when we want the low level stream sock to communicate without waking the session up. So we add a BF_READ_NOEXP flag to indicate that when the read timeout is to be set, it might have to be set to ETERNITY. Since BF_READ_ENA was not used, we replaced this flag.	2008-12-28 11:06:40 +01:00
Willy Tarreau	f890dc9003	[MEDIUM] add a send limit to a buffer For keep-alive, line-mode protocols and splicing, we will need to limit the sender to process a certain amount of bytes. The limit is automatically set to the buffer size when analysers are detached from the buffer.	2008-12-28 10:58:52 +01:00
Willy Tarreau	3dfe6cd095	[MEDIUM] add support for "show sess" in unix stats socket It is now possible to list all known sessions by issuing "show sess" on the unix stats socket. The format is not much evolved but it is very useful for debugging. The doc has been updated to reflect the new keyword.	2008-12-07 22:41:17 +01:00
Willy Tarreau	62e4f1dedd	[MINOR] add back-references to sessions for later use by a dumper. This is the first step in implementing a session dump tool. A session dump will need restart points. It will be necessary for it to get references to sessions which can be moved when the session dies. The principle is not that complex : when a session ends, it looks for any potential back-references. If it finds any, then it moves them to the next session in the list. The dump function will of course have to restart from that new point.	2008-12-07 21:57:02 +01:00
Willy Tarreau	01bf8675ed	[MEDIUM] reference the current hijack function in the buffer itself Instead of calling a hard-coded function to produce data, let's reference this function into the buffer and call it from there when BF_HIJACK is set. This goes in the direction of more generic session management code.	2008-12-07 18:03:29 +01:00
Willy Tarreau	b5654f6ff4	[MINOR] move the listener reference from fd to session The listener referenced in the fd was only used to check the listener state upon session termination. There was no guarantee that the FD had not been reassigned by the moment it was processed, so this was a bit racy. Having it in the session is more robust.	2008-12-07 16:45:10 +01:00
Willy Tarreau	7e5067d459	[MEDIUM] remove cli_fd, srv_fd, cli_state and srv_state from the session Those were previously used by the unix sockets only, and could be removed.	2008-12-07 16:27:56 +01:00
Willy Tarreau	b1356cf4e4	[MAJOR] make unix sockets work again with stats The unix protocol handler had not been updated during the last stream_sock changes. This has been done now. There is still a lot of duplicated code between session.c and proto_uxst.c due to the way the session is handled. Session.c relies on the existence of a frontend while it does not exist here. It is easier to see the difference between the stats part (placed in dumpstats.c) and the unix-stream part (in proto_uxst.c). The hijacking function still needs to be dynamically set into the response buffer, and some cleanup is still required, then all those changes should be forward-ported to the HTTP part. Adding support for new keywords should not cause trouble now.	2008-12-07 16:06:43 +01:00
Willy Tarreau	a11e976163	[MEDIUM] first pass of lifting to proto_uxst.c:uxst_event_accept() The accept function must be adapted to the new framework. It is still broken, and calling it will still result in a segfault. But this cleanup is needed anyway.	2008-12-01 01:44:25 +01:00
Willy Tarreau	dded32defa	[MINOR] replace client_retnclose() with stream_int_retnclose() This makes more sense to return a message to a stream interface than to a session. senddata.{c,h} have been removed.	2008-11-30 19:48:07 +01:00
Willy Tarreau	f54f8bdd8d	[MINOR] maintain a global session list in order to ease debugging Now the global variable 'sessions' will be a dual-linked list of all known sessions. The list element is set at the beginning of the session so that it's easier to follow them all with gdb.	2008-11-23 19:53:55 +01:00
Willy Tarreau	eabf313df2	[MINOR] change type of fdtab[]->owner to void* The owner of an fd was initially a task but this was sometimes casted to a (struct listener ). We'll soon need more types, so void is more appropriate.	2008-11-02 10:19:08 +01:00
Willy Tarreau	fdccded0e8	[MEDIUM] indicate a reason for a task wakeup It's very frequent to require some information about the reason why a task is running. Some flags have been added so that a task now knows if it got woken up due to I/O completion, timeout, etc...	2008-11-02 10:19:08 +01:00
Willy Tarreau	3da77c5abd	[MINOR] re-arrange buffer flags and rename some of them The buffer flags became a big bazaar. Re-arrange them so that their names are more explicit and so that they are more easily readable in hex form. Some aggregates have also been adjusted.	2008-11-02 10:19:07 +01:00
Willy Tarreau	fa7e10251d	[MAJOR] rework of the server FSM srv_state has been removed from HTTP state machines, and states have been split in either TCP states or analyzers. For instance, the TARPIT state has just become a simple analyzer. New flags have been added to the struct buffer to compensate this. The high-level stream processors sometimes need to force a disconnection without touching a file-descriptor (eg: report an error). But if they touched BF_SHUTW or BF_SHUTR, the file descriptor would not be closed. Thus, the two SHUT?_NOW flags have been added so that an application can request a forced close which the stream interface will be forced to obey. During this change, a new BF_HIJACK flag was added. It will be used for data generation, eg during a stats dump. It prevents the producer on a buffer from sending data into it. BF_SHUTR_NOW /* the producer must shut down for reads ASAP / BF_SHUTW_NOW / the consumer must shut down for writes ASAP / BF_HIJACK / the producer is temporarily replaced / BF_SHUTW_NOW has precedence over BF_HIJACK. BF_HIJACK has precedence over BF_MAY_FORWARD (so that it does not need it). New functions buffer_shutr_now(), buffer_shutw_now(), buffer_abort() are provided to manipulate BF_SHUT flags. A new type "stream_interface" has been added to describe both sides of a buffer. A stream interface has states and error reporting. The session now has two stream interfaces (one per side). Each buffer has stream_interface pointers to both consumer and producer sides. The server-side file descriptor has moved to its stream interface, so that even the buffer has access to it. process_srv() has been split into three parts : - tcp_get_connection() obtains a connection to the server - tcp_connection_failed() tests if a previously attempted connection has succeeded or not. - process_srv_data() only manages the data phase, and in this sense should be roughly equivalent to process_cli. Little code has been removed, and a lot of old code has been left in comments for now.	2008-11-02 10:19:04 +01:00
Willy Tarreau	ffab5b4ab0	[MEDIUM] merge inspect_exp and txn->exp into request buffer Since we may have several analysers on a buffer, it's more convenient to have the analyser timeout attached to the buffer itself.	2008-08-17 18:03:28 +02:00
Willy Tarreau	2df28e8110	[MEDIUM] session: move the analysis bit field to the buffer It makes more sense to store the list of analysers in the buffer than in the session since they are precisely plugged onto one buffer.	2008-08-17 15:20:19 +02:00
Willy Tarreau	26ed74dadc	[MEDIUM] use buffer->wex instead of buffer->cex for connect timeout It's a shame not to use buffer->wex for connection timeouts since by definition it cannot be used till the connection is not established. Using it instead of ->cex also makes the buffer processing more symmetric.	2008-08-17 12:11:14 +02:00
Willy Tarreau	e393fe224b	[MEDIUM] buffers: add BF_EMPTY and BF_FULL to remove dependency on req/rep->l It is not always convenient to run checks on req->l in functions to check if a buffer is empty or full. Now the stream_sock functions set flags BF_EMPTY and BF_FULL according to the buffer contents. Of course, functions which touch the buffer contents adjust the flags too.	2008-08-16 22:18:07 +02:00
Willy Tarreau	ba392cecf9	[CLEANUP] get rid of BF_SHUT_PENDING BF_SHUTR_PENDING and BF_SHUTW_PENDING were poor ideas because BF_SHUTR is the pending of BF_SHUTW_DONE and BF_SHUTW is the pending of BF_SHUTR_DONE. Remove those two useless and confusing "pending" versions and rename buffer_shut{r,w}_ functions.	2008-08-16 21:13:23 +02:00
Willy Tarreau	f853320b44	[MINOR] term_trace: add better instrumentations to trace the code A new member has been added to the struct session. It keeps a trace of what block of code performs a close or a shutdown on a socket, and in what sequence. This is extremely convenient for post-mortem analysis where flag combinations and states seem impossible. A new ABORT_NOW() macro has also been added to make the code immediately segfault where called.	2008-08-16 14:55:08 +02:00
Willy Tarreau	adfb8569f7	[MAJOR] get rid of SV_STANALYZE (step 2) The SV_STANALYZE state was installed on the server side but was really meant to be processed with the rest of the request on the client side. It suffered from several issues, mostly related to the way timeouts were handled while waiting for data. All known issues related to timeouts during a request - and specifically a request involving body processing - have been raised and fixed. At this point, the code is a bit dirty but works fine, so next steps might be cleanups with an ability to come back to the current state in case of trouble.	2008-08-14 00:18:38 +02:00
Willy Tarreau	67f0eead22	[MAJOR] kill CL_STINSPECT and CL_STHEADERS (step 1) This is a first attempt at separating data processing from the TCP state machine. Those two states have been replaced with flags in the session indicating what needs to be analyzed. The corresponding code is still called before and in lieu of TCP states. Next change should get rid of the specific SV_STANALYZE which is in fact a client state. Then next change should consist in making it possible to analyze TCP contents while being in CL_STDATA (or CL_STSHUT*).	2008-08-14 00:18:38 +02:00
Willy Tarreau	89edf5e629	[MEDIUM] buffers: ensure buffer_shut* are properly called upon shutdowns It is important that buffer states reflect the state of both sides so that we can remove client and server state inter-dependencies.	2008-08-03 20:48:50 +02:00
Willy Tarreau	ec6c5df018	[CLEANUP] remove many #include <types/xxx> from C files It should be stated as a rule that a C file should never include types/xxx.h when proto/xxx.h exists, as it gives less exposure to declaration conflicts (one of which was caught and fixed here) and it complicates the file headers for nothing. Only types/global.h, types/capture.h and types/polling.h have been found to be valid includes from C files.	2008-07-16 10:30:42 +02:00
Willy Tarreau	284648e079	[CLEANUP] remove unused include/types/client.h This file is not used anymore.	2008-07-16 10:30:40 +02:00
Willy Tarreau	0c303eec87	[MAJOR] convert all expiration timers from timeval to ticks This is the first attempt at moving all internal parts from using struct timeval to integer ticks. Those provides simpler and faster code due to simplified operations, and this change also saved about 64 bytes per session. A new header file has been added : include/common/ticks.h. It is possible that some functions should finally not be inlined because they're used quite a lot (eg: tick_first, tick_add_ifset and tick_is_expired). More measurements are required in order to decide whether this is interesting or not. Some function and variable names are still subject to change for a better overall logics.	2008-07-07 00:09:58 +02:00
Willy Tarreau	91e99931b7	[MEDIUM] introduce task->nice and boot access to statistics The run queue scheduler now considers task->nice to queue a task and to pick a task out of the queue. This makes it possible to boost the access to statistics (both via HTTP and UNIX socket). The UNIX socket receives twice as much a boost as the HTTP socket because it is more sensible.	2008-06-30 07:51:00 +02:00
Willy Tarreau	9789f7bd68	[MAJOR] replace ultree with ebtree in wait-queues The ultree code has been removed in favor of a simpler and cleaner ebtree implementation. The eternity queue does not need to exist anymore, and the pool_tree64 has been removed. The ebtree node is stored in the task itself. The qlist list header is still used by the run-queue, but will be able to disappear once the run-queue uses ebtree too.	2008-06-24 08:17:16 +02:00
Willy Tarreau	7c669d7e0f	[BUG] fix the dequeuing logic to ensure that all requests get served The dequeuing logic was completely wrong. First, a task was assigned to all servers to process the queue, but this task was never scheduled and was only woken up on session free. Second, there was no reservation of server entries when a task was assigned a server. This means that as long as the task was not connected to the server, its presence was not accounted for. This was causing trouble when detecting whether or not a server had reached maxconn. Third, during a redispatch, a session could lose its place at the server's and get blocked because another session at the same moment would have stolen the entry. Fourth, the redispatch option did not work when maxqueue was reached for a server, and it was not possible to do so without indefinitely hanging a session. The root cause of all those problems was the lack of pre-reservation of connections at the server's, and the lack of tracking of servers during a redispatch. Everything relied on combinations of flags which could appear similarly in quite distinct situations. This patch is a major rework but there was no other solution, as the internal logic was deeply flawed. The resulting code is cleaner, more understandable, uses less magics and is overall more robust. As an added bonus, "option redispatch" now works when maxqueue has been reached on a server.	2008-06-20 15:08:06 +02:00
Willy Tarreau	39f7e6d516	[MEDIUM] fix stats socket limitation to 16 kB Due to the way the stats socket work, it was not possible to maintain the information related to the command entered, so after filling a whole buffer, the request was lost and it was considered that there was nothing to write anymore. The major reason was that some flags were passed directly during the first call to stats_dump_raw() instead of being stored persistently in the session. To definitely fix this problem, flags were added to the stats member of the session structure. A second problem appeared. When the stats were produced, a first call to client_retnclose() was performed, then one or multiple subsequent calls to buffer_write_chunks() were done. But once the stats buffer was full and a reschedule operated, the buffer was flushed, the write flag cleared from the buffer and nothing was done to re-arm it. For this reason, a check was added in the proto_uxst_stats() function in order to re-call the client FSM when data were added by stats_dump_raw(). Finally, the whole unix stats dump FSM was rewritten to avoid all the magics it depended on. It is now simpler and looks more like the HTTP one.	2008-03-17 22:08:01 +01:00
Krzysztof Piotr Oledzki	2c6962c3c0	[MAJOR] proto_uxst rework -> SNMP support Currently there is a ~16KB limit for a data size passed via unix socket. It is caused by a trivial bug ttat is going to fixed soon, however in most cases there is no need to dump a full stats. This patch makes possible to select a scope of dumped data by extending current "show stat" to "show stat [<iid> <type> <sid>]": - iid is a proxy id, -1 to dump all proxies - type selects type of dumpable objects: 1 for frontend, 2 for backend, 4 for server, -1 for all types. Values can be ORed, for example: 1+2=3 -> frontend+backend. 1+2+4=7 -> frontend+backend+server. - sid is a service id, -1 to dump everything from the selected proxy. To do this I implemented a new session flag (SN_STAT_BOUND), added three variables in data_ctx.stats (iid, type, sid), modified dumpstats.c and completely revorked the process_uxst_stats: now it waits for a "\n" terminated string, splits args and uses them. BTW: It should be quite easy to add new commands, for example to enable/disable servers, the only problem I can see is a not very lucky config name (stats socket). :\| During the work I also fixed two bug: - s->flags were not initialized for proto_uxst - missing comma if throttling not enabled (caused by a stupid change in "Implement persistent id for proxies and servers") Other changes: - No more magic type valuse, use STATS_TYPE_FE/STATS_TYPE_BE/STATS_TYPE_SV - Don't memset full s->data_ctx (it was clearing s->data_ctx.stats.{iid/type/sid}, instead initialize stats.sv & stats.sv_st (stats.px and stats.px_st were already initialized) With all that changes it was extremely easy to write a short perl plugin for a perl-enabled net-snmp (also included in this patch). 29385 is my PEN (Private Enterprise Number) and I'm willing to donate the SNMPv2-SMI::enterprises.29385.106.* OIDs for HAProxy if there is nothing assigned already.	2008-03-04 06:32:16 +01:00
Willy Tarreau	d6f087ea1c	[BUG] fix truncated responses with sepoll Due to the way Linux delivers EPOLLIN and EPOLLHUP, a closed connection received after some server data sometimes results in truncated responses if the client disconnects before server starts to respond. The reason is that the EPOLLHUP flag is processed as an indication of end of transfer while some data may remain in the system's socket buffers. This problem could only be triggered with sepoll, although nothing should prevent it from happening with normal epoll. In fact, the work factoring performed by sepoll increases the risk that this bug appears. The fix consists in making FD_POLL_HUP and FD_POLL_ERR sticky and that they are only checked if FD_POLL_IN is not set, meaning that we have read all pending data. That way, the problem is definitely fixed and sepoll still remains about 17% faster than epoll since it can take into account all information returned by the kernel.	2008-01-18 17:20:13 +01:00
Willy Tarreau	a8efd362b2	[STATS] add support for "show info" on the unix socket It is sometimes required to know some informations such as the process uptime when consulting statistics. This patch adds the "show info" command to query those informations on the UNIX socket.	2008-01-03 10:19:15 +01:00
Krzysztof Piotr Oledzki	583bc96606	[MEDIUM] continous statistics By default, counters used for statistics calculation are incremented only when a session finishes. It works quite well when serving small objects, but with big ones (for example large images or archives) or with A/V streaming, a graph generated from haproxy counters looks like a hedgehog. This patch implements a contstats (continous statistics) option. When set counters get incremented continuously, during a whole session. Recounting touches a hotpath directly so it is not enabled by default, as it has small performance impact (~0.5%).	2007-11-26 20:21:47 +01:00
Willy Tarreau	1a64d16720	[MINOR] add a generic delete_listener() primitive Most protocols will be able to share a single delete_listener() primitive. Provide it in protocols.c, and remove the specific version from proto_uxst.	2007-11-04 22:42:49 +01:00
Willy Tarreau	8eebe5ea40	[MEDIUM] unbind_listener() must use fd_delete() and not close() It is important that unbind_listener() calls fd_delete() to remove a file descriptor, because only this one can update the fdtab and the maxfd entries.	2007-11-04 22:42:48 +01:00
Willy Tarreau	dabf2e2647	[MAJOR] added a new state to listeners There was a missing state for listeners, when they are not listening but still attached to the protocol. The LI_ASSIGNED state was added for this purpose. This permitted to clean up the assignment/release workflow quite a bit. Generic enable/enable_all/disable/disable_all primitives were added, and a disable_all entry was added to the struct protocol.	2007-11-04 22:42:48 +01:00
Willy Tarreau	106bf274c4	[MINOR] add socket address length to the protocols The protocol struct can be more useful if it also provides address lengths. Add sock_addrlen, as used by bind(), as well as l3_addrlen for hashes.	2007-10-28 12:09:45 +01:00
Willy Tarreau	d740babd0e	[MINOR] move error codes to common/errors.h It's useful to be able to share error codes between C files, so move the codes currently only used in protocols to a generic file.	2007-10-28 11:14:07 +01:00
Willy Tarreau	10ae548052	[BUG] fix off-by-one in path length in destroy_uxst_socket() An off-by-one error was left in the computation of the unix socket path.	2007-10-18 16:15:52 +02:00
Willy Tarreau	3e76e728ce	[MEDIUM] implement the statistics output on a unix socket A unix socket can now access the statistics. It currently only recognizes the "show stat\n" command at the beginning of the input, then returns the statistics in CSV format.	2007-10-18 14:13:13 +02:00
Willy Tarreau	e6ad2b165e	[MINOR] make it possible to set unix socket permissions Under most systems, it is possible to set permissions on unix sockets. This has been added to the listeners and to unix sockets.	2007-10-18 14:11:55 +02:00
Willy Tarreau	92fb9836ee	[MAJOR] implemented client-side support for PF_UNIX sockets A new file, proto_uxst.c, implements support of PF_UNIX sockets of type SOCK_STREAM. It relies on generic stream_sock_read/write and uses its own accept primitive which also tries to be generic. Right now it only implements an echo service in sight of a general support for start dumping via unix socket. The echo code is more of a proof of concept than useful code.	2007-10-18 14:11:15 +02:00

1 2

99 Commits