haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-13 02:26:56 +02:00

Author	SHA1	Message	Date
Willy Tarreau	4f0c64cad7	MINOR: session: release the listener with the session, not the stream Since multiple streams can share one session attached to one listener, the listener_release() call must be done in session_free() and not in stream_free(), otherwise we end up with a negative count in H2.	2017-10-31 18:03:24 +01:00
Willy Tarreau	436d333124	MEDIUM: connection: add a destroy callback This callback will be used to release upper layers when a mux is in use. Given that the mux can be asynchronously deleted, we need a way to release the extra information such as the session. This callback will be called directly by the mux upon releasing everything and before the connection itself is released, so that the callee can find its information inside the connection if needed. The way it currently works is not perfect, and most likely this should instead become a mux release callback, but for now we have no easy way to add mux-specific stuff, and since there's one mux per connection, it works fine this way.	2017-10-31 18:03:24 +01:00
Willy Tarreau	2e0b2b5f83	MEDIUM: session: use the ALPN token and proxy mode to select the mux When an incoming connection is made on an HTTP mode frontend, the session now looks up the mux to use based on the ALPN token and the proxy mode. This will allow easier mux registration, and we don't need to hard-code the mux_pt_ops anymore.	2017-10-31 18:03:23 +01:00
Willy Tarreau	53a4766e40	MEDIUM: connection: start to introduce a mux layer between xprt and data For HTTP/2 and QUIC, we'll need to deal with multiplexed streams inside a connection. After quite a long brainstorming, it appears that the connection interface to the existing streams is appropriate just like the connection interface to the lower layers. In fact we need to have the mux layer in the middle of the connection, between the transport and the data layer. A mux can exist on two directions/sides. On the inbound direction, it instanciates new streams from incoming connections, while on the outbound direction it muxes streams into outgoing connections. The difference is visible on the mux->init() call : in one case, an upper context is already known (outgoing connection), and in the other case, the upper context is not yet known (incoming connection) and will have to be allocated by the mux. The session doesn't have to create the new streams anymore, as this is performed by the mux itself. This patch introduces this and creates a pass-through mux called "mux_pt" which is used for all new connections and which only calls the data layer's recv,send,wake() calls. One incoming stream is immediately created when init() is called on the inbound direction. There should not be any visible impact. Note that the connection's mux is purposely not set until the session is completed so that we don't accidently run with the wrong mux. This must not cause any issue as the xprt_done_cb function is always called prior to using mux's recv/send functions.	2017-10-31 18:03:23 +01:00
Christopher Faulet	ff8abcd31d	MEDIUM: threads/proxy: Add a lock per proxy and atomically update proxy vars Now, each proxy contains a lock that must be used when necessary to protect it. Moreover, all proxy's counters are now updated using atomic operations.	2017-10-31 13:58:30 +01:00
Christopher Faulet	8d8aa0d681	MEDIUM: threads/listeners: Make listeners thread-safe First, we use atomic operations to update jobs/totalconn/actconn variables, listener's nbconn variable and listener's counters. Then we add a lock on listeners to protect access to their information. And finally, listener queues (global and per proxy) are also protected by a lock. Here, because access to these queues are unusal, we use the same lock for all queues instead of a global one for the global queue and a lock per proxy for others.	2017-10-31 13:58:30 +01:00
Emeric Brun	c60def8368	MAJOR: threads/task: handle multithread on task scheduler 2 global locks have been added to protect, respectively, the run queue and the wait queue. And a process mask has been added on each task. Like for FDs, this mask is used to know which threads are allowed to process a task. For many tasks, all threads are granted. And this must be your first intension when you create a new task, else you have a good reason to make a task sticky on some threads. This is then the responsibility to the process callback to lock what have to be locked in the task context. Nevertheless, all tasks linked to a session must be sticky on the thread creating the session. It is important that I/O handlers processing session FDs and these tasks run on the same thread to avoid conflicts.	2017-10-31 13:58:30 +01:00
Olivier Houchard	c2aae74f01	MEDIUM: ssl: Handle early data with OpenSSL 1.1.1 When compiled with Openssl >= 1.1.1, before attempting to do the handshake, try to read any early data. If any early data is present, then we'll create the session, read the data, and handle the request before we're doing the handshake. For this, we add a new connection flag, CO_FL_EARLY_SSL_HS, which is not part of the CO_FL_HANDSHAKE set, allowing to proceed with a session even before an SSL handshake is completed. As early data do have security implication, we let the origin server know the request comes from early data by adding the "Early-Data" header, as specified in this draft from the HTTP working group : https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-replay	2017-10-27 10:54:05 +02:00
Willy Tarreau	5b78a9dd04	MINOR: session: use conn_full_close() instead of conn_force_close() We simply disable tracking before calling it.	2017-10-22 09:54:17 +02:00
Olivier Houchard	1a0545f3d7	REORG: connection: rename CO_FL_DATA_* -> CO_FL_XPRT_* These flags are not exactly for the data layer, they instead indicate what is expected from the transport layer. Since we're going to split the connection between the transport and the data layers to insert a mux layer, it's important to have a clear idea of what each layer does. All function conn_data_* used to manipulate these flags were renamed to conn_xprt_*.	2017-10-22 09:54:15 +02:00
Willy Tarreau	bf08beb2a3	MINOR: session: remove the list of streams from struct session Commit `bcb86ab` ("MINOR: session: add a streams field to the session struct") added this list of streams that is not needed anymore. Let's get rid of it now.	2017-10-08 22:32:05 +02:00
Willy Tarreau	0bf6fa5e40	MEDIUM: session: count the frontend's connections at a single place There are several places where we see feconn++, feconn--, totalconn++ and an increment on the frontend's number of connections and connection rate. This is done exactly once per session in each direction, so better take care of this counter in the session and simplify the callers. At least it ensures a better symmetry. It also ensures consistency as till now the lua/spoe/peers frontend didn't have these counters properly set, which can be useful at least for troubleshooting.	2017-09-15 11:49:52 +02:00
Willy Tarreau	0c4ed35225	MEDIUM: session: factor out duplicated code for conn_complete_session session_accept_fd() may either successfully complete a session creation, or defer it to conn_complete_session() depending of whether a handshake remains to be performed or not. The problem is that all the code after the handshake was duplicated between the two functions. This patch make session_accept_fd() synchronously call conn_complete_session() to finish the session creation. It is only needed to check if the session's task has to be released or not at the end, which is fairly minimal. This way there is now a single place where the sessions are created.	2017-09-15 11:49:52 +02:00
Willy Tarreau	eaa7e44ad7	MINOR: session: small cleanup of conn_complete_session() Commit `8e3c6ce` ("MEDIUM: connection: get rid of data->init() which was not for data") simplified conn_complete_session() but introduced a confusing check which cannot happen on CO_FL_HANDSHAKE. Make it clear that this call is final and will either succeed and complete the session or fail.	2017-09-15 11:49:52 +02:00
Willy Tarreau	05f5047d40	MINOR: listener: new function listener_release Instead of duplicating some sensitive listener-specific code in the session and in the stream code, let's call listener_release() when releasing a connection attached to a listener.	2017-09-15 11:49:52 +02:00
Willy Tarreau	6f5e4b98df	MEDIUM: session: take care of incrementing/decrementing jobs Each user of a session increments/decrements the jobs variable at its own place, resulting in a real mess and inconsistencies between them. Let's have session_new() increment jobs and session_free() decrement it.	2017-09-15 11:49:52 +02:00
Willy Tarreau	5790eb0a76	MINOR: stream: provide a new stream creation function for connections The purpose will be to create new streams for a given connection so that we can later abstract this from a mux.	2017-08-30 07:06:39 +02:00
Willy Tarreau	0b74eae1f1	MEDIUM: session: add a pointer to a struct task in the session The session may need to enforce a timeout when waiting for a handshake. Till now we used a trick to avoid allocating a pointer, we used to set the connection's owner to the task and set the task's context to the session, so that it was possible to circle between all of them. The problem is that we'll really need to pass the pointer to the session to the upper layers during initialization and that the only place to store it is conn->owner, which is squatted for this trick. So this patch moves the struct task* into the session where it should always have been and ensures conn->owner points to the session until the data layer is properly initialized.	2017-08-30 07:05:49 +02:00
Willy Tarreau	87787acf72	MEDIUM: stream: make stream_new() allocate its own task Currently a task is allocated in session_new() and serves two purposes : - either the handshake is complete and it is offered to the stream via the second arg of stream_new() - or the handshake is not complete and it's diverted to be used as a timeout handler for the embryonic session and repurposed once we land into conn_complete_session() Furthermore, the task's process() function was taken from the listener's handler in conn_complete_session() prior to being replaced by a call to stream_new(). This will become a serious mess with the mux. Since it's impossible to have a stream without a task, this patch removes the second arg from stream_new() and make this function allocate its own task. In session_accept_fd(), we now only allocate the task if needed for the embryonic session and delete it later.	2017-08-30 07:05:04 +02:00
Willy Tarreau	8e3c6ce75a	MEDIUM: connection: get rid of data->init() which was not for data The ->init() callback of the connection's data layer was only used to complete the session's initialisation since sessions and streams were split apart in 1.6. The problem is that it creates a big confusion in the layers' roles as the session has to register a dummy data layer when waiting for a handshake to complete, then hand it off to the stream which will replace it. The real need is to notify that the transport has finished initializing. This should enable a better splitting between these layers. This patch thus introduces a connection-specific callback called xprt_done_cb() which informs about handshake successes or failures. With this, data->init() can disappear, CO_FL_INIT_DATA as well, and we don't need to register a dummy data->wake() callback to be notified of errors.	2017-08-30 07:04:04 +02:00
Willy Tarreau	585744bf2e	REORG/MEDIUM: connection: introduce the notion of connection handle Till now connections used to rely exclusively on file descriptors. It was planned in the past that alternative solutions would be implemented, leading to member "union t" presenting sock.fd only for now. With QUIC, the connection will need to continue to exist but will not rely on a file descriptor but a connection ID. So this patch introduces a "connection handle" which is either a file descriptor or a connection ID, to replace the existing "union t". We've now removed the intermediate "struct sock" which was never used. There is no functional change at all, though the struct connection was inflated by 32 bits on 64-bit platforms due to alignment.	2017-08-24 19:30:04 +02:00
Willy Tarreau	f92a73d2fc	MEDIUM: session: do not free a session until no stream references it We now refrain from clearing a session's variables, counters, and from releasing it as long as at least one stream references it. For now it never happens but with H2 this will be mandatory to avoid double frees.	2017-08-18 13:26:35 +02:00
Willy Tarreau	bcb86abaca	MINOR: session: add a streams field to the session struct This will be used to hold the list of streams belonging to a given session.	2017-08-18 13:26:35 +02:00
Willy Tarreau	9b82d941c5	MEDIUM: stream: make stream_new() always set the target and analysers It doesn't make sense that stream_new() doesn't sets the target nor analysers and that the caller has to do it even if it doesn't know about streams (eg: in session_accept_fd()). This causes trouble for H2 where the applet handling the protocol cannot properly change these information during its init phase. Let's ensure it's always set and that the callers don't set it anymore. Note: peers and lua don't use analysers and that's properly handled.	2017-06-27 14:38:02 +02:00
Emeric Brun	5f77fef34e	MINOR: task/stream: tasks related to a stream must be init by the caller. The task_wakeup was called on stream_new, but the task/stream wasn't fully initialized yet. The task_wakeup must be called explicitly by the caller once the task/stream is initialized.	2017-06-27 14:38:02 +02:00
Willy Tarreau	de40d798de	CLEANUP: connection: completely remove CO_FL_WAKE_DATA Since it's only set and never tested anymore, let's remove it.	2017-03-19 12:18:27 +01:00
Willy Tarreau	a261e9b094	CLEANUP: connection: remove all direct references to raw_sock and ssl_sock Now we exclusively use xprt_get(XPRT_RAW) instead of &raw_sock or xprt_get(XPRT_SSL) for &ssl_sock. This removes a bunch of #ifdef and include spread over a number of location including backend, cfgparse, checks, cli, hlua, log, server and session.	2016-12-22 23:26:38 +01:00
Willy Tarreau	c95bad5013	MEDIUM: move listener->frontend to bind_conf->frontend Historically, all listeners have a pointer to the frontend. But since the introduction of SSL, we now have an intermediary layer called bind_conf corresponding to a "bind" line. It makes no sense to have the frontend on each listener given that it's the same for all listeners belonging to a same bind_conf. Also certain parts like SSL can only operate on bind_conf and need the frontend. This patch fixes this by moving the frontend pointer from the listener to the bind_conf. The extra indirection is quite cheap given and the places were this is used are very scarce.	2016-12-22 23:26:38 +01:00
Willy Tarreau	71a8c7c49e	MINOR: listener: move the transport layer pointer to the bind_conf A mistake was made when the socket layer was cut into proto and transport, the transport was attached to the listener while all listeners in a single "bind" line always have exactly the same transport. It doesn't seem obvious but this is the reason why there are so many #ifdefs USE_OPENSSL in cfgparse : a lot of operations have to be open-coded because cfgparse only manipulates bind_conf and we don't have the information of the transport layer here. Very little code makes use of the transport layer, mainly session setup and log. These places can afford an extra pointer indirection (the listener points to the bind_conf). This change is thus very small, it saves a little bit of memory (8B per listener) and makes the code more flexible.	2016-12-22 23:26:37 +01:00
Willy Tarreau	92b10c954d	BUG/MAJOR: stream: fix session abort on resource shortage In 1.6-dev2, commit `32990b5` ("MEDIUM: session: remove the task pointer from the session") introduced a bug which can sometimes crash the process on resource shortage. When stream_complete() returns -1, it has already reattached the connection to the stream, then kill_mini_session() is called and still expects to find the task in conn->owner. Note that since this commit, the code has moved a bit and is now in stream_new() but the problem remains the same. Given that we already know the task around these places, let's simply pass the task to kill_mini_session(). The conditions currently at risk are : - failure to initialize filters for the new stream (lack of memory or any filter returning < 0 on attach()) - failure to attach filters (any filter returning < 0 on stream_start()) - frontend's accept() returning < 0 (allocation failure) This fix is needed in 1.7 and 1.6.	2016-12-04 20:16:52 +01:00
Willy Tarreau	397131093f	REORG: tcp-rules: move tcp rules processing to their own file There's no more reason to keep tcp rules processing inside proto_tcp.c given that there is nothing in common there except these 3 letters : tcp. The tcp rules are in fact connection, session and content processing rules. Let's move them to "tcp-rules" and let them live their life there.	2016-11-25 15:57:38 +01:00
Willy Tarreau	8e0bb0ae16	MINOR: connection: add names for transport and data layers This makes debugging easier and avoids having to put ugly checks against certain well-known internal struct pointers.	2016-11-24 16:58:12 +01:00
Willy Tarreau	620408f406	MEDIUM: tcp: add registration and processing of TCP L5 rules This commit introduces "tcp-request session" rules. These are very much like "tcp-request connection" rules except that they're processed after the handshake, so it is possible to consider SSL information and addresses rewritten by the proxy protocol header in actions. This is particularly useful to track proxied sources as this was not possible before, given that tcp-request content rules are processed after each HTTP request. Similarly it is possible to assign the proxied source address or the client's cert to a variable.	2016-10-21 18:19:24 +02:00
Willy Tarreau	7d9736fb5d	CLEANUP: tcp rules: mention everywhere that tcp-conn rules are L4 This is in order to make integration of tcp-request-session cleaner : - tcp_exec_req_rules() was renamed tcp_exec_l4_rules() - LI_O_TCP_RULES was renamed LI_O_TCP_L4_RULES (LI_O_*'s horrible indent was also fixed and a provision was left for L5 rules).	2016-10-21 18:19:24 +02:00
Bertrand Jacquin	93b227db95	MINOR: listener: add the "accept-netscaler-cip" option to the "bind" keyword When NetScaler application switch is used as L3+ switch, informations regarding the original IP and TCP headers are lost as a new TCP connection is created between the NetScaler and the backend server. NetScaler provides a feature to insert in the TCP data the original data that can then be consumed by the backend server. Specifications and documentations from NetScaler: https://support.citrix.com/article/CTX205670 https://www.citrix.com/blogs/2016/04/25/how-to-enable-client-ip-in-tcpip-option-of-netscaler/ When CIP is enabled on the NetScaler, then a TCP packet is inserted just after the TCP handshake. This is composed as: - CIP magic number : 4 bytes Both sender and receiver have to agree on a magic number so that they both handle the incoming data as a NetScaler Client IP insertion packet. - Header length : 4 bytes Defines the length on the remaining data. - IP header : >= 20 bytes if IPv4, 40 bytes if IPv6 Contains the header of the last IP packet sent by the client during TCP handshake. - TCP header : >= 20 bytes Contains the header of the last TCP packet sent by the client during TCP handshake.	2016-06-20 23:02:47 +02:00
Christopher Faulet	d7c9196ae5	MAJOR: filters: Add filters support This patch adds the support of filters in HAProxy. The main idea is to have a way to "easely" extend HAProxy by adding some "modules", called filters, that will be able to change HAProxy behavior in a programmatic way. To do so, many entry points has been added in code to let filters to hook up to different steps of the processing. A filter must define a flt_ops sutrctures (see include/types/filters.h for details). This structure contains all available callbacks that a filter can define: struct flt_ops { /* * Callbacks to manage the filter lifecycle / int (init) (struct proxy p); void (deinit)(struct proxy p); int (check) (struct proxy p); / * Stream callbacks / void (stream_start) (struct stream s); void (stream_accept) (struct stream s); void (session_establish)(struct stream s); void (stream_stop) (struct stream s); / * HTTP callbacks / int (http_start) (struct stream s, struct http_msg msg); int (http_start_body) (struct stream s, struct http_msg msg); int (http_start_chunk) (struct stream s, struct http_msg msg); int (http_data) (struct stream s, struct http_msg msg); int (http_last_chunk) (struct stream s, struct http_msg msg); int (http_end_chunk) (struct stream s, struct http_msg msg); int (http_chunk_trailers)(struct stream s, struct http_msg msg); int (http_end_body) (struct stream s, struct http_msg msg); void (http_end) (struct stream s, struct http_msg msg); void (http_reset) (struct stream s, struct http_msg msg); int (http_pre_process) (struct stream s, struct http_msg msg); int (http_post_process) (struct stream s, struct http_msg msg); void (http_reply) (struct stream s, short status, const struct chunk msg); }; To declare and use a filter, in the configuration, the "filter" keyword must be used in a listener/frontend section: frontend test ... filter <FILTER-NAME> [OPTIONS...] The filter referenced by the <FILTER-NAME> must declare a configuration parser on its own name to fill flt_ops and filter_conf field in the proxy's structure. An exemple will be provided later to make it perfectly clear. For now, filters cannot be used in backend section. But this is only a matter of time. Documentation will also be added later. This is the first commit of a long list about filters. It is possible to have several filters on the same listener/frontend. These filters are stored in an array of at most MAX_FILTERS elements (define in include/types/filters.h). Again, this will be replaced later by a list of filters. The filter API has been highly refactored. Main changes are: * Now, HA supports an infinite number of filters per proxy. To do so, filters are stored in list. * Because filters are stored in list, filters state has been moved from the channel structure to the filter structure. This is cleaner because there is no more info about filters in channel structure. * It is possible to defined filters on backends only. For such filters, stream_start/stream_stop callbacks are not called. Of course, it is possible to mix frontend and backend filters. * Now, TCP streams are also filtered. All callbacks without the 'http_' prefix are called for all kind of streams. In addition, 2 new callbacks were added to filter data exchanged through a TCP stream: - tcp_data: it is called when new data are available or when old unprocessed data are still waiting. - tcp_forward_data: it is called when some data can be consumed. * New callbacks attached to channel were added: - channel_start_analyze: it is called when a filter is ready to process data exchanged through a channel. 2 new analyzers (a frontend and a backend) are attached to channels to call this callback. For a frontend filter, it is called before any other analyzer. For a backend filter, it is called when a backend is attached to a stream. So some processing cannot be filtered in that case. - channel_analyze: it is called before each analyzer attached to a channel, expects analyzers responsible for data sending. - channel_end_analyze: it is called when all other analyzers have finished their processing. A new analyzers is attached to channels to call this callback. For a TCP stream, this is always the last one called. For a HTTP one, the callback is called when a request/response ends, so it is called one time for each request/response. * 'session_established' callback has been removed. Everything that is done in this callback can be handled by 'channel_start_analyze' on the response channel. * 'http_pre_process' and 'http_post_process' callbacks have been replaced by 'channel_analyze'. * 'http_start' callback has been replaced by 'http_headers'. This new one is called just before headers sending and parsing of the body. * 'http_end' callback has been replaced by 'channel_end_analyze'. * It is possible to set a forwarder for TCP channels. It was already possible to do it for HTTP ones. * Forwarders can partially consumed forwardable data. For this reason a new HTTP message state was added before HTTP_MSG_DONE : HTTP_MSG_ENDING. Now all filters can define corresponding callbacks (http_forward_data and tcp_forward_data). Each filter owns 2 offsets relative to buf->p, next and forward, to track, respectively, input data already parsed but not forwarded yet by the filter and parsed data considered as forwarded by the filter. A any time, we have the warranty that a filter cannot parse or forward more input than previous ones. And, of course, it cannot forward more input than it has parsed. 2 macros has been added to retrieve these offets: FLT_NXT and FLT_FWD. In addition, 2 functions has been added to change the 'next size' and the 'forward size' of a filter. When a filter parses input data, it can alter these data, so the size of these data can vary. This action has an effet on all previous filters that must be handled. To do so, the function 'filter_change_next_size' must be called, passing the size variation. In the same spirit, if a filter alter forwarded data, it must call the function 'filter_change_forward_size'. 'filter_change_next_size' can be called in 'http_data' and 'tcp_data' callbacks and only these ones. And 'filter_change_forward_size' can be called in 'http_forward_data' and 'tcp_forward_data' callbacks and only these ones. The data changes are the filter responsability, but with some limitation. It must not change already parsed/forwarded data or data that previous filters have not parsed/forwarded yet. Because filters can be used on backends, when we the backend is set for a stream, we add filters defined for this backend in the filter list of the stream. But we must only do that when the backend and the frontend of the stream are not the same. Else same filters are added a second time leading to undefined behavior. The HTTP compression code had to be moved. So it simplifies http_response_forward_body function. To do so, the way the data are forwarded has changed. Now, a filter (and only one) can forward data. In a commit to come, this limitation will be removed to let all filters take part to data forwarding. There are 2 new functions that filters should use to deal with this feature: * flt_set_http_data_forwarder: This function sets the filter (using its id) that will forward data for the specified HTTP message. It is possible if it was not already set by another filter _AND_ if no data was yet forwarded (msg->msg_state <= HTTP_MSG_BODY). It returns -1 if an error occurs. * flt_http_data_forwarder: This function returns the filter id that will forward data for the specified HTTP message. If there is no forwarder set, it returns -1. When an HTTP data forwarder is set for the response, the HTTP compression is disabled. Of course, this is not definitive.	2016-02-09 14:53:15 +01:00
Willy Tarreau	ebcd4844e8	MEDIUM: vars: move the session variables to the session, not the stream It's important that the session-wide variables are in the session and not in the stream.	2015-06-19 11:59:02 +02:00
Willy Tarreau	73b65acd46	MINOR: stream: pass the pointer to the origin explicitly to stream_new() We don't pass sess->origin anymore but the pointer to the previous step. Now it should be much easier to chain elements together once applets are moved out of streams. Indeed, the session is only used for configuration and not for the dynamic chaining anymore.	2015-04-08 18:26:29 +02:00
Willy Tarreau	678be62981	MEDIUM: session: adjust the connection flags before stream_new() It's not the stream's job to manipulate the connection's flags, it's more related to the session that accepted the new connection. And the only case where we have to do it conditionally is based on the frontend which is known from the session, thus it makes sense to do it there.	2015-04-08 18:18:15 +02:00
Willy Tarreau	042cd75bc2	MINOR: session: maintain the session count stats in the session, not the stream This has nothing to do in the stream, as we'll face absurdities when chaining multiple streams. The session is where it must be accounted for.	2015-04-08 18:10:49 +02:00
Willy Tarreau	d1769b8b9a	MEDIUM: stream: don't rely on the session's listener anymore in stream_new() When the stream is instanciated from an applet, it doesn't necessarily have a listener. The listener was sparsely used there, just to retrieve the task function, update the listeners' stats, and set the analysers and default target, both of which are often zero from applets. Thus these elements are now initialized with default values that the caller is free to change if desired.	2015-04-06 11:37:35 +02:00
Willy Tarreau	f9d1bc6d9a	MEDIUM: frontend: move the fd-specific settings to session_accept_fd() The frontend is generic and does not depend on a file descriptor, so applying some socket options to the incoming fd is not its role. Let's move the setsockopt() calls earlier in session_accept_fd() where others are done as well.	2015-04-06 11:37:35 +02:00
Willy Tarreau	02d863866d	MEDIUM: stream: return the stream upon accept() The function was called stream_accept_session(), let's rename it stream_new() and make it return the newly allocated pointer. It's more convenient for some callers who need it.	2015-04-06 11:37:34 +02:00
Willy Tarreau	18b95a4b27	MINOR: session: set the CO_FL_CONNECTED flag on the connection once ready If we know there's no handshake, we must set the flag on the connection, it's not the job of the stream initializer to do it.	2015-04-06 11:37:33 +02:00
Willy Tarreau	64beab202c	MINOR: session: make use of session_new() when creating a new session It's better than open-coding it.	2015-04-06 11:37:33 +02:00
Willy Tarreau	c38f71cfcd	MINOR: session: introduce session_new() This one creates a new session and does the minimum initialization.	2015-04-06 11:37:33 +02:00
Willy Tarreau	9903f0e1a2	REORG: session: move the session parts out of stream.c This concerns everythins related to accepting a new session and expiring the embryonic session. There's still a hard-coded call to stream_accept_session() which could be set somewhere in the frontend, but for now it's not a problem.	2015-04-06 11:37:32 +02:00
Willy Tarreau	bb2ef12a60	MEDIUM: session: update the session's stick counters upon session_free() Whenever session_free() is called, any possible stick counter stored in the session will be synchronized.	2015-04-06 11:37:31 +02:00
Willy Tarreau	11c3624c32	MINOR: session: implement session_free() and use it everywhere We want to call this one everywhere we have to kill a session so that future parts we move to the session can be released from there.	2015-04-06 11:37:30 +02:00
Willy Tarreau	b1ec8c4a59	MINOR: session: start to reintroduce struct session There is now a pointer to the session in the stream, which is NULL for now. The session pool is created as well. Some parts will move from the stream to the session now.	2015-04-06 11:23:57 +02:00
Willy Tarreau	87b09668be	REORG/MAJOR: session: rename the "session" entity to "stream" With HTTP/2, we'll have to support multiplexed streams. A stream is in fact the largest part of what we currently call a session, it has buffers, logs, etc. In order to catch any error, this commit removes any reference to the struct session and tries to rename most "session" occurrences in function names to "stream" and "sess" to "strm" when that's related to a session. The files stream.{c,h} were added and session.{c,h} removed. The session will be reintroduced later and a few parts of the stream will progressively be moved overthere. It will more or less contain only what we need in an embryonic session. Sample fetch functions and converters will have to change a bit so that they'll use an L5 (session) instead of what's currently called "L4" which is in fact L6 for now. Once all changes are completed, we should see approximately this : L7 - http_txn L6 - stream L5 - session L4 - connection \| applet There will be at most one http_txn per stream, and a same session will possibly be referenced by multiple streams. A connection will point to a session and to a stream. The session will hold all the information we need to keep even when we don't yet have a stream. Some more cleanup is needed because some code was already far from being clean. The server queue management still refers to sessions at many places while comments talk about connections. This will have to be cleaned up once we have a server-side connection pool manager. Stream flags "SN_*" still need to be renamed, it doesn't seem like any of them will need to move to the session.	2015-04-06 11:23:56 +02:00
Willy Tarreau	10b688f2b4	MEDIUM: listener: store the default target per listener This will be useful later to state that some listeners have to use certain decoders (typically an HTTP/2 decoder) regardless of the regular processing applied to other listeners. For now it simply defaults to the frontend's default target, and it is used by the session.	2015-03-13 16:45:37 +01:00
Willy Tarreau	f87ab94e3b	MINOR: proxy: store the default target into the frontend's configuration Some services such as peers and CLI pre-set the target applet immediately during accept(), and for this reason they're forced to have a dedicated accept() function which does not even properly follow everything the regular one does (eg: sndbuf/rcvbuf/linger/nodelay are not set, etc). Let's store the default target when known into the frontend's config so that it's session_accept() which automatically sets it.	2015-03-13 16:23:00 +01:00
Willy Tarreau	78955f4c8b	MEDIUM: session: simplify receive buffer allocator to only use the channel Now that we can get the session from the channel, let's simplify the prototype of session_alloc_recv_buffer() to only require the channel. Both the caller and the function are now simplified.	2015-03-11 20:41:47 +01:00
Willy Tarreau	103197d597	CLEANUP: session: don't use si_{ic,oc} when we know the session. During the connection establishment, we needlessly rely on pointer dereferences.	2015-03-11 20:41:47 +01:00
Willy Tarreau	7b8c4f9661	CLEANUP: session: don't needlessly pass a pointer to the stream-int All functions dealing with connection establishment currently use a pointer to the stream interface. Now we know it cannot change and is always s->si[1].	2015-03-11 20:41:47 +01:00
Willy Tarreau	8f128b41ec	CLEANUP: session: use local variables to access channels / stream ints In process_session, we had around 300 accesses to channels and stream-ints from the session. Not only this inflates the code due to the large offsets from the original pointer, but readability can be improved. Let's have 4 local variables for the channels and stream-ints.	2015-03-11 20:41:47 +01:00
Willy Tarreau	350f487300	CLEANUP: session: simplify references to chn_{prod,cons}(&s->{req,res}) These 4 combinations are needlessly complicated since the session already has direct access to the associated stream interfaces without having to check an indirect pointer.	2015-03-11 20:41:47 +01:00
Willy Tarreau	81cd90069a	MEDIUM: channel: remove now unused ->prod and ->cons pointers Nothing uses them anymore.	2015-03-11 20:41:47 +01:00
Willy Tarreau	ef573c0f22	MEDIUM: channel: add a new flag "CF_ISRESP" for the response channel This flag designates the response channel. This will be used to know what channel we're seeing and finding our way back to the session.	2015-03-11 20:41:47 +01:00
Willy Tarreau	73796535a9	REORG/MEDIUM: channel: only use chn_prod / chn_cons to find stream-interfaces The purpose of these two macros will be to pass via the session to find the relevant stream interfaces so that we don't need to store the ->cons nor ->prod pointers anymore. Currently they're only defined so that all references could be removed. Note that many places need a second pass of clean up so that we don't have any chn_prod(&s->req) anymore and only &s->si[0] instead, and conversely for the 3 other cases.	2015-03-11 20:41:47 +01:00
Willy Tarreau	819d332dfd	MEDIUM: stream-int: remove any reference to the owner si->owner is not used anymore now, so let's remove any reference to it.	2015-03-11 20:41:46 +01:00
Willy Tarreau	07373b8660	MEDIUM: stream-int: use si_task() to retrieve the task from the stream int We go back to the session to get the owner. Here again it's very easy and is just a matter of relative offsets. Since the owner always exists and always points to the session's task, we can remove some unneeded tests.	2015-03-11 20:41:46 +01:00
Willy Tarreau	a2df3fa251	MEDIUM: stream-interface: remove now unused pointers to channels Everyone must now use si_ic() / si_oc() to find the relevant channels, the points have been totally removed.	2015-03-11 20:41:46 +01:00
Willy Tarreau	a5f5d8dc69	MEDIUM: stream-int: add a flag indicating which side the SI is on This new flag "SI_FL_ISBACK" is set only on the back SI and is cleared on the front SI. That way it's possible only by looking at the SI to know what side it is.	2015-03-11 20:41:46 +01:00
Willy Tarreau	2bb4a96f8f	REORG/MEDIUM: stream-int: introduce si_ic/si_oc to access channels We'll soon remove direct references to the channels from the stream interface since everything belongs to the same session, so let's first not dereference si->ib / si->ob anymore and use macros instead.	2015-03-11 20:41:46 +01:00
Willy Tarreau	a27dc19eda	CLEANUP: remove now unused channel pool The channels are now part of the struct session. Their pool is not needed anymore.	2015-03-11 20:41:46 +01:00
Willy Tarreau	22ec1eadd0	REORG/MAJOR: move session's req and resp channels back into the session The channels were pointers to outside structs and this is not needed anymore since the buffers have moved, but this complicates operations. Move them back into the session so that both channels and stream interfaces are always allocated for a session. Some places (some early sample fetch functions) used to validate that a channel was NULL prior to dereferencing it. Now instead we check if chn->buf is NULL and we force it to remain NULL until the channel is initialized.	2015-03-11 20:41:46 +01:00
Thierry FOURNIER	a718b29b6d	MINOR: lua: remove some #define The #define compilation directives are centralized in the hlua include files. This permits to remove ome #ifdef from the haproxy main code.	2015-03-04 17:58:52 +01:00
Thierry FOURNIER	05ac42455f	MEDIUM: lua: Lua initialisation "on demand" Actually, the Lua context is always initilized in each session, even if the session doesn't use Lua. This behavior cause 5% performances loss. This patch initilize the Lua only if it is use by the session. The initialization is now on demand.	2015-02-28 23:12:37 +01:00
Thierry FOURNIER	65f34c6367	MINOR: lua: txn: create class TXN associated with the transaction. This class of functions permit to access to all the functions associated with the transaction like http header, HAProxy internal fetches, etc ... This patch puts the skeleton of this class. The class will be enhanced later.	2015-02-28 23:12:34 +01:00
Thierry FOURNIER	bc4c1ac6ad	MEDIUM: http/tcp: permit to resume http and tcp custom actions Later, the processing of some actions needs to be interrupted and resumed later. This patch permit to resume the actions. The actions that needs to run with the resume mode are not yet avalaible. It will be soon with Lua patches. So the code added by this patch is untestable for the moment. The list of "tcp_exec_req_rules" cannot resme because is called by the unresumable function "accept_session".	2015-02-28 23:12:33 +01:00
Thierry FOURNIER	f41a809dc9	MINOR: sample: add private argument to the struct sample_fetch The add of this private argument is to prepare the integration of the lua fetchs.	2015-02-28 23:12:31 +01:00
Thierry FOURNIER	b83862dd74	MEDIUM: channel: wake up any request analyzer on response activity This behavior is already existing for the "WAIT_HTTP" analyzer, this patch just extends the system to any analyzer that would be waked up on response activity.	2015-02-28 23:12:31 +01:00
Thierry FOURNIER	2e05a8c742	MEDIUM: task: call session analyzers if the task is woken by a message. When a task used to receive a message from another one, its analysers were not called if there was no I/O activity.	2015-02-28 23:12:30 +01:00
Willy Tarreau	a24adf0795	MAJOR: session: only wake up as many sessions as available buffers permit We've already experimented with three wake up algorithms when releasing buffers : the first naive one used to wake up far too many sessions, causing many of them not to get any buffer. The second approach which was still in use prior to this patch consisted in waking up either 1 or 2 sessions depending on the number of FDs we had released. And this was still inaccurate. The third one tried to cover the accuracy issues of the second and took into consideration the number of FDs the sessions would be willing to use, but most of the time we ended up waking up too many of them for nothing, or deadlocking by lack of buffers. This patch completely removes the need to allocate two buffers at once. Instead it splits allocations into critical and non-critical ones and implements a reserve in the pool for this. The deadlock situation happens when all buffers are be allocated for requests pending in a maxconn-limited server queue, because then there's no more way to allocate buffers for responses, and these responses are critical to release the servers's connection in order to release the pending requests. In fact maxconn on a server creates a dependence between sessions and particularly between oldest session's responses and latest session's requests. Thus, it is mandatory to get a free buffer for a response in order to release a server connection which will permit to release a request buffer. Since we definitely have non-symmetrical buffers, we need to implement this logic in the buffer allocation mechanism. What this commit does is implement a reserve of buffers which can only be allocated for responses and that will never be allocated for requests. This is made possible by the requester indicating how much margin it wants to leave after the allocation succeeds. Thus it is a cooperative allocation mechanism : the requester (process_session() in general) prefers not to get a buffer in order to respect other's need for response buffers. The session management code always knows if a buffer will be used for requests or responses, so that is not difficult : - either there's an applet on the initiator side and we really need the request buffer (since currently the applet is called in the context of the session) - or we have a connection and we really need the response buffer (in order to support building and sending an error message back) This reserve ensures that we don't take all allocatable buffers for requests waiting in a queue. The downside is that all the extra buffers are really allocated to ensure they can be allocated. But with small values it is not an issue. With this change, we don't observe any more deadlocks even when running with maxconn 1 on a server under severely constrained memory conditions. The code becomes a bit tricky, it relies on the scheduler's run queue to estimate how many sessions are already expected to run so that it doesn't wake up everyone with too few resources. A better solution would probably consist in having two queues, one for urgent requests and one for normal requests. A failed allocation for a session dealing with an error, a connection event, or the need for a response (or request when there's an applet on the left) would go to the urgent request queue, while other requests would go to the other queue. Urgent requests would be served from 1 entry in the pool, while the regular ones would be served only according to the reserve. Despite not yet having this, it works remarkably well. This mechanism is quite efficient, we don't perform too many wake up calls anymore. For 1 million sessions elapsed during massive memory contention, we observe about 4.5M calls to process_session() compared to 4.0M without memory constraints. Previously we used to observe up to 16M calls, which rougly means 12M failures. During a test run under high memory constraints (limit enforced to 27 MB instead of the 58 MB normally needed), performance used to drop by 53% prior to this patch. Now with this patch instead it increases by about 1.5%. The best effect of this change is that by limiting the memory usage to about 2/3 to 3/4 of what is needed by default, it's possible to increase performance by up to about 18% mainly due to the fact that pools are reused more often and remain hot in the CPU cache (observed on regular HTTP traffic with 20k objects, buffers.limit = maxconn/10, buffers.reserve = limit/2). Below is an example of scenario which used to cause a deadlock previously : - connection is received - two buffers are allocated in process_session() then released - one is allocated when receiving an HTTP request - the second buffer is allocated then released in process_session() for request parsing then connection establishment. - poll() says we can send, so the request buffer is sent and released - process session gets notified that the connection is now established and allocates two buffers then releases them - all other sessions do the same till one cannot get the request buffer without hitting the margin - and now the server responds. stream_interface allocates the response buffer and manages to get it since it's higher priority being for a response. - but process_session() cannot allocate the request buffer anymore => We could end up with all buffers used by responses so that none may be allocated for a request in process_session(). When the applet processing leaves the session context, the test will have to be changed so that we always allocate a response buffer regardless of the left side (eg: H2->H1 gateway). A final improvement would consists in being able to only retry the failed I/O operation without waking up a task, but to date all experiments to achieve this have proven not to be reliable enough.	2014-12-24 23:47:33 +01:00
Willy Tarreau	10fc09e872	MAJOR: session: only allocate buffers when needed A session doesn't need buffers all the time, especially when they're empty. With this patch, we don't allocate buffers anymore when the session is initialized, we only allocate them in two cases : - during process_session() - during I/O operations During process_session(), we try hard to allocate both buffers at once so that we know for sure that a started operation can complete. Indeed, a previous version of this patch used to allocate one buffer at a time, but it can result in a deadlock when all buffers are allocated for requests for example, and there's no buffer left to emit error responses. Here, if any of the buffers cannot be allocated, the whole operation is cancelled and the session is added at the tail of the buffer wait queue. At the end of process_session(), a call to session_release_buffers() is done so that we can offer unused buffers to other sessions waiting for them. For I/O operations, we only need to allocate a buffer on the Rx path. For this, we only allocate a single buffer but ensure that at least two are available to avoid the deadlock situation. In case buffers are not available, SI_FL_WAIT_ROOM is set on the stream interface and the session is queued. Unused buffers resulting either from a successful send() or from an unused read buffer are offered to pending sessions during the ->wake() callback.	2014-12-24 23:47:33 +01:00
Willy Tarreau	bf883e0aa7	MAJOR: session: implement a wait-queue for sessions who need a buffer When a session_alloc_buffers() fails to allocate one or two buffers, it subscribes the session to buffer_wq, and waits for another session to release buffers. It's then removed from the queue and woken up with TASK_WAKE_RES, and can attempt its allocation again. We decide to try to wake as many waiters as we release buffers so that if we release 2 and two waiters need only once, they both have their chance. We must never come to the situation where we don't wake enough tasks up. It's common to release buffers after the completion of an I/O callback, which can happen even if the I/O could not be performed due to half a failure on memory allocation. In this situation, we don't want to move out of the wait queue the session that was just added, otherwise it will never get any buffer. Thus, we only force ourselves out of the queue when freeing the session. Note: at the moment, since session_alloc_buffers() is not used, no task is subscribed to the wait queue.	2014-12-24 23:47:33 +01:00
Willy Tarreau	656859d478	MEDIUM: session: implement a basic atomic buffer allocator This patch introduces session_alloc_recv_buffer(), session_alloc_buffers() and session_release_buffers() whose purpose will be to allocate missing buffers and release unneeded ones around the process_session() and during I/O operations. I/O callbacks only need a single buffer for recv operations and none for send. However we still want to ensure that we don't pick the last buffer. That's what session_alloc_recv_buffer() is for. This allocator is atomic in that it always ensures we can get 2 buffers or fails. Here, if any of the buffers is not ready and cannot be allocated, the operation is cancelled. The purpose is to guarantee that we don't enter into the deadlock where all buffers are allocated by the same size of all sessions. A queue will have to be implemented for failed allocations. For now they're just reported as failures.	2014-12-24 23:47:32 +01:00
Willy Tarreau	909e267be0	MINOR: session: group buffer allocations together We'll soon want to release buffers together upon failure so we need to allocate them after the channels. Let's change this now. There's no impact on the behaviour, only the error path is unrolled slightly differently. The same was done in peers.	2014-12-24 23:47:32 +01:00
Willy Tarreau	7dfca9daec	MINOR: buffer: only use b_free to release buffers We don't call pool_free2(pool2_buffers) anymore, we only call b_free() to do the job. This ensures that we can start to centralize the releasing of buffers.	2014-12-24 23:47:32 +01:00
Willy Tarreau	696a2910a0	MINOR: buffer: move buffer initialization after channel initialization It's not clean to initialize the buffer before the channel since it dereferences one pointer in the channel. Also we'll want to let the channel pre-initialize the buffer, so let's ensure that the channel is always initialized prior to the buffers.	2014-12-24 23:47:32 +01:00
Willy Tarreau	e583ea583a	MEDIUM: buffer: use b_alloc() to allocate and initialize a buffer b_alloc() now allocates a buffer and initializes it to the size specified in the pool minus the size of the struct buffer itself. This ensures that callers do not need to care about buffer details anymore. Also this never applies memory poisonning, which is slow and useless on buffers.	2014-12-24 23:47:32 +01:00
Willy Tarreau	474cf54a97	MINOR: buffer: reset a buffer in b_reset() and not channel_init() We'll soon need to be able to switch buffers without touching the channel, so let's move buffer initialization out of channel_init(). We had the same in compressoin.c.	2014-12-24 23:47:31 +01:00
Willy Tarreau	3b24641745	BUG/MAJOR: sessions: unlink session from list on out of memory Since embryonic sessions were introduced in 1.5-dev12 with commit `2542b53` ("MAJOR: session: introduce embryonic sessions"), a major bug remained present. If haproxy cannot allocate memory during session_complete() (for example, no more buffers), it will not unlink the new session from the sessions list. This will cause memory corruptions if the memory area from the session is reused for anything else, and may also cause bogus output on "show sess" on the CLI. This fix must be backported to 1.5.	2014-11-25 22:09:05 +01:00
KOVACS Krisztian	b3e54fe387	MAJOR: namespace: add Linux network namespace support This patch makes it possible to create binds and servers in separate namespaces. This can be used to proxy between multiple completely independent virtual networks (with possibly overlapping IP addresses) and a non-namespace-aware proxy implementation that supports the proxy protocol (v2). The setup is something like this: net1 on VLAN 1 (namespace 1) -\ net2 on VLAN 2 (namespace 2) -- haproxy ==== proxy (namespace 0) net3 on VLAN 3 (namespace 3) -/ The proxy is configured to make server connections through haproxy and sending the expected source/target addresses to haproxy using the proxy protocol. The network namespace setup on the haproxy node is something like this: = 8< = $ cat setup.sh ip netns add 1 ip link add link eth1 type vlan id 1 ip link set eth1.1 netns 1 ip netns exec 1 ip addr add 192.168.91.2/24 dev eth1.1 ip netns exec 1 ip link set eth1.$id up ... = 8< = = 8< = $ cat haproxy.cfg frontend clients bind 127.0.0.1:50022 namespace 1 transparent default_backend scb backend server mode tcp server server1 192.168.122.4:2222 namespace 2 send-proxy-v2 = 8< = A bind line creates the listener in the specified namespace, and connections originating from that listener also have their network namespace set to that of the listener. A server line either forces the connection to be made in a specified namespace or may use the namespace from the client-side connection if that was set. For more documentation please read the documentation included in the patch itself. Signed-off-by: KOVACS Tamas <ktamas@balabit.com> Signed-off-by: Sarkozi Laszlo <laszlo.sarkozi@balabit.com> Signed-off-by: KOVACS Krisztian <hidden@balabit.com>	2014-11-21 07:51:57 +01:00
Willy Tarreau	3a5e060bf6	MINOR: session: release a few other pools when stopping We currently release all pools when a proxy is stopped, except the connection, pendconn, and pipe pools. Doing so can improve further reduce memory usage of old processes, eventhough the connection struct is quite small, but there are a lot and they can participate to memory fragmentation. The pipe pool is very small and limited, and not exported so it's not done here.	2014-11-13 16:56:12 +01:00
Willy Tarreau	e12704bfc7	MINOR: session: export the function 'smp_fetch_sc_stkctr' This one is sometimes useful outside of this file.	2014-07-15 19:09:56 +02:00
Willy Tarreau	b5975defba	MINOR: stick-table: make stktable_fetch_key() indicate why it failed stktable_fetch_key() does not indicate whether it returns NULL because the input sample was not found or because it's unstable. It causes trouble with track-sc* rules. Just like with sample_fetch_string(), we want it to be able to give more information to the caller about what it found. Thus, now we use the pointer to a sample passed by the caller, and fill it with the information we have about the sample. That way, even if we return NULL, the caller has the ability to check whether a sample was found and if it is still changing or not.	2014-06-25 17:17:53 +02:00
Willy Tarreau	6f0a7bac28	BUG/MAJOR: session: revert all the crappy client-side timeout changes This is the 3rd regression caused by the changes below. The latest to date was reported by Finn Arne Gangstad. If a server responds with no content-length and the client's FIN is never received, either we leak the client-side FD or we spin at 100% CPU if timeout client-fin is set. Enough is enough. The amount of tricks needed to cover these side-effects starts to look like used toilet paper stacked over a chocolate cake. I don't want to eat that cake anymore! All this to avoid reporting a server-side timeout when a client stops uploading data and haproxy expires faster than the server... A lot of "ifs" resulting in a technically valid log that doesn't always please users, and whose alternative causes that many issues for all others users. So let's revert this crap merged since 1.5-dev25 : Revert "CLEANUP: http: don't clear CF_READ_NOEXP twice" This reverts commit `1592d1e72a`. Revert "BUG/MEDIUM: http: clear CF_READ_NOEXP when preparing a new transaction" This reverts commit `77d29029af`. Revert "BUG/MEDIUM: session: don't clear CF_READ_NOEXP if analysers are not called" This reverts commit `0943757a21`. Revert "BUG/MEDIUM: http: disable server-side expiration until client has sent the body" This reverts commit `3bed5e9337`. Revert "BUG/MEDIUM: http: correctly report request body timeouts" This reverts commit `b9edf8fbec`. Revert "BUG/MEDIUM: http/session: disable client-side expiration only after body" This reverts commit `b1982e27aa`. If a cleaner AND SAFER way to do something equivalent in 1.6-dev, we might consider backporting it to 1.5, but given the vicious bugs that have surfaced since, I doubt it will happen any time soon. Fortunately, that crap never made it into 1.4 so no backport is needed.	2014-06-23 15:47:00 +02:00
Willy Tarreau	4bfc580dd3	MEDIUM: session: maintain per-backend and per-server time statistics Using the last rate counters, we now compute the queue, connect, response and total times per server and per backend with a 95% accuracy over the last 1024 samples. The operation is cheap so we don't need to condition it.	2014-06-17 17:15:56 +02:00
Willy Tarreau	33a14e515b	MEDIUM: session: redispatch earlier when possible As discussed with Dmitry Sivachenko, is a server farm has more than one active server, uses a guaranteed non-determinist algorithm (round robin), and a connection was initiated from a non-persistent connection, there's no point insisting to reconnect to the same server after a connect failure, better redispatch upon the very first retry instead of insisting on the same server multiple times.	2014-06-13 17:53:55 +02:00
Willy Tarreau	db6d012270	MEDIUM: session: don't apply the retry delay when redispatching The retry delay is only useful when sticking to a same server. During a redispatch, it's useless and counter-productive if we're sure to switch to another server, which is almost guaranteed when there's more than one server and the balancing algorithm is round robin, so better not pass via the turn-around state in this case. It could be done as well for leastconn, but there's a risk of always killing the delay after the recovery of a server in a farm where it's almost guaranteed to take most incoming traffic. So better only kill the delay when using round robin.	2014-06-13 17:48:45 +02:00
Willy Tarreau	b02906659b	MEDIUM: session: allow shorter retry delay if timeout connect is small As discussed with Dmitry Sivachenko, the default 1-second connect retry delay can be large for situations where the connect timeout is much smaller, because it means that an active connection reject will take more time to be retried than a silent drop, and that does not make sense. This patch changes this so that the retry delay is the minimum of 1 second and the connect timeout. That way people running with sub-second connect timeout will benefit from the shorter reconnect.	2014-06-13 17:04:44 +02:00
Willy Tarreau	892337c8e1	MAJOR: server: use states instead of flags to store the server state Servers used to have 3 flags to store a state, now they have 4 states instead. This avoids lots of confusion for the 4 remaining undefined states. The encoding from the previous to the new states can be represented this way : SRV_STF_RUNNING \| SRV_STF_GOINGDOWN \| \| SRV_STF_WARMINGUP \| \| \| 0 x x SRV_ST_STOPPED 1 0 0 SRV_ST_RUNNING 1 0 1 SRV_ST_STARTING 1 1 x SRV_ST_STOPPING Note that the case where all bits were set used to exist and was randomly dealt with. For example, the task was not stopped, the throttle value was still updated and reported in the stats and in the http_server_state header. It was the same if the server was stopped by the agent or for maintenance. It's worth noting that the internal function names are still quite confusing.	2014-05-22 11:27:00 +02:00
Willy Tarreau	c93cd16b6c	REORG/MEDIUM: server: split server state and flags in two different variables Till now, the server's state and flags were all saved as a single bit field. It causes some difficulties because we'd like to have an enum for the state and separate flags. This commit starts by splitting them in two distinct fields. The first one is srv->state (with its counter-part srv->prev_state) which are now enums, but which still contain bits (SRV_STF_*). The flags now lie in their own field (srv->flags). The function srv_is_usable() was updated to use the enum as input, since it already used to deal only with the state. Note that currently, the maintenance mode is still in the state for simplicity, but it must move as well.	2014-05-22 11:27:00 +02:00
Willy Tarreau	0943757a21	BUG/MEDIUM: session: don't clear CF_READ_NOEXP if analysers are not called As more or less suspected, commit `b1982e2` ("BUG/MEDIUM: http/session: disable client-side expiration only after body") was hazardous. It introduced a regression causing client side timeout to expire during connection retries if it's lower than the time needed to cover the amount of retries, so clients get a 408 when the connection to the server fails to establish fast enough. The reason is that the CF_READ_NOEXP flag is set after the MSG_DONE state is reached, which protects the timeout from being re-armed, then during the retries, process_session() clears the flag without calling the analyser (since there's no activity for it), so the timeouts are rearmed. Ideally, these one-shot flags should be per-analyser, and the analyser which sets them would be responsible for clearing them, or they would automatically be cleared when switching to another analyser. Unfortunately this is not really possible currently. What can be done however is to only clear them in the following situations : - we're going to call analysers - analysers have all been unsubscribed This method seems reliable enough and approaches the ideal case well enough. No backport is needed, this bug was introduced in 1.5-dev25.	2014-05-21 16:58:17 +02:00
Willy Tarreau	05cdd9655d	MEDIUM: session: implement half-closed timeouts (client-fin and server-fin) Long-lived sessions are often subject to half-closed sessions resulting in a lot of sessions appearing in FIN_WAIT state in the system tables, and no way for haproxy to get rid of them. This typically happens because clients suddenly disconnect without sending any packet (eg: FIN or RST was lost in the path), and while the server detects this using an applicative heart beat, haproxy does not close the connection. This patch adds two new timeouts : "timeout client-fin" and "timeout server-fin". The former allows one to override the client-facing timeout when a FIN has been received or sent. The latter does the same for server-facing connections, which is less useful.	2014-05-10 15:14:05 +02:00
Willy Tarreau	b4f98098aa	BUG/MAJOR: session: recover the correct connection pointer in half-initialized sessions John-Paul Bader reported a nasty segv which happens after a few hours when SSL is enabled under a high load. Fortunately he could catch a stack trace, systematically looking like this one : (gdb) bt full level = 6 conn = (struct connection ) 0x0 err_msg = <value optimized out> s = (struct session ) 0x80337f800 conn = <value optimized out> flags = 41997063 new_updt = <value optimized out> old_updt = 1 e = <value optimized out> status = 0 fd = 53999616 nbfd = 279 wait_time = <value optimized out> updt_idx = <value optimized out> en = <value optimized out> eo = <value optimized out> count = 78 sr = <value optimized out> sw = <value optimized out> rn = <value optimized out> wn = <value optimized out> The variable "flags" in conn_fd_handler() holds a copy of connection->flags when entering the function. These flags indicate 41997063 = 0x0280d307 : - {SOCK,DATA,CURR}_RD_ENA=1 => it's a handshake, waiting for reading - {SOCK,DATA,CURR}_WR_ENA=0 => no need for writing - CTRL_READY=1 => FD is still allocated - XPRT_READY=1 => transport layer is initialized - ADDR_FROM_SET=1, ADDR_TO_SET=0 => clearly it's a frontend connection - INIT_DATA=1, WAKE_DATA=1 => processing a handshake (ssl I guess) - {DATA,SOCK}_{RD,WR}_SH=0 => no shutdown - ERROR=0, CONNECTED=0 => handshake not completed yet - WAIT_L4_CONN=0 => normal - WAIT_L6_CONN=1 => waiting for an L6 handshake to complete - SSL_WAIT_HS=1 => the pending handshake is an SSL handshake So this is a handshake is in progress. And the only way to reach line 88 is for the handshake to complete without error. So we know for sure that ssl_sock_handshake() was called and completed the handshake then removed the CO_FL_SSL_WAIT_HS flag from the connection. With these flags, ssl_sock_handshake() does only call SSL_do_handshake() and retruns. So that means that the problem is necessarily in data->init(). The fd is wrong as reported but is simply mis-decoded as it's the lower half of the last function pointer. What happens in practice is that there's an issue with the way we deal with embryonic sessions during their conversion to regular sessions. Since they have no stream interface at the beginning, the pointer to the connection is temporarily stored into s->target. Then during their conversion, the first stream interface is properly initialized and the connection is attached to it, then s->target is set to NULL. The problem is that if anything fails in session_complete(), the session is left in this intermediate state where s->target is NULL, and kill_mini_session() is called afterwards to perform the cleanup. It needs the connection, that it finds in s->target which is NULL, dereferences it and dies. The only reasons for dying here are a problem on the TCP connection when doing the setsockopt(TCP_NODELAY) or a memory allocation issue. This patch implements a solution consisting in restoring s->target in session_complete() on the error path. That way embryonic sessions that were valid before calling it are still valid after. The bug was introduced in 1.5-dev20 by commit `f8a49ea` ("MEDIUM: session: attach incoming connection to target on embryonic sessions"). No backport is needed. Special thanks to John for his numerous tests and traces.	2014-05-08 22:46:32 +02:00
Willy Tarreau	b1982e27aa	BUG/MEDIUM: http/session: disable client-side expiration only after body For a very long time, back in the v1.3 days, we used to rely on a trick to avoid expiring the client side while transferring a payload to the server. The problem was that if a client was able to quickly fill the buffers, and these buffers took some time to reach the server, the client should not expire while not sending anything. In order to cover this situation, the client-side timeout was disabled once the connection to the server was OK, since it implied that we would at least expire on the server if required. But there is a drawback to this : if a client stops uploading data before the end, its timeout is not enforced and we only expire on the server's timeout, so the logs report a 504. Since 1.4, we have message body analysers which ensure that we know whether all the expected data was received or not (HTTP_MSG_DATA or HTTP_MSG_DONE). So we can fix this problem by disabling the client-side or server-side timeout at the end of the transfer for the respective side instead of having it unconditionally in session.c during all the transfer. With this, the logs now report the correct side for the timeout. Note that this patch is not enough, because another issue remains : the HTTP body forwarders do not abort upon timeout, they simply rely on the generic handling from session.c. So for now, the session is still aborted when reaching the server timeout, but the culprit is properly reported. A subsequent patch will address this specific point. This bug was tagged MEDIUM because of the changes performed. The issue it fixes is minor however. After some cooling down, it may be backported to 1.4. It was reported by and discussed with Rachel Chavez and Patrick Hemmer on the mailing list.	2014-05-07 14:21:47 +02:00

1 2 3 4 5 ...

567 Commits