haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-12 18:16:58 +02:00

Author	SHA1	Message	Date
Olivier Houchard	000694cf96	MINOR: ssl: Make ssl_sock_handshake() static. ssl_sock_handshake is now only used by the ssl code itself, there's no need to export it anymore, so make it static.	2019-06-05 18:03:38 +02:00
Olivier Houchard	ea8dd949e4	MEDIUM: ssl: Handle subscribe by itself. As the SSL code may have different needs than the upper layer, ie it may want to receive when the upper layer wants to right, instead of directly forwarding the subscribe to the underlying xprt, handle it ourself. The SSL code will know remember any subscribe call, and wake the tasklet when it is ready for more I/O.	2019-06-05 18:03:38 +02:00
Olivier Houchard	c3df4507fa	MEDIUM: connections: Wake the upper layer even if sending/receiving is disabled. In conn_fd_handler(), if the fd is ready to send/recv, wake the upper layer even if we have CO_FL_ERROR, or if CO_FL_XPRT_RD_ENA/CO_FL_XPRT_WR_ENA isn't set. The only reason we should reach that point is if we had a shutw/shutr, and the upper layer may want to know about it, and is supposed to handle it anyway.	2019-06-05 18:03:38 +02:00
Olivier Houchard	49065544d0	MEDIUM: checks: Make sure we unsubscribe before calling cs_destroy(). When we want to destroy the conn_stream for some reason, usually on error, make sure we unsubscribed before doing so. If we subsscribed, the xprt may ultimately wake our tasklet on close, aand the check tasklet doesn't expect it ot happen when we have no longer any conn_stream.	2019-06-05 18:03:38 +02:00
Olivier Houchard	14fcc2ebcc	BUG/MEDIUM: servers: Don't attempt to destroy idle connections if disabled. In connect_server(), when deciding if we should attempt to remove idle connections, because we have to many file descriptors opened, don't attempt to do so if idle connection pool is disabled (with pool-max-conn 0), as if it is, srv->idle_orphan_conns won't even be allocated, and trying to dereference it will cause a crash.	2019-06-05 13:58:06 +02:00
Fr�d�ric L�caille	344e94816c	BUG/MINOR: peers: Wrong "server_name" decoding. This patch fixes a bug which does not occur at this time because the "server_name" stick-table data type is the last one (see STKTABLE_DT_SERVER_NAME). It was introduced by this commit: "MINOR: peers: Make peers protocol support new "server_name" data type". Indeed when receiving STD_T_DICT stick-table data type we first decode the length of these data, then we decode the ID of this dictionary entry. To know if there is remaining data to parse, we check if we have reached the end of the current data, relying on <msg_end> variable. But <msg_end> is at the end of the entire message! So this patch computes the correct end of the current STD_T_DICT before doing anything else with it. Nothing to backport.	2019-06-05 13:36:34 +02:00
Christopher Faulet	0bdeeaacbb	BUG/MINOR: flt_trace/htx: Only apply the random forwarding on the message body. In the function trace_http_payload(), when the random forwarding is enabled, only blocks of type HTX_BLK_DATA must be considered. Because other blocks must be forwarding in one time. This patch must be backported to 1.9. But it will have to be adapted. Because several changes on the HTX in the 2.0 are missing in the 1.9.	2019-06-05 10:12:11 +02:00
Christopher Faulet	c31872fc04	BUG/MINOR: mux-h1: Don't send more data than expected In h1_snd_buf(), we try to consume as much data as possible in a loop. In this loop, we first format the raw HTTP message from the HTX message, then we try to send it. But we must be carefull to never send more data than specified by the stream-interface. This patch must be backported to 1.9.	2019-06-05 10:12:11 +02:00
Christopher Faulet	54b5e214b0	MINOR: htx: Don't use end-of-data blocks anymore This type of blocks is useless because transition between data and trailers is obvious. And when there is no trailers, the end-of-message is still there to know when data end for chunked messages.	2019-06-05 10:12:11 +02:00
Christopher Faulet	2d7c5395ed	MEDIUM: htx: Add the parsing of trailers of chunked messages HTTP trailers are now parsed in the same way headers are. It means trailers are converted to K/V blocks followed by an end-of-trailer marker. For now, to make things simple, the type for trailer blocks are not the same than for header blocks. But the aim is to make no difference between headers and trailers by using the same type. Probably for the end-of marker too.	2019-06-05 10:12:11 +02:00
Christopher Faulet	8f3c256f7e	MEDIUM: cache/htx: Always store info about HTX blocks in the cache It was only done for the headers (including the EOH marker). data were prefixed by the info field of these blocks. The payload and the trailers of the messages were stored in raw. The total size of headers and payload were kept in the cached object state to help output formatting. Now, info about each HTX block is store in the cache. Only data are allowed to be splitted. Otherwise, all blocks of an HTX message are handled the same way, both when storing a message in the cache and when delivering it from the cache. This will help the cache implementation to be more robust to internal changes in the HTX. Especially for the upcoming parsing of trailers. There is also no more need to keep extra info in the cached object state.	2019-06-05 10:12:11 +02:00
Christopher Faulet	4c7ce017fc	MINOR: mux-h1: Don't count the EOM in the estimated size of headers If there is not enough space in the HTX message, the EOM can be delayed when a bodyless message is added. So, don't count it in the estimated size of headers.	2019-06-05 10:12:11 +02:00
Christopher Faulet	82f0160318	MINOR: mux-h1: Add h1_eval_htx_hdrs_size() to estimate size of the HTX headers It is just a cosmetic change, to avoid code duplication.	2019-06-05 10:12:11 +02:00
Christopher Faulet	ada34b6a86	MINOR: mux-h1: Add the flag HAVE_O_CONN on h1s This flag is set on h1s when output messages are formatted to know the connection mode was already processed. It replace the variable process_conn_mode in the function h1_process_output().	2019-06-05 10:12:11 +02:00
Christopher Faulet	94b2c76399	MEDIUM: mux-h1: refactor output processing When we format the H1 output, in the loop on the HTX message, instead of switching on the block types, we now switch on the message state. It is almost the same, but it will ease futur changes, on trailers and end-of markers.	2019-06-05 10:12:11 +02:00
Christopher Faulet	a2ea158cf2	BUG/MINOR: mux-h1: errflag must be set on H1S and not H1M during output processing This bug is in an unexpected clause of the switch..case, inside h1_process_output(). The wrong structure is used to set the error flag. This patch must be backported to 1.9.	2019-06-05 10:12:11 +02:00
Patrick Hemmer	65674662b4	MINOR: SSL: add client/server random sample fetches This adds 4 sample fetches: - ssl_fc_client_random - ssl_fc_server_random - ssl_bc_client_random - ssl_bc_server_random These fetches retrieve the client or server random value sent during the handshake. Their use is to be able to decrypt traffic sent using ephemeral ciphers. Tools like wireshark expect a TLS log file with lines in a few known formats (https://code.wireshark.org/review/gitweb?p=wireshark.git;a=blob;f=epan/dissectors/packet-tls-utils.c;h=28a51fb1fb029eae5cea52d37ff5b67d9b11950f;hb=HEAD#l5209). Previously the only format supported using data retrievable from HAProxy state was the one utilizing the Session-ID. However an SSL/TLS session ID is optional, and thus cannot be relied upon for this purpose. This change introduces the ability to extract the client random instead which can be used for one of the other formats. The change also adds the ability to extract the server random, just in case it might have some other use, as the code change to support this was trivial.	2019-06-05 10:07:44 +02:00
Emmanuel Hocdet	839af57c85	CLEANUP: ssl: remove unneeded defined(OPENSSL_IS_BORINGSSL) BoringSSL pretend to be compatible with OpenSSL 1.1.0 and OPENSSL_VERSION_NUMBER is set accordly: cleanup redundante #ifdef.	2019-06-05 10:01:44 +02:00
Fr�d�ric L�caille	36fb77e295	MINOR: peers: Replace hard-coded values for peer protocol messaging by macros. Simple patch to replace hard-coded values in relation with bytes identifiers used for stick-table messages by macros.	2019-06-05 08:42:36 +02:00
Fr�d�ric L�caille	32b5573b13	MINOR: peers: Replace hard-coded for peer protocol 64-bits value encoding by macros. With this patch we define macros for the minimum values which are encoded for 2 up to 10 bytes. This latter is big enough to encode UINT64_MAX. We replaced at several places 240 value by PEER_ENC_2BYTES_MIN which is the minimum value which is encoded with 2 bytes. The peer protocol encoding consisting in encoding with only one byte a value which is less than PEER_ENC_2BYTES_MIN and with at least 2 bytes a 64-bits value greater than PEER_ENC_2BYTES_MIN.	2019-06-05 08:42:36 +02:00
Fr�d�ric L�caille	62b0b0bc02	MINOR: peers: Add dictionary cache information to "show peers" CLI command. This patch adds dictionary entries cached and used for the server by name stickiness feature (exchanged thanks to peers protocol).	2019-06-05 08:42:36 +02:00
Fr�d�ric L�caille	16b4f54533	MINOR: stick-table: Make the CLI stick-table handler support dictionary entry data type. Simple patch to dump the values (strings) of dictionary entries stored in stick-table entries with STD_T_DICT as internal data type.	2019-06-05 08:42:36 +02:00
Fr�d�ric L�caille	8d78fa7def	MINOR: peers: Make peers protocol support new "server_name" data type. Make usage of the APIs implemented for dictionaries (dict.c) and their LRU caches (struct dcache) so that to send/receive server names used for the server by name stickiness. These names are sent over the network as follows: - in every case we send the encode length of the data (STD_T_DICT), then - if the server names is not present in the cache used upon transmission (struct dcache_tx) we cache it and we the ID of this TX cache entry followed the encode length of the server name, and finally the sever name itseft (non NULL terminated string). - if the server name is present, we repead these operations but we only send the TX cache entry ID. Upon receipt, the couple of (cache IDs, server name) are stored the LRU cache used only upon receipt (struct dcache_rx). As the peers protocol is symetrical, the fact that the server name is present in the received data (resp. or not) denotes if the entry is absent (resp. or not).	2019-06-05 08:42:33 +02:00
Fr�d�ric L�caille	03cdf55e69	MINOR: stream: Stickiness server lookup by name. With this patch we modify the stickiness server targets lookup behavior. First we look for this server targets by their names before looking for them by their IDs if not found. We also insert a dictionary entry for the name of the server targets and store the address of this entry in the underlying stick-table.	2019-06-05 08:33:35 +02:00
Fr�d�ric L�caille	7da71293e4	MINOR: server: Add a dictionary for server names. This patch only declares and defines a dictionary for the server names (stored as ->id member field).	2019-06-05 08:33:35 +02:00
Fr�d�ric L�caille	84d6046a33	MINOR: proxy: Add a "server by name" tree to proxy. Add a tree to proxy struct to lookup by name for servers attached to this proxy and populated it at parsing time.	2019-06-05 08:33:35 +02:00
Fr�d�ric L�caille	db52d9087a	MINOR: cfgparse: Space allocation for "server_name" stick-table data type. When parsing sticking rules, with this patch we reserve some room for the new "server_name" stick-table data type, as this is already done for "server_id", setting the offset and used space (in bytes) in the stick-table entry thanks to stkable_alloc_data_type().	2019-06-05 08:33:35 +02:00
Fr�d�ric L�caille	5ad57ea85f	MINOR: stick-table: Add "server_name" new data type. This simple patch only adds definitions to create a new stick-table data type ID and a new standard type to store information in relation wich dictionary entries (STD_T_DICT).	2019-06-05 08:33:35 +02:00
Fr�d�ric L�caille	74167b25f7	MINOR: peers: Add a LRU cache implementation for dictionaries. We want to send some stick-table data fields stored as strings in dictionaries without consuming too much memory and CPU. To do so we implement with this patch a cache for send/received dictionaries entries. These dictionary of strings entries are stored in others real dictionary entries with an identifier as key (unsigned int) and a pointer to the dictionary of strings entries as values.	2019-06-05 08:33:35 +02:00
Fr�d�ric L�caille	4a3fef834c	MINOR: dict: Add dictionary new data structure. This patch adds minimalistic definitions to implement dictionary new data structure which is an ebtree of ebpt_node structs with strings as keys. Note that this has nothing to see with real dictionary data structure (maps of keys in association with values).	2019-06-05 08:33:35 +02:00
Fr�d�ric L�caille	0e8db97df4	BUG/MINOR: peers: Wrong stick-table update message building. When creating this patch "CLEANUP: peers: Replace hard-coded values by macros", we realized there was a remaining place in peer_prepare_updatemsg() where the maximum of an encoded length harcoded value could be replaced by PEER_MSG_ENCODED_LENGTH_MAXLEN macro. But in this case, the 1 harcoded value for the header length is wrong. Should be 2 or PEER_MSG_HEADER_LEN. So, there is a missing byte to encode the length of remaining data after the header. Note that the bug was never encountered because even with a missing byte, we could encode a maximum length which would be (1<<25) (32MB) according to the following extract of the peers protocol documentation which were from far a never reached limit I guess: I) Encoded Integer and Bitfield. 0 <= X < 240 : 1 byte (7.875 bits) [ XXXX XXXX ] 240 <= X < 2288 : 2 bytes (11 bits) [ 1111 XXXX ] [ 0XXX XXXX ] 2288 <= X < 264432 : 3 bytes (18 bits) [ 1111 XXXX ] [ 1XXX XXXX ] [ 0XXX XXXX ] 264432 <= X < 33818864 : 4 bytes (25 bits) [ 1111 XXXX ] [ 1XXX XXXX ]2 [ 0XXX XXXX ] 33818864 <= X < 4328786160 : 5 bytes (32 bits) [ 1111 XXXX ] [ 1XXX XXXX ]3 [ 0XXX XXXX ]	2019-06-05 08:33:34 +02:00
Fr�d�ric L�caille	39143340ec	CLEANUP: peers: Replace hard-coded values by macros. All the peer stick-table messages are made of a 2-byte header (PEER_MSG_HEADER_LEN) followed by the encoded length of the remaining data wich is harcoded as 5 (in bytes) for the maximum (PEER_MSG_ENCODED_LENGTH_MAXLEN). With such a length we can encode a maximum length which equals to (1 << 32) - 1, which is from far enough. This patches replaces both these values by macros where applicable.	2019-06-05 08:33:34 +02:00
Willy Tarreau	5598d171b3	BUILD: task: fix a build warning when threads are disabled The __decl_hathreads() macro will leave a lone semi-colon making the end of variables declarations, resulting in a warning if threads are disabled. Let's simply swap it with the last variable. Thanks to Ilya Shipitsin for reporting this issue. No backport is needed.	2019-06-04 17:18:40 +02:00
Willy Tarreau	4b7531f48b	BUG/MEDIUM: vars: make the tcp/http unset-var() action support conditions Patrick Hemmer reported that http-request unset-var(foo) if ... fails to parse. The reason is that it reuses the same parser as "set-var(foo)" which makes a special case of the arguments, supposed to be a sample expression for set-var, but which must not exist for unset-var. Unfortunately the parser finds "if" or "unless" and believes it's an expression. Let's simply drop the test so that the outer rule parser deals with potential extraneous keywords. This should be backported to all versions supporting unset-var().	2019-06-04 16:48:15 +02:00
Willy Tarreau	f37b140b06	BUG/MEDIUM: vars: make sure the scope is always valid when accessing vars Patrick Hemmer reported that a simple tcp rule involving a variable like this is enough to crash haproxy : frontend foo bind :8001 tcp-request session set-var(txn.foo) src The tests on the variables scopes is not strict enough, it needs to always verify if the stream is valid when accessing a req/res/txn variable. This patch does this by adding a new get_vars() function which does the job instead of open-coding all the lookups everywhere. It must be backported to all versions supporting set-var and "tcp-request session" so at least 1.9 and 1.8.	2019-06-04 16:27:36 +02:00
Willy Tarreau	42a6621d30	BUILD: tools: do not use the weak attribute for trace() on obsolete linkers The default dummy trace() function is marked weak in order to be easily replaced at link time. Some linkers are having issues with the weak attribute, so let's not mark it on these linkers. They will simply not be able to build with TRACE=1, which is no big deal since it's only used by developers.	2019-06-04 16:02:26 +02:00
Willy Tarreau	fb55365f9e	MINOR: server: increase the default pool-purge-delay to 5 seconds The default used to be a very aggressive delay of 1 second before starting to purge idle connections, but tests show that with bursty traffic it's a bit short. Let's increase this to 5 seconds.	2019-06-04 14:06:31 +02:00
Willy Tarreau	a689c3d8d4	MEDIUM: stream: make a full process_stream() loop when completing I/O on exit During 1.9 development cycle a shortcut was made in process_stream() to update the analysers immediately after an I/O even detected on the send() path while leaving the function. In order to prevent this from being abused by a single stream stealing all the CPU, the loop didn't cover the initial recv() call, so that events ultimately converge. This has caused a number of issues over time because the conditions to decide to loop are a bit tricky. For example the CF_READ_PARTIAL flag is not immediately removed from rqf_last and may appear for a long time at this point, sometimes causing some loops to last long. Another unexpected side effect is that all analysers are called again with no data to process, just because CF_WRITE_PARTIAL is present. We cannot get rid of this event even if of very rare use, because some analysers might wait for some data to leave a buffer before proceeding. With a full loop, this event would have been merged with a subsequent recv() allowing analysers to do something more useful than just ack an event they don't care about. While during early 1.9-dev it was very important to be kind with the scheduler, nowadays it's lock-free for local tasks so this optimization is much less interesting to use it for I/Os, especially if we factor in the trouble it causes. This patch thus removes the use of the loop for regular I/Os and instead performs a task_wakeup() with an I/O event so that the task will be scheduled after all other ones and will have a chance to perform another recv() and possibly to gather more I/O events to be processed at once. Synchronous errors and transitions to SI_ST_DIS however are still handled by the loop. Doing so significantly reduces the average number of calls to analysers (those are typically halved when compression is enabled in legacy mode), and as a side benefit, has increased the H1 performance by about 1%.	2019-06-03 17:55:23 +02:00
Willy Tarreau	7bb39d7cd6	CLEANUP: connection: remove the now unused CS_FL_REOS flag Let's remove it before it gets uesd again. It was mostly replaced with CS_FL_EOI and by mux-specific states or flags.	2019-06-03 14:23:33 +02:00
Willy Tarreau	c493c9cb08	MEDIUM: mux-h1: don't use CS_FL_REOS anymore This flag was already removed from other muxes and from the upper layers, because it was misused. It indicates to the mux that the end of a stream was already seen and is pending after existing data, but this should not be on the conn_stream but internal to the mux. This patch creates a new H1S flag H1S_F_REOS to replace it and uses it to replace the last uses of CS_FL_REOS.	2019-06-03 14:18:22 +02:00
Willy Tarreau	fbdf90a6f9	BUG/MEDIUM: mux-h1: only check input data for the current stream, not next one The mux-h1 doesn't properly propagate end of streams to the application layer when requests are pipelined. This is visible by launching h2load in h1 mode with -m greater than 1 : issuing Ctrl-C has no effect until the client timeout expires. The reason is that among the checks conditionning the reporting of the end of stream status and waking up the streams, is a test on the presence of remaining input data in the demux. But with pipelining, these data may be present for another stream and should not prevent the end of stream condition from being reported. This patch addresses this issue by introducing a new function "h1s_data_pending" which returns a boolean indicating if there are in the demux buffer any data for the current stream. That is, if the stream is in H1_MSG_DONE state, there are never any data for it. And if it's in a different state, then the demux buffer is checked. This replaces the tests on b_data(&h1c->ibuf) and correctly allows end of streams to be reported at the end of requests. It's worth noting that 1.9 doesn't suffer from this issue but it possibly isn't completely immune either given that the same tests are present.	2019-06-03 14:13:23 +02:00
Willy Tarreau	d58f27fead	MINOR: mux-h1: don't try to recv() before the connection is ready Just as we already do in h1_send(), if the connection is not yet ready, do not proceed and instead subscribe. This avoids a needless recvfrom() and subscription to polling for a case which will never work since the request was not even sent.	2019-06-03 10:17:12 +02:00
Willy Tarreau	694fcd0ee4	MINOR: connection: also stop receiving after a SOCKS4 response Just as is done in previous patch for all handshake handlers, also stop receiving after a SOCKS4 response was received. This one escaped the previous cleanup but must be done to keep the code safe.	2019-06-03 10:16:35 +02:00
Willy Tarreau	6499b9d996	BUG/MEDIUM: connection: fix multiple handshake polling issues Connection handshakes were rarely stacked on top of each other, but the recent experiments consisting in sending PROXY over SOCKS4 revealed a number of issues in these lower layers. First, each handler waiting for data MUST subscribe to recv events with __conn_sock_want_recv() and MUST unsubscribe from send events using __conn_sock_stop_send() to avoid any wake-up loop in case a previous sender has set this. Second, each handler waiting for sending MUST subscribe to send events with __conn_sock_want_send() and MUST unsubscribe from recv events using __conn_sock_stop_recv() to avoid any wake-up loop in case some data are available on the connection. Till now this was done at various random places, and in particular the cases where the FD was not ready for recv forgot to re-enable reading. Second, while senders can happily use conn_sock_send() which automatically handles EINTR, loops, and marks the FD as not ready with fd_cant_send(), there is no equivalent for recv so receivers facing EAGAIN MUST call fd_cant_send() to enable polling. It could be argued that implementing an equivalent conn_sock_recv() function could be useful and more long-term proof than the current situation. Third, both types of handlers MUST unsubscribe from their respective events once they managed to do their job, and none may even play with __conn_xprt_*(). Here again this was lacking, and one surprizing call to __conn_xprt_stop_recv() was present in the proxy protocol parser for TCP6 messages! Thanks to Alexander Liu for his help on this issue. This patch must be backported to 1.9 and possibly some older versions, though the SOCKS parts should be dropped.	2019-06-03 08:31:22 +02:00
Willy Tarreau	7067b3a92e	BUG/MINOR: deinit/threads: make hard-stop-after perform a clean exit As reported in GH issue #99, when hard-stop-after triggers and threads are in use, the chance that any thread releases the resources in use by the other ones is non-null. Thus no thread should be allowed to deinit() nor exit by itself. Here we take a different approach. We simply use a 3rd possible value for the "killed" variable so that all threads know they must break out of the run-poll-loop and immediately stop. This patch was tested by commenting the stream_shutdown() calls in hard_stop() to increase the chances to see a stream use released resources. With this fix applied, it never crashes anymore. This fix should be backported to 1.9 and 1.8.	2019-06-02 11:30:07 +02:00
Alexander Liu	2a54bb74cd	MEDIUM: connection: Upstream SOCKS4 proxy support Have "socks4" and "check-via-socks4" server keyword added. Implement handshake with SOCKS4 proxy server for tcp stream connection. See issue #82. I have the "SOCKS: A protocol for TCP proxy across firewalls" doc found at "https://www.openssh.com/txt/socks4.protocol". Please reference to it. [wt: for now connecting to the SOCKS4 proxy over unix sockets is not supported, and mixing IPv4/IPv6 is discouraged; indeed, the control layer is unique for a connection and will be used both for connecting and for target address manipulation. As such it may for example report incorrect destination addresses in logs if the proxy is reached over IPv6]	2019-05-31 17:24:06 +02:00
Olivier Houchard	cfbb3e6560	MEDIUM: tasks: Get rid of active_tasks_mask. Remove the active_tasks_mask variable, we can deduce if we've work to do by other means, and it is costly to maintain. Instead, introduce a new function, thread_has_tasks(), that returns non-zero if there's tasks scheduled for the thread, zero otherwise.	2019-05-29 21:53:37 +02:00
Olivier Houchard	661167d136	BUG/MEDIUM: connection: Use the session to get the origin address if needed. In conn_si_send_proxy(), if we don't have a conn_stream yet, because the mux won't be created until the SSL handshake is done, retrieve the opposite's connection from the session. At this point, we know the session associated with the connection is the one that initiated it, and we can thus just use the session's origin. This should be backported to 1.9.	2019-05-29 17:56:59 +02:00
Willy Tarreau	201840abf1	BUG/MEDIUM: mux-h2: don't refrain from offering oneself a used buffer Usually when calling offer_buffer(), we don't expect to offer it to ourselves. But with h2 we have the same buffer_wait for the two directions so we can unblock the recv path when completing a send(), or we can unblock part of the mux buffer after sending the first few buffers that we managed to collect. Thus it is important to always accept to wake up any requester. A few parts of this patch could possibly be backported but earlier versions already have other issues related to low-buffer condition so it's not sure it's worth taking the risk to make things worse.	2019-05-29 17:54:35 +02:00
Willy Tarreau	7f1265a238	BUG/MEDIUM: mux-h2: fix the conditions to end the h2_send() loop The test for the mux alloc failure in h2_send() right after an attempt at h2_process_mux() used to make sense as it tried to detect that this latter failed to produce data. But now that we have a list of buffers, it is a perfectly valid situation where there can still be data in the buffer(s). So now when we see this flag we only declare it's the last run on the loop. In addition we need to make sure we break out of the loop on snd_buf failure, or we'll loop indefinitely, for example when the buf is full and we can't send. No backport is needed.	2019-05-29 17:54:35 +02:00
Olivier Houchard	58d87f31f7	BUG/MEDIUM: h2: Don't forget to set h2s->cs to NULL after having free'd cs. In h2c_frt_stream_new, if we failed to create the stream for some reason, don't forget to set h2s->cs to NULL before calling h2s_destroy(), otherwise h2s_destroy() will call h2s_close(), which will attempt to access h2s->cs->flags if it's non-NULL. This should be backported to 1.9.	2019-05-29 16:45:13 +02:00
Olivier Houchard	250031e444	MEDIUM: sessions: Introduce session flags. Add session flags, and add a new flag, SESS_FL_PREFER_LAST, to be set when we use NTLM authentication, and we should reuse the last connection. This should fix using NTLM with HTX. This totally replaces TX_PREFER_LAST. This should be backported to 1.9.	2019-05-29 15:41:47 +02:00
Christopher Faulet	1146f975a9	BUG/MEDIUM: mux-h1: Don't skip the TCP splicing when there is no more data to read When there is no more data to read (h1m->curr_len == 0 in the state H1_MSG_DATA), we still call xprt->rcv_pipe() callback. It is important to update connection's flags. Especially to remove the flag CO_FL_WAIT_ROOM. Otherwise, the pipe remains marked as full, preventing the stream-interface to fallback on rcv_buf(). So the connection may be freezed because no more data is received and the mux H1 remains blocked in the state H1_MSG_DATA. This patch must be backported to 1.9.	2019-05-29 15:32:14 +02:00
Willy Tarreau	1e928c074b	MEDIUM: task: don't grab the WR lock just to check the WQ When profiling locks, it appears that the WQ's lock has become the most contended one, despite the WQ being split by thread. The reason is that each thread takes the WQ lock before checking if it it does have something to do. In practice the WQ almost only contains health checks and rare tasks that can be scheduled anywhere, so this is a real waste of resources. This patch proceeds differently. Now that the WQ's lock was turned to RW lock, we proceed in 3 phases : 1) locklessly check for the queue's emptiness 2) take an R lock to retrieve the first element and check if it is expired. This way most visits are performed with an R lock to find and return the next expiration date. 3) if one expiration is found, we perform the WR-locked lookup as usual. As a result, on a one-minute test involving 8 threads and 64 streams at 1.3 million ctxsw/s, before this patch the lock profiler reported this : Stats about Lock TASK_WQ: # write lock : 1125496 # write unlock: 1125496 (0) # wait time for write : 263.143 msec # wait time for write/lock: 233.802 nsec # read lock : 0 # read unlock : 0 (0) # wait time for read : 0.000 msec # wait time for read/lock : 0.000 nsec And after : Stats about Lock TASK_WQ: # write lock : 173 # write unlock: 173 (0) # wait time for write : 0.018 msec # wait time for write/lock: 103.988 nsec # read lock : 1072706 # read unlock : 1072706 (0) # wait time for read : 60.702 msec # wait time for read/lock : 56.588 nsec Thus the contention was divided by 4.3.	2019-05-28 19:15:44 +02:00
Willy Tarreau	ef28dc11e3	MINOR: task: turn the WQ lock to an RW_LOCK For now it's exclusively used as a write lock though, thus it remains 100% equivalent to the spinlock it replaces.	2019-05-28 19:15:44 +02:00
Willy Tarreau	186e96ece0	MEDIUM: buffers: relax the buffer lock a little bit In lock profiles it's visible that there is a huge contention on the buffer lock. The reason is that when offer_buffers() is called, it systematically takes the lock before verifying if there is any waiter. However doing so doesn't protect against races since a waiter can happen just after we release the lock as well. Similarly in h2 we take the lock every time an h2c is going to be released, even without checking that the h2c belongs to a wait list. These two have now been addressed by verifying non-emptiness of the list prior to taking the lock.	2019-05-28 17:25:21 +02:00
Willy Tarreau	a8b2ce02b8	MINOR: activity: report the number of failed pool/buffer allocations Haproxy is designed to be able to continue to run even under very low memory conditions. However this can sometimes have a serious impact on performance that it hard to diagnose. Let's report counters of failed pool and buffer allocations per thread in show activity.	2019-05-28 17:25:21 +02:00
Willy Tarreau	2ae84e445d	MEDIUM: poller: separate the wait time from the wake events We have been abusing the do_poll()'s timeout for a while, making it zero whenever there is some known activity. The problem this poses is that it complicates activity diagnostic by incrementing the poll_exp field for each known activity. It also requires extra computations that could be avoided. This change passes a "wake" argument to say that the poller must not sleep. This simplifies the operations and allows one to differenciate expirations from activity.	2019-05-28 17:25:21 +02:00
Willy Tarreau	d78d08f95b	MINOR: activity: report totals and average separately Some fields need to be averaged instead of summed (e.g. avg_poll_us) when reported on the CLI. Let's have a distinct macro for this.	2019-05-28 17:25:21 +02:00
Willy Tarreau	a0211b864c	MINOR: activity: write totals on the "show activity" output Most of the time we find ourselves adding per-thread fields to observe activity, so let's compute these on the fly and display them. Now the output shows "field: total [ thr0 thr1 ... thrn ]".	2019-05-28 15:16:09 +02:00
Willy Tarreau	0350b90e31	MEDIUM: htx: make htx_add_data() never defragment the buffer Now instead of trying to fit 100% of the input data into the output buffer at the risk of defragmenting it, we put what fits into it only and return the amount of bytes transferred. In a test, compared to the previous commit, it increases the cached data rate from 44 Gbps to 55 Gbps and saves a lot in case of large buffers : with a 1 MB buffer, uncached transfers jumped from 700 Mbps to 30 Gbps.	2019-05-28 14:48:59 +02:00
Willy Tarreau	0a7ef02074	MINOR: htx: make htx_add_data() return the transmitted byte count In order to later allow htx_add_data() to transmit partial blocks and avoid defragmenting the buffer, we'll need to return the number of bytes consumed. This first modification makes the function do this and its callers take this into account. At the moment the function still works atomically so it returns either the block size or zero. However all call places have been adapted to consider any value between zero and the block size.	2019-05-28 14:48:59 +02:00
Willy Tarreau	d4908fa465	MINOR: htx: rename htx_append_blk_value() to htx_add_data_atonce() This function is now dedicated to data blocks, and we'll soon need to access it from outside in a rare few cases. Let's rename it and export it.	2019-05-28 14:48:59 +02:00
Olivier Houchard	692c1d07f9	MINOR: ssl: Don't forget to call the close method of the underlying xprt. In ssl_sock_close(), don't forget to call the underlying xprt's close method if it exists. For now it's harmless not to do so, because the only available layer is the raw socket, which doesn't have a close method, but that will change when we implement QUIC.	2019-05-28 10:08:39 +02:00
Olivier Houchard	19afb274ad	MINOR: ssl: Make sure the underlying xprt's init method doesn't fail. In ssl_sock_init(), when initting the underlying xprt, check the return value, and give up if it fails.	2019-05-28 10:08:28 +02:00
Willy Tarreau	11c90fbd92	BUG/MEDIUM: http: fix "http-request reject" when not final When "http-request reject" was introduced in 1.8 with commit `53275e8b0` ("MINOR: http: implement the "http-request reject" rule"), it was already broken. The code mentions "it always returns ACT_RET_STOP" and obviously a gross copy-paste made it ACT_RET_CONT. If the rule is the last one it properly blocks, but if not the last one it gets ignored, as can be seen with this simple configuration : frontend f1 bind :8011 mode http http-request reject http-request redirect location / This trivial fix must be backported to 1.9 and 1.8. It is tracked by github issue #107.	2019-05-28 08:26:17 +02:00
Christopher Faulet	39744f792d	MINOR: htx: Remove support of pseudo headers because it is unused The code to handle pseudo headers is unused and with no real value. So remove it.	2019-05-28 07:42:33 +02:00
Christopher Faulet	ced39006a2	MINOR: htx: don't rely on htx_find_blk() anymore in the function htx_truncate() the function htx_find_blk() is used by only one function, htx_truncate(). So because this function does nothing very smart, we don't use it anymore. It will be removed by another commit.	2019-05-28 07:42:33 +02:00
Christopher Faulet	0f6d6a9ab6	MINOR: htx: Optimize htx_drain() when all data are drained Instead of looping on the HTX message to drain all data, the message is now reset..	2019-05-28 07:42:33 +02:00
Christopher Faulet	ee847d45d0	MEDIUM: filters/htx: Filter body relatively to the first block The filters filtering HTX body, in the callback http_payload, must now loop on an HTX message starting from the first block position. The offset passed as parameter is relative to this position and not the head one. It is mandatory because once filtered, data are now forwarded using the function channel_htx_fwd_payload(). So the first block position is always updated.	2019-05-28 07:42:33 +02:00
Christopher Faulet	16af60e540	MINOR: proto-htx: Use channel_htx_fwd_all() when unfiltered body are forwarded So the first block position of the HTX message will always be updated accordingly.	2019-05-28 07:42:33 +02:00
Christopher Faulet	8fa60e4613	MINOR: stats/htx: don't use the first block position but the head one Applets must never rely on the first block position to consume an HTX message. The head position must be used instead. For the request it is always the start-line. At this stage, it is not a bug, because the first position of the request is never changed by HTX analysers.	2019-05-28 07:42:33 +02:00
Christopher Faulet	29f1758285	MEDIUM: htx: Store the first block position instead of the start-line one We don't store the start-line position anymore in the HTX message. Instead we store the first block position to analyze. For now, it is almost the same. But once all changes will be made on this part, this position will have to be used by HTX analyzers, and only in the analysis context, to know where the analyse should start. When new blocks are added in an HTX message, if the first block position is not defined, it is set. When the block pointed by it is removed, it is set to the block following it. -1 remains the value to unset the position. the first block position is unset when the HTX message is empty. It may also be unset on a non-empty message, meaning every blocks were already analyzed. From HTX analyzers point of view, this position is always set during headers analysis. When they are waiting for a request or a response, if it is unset, it means the analysis should wait. But once the analysis is started, and as long as headers are not forwarded, it points to the message start-line. As mentionned, outside the HTX analysis, no code must rely on the first block position. So multiplexers and applets must always use the head position to start a loop on an HTX message.	2019-05-28 07:42:33 +02:00
Christopher Faulet	ee1bd4b4f7	MINOR: proto-htx: Use channel_htx_fwd_headers() to forward 1xx responses Instead of doing it by hand, we now call the dedicated function to do so.	2019-05-28 07:42:33 +02:00
Christopher Faulet	17fd8a261f	MINOR: filters/htx: Use channel_htx_fwd_headers() after headers filtering Instead of doing it by hand in the function flt_analyze_http_headers(), we now call the dedicated function to do so.	2019-05-28 07:42:33 +02:00
Christopher Faulet	b75b5eaf26	MEDIUM: htx: 1xx messages are now part of the final reponses 1xx informational messages (all except 101) are now part of the HTTP reponse, semantically speaking. These messages are not followed by an EOM anymore, because a final reponse is always expected. All these parts can also be transferred to the channel in same time, if possible. The HTX response analyzer has been update to forward them in loop, as the legacy one.	2019-05-28 07:42:30 +02:00
Christopher Faulet	a61e97bcae	MINOR: htx: Be sure to xfer all headers in one time in htx_xfer_blks() In the function htx_xfer_blks(), we take care to transfer all headers in one time. When the current block is a start-line, we check if there is enough space to transfer all headers too. If not, and if the destination is empty, a parsing error is reported on the source. The H2 multiplexer is the only one to use this function. When a parsing error is reported during the transfer, the flag CS_FL_EOI is also set on the conn_stream.	2019-05-28 07:42:12 +02:00
Christopher Faulet	a39d8ad086	MINOR: mux-h1: Set hdrs_bytes on the SL when an HTX message is produced	2019-05-28 07:42:12 +02:00
Christopher Faulet	33543e73a2	MINOR: h2/htx: Set hdrs_bytes on the SL when an HTX message is produced	2019-05-28 07:42:12 +02:00
Christopher Faulet	05c083ca8d	MINOR: htx: Add a field to set the memory used by headers in the HTX start-line The field hdrs_bytes has been added in the structure htx_sl. It should be used to set how many bytes are help by all headers, from the start-line to the corresponding EOH block. it must be set to -1 if it is unknown.	2019-05-28 07:42:12 +02:00
Christopher Faulet	2f6edc84a8	MINOR: mux-h2/htx: Support zero-copy when possible in h2_rcv_buf() If the channel's buffer is empty and the message is small enough, we can swap the H2S buffer with the channel one.	2019-05-28 07:42:12 +02:00
Christopher Faulet	9cdd5036f3	MINOR: stream-int: Don't use the flag CO_RFL_KEEP_RSV anymore in si_cs_recv() Because the channel_recv_max() always return the right value, for HTX and legacy streams, we don't need to set this flag. The multiplexer don't use it anymore.	2019-05-28 07:42:12 +02:00
Christopher Faulet	8a9ad4c0e8	MINOR: mux-h2: Use the count value received from the SI in h2_rcv_buf() Now, the SI calls h2_rcv_buf() with the right count value. So we can rely on it. Unlike the H1 multiplexer, it is fairly easier for the H2 multiplexer because the HTX message already exists, we only transfer blocks from the H2S to the channel. And this part is handled by htx_xfer_blks().	2019-05-28 07:42:12 +02:00
Christopher Faulet	30db3d737b	MEDIUM: mux-h1: Use the count value received from the SI in h1_rcv_buf() Now, the SI calls h1_rcv_buf() with the right count value. So we can rely on it. During the parsing, we now really respect this value to be sure to never exceed it. To do so, once headers are parsed, we should estimate the size of the HTX message before copying data.	2019-05-28 07:42:12 +02:00
Christopher Faulet	156852b613	BUG/MINOR: htx: Change htx_xfer_blk() to also count metadata This patch makes the function more accurate. Thanks to the function htx_get_max_blksz(), the transfer of data has been simplified. Note that now the total number of bytes copied (metadata + payload) is returned. This slighly change how the function is used in the H2 multiplexer.	2019-05-28 07:42:12 +02:00
Christopher Faulet	a3f1550dfa	MEDIUM: http/htx: Perform analysis relatively to the first block The first block is the start-line, if defined. Otherwise it the head of the HTX message. So now, during HTTP analysis, lookup are all done using the first block instead of the head. Concretely, for now, it is the same because only one HTTP message is stored at a time in an HTX message. 1xx informational messages are handled separatly from the final reponse and from each other. But it will make sense when the 1xx informational messages and the associated final reponse will be stored in the same HTX message.	2019-05-28 07:42:12 +02:00
Christopher Faulet	7b7d507a5b	MINOR: http/htx: Use sl_pos directly to replace the start-line Since the HTX start-line is now referenced by position instead of by its payload address, it is fairly easier to replace it. No need to search the rigth block to find the start-line comparing the payloads address. It just enough to get the block at the position sl_pos.	2019-05-28 07:42:12 +02:00
Christopher Faulet	297fbb45fe	MINOR: htx: Replace the function http_find_stline() by http_get_stline() Now, we only return the start-line. If not found, NULL is returned. No lookup is performed and the HTX message is no more updated. It is now the caller responsibility to update the position of the start-line to the right value. So when it is not found, i.e sl_pos is set to -1, it means the last start-line has been already processed and the next one has not been inserted yet. It is mandatory to rely on this kind of warranty to store 1xx informational responses and final reponse in the same HTX message.	2019-05-28 07:42:12 +02:00
Christopher Faulet	b77a1d26a4	MINOR: mux-h2/htx: Get the start-line from the head when HEADERS frame is built in the H2 multiplexer, when a HEADERS frame is built before sending it, we have the warranty the start-line is the head of the HTX message. It is safer to rely on this fact than on the sl_pos value. For now, it's safe to use sl_pos in muxes because HTTP 1xx messages are considered as full messages in HTX and only one HTTP message can be stored at a time in HTX. But we are trying to handle 1xx messages as a part of the reponse message. In this way, an HTTP reponse will be the sum of all 1xx informational messages followed by the final response. So it will be possible to have several start-line in the same HTX message. And the sl_pos will point to the first unprocessed start-line from the analyzers point of view.	2019-05-28 07:42:12 +02:00
Christopher Faulet	9c66b980fa	MINOR: htx: Store start-line block's position instead of address of its payload Nothing much to say. This change is just mandatory to consider 1xx informational messages as part of a response.	2019-05-28 07:42:12 +02:00
Christopher Faulet	28f29c7eea	MINOR: htx: Store the head position instead of the wrap one The head of an HTX message is heavily used whereas the wrap position is only used when a block is added or removed. So it is more logical to store the head position in the HTX message instead of the wrap one. The wrap position can be easily deduced. To get it, the new function htx_get_wrap() may be used.	2019-05-28 07:42:12 +02:00
Christopher Faulet	429b91d308	MINOR: htx: Remove the macro IS_HTX_SMP() and always use IS_HTX_STRM() instead The macro IS_HTX_SMP() is only used at a place, in a context where the stream always exists. So, we can remove it to use IS_HTX_STRM() instead.	2019-05-28 07:42:12 +02:00
Willy Tarreau	b01302f9ac	MEDIUM: config: now alert when two servers have the same name We've been emitting warnings for over 5 years (since 1.5-dev22) about configs accidently carrying multiple servers with the same name in the same backend, and this starts to cause some real trouble in dynamic environments since it's still very difficult to accurately process a state-file and we still can't transport a server's name over the peers protocol because of this. It's about time to force users to fix their configs if they still hadn't given that there is zero technical justification for doing this, beyond the "yyp" (or copy-paste accident) when editing the config. The message remains as clear as before, indicating the file and lines of the conflict so that the user can easily fix it.	2019-05-27 19:31:06 +02:00
Willy Tarreau	c3b5958255	BUG/MEDIUM: threads: fix double-word CAS on non-optimized 32-bit platforms On armv7 haproxy doesn't work because of the fixes on the double-word CAS. There are two issues. The first one is that the last argument in case of dwcas is a pointer to the set of value and not a value ; the second is that it's not enough to cast the data as (void*) since it will be a single word. Let's fix this by using the pointers as an array of long. This was tested on i386, armv7, x86_64 and aarch64 and it is now fine. An alternate approach using a struct was attempted as well but it used to produce less optimal code. This fix must be backported to 1.9. This fixes github issue #105. Cc: Olivier Houchard <ohouchard@haproxy.com>	2019-05-27 17:40:59 +02:00
Willy Tarreau	bff005ae58	BUG/MEDIUM: queue: fix the tree walk in pendconn_redistribute. In pendconn_redistribute() we scan the queue using eb32_next() on the node we've just deleted, which is wrong since the node is not in the tree anymore, and it could dereference one node that has already been released by another thread. Note that we cannot use eb32_first() in the loop here instead because we need to skip pendconns having SF_FORCE_PRST. Instead, let's keep a copy of the next node before deleting it. In addition, the pendconn retrieved there is wrong, it uses &node as the pointer instead of node, resulting in very quick crashes when the server list is scanned. Fortunately this only happens when "option redispatch" is used in conjunction with "maxconn" on server lines, "cookie" for the stickiness, and when a server goes down with entries in its queue. This bug was introduced by commit `0355dabd7` ("MINOR: queue: replace the linked list with a tree") so the fix must be backported to 1.9.	2019-05-27 10:29:59 +02:00
Willy Tarreau	b6195ef2a6	BUG/MAJOR: lb/threads: make sure the avoided server is not full on second pass In fwrr_get_next_server(), we optionally pass a server to avoid. It usually points to the current server during a redispatch operation. If this server is usable, an "avoided" pointer is set and we continue to look for another server. If in the end no other server is found, then we fall back to this avoided one, which is still better than nothing. The problem that may arise with threads is that in the mean time, this avoided server might have received extra connections and might not be usable anymore. This causes it to be queued a second time in the "full" list and the loop to search for a server again, ending up on this one again and so on. This patch makes sure that we break out of the loop when we have to pick the avoided server. It's probably what the code intended to do as the current break statement causes fwrr_update_position() and fwrr_dequeue_srv() to be called again on the avoided server. It must be backported to 1.9 and 1.8, and seems appropriate for older versions though it's unclear what the impact of this bug might be there since the race doesn't exist and we're left with the double update of the server's position.	2019-05-27 10:29:59 +02:00
Willy Tarreau	d6a7850200	MINOR: cli/activity: add 3 general purpose counters in development mode The unused fd_del and fd_skip were being abused during debugging sessions as general purpose event counters. With their removal, let's officially have dedicated counters for such use cases. These counters are called "ctr0".."ctr2" and are listed at the end when DEBUG_DEV is set.	2019-05-27 07:03:38 +02:00
Willy Tarreau	394c9b4215	MINOR: cli/activity: remove "fd_del" and "fd_skip" from show activity These variables are never set anymore and were always reported as zero.	2019-05-27 06:59:14 +02:00
Ilya Shipitsin	0590f44254	BUILD: ssl: fix latest LibreSSL reg-test error starting with OpenSSL 1.0.0 recommended way to disable compression is using SSL_OP_NO_COMPRESSION when creating context. manipulations with SSL_COMP_get_compression_methods, sk_SSL_COMP_num are only required for OpenSSL < 1.0.0	2019-05-26 21:26:02 +02:00
Willy Tarreau	08e2b41e81	BUILD: connections: shut up gcc about impossible out-of-bounds warning Since commit `88698d9` ("MEDIUM: connections: Add a way to control the number of idling connections.") when building without threads, gcc complains that the operations made on the idle_orphan_conns[] list is out of bounds, which is always false since 1) <i> can only equal zero, and 2) given it's equal to <tid> we never even enter the loop. But as usual it thinks it knows better, so let's mask the origin of this <i> value to shut it up. Another solution consists in making <i> unsigned and adding an explicit range check.	2019-05-26 11:54:20 +02:00

1 2 3 4 5 ...

7942 Commits