haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-09 16:47:18 +02:00

Author	SHA1	Message	Date
Christopher Faulet	78fbb9f991	MEDIUM: fcgi-app: Add FCGI application and filter The FCGI application handles all the configuration parameters used to format requests sent to an application. The configuration of an application is grouped in a dedicated section (fcgi-app <name>) and referenced in a backend to be used (use-fcgi-app <name>). To be valid, a FCGI application must at least define a document root. But it is also possible to set the default index, a regex to split the script name and the path-info from the request URI, parameters to set or unset... In addition, this patch also adds a FCGI filter, responsible for all processing on a stream.	2019-09-17 10:18:54 +02:00
Christopher Faulet	63bbf284a1	MINOR: fcgi: Add code related to FCGI protocol This code is independant and is only responsible to encode and decode part of the FCGI protocol.	2019-09-17 10:18:54 +02:00
Christopher Faulet	4f0f88a9d0	MEDIUM: mux-h1/h1-htx: move HTX convertion of H1 messages in dedicated file To avoid code duplication in the futur mux FCGI, functions parsing H1 messages and converting them into HTX have been moved in the file h1_htx.c. Some specific parts remain in the mux H1. But most of the parsing is now generic.	2019-09-17 10:18:54 +02:00
Christopher Faulet	341fac1eb2	MINOR: http: Add function to parse value of the header Status It will be used by the mux FCGI to get the status a response.	2019-09-17 10:18:54 +02:00
Christopher Faulet	5c6fefc8eb	MINOR: log: Provide a function to emit a log for an application Application is a generic term here. It is a modules which handle its own log server list, with no dependency on a proxy. Such applications can now call the function app_log() to log messages, passing a log server list and a tag as parameters. Internally, the function __send_log() has been adapted accordingly.	2019-09-17 10:18:54 +02:00
Christopher Faulet	130cf21709	MINOR: istbuf: Add the function b_isteqi() This function compares a part of a buffer to an indirect string (ist), ignoring the case of the characters.	2019-09-17 10:18:54 +02:00
Christopher Faulet	c16929658f	MINOR: config: Support per-proxy and per-server post-check functions callbacks Most of times, when a keyword is added in proxy section or on the server line, we need to have a post-parser callback to check the config validity for the proxy or the server which uses this keyword. It is possible to register a global post-parser callback. But all these callbacks need to loop on the proxies and servers to do their job. It is neither handy nor efficient. Instead, it is now possible to register per-proxy and per-server post-check callbacks.	2019-09-17 10:18:54 +02:00
Christopher Faulet	3ea5cbe6a4	MINOR: config: Support per-proxy and per-server deinit functions callbacks Most of times, when any allocation is done during configuration parsing because of a new keyword in proxy section or on the server line, we must add a call in the deinit() function to release allocated ressources. It is now possible to register a post-deinit callback because, at this stage, the proxies and the servers are already releases. Now, it is possible to register deinit callbacks per-proxy or per-server. These callbacks will be called for each proxy and server before releasing them.	2019-09-17 10:18:54 +02:00
Christopher Faulet	e3d2a877fb	MINOR: http-ana: Remove err_state field from http_msg This field is not used anymore. In addition, the state HTTP_MSG_ERROR is now only used when an error occurred during the body forward.	2019-09-17 10:18:54 +02:00
Christopher Faulet	505adfca51	MINOR: htx: Add a flag on HTX message to report processing errors This new flag may be used to report unexpected error because of not well formatted HTX messages (not related to a parsing error) or our incapactity to handle the processing because we reach a limit (ressource exhaustion, too big headers...). It should result to an error 500 returned to the client when applicable.	2019-09-17 10:18:54 +02:00
Christopher Faulet	6338a08c34	MINOR: stats: Add JSON export from the stats page It is now possible to export stats using the JSON format from the HTTP stats page. Like for the CSV export, to export stats in JSON, you must add the option ";json" on the stats URL. It is also possible to dump the JSON schema with the option ";json-schema". Corresponding Links have been added on the HTML page. This patch fixes the issue #263.	2019-09-10 10:29:54 +02:00
Willy Tarreau	f21d17bbe8	MINOR: stats: report the number of idle connections for each server This adds two extra fields to the stats, one for the current number of idle connections and one for the configured limit. A tooltip link now appears on the HTML page to show these values in front of the active connection values. This should be backported to 2.0 and 1.9 as it's the only way to monitor the idle connections behaviour.	2019-09-08 09:30:50 +02:00
Willy Tarreau	4cae3bf631	BUG/MEDIUM: connection: don't keep more idle connections than ever needed When using "http-reuse safe", which is the default, a new incoming connection does not automatically reuse an existing connection for the first request, as we don't want to risk to lose the contents if we know the client will not be able to replay the request. A side effect to this is that when dealing with mostly http-close traffic, the reuse rate is extremely low and we keep accumulating server-side connections that may even never be reused. At some point we're limited to a ratio of file descriptors, but when the system is configured with very high FD limits, we can still reach the limit of outgoing source ports and make the system significantly slow down trying to find an available port for outgoing connections. A simple test on my laptop with ulimit 100000 and with the following config results in the load immediately dropping after a few seconds : listen l1 bind :4445 mode http server s1 127.0.0.1:8000 As can be seen, the load falls from 38k cps to 400 cps during the first 200ms (in fact when the source port table is full and connect() takes ages to find a spare port for a new connection): $ injectl464 -p 4 -o 1 -u 10 -G 127.0.0.1:4445/ -F -c -w 100 hits ^hits hits/s ^h/s bytes kB/s last errs tout htime sdht ptime 2439 2439 39338 39338 356094 5743 5743 0 0 0.4 0.5 0.4 7637 5198 38185 37666 1115002 5575 5499 0 0 0.7 0.5 0.7 7719 82 25730 820 1127002 3756 120 0 0 21.8 18.8 21.8 7797 78 19492 780 1138446 2846 114 0 0 61.4 2.5 61.4 7877 80 15754 800 1150182 2300 117 0 0 58.6 0.5 58.6 7920 43 13200 430 1156488 1927 63 0 0 58.9 0.3 58.9 At this point, lots of connections are indeed in use, for only 10 connections on the frontend side: $ ss -ant state established \| wc -l 39022 This patch makes sure we never keep more idle connections than we've ever had outstanding requests on a server. This way the total number of idle connections will never exceed the sum of maximum connections. Thus highly loaded servers will be able to get many connections and slightly loaded servers will keep less. Ideally we should apply similar limits per process and the per backend, but in practice this already addresses the issues pretty well: $ injectl464 -p 4 -o 1 -u 10 -G 127.0.0.1:4445/ -F -c -w 100 hits ^hits hits/s ^h/s bytes kB/s last errs tout htime sdht ptime 4423 4423 40209 40209 645758 5870 5870 0 0 0.2 0.4 0.2 8020 3597 40100 39966 1170920 5854 5835 0 0 0.2 0.4 0.2 12037 4017 40123 40170 1757402 5858 5864 0 0 0.2 0.4 0.2 16069 4032 40172 40320 2346074 5865 5886 0 0 0.2 0.4 0.2 20047 3978 40013 39386 2926862 5842 5750 0 0 0.3 0.4 0.3 24005 3958 40008 39979 3504730 5841 5837 0 0 0.2 0.4 0.2 $ ss -ant state established \| wc -l 234 This patch must be backported to 2.0. It could be useful in 1.9 as well eventhough pools and reuse are not enabled by default there.	2019-09-08 09:30:50 +02:00
Willy Tarreau	6b3089856f	MEDIUM: fd: do not use the FD_POLL_* flags in the pollers anymore As mentioned in previous commit, these flags do not map well to modern poller capabilities. Let's use the FD_EV_*_{R,W} flags instead. This first patch only performs a 1-to-1 mapping making sure that the previously reported flags are still reported identically while using the closest possible semantics in the pollers. It's worth noting that kqueue will now support improvements such as returning distinctions between shut and errors on each direction, though this is not exploited for now.	2019-09-06 19:09:56 +02:00
Willy Tarreau	77abb43ed1	MINOR: fd: add two flags ERR and SHUT to describe FD states There's currently a big ambiguity on our use of POLLHUP because we currently map POLLHUP and POLLRDHUP to FD_POLL_HUP. The first one indicates a close in both directions while the second one indicates a unidirectional close. Since we don't know from the resulting flag we always have to read when reported. Furthermore kqueue only reports unidirectional responses which are mapped to FD_POLL_HUP as well, and their write closes are mapped to a general error. We could add a new FD_POLL_RDHUP flag to improve the mapping, or switch only to the POLL* flags, but that further complicates the portability for operating systems like FreeBSD which do not have POLLRDHUP but have its semantics. Let's instead directly use the per-direction flag values we already have, and it will be a first step in the direction of finer states. Thus we introduce an ERR and a SHUT status for each direction, that the pollers will be able to compute and pass to fd_update_events(). It's worth noting that FD_EV_STATUS already sees the two new flags, but they are harmless since used only by fd_{recv,send}_state() which are never called. Thus in its current state this patch must be totally transparent.	2019-09-06 18:33:07 +02:00
Willy Tarreau	8f2825f3ab	MINOR: fd: add two new calls fd_cond_{recv,send}() These two functions are used to enable recv/send but only if the FD is not marked as active yet. The purpose is to conditionally mark them as tentatively usable without interfering with the polling if polling was already enabled, when it's supposed to be likely true.	2019-09-06 17:50:36 +02:00
Willy Tarreau	4ac9d064d2	MEDIUM: fd: mark the FD as ready when it's inserted Given that all our I/Os are now directed from top to bottom and not the opposite way around, and the FD cache was removed, it doesn't make sense anymore to create FDs that are marked not ready since this would prevent the first accesses unless the caller explicitly does an fd_may_recv() which is not expected to be its job (which conn_ctrl_init() has to do by the way). Let's move this into fd_insert() instead, and have a single atomic operation for both directions via fd_may_both().	2019-09-06 17:50:36 +02:00
Willy Tarreau	dbe3060e81	MINOR: fd: make updt_fd_polling() a normal function It's called from many places, better use a real function than an inline.	2019-09-05 09:31:18 +02:00
Willy Tarreau	f8ecc7f667	MEDIUM: fd: simplify the fd__{recv,send} functions using BTS/BTR Now that we don't have to update FD_EV_POLLED_ at the same time as FD_EV_ACTIVE_*, we don't need to use a CAS anymore, a bit-test-and-set operation is enough. Doing so reduces the code size by a bit more than 1 kB. One function was special, fd_done_recv(), whose comments and doc were inaccurate for the part related to the lack of polling.	2019-09-05 09:31:18 +02:00
Willy Tarreau	5bee3e2f47	MEDIUM: fd: remove the FD_EV_POLLED status bit Since commit `7ac0e35f2` in 1.9-dev1 ("MAJOR: fd: compute the new fd polling state out of the fd lock") we've started to update the FD POLLED bit a bit more aggressively. Lately with the removal of the FD cache, this bit is always equal to the ACTIVE bit. There's no point continuing to watch it and update it anymore, all it does is create confusion and complicate the code. One interesting side effect is that it now becomes visible that all fd_*_{send,recv}() operations systematically call updt_fd_polling(), except fd_cant_recv()/fd_cant_send() which never saw it change.	2019-09-05 09:31:18 +02:00
Willy Tarreau	c046d167e4	MEDIUM: log: add support for logging to a ring buffer Now by prefixing a log server with "ring@<name>" it's possible to send the logs to a ring buffer. One nice thing is that it allows multiple sessions to consult the logs in real time in parallel over the CLI, and without requiring file system access. At the moment, ring0 is created as a default sink for tracing purposes and is available. No option is provided to create new rings though this is trivial to add to the global section.	2019-08-30 15:24:59 +02:00
Willy Tarreau	f3dc30f6de	MINOR: log: add a target type instead of hacking the address family Instead of detecting an AF_UNSPEC address family for a log server and to deduce a file descriptor, let's create a target type field and explicitly mention that the socket is of type FD.	2019-08-30 15:07:25 +02:00
Willy Tarreau	d660990cee	MINOR: fd: add a new "initialized" bit in the fdtab struct The purpose is to be able to remember that initialization was already done for a file descriptor. This will allow to get rid of some dirty hacks performed in the logs or fd sinks where the init state of the fd has to be guessed.	2019-08-30 15:07:25 +02:00
Willy Tarreau	76913d3ef4	CLEANUP: fd: remove leftovers of the fdcache The "cache" entry was still present in the fdtab struct and it was reported in "show sess". Removing it broke the cache-line alignment on 64-bit machines which is important for threads, so it was fixed by adding an attribute(aligned()) when threads are in use. Doing it only in this case allows 32-bit thread-less platforms to see the struct fit into 32 bytes.	2019-08-30 15:07:25 +02:00
Willy Tarreau	1d181e489c	MEDIUM: ring: implement a wait mode for watchers Now it is possible for a reader to subscribe and wait for new events sent to a ring buffer. When new events are written to a ring buffer, the applets that are subscribed are woken up to display new events. For now we only support this with the CLI applet called by "show events" since the I/O handler is indeed a CLI I/O handler. But it's not complicated to add other mechanisms to consume events and forward them to external log servers for example. The wait mode is enabled by adding "-w" after "show events <sink>". An extra "-n" was added to directly seek to new events only.	2019-08-30 11:58:58 +02:00
Willy Tarreau	300decc8d9	MINOR: cli: extend the CLI context with a list and two offsets Some CLI parsers are currently abusing the CLI context types such as pointers to stuff longs into them by lack of room. But the context is 80 bytes while cli is only 48, thus there's some room left. This patch adds a list element and two size_t usable as various offsets. The list element is initialized.	2019-08-30 11:58:58 +02:00
Willy Tarreau	370a694879	MINOR: trace: change the detail_level to per-source verbosity The detail level initially based on syslog levels is not used, while something related is missing, trace verbosity, to indicate whether or not we want to call the decoding callback and what level of decoding we want (raw captures etc). Let's change the field to "verbosity" for this. A verbosity of zero means that the decoding callback is not called, and all other levels are handled by this callback and are source-specific. The source is now prompted to list the levels that are proposed to the user. When the source doesn't define anything, "quiet" and "default" are available.	2019-08-29 17:11:25 +02:00
Willy Tarreau	09fb0df6fd	MINOR: trace: prepend the function name for developer level traces Working on adding traces to mux-h2 revealed that the function names are manually copied a lot in developer traces. The reason is that they are not preprocessor macros and as such cannot be concatenated. Let's slightly adjust the trace() function call to take a function name just after the file:line argument. This argument is only added for the TRACE_DEVEL and 3 new TRACE_ENTER, TRACE_LEAVE, and TRACE_POINT macros and left NULL for others. This way the function name is only reported for traces aimed at the developers. The pretty-print callback was also extended to benefit from this. This will also significantly shrink the data segment as the "entering" and "leaving" strings will now be merged. One technical point worth mentioning is that the function name is not passed as an ist to the inline function because it's not considered as a builtin constant by the compiler, and would lead to strlen() being run on it from all call places before calling the inline function. Thus instead we pass the const char * (that the compiler knows where to find) and it's the __trace() function that converts it to an ist for internal consumption and for the pretty-print callback. Doing this avoids losing 5-10% peak performance.	2019-08-29 17:09:13 +02:00
Willy Tarreau	2ea549bc43	MINOR: trace: change the "payload" level to "data" and move it The "payload" trace level was ambigous because its initial purpose was to be able to dump received data. But it doesn't make sense to force to report data transfers just to be able to report state changes. For example, all snd_buf()/rcv_buf() operations coming from the application layer should be tagged at this level. So here we move this payload level above the state transitions and rename it to avoid the ambiguity making one think it's only about request/response payload. Now it clearly is about any data transfer and is thus just below the developer level. The help messages on the CLI and the doc were slightly reworded to help remove this ambiguity.	2019-08-29 10:46:11 +02:00
Willy Tarreau	be5a288424	MINOR: trace: replace struct trace_lockon_args with struct name_desc No need for a specific struct anymore, name_desc suits us.	2019-08-29 09:34:53 +02:00
Willy Tarreau	fb4ba91ac1	MINOR: tools: add a generic struct "name_desc" for name-description pairs In prompts on the CLI we now commonly need to propose a keyword name and a description and it doesn't make sense to define a new struct for each such pairs. Let's simply have a generic "name_desc" for this.	2019-08-29 09:34:53 +02:00
Geoff Simmons	7185b789f9	MINOR: connection: add the fc_pp_authority fetch -- authority TLV, from PROXYv2 Save the authority TLV in a PROXYv2 header from the client connection, if present, and make it available as fc_pp_authority. The fetch can be used, for example, to set the SNI for a backend TLS connection.	2019-08-28 17:16:20 +02:00
Willy Tarreau	c326ecc9b1	MINOR: trace: change the TRACE() calling convention to put the args and cb last Previously the callback was almost mandatory so it made sense to have it before the message. Now that it can default to the one declared in the trace source, most TRACE() calls contain series of empty args and callbacks, which make them suitable for being at the end and being totally omitted. This patch thus reverses the TRACE arguments so that the message appears first, then the mask, then arg1..arg4, then the callback. In practice we'll mostly see 1 arg, or 2 args and nothing else, and it will not be needed anymore to pass long series of commas in the middle of the arguments. However if a source is enforced, the empty commas will still be needed for all omitted arguments.	2019-08-28 10:39:43 +02:00
Willy Tarreau	3da0026d25	MINOR: trace: support a default callback for the source It becomes apparent that most traces will use a single trace pretty print callback, so let's allow the trace source to declare a default one so that it can be omitted from trace calls, and will be used if no other one is specified.	2019-08-28 07:06:23 +02:00
Willy Tarreau	8f24023ba0	MINOR: sink: now report the number of dropped events on output The principle is that when emitting a message, if some dropped events were logged, we first attempt to report this counter before going further. This is done under an exclusive lock while all logs are produced under a shared lock. This ensures that the dropped line is accurately reported and doesn't accidently arrive after a later event.	2019-08-27 17:14:19 +02:00
Willy Tarreau	4ed23ca0e7	MINOR: sink: add support for ring buffers This now provides sink_new_buf() which allocates a ring buffer. One such ring ("buf0") of 1 MB is created already, and may be used by sink_write(). The sink's creation should probably be moved somewhere else later.	2019-08-27 17:14:19 +02:00
Willy Tarreau	072931cdcb	MINOR: ring: add a generic CLI io_handler to dump a ring buffer The three functions (attach, IO handler, and release) are meant to be called by any CLI command which requires to dump the contents of a ring buffer. We do not implement anything generic to dump any ring buffer on the CLI since it's meant to be used by other functionalities above. However these functions deal with locking and everything so it's trivial to embed them in other code.	2019-08-27 17:14:19 +02:00
Willy Tarreau	be97853c2f	MINOR: ring: add a ring_write() function This function tries to write to the ring buffer, possibly removing enough old messages to make room for the new one. It takes two arrays of fragments on input to ease the insertion of prefixes by the caller. It atomically writes the message, possibly truncating it if desired, and returns the operation's status.	2019-08-27 17:14:19 +02:00
Willy Tarreau	172945fbad	MINOR: ring: add a new mechanism for retrieving/storing ring data in buffers Our circular buffers are well suited for being used as ring buffers for not-so-structured data. The machanism here consists in making room in a buffer before inserting a new record which is prefixed by its size, and looking up next record based on the previous one's offset and size. We can have up to 255 consumers watching for data (dump in progress, tail) which guarantee that entrees are not recycled while they're being dumped. The complete representation is described in the header file. For now only ring_new(), ring_resize() and ring_free() are created.	2019-08-27 17:14:19 +02:00
Willy Tarreau	931d8b79a8	MINOR: fd: add fd_write_frag_line() to send a fragmented line to an fd Currently both logs and event sinks may use a file descriptor to atomically emit some output contents. The two may use the same FD though nothing is done to make sure they use the same lock. Also there is quite some redundancy between the two. Better make a specific function to send a fragmented message to a file descriptor which will take care of the locking via the fd's lock. The function is also able to truncate a message and to enforce addition of a trailing LF when building the output message.	2019-08-27 17:14:19 +02:00
Willy Tarreau	b88d231773	MINOR: buffer: add functions to read/write varints from/to buffers The new functions are : __b_put_varint() : inserts a varint when it's known that it fits b_put_varint() : tries to insert a varint at the tail b_get_varint() : tries to get a varint from the head b_peek_varint() : tries to peek a varint at a specific offset Wrapping is supported so that they are expected to be safe to use to manipulate varints with buffers anywhere.	2019-08-27 17:14:19 +02:00
Willy Tarreau	4d589e719b	MINOR: tools: add a function varint_bytes() to report the size of a varint It will sometimes be useful to encode varints to know the output size in advance. Two versions are provided, one inline using a switch/case construct which will be trivial for use with constants (and will be very fast albeit huge) and one function iterating on the number which is 5 times smaller, for use with variables.	2019-08-27 17:14:19 +02:00
Willy Tarreau	e40f274878	BUILD: trace: make the lockon_ptr const to silence a warning without threads I forgot to fix this one before pushing, despite my tests. lockon_ptr is only used to compare pointers, it doesn't need to point to a writable location. Without threads the atomic store is turned into an assignment and rightfully complains.	2019-08-22 20:26:28 +02:00
Willy Tarreau	c14eea49e6	MINOR: trace: add the possibility to lock on some arguments Given that we can pass typed arguments to the trace() function, let's add provisions for tracking them. They are source-specific so we need to let the source fill their name and description. Only those with a non-null name will be proposed.	2019-08-22 20:21:00 +02:00
Willy Tarreau	17a51c64b5	MINOR: trace: add a definition of typed arguments to trace() With a few macros it's possible for a trace source to commit to only using a certain type for a given argument (or set of). This will be particularly useful to let the trace subsystem retrieve some precious information such as a connection, session, listener, source address or so, and enable/disable filtering and/or locking.	2019-08-22 20:21:00 +02:00
Willy Tarreau	4ab242136d	MINOR: trace: add per-level macros to produce traces The new TRACE_<level>() macros take a mask, 4 args, a callback and a static message. From this they also inherit the TRACE_SOURCE macro from the caller, which contains the pointer to the trace source (so that it's not required to paste it everywhere), and an ist string is also made by the concatenation of the file name and the line number. This uses string concatenation by the preprocessor, and turns it into an ist by the compiler so that there is no operation at all to perform to adjust the data length as the compiler knows where to cut during the optimization phase. Last, the message is also automatically turned into an ist so that it's trivial to put it into an iovec without having to run strlen() on it. All arguments and the callback may be empty and will then automatically be replaced with a NULL pointer. This makes the TRACE calls slightly lighter especially since arguments are not always used. Several other options were considered to use variadic macros but there's no outstanding rule that justifies to place an argument before another one, and it still looks convenient to have the message be the last one to encourage copy- pasting of the trace statements. A generic TRACE() macro takes TRACE_LEVEL in from the source file as the trace level instead of taking it from its name. This may slightly simplify the production of traces that always run at the same level (internal core parts may probably only be called at developer level).	2019-08-22 20:21:00 +02:00
Willy Tarreau	bfd14fc6eb	MINOR: trace: implement a call to a decode function The trace() call will support an optional decoding callback and 4 arguments that this function is supposed to know how to use to provide extra information. The output remains unchanged when the function is NULL. Otherwise, the message is pre-filled into the thread-local trace_buf, and the function is called with all arguments so that it completes the buffer in a readable form depending on the expected level of detail.	2019-08-22 20:21:00 +02:00
Willy Tarreau	5da408818b	MINOR: trace: make trace() now also take a level in argument This new "level" argument will allow the trace sources to label the traces for different purposes, and filter out some of them if they are not relevant to the current target. Right now we have 5 different levels: - USER : the least verbose one, only a few functional information - PAYLOAD: like user but also displays some payload-related information - PROTO: focuses on the protocol's framing - STATE: also indicate state internal transitions or non-transitions - DEVELOPER: adds extra info about branches taken in the code (break points, return points)	2019-08-22 20:21:00 +02:00
Willy Tarreau	419bd49f0b	MINOR: trace: add the file name and line number in the prefix We now pass an extra argument "where" to the trace() call, which is supposed to be an ist made of the concatenation of the filename and the line number. We only keep the last 10 chars from this string since the end of file names is most often easy to recognize. This gives developers useful information at very low cost.	2019-08-22 20:21:00 +02:00
Willy Tarreau	4c2ae48375	MINOR: trace: implement a very basic trace() function For now it remains quite basic. It performs a few state checks, calls the source's sink if defined, and performs the transitions between RUNNING, STOPPED and WAITING when the configured events match.	2019-08-22 20:21:00 +02:00
Willy Tarreau	864e880f6c	MINOR: trace/cli: register the "trace" CLI keyword to list the sources For now it lists the sources if one is not provided, and checks for the source's existence. It lists the events if not provided, checks for their existence if provided, and adjusts reported events/start/stop/pause events, and performs state transitions. It lists sinks and adjusts them as well. Filters, lock, and level are not implemented yet.	2019-08-22 20:21:00 +02:00
Willy Tarreau	88ebd4050e	MINOR: trace: add allocation of buffer-sized trace buffers This will be needed so that we can implement protocol decoders which will have to emit their contents into such a buffer.	2019-08-22 20:21:00 +02:00
Willy Tarreau	4151c753fc	MINOR: trace: start to create a new trace subsystem The principle of this subsystem will be to support taking live traces at various places in the code with conditional triggers, filters, and ability to lock on some elements. The traces will support typed events and will be sent into sinks made of ring buffers, file descriptors or remote servers.	2019-08-22 20:21:00 +02:00
Willy Tarreau	973e662fe8	MINOR: sink: add a support for file descriptors This is the most basic type of sink. It pre-registers "stdout" and "stderr", and is able to use writev() on them. The writev() operation is locked to avoid mixing outputs. It's likely that the registration should move somewhere else to take into account the fact that stdout and stderr are still opened or are closed.	2019-08-22 20:21:00 +02:00
Willy Tarreau	67b5a161b4	MINOR: sink: create definitions a minimal code for event sinks The principle will be to be able to dispatch events to various destinations called "sinks". This is already done in part in logs where log servers can be either a UDP socket or a file descriptor. This will be needed with the new trace subsystem where we may also want to add ring buffers. And it turns out that all such destinations make sense at all places. Logs may need to be sent to a TCP server via a ring buffer, or consulted from the CLI. Trace events may need to be sent to stdout/stderr as well as to remote log servers. This patch creates a new structure "sink" aiming at addressing these similar needs. The goal is to merge together what is common to all of them, such as the output format, the dropped events count, etc, and also keep separately the target identification (network address, file descriptor). Provisions were made to have a "waiter" on the sink. For a TCP log server it will be the task to wake up after writing to the log buffer. For a ring buffer, it could be the list of watchers on the CLI running a "tail" operation and waiting for new events. A lock was also placed in the struct since many operations will require some locking, including the FD ones. The output formats covers those in use by logs and two extra ones prepending the ISO time in front of the message (convenient for stdio/buffer). For now only the generic infrastructure is present, no type-specific output is implemented. There's the sink_write() function which prepares and formats a message to be sent, trying hard to avoid copies and only using pointer manipulation, where the type-specific code just has to be added. Dropped messages are already counted (for now 100% drop). The message is put into an iovec array as it will be trivial to use with file descriptors and sockets.	2019-08-22 20:21:00 +02:00
Willy Tarreau	9eebd8a978	REORG: trace: rename trace.c to calltrace.c and mention it's not thread-safe The function call tracing code is a quite old and was never ported to support threads. It's not even sure whether it still works well, but at least its presence creates confusion for future work so let's rename it to calltrace.c and add a comment about its lack of thread-safety.	2019-08-22 20:21:00 +02:00
Willy Tarreau	32c24552e4	MINOR: tools: add a DEFNULL() macro to use NULL for empty args It's sometimes convenient for debugging macros not to be forced to explicitly pass NULL in an unused argument. This macro does this, it replaces a missing arg with NULL.	2019-08-22 20:21:00 +02:00
Willy Tarreau	9bead8c7f5	MINOR: list: add LIST_SPLICE() to merge one list into another This will move the contents of list <old> at the beginning of list <new>.	2019-08-22 20:21:00 +02:00
Willy Tarreau	60409db0b1	MINOR: lua: export applet and task handlers The current functions are seen outside from the debugging code and are convenient to export so that we can improve the thread dump output : void hlua_applet_tcp_fct(struct appctx ctx); void hlua_applet_http_fct(struct appctx ctx); struct task hlua_process_task(struct task task, void *context, unsigned short state); Of course they are only available when USE_LUA is defined.	2019-08-21 14:32:09 +02:00
Willy Tarreau	a2c9911ace	MINOR: tools: add append_prefixed_str() This is somewhat related to indent_msg() except that this one places a known prefix at the beginning of each line, allows to replace the EOL character, and not to insert a prefix on the first line if not desired. It works with a normal output buffer/chunk so it doesn't need to allocate anything nor to modify the input string. It is suitable for use in multi- line backtraces.	2019-08-21 14:32:09 +02:00
Willy Tarreau	f5cab82025	MINOR: fd: make sure to mark the thread as not stuck in fd_update_events() When I/O events are being processed, we want to make sure to mark the thread as not stuck. The reason is that some pollers (like poll()) which do not limit the number of FDs they report could possibly report a huge amount of FD all having to perform moderately expensive operations in the I/O callback (e.g. via mux-pt which forwards to the upper layers), making the watchdog think the thread is stuck since it does not schedule. Of course this must never happen but if it ever does we must be liberal about it. This should be backported to 2.0, where the situation may happen more easily due to the FD cache which can start to collect a large amount of events. It may be related to the report in issue #201 though nothing is certain about it.	2019-08-16 16:06:14 +02:00
Willy Tarreau	edb91ad647	MINOR: cli: add cli_msg(), cli_err(), cli_dynmsg(), cli_dynerr() These functions perform all the boring filling of the appctx's cli struct needed by CLI parsers to return a message or an error, and they return 1 so that they can be used as a single-line return statement. They may be used for const messages or dynamic messages.	2019-08-09 10:11:38 +02:00
Willy Tarreau	d50c7feaa1	MINOR: cli: add two new states to print messages on the CLI Right now we used to have extremely inconsistent states to report output, one is CLI_ST_PRINT which prints constant message cli->msg with the assigned severity, and CLI_ST_PRINT_FREE which prints dynamically allocated cli->err with severity LOG_ERR, and nothing in between, eventhough it's useful to be able to report dynamically allocated messages as well as constant error messages. This patch adds two extra states, which are not particularly well named given the constraints imposed by existing ones. One is CLI_ST_PRINT_ERR which prints a constant error message. The other one is CLI_ST_PRINT_DYN which prints a dynamically allocated message. By doing so we maintain the compatibility with current code. It is important to keep in mind that we cannot pre-initialize pointers and automatically detect what message type it is based on the assigned fields, because the CLI's context is in a union shared with all other users, thus unused fields contain anything upon return. This is why we have no choice but using 4 states. Keeping the two fields <msg> and <err> remains useful because one is const and not the other one, and this catches may copy-paste mistakes. It's just that <err> is pretty confusing here, it should be renamed.	2019-08-09 10:11:38 +02:00
Willy Tarreau	247a8b1d81	CLEANUP: task: move the cpu_time field to the task-only part The CPU time accounting field called "cpu_time" is used only by tasks and not tasklets, yet it used to be stored into the TASK_COMMON part, which doesn't make sense and wastes tasklet memory. In addition, moving it to tasks also helps better group the various parts in cache lines.	2019-08-08 10:11:05 +02:00
Willy Tarreau	e0d0b4089d	CLEANUP: buffer: replace b_drop() with b_free() Since last commit there's no point anymore in having two variants of the same function, let's switch to b_free() only. __b_drop() was renamed to __b_free() for obvious consistency reasons.	2019-08-08 08:07:45 +02:00
Willy Tarreau	3b091f80aa	BUG/MINOR: buffers/threads: always clear a buffer's head before releasing it A small race exists in buffers with "show sess all". This one wants to show some information grabbed from the buffer (especially in HTX mode). But the thread owning this buffer might just be releasing its area, right after a free() or munmap() call, resulting in a head that is not seen as empty yet though the area was released. It may then be dereferenced by "show sess all" causing a crash. Note that in practice it only happens in debug mode with UAF enabled, but it's tricky enough to fix it right now. This should be backported to stable versions which support threads and a store barrier. It's worth noting that by performing the clearing first, b_free() and b_drop() now become two exact equivalent.	2019-08-08 08:07:45 +02:00
Willy Tarreau	229e739c21	BUG/MINOR: pools: don't mark the thread harmless if already isolated Commit `85b2cae63` ("MINOR: pools: make the thread harmless during the mmap/munmap syscalls") was used to relax the pressure experienced by other threads when running in debug mode with UAF enabled. It places a pair of thread_harmless_now()/thread_harmless_end() around the call to mmap(), assuming callers are not sensitive to parallel activity. But there are a few cases like "show sess all" where this happens in isolated threads, and marking the thread as harmless there is a very bad idea, even worse when arriving to thread_harmless_end() which loops forever. Let's only do that when the thread is not isolated. No backport is needed as the patch above was only in 2.1-dev.	2019-08-08 07:41:52 +02:00
Fr�d�ric L�caille	be36793d1d	BUG/MEDIUM: stick-table: Wrong stick-table backends parsing. When parsing references to stick-tables declared as backends, they are added to a list of proxies (they are proxies!) which refer to this stick-tables. Before this patch we added them to these list without checking they were already present, making the silly hypothesis the actions/sample were checked/resolved in the same order the proxies are parsed. This patch implement a simple inline function to in_proxies_list() to test the presence of a proxy in a list of proxies. We use this function when resolving /checking samples/actions. This bug was introduced by `015e4d7` commit. Must be backported to 2.0.	2019-08-07 10:32:31 +02:00
Olivier Houchard	4c18f94c11	BUG/MEDIUM: proxy: Make sure to destroy the stream on upgrade from TCP to H2 In stream_set_backend(), if we have a TCP stream, and we want to upgrade it to H2 instead of attempting ot reuse the stream, just destroy the conn_stream, make sure we don't log anything about the stream, and pretend we failed setting the backend, so that the stream will get destroyed. New streams will then be created by the mux, as if the connection just happened. This fixes a crash when upgrading from TCP to H2, as the H2 mux totally ignored the conn_stream provided by the upgrade, as reported in github issue #196. This should be backported to 2.0.	2019-08-02 18:28:58 +02:00
Emmanuel Hocdet	f580d0f391	BUILD: ssl: BoringSSL add EVP_PKEY_base_id Remove EVP_PKEY_base_id compatibility, it is now included in BoringSSL.	2019-08-01 11:21:42 +02:00
Willy Tarreau	a37cb1880c	MINOR: wdt: also consider that waiting in the thread dumper is normal It happens that upon looping threads the watchdog fires, starts a dump, and other threads expire their budget while waiting for the other threads to get dumped and trigger a watchdog event again, adding some confusion to the traces. With this patch the situation becomes clearer as we export the list of threads being dumped so that the watchdog can check it before deciding to trigger. This way such threads in queue for being dumped are not attempted to be reported in turn. This should be backported to 2.0 as it helps understand stack traces.	2019-07-31 19:35:31 +02:00
Olivier Houchard	53055055c5	MEDIUM: pollers: Remember the state for read and write for each threads. In the poller code, instead of just remembering if we're currently polling a fd or not, remember if we're polling it for writing and/or for reading, that way, we can avoid to modify the polling if it's already polled as needed.	2019-07-31 14:54:41 +02:00
Olivier Houchard	305d5ab469	MAJOR: fd: Get rid of the fd cache. Now that the architecture was changed so that attempts to receive/send data always come from the upper layers, instead of them only trying to do so when the lower layer let them know they could try, we can finally get rid of the fd cache. We don't really need it anymore, and removing it gives us a small performance boost.	2019-07-31 14:12:55 +02:00
Willy Tarreau	5e83d996cf	BUG/MAJOR: queue/threads: avoid an AB/BA locking issue in process_srv_queue() A problem involving server slowstart was reported by @max2k1 in issue #197. The problem is that pendconn_grab_from_px() takes the proxy lock while already under the server's lock while process_srv_queue() first takes the proxy's lock then the server's lock. While the latter seems more natural, it is fundamentally incompatible with mayn other operations performed on servers, namely state change propagation, where the proxy is only known after the server and cannot be locked around the servers. Howwever reversing the lock in process_srv_queue() is trivial and only the few functions related to dynamic cookies need to be adjusted for this so that the proxy's lock is taken for each server operation. This is possible because the proxy's server list is built once at boot time and remains stable. So this is what this patch does. The comments in the proxy and server structs were updated to mention this rule that the server's lock may not be taken under the proxy's lock but may enclose it. Another approach could consist in using a second lock for the proxy's queue which would be different from the regular proxy's lock, but given that the operations above are rare and operate on small servers list, there is no reason for overdesigning a solution. This fix was successfully tested with 10000 servers in a backend where adjusting the dyncookies in loops over the CLI didn't have a measurable impact on the traffic. The only workaround without the fix is to disable any occurrence of "slowstart" on server lines, or to disable threads using "nbthread 1". This must be backported as far as 1.8.	2019-07-30 14:02:06 +02:00
Christopher Faulet	bfab2dddad	MINOR: hlua: Add a flag on the lua txn to know in which context it can be used When a lua action or a lua sample fetch is called, a lua transaction is created. It is an entry in the stack containing the class TXN. Thanks to it, we can know the direction (request or response) of the call. But, for some functions, it is also necessary to know if the buffer is "HTTP ready" for the given direction. "HTTP ready" means there is a valid HTTP message in the channel's buffer. So, when a lua action or a lua sample fetch is called, the flag HLUA_TXN_HTTP_RDY is set if it is appropriate.	2019-07-29 11:17:52 +02:00
Willy Tarreau	d6e0c03384	BUILD: threads: add the definition of PROTO_LOCK This one was added by commit `daacf3664` ("BUG/MEDIUM: protocols: add a global lock for the init/deinit stuff") but I forgot to add it to the include file, breaking DEBUG_THREAD.	2019-07-25 07:53:56 +02:00
Christopher Faulet	98fbe9531a	MEDIUM: mux-h1: Add the support of headers adjustment for bogus HTTP/1 apps There is no standard case for HTTP header names because, as stated in the RFC7230, they are case-insensitive. So applications must handle them in a case-insensitive manner. But some bogus applications erroneously rely on the case used by most browsers. This problem becomes critical with HTTP/2 because all header names must be exchanged in lowercase. And HAProxy uses the same convention. All header names are sent in lowercase to clients and servers, regardless of the HTTP version. This design choice is linked to the HTX implementation. So, for previous versions (2.0 and 1.9), a workaround is to disable the HTX mode to fall back to the legacy HTTP mode. Since the legacy HTTP mode was removed, some users reported interoperability issues because their application was not able anymore to handle HTTP/1 message received from HAProxy. So, we've decided to add a way to change the case of some headers before sending them. It is now possible to define a "mapping" between a lowercase header name and a version supported by the bogus application. To do so, you must use the global directives "h1-case-adjust" and "h1-case-adjust-file". Then options "h1-case-adjust-bogus-client" and "h1-case-adjust-bogus-server" may be used in proxy sections to enable the conversion. See the configuration manual for more info. Of course, our advice is to urgently upgrade these applications for interoperability concerns and because they may be vulnerable to various types of content smuggling attacks. But, if your are really forced to use an unmaintained bogus application, you may use these directive, at your own risks. If it is relevant, this feature may be backported to 2.0.	2019-07-24 18:32:47 +02:00
Willy Tarreau	daacf36645	BUG/MEDIUM: protocols: add a global lock for the init/deinit stuff Dragan Dosen found that the listeners lock is not sufficient to protect the listeners list when proxies are stopping because the listeners are also unlinked from the protocol list, and under certain situations like bombing with soft-stop signals or shutting down many frontends in parallel from multiple CLI connections, it could be possible to provoke multiple instances of delete_listener() to be called in parallel for different listeners, thus corrupting the protocol lists. Such operations are pretty rare, they are performed once per proxy upon startup and once per proxy on shut down. Thus there is no point trying to optimize anything and we can use a global lock to protect the protocol lists during these manipulations. This fix (or a variant) will have to be backported as far as 1.8.	2019-07-24 16:45:02 +02:00
Christopher Faulet	90cc4811be	BUG/MINOR: http_htx: Support empty errorfiles Empty error files may be used to disable the sending of any message for specific error codes. A common use-case is to use the file "/dev/null". This way the default error message is overridden and no message is returned to the client. It was supported in the legacy HTTP mode, but not in HTX. Because of a bug, such messages triggered an error. This patch must be backported to 2.0 and 1.9. However, the patch will have to be adapted.	2019-07-23 14:58:32 +02:00
Willy Tarreau	1c8d32bb62	MAJOR: stream: store the target address into s->target_addr When forcing the outgoing address of a connection, till now we used to allocate this outgoing connection and set the address into it, then set SF_ADDR_SET. With connection reuse this causes a whole lot of issues and difficulties in the code. Thanks to the previous changes, it is now possible to store the target address into the stream instead, and copy the address from the stream to the connection when initializing the connection. assign_server_address() does this and as a result SF_ADDR_SET now reflects the presence of the target address in the stream, not in the connection. The http_proxy mode, the peers and the master's CLI now use the same mechanism. For now the existing connection code was not removed to limit the amount of tricky changes, but the allocated connection is not used anymore. This change also revealed a latent issue that we've been having around option http_proxy : the address was set in the connection but neither the SF_ADDR_SET nor the SF_ASSIGNED flags were set. It looks like the connection could establish only due to the fact that it existed with a non-null destination address.	2019-07-19 13:50:09 +02:00
Willy Tarreau	9042060b0b	MINOR: stream: add a new target_addr entry in the stream structure The purpose will be to store the target address there and not to allocate a connection just for this anymore. For now it's only placed in the struct, a few fields were moved to plug some holes, and the entry is freed on release (never allocated yet for now). This must have no impact. Note that in order to fit, the store_count which previously was an int was turned into a short, which is way more than enough given that the hard-coded limit is 8.	2019-07-19 13:50:09 +02:00
Willy Tarreau	e71fca81dd	MAJOR: connection: remove the addr field Now addresses are dynamically allocated when needed. Each connection is created with src=dst=NULL, these entries are allocated on the fly, and released when the connection is released.	2019-07-19 13:50:09 +02:00
Willy Tarreau	ca79f59365	MEDIUM: connection: make sure all address producers allocate their address This commit places calls to sockaddr_alloc() at the places where an address is needed, and makes sure that the allocation is properly tested. This does not add too many error paths since connection allocations are already in the vicinity and share the same error paths. For the two cases where a clear_addr() was called, instead the address was not allocated.	2019-07-19 13:50:09 +02:00
Willy Tarreau	ff5d57b022	MINOR: connection: create a new pool for struct sockaddr_storage This pool will be used to allocate storage for source and destination addresses used in connections. Two functions sockaddr_{alloc,free}() were added and will have to be used everywhere an address is needed. These ones are safe for progressive replacement as they check that the existing pointer is set before replacing it. The pool is not yet used during allocation nor freeing. Also they operate on pointers to pointers so they will perform checks and replace values. The free one nulls the pointer.	2019-07-19 13:50:09 +02:00
Willy Tarreau	226572f55f	MINOR: connection: use conn->{src,dst} instead of &conn->addr.{from,to} This is in preparation for the switch to dynamic address allocation, let's migrate the code using the old fields to the pointers instead. Note that no extra check was added for now, the purpose is only to get the code to use the pointers and still work. In the proxy protocol message handling we make sure the addresses are properly allocated before declaring them unset.	2019-07-19 13:50:09 +02:00
Willy Tarreau	1ef4cbc693	MINOR: connection: add new src and dst fields At the moment we're facing difficulties with connection reuse based on the fact that connections may be allocated very early only to set a target address in transparent mode. With the imminent removal of the legacy mode, the connection reuse by a same stream will not exist anymore and all this awful complexity is not justified anymore. However we still need to be able to assign addresses somewhere. Thus instead of allocating a connection, we'll only place addresses where needed in the stream during operations. But this takes quite some room (typically 128 bytes). This is a nice opportunity for cleaning all this up and dynamically allocatating the addresses fields, which will result in actually saving memory from connection structs since most of the time the client's "to" address is not used and the server's "from" is not used either, thus saving ~256 bytes per end-to-end connection. For now these new "src" and "dst" pointers point to addr.from and addr.to. This will allow us to smoothly update the whole code to use these pointers prior to going further and switching them to pools.	2019-07-19 13:50:09 +02:00
Willy Tarreau	cc4df3b3de	CLEANUP: connection: remove the now unused conn_get_{from,to}_addr() These functions are not used anymore. They didn't report failures and as such were often misused. conn_get_src() and conn_get_dst() now replaced them everywhere.	2019-07-19 13:50:09 +02:00
Willy Tarreau	3cc01d84b3	MINOR: backend: switch to conn_get_{src,dst}() for port and address mapping The backend connect code uses conn_get_{from,to}_addr to forward addresses in transparent mode and to map server ports, without really checking if the operation succeeds. In preparation of future changes, let's switch to conn_get_{src,dst}() and integrate status check for possible failures.	2019-07-19 13:50:09 +02:00
Willy Tarreau	2e34c11458	MINOR: connection: add conn_get_src() and conn_get_dst() These functions currently are the same as conn_get_from_addr() and conn_get_to_addr() respectively except that they return a status for the operation that the caller can test.	2019-07-19 13:50:09 +02:00
Christopher Faulet	f734638976	MINOR: http: Don't store raw HTTP errors in chunks anymore Default HTTP error messages are stored in an array of chunks. And since the HTX was added, these messages are also converted in HTX and stored in another array. But now, the first array is not used anymore because the legacy HTTP mode was removed. So now, only the array with the HTX messages are kept. The other one was removed.	2019-07-19 09:46:23 +02:00
Christopher Faulet	1b6adb4a51	MINOR: proxy/http_ana: Remove unused req_exp/rsp_exp and req_add/rsp_add lists The keywords req* and rsp* are now unsupported. So the corresponding lists are now unused. It is safe to remove them from the structure proxy. As a result, the code dealing with these rules in HTTP analyzers was also removed.	2019-07-19 09:24:12 +02:00
Christopher Faulet	8c3b63ae1d	MINOR: proxy: Remove the unused list of block rules The keyword "block" is now unsupported. So the list of block rules is now unused. It can be safely removed from the structure proxy.	2019-07-19 09:24:12 +02:00
Christopher Faulet	a6a56e6483	MEDIUM: config: Remove parsing of req* and rsp* directives It was announced for the 2.1. Following keywords are now unsupported: * reqadd, reqallow, reqiallow, reqdel, reqidel, reqdeny, reqideny, reqpass, reqipass, reqrep, reqirep reqtarpit, reqitarpit * rspadd, rspdel, rspidel, rspdeny, rspideny, rsprep, rspirep a fatal error is emitted if one of these keyword is found during the configuraion parsing.	2019-07-19 09:24:12 +02:00
Christopher Faulet	73e8ede156	MINOR: proxy: Remove support of the option 'http-tunnel' The option 'http-tunnel' is deprecated and it was only used in the legacy HTTP mode. So this option is now totally ignored and a warning is emitted during HAProxy startup if it is found in a configuration file.	2019-07-19 09:24:12 +02:00
Christopher Faulet	fc9cfe4006	REORG: proto_htx: Move HTX analyzers & co to http_ana.{c,h} files The old module proto_http does not exist anymore. All code dedicated to the HTTP analysis is now grouped in the file proto_htx.c. So, to finish the polishing after removing the legacy HTTP code, proto_htx.{c,h} files have been moved in http_ana.{c,h} files. In addition, all HTX analyzers and related functions prefixed with "htx_" have been renamed to start with "http_" instead.	2019-07-19 09:24:12 +02:00
Christopher Faulet	eb2754bef8	CLEANUP: proto_http: Remove unecessary includes and comments	2019-07-19 09:24:12 +02:00
Christopher Faulet	22dc248c2a	CLEANUP: channel: Remove the unused flag CF_WAKE_CONNECT This flag is tested or cleared but never set anymore.	2019-07-19 09:24:12 +02:00
Christopher Faulet	3716ebc50f	CLEANUP: proto_http: Group remaining flags of the HTTP transaction	2019-07-19 09:24:12 +02:00
Christopher Faulet	cc76d5b9a1	MINOR: proto_http: Remove the unused flag HTTP_MSGF_WAIT_CONN This flag is set but never used. So remove it.	2019-07-19 09:24:12 +02:00
Christopher Faulet	c41547b66e	MINOR: proto_http: Remove unused http txn flags Many flags of the HTTP transction (TX_) are now unused and useless. So the flags TX_WAIT_CLEANUP, TX_HDR_CONN_, TX_CON_CLO_SET and TX_CON_KAL_SET were removed. Most of TX_CON_WANT_* were also removed. Only TX_CON_WANT_TUN has been kept.	2019-07-19 09:24:12 +02:00
Christopher Faulet	711ed6ae4a	MAJOR: http: Remove the HTTP legacy code First of all, all legacy HTTP analyzers and all functions exclusively used by them were removed. So the most of the functions in proto_http.{c,h} were removed. Only functions to deal with the HTTP transaction have been kept. Then, http_msg and hdr_idx modules were entirely removed. And finally the structure http_msg was lightened of all its useless information about the legacy HTTP. The structure hdr_ctx was also removed because unused now, just like unused states in the enum h1_state. Note that the memory pool "hdr_idx" was removed and "http_txn" is now smaller.	2019-07-19 09:24:12 +02:00
Christopher Faulet	3d11969a91	MAJOR: filters: Remove code relying on the legacy HTTP mode This commit breaks the compatibility with filters still relying on the legacy HTTP code. The legacy callbacks were removed (http_data, http_chunk_trailers and http_forward_data). For now, the filters must still set the flag FLT_CFG_FL_HTX to be used on HTX streams.	2019-07-19 09:18:27 +02:00
Christopher Faulet	28b18c5e21	CLEANUP: proxy: Remove the flag PR_O2_USE_HTX This flag is now unused. So we can safely remove it.	2019-07-19 09:18:27 +02:00
Christopher Faulet	6d1dd46917	MEDIUM: http_fetch: Remove code relying on HTTP legacy mode Since the legacy HTTP mode is disbabled, all HTTP sample fetches work on HTX streams. So it is safe to remove all code relying on HTTP legacy mode. Among other things, the function smp_prefetch_http() was removed with the associated macros CHECK_HTTP_MESSAGE_FIRST() and CHECK_HTTP_MESSAGE_FIRST_PERM().	2019-07-19 09:18:27 +02:00
Christopher Faulet	c985f6c5d8	MINOR: connection: Remove the multiplexer protocol PROTO_MODE_HTX Since the legacy HTTP mode is disabled and no multiplexer relies on it anymore, there is no reason to have 2 multiplexer protocols for the HTTP. So the protocol PROTO_MODE_HTX was removed and all HTTP multiplexers use now PROTO_MODE_HTTP.	2019-07-19 09:18:27 +02:00
Christopher Faulet	5ed8353dcf	CLEANUP: h2: Remove functions converting h2 requests to raw HTTP/1.1 ones Because the h2 multiplexer only uses the HTX mode, following H2 functions were removed : * h2_prepare_h1_reqline * h2_make_h1_request() * h2_make_h1_trailers()	2019-07-19 09:18:27 +02:00
Christopher Faulet	24e116bfe0	MINOR: htx: Slightly update htx_dump() to report better messages Sign of <tail_addr>, <head_addr> and <end_addr> is respsected to not convert -1 into its unsigned representation.	2019-07-19 09:18:27 +02:00
Christopher Faulet	2bf43f0746	MINOR: htx: Use an array of char to store HTX blocks Instead of using a array of (struct block), it is more natural and intuitive to use an array of char. Indeed, not only (struct block) are stored in this array, but also their payload.	2019-07-19 09:18:27 +02:00
Christopher Faulet	192c6a23d4	MINOR: htx: Deduce the number of used blocks from tail and head values <head> and <tail> fields are now signed 32-bits integers. For an empty HTX message, these fields are set to -1. So the field <used> is now useless and can safely be removed. To know if an HTX message is empty or not, we just compare <head> against -1 (it also works with <tail>). The function htx_nbblks() has been added to get the number of used blocks.	2019-07-19 09:18:27 +02:00
Christopher Faulet	5a916f7326	CLEANUP: htx: Remove the unsued function htx_add_blk_type_size()	2019-07-19 09:18:27 +02:00
Christopher Faulet	3b21972061	DOC: htx: Update comments in HTX files This patch may be backported to 2.0 to have accurate comments.	2019-07-19 09:18:27 +02:00
Christopher Faulet	304cc40536	MINOR: proto_htx: Add the function htx_return_srv_error() Instead of using a function from the legacy HTTP, the HTX code now uses its own one.	2019-07-19 09:18:27 +02:00
Willy Tarreau	8280ea97a0	MINOR: applet: make appctx use their own pool A long time ago, applets were seen as an alternative to connections, and since their respective sizes were roughly equal it appeared wise to share the same pool. Nowadays, connections got significantly larger but applets are not that often used, except for the cache. However applets are mostly complementary and not alternatives anymore, as it's very possible not to have a back connection or to share one with other streams. The connections will soon lose their addresses and their size will shrink so much that appctx won't fit anymore. Given that the old benefits of sharing these pools have long disappeared, let's stop doing this and have a dedicated pool for appctx.	2019-07-18 10:45:08 +02:00
Willy Tarreau	7764a57d32	BUG/MEDIUM: threads: cpu-map designating a single thread/process are ignored Since commit `81492c989` ("MINOR: threads: flatten the per-thread cpu-map"), we don't keep the procthread matrix anymore to represent the full binding possibilities, but only the proc and thread ones. The problem is that the per-process binding is not the same for each thread and for the process, and the proc[] array was assumed to store the per-proc first thread value when doing this change. Worse, the logic present there tries to deal with thread ranges and process ranges in a way which automatically exclused the other possibility (since ranges cannot be used on both) but as such fails to apply changes if neither the process nor the thread is expressed as a range. The real problem comes from the fact that specifying cpu-map 1/1 doesn't yet reveal if the per-process mask or the per-thread mask needs to be updated. In practice it's the thread one but then the current storage doesn't allow to store the binding of the first thread of each other process in nbproc>1 configurations. When removing the procthread matrix, what ought to have been kept was both the thread column for process 1 and the process line for threads 1, but instead only the thread column was kept. This patch reintroduces the storage of the configuration for the first thread of each process so that it is again possible to store either the per-thread or per-process configuration. As a partial workaround for existing configurations, it is possible to systematically indicate at least two processes or two threads at once and map them by pairs or more so that at least two values are present in the range. E.g : # set processes 1-4 to cpus 0-3 : cpu-map auto:1-4/1 0 1 2 3 # or: cpu-map 1-2/1 0 1 cpu-map 2-3/1 2 3 # set threads 1-4 to cpus 0-3 : cpu-map auto:1/1-4 0 1 2 3 # or : cpu-map 1/1-2 0 1 cpu-map 3/3-4 2 3 This fix must be backported to 2.0.	2019-07-16 15:23:09 +02:00
Andrew Heberle	9723696759	MEDIUM: mworker-prog: Add user/group options to program section This patch adds "user" and "group" config options to the "program" section so the configured command can be run as a different user.	2019-07-15 16:43:16 +02:00
Olivier Houchard	4bd5867627	BUG/MEDIUM: streams: Don't redispatch with L7 retries if redispatch isn't set. Move the logic to decide if we redispatch to a new server from sess_update_st_cer() to a new inline function, stream_choose_redispatch(), and use it in do_l7_retry() instead of just setting the state to SI_ST_REQ. That way, when using L7 retries, we won't redispatch the request to another server except if "option redispatch" is used. This should be backported to 2.0.	2019-07-12 16:17:50 +02:00
Willy Tarreau	64e6012eb9	MINOR: task: introduce work lists Sometimes we need to delegate some list processing to a function running on another thread. In this case the list element will simply be queued into a dedicated self-locked list and the task responsible for this list will be woken up, calling the associated function which will run over the list. This is what work_list does. Such lists will be dedicated to a limited type of work but will significantly ease such remote handling. A function is provided to create these per-thread lists, their tasks and to properly bind each task to a distinct thread, so that the caller only has to store the resulting pointer to the start of the structure. These structures should not be abused though as each head will consume 4 pointers per thread, hence 32 bytes per thread or 2 kB for 64 threads.	2019-07-12 09:07:48 +02:00
Olivier Houchard	4be7190c10	BUG/MEDIUM: servers: Fix a race condition with idle connections. When we're purging idle connections, there's a race condition, when we're removing the connection from the idle list, to add it to the list of connections to free, if the thread owning the connection tries to free it at the same time. To fix this, simply add a per-thread lock, that has to be hold before removing the connection from the idle list, and when, in conn_free(), we're about to remove the connection from every list. That way, we know for sure the connection will stay valid while we remove it from the idle list, to add it to the list of connections to free. This should happen rarely enough that it shouldn't have any impact on performances. This has not been reported yet, but could provoke random segfaults. This should be backported to 2.0.	2019-07-11 16:16:38 +02:00
Christopher Faulet	34ce7d075a	BUG/MINOR: server: Be really able to keep "pool-max-conn" idle connections The maximum number of idle connections for a server can be configured by setting the server option "pool-max-conn". But when we try to add a connection in its idle list, because of a wrong comparison, it may be rejected because there are already "pool-max-conn - 1" idle connections. This patch must be backported to 2.0 and 1.9.	2019-07-10 14:20:52 +02:00
Willy Tarreau	1dad3843dc	BUG/MEDIUM: fd/threads: fix excessive CPU usage on multi-thread accept While experimenting with potentially improved fairness and latency using ticket locks on a Ryzen 16-thread/8-core, a very strange situation happened a lot for some levels of traffic. Around 300k connections per second, no more connections would be accepted on the multi-threaded listener but all others would continue to work fine. All attempts to trace showed that the threads were all in the trylock in the fd cache, or in the spinlock of fd_update_events(), or in the one of fd_may_recv(). But as indicated this was not a deadlock since the process continues to work fine. After quite some investigation it appeared that the issue is caused by a lack of fairness between the fdcache's trylock and these functions' spin locks above. In fact, regardless of the success or failure of the fdcache's attempt at grabbing the lock, the poller was calling fd_update_events() which locks the FD once for something that can be done with a CAS, and then calls fd_may_recv() with another lock for something that most often didn't change. The high contention on these spinlocks leaves no chance to any other thread to grab the lock using trylock(), and once this happens, there is no thread left to process incoming connection events nor to stop polling on the FD, leaving all threads at 100% CPU but partially operational. This patch addresses the issue by using bit-test-and-set instead of the OR in fd_may_recv() / fd_may_send() so that nothing is done if the FD was already configured as expected. It does the same in fd_update_events() using a CAS to check if the FD's events need to be changed at all or not. With this patch applied, it became impossible to reproduce the issue, and now there's no way to saturate all 16 CPUs with the load used for testing, as no more than 1350-1400 were noticed at 300+kcps vs 1600. Ideally this patch should go further and try to remove the remaining incarnations of the fdlock as this seems possible, but it's difficult enough to be done in a distinct patch that will not have to be backported. It is possible that workloads involving a high connection rate may slightly benefit from this patch and observe a slightly lower CPU usage even when the service doesn't misbehave. This patch must be backported to 2.0 and 1.9.	2019-07-09 10:41:24 +02:00
Willy Tarreau	85b2cae63c	MINOR: pools: make the thread harmless during the mmap/munmap syscalls These calls can take quite some time and leave the thread harmless so it's better to mark it as such. This makes "show sess" respond way faster during high loads running on processes build with DEBUG_UAF since these calls are stressed a lot.	2019-07-09 10:40:33 +02:00
Willy Tarreau	828675421e	MINOR: pools: always pre-initialize allocated memory outside of the lock When calling mmap(), in general the system gives us a page but does not really allocate it until we first dereference it. And it turns out that this time is much longer than the time to perform the mmap() syscall. Unfortunately, when running with memory debugging enabled, we mmap/munmap() each object resulting in lots of such calls and a high contention on the allocator. And the first accesses to the page being done under the pool lock is extremely damaging to other threads. The simple fact of writing a 0 at the beginning of the page after allocating it and placing the POOL_LINK pointer outside of the lock is enough to boost the performance by 8x in debug mode and to save the watchdog from triggering on lock contention. This is what this patch does.	2019-07-09 10:40:33 +02:00
Willy Tarreau	3e853ea74d	MINOR: pools: release the pool's lock during the malloc/free calls The malloc and free calls and especially the underlying mmap/munmap() can occasionally take a huge amount of time and even cause the thread to sleep. This is visible when haproxy is compiled with DEBUG_UAF which causes every single pool allocation/free to allocate and release pages. In this case, when using the locked pools, the watchdog can occasionally fire under high contention (typically requesting 40000 1M objects in parallel over 8 threads). Then, "perf top" shows that 50% of the CPU time is spent in mmap() and munmap(). The reason the watchdog fires is because some threads spin on the pool lock which is held by other threads waiting on mmap() or munmap(). This patch modifies this so that the pool lock is released during these syscalls. Not only this allows other threads to request try to allocate their data in parallel, but it also considerably reduces the lock contention. Note that the locked pools are only used on small architectures where high thread counts would not make sense, so this will not provide any benefit in the general case. However it makes the debugging versions way more stable, which is always appreciated.	2019-07-09 10:40:33 +02:00
Christopher Faulet	037b3ebd35	BUG/MEDIUM: stream-int: Don't rely on CF_WRITE_PARTIAL to unblock opposite si In the function stream_int_notify(), when the opposite stream-interface is blocked because there is no more room into the input buffer, if the flag CF_WRITE_PARTIAL is set on this buffer, it is unblocked. It is a way to unblock the reads on the other side because some data was sent. But it is a problem during the fast-forwarding because only the stream is able to remove the flag CF_WRITE_PARTIAL. So it is possible to have this flag because of a previous send while the input buffer of the opposite stream-interface is now full. In such case, the opposite stream-interface will be woken up for nothing because its input buffer is full. If the same happens on the opposite side, we will have a loop consumming all the CPU. To fix the bug, the opposite side is now only notify if there is some available room in its input buffer in the function si_cs_send(), so only if some data was sent. This patch must be backported to 2.0 and 1.9.	2019-07-05 14:26:15 +02:00
Christopher Faulet	2e4843d1d2	MINOR: action: Add the return code ACT_RET_DONE for actions This code should be now used by action to stop at the same time the rules processing and the possible following processings. And from its side, the return code ACT_RET_STOP should be used to only stop rules processing. So concretely, for TCP rules, there is no changes. ACT_RET_STOP and ACT_RET_DONE are handled the same way. However, for HTTP rules, ACT_RET_STOP should now be mapped on HTTP_RULE_RES_STOP and ACT_RET_DONE on HTTP_RULE_RES_DONE. So this way, a action will have the possibilty to stop all processing or only rules processing. Note that changes about the TCP is done in this commit but changes about the HTTP will be done in another one because it will fix a bug in the same time. This patch must be backported to 2.0 because a bugfix depends on it.	2019-07-05 14:26:14 +02:00
Olivier Houchard	cee0389088	BUG/MEDIUM: sessions: Don't keep an extra idle connection in sessions. When deciding if we keep an idle connection in the session, check if the number of connections currently in the session is >= the max allowed, not >, or we'll keep an extra connection. This should be backported to 1.9 and 2.0.	2019-07-04 14:28:18 +02:00
Olivier Houchard	2ab3dada01	BUG/MEDIUM: connections: Make sure we're unsubscribe before upgrading the mux. Just calling conn_force_unsubscribe() from conn_upgrade_mux_fe() is not enough, as there may be multiple XPRT involved. Instead, require that any user of conn_upgrade_mux_fe() unsubscribe itself before calling it. This should fix upgrading a TCP connection to HTX when using SSL. This should be backported to 2.0.	2019-07-03 13:57:30 +02:00
Christopher Faulet	621da6bafa	BUG/MEDIUM: channel/htx: Use the total HTX size in channel_htx_recv_limit() The receive limit of an HTX channel must be calculated against the total size of the HTX message. Otherwise, the buffer may never be seen as full whereas the receive limit is 0. Indeed, the function channel_htx_full() already takes care to add a block size to the buffer's reserve (8 bytes). So if the function channel_htx_recv_limit() also keep a block size free in addition to the buffer's reserve, it means that at least 2 block size will be kept free but only one will be taken into account, freezing the stream if the option http-buffer-request is enabled. This patch fixes the Github issue #136. It should be backported to 2.0 and 1.9. Thanks jaroslawr (Jarosław Rzeszótko) for his help.	2019-07-02 21:32:45 +02:00
Olivier Houchard	6c7e96a3e1	BUG/MEDIUM: connections: Always call shutdown, with no linger. Revert commit `fe4abe62c7`. The goal was to make sure for health-checks, we would not get sockets in TIME_WAIT. To do so, we would not call shutdown() if linger_risk is set. However that is wrong, and that means shutw would never be forwarded to the server, and thus we could get connection that are never properly closed. Instead, to fix the original problem as described here : https://www.mail-archive.com/haproxy@formilux.org/msg34080.html Just make sure the checks code call cs_shutr() before calling cs_shutw(). If shutr has been called, conn_sock_shutw() will make no attempt to call shutdown(), as it knows close() will be called. We should really review and revamp the shutr/shutw code, as described in github issue #142. This should be backported to 1.9 and 2.0.	2019-07-02 16:40:55 +02:00
William Lallemand	ad03288e6b	BUG/MINOR: mworker/cli: don't output a \n before the response When using a level lower than admin on the master CLI, a \n is output before the response, this is caused by the response of the "operator" or "user" that are sent before the actual command. To fix this problem we introduce the flag APPCTX_CLI_ST1_NOLF which ask a command response to not be followed by the final \n. This patch made a special case with the command operator and user followed by a - so they are not followed by \n. This patch must be backported to 2.0 and 1.9.	2019-07-01 15:34:11 +02:00
Christopher Faulet	bb0efcdd29	MINOR: htx: Add the function htx_change_blk_value_len() As its name suggest, this function change the value length of a block. But it also update the HTX message accordingly. It simplifies the HTX API. The function htx_set_blk_value_len() is still available and must be used with caution because this one does not update the HTX message. It just updates the HTX block. It should be considered as an internal function. When possible, htx_change_blk_value_len() should be used instead. This function is used to fix a bug affecting the 2.0. So, this patch must be backported to 2.0.	2019-06-18 10:01:55 +02:00
Baptiste Assmann	da29fe2360	MEDIUM: server: server-state global file stored in a tree Server states can be recovered from either a "global" file (all backends) or a "local" file (per backend). The way the algorithm to parse the state file was first implemented was good enough for a low number of backends and servers per backend. Basically, for each backend the state file (global or local) is opened, parsed entirely and for each line we check if it contains data related to a server from the backend we're currently processing. We must read the file entirely, just in case some lines for the current backend are stored at the end of the file. This does not scale at all! This patch changes the behavior above for the "global" file only. Now, the global file is read and parsed once and all lines it contains are stored in a tree, for faster discovery. This result in way much less fopen, fgets, and strcmp calls, which make loading of very big state files very quick now.	2019-06-17 13:40:42 +02:00
Tim Duesterhus	86e6b6ebf8	MEDIUM: Make '(cli\|con\|srv)timeout' directive fatal They were deprecated with HAProxy 1.5. Time to remove them.	2019-06-17 13:35:54 +02:00
Tim Duesterhus	dac168bc15	MEDIUM: Make 'redispatch' directive fatal It was deprecated with HAProxy 1.5. Time to remove it.	2019-06-17 13:35:54 +02:00
Tim Duesterhus	7b7c47f05c	MEDIUM: Make 'block' directive fatal It was deprecated with HAProxy 1.5. Time to remove it.	2019-06-17 13:35:54 +02:00
Willy Tarreau	9dc6b97429	[RELEASE] Released version 2.1-dev0 Released version 2.1-dev0 with the following main changes : - exact copy of 2.0.0	2019-06-16 21:49:47 +02:00
Willy Tarreau	bd20a9dd4e	BUG: tasks: fix bug introduced by latest scheduler cleanup In commit `86eded6c6` ("CLEANUP: tasks: rename task_remove_from_tasklet_list() to tasklet_remove_*") which consisted in removing the casts between tasks and tasklet, I was a bit too fast to believe that we only saw tasklets in this function since process_runnable_tasks() also uses it with tasks under a cast. So removing the bookkeeping on task_list_size was not appropriate. Bah, the joy of casts which hide the real thing... This patch does two things at once to address this mess once for all: - it restores the decrement of task_list_size when it's a real task, but moves it to process_runnable_task() since it's the only place where it's allowed to call it with a task - it moves the increment there as well and renames task_insert_into_tasklet_list() to tasklet_insert_into_tasklet_list() of obvious consistency reasons. This way the increment/decrement of task_list_size is made at the only places where the cast is enforced, so it has less risks to be missed. The comments on top of these functions were updated to reflect that they are only supposed to be used with tasklets and that the caller is responsible for keeping task_list_size up to date if it decides to enforce a task there. Now we don't have to worry anymore about how these functions work outside of the scheduler, which is better longterm-wise. Thanks to Christopher for spotting this mistake. No backport is needed.	2019-06-14 18:16:19 +02:00
Olivier Houchard	fe4abe62c7	BUG/MEDIUM: connections: Don't call shutdown() if we want to disable linger. In conn_sock_shutw(), avoid calling shutdown() if linger_risk is set. Not doing so will result in getting sockets in TIME_WAIT for some time. This is particularly observable with health checks. This should be backported to 1.9.	2019-06-14 15:33:41 +02:00
Willy Tarreau	86eded6c69	CLEANUP: tasks: rename task_remove_from_tasklet_list() to tasklet_remove_* The function really only operates on tasklets, its arguments are always tasklets cast as tasks to match the function's type, to be cast back to a struct tasklet. Let's rename it to tasklet_remove_from_tasklet_list(), take a struct tasklet, and get rid of the undesired task casts.	2019-06-14 14:57:03 +02:00
Willy Tarreau	3c39a7d889	CLEANUP: connection: rename the wait_event.task field to .tasklet It's really confusing to call it a task because it's a tasklet and used in places where tasks and tasklets are used together. Let's rename it to tasklet to remove this confusion.	2019-06-14 14:42:29 +02:00
Christopher Faulet	e21c01637a	MINOR: htx: Add 3 flags on the start-line to deal with the request schemes The first one, HTX_SL_F_HAS_SCHM, will be used to know the request has an explicit scheme. So, in H2, it is always true because the pseudo-header ":scheme" is mandatory. In H1, it is only true when an absolute URI is found on the start-line. The other flags, HTX_SL_F_SCHM_HTTP and HTX_SL_F_SCHM_HTTPS, will be used to know which scheme the request have. For now, other protocols are not handled. The aim of these flags is to pass this information to the backend side in general, and to the H2 mux in particular. So the multiplexer will have a chance to use this information to send the right scheme to the server.	2019-06-14 11:13:32 +02:00
Christopher Faulet	36a7702b03	CLEANUP: channel: Remove channel_htx_fwd_payload() and channel_htx_fwd_all() These functions are unused now. No backport needed.	2019-06-14 11:13:32 +02:00
Christopher Faulet	421e769783	BUG/MEDIUM: htx: Don't change position of the first block during HTX analysis In the HTX structure, the field <first> is used to know where to (re)start the analysis. It may differ from the message's head. It is especially important to update it to handle 1xx messages, to be sure to restart the analysis on the next message (another 1xx message or the final one). It is also updated when some data are forwarded (the headers or part of the body). But this update is an error and must never be done at the analysis level. It is a bug, because some sample fetches may be used after the data forwarding (but before the first send of course). At this stage, if the first block position does not point on the start-line, most of HTTP sample fetches fail. So now, when something is forwarding by HTX analyzers, the first block position is not update anymore. This issue was reported on Github. See #119. No backport needed.	2019-06-14 11:13:32 +02:00
Christopher Faulet	87ebe944d6	BUG/MINOR: channel/htx: Call channel_htx_full() from channel_full() When channel_full() is called for an HTX stream, we fall back on the HTX version. This function is called, among other, from tcp_inspect_request(). With this patch, the inspect delay is respected again. This patch must be backported to 1.9.	2019-06-14 11:13:32 +02:00
Willy Tarreau	3cec0f94f3	BUG/MINOR: task: prevent schedulable tasks from starving under high I/O activity With both I/O and tasks in the same tasklet list, we now have a very smooth and responsive scheduler, providing a good fairness between I/O activities. With the lower layers relying on tasklet a lot (I/O wakeup, subscribe, etc), there may often be a large number of totally autonomous tasklets doing their business such as forwarding data between two muxes. But the task scheduler historically refrained from picking tasks from the priority-ordered run queue to put them into the tasklet list until this later had less than max_runqueue_depth entries. This was to make sure that low-latency, high-priority tasks would have an opportunity to be dequeued before others even if they arrive late. But the counter used for this is still the tasklet list size, which contains countless I/O events. This causes an unfairness between unbounded I/Os and bounded tasks, resulting for example in the CLI responding slower when forwarding 40 Gbps of HTTP traffic spread over a thousand of connections. A good solution consists in sticking to the initial intent of max_runqueue_depth which is to limit the number of tasks in the list (to maintain fairness between them) and not to limit the number of these tasks among tasklets. It just turns out that the task_list_size initially was this task counter and changed over time to be a tasklet list size. Let's simply refrain from updating it for pure tasklets so that it takes back its original role of counting real tasks as its name implies. With this change the CLI becomes instantly responsive under load again. This patch may possibly be backported to 1.9 though it requires some careful checks.	2019-06-14 09:16:51 +02:00
William Lallemand	1dc6963086	MINOR: mworker: add the HAProxy version in "show proc" Displays the HAProxy version so you can compare the version of old processes and new ones.	2019-06-12 19:19:57 +02:00
Olivier Houchard	a0fdce3950	MINOR: fd: Don't use atomic operations when it's not needed. In updt_fd_polling(), when updating fd_nbupdt, there's no need to use an atomic operation, as it's a TLS variable.	2019-06-12 14:36:24 +02:00
Christopher Faulet	86fcf6d6cd	MINOR: htx: Add the function htx_move_blk_before() The function htx_add_data_before() was removed because it was buggy. The function htx_move_blk_before() may be used if necessary to do something equivalent, except it just moves blocks. It doesn't handle the adding.	2019-06-11 14:05:25 +02:00
Christopher Faulet	d7884d3449	MAJOR: htx: Rework how free rooms are tracked in an HTX message In an HTX message, it may have 2 available rooms to store a new block. The first one is between the blocks and their payload. Blocks are added starting from the end of the buffer and their payloads are added starting from the begining. So the first free room is between these 2 edges. The second one is at the begining of the buffer, when we start to wrap to add new payloads. Once we start to use this one, the other one is ignored until the next defragmentation of the HTX message. In theory, there is no problem. But in practice, some lacks in the HTX structure force us to defragment too often HTX messages to always be in a known state. The second free room is not tracked as it should do and the first one may be easily corrupted when rewrites happen. So to fix the problem and avoid unecessary defragmentation, the HTX structure has been refactored. The front (the block's position of the first payload before the blocks) is no more stored. Instead we keep the relative addresses of 3 edges: * tail_addr : The start address of the free space in front of the the blocks table * head_addr : The start address of the free space at the beginning * end_addr : The end address of the free space at the beginning Here is the general view of the HTX message now: head_addr end_addr tail_addr \| \| \| V V V +------------+------------+------------+------------+------------------+ \| \| \| \| \| \| \| PAYLOAD \| Free space \| PAYLOAD \| Free space \| Blocks area \| \| ==> \| 1 \| ==> \| 2 \| <== \| +------------+------------+------------+------------+------------------+ <head_addr> is always lower or equal to <end_addr> and <tail_addr>. <end_addr> is always lower or equal to <tail_addr>. In addition;, to simplify everything, the blocks area are now contiguous. It doesn't wrap anymore. So the head is always the block with the lowest position, and the tail is always the one with the highest position.	2019-06-11 14:05:25 +02:00
Christopher Faulet	86bc8df955	BUG/MEDIUM: compression/htx: Fix the adding of the last data block The function htx_add_data_before() is buggy and cannot work. It first add a data block and then move it before another one, passed in argument. The problem happens when a defragmentation is done to add the new block. In this case, the reference is no longer valid, because the blocks are rearranged. So, instead of moving the new block before the reference, it is moved at the head of the HTX message. So this function has been removed. It was only used by the compression filter to add a last data block before a TLR, EOT or EOM block. Now, the new function htx_add_last_data() is used. It adds a last data block, after all others and before any TLR, EOT or EOM block. Then, the next bock is get. It is the first non-data block after data in the HTX message. The compression loop continues with it. This patch must be backported to 1.9.	2019-06-11 14:05:25 +02:00
Willy Tarreau	9a1f57351d	MEDIUM: threads: add thread_sync_release() to synchronize steps This function provides an alternate way to leave a critical section run under thread_isolate(). Currently, a thread may remain in thread_release() without having the time to notice that the rdv mask was released and taken again by another thread entering thread_isolate() (often the same that just released it). This is because threads wait in harmless mode in the loop, which is compatible with the conditions to enter thread_isolate(). It's not possible to make them wait with the harmless bit off or we cannot know when the job is finished for the next thread to start in thread_isolate(), and if we don't clear the rdv bit when going there, we create another race on the start point of thread_isolate(). This new synchronous variant of thread_release() makes use of an extra mask to indicate the threads that want to be synchronously released. In this case, they will be marked harmless before releasing their sync bit, and will wait for others to release their bit as well, guaranteeing that thread_isolate() cannot be started by any of them before they all left thread_sync_release(). This allows to construct synchronized blocks like this : thread_isolate() /* optionally do something alone here / thread_sync_release() / do something together here / thread_isolate() / optionally do something alone here */ thread_sync_release() And so on. This is particularly useful during initialization where several steps have to be respected and no thread must start a step before the previous one is completed by other threads. This one must not be placed after any call to thread_release() or it would risk to block an earlier call to thread_isolate() which the current thread managed to leave without waiting for others to complete, and end up here with the thread's harmless bit cleared, blocking others. This might be improved in the future.	2019-06-10 09:42:43 +02:00
Willy Tarreau	9faebe34cd	MEDIUM: tools: improve time format error detection As reported in GH issue #109 and in discourse issue https://discourse.haproxy.org/t/haproxy-returns-408-or-504-error-when-timeout-client-value-is-every-25d the time parser doesn't error on overflows nor underflows. This is a recurring problem which additionally has the bad taste of taking a long time before hitting the user. This patch makes parse_time_err() return special error codes for overflows and underflows, and adds the control in the call places to report suitable errors depending on the requested unit. In practice, underflows are almost never returned as the parsing function takes care of rounding values up, so this might possibly happen on 64-bit overflows returning exactly zero after rounding though. It is not really possible to cut the patch into pieces as it changes the function's API, hence all callers. Tests were run on about every relevant part (cookie maxlife/maxidle, server inter, stats timeout, timeout*, cli's set timeout command, tcp-request/response inspect-delay).	2019-06-07 19:32:02 +02:00
Fr�d�ric L�caille	b65717fa55	MINOR: peers: Optimization for dictionary cache lookup. When we look up an dictionary entry in the cache used upon transmission we store the last result in ->prev_lookup of struct dcache_tx so that to compare it with the subsequent entries to look up and save performances.	2019-06-07 15:47:54 +02:00
Fr�d�ric L�caille	99de1d0479	MINOR: dict: Store the length of the dictionary entries. When allocating new dictionary entries we store the length of the strings. May be useful so that not to have to call strlen() too much often at runing time.	2019-06-07 15:47:54 +02:00
Fr�d�ric L�caille	6c39198b57	MINOR peers: data structure simplifications for server names dictionary cache. We store pointers to server names dictionary entries in a pre-allocated array of ebpt_node's (->entries member of struct dcache_tx) to cache those sent to remote peers. Consequently the ID used to identify the server name dictionary entry is also used as index for this array. There is no need to implement a lookup by key for this dictionary cache.	2019-06-07 15:47:54 +02:00
Willy Tarreau	1bfd6020ce	MINOR: logs: use the new bitmap functions instead of fd_sets for encoding maps The fd_sets we've been using in the log encoding functions are not portable and were shown to break at least under Cygwin. This patch gets rid of them in favor of the new bitmap functions. It was verified with the config below that the log output was exactly the same before and after the change : defaults mode http option httplog log stdout local0 timeout client 1s timeout server 1s timeout connect 1s frontend foo bind :8001 capture request header chars len 255 backend bar option httpchk "GET" "/" "HTTP/1.0\r\nchars: \x01\x02\x03\x04\x05\x06\x07\x08\x09\x0b\x0c\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f\x20\x21\x22\x23\x24\x25\x26\x27\x28\x29\x2a\x2b\x2c\x2d\x2e\x2f\x30\x31\x32\x33\x34\x35\x36\x37\x38\x39\x3a\x3b\x3c\x3d\x3e\x3f\x40\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f\x50\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5a\x5b\x5c\x5d\x5e\x5f\x60\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f\x70\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff" server foo 127.0.0.1:8001 check	2019-06-07 11:13:24 +02:00
Willy Tarreau	7355b040d1	MINOR: tools: add new bitmap manipulation functions We now have ha_bit_{set,clr,flip,test} to manipulate bitfields made of arrays of longs. The goal is to get rid of the remaining non-portable FD_{SET,CLR,ISSET} that still exist at a few places.	2019-06-07 10:44:49 +02:00
Willy Tarreau	ad660e3f84	BUILD: stream-int: avoid a build warning in dev mode in si_state_bit() The BUG_ON() test emits a warning about an always-true comparison regarding <state> which cannot be lower than zero. Let's get rid of it.	2019-06-06 16:42:08 +02:00
Willy Tarreau	3b285d7fbd	MINOR: stream-int: make si_sync_send() from the send code of si_update_both() Just like we have a synchronous recv() function for the stream interface, let's have a synchronous send function that we'll be able to call from different places. For now this only moves the code, nothing more.	2019-06-06 16:36:19 +02:00
Willy Tarreau	236c4298b3	MINOR: stream-int: split si_update() into si_update_rx() and si_update_tx() We should not update the two directions at once, in fact we should update the Rx path after recv() and the Tx path after send(). Let's start by splitting the update function in two for this.	2019-06-06 16:36:19 +02:00
Willy Tarreau	8c603ded39	MEDIUM: stream-int: make idle-conns switch to ST_RDY The purpose of making idle-conns switch to SI_ST_CON was to make the transition detectable and the operation retryable in case of connection error. Now we have the RDY state for this which is much more suitable since it indicates a validated connection on which we didn't necessarily send anything yet. This will still lead to a transition to EST while not requiring unnatural write polling nor connect timeouts.	2019-06-06 16:36:19 +02:00
Willy Tarreau	4f283fa604	MEDIUM: stream-int: introduce a new state SI_ST_RDY The main reason for all the trouble we're facing with stream interface error or timeout reports during the connection phase is that we currently can't make the difference between a connection attempt and a validated connection attempt. It is problematic because we tend to switch early to SI_ST_EST but can't always do what we want in this state since it's supposed to be set when we don't need to visit sess_establish() again. This patch introduces a new state betwen SI_ST_CON and SI_ST_EST, which is SI_ST_RDY. It indicates that we've verified that the connection is ready. It's a transient state, like SI_ST_DIS, that cannot persist when leaving process_stream(). For now it is not set, only verified in various tests where SI_ST_CON was used or SI_ST_EST depending on the cases. The stream-int state diagram was minimally updated to reflect the new state, though it is largely obsolete and would need to be seriously updated.	2019-06-06 16:36:19 +02:00
Willy Tarreau	7ab22adbf7	MEDIUM: stream-int: remove dangerous interval checks for stream-int states The stream interface state checks involving ranges were replaced with checks on a set of states, already revealing some issues. No issue was fixed, all was replaced in a one-to-one mapping for easier control. Some checks involving a strict difference were also replaced with fields to be clearer. At this stage, the result must be strictly equivalent. A few tests were also turned to their bit-field equivalent for better readability or in preparation for upcoming changes. The test performed in the SPOE filter was swapped so that the closed and error states are evicted first and that the established vs conn state is tested second.	2019-06-06 16:36:19 +02:00
Willy Tarreau	bedcd698b3	MINOR: stream-int: use bit fields to match multiple stream-int states at once At some places we do check for ranges of stream-int states but those are confusing as states ordering is not well known (e.g. it's not obvious that CER is between CON and EST). Let's create a bit field from states so that we can match multiple states at once instead. The new enum si_state_bit contains SI_SB_* which are state bits instead of state values. The function si_state_in() indicates if the state in argument is one of those represented by the bit mask in second argument.	2019-06-06 16:36:19 +02:00
Olivier Houchard	03abf2d31e	MEDIUM: connections: Remove CONN_FL_SOCK* Now that the various handshakes come with their own XPRT, there's no need for the CONN_FL_SOCK* flags, and the conn_sock_want\|stop functions, so garbage-collect them.	2019-06-05 18:03:38 +02:00
Olivier Houchard	fe50bfb82c	MEDIUM: connections: Introduce a handshake pseudo-XPRT. Add a new XPRT that is used when using non-SSL handshakes, such as proxy protocol or Netscaler, instead of taking care of it in conn_fd_handler(). This XPRT is installed when any of those is used, and it removes itself once the handshake is done. This should allow us to remove the distinction between CO_FL_SOCK* and CO_FL_XPRT*.	2019-06-05 18:03:38 +02:00
Olivier Houchard	2e055483ff	MINOR: connections: Add a new xprt method, add_xprt(). Add a new method to xprt_ops, add_xprt(), that changes the underlying xprt to the one provided, and optionally provide the old one.	2019-06-05 18:03:38 +02:00
Olivier Houchard	5149b59851	MINOR: connections: Add a new xprt method, remove_xprt. Add a new method to xprt_ops, remove_xprt. When called, if the provided xprt_ctx is the same as the xprt's underlying xprt_ctx, it then uses the new xprt provided, otherwise it calls the remove_xprt method of the next xprt. The goal is to be able to add a temporary xprt, that removes itself from the chain when it did what it had to do. This will be used to implement a pseudo-xprt for anything that just requires a handshake (such as the proxy protocol).	2019-06-05 18:03:38 +02:00
Olivier Houchard	000694cf96	MINOR: ssl: Make ssl_sock_handshake() static. ssl_sock_handshake is now only used by the ssl code itself, there's no need to export it anymore, so make it static.	2019-06-05 18:03:38 +02:00
Olivier Houchard	ea8dd949e4	MEDIUM: ssl: Handle subscribe by itself. As the SSL code may have different needs than the upper layer, ie it may want to receive when the upper layer wants to right, instead of directly forwarding the subscribe to the underlying xprt, handle it ourself. The SSL code will know remember any subscribe call, and wake the tasklet when it is ready for more I/O.	2019-06-05 18:03:38 +02:00
Christopher Faulet	54b5e214b0	MINOR: htx: Don't use end-of-data blocks anymore This type of blocks is useless because transition between data and trailers is obvious. And when there is no trailers, the end-of-message is still there to know when data end for chunked messages.	2019-06-05 10:12:11 +02:00
Christopher Faulet	2d7c5395ed	MEDIUM: htx: Add the parsing of trailers of chunked messages HTTP trailers are now parsed in the same way headers are. It means trailers are converted to K/V blocks followed by an end-of-trailer marker. For now, to make things simple, the type for trailer blocks are not the same than for header blocks. But the aim is to make no difference between headers and trailers by using the same type. Probably for the end-of marker too.	2019-06-05 10:12:11 +02:00
Christopher Faulet	8f3c256f7e	MEDIUM: cache/htx: Always store info about HTX blocks in the cache It was only done for the headers (including the EOH marker). data were prefixed by the info field of these blocks. The payload and the trailers of the messages were stored in raw. The total size of headers and payload were kept in the cached object state to help output formatting. Now, info about each HTX block is store in the cache. Only data are allowed to be splitted. Otherwise, all blocks of an HTX message are handled the same way, both when storing a message in the cache and when delivering it from the cache. This will help the cache implementation to be more robust to internal changes in the HTX. Especially for the upcoming parsing of trailers. There is also no more need to keep extra info in the cached object state.	2019-06-05 10:12:11 +02:00
Christopher Faulet	a4f9dd4a56	BUG/MINOR: channel/htx: Don't alter channel during forward for empty HTX message In channel_htx_forward() and channel_htx_forward_forever(), if the HTX message is empty, the underlying buffer may be really empty too. And we have no warranty the caller will call htx_to_buf() later. And in practice, it is almost never done. So the channel's buffer must not be altered. Otherwise, the buffer may be considered as full (data == size) for an empty HTX message and no outgoing data. This patch must be backported to 1.9.	2019-06-05 10:12:11 +02:00
Fr�d�ric L�caille	8d78fa7def	MINOR: peers: Make peers protocol support new "server_name" data type. Make usage of the APIs implemented for dictionaries (dict.c) and their LRU caches (struct dcache) so that to send/receive server names used for the server by name stickiness. These names are sent over the network as follows: - in every case we send the encode length of the data (STD_T_DICT), then - if the server names is not present in the cache used upon transmission (struct dcache_tx) we cache it and we the ID of this TX cache entry followed the encode length of the server name, and finally the sever name itseft (non NULL terminated string). - if the server name is present, we repead these operations but we only send the TX cache entry ID. Upon receipt, the couple of (cache IDs, server name) are stored the LRU cache used only upon receipt (struct dcache_rx). As the peers protocol is symetrical, the fact that the server name is present in the received data (resp. or not) denotes if the entry is absent (resp. or not).	2019-06-05 08:42:33 +02:00
Fr�d�ric L�caille	7da71293e4	MINOR: server: Add a dictionary for server names. This patch only declares and defines a dictionary for the server names (stored as ->id member field).	2019-06-05 08:33:35 +02:00
Fr�d�ric L�caille	84d6046a33	MINOR: proxy: Add a "server by name" tree to proxy. Add a tree to proxy struct to lookup by name for servers attached to this proxy and populated it at parsing time.	2019-06-05 08:33:35 +02:00
Fr�d�ric L�caille	5ad57ea85f	MINOR: stick-table: Add "server_name" new data type. This simple patch only adds definitions to create a new stick-table data type ID and a new standard type to store information in relation wich dictionary entries (STD_T_DICT).	2019-06-05 08:33:35 +02:00
Fr�d�ric L�caille	74167b25f7	MINOR: peers: Add a LRU cache implementation for dictionaries. We want to send some stick-table data fields stored as strings in dictionaries without consuming too much memory and CPU. To do so we implement with this patch a cache for send/received dictionaries entries. These dictionary of strings entries are stored in others real dictionary entries with an identifier as key (unsigned int) and a pointer to the dictionary of strings entries as values.	2019-06-05 08:33:35 +02:00
Fr�d�ric L�caille	4a3fef834c	MINOR: dict: Add dictionary new data structure. This patch adds minimalistic definitions to implement dictionary new data structure which is an ebtree of ebpt_node structs with strings as keys. Note that this has nothing to see with real dictionary data structure (maps of keys in association with values).	2019-06-05 08:33:35 +02:00
Fr�d�ric L�caille	1673bbdf98	CLEANUP: peers: Remove tabs characters. This patch only replaces very annoying tabulation characters by spaces so that not to have to use again tabulations where they should not be used.	2019-06-05 08:33:34 +02:00
Willy Tarreau	7bb39d7cd6	CLEANUP: connection: remove the now unused CS_FL_REOS flag Let's remove it before it gets uesd again. It was mostly replaced with CS_FL_EOI and by mux-specific states or flags.	2019-06-03 14:23:33 +02:00
Willy Tarreau	7067b3a92e	BUG/MINOR: deinit/threads: make hard-stop-after perform a clean exit As reported in GH issue #99, when hard-stop-after triggers and threads are in use, the chance that any thread releases the resources in use by the other ones is non-null. Thus no thread should be allowed to deinit() nor exit by itself. Here we take a different approach. We simply use a 3rd possible value for the "killed" variable so that all threads know they must break out of the run-poll-loop and immediately stop. This patch was tested by commenting the stream_shutdown() calls in hard_stop() to increase the chances to see a stream use released resources. With this fix applied, it never crashes anymore. This fix should be backported to 1.9 and 1.8.	2019-06-02 11:30:07 +02:00
Alexander Liu	2a54bb74cd	MEDIUM: connection: Upstream SOCKS4 proxy support Have "socks4" and "check-via-socks4" server keyword added. Implement handshake with SOCKS4 proxy server for tcp stream connection. See issue #82. I have the "SOCKS: A protocol for TCP proxy across firewalls" doc found at "https://www.openssh.com/txt/socks4.protocol". Please reference to it. [wt: for now connecting to the SOCKS4 proxy over unix sockets is not supported, and mixing IPv4/IPv6 is discouraged; indeed, the control layer is unique for a connection and will be used both for connecting and for target address manipulation. As such it may for example report incorrect destination addresses in logs if the proxy is reached over IPv6]	2019-05-31 17:24:06 +02:00
Olivier Houchard	cfbb3e6560	MEDIUM: tasks: Get rid of active_tasks_mask. Remove the active_tasks_mask variable, we can deduce if we've work to do by other means, and it is costly to maintain. Instead, introduce a new function, thread_has_tasks(), that returns non-zero if there's tasks scheduled for the thread, zero otherwise.	2019-05-29 21:53:37 +02:00
Olivier Houchard	250031e444	MEDIUM: sessions: Introduce session flags. Add session flags, and add a new flag, SESS_FL_PREFER_LAST, to be set when we use NTLM authentication, and we should reuse the last connection. This should fix using NTLM with HTX. This totally replaces TX_PREFER_LAST. This should be backported to 1.9.	2019-05-29 15:41:47 +02:00
Willy Tarreau	ef28dc11e3	MINOR: task: turn the WQ lock to an RW_LOCK For now it's exclusively used as a write lock though, thus it remains 100% equivalent to the spinlock it replaces.	2019-05-28 19:15:44 +02:00
Willy Tarreau	186e96ece0	MEDIUM: buffers: relax the buffer lock a little bit In lock profiles it's visible that there is a huge contention on the buffer lock. The reason is that when offer_buffers() is called, it systematically takes the lock before verifying if there is any waiter. However doing so doesn't protect against races since a waiter can happen just after we release the lock as well. Similarly in h2 we take the lock every time an h2c is going to be released, even without checking that the h2c belongs to a wait list. These two have now been addressed by verifying non-emptiness of the list prior to taking the lock.	2019-05-28 17:25:21 +02:00
Willy Tarreau	a8b2ce02b8	MINOR: activity: report the number of failed pool/buffer allocations Haproxy is designed to be able to continue to run even under very low memory conditions. However this can sometimes have a serious impact on performance that it hard to diagnose. Let's report counters of failed pool and buffer allocations per thread in show activity.	2019-05-28 17:25:21 +02:00
Willy Tarreau	2ae84e445d	MEDIUM: poller: separate the wait time from the wake events We have been abusing the do_poll()'s timeout for a while, making it zero whenever there is some known activity. The problem this poses is that it complicates activity diagnostic by incrementing the poll_exp field for each known activity. It also requires extra computations that could be avoided. This change passes a "wake" argument to say that the poller must not sleep. This simplifies the operations and allows one to differenciate expirations from activity.	2019-05-28 17:25:21 +02:00
Willy Tarreau	0a7ef02074	MINOR: htx: make htx_add_data() return the transmitted byte count In order to later allow htx_add_data() to transmit partial blocks and avoid defragmenting the buffer, we'll need to return the number of bytes consumed. This first modification makes the function do this and its callers take this into account. At the moment the function still works atomically so it returns either the block size or zero. However all call places have been adapted to consider any value between zero and the block size.	2019-05-28 14:48:59 +02:00
Willy Tarreau	d4908fa465	MINOR: htx: rename htx_append_blk_value() to htx_add_data_atonce() This function is now dedicated to data blocks, and we'll soon need to access it from outside in a rare few cases. Let's rename it and export it.	2019-05-28 14:48:59 +02:00
Christopher Faulet	39744f792d	MINOR: htx: Remove support of pseudo headers because it is unused The code to handle pseudo headers is unused and with no real value. So remove it.	2019-05-28 07:42:33 +02:00
Christopher Faulet	613346b60e	MINOR: htx: remove the unused function htx_find_blk()	2019-05-28 07:42:33 +02:00
Christopher Faulet	dab5ab551d	MINOR: channel/htx: Add functions to forward a part or all HTX payload The functions channel_htx_fwd_payload() and channel_htx_fwd_all() should now be used to forward, respectively, a part of the HTX payload or all of it. These functions forward data and update the first block position.	2019-05-28 07:42:33 +02:00
Christopher Faulet	29f1758285	MEDIUM: htx: Store the first block position instead of the start-line one We don't store the start-line position anymore in the HTX message. Instead we store the first block position to analyze. For now, it is almost the same. But once all changes will be made on this part, this position will have to be used by HTX analyzers, and only in the analysis context, to know where the analyse should start. When new blocks are added in an HTX message, if the first block position is not defined, it is set. When the block pointed by it is removed, it is set to the block following it. -1 remains the value to unset the position. the first block position is unset when the HTX message is empty. It may also be unset on a non-empty message, meaning every blocks were already analyzed. From HTX analyzers point of view, this position is always set during headers analysis. When they are waiting for a request or a response, if it is unset, it means the analysis should wait. But once the analysis is started, and as long as headers are not forwarded, it points to the message start-line. As mentionned, outside the HTX analysis, no code must rely on the first block position. So multiplexers and applets must always use the head position to start a loop on an HTX message.	2019-05-28 07:42:33 +02:00
Christopher Faulet	b2f4e83a28	MINOR: channel/htx: Add function to forward headers of an HTX message The function channel_htx_fwd_headers() should now be used by HTX analyzers to forward all headers of an HTX message, from the start-line to the corresponding EOH. It takes care to update the star-line position.	2019-05-28 07:42:33 +02:00
Christopher Faulet	05c083ca8d	MINOR: htx: Add a field to set the memory used by headers in the HTX start-line The field hdrs_bytes has been added in the structure htx_sl. It should be used to set how many bytes are help by all headers, from the start-line to the corresponding EOH block. it must be set to -1 if it is unknown.	2019-05-28 07:42:12 +02:00
Christopher Faulet	9b04d22945	MINOR: connection: Remove the unused flag CO_RFL_KEEP_RSV	2019-05-28 07:42:12 +02:00
Christopher Faulet	2ae35045e2	MINOR: htx: Add function htx_get_max_blksz() This functions should be used to get the maximum size for a block, not exceeding the max amount of bytes passed in argument. Thus max may be set to -1 to have no limit.	2019-05-28 07:42:12 +02:00
Christopher Faulet	aad458587d	MINOR: channel/htx: Call channel_htx_recv_max() from channel_recv_max() When channel_recv_max() is called for an HTX stream, we fall back on the HTX version. This function is called from si_cs_recv(). This will let us pass the max amount of bytes to read to HTX multiplexers.	2019-05-28 07:42:12 +02:00
Christopher Faulet	dd2ad8518f	CLEANUP: htx: Remove unused function htx_get_stline()	2019-05-28 07:42:12 +02:00
Christopher Faulet	297fbb45fe	MINOR: htx: Replace the function http_find_stline() by http_get_stline() Now, we only return the start-line. If not found, NULL is returned. No lookup is performed and the HTX message is no more updated. It is now the caller responsibility to update the position of the start-line to the right value. So when it is not found, i.e sl_pos is set to -1, it means the last start-line has been already processed and the next one has not been inserted yet. It is mandatory to rely on this kind of warranty to store 1xx informational responses and final reponse in the same HTX message.	2019-05-28 07:42:12 +02:00
Christopher Faulet	a3ad6b1b8f	MINOR: htx: Add functions to get the first block of an HTX message It is the first block relatively to the start-line. So it is the start-line if its position is set (sl_pos != -1), otherwise it is the head. The functions htx_get_first() and htx_get_first_blk() can be used to get it. This change is mandatory to consider 1xx informational messages as part of a response.	2019-05-28 07:42:12 +02:00
Christopher Faulet	9c66b980fa	MINOR: htx: Store start-line block's position instead of address of its payload Nothing much to say. This change is just mandatory to consider 1xx informational messages as part of a response.	2019-05-28 07:42:12 +02:00
Christopher Faulet	28f29c7eea	MINOR: htx: Store the head position instead of the wrap one The head of an HTX message is heavily used whereas the wrap position is only used when a block is added or removed. So it is more logical to store the head position in the HTX message instead of the wrap one. The wrap position can be easily deduced. To get it, the new function htx_get_wrap() may be used.	2019-05-28 07:42:12 +02:00
Christopher Faulet	c8b246f108	MINOR: htx: Move the macro IS_HTX_STRM() in proto/stream.h The macro IS_HTX_STRM() only relies on stream flags. So move it in proto/stream.h.	2019-05-28 07:42:12 +02:00
Christopher Faulet	429b91d308	MINOR: htx: Remove the macro IS_HTX_SMP() and always use IS_HTX_STRM() instead The macro IS_HTX_SMP() is only used at a place, in a context where the stream always exists. So, we can remove it to use IS_HTX_STRM() instead.	2019-05-28 07:42:12 +02:00
Willy Tarreau	c3b5958255	BUG/MEDIUM: threads: fix double-word CAS on non-optimized 32-bit platforms On armv7 haproxy doesn't work because of the fixes on the double-word CAS. There are two issues. The first one is that the last argument in case of dwcas is a pointer to the set of value and not a value ; the second is that it's not enough to cast the data as (void*) since it will be a single word. Let's fix this by using the pointers as an array of long. This was tested on i386, armv7, x86_64 and aarch64 and it is now fine. An alternate approach using a struct was attempted as well but it used to produce less optimal code. This fix must be backported to 1.9. This fixes github issue #105. Cc: Olivier Houchard <ohouchard@haproxy.com>	2019-05-27 17:40:59 +02:00
Willy Tarreau	d6a7850200	MINOR: cli/activity: add 3 general purpose counters in development mode The unused fd_del and fd_skip were being abused during debugging sessions as general purpose event counters. With their removal, let's officially have dedicated counters for such use cases. These counters are called "ctr0".."ctr2" and are listed at the end when DEBUG_DEV is set.	2019-05-27 07:03:38 +02:00
Willy Tarreau	394c9b4215	MINOR: cli/activity: remove "fd_del" and "fd_skip" from show activity These variables are never set anymore and were always reported as zero.	2019-05-27 06:59:14 +02:00
Willy Tarreau	c4943d5170	MINOR: buffer: add a new buffer ring API to manipulate rings of buffers The purpose is to manipulate rings made of series of buffers so that it is possible to continue to work on a next buffer once one is full. This will be used by muxes to deal with contention between multiple streams and a single output buffer. No data is expected to span over multiple buffers, all of them will be used like a regular buffer. This will significantly limit the amount of changes and the code complexity while still supporting larger output buffering. The ring is made of a head and a tail indexes both of which point to a buffer descriptor. At least one descriptor is always valid, so it could be seen as a form of pagination always presenting one buffer. The root of the ring is itself stored into a buffer descriptor so that the user only has to declare a buffer array and to call br_init() on it in order to use it.	2019-05-26 09:26:59 +02:00
Willy Tarreau	e39b58f045	MINOR: buffer: introduce b_make() to make a buffer from its parameters This is convenient to assign a buffer from parts of another one.	2019-05-26 09:26:59 +02:00
Willy Tarreau	7562a7291d	CLEANUP: debug: remove the TRACE() macro It has not been used for many years, is unlikely to be reused and conflicts with the similarly named macro in flt_trace, causing warnings at build time when including debug.h in low-level files. Let's simply remove it.	2019-05-26 09:25:59 +02:00
Willy Tarreau	0d6c75d749	OPTIM: freq-ctr: don't take the date lock for most updates It's amazing that the value was still incremented under the date lock, let's first use an atomic increment for the counter and move it out of the date lock to reduce contention. These are just counters, we don't need to take locks if we're not rotating, atomic ops are enough. This patch does this, and leaves the lock for when the period is over. It's important to note that some values might be added just before or just after a rotation but this is not a problem since we don't care if a value is counted in the previous or next period when it's exactly on the edge. Great care was taken to ensure that the current counter is always atomically updated. Other minor cleanups were performed, such as avoiding to reload the value from memory after a CAS, or using &~1 instead of two shifts to remove the lowest bit.	2019-05-25 20:31:53 +02:00
Willy Tarreau	7cf0e4517d	MINOR: raw_sock: report global traffic statistics Many times we've been missing per-process traffic statistics. While it didn't make sense in multi-process mode, with threads it does. Thus we now have a counter of bytes emitted by raw_sock, and a freq counter for these as well. However, freq_ctr are limited to 32 bits, and given that loads of 300 Gbps have already been reached over a loopback using splicing, we need to downscale this a bit. Here we're storing 1/32 of the byte rate, which gives a theorical limit of 128 GB/s or ~1 Tbps, which is more than enough. Let's have fun re-reading this sentence in 2029 :-) The values can be read in "show info" output on the CLI.	2019-05-23 11:45:38 +02:00
Willy Tarreau	f4c1e56b5e	BUILD: signals: FreeBSD has SI_LWP instead of SI_TKILL SI_TKILL is for Linux. We're again in the non-portable area. Both OSes use macros to define these values so we can #ifdef them. Let's make SI_TKILL defined based on SI_LWP when only the latter is defined.	2019-05-23 08:40:50 +02:00
Willy Tarreau	96d5195862	MEDIUM: config: deprecate the antique req* and rsp* commands These commands don't follow the same flow as the rest of the commands, each of them iterates over all header lines before switching to the next directive. In addition they make no distinction between start line and headers and can lead to unparsable rewrites which are very difficult to deal with internally. Most of them are still occasionally found in configurations, mainly because of the usual "we've always done this way". By marking them deprecated and emitting a warning and recommendation on first use of each of them, we will raise users' awareness of users regarding the cleaner, faster and more reliable alternatives. Some use cases of "reqrep" still appear from time to time for URL rewriting that is not so convenient with other rules. But at least users facing this requirement will explain their use case so that we can best serve them. Some discussion started on this subject in a thread linked to from github issue #100. The goal is to remove them in 2.1 since they require to reparse the result before indexing it and we don't want this hack to live long. The following directives were marked deprecated : -reqadd -reqallow -reqdel -reqdeny -reqiallow -reqidel -reqideny -reqipass -reqirep -reqitarpit -reqpass -reqrep -reqtarpit -rspadd -rspdel -rspdeny -rspidel -rspideny -rspirep -rsprep	2019-05-22 20:43:45 +02:00
Willy Tarreau	d1f56c9a01	BUG/MEDIUM: dns: make the port numbers unsigned Mustafa Yildirim reported in Discourse that ports >32767 advertised in SRV records are wrong. Given the high value they definitely correspond to a sign extension of a negative number. The cause was indeed that the port is declared as a signed int in the dns_answer_item structure, and Lukas confirmed in github issue #103 that turning it to unsigned addresses the issue. It is worth noting that there are other such fields in this structure that don't look right (ttl, priority, class, type) and that someone should audit this part to be certain they are properly typed. This fix must be backported to 1.9 and likely to 1.8 as well.	2019-05-22 20:07:45 +02:00
Willy Tarreau	e5733234f6	CLEANUP: build: rename some build macros to use the USE_* ones We still have quite a number of build macros which are mapped 1:1 to a USE_something setting in the makefile but which have a different name. This patch cleans this up by renaming them to use the USE_something one, allowing to clean up the makefile and make it more obvious when reading the code what build option needs to be added. The following renames were done : ENABLE_POLL -> USE_POLL ENABLE_EPOLL -> USE_EPOLL ENABLE_KQUEUE -> USE_KQUEUE ENABLE_EVPORTS -> USE_EVPORTS TPROXY -> USE_TPROXY NETFILTER -> USE_NETFILTER NEED_CRYPT_H -> USE_CRYPT_H CONFIG_HAP_CRYPT -> USE_LIBCRYPT CONFIG_HAP_NS -> DUSE_NS CONFIG_HAP_LINUX_SPLICE -> USE_LINUX_SPLICE CONFIG_HAP_LINUX_TPROXY -> USE_LINUX_TPROXY CONFIG_HAP_LINUX_VSYSCALL -> USE_LINUX_VSYSCALL	2019-05-22 19:47:57 +02:00
Willy Tarreau	823bda0eb7	BUILD: time: remove the test on _POSIX_C_SOURCE It seems it's not defined on FreeBSD while it's mentioned on Linux that clock_gettime() can be detected using this. Given that we also have the test for _POSIX_TIMERS>0 that should cover it well enough. If it breaks on other systems, we'll see. Report was here : https://github.com/haproxy/haproxy/runs/133866993	2019-05-22 19:14:59 +02:00
Willy Tarreau	082b62828d	BUG/MEDIUM: init/threads: provide per-thread alloc/free function callbacks We currently have the ability to register functions to be called early on thread creation and at thread deinitialization. It turns out this is not sufficient because certain such functions may use resources that are being allocated by the other ones, thus creating a race condition depending only on the linking order. For example the mworker needs to register a file descriptor while the pollers will reallocate the fd_updt[] array. Similarly logs and trashes may be used by some init functions while it's unclear whether they have been deduplicated. The same issue happens on deinit, if the fd_updt[] or trash is released before some functions finish to use them, we'll get into trouble. This patch creates a couple of early and late callbacks for per-thread allocation/freeing of resources. A few init functions were moved there, and the fd init code was split between the two (since it used to both allocate and initialize at once). This way the init/deinit sequence is expected to be safe now. This patch should be backported to 1.9 as at least the trash/log issue seems to be present. The run_thread_poll_loop() code is a bit different there as the mworker is not a callback, but it will have no effect and it's enough to drop the mworker changes. This bug was reported by Ilya Shipitsin in github issue #104.	2019-05-22 14:59:08 +02:00
Willy Tarreau	ca2a3cc8d5	MINOR: connection: report the mux names in "haproxy -vv" Since the mux names appear at a few places (dumps etc), let's list them in front of supported mux protocols in "haproxy -vv".	2019-05-22 11:50:48 +02:00
Willy Tarreau	430f590b5b	MINOR: threads: add a timer_t per thread in thread_info This will be used by the watchdog to detect that a thread locked up. It's only defined on platforms supporting it. This patch only reserves the room for the timer in the struct. A special value was reserved for the uninitialized timer. The problem is that the POSIX API was horribly designed, defining no invalid value, thus for each timer it is required to keep a second variable to indicate whether it's valid. A quick check shows that defining a 32-bit invalid value is not something uncommon across other implementations, with ~0 being common. Let's try with this and if it causes issues we can revisit this decision.	2019-05-22 11:50:48 +02:00
Willy Tarreau	e6a02fa65a	MINOR: threads: add a "stuck" flag to the thread_info struct This flag is constantly cleared by the scheduler and will be set by the watchdog timer to detect stuck threads. It is also set by the "show threads" command so that it is easy to spot if the situation has evolved between two subsequent calls : if the first "show threads" shows no stuck thread and the second one shows such a stuck thread, it indicates that this thread didn't manage to make any forward progress since the previous call, which is extremely suspicious.	2019-05-22 11:50:48 +02:00
Willy Tarreau	5484d58a17	MINOR: stream: introduce a stream_dump() function and use it in stream_dump_and_crash() This function dumps a lot of information about a stream into the provided buffer. It is now used by stream_dump_and_crash() and will be used by the debugger as well.	2019-05-22 11:50:48 +02:00
Willy Tarreau	2beaaf7d46	MINOR: threads: implement ha_tkill() and ha_tkillall() These functions are used respectively to signal one thread or all threads. When multithreading is disabled, it's always the current thread which is signaled.	2019-05-22 11:50:48 +02:00
Willy Tarreau	441259c561	MINOR: threads: make threads_{harmless\|want_rdv}_mask constant 0 without threads Some code starts to add ifdefs everywhere to work around the lack of threads_harmless_mask when threads are not compiled in. This one is often used to indicate a thread having joined the rendez-vous point or a thread sleeping in the poller. By setting it to zero we translate what usually is required in debugging code (i.e. the only thread is currently working) and for signal handlers we can use a combination of threads_harmless_mask and sleeping_threads_mask to detect the polling cases as well. Similarly do the same with threads_want_rdv_mask which is less often used though.	2019-05-22 11:50:48 +02:00
Willy Tarreau	6ea63c301d	CLEANUP: objtype: make obj_type() and obj_type_name() take consts There is no reason for them to require a writable area.	2019-05-22 11:50:48 +02:00
Tim Duesterhus	9b7a976cd6	BUG/MINOR: mworker: Fix memory leak of mworker_proc members The struct mworker_proc is not uniformly freed everywhere, sometimes leading to leaks of the `id` string (and possibly the other strings). Introduce a mworker_free_child function instead of duplicating the freeing logic everywhere to prevent this kind of issues. This leak was reported in issue #96. It looks like the leaks have been introduced in commit `9a1ee7ac31`, which is specific to 2.0-dev. Backporting `mworker_free_child` might be helpful to ease backporting other fixes, though.	2019-05-22 11:29:18 +02:00
Willy Tarreau	80daaa1e9d	CLEANUP: time: switch clockid_t to empty_t when not available This is cleaner than using an int. We also get rid of the constants that we don't need nor use.	2019-05-21 20:03:03 +02:00
Willy Tarreau	9a85a1700b	MINOR: compat: define a new empty type empty_t for non-implemented fields Some structures have optional fields which depend on availability of certain features on certain platforms, and having to stuff lots of ifdefs in these structs makes them unreadable. Using real values like ints requires some initialization and adds even more confusion. Here we take a different approach : we create an empty type called empty_t to use as a substitute for the real type that is not implemented and which doesn't contain any value (it's an empty struct). Thus it has a size of zero but an address, thus a pointer may point to it. It will not have to be initialized though. Some initialization code might even continue to work and do nothing like initializing it using memset with its sizeof which is zero.	2019-05-21 20:03:03 +02:00
Willy Tarreau	f61782418c	CLEANUP: time: refine the test on _POSIX_TIMERS The clock_gettime() man page says we must check that _POSIX_TIMERS is defined to a value greater than zero, not just that it's simply defined so let's fix this right now.	2019-05-21 20:03:03 +02:00
Emmanuel Hocdet	0ba4f483d2	MAJOR: polling: add event ports support (Solaris) Event ports are kqueue/epoll polling class for Solaris. Code is based on https://github.com/joyent/haproxy-1.8/tree/joyent/dev-v1.8.8. Event ports are available only on SunOS systems derived from Solaris 10 and later (including illumos systems).	2019-05-21 15:16:45 +02:00
Willy Tarreau	219b829b62	MINOR: time: add a function to retrieve another thread's cputime now_cpu_time_thread() does the same as now_cpu_time() but for another thread based on its clockid.	2019-05-20 21:14:14 +02:00
Willy Tarreau	81036f2738	MINOR: time: move the cpu, mono, and idle time to thread_info These ones are useful across all threads and would be better placed in struct thread_info than thread-local. There are very few users.	2019-05-20 21:14:14 +02:00
Willy Tarreau	8323a375bc	MINOR: threads: add a thread-local thread_info pointer "ti" Since we're likely to access this thread_info struct more frequently in the future, let's reserve the thread-local symbol to access it directly and avoid always having to combine thread_info and tid. This pointer is set when tid is set.	2019-05-20 21:14:12 +02:00
Willy Tarreau	624dcbf41e	MINOR: threads: always place the clockid in the struct thread_info It will be easier to deal with the internal API to always have it.	2019-05-20 21:13:01 +02:00
Willy Tarreau	b81939cef0	MINOR: compat: make sure to always define clockid_t In order to ease the internal time API, we'll have the threads time always present even when threads are disabled. Let's make sure clockid_t, and the minimum clock times are defined even on older or non-compatible systems.	2019-05-20 20:24:10 +02:00
Willy Tarreau	5a6e2245fa	REORG: threads: move the struct thread_info from global.h to hathreads.h It doesn't make sense to keep this struct thread_info in global.h, it causes difficulties to access its contents from hathreads.h, let's move it to the threads where it ought to have been created.	2019-05-20 20:00:25 +02:00
Willy Tarreau	e3e2b7283f	REORG: compat: move some integer limit definitions from standard.h to compat.h Historically standard.h was the location where we used to (re-)define the standard set of macros and functions, and to complement the ones missing on the target OS. Over time it has become a toolbox in itself relying on many other things, and its definition of LONGBITS is used everywhere else (e.g. for MAX_THREADS), resulting in painful circular dependencies. Let's move these few defines (integer sizes) to compat.h where other similar definitions normally are.	2019-05-20 19:59:34 +02:00
Willy Tarreau	3710105945	MINOR: tools: provide a may_access() function and make dump_hex() use it It's a bit too easy to crash by accident when using dump_hex() on any area. Let's have a function to check if the memory may safely be read first. This one abuses the stat() syscall checking if it returns EFAULT or not, in which case it means we're not allowed to read from there. In other situations it may return other codes or even a success if the area pointed to by the file exists. It's important not to abuse it though and as such it's tested only once per output line.	2019-05-20 16:59:37 +02:00
Willy Tarreau	56131ca58e	MINOR: debug: implement ha_panic() This function dumps all existing threads using the thread dump mechanism then aborts. This will be used by the lockup detection and by debugging tools.	2019-05-20 16:51:30 +02:00
Willy Tarreau	9fc5dcbd71	MINOR: tools: add dump_hex() This is used to dump a memory area into a buffer for debugging purposes.	2019-05-20 16:51:30 +02:00
Willy Tarreau	91e6df01fa	MINOR: threads: add each thread's clockid into the global thread_info This is the per-thread CPU runtime clock, it will be used to measure the CPU usage of each thread and by the lockup detection mechanism. It must only be retrieved at the beginning of run_thread_poll_loop() since the thread must already have been started for this. But it must be done before performing any per-thread initcall so that all thread init functions have access to the clock ID. Note that it could make sense to always have this clockid available even in non-threaded situations and place the process' clock there instead. But it would add portability issues which are currently easy to deal with by disabling threads so it may not be worth it for now.	2019-05-20 11:42:25 +02:00
Willy Tarreau	522cfbc1ea	MINOR: init/threads: make the global threads an array of structs This way we'll be able to store more per-thread information than just the pthread pointer. The storage became an array of struct instead of an allocated array since it's very small (typically 512 bytes) and not worth the hassle of dealing with memory allocation on this. The array was also renamed thread_info to make its intended usage more explicit.	2019-05-20 11:37:57 +02:00
Willy Tarreau	b49a58dda2	CLEANUP: threads: remove the now unused START_LOCK label The last two users are now gone.	2019-05-20 11:26:12 +02:00
Willy Tarreau	619a95f5ad	MEDIUM: init/mworker: make the pipe register function a regular initcall Now that we have the guarantee that init calls happen before any other thread starts, we don't need anymore the workaround installed by commit `1605c7ae6` ("BUG/MEDIUM: threads/mworker: fix a race on startup") and we can instead rely on a regular per-thread initcall for this function. It will only be performed on worker thread #0, the other ones and the master have nothing to do, just like in the original code that was only moved to the function.	2019-05-20 11:26:12 +02:00
Willy Tarreau	c7091d89ae	MEDIUM: debug/threads: implement an advanced thread dump system The current "show threads" command was too limited as it was not possible to dump other threads' detailed states (e.g. their tasks). This patch goes further by using thread signals so that each thread can dump its own state in turn into a shared buffer provided by the caller. Threads are synchronized using a mechanism very similar to the rendez-vous point and using this method, each thread can safely dump any of its contents and the caller can finally report the aggregated ones from the buffer. It is important to keep in mind that the list of signal-safe functions is limited, so we take care of only using chunk_printf() to write to a pre-allocated buffer. This mechanism is enabled by USE_THREAD_DUMP and is enabled by default on Linux 2.6.28+. On other platforms it falls back to the previous solution using the loop and the less precise dump.	2019-05-17 17:16:20 +02:00
Willy Tarreau	29bf96d73d	MINOR: task: always reset curr_task when freeing a task or tasklet With the thread debugger it becomes visible that we can leave some wandering pointers for a while in curr_task, which is inappropriate. This patch addresses this by resetting curr_task to NULL before really freeing the area. This way it becomes safe even regarding signals.	2019-05-17 17:16:20 +02:00
Willy Tarreau	38171daf21	MINOR: thread: implement ha_thread_relax() At some places we're using a painful ifdef to decide whether to use sched_yield() or pl_cpu_relax() to relax in loops, this is hardly exportable. Let's move this to ha_thread_relax() instead and une this one only.	2019-05-17 17:16:20 +02:00
Willy Tarreau	5cf64dd1bd	MINOR: debug: make ha_thread_dump() and ha_task_dump() take a buffer Instead of having them dump into the trash and initialize it, let's have the caller initialize a buffer and pass it. This will be convenient to dump multiple threads at once into a single buffer.	2019-05-17 17:16:20 +02:00
Willy Tarreau	4e2b646d60	MINOR: cli/debug: add a thread dump function The new function ha_thread_dump() will dump debugging info about all known threads. The current thread will contain a bit more info. The long-term goal is to make it possible to use it in signal handlers to improve the accuracy of some dumps. The function dumps its output into the trash so as it was trivial to add, a new "show threads" command appeared on the CLI.	2019-05-16 18:06:45 +02:00
Willy Tarreau	aa1e1be88f	MINOR: task: export global_task_mask It will be used in debugging functions and must be exported.	2019-05-16 18:02:03 +02:00
Tim Duesterhus	10c6c16cde	MEDIUM: Make 'option forceclose' actually warn It is deprecated since `315b39c391` (1.9-dev), but only was deprecated in the docs. Make it warn when being used and remove it from the docs.	2019-05-16 18:02:03 +02:00
Willy Tarreau	0f35c593f6	BUILD: ist: turn the lower/upper case tables to literal on obsolete linkers Gil Bahat reported build issues on Cygwin starting with 1.9 due to a difference in the way the linker handles the weak symbols there, causing multiple declarations of ist_lc[] and ist_uc[]. It's likely that this issue could also happen on any older or non-ELF linker. This patch addresses this by using literals instead on such platforms, leaving it to the compiler to merge the constants when it can. On other platforms the resulting executable is slightly larger due to strings that could not be merged but this is a minor detail compared to not being able to build at all. If this change alone is confirmed to fix these issues, it's safe to backport to 1.9.	2019-05-15 16:14:04 +02:00
Willy Tarreau	469fa2c9d9	MINOR: debug: add a new BUG_ON macro We do have some code paths testing for impossible errors that tend to be quite confusing, first for maintenance (what to do on such errors, and how far to guess the bug), second for developers as it tends to hide the main purpose and expectations of these call places. Also most of the time impossible errors are ignored by the callers so the tests are not even usable during debugging. Let's instead implement a BUG_ON macro which takes a condition, which if true, will cause a message to be emitted and optionally to crash the process. Additionally, these calls inserted at various places server as hints and documentation for developers to know that such conditions must absolutely not happen. This is only enabled when DEBUG_STRICT or DEBUG_STRICT_NOCRASH are set. As its name implies, DEBUG_STRICT_NOCRASH only performs the test but does not crash, which can be useful to track some checkpoints. At the moment nothing uses this code.	2019-05-14 17:34:49 +02:00
Willy Tarreau	a5e33a9b66	BUILD: debug: make gcc not complain on the ABORT_NOW() macro On recent gcc versions with the null-deref checks, ABORT_NOW() rightfully emits such a warning. But here it's on purpose. Simply changing the memory address to 1 makes gcc happy.	2019-05-14 17:22:28 +02:00
Willy Tarreau	8bdb5c9bb4	CLEANUP: connection: remove the handle field from the wait_event struct It was only set and not consumed after the previous change. The reason is that the task's context always contains the relevant information, so there is no need for a second pointer.	2019-05-13 19:14:52 +02:00
Willy Tarreau	42ccb5ac45	MINOR: lists: add LIST_ADDED() to check if an element belongs to a list Some code parts use LIST_ISEMPTY() a lot on list elements to detect if they were reset consecutive to their removal from a list, but this test is always confusing as this was initially designed for list heads. Instead let's have a new macro, LIST_ADDED(), which returns true when the element is in a list (i.e. it's not "empty").	2019-05-13 19:14:52 +02:00
Olivier Houchard	478281f55d	BUG/MEDIUM: connections: Don't forget to set xprt_ctx to NULL on close. In conn_xprt_close(), after calling xprt->close(), don't forget to set conn->xprt_ctx to NULL, or we may attempt to reuse the now-free'd conn->xprt_ctx if the connection failed and we're retrying it.	2019-05-13 19:11:38 +02:00
Willy Tarreau	6a38b3297c	BUILD: threads: fix again the __ha_cas_dw() definition This low-level asm implementation of a double CAS was implemented only for certain architectures (x86_64, armv7, armv8). When threads are not used, they were not defined, but since they were called directly from a few locations, they were causing build issues on certain platforms with threads disabled. This was addressed in commit `f4436e1` ("BUILD: threads: Add __ha_cas_dw fallback for single threaded builds") by making it fall back to HA_ATOMIC_CAS() when threads are not defined, but this actually made the situation worse by breaking other cases. This patch fixes this by creating a high-level macro HA_ATOMIC_DWCAS() which is similar to HA_ATOMIC_CAS() except that it's intended to work on a double word, and which rely on the asm implementations when threads are in use, and uses its own open-coded implementation when threads are not used. The 3 call places relying on __ha_cas_dw() were updated to use HA_ATOMIC_DWCAS() instead. This change was tested on i586, x86_64, armv7, armv8 with and without threads with gcc 4.7, armv8 with gcc 5.4 with and without threads, as well as i586 with gcc-3.4 without threads. It will need to be backported to 1.9 along with the fix above to fix build on armv7 with threads disabled.	2019-05-11 18:13:29 +02:00
Willy Tarreau	295d614de1	CLEANUP: ssl: move all BIO_* definitions to openssl-compat The following macros are now defined for openssl < 1.1 so that we can remove the code performing direct access to the structures : BIO_get_data(), BIO_set_data(), BIO_set_init(), BIO_meth_free(), BIO_meth_new(), BIO_meth_set_gets(), BIO_meth_set_puts(), BIO_meth_set_read(), BIO_meth_set_write(), BIO_meth_set_create(), BIO_meth_set_ctrl(), BIO_meth_set_destroy()	2019-05-11 17:39:08 +02:00
Willy Tarreau	11b167167e	CLEANUP: ssl: remove ifdef around SSL_CTX_get_extra_chain_certs() Instead define this one in openssl-compat.h when SSL_CTRL_GET_EXTRA_CHAIN_CERTS is not defined (which was the current condition used in the ifdef).	2019-05-11 17:38:21 +02:00
Willy Tarreau	366a6987a7	CLEANUP: ssl: move the SSL_OP_* and SSL_MODE_* definitions to openssl-compat These ones were defined in the middle of ssl_sock.c, better move them to the include file to find them.	2019-05-11 17:37:44 +02:00
Olivier Houchard	602bf7d2ea	MEDIUM: streams: Add a new http action, disable-l7-retry. Add a new action for http-request, disable-l7-retry, that can be used to disable any attempt at retry requests (see retry-on) if it fails for any reason other than a connection failure. This is useful for example to make sure POST requests aren't retried.	2019-05-10 17:49:09 +02:00
Chris Packham	f4436e145b	BUILD: threads: Add __ha_cas_dw fallback for single threaded builds __ha_cas_dw() is used in fd_rm_from_fd_list() and when built without USE_THREADS=1 the linker fails to find __ha_cas_dw(). Add a definition of __ha_cas_dw() for the #ifndef USE_THREADS case. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>	2019-05-10 10:55:31 +02:00
Willy Tarreau	c125cef6da	CLEANUP: ssl: make inclusion of openssl headers safe It's always a pain to have to stuff lots of #ifdef USE_OPENSSL around ssl headers, it even results in some of them appearing in a random order and multiple times just to benefit form an existing ifdef block. Let's make these headers safe for inclusion when USE_OPENSSL is not defined, they now perform the test themselves and do nothing if USE_OPENSSL is not defined. This allows to remove no less than 8 such ifdef blocks and make include blocks more readable.	2019-05-10 09:58:43 +02:00
Willy Tarreau	8d164dc568	CLEANUP: ssl: never include openssl/*.h outside of openssl-compat.h anymore Since we're providing a compatibility layer for multiple OpenSSL implementations and their derivatives, it is important that no C file directly includes openssl headers but only passes via openssl-compat instead. As a bonus this also gets rid of redundant complex rules for inclusion of certain files (engines etc).	2019-05-10 09:36:42 +02:00
Willy Tarreau	9356dacd22	REORG: ssl: move some OpenSSL defines from ssl_sock to openssl-compat Some defines like OPENSSL_VERSION or X509_getm_notBefore() have nothing to do in ssl_sock and must move to openssl-compat.h so that they are consistently shared by the whole code. A warning in the code was added against wild additions of macros there.	2019-05-10 09:31:06 +02:00
Willy Tarreau	5599456ee2	REORG: ssl: move openssl-compat from proto to common This way we can include it much earlier to cover types/ as well.	2019-05-10 09:19:50 +02:00
Willy Tarreau	1d158ab12d	BUILD: ssl: make libressl use its own version numbers LibreSSL causes lots of build issues by pretending to be OpenSSL 2.0.0, and it requires lots of care for each #if added to cover any specific OpenSSL features. This commit addresses the problem by making LibreSSL only advertise the version it forked from (1.0.1g) and by starting to use tests based on its real version to enable features instead of working by exclusion.	2019-05-09 14:25:47 +02:00
Willy Tarreau	9a1ab08160	CLEANUP: ssl-sock: use HA_OPENSSL_VERSION_NUMBER instead of OPENSSL_VERSION_NUMBER Most tests on OPENSSL_VERSION_NUMBER have become complex and break all the time because this number is fake for some derivatives like LibreSSL. This patch creates a new macro, HA_OPENSSL_VERSION_NUMBER, which will carry the real openssl version defining the compatibility level, and this version will be adjusted depending on the variants.	2019-05-09 14:25:43 +02:00
Olivier Houchard	92e38e254b	BUG/MEDIUM: servers: Don't use the same srv flag for cookie-set and TFO. The tfo code was based on an old patch, and the value of the SRV_F_FASTOPEN flag it used was since reused for SRV_F_COOKIESET. So give SRV_F_FASTOPEN its own value.	2019-05-08 19:48:32 +02:00
Christopher Faulet	6f3cb1801b	MINOR: htx: Remove support for unused OOB HTX blocks This type of block was introduced in the early design of the HTX and it is not used anymore. So, just remove it. This patch may be backported to 1.9.	2019-05-07 22:16:41 +02:00
Christopher Faulet	3b1d004d41	BUG/MEDIUM: spoe: Be sure the sample is found before setting its context When a sample fetch is encoded, we use its context to set info about the fragmentation. But if the sample is not found, the function sample_process() returns NULL. So we me be sure the sample exists before setting its context. This patch must be backported to 1.9 and 1.8.	2019-05-07 22:16:41 +02:00
William Lallemand	27edc4b915	MINOR: mworker: support a configurable maximum number of reloads This patch implements a new global parameter for the master-worker mode. When setting the mworker-max-reloads value, a worker receive a SIGTERM if its number of reloads is greater than this value.	2019-05-07 19:09:01 +02:00
Willy Tarreau	f656279347	CLEANUP: task: remove unneeded tests before task_destroy() Since previous commit it's not needed anymore to test a task pointer before calling task_destory() so let's just remove these tests from the various callers before they become confusing. The function's arguments were also documented. The same should probably be done with tasklet_free() which involves a test in roughly half of the call places.	2019-05-07 19:08:16 +02:00
Dragan Dosen	75bc6d3e59	BUG/MEDIUM: tasks: fix possible segfault on task_destroy() Commit `3f795f7` ("MEDIUM: tasks: Merge task_delete() and task_free() into task_destroy().") replaced task_delete() and task_free() with a single function named task_destroy(). This patch adds a check for struct task* argument in function task_destroy() to prevent a possible segfault on NULL and also to make the function safer for use in other cases.	2019-05-07 18:58:52 +02:00
Dragan Dosen	2674303912	MEDIUM: regex: modify regex_comp() to atomically allocate/free the my_regex struct Now we atomically allocate the my_regex struct within function regex_comp() and compile the regex or free both in case of failure. The pointer to the allocated my_regex struct is returned directly. The my_regex* argument to regex_comp() is removed. Function regex_free() was modified so that it systematically frees the my_regex entry. The function does nothing when called with a NULL as argument (like free()). It will avoid existing risk of not properly freeing the initialized area. Other structures are also updated in order to be compatible (the ones related to Lua and action rules).	2019-05-07 06:58:15 +02:00
Fr�d�ric L�caille	c02766a267	MINOR: stick-table: Add prefixes to stick-table names. With this patch we add a prefix to stick-table names declared in "peers" sections concatenating the "peers" section name followed by a '/' character with the stick-table name. Consequently, "peers" sections have their own namespace for their stick-tables. Obviously, these stick-table names are not the ones which should be sent over the network. So these configurations must be compatible and should make A and B peers communicate with peers protocol: # haproxy A config, old way stick-table declerations peers mypeers peer A ... peer B ... backend t1 stick-table type string size 10m store gpc0 peers mypeers # haproxy B config, new way stick-table declerations peers mypeers peer A ... peer B ... table t1 type string size store gpc0 10m This "network" name is stored in ->nid new field of stktable struct. The "local" stktable-name is still stored in ->id.	2019-05-07 06:54:07 +02:00
Fr�d�ric L�caille	015e4d7d93	MINOR: stick-tables: Add peers process binding computing. Add a list of proxies for all the stick-tables (->proxies_list struct stktable member) so that to be able to compute the process bindings of the peers after having parsed the configuration file. The proxies are added to the stick-tables they reference when parsing stick-tables lines in proxy sections, when checking the actions in check_trk_action() and when resolving samples args for stick-tables without checking is they are duplicates. We check only there is no loop. Then, after having parsed everything, we add the proxy bindings to the peers frontend bindings with stick-tables they reference.	2019-05-07 06:54:07 +02:00
Fr�d�ric L�caille	1b8e68e89a	MEDIUM: stick-table: Stop handling stick-tables as proxies. This patch adds the support for the "table" line parsing in "peers" sections to declare stick-table in such sections. This also prevents the user from having to declare dummy backends sections with a unique stick-table inside. Even if still supported, this usage will become deprecated. To do so, the ->table member of proxy struct which is a stktable struct is replaced by a pointer to a stktable struct allocated at parsing time in src/cfgparse-listen.c for the dummy stick-table backends and in src/cfgparse.c for "peers" sections. This has an impact on the code for stick-table sample converters and on the stickiness rules parsers which first store the name of the dummy before resolving the rules. This patch replaces proxy_tbl_by_name() calls by stktable_find_by_name() calls to lookup for stick-tables stored in "stktable_by_name" ebtree at parsing time. There is only one remaining place where proxy_tbl_by_name() is used: src/hlua.c. At several places in the code we relied on the fact that ->size member of stick-table was equal to zero to consider the stick-table was present by not configured, this do not make sense anymore as ->table member of struct proxyis fow now on a pointer. These tests are replaced by a test on ->table value itself. In "peers" section we do not have to temporary store the name of the section the stick-table are attached to because this name is obviously already known just after having entered this "peers" section. About the CLI stick-table I/O handler, the pointer to proxy struct is replaced by a pointer to a stktable struct.	2019-05-07 06:54:06 +02:00
Fr�d�ric L�caille	f92da38222	BUILD/MINOR: stick-table: Compilation fix. Missing header to dereference struct peers pointer from struct table.	2019-05-07 06:54:06 +02:00
Fr�d�ric L�caille	d456aa4ac2	MINOR: config: Extract the code of "stick-table" line parsing. With this patch we move the code responsible of parsing "stick-table" lines to implement parse_stick_table() function in src/stick-tabble.c so that to be able to parse "stick-table" elsewhere than in proxy sections. We have have also added a conf struct to stktable struct to store the filename and the line in the file the stick-table has been parsed to help in diagnosing and displaying any configuration issue.	2019-05-07 06:54:06 +02:00
Willy Tarreau	034c88cf03	MEDIUM: tcp: add the "tfo" option to support TCP fastopen on the server This implements support for the new API which relies on a call to setsockopt(). On systems that support it (currently, only Linux >= 4.11), this enables using TCP fast open when connecting to server. Please note that you should use the retry-on "conn-failure", "empty-response" and "response-timeout" keywords, or the request won't be able to be retried on failure. Co-authored-by: Olivier Houchard <ohouchard@haproxy.com>	2019-05-06 22:29:39 +02:00
Olivier Houchard	fdcb007ad8	MEDIUM: proto: Change the prototype of the connect() method. The connect() method had 2 arguments, "data", that tells if there's pending data to be sent, and "delack" that tells if we have to use a delayed ack inconditionally, or if the backend is configured with tcp-smart-connect. Turn that into one argument, "flags". That way it'll be easier to provide more informations to connect() without adding extra arguments.	2019-05-06 22:12:57 +02:00
Ilya Shipitsin	54832b97c6	BUILD: enable several LibreSSL hacks, including SSL_SESSION_get0_id_context is introduced in LibreSSL-2.7.0 async operations are not supported by LibreSSL early data is not supported by LibreSSL packet_length is removed from SSL struct in LibreSSL	2019-05-06 07:26:24 +02:00
Olivier Houchard	e3249a98e2	MEDIUM: streams: Add a new keyword for retry-on, "junk-response" Add a way to retry requests if we got a junk response from the server, ie an incomplete response, or something that is not valid HTTP. To do so, one can use the new "junk-response" keyword for retry-on.	2019-05-04 10:20:24 +02:00
Olivier Houchard	865d8392bb	MEDIUM: streams: Add a way to replay failed 0rtt requests. Add a new keyword for retry-on, 0rtt-rejected. If set, we will try to replay requests for which we sent early data that got rejected by the server. If that option is set, we will attempt to use 0rtt if "allow-0rtt" is set on the server line even if the client didn't send early data.	2019-05-04 10:20:24 +02:00
Olivier Houchard	a254a37ad7	MEDIUM: streams: Add the ability to retry a request on L7 failure. When running in HTX mode, if we sent the request, but failed to get the answer, either because the server just closed its socket, we hit a server timeout, or we get a 404, 408, 425, 500, 501, 502, 503 or 504 error, attempt to retry the request, exactly as if we just failed to connect to the server. To do so, add a new backend keyword, "retry-on". It accepts a list of keywords, which can be "none" (never retry), "conn-failure" (we failed to connect, or to do the SSL handshake), "empty-response" (the server closed the connection without answering), "response-timeout" (we timed out while waiting for the server response), or "404", "408", "425", "500", "501", "502", "503" and "504". The default is "conn-failure".	2019-05-04 10:19:56 +02:00
Willy Tarreau	c40efc1919	MINOR: init/threads: make the threads array global Currently the thread array is a local variable inside a function block and there is no access to it from outside, which often complicates debugging. Let's make it global and export it. Also the allocation return is now checked.	2019-05-03 10:16:30 +02:00
Willy Tarreau	81492c989c	MINOR: threads: flatten the per-thread cpu-map When we initially experimented with threads and processes support, we needed to implement arrays of threads per process for cpu-map, but this is not needed anymore since we support either threads or processes. Let's simply make the thread-based cpu-map per thread and not per thread and per process since that's not used anymore. Doing so reduces the global struct from 33kB to 1.5kB.	2019-05-03 09:46:45 +02:00
Olivier Houchard	a48237fd07	BUG/MEDIUM: connections: Make sure we remove CO_FL_SESS_IDLE on disown. When for some reason the session is not the owner of the connection anymore, make sure we remove CO_FL_SESS_IDLE, even if we're about to call conn->mux->destroy(), as the destroy may not destroy the connection immediately if it's still in use. This should be backported to 1.9. u	2019-05-02 12:08:39 +02:00
Olivier Houchard	55071d30ca	BUG/MEDIUM: channels: Don't forget to reset output in channel_erase(). In channel_erase(), don't forget to set output to 0, otherwise the channel won't seem empty, when it really is, and that could lead to stream never closing properly. This should be backported to 1.9.	2019-05-02 10:40:59 +02:00
Christopher Faulet	102854cbba	BUG/MEDIUM: listener: Fix how unlimited number of consecutive accepts is handled There is a bug when global.tune.maxaccept is set to -1 (no limit). It is pretty visible with one process (nbproc sets to 1). The functions listener_accept() and accept_queue_process() don't expect to handle negative maxaccept values. So instead of accepting incoming connections without any limit, none are never accepted and HAProxy loop infinitly in the scheduler. When there are 2 or more processes, the bug is a bit more subtile. The limit for a listener is set to 1. So only one connection is accepted at a time by a given listener. This happens because the listener's maxaccept value is an unsigned integer. In check_config_validity(), it is first set to UINT_MAX (-1 casted in an unsigned integer), and then some calculations on it leads to an integer overflow. To fix the bug, the listener's maxaccept value is now a signed integer. So, if a negative value is set for global.tune.maxaccept, we keep it untouched for the listener and no calculation is made on it. Then, in the listener code, this signed value is casted to a unsigned one. It simplifies all tests instead of dealing with negative values. So, it limits the number of connections accepted at a time to UINT_MAX at most. But, honestly, it not an issue. This patch must be backported to 1.9 and 1.8.	2019-04-30 15:28:29 +02:00
Olivier Houchard	07425de717	BUG/MEDIUM: port_range: Make the ring buffer lock-free. Port range uses a ring buffer, and unfortunately, when making haproxy multithreaded, it's been overlooked, and the ring buffer is not thread-safe. When specifying a source range, 2 or more threads could pick the same port, and of course only one of them could use the port, the others would always fail the connection. To fix this, make it a lock-free ring buffer. This is easier than usual because we know the ring buffer can never be full. This should be backported to 1.8 and 1.9.	2019-04-30 15:10:17 +02:00
Olivier Houchard	9ce62b5498	MINOR: threads: Implement HA_ATOMIC_LOAD(). The same way we have HA_ATOMIC_STORE(), implement HA_ATOMIC_LOAD(). This should be backported to 1.8 and 1.9, as we need it for a bug fix in port ranges.	2019-04-30 15:10:08 +02:00
Willy Tarreau	bc13bec548	MINOR: activity: report context switch counts instead of rates It's not logical to report context switch rates per thread in show activity because everything else is a counter and it's not even possible to compare values. Let's only report counts. Further, this simplifies the scheduler's code.	2019-04-30 14:55:18 +02:00
Willy Tarreau	9634e86dc7	CLEANUP: task: move the task_per_thread definition to task.h It's the second time I look for it and can't find it because it's not in the right file.	2019-04-30 14:36:47 +02:00

... 4 5 6 7 8 ...

3969 Commits