haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-10 09:07:02 +02:00

Author	SHA1	Message	Date
Willy Tarreau	62bde43779	BUILD: flags: fix the fallback macros for missing stdio The fallback macros for when stdio is not there didn't have the "..." and were causing build issues on platforms with stricter dependencies between includes.	2022-09-09 17:46:45 +02:00
Willy Tarreau	233c0a586d	BUILD: flags: fix build warning in some macros used by show_flags Some gcc versions seem to be upset by the use of enums as booleans, so OK, let's cast all of them as uint, that's no big deal.	2022-09-09 17:36:27 +02:00
Aurelien DARRAGON	d46f437de6	MINOR: proxy/listener: support for additional PAUSED state This patch is a prerequisite for #1626. Adding PAUSED state to the list of available proxy states. The flag is set when the proxy is paused at runtime (pause_listener()). It is cleared when the proxy is resumed (resume_listener()). It should be backported to 2.6, 2.5 and 2.4	2022-09-09 17:23:01 +02:00
Aurelien DARRAGON	001328873c	MINOR: listener: small API change A minor API change was performed in listener(.c/.h) to restore consistency between stop_listener() and (resume/pause)_listener() functions. LISTENER_LOCK was never locked prior to calling stop_listener(): lli variable hint is thus not useful anymore. Added PROXY_LOCK locking in (resume/pause)_listener() functions with related lpx variable hint (prerequisite for #1626). It should be backported to 2.6, 2.5 and 2.4	2022-09-09 17:23:01 +02:00
Willy Tarreau	6edae6ff48	MINOR: flags/http_ana: use flag dumping to show http msg states The function is hmsg_show_flags(). It shows the HTTP_MSGF_* flags.	2022-09-09 17:18:57 +02:00
Willy Tarreau	5349779e40	MINOR: flags/htx: use flag dumping to show htx and start-line flags The function are respectively htx_show_flags() and hsl_show_flags().	2022-09-09 16:59:29 +02:00
Willy Tarreau	e2afad0af4	MINOR: flags/http_ana: use flag dumping for txn flags The new function is txn_show_flags(). It dumps the TXN flags as well as the client and server cookie types.	2022-09-09 16:52:09 +02:00
Willy Tarreau	92a2d3c02b	MINOR: flags/task: use flag dumping for task state The new function is task_show_state().	2022-09-09 16:52:09 +02:00
Willy Tarreau	e9d1283cc5	MINOR: flags/stream: use flag dumping for stream flags The new function is strm_show_flags(). It dumps the stream flags as well as the err type under SF_ERR_MASK and the final state under SF_FINST_MASK.	2022-09-09 16:52:09 +02:00
Willy Tarreau	f4cb98ce56	MINOR: flags/stream: use flag dumping for stream error type The new function is strm_et_show_flags(). Only the error type is handled at the moment, as a bit more complex logic is needed to mix the values and enums present in some fields.	2022-09-09 16:52:09 +02:00
Willy Tarreau	4bab7d81b6	MINOR: flags/stconn: use flag dumping for stconn and sedesc flags The two new functions are se_show_flags() and sc_show_flags(). Maybe something could be done for SC_ST_* values but as it's a small enum, a simple switch/case should work fine.	2022-09-09 16:52:08 +02:00
Willy Tarreau	9d9e101689	MINOR: flags/connection: use flag dumping for connection flags The new function is conn_show_flags(), it only deals with flags. Nothing is planned for connection error types at the moment.	2022-09-09 16:15:10 +02:00
Willy Tarreau	cdc9ddc8cf	MINOR: flags/channel: use flag dumping for channel flags and analysers The two new functions are chn_show_analysers() and chn_show_flags(). They work on an existing buffer so one was declared in flags.c for this purpose. File flags.c does not have to know about channel flags anymore.	2022-09-09 16:15:10 +02:00
Willy Tarreau	7a955b5d73	MINOR: flags: implement a macro used to dump enums inside masks Some of our flags have enums inside a mask. The new macro __APPEND_ENUM is able to deal with that by comparing the flag's value against an exact one under the mask. One needs to take care of eliminating the zero value though, otherwise delimiters will not always be properly placed (e.g. if some flags were dumped before and what remains is exactly zero). The bits of the mask are cleared only upon exact matches.	2022-09-09 16:15:10 +02:00
Willy Tarreau	77acaf5af5	MINOR: flags: add a new file to host flag dumping macros The "flags" utility is useful but painful to maintain up to date. This commit aims at providing a low-maintenance solution to keep flags up to date, by proposing some macros that build a string from a set of flags in a way that requires the least possible verbosity. The idea will be to add an inline function dedicated to this just after the flags declaration, and enumerate the flags one is interested in, and that function will fill a string based on them. Placing this inside the type files allows both haproxy and external tools like "flags" to use it, but comes with a few constraints. First, the files will be slightly less readable if these functions are huge, so they need to stay as compact as possible. Second, the function will need anprintf() and we don't want to include stdio.h in type files as it proved to be particularly heavy and to cause definition headaches in the past. As such the file here only contains a macro enclosed in #ifdef EOF (that is defined in stdio), and provides an alternate empty one when no stdio is defined. This way it's the caller that has to include stdio first or it won't get anything back, and in practice the locations relying on this always have it. The macro has to be used in 3 steps: - prologue: dumps 0 and exits if the value is zero - flags: the macro can be recursively called and it will push the flag from bottom to top so that they appear in the same order as today without requiring to be declared the other way around - epilogue: dump remaining flags that were not identified The macro was arranged so that a single character can be used with no other argument to declare all flags at once. Example: #define _(n, ...) __APPEND_FLAG(buf, len, del, flg, n, #n, __VA_ARGS__) _(0); _(X_FLAG1, _(X_FLAG2, _(X_FLAG3, _(X_FLAG4)))); _(~0); #undef _ Existing files will have to be updated to rely on it, and more files could come soon.	2022-09-09 14:47:31 +02:00
Frédéric Lécaille	3dd79d378c	MINOR: h3: Send the h3 settings with others streams (requests) This is the ->finalize application callback which prepares the unidirectional STREAM frames for h3 settings and wakeup the mux I/O handler to send them. As haproxy is at the same time always waiting for the client request, this makes haproxy call sendto() to send only about 20 bytes of stream data. Furthermore in case of heavy loss, this give less chances to short h3 requests to succeed. Drawback: as at this time the mux sends its streams by their IDs ascending order the stream 0 is always embedded before the unidirectional stream 3 for h3 settings. Nevertheless, as these settings may be lost and received after other h3 request streams, this is permitted by the RFC. Perhaps there is a better way to do. This will have to be checked with Amaury. Must be backported to 2.6.	2022-09-08 18:04:58 +02:00
Frédéric Lécaille	bb995eafc7	BUG/MINOR: quic: Speed up the handshake completion only one time It is possible to speed up the handshake completion but only one time by connection as mentionned in RFC 9002 "6.2.3. Speeding up Handshake Completion". Add a flag to prevent this process to be run several times (see https://www.rfc-editor.org/rfc/rfc9002#name-speeding-up-handshake-compl). Must be backported to 2.6.	2022-09-08 18:04:58 +02:00
Willy Tarreau	3d4cdb198c	MEDIUM: tasks/activity: combine the called function with the caller Now instead of getting aggregate stats per called function, we have them per function AND per call place. The "byaddr" sort considers the function pointer first, then the call count, so that dominant callers of a given callee are instantly spotted. This allows to get sorted outputs like this: Tasks activity: function calls cpu_tot cpu_avg lat_tot lat_avg h1_io_cb 17357952 40.91s 2.357us 4.849m 16.76us <- sock_conn_iocb@src/sock.c:869 tasklet_wakeup sc_conn_io_cb 10357182 6.297s 607.0ns 27.93m 161.8us <- sc_app_chk_rcv_conn@src/stconn.c:762 tasklet_wakeup process_stream 9891131 1.809m 10.97us 53.61m 325.2us <- sc_notify@src/stconn.c:1209 task_wakeup process_stream 9823934 1.887m 11.52us 48.31m 295.1us <- stream_new@src/stream.c:563 task_wakeup sc_conn_io_cb 9347863 16.59s 1.774us 6.143m 39.43us <- h1_wake_stream_for_recv@src/mux_h1.c:2600 tasklet_wakeup h1_io_cb 501344 1.848s 3.686us 6.544m 783.2us <- conn_subscribe@src/connection.c:732 tasklet_wakeup sc_conn_io_cb 239717 492.3ms 2.053us 3.213m 804.3us <- qcs_notify_send@src/mux_quic.c:529 tasklet_wakeup h2_io_cb 173019 4.204s 24.30us 40.95s 236.7us <- h2_snd_buf@src/mux_h2.c:6712 tasklet_wakeup h2_io_cb 149487 424.3ms 2.838us 14.63s 97.87us <- h2c_restart_reading@src/mux_h2.c:856 tasklet_wakeup other 101893 4.626s 45.40us 14.84s 145.7us quic_lstnr_dghdlr 94389 614.0ms 6.504us 30.54s 323.6us <- quic_lstnr_dgram_dispatch@src/quic_sock.c:255 tasklet_wakeup quic_conn_app_io_cb 92205 3.735s 40.51us 390.9ms 4.239us <- qc_lstnr_pkt_rcv@src/xprt_quic.c:6184 tasklet_wakeup_after qc_io_cb 50355 19.01s 377.5us 10.65s 211.4us <- qc_treat_acked_tx_frm@src/xprt_quic.c:1695 tasklet_wakeup h1_io_cb 44427 155.0ms 3.489us 21.50s 484.0us <- h1_takeover@src/mux_h1.c:4085 tasklet_wakeup qc_io_cb 9018 4.924s 546.0us 3.084s 342.0us <- qc_stream_desc_ack@src/quic_stream.c:128 tasklet_wakeup h1_timeout_task 3236 1.172ms 362.0ns 1.119s 345.9us <- h1_release@src/mux_h1.c:1087 task_wakeup h1_io_cb 2804 7.974ms 2.843us 1.980s 706.0us <- sock_conn_iocb@src/sock.c:849 tasklet_wakeup sc_conn_io_cb 2804 33.44ms 11.92us 2.597s 926.2us <- h1_wake_stream_for_send@src/mux_h1.c:2610 tasklet_wakeup qc_io_cb 2623 2.669s 1.017ms 1.347s 513.5us <- h3_snd_buf@src/h3.c:1084 tasklet_wakeup qc_process_timer 662 526.4us 795.0ns 1.081s 1.633ms <- wake_expired_tasks@src/task.c:344 task_wakeup quic_conn_app_io_cb 648 12.62ms 19.47us 225.7ms 348.2us <- qc_process_timer@src/xprt_quic.c:4635 tasklet_wakeup accept_queue_process 286 1.571ms 5.494us 72.55ms 253.7us <- listener_accept@src/listener.c:1099 tasklet_wakeup process_resolvers 176 157.8us 896.0ns 7.835ms 44.52us <- wake_expired_tasks@src/task.c:429 task_drop_running qc_io_cb 167 10.71ms 64.12us 32.47ms 194.4us <- qc_process_timer@src/xprt_quic.c:4602 tasklet_wakeup sc_conn_io_cb 123 80.05us 650.0ns 50.35ms 409.4us <- qcs_notify_recv@src/mux_quic.c:519 tasklet_wakeup h2_timeout_task 32 30.69us 958.0ns 9.038ms 282.4us <- h2_release@src/mux_h2.c:1191 task_wakeup task_run_applet 24 33.79ms 1.408ms 5.838ms 243.3us <- sc_applet_create@src/stconn.c:489 appctx_wakeup accept_queue_process 17 56.34us 3.314us 7.505ms 441.5us <- accept_queue_process@src/listener.c:165 tasklet_wakeup srv_cleanup_toremove_conns 16 1.133ms 70.81us 5.685ms 355.3us <- srv_cleanup_idle_conns@src/server.c:5948 task_wakeup srv_cleanup_idle_conns 16 74.57us 4.660us 2.797ms 174.8us <- wake_expired_tasks@src/task.c:429 task_drop_running quic_conn_app_io_cb 12 786.9us 65.58us 2.042ms 170.1us <- qc_process_timer@src/xprt_quic.c:4589 tasklet_wakeup sc_conn_io_cb 9 20.55us 2.283us 2.475ms 275.0us <- sock_conn_iocb@src/sock.c:869 tasklet_wakeup h2_io_cb 8 34.12us 4.265us 1.784ms 223.0us <- h2_do_shutw@src/mux_h2.c:4656 tasklet_wakeup task_run_applet 4 6.615ms 1.654ms 2.306us 576.0ns <- sc_app_chk_snd_applet@src/stconn.c:996 appctx_wakeup quic_conn_io_cb 4 4.278ms 1.069ms 6.469us 1.617us <- qc_lstnr_pkt_rcv@src/xprt_quic.c:6184 tasklet_wakeup_after qc_io_cb 2 20.81us 10.40us 4.943us 2.471us <- qc_init@src/mux_quic.c:2057 tasklet_wakeup quic_conn_app_io_cb 2 752.9us 376.4us 63.97us 31.99us <- qc_xprt_start@src/xprt_quic.c:7122 tasklet_wakeup quic_accept_run 2 13.84us 6.920us 172.8us 86.42us <- quic_accept_push_qc@src/quic_sock.c:458 tasklet_wakeup qc_idle_timer_task 2 295.0us 147.5us 8.761us 4.380us <- wake_expired_tasks@src/task.c:344 task_wakeup qc_io_cb 1 867.1us 867.1us 812.8us 812.8us <- qcs_consume@src/mux_quic.c:800 tasklet_wakeup ... and calls sorted by address like this: Tasks activity: function calls cpu_tot cpu_avg lat_tot lat_avg task_run_applet 23 32.73ms 1.423ms 5.837ms 253.8us <- sc_applet_create@src/stconn.c:489 appctx_wakeup task_run_applet 4 6.615ms 1.654ms 2.306us 576.0ns <- sc_app_chk_snd_applet@src/stconn.c:996 appctx_wakeup accept_queue_process 285 1.566ms 5.495us 72.49ms 254.3us <- listener_accept@src/listener.c:1099 tasklet_wakeup accept_queue_process 17 56.34us 3.314us 7.505ms 441.5us <- accept_queue_process@src/listener.c:165 tasklet_wakeup sc_conn_io_cb 10357182 6.297s 607.0ns 27.93m 161.8us <- sc_app_chk_rcv_conn@src/stconn.c:762 tasklet_wakeup sc_conn_io_cb 9347863 16.59s 1.774us 6.143m 39.43us <- h1_wake_stream_for_recv@src/mux_h1.c:2600 tasklet_wakeup sc_conn_io_cb 239717 492.3ms 2.053us 3.213m 804.3us <- qcs_notify_send@src/mux_quic.c:529 tasklet_wakeup sc_conn_io_cb 2804 33.44ms 11.92us 2.597s 926.2us <- h1_wake_stream_for_send@src/mux_h1.c:2610 tasklet_wakeup sc_conn_io_cb 123 80.05us 650.0ns 50.35ms 409.4us <- qcs_notify_recv@src/mux_quic.c:519 tasklet_wakeup sc_conn_io_cb 9 20.55us 2.283us 2.475ms 275.0us <- sock_conn_iocb@src/sock.c:869 tasklet_wakeup process_resolvers 159 145.9us 917.0ns 7.823ms 49.20us <- wake_expired_tasks@src/task.c:429 task_drop_running srv_cleanup_idle_conns 16 74.57us 4.660us 2.797ms 174.8us <- wake_expired_tasks@src/task.c:429 task_drop_running srv_cleanup_toremove_conns 16 1.133ms 70.81us 5.685ms 355.3us <- srv_cleanup_idle_conns@src/server.c:5948 task_wakeup process_stream 9891130 1.809m 10.97us 53.61m 325.2us <- sc_notify@src/stconn.c:1209 task_wakeup process_stream 9823933 1.887m 11.52us 48.31m 295.1us <- stream_new@src/stream.c:563 task_wakeup h1_io_cb 17357952 40.91s 2.357us 4.849m 16.76us <- sock_conn_iocb@src/sock.c:869 tasklet_wakeup h1_io_cb 501344 1.848s 3.686us 6.544m 783.2us <- conn_subscribe@src/connection.c:732 tasklet_wakeup h1_io_cb 44427 155.0ms 3.489us 21.50s 484.0us <- h1_takeover@src/mux_h1.c:4085 tasklet_wakeup h1_io_cb 2804 7.974ms 2.843us 1.980s 706.0us <- sock_conn_iocb@src/sock.c:849 tasklet_wakeup h1_timeout_task 3236 1.172ms 362.0ns 1.119s 345.9us <- h1_release@src/mux_h1.c:1087 task_wakeup h2_timeout_task 32 30.69us 958.0ns 9.038ms 282.4us <- h2_release@src/mux_h2.c:1191 task_wakeup h2_io_cb 173019 4.204s 24.30us 40.95s 236.7us <- h2_snd_buf@src/mux_h2.c:6712 tasklet_wakeup h2_io_cb 149487 424.3ms 2.838us 14.63s 97.87us <- h2c_restart_reading@src/mux_h2.c:856 tasklet_wakeup h2_io_cb 8 34.12us 4.265us 1.784ms 223.0us <- h2_do_shutw@src/mux_h2.c:4656 tasklet_wakeup qc_io_cb 50355 19.01s 377.5us 10.65s 211.4us <- qc_treat_acked_tx_frm@src/xprt_quic.c:1695 tasklet_wakeup qc_io_cb 9018 4.924s 546.0us 3.084s 342.0us <- qc_stream_desc_ack@src/quic_stream.c:128 tasklet_wakeup qc_io_cb 2623 2.669s 1.017ms 1.347s 513.5us <- h3_snd_buf@src/h3.c:1084 tasklet_wakeup qc_io_cb 167 10.71ms 64.12us 32.47ms 194.4us <- qc_process_timer@src/xprt_quic.c:4602 tasklet_wakeup qc_io_cb 2 20.81us 10.40us 4.943us 2.471us <- qc_init@src/mux_quic.c:2057 tasklet_wakeup qc_io_cb 1 867.1us 867.1us 812.8us 812.8us <- qcs_consume@src/mux_quic.c:800 tasklet_wakeup qc_idle_timer_task 2 295.0us 147.5us 8.761us 4.380us <- wake_expired_tasks@src/task.c:344 task_wakeup quic_conn_io_cb 4 4.278ms 1.069ms 6.469us 1.617us <- qc_lstnr_pkt_rcv@src/xprt_quic.c:6184 tasklet_wakeup_after quic_conn_app_io_cb 92205 3.735s 40.51us 390.9ms 4.239us <- qc_lstnr_pkt_rcv@src/xprt_quic.c:6184 tasklet_wakeup_after quic_conn_app_io_cb 648 12.62ms 19.47us 225.7ms 348.2us <- qc_process_timer@src/xprt_quic.c:4635 tasklet_wakeup quic_conn_app_io_cb 12 786.9us 65.58us 2.042ms 170.1us <- qc_process_timer@src/xprt_quic.c:4589 tasklet_wakeup quic_conn_app_io_cb 2 752.9us 376.4us 63.97us 31.99us <- qc_xprt_start@src/xprt_quic.c:7122 tasklet_wakeup quic_lstnr_dghdlr 94389 614.0ms 6.504us 30.54s 323.6us <- quic_lstnr_dgram_dispatch@src/quic_sock.c:255 tasklet_wakeup qc_process_timer 662 526.4us 795.0ns 1.081s 1.633ms <- wake_expired_tasks@src/task.c:344 task_wakeup quic_accept_run 2 13.84us 6.920us 172.8us 86.42us <- quic_accept_push_qc@src/quic_sock.c:458 tasklet_wakeup other 101892 4.626s 45.40us 14.84s 145.7us It already becomes visible that some tasks have different very costs depending where they're called (e.g. process_stream). The method used to wake them up is also shown. Applets are handled specially and shown as appctx_wakeup.	2022-09-08 16:21:22 +02:00
Willy Tarreau	a3423873fe	CLEANUP: activity: make the number of sched activity entries more configurable This removes all the hard-coded 8-bit and 256 entries to use a pair of macros instead so that we can more easily experiment with larger table sizes if needed.	2022-09-08 14:55:09 +02:00
Willy Tarreau	e0e6d81460	CLEANUP: task: move tid and wake_date into the common part There used to be one tid for tasklets and a thread_mask for tasks. Since 2.7, both tasks and tasklets now use a tid (albeit with a very slight semantic difference for the negative value), to in order to limit code duplication and to ease debugging it makes sense to move tid into the common part. One limitation is that it will leave a hole in the structure, but we now have the wake_date that is always present and can move there as well to plug the hole. This results in something overall pretty clean (and cleaner than before), with the low-level stuff (state,tid,process,context) appearing first, then the caller stuff (caller,wake_date,calls,debug) next, and finally the type-specific stuff (rq/wq/expire/nice).	2022-09-08 14:30:38 +02:00
Willy Tarreau	2830d282e5	DEBUG: task: simplify the caller recording in DEBUG_TASK Instead of storing an index that's swapped at every call, let's use the two pointers as a shifting history. Now we have a permanent "caller" field that records the last caller, and an optional prev_caller in the debug section enabled by DEBUG_TASK that keeps a copy of the previous caller one. This way, not only it's much easier to follow what's happening during debugging, but it saves 8 bytes in the struct task in debug mode and still keeps it under 2 cache lines in nominal mode, and this will finally be usable everywhere and later in profiling. The caller_idx was also used as a hint that the entry was freed, in order to detect wakeup-after-free. This was changed by setting caller to -1 instead and preserving its value in caller[1]. Finally, the operations were made atomic. That's not critical but since it's used for debugging and race conditions represent a significant part of the issues in multi-threaded mode, it seems wise to at least eliminate some possible factors of faulty analysis.	2022-09-08 14:30:38 +02:00
Willy Tarreau	8d71abf0cd	DEBUG: applet: instrument appctx_wakeup() to log the caller's location appctx_wakeup() relies on task_wakeup(), but since it calls it from a function, the calling place is always appctx_wakeup() itself, which is not very useful. Let's turn it to a macro so that we can log the location of the caller instead. As an example, the cli_io_handler() which used to be seen as this: (gdb) p appctx->t.debug.caller[0] $10 = { func = 0x9ffb78 <__func__.37996> "appctx_wakeup", file = 0x9b336a "include/haproxy/applet.h", line = 110, what = 1 '\001', arg8 = 0 '\000', arg32 = 0 } Now shows the more useful: (gdb) p appctx->t.debug.caller[0] $6 = { func = 0x9ffe80 <__func__.38641> "sc_app_chk_snd_applet", file = 0xa00320 "src/stconn.c", line = 996, what = 6 '\006', arg8 = 0 '\000', arg32 = 0 }	2022-09-08 14:30:38 +02:00
Willy Tarreau	e08af9a0f4	DEBUG: task: use struct ha_caller instead of arrays of file:line This reduces the task struct by 8 bytes, reduces the code size a little bit by simplifying the calling convention (one argument dropped), and as a bonus provides the function name in the caller.	2022-09-08 14:30:38 +02:00
Willy Tarreau	d2b2ad902b	DEBUG: task: define a series of wakeup types for tasks and tasklets The WAKEUP_* values will be used to report how a task/tasklet was woken up, and task_wakeup_type_str() wlil report the associated function name.	2022-09-08 14:30:16 +02:00
Willy Tarreau	d96d214b4c	CLEANUP: debug: use struct ha_caller for memstat The memstats code currently defines its own file/function/line number, type and extra pointer. We don't need to keep them separate and we can easily replace them all with just a struct ha_caller. Note that the extra pointer could be converted to a pool ID stored into arg8 or arg32 and be dropped as well, but this would first require to define IDs for pools (which we currently do not have).	2022-09-08 14:19:15 +02:00
Willy Tarreau	7f2f1f294c	MINOR: debug: add struct ha_caller to describe a calling location The purpose of this structure is to assemble all constant parts of a generic calling point for a specific event. These ones are created by the compiler as a static const element outside of the code path, so they cost nothing in terms of CPU, and a pointer to that descriptor can be passed to the place that needs it. This is very similar to what is being done for the mem_stat stuff. This will be useful to simplify and improve DEBUG_TASK.	2022-09-08 14:19:15 +02:00
Willy Tarreau	4a3907617f	MINOR: tools: add generic pointer hashing functions There are a few places where it's convenient to hash a pointer to compute a statistics bucket. Here we're basically reusing the hash that was used by memory profiling with a minor update that the multiplier was corrected to be prime and stand by its promise to have equal numbers of 1 and 0, and that 32-bit platforms won't lose range anymore. A two-pointer variant was also added.	2022-09-08 14:19:15 +02:00
Willy Tarreau	6a28a30efa	MINOR: tasks: do not keep cpu and latency times in struct task It was a mistake to put these two fields in the struct task. This was added in 1.9 via commit `9efd7456e` ("MEDIUM: tasks: collect per-task CPU time and latency"). These fields are used solely by streams in order to report the measurements via the lat_ns* and cpu_ns* sample fetch functions when task profiling is enabled. For the rest of the tasks, this is pure CPU waste when profiling is enabled, and memory waste 100% of the time, as the point where these latencies and usages are measured is in the profiling array. Let's move the fields to the stream instead, and have process_stream() retrieve the relevant info from the thread's context. The struct task is now back to 120 bytes, i.e. almost two cache lines, with 32 bit still available.	2022-09-08 14:19:15 +02:00
Willy Tarreau	1efddfa6bf	MINOR: sched: store the current profile entry in the thread context The profile entry that corresponds to the current task/tasklet being profiled is now stored into the thread's context. This will allow it to be accessed from the tasks themselves. This is needed for an upcoming fix.	2022-09-08 14:19:15 +02:00
Willy Tarreau	62b5b96bcc	BUG/MINOR: sched: properly account for the CPU time of dying tasks When task profiling is enabled, the scheduler can measure and report the cumulated time spent in each task and their respective latencies. But this was wrong for tasks with few wakeups as well as for self-waking ones, because the call date needed to measure how long it takes to process the task is retrieved in the task itself (->wake_date was turned to the call date), and we could face two conditions: - a new wakeup while the task is executing would reset the ->wake_date field before returning and make abnormally low values being reported; that was likely the case for task�run_applet for self-waking applets; - when the task dies, NULL is returned and the call date couldn't be retrieved, so that CPU time was not being accounted for. This was particularly visible with process_stream() which is usually called only twice per request, and whose time was systematically halved. The cleanest solution here is to keep in mind that the scheduler already uses quite a bit of local context in th_ctx, and place the intermediary values there so that they cannot vanish. The wake_date has to be reset immediately once read, and only its copy is used along the function. Note that this must be done both for tasks and tasklet, and that until recently tasklets were also able to report wrong values due to their sole dependency on TH_FL_TASK_PROFILING between tests. One nice benefit for future improvements is that such information will now be available from the task without having to be stored into the task itself anymore. Since the tasklet part was computed on wrapping 32-bit arithmetics and the task one was on 64-bit, the values were now consistently moved to 32-bit as it's already largely sufficient (4s spent in a task is more than twice what the watchdog would tolerate). Some further cleanups might be necessary, but the patch aimed at staying minimal. Task profiling output after 1 million HTTP request previously looked like this: Tasks activity: function calls cpu_tot cpu_avg lat_tot lat_avg h1_io_cb 2012338 4.850s 2.410us 12.91s 6.417us process_stream 2000136 9.594s 4.796us 34.26s 17.13us sc_conn_io_cb 2000135 1.973s 986.0ns 30.24s 15.12us h1_timeout_task 137 - - 2.649ms 19.34us accept_queue_process 49 152.3us 3.107us 321.7yr 6.564yr main+0x146430 7 5.250us 750.0ns 25.92us 3.702us srv_cleanup_idle_conns 1 559.0ns 559.0ns 918.0ns 918.0ns task_run_applet 1 - - 2.162us 2.162us Now it looks like this: Tasks activity: function calls cpu_tot cpu_avg lat_tot lat_avg h1_io_cb 2014194 4.794s 2.380us 13.75s 6.826us process_stream 2000151 20.01s 10.00us 36.04s 18.02us sc_conn_io_cb 2000148 2.167s 1.083us 32.27s 16.13us h1_timeout_task 198 54.24us 273.0ns 3.487ms 17.61us accept_queue_process 52 158.3us 3.044us 409.9us 7.882us main+0x1466e0 18 16.77us 931.0ns 63.98us 3.554us srv_cleanup_toremove_conns 8 282.1us 35.26us 546.8us 68.35us srv_cleanup_idle_conns 3 149.2us 49.73us 8.131us 2.710us task_run_applet 3 268.1us 89.38us 11.61us 3.871us Note the two-fold difference on process_stream(). This feature is essentially used for debugging so it has extremely limited impact. However it's used quite a bit more in bug reports and it would be desirable that at least 2.6 gets this fix backported. It depends on at least these two previous patches which will then also have to be backported: MINOR: task: permanently enable latency measurement on tasklets CLEANUP: task: rename ->call_date to ->wake_date	2022-09-08 14:19:15 +02:00
Willy Tarreau	04e50b3d32	CLEANUP: task: rename ->call_date to ->wake_date This field is misnamed because its real and important content is the date the task was woken up, not the date it was called. It temporarily holds the call date during execution but this remains confusing. In fact before the latency measurements were possible it was indeed a call date. Thus is will now be called wake_date. This change is necessary because a subsequent fix will require the introduction of the real call date in the thread ctx.	2022-09-08 14:19:15 +02:00
Willy Tarreau	768c2c5678	MINOR: task: permanently enable latency measurement on tasklets When tasklet latency measurement was enabled in 2.4 with commit `b2285de04` ("MINOR: tasks: also compute the tasklet latency when DEBUG_TASK is set"), the feature was conditionned on DEBUG_TASK because the field would add 8 bytes to the struct tasklet. This approach was not a very good idea because the struct ends on an int anyway thus it does finish with a 32-bit hole regardless of the presence of this field. What is true however is that adding it turned a 64-byte struct to 72-byte when caller debugging is enabled. This patch revisits this with a minor change. Now only the lowest 32 bits of the call date are stored, so they always fit in the remaining hole, and this allows to remove the dependency on DEBUG_TASK. With debugging off, we're now seeing a 48-byte struct, and with debugging on it's exactly 64 bytes, thus still exactly one cache line. 32 bits allow a latency of 4 seconds on a tasklet, which already indicates a completely dead process, so there's no point storing the upper bits at all. And even in the event it would happen once in a while, the lost upper bits do not really add any value to the debug reports. Also, now one tasklet wakeup every 4 billion will not be sampled due to the test on the value itself. Similarly we just don't care, it's statistics and the measurements are not 9-digit accurate anyway.	2022-09-08 14:19:15 +02:00
Willy Tarreau	0fae3a0360	BUG/MINOR: task: make task_instant_wakeup() work on a task not a tasklet There's a subtle (harmless) bug in task_instant_wakeup(). As it uses some tasklet code instead of some task code, the debug part also acts on the tasklet equivalent, and the call_date is only set when DEBUG_TASK is set instead of inconditionally like with tasks. As such, without this debugging macro, call dates are not updated for tasks woken this way. There isn't any impact yet because this function was introduced in 2.6 to solve certain classes of issues and is not used yet, and in the worst case it would only affect the reported latency time. This may be backported to 2.6 in case a future fix would depend on it but currently will not fix existing code.	2022-09-08 14:19:15 +02:00
Willy Tarreau	f27acd961e	BUG/MINOR: task: always reset a new tasklet's call date The tasklet's call date was not reset, so if profiling was enabled while some tasklets were in the run queue, their initial random value could be used to preload a bogus initial latency value into the task profiling bin. Let's just zero the initial value. This should be backported to 2.4 as it was brought with initial commit `b2285de04` ("MINOR: tasks: also compute the tasklet latency when DEBUG_TASK is set"). The impact is very low though.	2022-09-08 14:19:15 +02:00
Frédéric Lécaille	3122c75fd1	BUG/MINOR: quic: Wrong connection ID to thread ID association To work, quic_pin_cid_to_tid() must set cid[0] to a value with <target_id> as <global.nbthread> modulo. For each integer n, (n - (n % m)) + d has always d as modulo m (with d < m). So, this statement seemed correct: cid[0] = cid[0] - (cid[0] % global.nbthread) + target_tid; except when n wraps or when another modulo is applied to the addition result. Here, for 8bit modulo arithmetic, if m does not divides 256, this cannot works for values which wraps when we increment them by d. For instance n=255 m=3 and d=1 the formula result is 0 (should be d). To fix this, we first limit c[0] to 255 - <target_id> to prevent c[0] from wrapping. Thank you to @esb for having reported this issue in GH #1855. Must be backported to 2.6	2022-09-07 15:59:43 +02:00
William Lallemand	d2be9d4c48	BUILD: quic: temporarly ignore chacha20_poly1305 for libressl LibreSSL does not implement EVP_chacha20_poly1305() with EVP_CIPHER but uses the EVP_AEAD API instead: https://man.openbsd.org/EVP_AEAD_CTX_init This patch disables this cipher for libreSSL for now.	2022-09-07 09:33:46 +02:00
William Lallemand	844009d77a	BUILD: ssl: fix ssl_sock_switchtx_cbk when no client_hello_cb When building HAProxy with USE_QUIC and libressl 3.6.0, the ssl_sock_switchtx_cbk symbol is not found because libressl does not implement the client_hello_cb. A ssl_sock_switchtx_cbk version for the servername callback is available but wasn't exported correctly.	2022-09-07 09:33:46 +02:00
William Lallemand	6d74e179ee	BUILD: quic: add some ifdef around the SSL_ERROR_* for libressl SSL_ERROR_WANT_ASYNC, SSL_ERROR_WANT_ASYNC_JOB and SSL_ERROR_WANT_CLIENT_HELLO_CB does not seems supported by libressl.	2022-09-07 09:33:46 +02:00
Willy Tarreau	ce57777660	MINOR: muxes: add a "show_sd" helper to complete "show sess" dumps This helper will be called for muxes that provide it and will be used to let the mux provide extra information about the stream attached to a stream descriptor. A line prefix is passed in argument so that the mux is free to break long lines without breaking indent. No prefix means no line breaks should be produced (e.g. for short dumps).	2022-09-02 15:48:50 +02:00
Willy Tarreau	178dda6b41	DEBUG: stream: minor rearrangement of a few fields in struct stream. Some recent traces started to show confusing stream pointers ending with 0xe. The reason was that the stream's obj_type was almost unused in the past and was stuffed in a hole in the structure. But now it's present in all "show sess all" outputs and having to mentally match this value against another one that's 0x17e lower is painful. The solution here is to move the obj_type at the top, like in almost every other structure, but without breaking the efficient layout. This patch moves a few fields around and manages to both plug some holes (16 bytes saved, 976 to 960) and avoid channels needlessly crossing cache boundaries (res was spread over 3 lines vs 2 now). Nothing else was changed. It would be desirable to backport this to 2.6 since it's where dumps are currently being processed the most.	2022-09-02 15:48:10 +02:00
Willy Tarreau	d8009a1ca6	BUILD: debug: make sure debug macros are never empty As outlined in commit `f7ebe584d7` ("BUILD: debug: Add braces to if statement calling only CHECK_IF()"), the BUG_ON() family of macros is incorrectly defined to be empty when debugging is disabled, and that can lead to trouble. Make sure they always fall back to the usual "do { } while (0)". This may be backported to 2.6 if needed, though no such issue was met there to date.	2022-08-31 10:53:53 +02:00
Fr�d�ric L�caille	c242832af3	BUG/MINOR: quic: Missing header protection AES cipher context initialisations (draft-v2) This bug arrived with this commit: "MINOR: quic: Add reusable cipher contexts for header protection" haproxy could crash because of missing cipher contexts initializations for the header protection and draft-v2 Initial secrets. This was due to the fact that these initialization both for RX and TX secrets were done outside of qc_new_isecs(). The role of this function is definitively to initialize these cipher contexts in addition to the derived secrets. Indeed this function is called by qc_new_conn() which initializes the connection but also by qc_conn_finalize() which also calls qc_new_isecs() in case of a different QUIC version was negotiated by the peers from the one used by the client for its first Initial packet. This was reported by "v2" QUIC interop test with at least picoquic as client. Must be backported to 2.6.	2022-08-29 18:46:40 +02:00
Frédéric Lécaille	f34c1c9568	CLEANUP: quic: No more use ->rx_list MT_LIST entry point (quic_rx_packet) This quic_rx_packet is definitively no more used. Should be backported to 2.6 to ease the future backports.	2022-08-24 18:17:13 +02:00
William Lallemand	b10b1196b8	MINOR: resolvers: shut the warning when "default" resolvers is implicit Shut the connect() warning of resolvers_finalize_config() when the configuration was not emitted manually. This shuts the warning for the "default" resolvers which is created automatically for the httpclient. Must be backported in 2.6.	2022-08-24 14:56:42 +02:00
Christopher Faulet	871dd82117	BUG/MINOR: tcpcheck: Disable QUICKACK only if data should be sent after connect It is only a real problem for agent-checks when there is no agent string to send. The condition to disable TCP_QUICKACK was only based on the action type following the connect one. But it is not always accurate. indeed, for agent-checks, there is always a SEND action. But if there is no "agent-send" string defined, nothing is sent. In this case, this adds 200ms of latency with no reason. To fix the bug, a flag is now used on the CONNECT action to instruct there are data that should be sent after the connect. For health-checks, this flag is set if the action following the connect is a SEND action. For agent-checks, it is set if an "agent-send" string is defined. This patch should fix the issue #1836. It must be backported as far as 2.2.	2022-08-24 11:59:04 +02:00
Frédéric Lécaille	a2d8ad20a3	MINOR: quic: Replace MT_LISTs by LISTs for RX packets. Replace ->rx.pqpkts quic_enc_level struct member MT_LIST by an LIST. Same thing for ->list quic_rx_packet struct member MT_LIST. Update the code consequently. This was a reminisence of the multithreading support (several threads by connection). Must be backported to 2.6	2022-08-23 17:55:02 +02:00
Frédéric Lécaille	ea4a5cbbdf	BUG/MINOR: mux-quic: Fix memleak on QUIC stream buffer for unacknowledged data Some clients send CONNECTION_CLOSE frame without acknowledging the STREAM data haproxy has sent. In this case, when closing the connection if there were remaining data in QUIC stream buffers, they were not released. Add a <closing> boolean option to qc_stream_desc_free() to force the stream buffer memory releasing upon closing connection. Thank you to Tristan for having reported such a memory leak issue in GH #1801. Must be backported to 2.6.	2022-08-20 19:08:31 +02:00
William Lallemand	62c0b99e3b	MINOR: ssl/cli: implement "add ssl ca-file" In ticket #1805 an user is impacted by the limitation of size of the CLI buffer when updating a ca-file. This patch allows a user to append new certificates to a ca-file instead of trying to put them all with "set ssl ca-file" The implementation use a new function ssl_store_dup_cafile_entry() which duplicates a cafile_entry and its X509_STORE. ssl_store_load_ca_from_buf() was modified to take an apped parameter so we could share the function for "set" and "add".	2022-08-19 19:58:53 +02:00
William Lallemand	d4774d3cfa	MINOR: ssl: handle ca-file appending in cafile_entry In order to be able to append new CA in a cafile_entry, ssl_store_load_ca_from_buf() was reworked and a "append" parameter was added. The function is able to keep the previous X509_STORE which was already present in the cafile_entry.	2022-08-19 19:58:53 +02:00
Frédéric Lécaille	86a53c5669	MINOR: quic: Add reusable cipher contexts for header protection Implement quic_tls_rx_hp_ctx_init() and quic_tls_tx_hp_ctx_init() to initiliaze such header protection cipher contexts for each RX and TX parts and for each packet number spaces, only one time by connection. Make qc_new_isecs() call these two functions to initialize the cipher contexts of the Initial secrets. Same thing for ha_quic_set_encryption_secrets() to initialize the cipher contexts of the subsequent derived secrets (ORTT, 1RTT, Handshake). Modify qc_do_rm_hp() and quic_apply_header_protection() to reuse these cipher contexts. Note that there is no need to modify the key update for the header protection. The header protection secrets are never updated.	2022-08-19 18:31:59 +02:00

1 2 3 4 5 ...

6430 Commits