Commit Graph

99 Commits

Author SHA1 Message Date
Willy Tarreau
3b7942a1c9 MINOR: check/activity: collect some per-thread check activity stats
We now count the number of times a check was started on each thread
and the number of times a check was adopted. This helps understand
better what is observed regarding checks.
2023-09-01 08:26:06 +02:00
Willy Tarreau
338431ecb6 MINOR: activity: report the current run queue size
While troubleshooting the causes of load spikes, it appeared that the
length of individual run queues was missing, let's add it to "show
activity".
2023-09-01 08:26:06 +02:00
Willy Tarreau
8b3e39e37b MINOR: activity: allow "show activity" to restart in the middle of a line
16kB buffers are not enough to dump 4096 threads with up to 10 bytes value
on each line. By storing the column number in the applet's context, we can
now restart from the last attempted column. This requires to dump all values
as they are produced, but it doesn't cost that much: a 4096-thread output
from a fesh process produces 300kB of output in ~8ms, or ~400us per call
(19*16kB), most of which are spent in vfprintf(). Given that we don't print
more than needed, it doesn't really change anything.

The main caveat is that when interrupted on such large lines, there's a
great possibility that the total or average on the first column doesn't
match anymore the sum or average of all dumped values. In order to avoid
this whenever possible (typically less than ~1500 threads), we first try
to dump entire lines and only proceed one column at a time when we have
to retry a failed dump. This is already the same for other stats that are
dumped in an interruptible way anyway and there's little that can be done
about it at this point (and not much immediately perceived benefit in
doing this with extreme accuracy for >1500 threads).
2023-05-03 17:26:11 +02:00
Willy Tarreau
6ed0b9885d MINOR: activity: allow "show activity" to restart dumping on any line
When using many threads, it's difficult to see the end of "show activity"
due to the numerous columns which fill the buffer. For example a dump of
a 256-thread, freshly booted process yields around 15kB.

Here by arranging the dump in a loop around a switch/case block where
each case checks the code line number against the current dump position,
we have a restartable counter for free with a granularity of the line of
code, without having to maintain a matching between states and specific
lines. It just requires to reset the trash buffer for each line and to
try to dump it after each line.

Now dumping 256 threads after a few seconds of traffic happily emits 20kB.
2023-05-03 17:24:54 +02:00
Willy Tarreau
8ee0d11cb8 MINOR: activity: iterate over all fields in a main loop for dumping
Now each line of "show activity" will iterate over n+2 fields, one for
the line header, one for the total, and one per thread. This will soon
allow us to save the current state in a restartable way.
2023-05-03 17:24:54 +02:00
Willy Tarreau
a465b21516 MINOR: activity: show the line header inside the SHOW_VAL macro
Doing so will allow us to drop the extra chunk_appendf() dedicated to
the line header and simplify iteration over restartable columns.
2023-05-03 17:24:54 +02:00
Willy Tarreau
5ddf9bea09 MINOR: activity: use a single macro to iterate over all fields
Instead of having SHOW_AVG() and SHOW_TOT(), let's just have SHOW_VAL()
which iterates over all values.
2023-05-03 17:24:54 +02:00
Willy Tarreau
c05d30e9d8 MINOR: clock: replace the timeval start_time with start_time_ns
Now that "now" is no more a timeval, there's no point keeping a copy
of it as a timeval, let's also switch start_time to nanoseconds, it
simplifies operations.
2023-04-28 16:08:08 +02:00
Willy Tarreau
69530f59ae MEDIUM: clock: replace timeval "now" with integer "now_ns"
This puts an end to the occasional confusion between the "now" date
that is internal, monotonic and not synchronized with the system's
date, and "date" which is the system's date and not necessarily
monotonic. Variable "now" was removed and replaced with a 64-bit
integer "now_ns" which is a counter of nanoseconds. It wraps every
585 years, so if all goes well (i.e. if humanity does not need
haproxy anymore in 500 years), it will just never wrap. This implies
that now_ns is never nul and that the zero value can reliably be used
as "not set yet" for a timestamp if needed. This will also simplify
date checks where it becomes possible again to do "date1<date2".

All occurrences of "tv_to_ns(&now)" were simply replaced by "now_ns".
Due to the intricacies between now, global_now and now_offset, all 3
had to be turned to nanoseconds at once. It's not a problem since all
of them were solely used in 3 functions in clock.c, but they make the
patch look bigger than it really  is.

The clock_update_local_date() and clock_update_global_date() functions
are now much simpler as there's no need anymore to perform conversions
nor to round the timeval up or down.

The wrapping continues to happen by presetting the internal offset in
the short future so that the 32-bit now_ms continues to wrap 20 seconds
after boot.

The start_time used to calculate uptime can still be turned to
nanoseconds now. One interrogation concerns global_now_ms which is used
only for the freq counters. It's unclear whether there's more value in
using two variables that need to be synchronized sequentially like today
or to just use global_now_ns divided by 1 million. Both approaches will
work equally well on modern systems, the difference might come from
smaller ones. Better not change anyhting for now.

One benefit of the new approach is that we now have an internal date
with a resolution of the nanosecond and the precision of the microsecond,
which can be useful to extend some measurements given that timestamps
also have this resolution.
2023-04-28 16:08:08 +02:00
Willy Tarreau
b68d308aec MINOR: activity: use nanoseconds, not timeval to compute uptime
Now that we have the required functions, let's get rid of the timeval
in intermediary calculations.
2023-04-28 16:08:08 +02:00
Willy Tarreau
82bde18aa4 BUG/MINOR: activity: show wall-clock date, not internal date in show activity
Another case where "now" was used instead of "date" for a publicly visible
date that was already incorrect and became worse after commit 28360dc
("MEDIUM: clock: force internal time to wrap early after boot"). No
backport is needed.
2023-04-27 14:47:50 +02:00
Willy Tarreau
e6f5ab5afa MINOR: listener: make accept_queue index atomic
There has always been a race when checking the length of an accept queue
to determine which one is more loaded that another, because the head and
tail are read at two different moments. This is not required, we can merge
them as two 16 bit numbers inside a single 32-bit index that is always
accessed atomically. This way we read both values at once and always have
a consistent measurement.
2023-04-21 17:41:26 +02:00
Christopher Faulet
208c712b40 MINOR: stconn: Rename SC_FL_SHUTW in SC_FL_SHUT_DONE
Here again, it is just a flag renaming. In SC flags, there is no longer
shutdown for writes but shutdowns.
2023-04-14 15:01:21 +02:00
Willy Tarreau
28f2a590f6 MINOR: activity: add a line reporting the average CPU usage to "show activity"
It was missing from the output but is sometimes convenient to observe
and understand how incoming connections are distributed. The CPU usage
is reported as the instant measurement of 100-idle_pct for each thread,
and the average value is shown for the aggregated value.

This could be backported as it's helpful in certain troublehsooting
sessions.
2023-04-12 08:42:52 +02:00
Christopher Faulet
7faac7cf34 MINOR: tree-wide: Simplifiy some tests on SHUT flags by accessing SCs directly
At many places, we simplify the tests on SHUT flags to remove calls to
chn_prod() or chn_cons() function because the corresponding SC is available.
2023-04-05 08:57:06 +02:00
Christopher Faulet
87633c3a11 MEDIUM: tree-wide: Move flags about shut from the channel to the SC
The purpose of this patch is only a one-to-one replacement, as far as
possible.

CF_SHUTR(_NOW) and CF_SHUTW(_NOW) flags are now carried by the
stream-connecter. CF_ prefix is replaced by SC_FL_ one. Of course, it is not
so simple because at many places, we were testing if a channel was shut for
reads and writes in same time. To do the same, shut for reads must be tested
on one side on the SC and shut for writes on the other side on the opposite
SC. A special care was taken with process_stream(). flags of SCs must be
saved to be able to detect changes, just like for the channels.
2023-04-05 08:57:06 +02:00
Willy Tarreau
6093ba47c0 BUG/MINOR: clock: do not mix wall-clock and monotonic time in uptime calculation
We've had a start date even before the internal monotonic clock existed,
but once the monotonic clock was added, the start date was not updated
to distinguish the wall clock time units and the internal monotonic time
units. The distinction is important because both clocks do not necessarily
progress at the same speed. The very rare occurrences of the wall-clock
date are essentially for human consumption and communication with third
parties (e.g. report the start date in "show info" for monitoring
purposes). However currently this one is also used to measure the distance
to "now" as being the process' uptime. This is actually not correct. It
only works because for now the two dates are initialized at the exact
same instant at boot but could still be wrong if the system's date shows
a big jump backwards during startup for example. In addition the current
situation prevents us from enforcing an abritrary offset at boot to reveal
some heisenbugs.

This patch adds a new "start_time" at boot that is set from "now" and is
used in uptime calculations. "start_date" instead is now set from "date"
and will always reflect the system date for human consumption (e.g. in
"show info"). This way we're now sure that any drift of the internal
clock relative to the system date will not impact the reported uptime.

This could possibly be backported though it's unlikely that anyone has
ever noticed the problem.
2023-02-08 11:06:55 +01:00
Christopher Faulet
da89e9b95b MINOR: channel/applets: Stop to test CF_WRITE_ERROR flag if CF_SHUTW is enough
In applets, we stop processing when a write error (CF_WRITE_ERROR) or a shutdown
for writes (CF_SHUTW) is detected. However, any write error leads to an
immediate shutdown for writes. Thus, it is enough to only test if CF_SHUTW is
set.
2023-01-09 18:41:08 +01:00
Willy Tarreau
f9607f8b1f REORG: activity/cli: move the "show activity" handler to activity.c
Initially the code was placed into cli.c to keep activity.c small and
independent of the cli stuff, but that's no longer the case anyway and
keeping that code over there makes it harder to find. Let's move it to
its more natural place now.
2022-11-25 15:41:47 +01:00
Willy Tarreau
e86bc35672 MINOR: activity/cli: support sorting task profiling by total CPU time
The new "bytime" sorting criterion uses the reported CPU time instead of
the usage. This is convenient to spot tasks that are mostly reponsible
for the CPU usage in a running process. It supports both the detailed
and the aggregated format. The output looks like this:

> show profiling tasks bytime
Tasks activity:
  function                      calls   cpu_tot   cpu_avg   lat_tot   lat_avg
  qc_io_cb                     117739   1.961m    999.1us   37.45s    318.1us <- h3_snd_buf@src/h3.c:1084 tasklet_wakeup
  process_stream              7376273   1.384m    11.26us   1.013h    494.2us <- stream_new@src/stream.c:563 task_wakeup
  process_stream              8104400   1.133m    8.389us   1.130h    502.0us <- sc_notify@src/stconn.c:1209 task_wakeup
  qc_io_cb                      43280   45.76s    1.057ms   13.95s    322.3us <- qc_stream_desc_ack@src/quic_stream.c:128 tasklet_wakeup
  h1_io_cb                   11025715   24.82s    2.251us   5.406m    29.42us <- sock_conn_iocb@src/sock.c:869 tasklet_wakeup
  quic_conn_app_io_cb          312861   23.86s    76.27us   2.373s    7.584us <- qc_lstnr_pkt_rcv@src/xprt_quic.c:6184 tasklet_wakeup_after
  qc_io_cb                      37063   12.65s    341.4us   6.409s    172.9us <- qc_treat_acked_tx_frm@src/xprt_quic.c:1695 tasklet_wakeup
  h1_io_cb                    4783520   11.79s    2.463us   1.419h    1.068ms <- conn_subscribe@src/connection.c:732 tasklet_wakeup
  sc_conn_io_cb              12269693   11.51s    938.0ns   2.117h    621.2us <- sc_app_chk_rcv_conn@src/stconn.c:762 tasklet_wakeup
  sc_conn_io_cb               6479006   10.94s    1.689us   7.984m    73.93us <- h1_wake_stream_for_recv@src/mux_h1.c:2600 tasklet_wakeup
  qc_io_cb                      12011   10.72s    892.5us   2.120s    176.5us <- qcc_release_remote_stream@src/mux_quic.c:1200 tasklet_wakeup
  h2_io_cb                     246423   6.225s    25.26us   56.52s    229.4us <- h2_snd_buf@src/mux_h2.c:6712 tasklet_wakeup
  h2_io_cb                     137744   6.076s    44.11us   16.59s    120.4us <- sock_conn_iocb@src/sock.c:869 tasklet_wakeup
  quic_lstnr_dghdlr            323575   3.062s    9.462us   3.424m    634.9us <- quic_lstnr_dgram_dispatch@src/quic_sock.c:255 tasklet_wakeup
  sc_conn_io_cb               1206939   1.616s    1.338us   27.62m    1.373ms <- qcs_notify_send@src/mux_quic.c:529 tasklet_wakeup
  h2_io_cb                     212370   251.2ms   1.182us   6.476s    30.49us <- h2c_restart_reading@src/mux_h2.c:856 tasklet_wakeup
  h1_io_cb                      44109   197.0ms   4.466us   31.89s    723.0us <- h1_takeover@src/mux_h1.c:4085 tasklet_wakeup
  quic_conn_app_io_cb            3029   87.59ms   28.92us   999.0ms   329.8us <- qc_process_timer@src/xprt_quic.c:4635 tasklet_wakeup
  task_run_applet                  40   35.77ms   894.3us   4.407ms   110.2us <- sc_applet_create@src/stconn.c:489 appctx_wakeup
  task_run_applet                  18   27.36ms   1.520ms   19.56us   1.086us <- sc_app_chk_snd_applet@src/stconn.c:996 appctx_wakeup
  sc_conn_io_cb                  2186   11.76ms   5.377us   963.0ms   440.5us <- h1_wake_stream_for_send@src/mux_h1.c:2610 tasklet_wakeup
  qc_io_cb                          8   9.880ms   1.235ms   5.871ms   733.9us <- qcs_consume@src/mux_quic.c:800 tasklet_wakeup
  quic_conn_io_cb                   4   5.951ms   1.488ms   38.85us   9.713us <- qc_lstnr_pkt_rcv@src/xprt_quic.c:6184 tasklet_wakeup_after
  qc_io_cb                        101   4.975ms   49.26us   13.91ms   137.8us <- qc_process_timer@src/xprt_quic.c:4602 tasklet_wakeup
  h1_io_cb                       2186   1.809ms   827.0ns   720.2ms   329.5us <- sock_conn_iocb@src/sock.c:849 tasklet_wakeup
  qc_process_timer               3031   1.735ms   572.0ns   1.153s    380.3us <- wake_expired_tasks@src/task.c:344 task_wakeup
  accept_queue_process            359   1.362ms   3.793us   80.32ms   223.7us <- listener_accept@src/listener.c:1099 tasklet_wakeup
  quic_conn_app_io_cb               2   921.1us   460.6us   203.1us   101.5us <- qc_xprt_start@src/xprt_quic.c:7122 tasklet_wakeup
  h1_timeout_task                2618   526.8us   201.0ns   1.121s    428.4us <- h1_release@src/mux_h1.c:1087 task_wakeup
  process_resolvers               316   283.3us   896.0ns   14.96ms   47.33us <- wake_expired_tasks@src/task.c:429 task_drop_running
  sc_conn_io_cb                   420   235.6us   560.0ns   116.7ms   277.8us <- h2s_notify_recv@src/mux_h2.c:1298 tasklet_wakeup
  qc_idle_timer_task                1   225.5us   225.5us   506.0ns   506.0ns <- wake_expired_tasks@src/task.c:344 task_wakeup
  accept_queue_process             36   153.0us   4.250us   5.834ms   162.1us <- accept_queue_process@src/listener.c:165 tasklet_wakeup
  sc_conn_io_cb                    18   54.05us   3.003us   11.50us   638.0ns <- sock_conn_iocb@src/sock.c:869 tasklet_wakeup
  h2_io_cb                          6   38.88us   6.480us   2.089ms   348.2us <- h2_do_shutw@src/mux_h2.c:4656 tasklet_wakeup
  srv_cleanup_idle_conns           54   37.72us   698.0ns   14.21ms   263.1us <- wake_expired_tasks@src/task.c:429 task_drop_running
  sc_conn_io_cb                    50   32.86us   657.0ns   28.83ms   576.5us <- qcs_notify_recv@src/mux_quic.c:519 tasklet_wakeup
  qc_io_cb                          2   30.25us   15.12us   6.093us   3.046us <- qc_init@src/mux_quic.c:2057 tasklet_wakeup
  srv_cleanup_toremove_conns        1   27.16us   27.16us   905.6us   905.6us <- srv_cleanup_idle_conns@src/server.c:5948 task_wakeup
  task_run_applet                  39   19.61us   502.0ns   818.7us   20.99us <- run_tasks_from_lists@src/task.c:652 task_drop_running
  quic_accept_run                   2   15.46us   7.727us   305.5us   152.8us <- quic_accept_push_qc@src/quic_sock.c:458 tasklet_wakeup
  h2_timeout_task                  32   12.91us   403.0ns   4.207ms   131.5us <- h2_release@src/mux_h2.c:1191 task_wakeup
  quic_conn_app_io_cb               1   9.645us   9.645us   1.445us   1.445us <- qc_process_timer@src/xprt_quic.c:4589 tasklet_wakeup

> show profiling tasks bytime aggr
Tasks activity:
  function                      calls   cpu_tot   cpu_avg   lat_tot   lat_avg
  qc_io_cb                     212301   3.147m    889.5us   1.009m    285.2us
  process_stream             15503573   2.519m    9.747us   2.148h    498.7us
  h1_io_cb                   15916733   36.95s    2.321us   1.535h    347.1us
  quic_conn_app_io_cb          318845   24.21s    75.92us   3.410s    10.70us
  sc_conn_io_cb              20037058   24.19s    1.207us   2.737h    491.8us
  h2_io_cb                     596543   12.55s    21.04us   1.326m    133.4us
  quic_lstnr_dghdlr            326624   3.094s    9.473us   3.462m    635.9us
  task_run_applet                 100   64.43ms   644.3us   5.285ms   52.85us
  quic_conn_io_cb                   4   5.951ms   1.488ms   38.85us   9.713us
  qc_process_timer               3061   1.750ms   571.0ns   1.162s    379.5us
  accept_queue_process            396   1.521ms   3.840us   86.16ms   217.6us
  h1_timeout_task                2618   526.8us   201.0ns   1.121s    428.4us
  process_resolvers               319   286.0us   896.0ns   16.82ms   52.73us
  qc_idle_timer_task                1   225.5us   225.5us   506.0ns   506.0ns
  srv_cleanup_idle_conns           54   37.72us   698.0ns   14.21ms   263.1us
  srv_cleanup_toremove_conns        1   27.16us   27.16us   905.6us   905.6us
  quic_accept_run                   2   15.46us   7.727us   305.5us   152.8us
  h2_timeout_task                  32   12.91us   403.0ns   4.207ms   131.5us
2022-09-08 16:38:10 +02:00
Willy Tarreau
dc89b1806c MINOR: activity/cli: support aggregating task profiling outputs
By default we now dump stats between caller and callee, but by
specifying "aggr" on the command line, stats get aggregated by
callee again as it used to be before the feature was available.
It may sometimes be helpful when comparing total call counts,
though that's about all.
2022-09-08 16:32:17 +02:00
Willy Tarreau
64435aaa85 MINOR: tasks/activity: improve the caller-callee activity hash
The previous dump already showed that the "other" category was getting
a few entries. Let's proceed like for the memory profiling, by scanning
a limited range of adjacent slots to find a spare one (16 max). That's
pretty fast since close and likely prefetched and the comparison is
cheap. The new dump now shows up to 45 entries below without "other":

Now:
Tasks activity:
  function                      calls   cpu_tot   cpu_avg   lat_tot   lat_avg
  task_run_applet                  22   34.56ms   1.571ms   1.145ms   52.04us <- sc_applet_create@src/stconn.c:489 appctx_wakeup
  task_run_applet                  21   11.11us   529.0ns   2.590ms   123.3us <- run_tasks_from_lists@src/task.c:652 task_drop_running
  task_run_applet                   5   7.715ms   1.543ms   2.186us   437.0ns <- sc_app_chk_snd_applet@src/stconn.c:996 appctx_wakeup
  accept_queue_process            345   3.129ms   9.068us   72.84ms   211.1us <- listener_accept@src/listener.c:1099 tasklet_wakeup
  accept_queue_process             32   113.0us   3.529us   3.070ms   95.94us <- accept_queue_process@src/listener.c:165 tasklet_wakeup
  sc_conn_io_cb               5026032   3.037s    604.0ns   17.47m    208.5us <- sc_app_chk_rcv_conn@src/stconn.c:762 tasklet_wakeup
  sc_conn_io_cb               4361192   7.626s    1.748us   3.179m    43.74us <- h1_wake_stream_for_recv@src/mux_h1.c:2600 tasklet_wakeup
  sc_conn_io_cb                178293   275.4ms   1.544us   2.740m    922.0us <- qcs_notify_send@src/mux_quic.c:529 tasklet_wakeup
  sc_conn_io_cb                  2561   15.84ms   6.185us   1.036s    404.4us <- h1_wake_stream_for_send@src/mux_h1.c:2610 tasklet_wakeup
  sc_conn_io_cb                   453   261.4us   577.0ns   86.79ms   191.6us <- h2s_notify_recv@src/mux_h2.c:1298 tasklet_wakeup
  sc_conn_io_cb                    89   50.05us   562.0ns   100.7ms   1.131ms <- qcs_notify_recv@src/mux_quic.c:519 tasklet_wakeup
  sc_conn_io_cb                     8   19.04us   2.379us   472.5us   59.06us <- sock_conn_iocb@src/sock.c:869 tasklet_wakeup
  process_resolvers                50   57.50us   1.149us   1.116ms   22.32us <- wake_expired_tasks@src/task.c:429 task_drop_running
  srv_cleanup_idle_conns            8   5.669us   708.0ns   216.6us   27.08us <- wake_expired_tasks@src/task.c:429 task_drop_running
  process_stream              4599847   48.79s    10.61us   16.92m    220.7us <- sc_notify@src/stconn.c:1209 task_wakeup
  process_stream              4530081   52.82s    11.66us   14.92m    197.6us <- stream_new@src/stream.c:563 task_wakeup
  process_stream                   15   201.7us   13.45us   31.58ms   2.105ms <- sc_app_chk_snd_conn@src/stconn.c:857 task_wakeup
  h1_io_cb                    7861205   18.22s    2.317us   2.408m    18.38us <- sock_conn_iocb@src/sock.c:869 tasklet_wakeup
  h1_io_cb                     474763   1.379s    2.905us   6.578m    831.4us <- conn_subscribe@src/connection.c:732 tasklet_wakeup
  h1_io_cb                      34830   38.64ms   1.109us   18.85s    541.2us <- h1_takeover@src/mux_h1.c:4085 tasklet_wakeup
  h1_io_cb                       2561   2.150ms   839.0ns   674.4ms   263.3us <- sock_conn_iocb@src/sock.c:849 tasklet_wakeup
  h1_timeout_task                2634   588.5us   223.0ns   890.5ms   338.1us <- h1_release@src/mux_h1.c:1087 task_wakeup
  h2_timeout_task                  16   7.519us   469.0ns   1.146ms   71.63us <- h2_release@src/mux_h2.c:1191 task_wakeup
  h2_io_cb                      99601   2.212s    22.21us   19.33s    194.1us <- h2_snd_buf@src/mux_h2.c:6712 tasklet_wakeup
  h2_io_cb                      79777   146.6ms   1.837us   3.529s    44.24us <- h2c_restart_reading@src/mux_h2.c:856 tasklet_wakeup
  h2_io_cb                      60698   2.259s    37.21us   4.704s    77.50us <- sock_conn_iocb@src/sock.c:869 tasklet_wakeup
  h2_io_cb                          5   36.90us   7.380us   2.045ms   409.0us <- h2_do_shutw@src/mux_h2.c:4656 tasklet_wakeup
  qc_io_cb                      26595   8.007s    301.1us   4.261s    160.2us <- qc_treat_acked_tx_frm@src/xprt_quic.c:1695 tasklet_wakeup
  qc_io_cb                       7921   5.284s    667.1us   2.171s    274.1us <- qc_stream_desc_ack@src/quic_stream.c:128 tasklet_wakeup
  qc_io_cb                       6229   5.851s    939.3us   1.856s    297.9us <- h3_snd_buf@src/h3.c:1084 tasklet_wakeup
  qc_io_cb                        994   699.1ms   703.3us   174.9ms   176.0us <- qcc_release_remote_stream@src/mux_quic.c:1200 tasklet_wakeup
  qc_io_cb                         65   9.883ms   152.0us   13.33ms   205.1us <- qc_process_timer@src/xprt_quic.c:4602 tasklet_wakeup
  qc_io_cb                          1   293.5us   293.5us   105.9us   105.9us <- qcs_consume@src/mux_quic.c:800 tasklet_wakeup
  qc_io_cb                          1   10.87us   10.87us   3.307us   3.307us <- qc_init@src/mux_quic.c:2057 tasklet_wakeup
  quic_conn_io_cb                   2   2.531ms   1.265ms   2.839us   1.419us <- qc_lstnr_pkt_rcv@src/xprt_quic.c:6184 tasklet_wakeup_after
  quic_conn_app_io_cb           61392   2.620s    42.67us   268.0ms   4.365us <- qc_lstnr_pkt_rcv@src/xprt_quic.c:6184 tasklet_wakeup_after
  quic_conn_app_io_cb             408   10.56ms   25.88us   124.0ms   303.8us <- qc_process_timer@src/xprt_quic.c:4635 tasklet_wakeup
  quic_conn_app_io_cb               2   15.61us   7.806us   103.2us   51.59us <- qc_process_timer@src/xprt_quic.c:4589 tasklet_wakeup
  quic_conn_app_io_cb               1   410.6us   410.6us   11.52us   11.52us <- qc_xprt_start@src/xprt_quic.c:7122 tasklet_wakeup
  quic_lstnr_dghdlr             62716   409.2ms   6.523us   21.81s    347.8us <- quic_lstnr_dgram_dispatch@src/quic_sock.c:255 tasklet_wakeup
  qc_process_timer                410   245.4us   598.0ns   238.5ms   581.7us <- wake_expired_tasks@src/task.c:344 task_wakeup
  quic_accept_run                   1   7.711us   7.711us   82.28us   82.28us <- quic_accept_push_qc@src/quic_sock.c:458 tasklet_wakeup
2022-09-08 16:25:36 +02:00
Willy Tarreau
3d4cdb198c MEDIUM: tasks/activity: combine the called function with the caller
Now instead of getting aggregate stats per called function, we have
them per function AND per call place. The "byaddr" sort considers
the function pointer first, then the call count, so that dominant
callers of a given callee are instantly spotted. This allows to get
sorted outputs like this:

Tasks activity:
  function                      calls   cpu_tot   cpu_avg   lat_tot   lat_avg
  h1_io_cb                   17357952   40.91s    2.357us   4.849m    16.76us <- sock_conn_iocb@src/sock.c:869 tasklet_wakeup
  sc_conn_io_cb              10357182   6.297s    607.0ns   27.93m    161.8us <- sc_app_chk_rcv_conn@src/stconn.c:762 tasklet_wakeup
  process_stream              9891131   1.809m    10.97us   53.61m    325.2us <- sc_notify@src/stconn.c:1209 task_wakeup
  process_stream              9823934   1.887m    11.52us   48.31m    295.1us <- stream_new@src/stream.c:563 task_wakeup
  sc_conn_io_cb               9347863   16.59s    1.774us   6.143m    39.43us <- h1_wake_stream_for_recv@src/mux_h1.c:2600 tasklet_wakeup
  h1_io_cb                     501344   1.848s    3.686us   6.544m    783.2us <- conn_subscribe@src/connection.c:732 tasklet_wakeup
  sc_conn_io_cb                239717   492.3ms   2.053us   3.213m    804.3us <- qcs_notify_send@src/mux_quic.c:529 tasklet_wakeup
  h2_io_cb                     173019   4.204s    24.30us   40.95s    236.7us <- h2_snd_buf@src/mux_h2.c:6712 tasklet_wakeup
  h2_io_cb                     149487   424.3ms   2.838us   14.63s    97.87us <- h2c_restart_reading@src/mux_h2.c:856 tasklet_wakeup
  other                        101893   4.626s    45.40us   14.84s    145.7us
  quic_lstnr_dghdlr             94389   614.0ms   6.504us   30.54s    323.6us <- quic_lstnr_dgram_dispatch@src/quic_sock.c:255 tasklet_wakeup
  quic_conn_app_io_cb           92205   3.735s    40.51us   390.9ms   4.239us <- qc_lstnr_pkt_rcv@src/xprt_quic.c:6184 tasklet_wakeup_after
  qc_io_cb                      50355   19.01s    377.5us   10.65s    211.4us <- qc_treat_acked_tx_frm@src/xprt_quic.c:1695 tasklet_wakeup
  h1_io_cb                      44427   155.0ms   3.489us   21.50s    484.0us <- h1_takeover@src/mux_h1.c:4085 tasklet_wakeup
  qc_io_cb                       9018   4.924s    546.0us   3.084s    342.0us <- qc_stream_desc_ack@src/quic_stream.c:128 tasklet_wakeup
  h1_timeout_task                3236   1.172ms   362.0ns   1.119s    345.9us <- h1_release@src/mux_h1.c:1087 task_wakeup
  h1_io_cb                       2804   7.974ms   2.843us   1.980s    706.0us <- sock_conn_iocb@src/sock.c:849 tasklet_wakeup
  sc_conn_io_cb                  2804   33.44ms   11.92us   2.597s    926.2us <- h1_wake_stream_for_send@src/mux_h1.c:2610 tasklet_wakeup
  qc_io_cb                       2623   2.669s    1.017ms   1.347s    513.5us <- h3_snd_buf@src/h3.c:1084 tasklet_wakeup
  qc_process_timer                662   526.4us   795.0ns   1.081s    1.633ms <- wake_expired_tasks@src/task.c:344 task_wakeup
  quic_conn_app_io_cb             648   12.62ms   19.47us   225.7ms   348.2us <- qc_process_timer@src/xprt_quic.c:4635 tasklet_wakeup
  accept_queue_process            286   1.571ms   5.494us   72.55ms   253.7us <- listener_accept@src/listener.c:1099 tasklet_wakeup
  process_resolvers               176   157.8us   896.0ns   7.835ms   44.52us <- wake_expired_tasks@src/task.c:429 task_drop_running
  qc_io_cb                        167   10.71ms   64.12us   32.47ms   194.4us <- qc_process_timer@src/xprt_quic.c:4602 tasklet_wakeup
  sc_conn_io_cb                   123   80.05us   650.0ns   50.35ms   409.4us <- qcs_notify_recv@src/mux_quic.c:519 tasklet_wakeup
  h2_timeout_task                  32   30.69us   958.0ns   9.038ms   282.4us <- h2_release@src/mux_h2.c:1191 task_wakeup
  task_run_applet                  24   33.79ms   1.408ms   5.838ms   243.3us <- sc_applet_create@src/stconn.c:489 appctx_wakeup
  accept_queue_process             17   56.34us   3.314us   7.505ms   441.5us <- accept_queue_process@src/listener.c:165 tasklet_wakeup
  srv_cleanup_toremove_conns       16   1.133ms   70.81us   5.685ms   355.3us <- srv_cleanup_idle_conns@src/server.c:5948 task_wakeup
  srv_cleanup_idle_conns           16   74.57us   4.660us   2.797ms   174.8us <- wake_expired_tasks@src/task.c:429 task_drop_running
  quic_conn_app_io_cb              12   786.9us   65.58us   2.042ms   170.1us <- qc_process_timer@src/xprt_quic.c:4589 tasklet_wakeup
  sc_conn_io_cb                     9   20.55us   2.283us   2.475ms   275.0us <- sock_conn_iocb@src/sock.c:869 tasklet_wakeup
  h2_io_cb                          8   34.12us   4.265us   1.784ms   223.0us <- h2_do_shutw@src/mux_h2.c:4656 tasklet_wakeup
  task_run_applet                   4   6.615ms   1.654ms   2.306us   576.0ns <- sc_app_chk_snd_applet@src/stconn.c:996 appctx_wakeup
  quic_conn_io_cb                   4   4.278ms   1.069ms   6.469us   1.617us <- qc_lstnr_pkt_rcv@src/xprt_quic.c:6184 tasklet_wakeup_after
  qc_io_cb                          2   20.81us   10.40us   4.943us   2.471us <- qc_init@src/mux_quic.c:2057 tasklet_wakeup
  quic_conn_app_io_cb               2   752.9us   376.4us   63.97us   31.99us <- qc_xprt_start@src/xprt_quic.c:7122 tasklet_wakeup
  quic_accept_run                   2   13.84us   6.920us   172.8us   86.42us <- quic_accept_push_qc@src/quic_sock.c:458 tasklet_wakeup
  qc_idle_timer_task                2   295.0us   147.5us   8.761us   4.380us <- wake_expired_tasks@src/task.c:344 task_wakeup
  qc_io_cb                          1   867.1us   867.1us   812.8us   812.8us <- qcs_consume@src/mux_quic.c:800 tasklet_wakeup

... and calls sorted by address like this:

Tasks activity:
  function                      calls   cpu_tot   cpu_avg   lat_tot   lat_avg
  task_run_applet                  23   32.73ms   1.423ms   5.837ms   253.8us <- sc_applet_create@src/stconn.c:489 appctx_wakeup
  task_run_applet                   4   6.615ms   1.654ms   2.306us   576.0ns <- sc_app_chk_snd_applet@src/stconn.c:996 appctx_wakeup
  accept_queue_process            285   1.566ms   5.495us   72.49ms   254.3us <- listener_accept@src/listener.c:1099 tasklet_wakeup
  accept_queue_process             17   56.34us   3.314us   7.505ms   441.5us <- accept_queue_process@src/listener.c:165 tasklet_wakeup
  sc_conn_io_cb              10357182   6.297s    607.0ns   27.93m    161.8us <- sc_app_chk_rcv_conn@src/stconn.c:762 tasklet_wakeup
  sc_conn_io_cb               9347863   16.59s    1.774us   6.143m    39.43us <- h1_wake_stream_for_recv@src/mux_h1.c:2600 tasklet_wakeup
  sc_conn_io_cb                239717   492.3ms   2.053us   3.213m    804.3us <- qcs_notify_send@src/mux_quic.c:529 tasklet_wakeup
  sc_conn_io_cb                  2804   33.44ms   11.92us   2.597s    926.2us <- h1_wake_stream_for_send@src/mux_h1.c:2610 tasklet_wakeup
  sc_conn_io_cb                   123   80.05us   650.0ns   50.35ms   409.4us <- qcs_notify_recv@src/mux_quic.c:519 tasklet_wakeup
  sc_conn_io_cb                     9   20.55us   2.283us   2.475ms   275.0us <- sock_conn_iocb@src/sock.c:869 tasklet_wakeup
  process_resolvers               159   145.9us   917.0ns   7.823ms   49.20us <- wake_expired_tasks@src/task.c:429 task_drop_running
  srv_cleanup_idle_conns           16   74.57us   4.660us   2.797ms   174.8us <- wake_expired_tasks@src/task.c:429 task_drop_running
  srv_cleanup_toremove_conns       16   1.133ms   70.81us   5.685ms   355.3us <- srv_cleanup_idle_conns@src/server.c:5948 task_wakeup
  process_stream              9891130   1.809m    10.97us   53.61m    325.2us <- sc_notify@src/stconn.c:1209 task_wakeup
  process_stream              9823933   1.887m    11.52us   48.31m    295.1us <- stream_new@src/stream.c:563 task_wakeup
  h1_io_cb                   17357952   40.91s    2.357us   4.849m    16.76us <- sock_conn_iocb@src/sock.c:869 tasklet_wakeup
  h1_io_cb                     501344   1.848s    3.686us   6.544m    783.2us <- conn_subscribe@src/connection.c:732 tasklet_wakeup
  h1_io_cb                      44427   155.0ms   3.489us   21.50s    484.0us <- h1_takeover@src/mux_h1.c:4085 tasklet_wakeup
  h1_io_cb                       2804   7.974ms   2.843us   1.980s    706.0us <- sock_conn_iocb@src/sock.c:849 tasklet_wakeup
  h1_timeout_task                3236   1.172ms   362.0ns   1.119s    345.9us <- h1_release@src/mux_h1.c:1087 task_wakeup
  h2_timeout_task                  32   30.69us   958.0ns   9.038ms   282.4us <- h2_release@src/mux_h2.c:1191 task_wakeup
  h2_io_cb                     173019   4.204s    24.30us   40.95s    236.7us <- h2_snd_buf@src/mux_h2.c:6712 tasklet_wakeup
  h2_io_cb                     149487   424.3ms   2.838us   14.63s    97.87us <- h2c_restart_reading@src/mux_h2.c:856 tasklet_wakeup
  h2_io_cb                          8   34.12us   4.265us   1.784ms   223.0us <- h2_do_shutw@src/mux_h2.c:4656 tasklet_wakeup
  qc_io_cb                      50355   19.01s    377.5us   10.65s    211.4us <- qc_treat_acked_tx_frm@src/xprt_quic.c:1695 tasklet_wakeup
  qc_io_cb                       9018   4.924s    546.0us   3.084s    342.0us <- qc_stream_desc_ack@src/quic_stream.c:128 tasklet_wakeup
  qc_io_cb                       2623   2.669s    1.017ms   1.347s    513.5us <- h3_snd_buf@src/h3.c:1084 tasklet_wakeup
  qc_io_cb                        167   10.71ms   64.12us   32.47ms   194.4us <- qc_process_timer@src/xprt_quic.c:4602 tasklet_wakeup
  qc_io_cb                          2   20.81us   10.40us   4.943us   2.471us <- qc_init@src/mux_quic.c:2057 tasklet_wakeup
  qc_io_cb                          1   867.1us   867.1us   812.8us   812.8us <- qcs_consume@src/mux_quic.c:800 tasklet_wakeup
  qc_idle_timer_task                2   295.0us   147.5us   8.761us   4.380us <- wake_expired_tasks@src/task.c:344 task_wakeup
  quic_conn_io_cb                   4   4.278ms   1.069ms   6.469us   1.617us <- qc_lstnr_pkt_rcv@src/xprt_quic.c:6184 tasklet_wakeup_after
  quic_conn_app_io_cb           92205   3.735s    40.51us   390.9ms   4.239us <- qc_lstnr_pkt_rcv@src/xprt_quic.c:6184 tasklet_wakeup_after
  quic_conn_app_io_cb             648   12.62ms   19.47us   225.7ms   348.2us <- qc_process_timer@src/xprt_quic.c:4635 tasklet_wakeup
  quic_conn_app_io_cb              12   786.9us   65.58us   2.042ms   170.1us <- qc_process_timer@src/xprt_quic.c:4589 tasklet_wakeup
  quic_conn_app_io_cb               2   752.9us   376.4us   63.97us   31.99us <- qc_xprt_start@src/xprt_quic.c:7122 tasklet_wakeup
  quic_lstnr_dghdlr             94389   614.0ms   6.504us   30.54s    323.6us <- quic_lstnr_dgram_dispatch@src/quic_sock.c:255 tasklet_wakeup
  qc_process_timer                662   526.4us   795.0ns   1.081s    1.633ms <- wake_expired_tasks@src/task.c:344 task_wakeup
  quic_accept_run                   2   13.84us   6.920us   172.8us   86.42us <- quic_accept_push_qc@src/quic_sock.c:458 tasklet_wakeup
  other                        101892   4.626s    45.40us   14.84s    145.7us

It already becomes visible that some tasks have different very costs
depending where they're called (e.g. process_stream). The method used
to wake them up is also shown. Applets are handled specially and shown
as appctx_wakeup.
2022-09-08 16:21:22 +02:00
Willy Tarreau
a3423873fe CLEANUP: activity: make the number of sched activity entries more configurable
This removes all the hard-coded 8-bit and 256 entries to use a pair of
macros instead so that we can more easily experiment with larger table
sizes if needed.
2022-09-08 14:55:09 +02:00
Willy Tarreau
4c1bc01f31 CLEANUP: activity: make taskprof use ptr_hash()
There's no more point using a different hash function here, xxh64 is
of course better distributed but we really don't care so let's unify
the code.
2022-09-08 14:19:15 +02:00
Willy Tarreau
245d32fe8f CLEANUP: activity: make memprof use the generic ptr_hash() function
There's no need to keep a local version of that function anymore.
2022-09-08 14:19:15 +02:00
Willy Tarreau
04e50b3d32 CLEANUP: task: rename ->call_date to ->wake_date
This field is misnamed because its real and important content is the
date the task was woken up, not the date it was called. It temporarily
holds the call date during execution but this remains confusing. In
fact before the latency measurements were possible it was indeed a call
date. Thus is will now be called wake_date.

This change is necessary because a subsequent fix will require the
introduction of the real call date in the thread ctx.
2022-09-08 14:19:15 +02:00
Willy Tarreau
42b180dcdb MINOR: pools/memprof: store and report the pool's name in each bin
Storing the pointer to the pool along with the stats is quite useful as
it allows to report the name. That's what we're doing here. We could
store it in place of another field but that's not convenient as it would
require to change all functions that manipulate counters. Thus here we
store one extra field, as well as some padding because the struct turns
56 bytes long, thus better go to 64 directly. Example of output from
"show profiling memory":

      2      0       48         0|  0x4bfb2c ha_quic_set_encryption_secrets+0xcc/0xb5e p_alloc(24) [pool=quic_tls_iv]
      0  55252        0  10608384|  0x4bed32 main+0x2beb2 free(-192)
     15      0     2760         0|  0x4be855 main+0x2b9d5 p_alloc(184) [pool=quic_frame]
      1      0     1048         0|  0x4be266 ha_quic_add_handshake_data+0x2b6/0x66d p_alloc(1048) [pool=quic_crypto]
      3      0      552         0|  0x4be142 ha_quic_add_handshake_data+0x192/0x66d p_alloc(184) [pool=quic_frame]
  31276      0  6755616         0|  0x4bb8f9 quic_sock_fd_iocb+0x689/0x69b p_alloc(216) [pool=quic_dgram]
      0  31424        0   6787584|  0x4bb7f3 quic_sock_fd_iocb+0x583/0x69b p_free(-216) [pool=quic_dgram]
    152      0    32832         0|  0x4bb4d9 quic_sock_fd_iocb+0x269/0x69b p_alloc(216) [pool=quic_dgram]
2022-08-17 10:34:00 +02:00
Willy Tarreau
facfad2b64 MINOR: pool/memprof: report pool alloc/free in memory profiling
Pools are being used so well that it becomes difficult to profile their
usage via the regular memory profiling. Let's add new entries for pools
there, named "p_alloc" and "p_free" that correspond to pool_alloc() and
pool_free(). Ideally it would be nice to only report those that fail
cache lookups but that's complicated, particularly on the free() path
since free lists are released in clusters to the shared pools.

It's worth noting that the alloc_tot/free_tot fields can easily be
determined by multiplying alloc_calls/free_calls by the pool's size, and
could be better used to store a pointer to the pool itself. However it
would require significant changes down the code that sorts output.

If this were to cause a measurable slowdown, an alternate approach could
consist in using a different value of USE_MEMORY_PROFILING to enable pools
profiling. Also, this profiler doesn't depend on intercepting regular malloc
functions, so we could also imagine enabling it alone or the other one alone
or both.

Tests show that the CPU overhead on QUIC (which is already an extremely
intensive user of pools) jumps from ~7% to ~10%. This is quite acceptable
in most deployments.
2022-08-17 09:38:05 +02:00
Willy Tarreau
219afa2ca8 MINOR: memprof: export the minimum definitions for memory profiling
Right now it's not possible to feed memory profiling info from outside
activity.c, so let's export the function and move the enum and struct
to the include file.
2022-08-17 09:03:57 +02:00
Willy Tarreau
bdcd32598f MINOR: thread: only use atomic ops to touch the flags
The thread flags are touched a little bit by other threads, e.g. the STUCK
flag may be set by other ones, and they're watched a little bit. As such
we need to use atomic ops only to manipulate them. Most places were already
using them, but here we generalize the practice. Only ha_thread_dump() does
not change because it's run under isolation.
2022-07-01 19:15:14 +02:00
Willy Tarreau
319d136ff9 MEDIUM: task: use regular eb32 trees for the run queues
Since we don't mix tasks from different threads in the run queues
anymore, we don't need to use the eb32sc_ trees and we can switch
to the regular eb32 ones. This uses cheaper lookup and insert code,
and a 16-thread test on the queues shows a performance increase
from 570k RPS to 585k RPS.
2022-07-01 19:15:14 +02:00
Willy Tarreau
6f78038d72 MEDIUM: task: move the shared runqueue to one per thread
Since we only use the shared runqueue to put tasks only assigned to
known threads, let's move that runqueue to each of these threads. The
goal will be to arrange an N*(N-1) mesh instead of a central contention
point.

The global_rqueue_ticks had to be dropped (for good) since we'll now
use the per-thread rqueue_ticks counter for both trees.

A few points to note:
  - the rq_lock stlil remains the global one for now so there should not
    be any gain in doing this, but should this trigger any regression, it
    is important to detect whether it's related to the lock or to the tree.

  - there's no more reason for using the scope-based version of the ebtree
    now, we could switch back to the regular eb32_tree.

  - it's worth checking if we still need TASK_GLOBAL (probably only to
    delete a task in one's own shared queue maybe).
2022-07-01 19:15:14 +02:00
Willy Tarreau
680ed5f28b MINOR: task: move profiling bit to per-thread
Instead of having a global mask of all the profiled threads, let's have
one flag per thread in each thread's flags. They are never accessed more
than one at a time an are better located inside the threads' contexts for
both performance and scalability.
2022-06-14 10:38:03 +02:00
Willy Tarreau
c12b321661 CLEANUP: applet: rename appctx_cs() to appctx_sc()
It returns a stream connector, not a conn_stream anymore, so let's
fix its name.
2022-05-27 19:33:35 +02:00
Willy Tarreau
475e4636bc CLEANUP: cli: rename all occurrences of stconn "cs" to "sc"
Function arguments and local variables called "cs" were renamed to "sc"
in the various keyword handlers.
2022-05-27 19:33:35 +02:00
Willy Tarreau
cb086c6de1 REORG: stconn: rename conn_stream.{c,h} to stconn.{c,h}
There's no more reason for keepin the code and definitions in conn_stream,
let's move all that to stconn. The alphabetical ordering of include files
was adjusted.
2022-05-27 19:33:35 +02:00
Willy Tarreau
5edca2f0e1 REORG: rename cs_utils.h to sc_strm.h
This file contains all the stream-connector functions that are specific
to application layers of type stream. So let's name it accordingly so
that it's easier to figure what's located there.

The alphabetical ordering of include files was preserved.
2022-05-27 19:33:35 +02:00
Willy Tarreau
40a9c32e3a CLEANUP: stconn: rename cs_{i,o}{b,c} to sc_{i,o}{b,c}
We're starting to propagate the stream connector's new name through the
API. Most call places of these functions that retrieve the channel or its
buffer are in applets. The local variable names are not changed in order
to keep the changes small and reviewable. There were ~92 uses of cs_ic(),
~96 of cs_oc() (due to co_get*() being less factorizable than ci_put*),
and ~5 accesses to the buffer itself.
2022-05-27 19:33:34 +02:00
Willy Tarreau
d0a06d52f4 CLEANUP: applet: use applet_put*() everywhere possible
This applies the change so that the applet code stops using ci_putchk()
and friends everywhere possible, for the much saferapplet_put*() instead.
The change is mechanical but large. Two or three functions used to have no
appctx and a cs derived from the appctx instead, which was a reminiscence
of old times' stream_interface. These were simply changed to directly take
the appctx. No sensitive change was performed, and the old (more complex)
API is still usable when needed (e.g. the channel is already known).

The change touched roughly a hundred of locations, with no less than 124
lines removed.

It's worth noting that the stats applet, the oldest of the series, could
get a serious lifting, as it's still very channel-centric instead of
propagating the appctx along the chain. Given that this code doesn't
change often, there's no emergency to clean it up but it would look
better.
2022-05-27 19:33:34 +02:00
Willy Tarreau
4596fe20d9 CLEANUP: conn_stream: tree-wide rename to stconn (stream connector)
This renames the "struct conn_stream" to "struct stconn" and updates
the descriptions in all comments (and the rare help descriptions) to
"stream connector" or "connector". This touches a lot of files but
the change is minimal. The local variables were not even renamed, so
there's still a lot of "cs" everywhere.
2022-05-27 19:33:34 +02:00
Willy Tarreau
0698c80a58 CLEANUP: applet: remove the unneeded appctx->owner
This one is the pointer to the conn_stream which is always in the
endpoint that is always present in the appctx, thus it's not needed.
This patch removes it and replaces it with appctx_cs() instead. A
few occurences that were using __cs_strm(appctx->owner) were moved
directly to appctx_strm() which does the equivalent.
2022-05-13 14:28:48 +02:00
Willy Tarreau
e8d006a79a CLEANUP: activity/cli: make "show profiling" not use ctx.cli anymore
The I/O handler was using ctx.cli.i0/i1/o0/o1. Let's put all that into
a locally-defined context and use it instead.
2022-05-06 18:13:36 +02:00
Christopher Faulet
6b0a0fb2f9 CLEANUP: tree-wide: Remove any ref to stream-interfaces
Stream-interfaces are gone. Corresponding files can be safely be removed. In
addition, comments are updated accordingly.
2022-04-13 15:10:16 +02:00
Christopher Faulet
a0bdec350f MEDIUM: stream-int/conn-stream: Move blocking flags from SI to CS
Remaining flags and associated functions are move in the conn-stream
scope. These flags are added on the endpoint and not the conn-stream
itself. This way it will be possible to get them from the mux or the
applet. The functions to get or set these flags are renamed accordingly with
the "cs_" prefix and updated to manipualte a conn-stream instead of a
stream-interface.
2022-04-13 15:10:15 +02:00
Christopher Faulet
908628c4c0 MEDIUM: tree-wide: Use CS util functions instead of SI ones
At many places, we now use the new CS functions to get a stream or a channel
from a conn-stream instead of using the stream-interface API. It is the
first step to reduce the scope of the stream-interfaces. The main change
here is about the applet I/O callback functions. Before the refactoring, the
stream-interface was the appctx owner. Thus, it was heavily used. Now, as
far as possible,the conn-stream is used. Of course, it remains many calls to
the stream-interface API.
2022-04-13 15:10:14 +02:00
Christopher Faulet
86e1c3381b MEDIUM: applet: Set the conn-stream as appctx owner instead of the stream-int
Because appctx is now an endpoint of the conn-stream, there is no reason to
still have the stream-interface as appctx owner. Thus, the conn-stream is
now the appctx owner.
2022-02-24 11:00:02 +01:00
Willy Tarreau
1de51eb727 MINOR: memprof: add one pointer size to the size of allocations
The current model causes an issue when trying to spot memory leaks,
because malloc(0) or realloc(0) do not count as allocations since we only
account for the application-usable size. This is the problem that made
issue #1406 not to appear as a leak.

What we're doing now is to account for one extra pointer (the one that
memory allocators usually place before the returned area), so that a
malloc(0) will properly account for 4 or 8 bytes. We don't need something
exact, we just need something non-zero so that a realloc(X) followed by a
realloc(0) without a free() gives a small non-zero result.

It was verified that the results are stable including in the presence
of lots of malloc/realloc/free as happens when stressing Lua.

It would make sense to backport this to 2.4 as it helps in bug reports.
2021-10-22 16:40:09 +02:00
Willy Tarreau
8cce4d79ff MINOR: memprof: report the delta between alloc and free on realloc()
realloc() calls are painful to analyse because they have two non-zero
columns and trying to spot a leaking one requires a bit of scripting.
Let's simply append the delta at the end of the line when alloc and
free are non-nul.

It would be useful to backport this to 2.4 to help with bug reports.
2021-10-22 16:40:09 +02:00
Willy Tarreau
1a9c922b53 REORG: thread/sched: move the task_per_thread stuff to thread_ctx
The scheduler contains a lot of stuff that is thread-local and not
exclusively tied to the scheduler. Other parts (namely thread_info)
contain similar thread-local context that ought to be merged with
it but that is even less related to the scheduler. However moving
more data into this structure isn't possible since task.h is high
level and cannot be included everywhere (e.g. activity) without
causing include loops.

In the end, it appears that the task_per_thread represents most of
the per-thread context defined with generic types and should simply
move to tinfo.h so that everyone can use them.

The struct was renamed to thread_ctx and the variable "sched" was
renamed to "th_ctx". "sched" used to be initialized manually from
run_thread_poll_loop(), now it's initialized by ha_set_tid() just
like ti, tid, tid_bit.

The memset() in init_task() was removed in favor of a bss initialization
of the array, so that other subsystems can put their stuff in this array.

Since the tasklet array has TL_CLASSES elements, the TL_* definitions
was moved there as well, but it's not a problem.

The vast majority of the change in this patch is caused by the
renaming of the structures.
2021-10-08 17:22:26 +02:00