haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-18 04:56:56 +02:00

Author	SHA1	Message	Date
Christopher Faulet	c91745e3a4	BUG/MINOR: mux-h1: Fix conditions on pipe in some COUNT_IF() The previous commit contains a bug in some COUNT_IF() relying on the pipe inside the IOBUF. We must take care to have a pipe before checking its size. No backport needed.	2024-10-24 09:50:16 +02:00
Christopher Faulet	7e60928c9c	DEBUG: mux-h1: Add debug counters to track errors with in/out pending data Debug counters were added on all connection error when pending data remain blocked in the input or ouput buffers. The same is performed when the H1C is released, when the connection is closed and when a timeout is reached. Idea is to be able to count all cases where data are lost, especially the outgoing ones.	2024-10-24 08:18:55 +02:00
Willy Tarreau	1eb31d30fe	Revert "OPTIM: mux-h2: make h2_send() report more accurate wake up conditions" This reverts commit `9fbc01710a`. In 3.1-dev10, commit `9fbc01710a` ("OPTIM: mux-h2: make h2_send() report more accurate wake up conditions") leveraged the more accurate distinction between demux and recv to decide when to wake the tasklet up after a send. But other cases are needed. When we just need to wake the processing task up so that it itself wakes up other streams, for example because these ones are blocked. Indeed, a temporarily blocked stream may block other ones, which will never be woken up if the demux has nothing to do. In an ideal world we would check all cases where blocking flags were dropped. However it looks like this case after a send is probably the only one that deserves waking up the connection again. It's likely that in practice the MUX_MFULL flag was dropped and that it was that one that was blocking the send. In addition, dealing with these cases was not sufficient, as one case was encountered where dbuf was empty, subs=0, short_read still present while in FRH state... and the timeouts were still there (easily found with halog -tcn cD at a rate of 1-2 every 2 minutes roughly). Interestingly, in a dump, some MBUF_HAS_DATA were seen on an empty mbuf, so it means that certain conditions must be taken very carefully in the wakeup conditions. So overall this indicates that there remain subtle inconsistencies that this optimization is sensitive to. It may have to be revisited later but for now better revert it. No backport is needed. Annex: - first dump showing a dependency on WAIT_INLIST after h2_send(): 0x6dc2800: [23/Oct/2024:18:07:22.861247] id=1696 proto=tcpv4 flags=0x100c4a, conn_retries=0, conn_exp=<NEVER> conn_et=0x000 srv_conn=0x597a900, pend_pos=(nil) waiting=0 epoch=0 frontend=public (id=2 mode=http), listener=SSL (id=5) backend=gitweb-haproxy (id=6 mode=http) task=0x6e1d090 (state=0x00 nice=0 calls=23 rate=0 exp=2s tid=0(1/0) age=57s) txn=0x6e3f7c0 flags=0x43000 meth=1 status=200 req.st=MSG_DONE rsp.st=MSG_DATA req.f=0x4c rsp.f=0x2e scf=0x6dc33a0 flags=0x00002482 ioto=1m state=EST endp=CONN,0x6dc6c20,0x40405001 sub=3 rex=<NEVER> wex=3s rto=3s wto=3s iobuf.flags=0x00000000 .pipe=0 .buf=0@(nil)+0/0 h2s=0x6dc6c20 h2s.id=59 .st=HCR .flg=0x7001 .rxwin=32712 .rxbuf.c=0 .t=0@(nil)+0/0 .h=0@(nil)+0/0 .sc=0x6dc33a0(.flg=0x00002482 .app=0x6dc2800) .sd=0x6e83fd0(.flg=0x40405001) .subs=0x6dc33b8(ev=3 tl=0x6e22a20 tl.calls=10 tl.ctx=0x6dc33a0 tl.fct=sc_conn_io_cb) h2c=0x6e66570 h2c.st0=FRH .err=0 .maxid=77 .lastid=-1 .flg=0x2000e00 .nbst=2 .nbsc=2 .nbrcv=0 .glitches=0 .fctl_cnt=0 .send_cnt=2 .tree_cnt=2 .orph_cnt=0 .sub=1 .dsi=77 .dbuf=0@(nil)+0/0 .mbuf=[4..4\|32],h=[0@(nil)+0/0],t=[0@(nil)+0/0] .task=0x6dbdc60 .exp=<NEVER> co0=0x7f84881614b0 ctrl=tcpv4 xprt=SSL mux=H2 data=STRM target=LISTENER:0x2acb7c0 flags=0x80000300 fd=19 fd.state=121 updt=0 fd.tmask=0x1 scb=0x2a8da90 flags=0x00001211 ioto=1m state=EST endp=CONN,0x6e5a530,0x106c0001 sub=0 rex=<NEVER> wex=<NEVER> rto=3s wto=<NEVER> iobuf.flags=0x00000000 .pipe=0 .buf=0@(nil)+0/0 h1s=0x6e5a530 h1s.flg=0x14094 .sd.flg=0x106c0001 .req.state=MSG_DONE .res.state=MSG_DATA .meth=GET status=200 .sd.flg=0x106c0001 .sc.flg=0x00001211 .sc.app=0x6dc2800 .subs=(nil) h1c=0x7f84880f5f40 h1c.flg=0x80000020 .sub=0 .ibuf=32704@0x6ddef30+16262/32768 .obuf=0@(nil)+0/0 .task=0x6e131d0 .exp=<NEVER> co1=0x7f8488172b70 ctrl=tcpv4 xprt=RAW mux=H1 data=STRM target=SERVER:0x597a900 flags=0x00000300 fd=31 fd.state=10122 updt=0 fd.tmask=0x1 filters={0x6e49f30="cache store filter", 0x6e67ad0="compression filter"} req=0x6dc2828 (f=0x21840000 an=0x48000 tofwd=0 total=224) an_exp=<NEVER> buf=0x6dc2830 data=(nil) o=0 p=0 i=0 size=0 htx=0x104d2c0 flags=0x0 size=0 data=0 used=0 wrap=NO extra=0 res=0x6dc2870 (f=0xa0040000 an=0x24000000 tofwd=0 total=309982) an_exp=<NEVER> buf=0x6dc2878 data=0x6dceef0 o=16333 p=16333 i=16435 size=32768 htx=0x6dceef0 flags=0x0 size=32720 data=16333 used=1 wrap=NO extra=0 ----------------------------------- strm.flg 0x100c4a SF_SRV_REUSED SF_HTX SF_REDIRECTABLE SF_CURR_SESS SF_BE_ASSIGNED SF_ASSIGNED task.state 0 0 txn.meth 1 GET txn.flg 0x43000 TX_NOT_FIRST TX_CACHE_COOK TX_CACHEABLE txn.req.flg 0x4c HTTP_MSGF_BODYLESS HTTP_MSGF_VER_11 HTTP_MSGF_XFER_LEN txn.rsp.flg 0x2e HTTP_MSGF_COMPRESSING HTTP_MSGF_VER_11 HTTP_MSGF_XFER_LEN HTTP_MSGF_TE_CHNK f.sc.flg 0x2482 SC_FL_SND_EXP_MORE SC_FL_RCV_ONCE SC_FL_WONT_READ SC_FL_EOI f.sc.sd.flg 0x40405001 SE_FL_HAVE_NO_DATA SE_FL_MAY_FASTFWD_CONS SE_FL_EOI SE_FL_NOT_FIRST SE_FL_T_MUX f.h2s.flg 0x7001 H2_SF_HEADERS_RCVD H2_SF_OUTGOING_DATA H2_SF_HEADERS_SENT H2_SF_ES_RCVD f.h2s.sd.flg 0x40405001 SE_FL_HAVE_NO_DATA SE_FL_MAY_FASTFWD_CONS SE_FL_EOI SE_FL_NOT_FIRST SE_FL_T_MUX f.h2c.flg 0x2000e00 H2_CF_MBUF_HAS_DATA H2_CF_DEM_IN_PROGRESS H2_CF_DEM_SHORT_READ H2_CF_WAIT_INLIST f.co.flg 0x80000300 CO_FL_XPRT_TRACKED CO_FL_XPRT_READY CO_FL_CTRL_READY f.co.fd.st 0x121 FD_POLL_IN FD_EV_READY_W FD_EV_ACTIVE_R b.sc.flg 0x1211 SC_FL_SND_NEVERWAIT SC_FL_NEED_ROOM SC_FL_NOHALF SC_FL_ISBACK b.sc.sd.flg 0x106c0001 SE_FL_WAIT_DATA SE_FL_MAY_FASTFWD_CONS SE_FL_MAY_FASTFWD_PROD SE_FL_WANT_ROOM SE_FL_RCV_MORE SE_FL_T_MUX b.h1s.sd.flg 0x106c0001 SE_FL_WAIT_DATA SE_FL_MAY_FASTFWD_CONS SE_FL_MAY_FASTFWD_PROD SE_FL_WANT_ROOM SE_FL_RCV_MORE SE_FL_T_MUX b.h1s.flg 0x14094 H1S_F_HAVE_CLEN H1S_F_HAVE_O_CONN H1S_F_NOT_FIRST H1S_F_WANT_KAL H1S_F_RX_CONGESTED b.h1c.flg 0x80000020 H1C_F_IS_BACK H1C_F_IN_FULL b.co.flg 0x300 CO_FL_XPRT_READY CO_FL_CTRL_READY b.co.fd.st 0x278a FD_POLL_OUT FD_POLL_PRI FD_POLL_IN FD_EV_ERR_RW FD_EV_READY_R 0x2008 req.flg 0x21840000 CF_FLT_ANALYZE CF_DONT_READ CF_AUTO_CONNECT CF_WROTE_DATA req.ana 0x48000 AN_REQ_FLT_END AN_REQ_HTTP_XFER_BODY req.htx.flg 0 0 res.flg 0xa0040000 CF_ISRESP CF_FLT_ANALYZE CF_WROTE_DATA res.ana 0x24000000 AN_RES_FLT_END AN_RES_HTTP_XFER_BODY res.htx.flg 0 0 ----------------------------------- - second example of stuck connection after properly checking for WAIT_INLIST as well: 0x73438d0: [23/Oct/2024:18:46:57.235709] id=3963 proto=tcpv4 flags=0x100c4a, conn_retries=0, conn_exp=<NEVER> conn_et=0x000 srv_conn=0x5dd3f50, pend_pos=(nil) waiting=0 epoch=0x13 p_stc=25 p_req=29 p_res=29 p_prp=29 frontend=public (id=2 mode=http), listener=SSL (id=5) backend=gitweb-haproxy (id=6 mode=http) task=0x72a13e0 (state=0x00 nice=0 calls=24 rate=0 exp=7s tid=0(1/0) age=53s) txn=0x7287260 flags=0x43000 meth=1 status=200 req.st=MSG_DONE rsp.st=MSG_DATA req.f=0x4c rsp.f=0x2e scf=0x729e520 flags=0x00042082 ioto=1m state=EST endp=CONN,0x737ffd0,0x4040d001 sub=2 rex=<NEVER> wex=46s rto=46s wto=46s iobuf.flags=0x00000000 .pipe=0 .buf=0@(nil)+0/0 h2s=0x737ffd0 h2s.id=57 .st=HCR .flg=0x7001 .rxwin=32712 .rxbuf.c=0 .t=0@(nil)+0/0 .h=0@(nil)+0/0 .sc=0x729e520(.flg=0x00042082 .app=0x73438d0) .sd=0x72afd50(.flg=0x4040d001) .subs=0x729e538(ev=2 tl=0x72af760 tl.calls=10 tl.ctx=0x729e520 tl.fct=sc_conn_io_cb) h2c=0x72555a0 h2c.st0=FRH .err=0 .maxid=77 .lastid=-1 .flg=0x60e00 .nbst=1 .nbsc=1 .nbrcv=0 .glitches=0 .fctl_cnt=0 .send_cnt=1 .tree_cnt=1 .orph_cnt=0 .sub=0 .dsi=77 .dbuf=0@(nil)+0/0 .mbuf=[2..2\|32],h=[0@(nil)+0/0],t=[0@(nil)+0/0] .task=0x725e660 .exp=<NEVER> co0=0x7378e00 ctrl=tcpv4 xprt=SSL mux=H2 data=STRM target=LISTENER:0x2f24800 flags=0x80040300 fd=23 fd.state=1122 updt=0 fd.tmask=0x1 scb=0x2ee74c0 flags=0x00001211 ioto=1m state=EST endp=CONN,0x7287190,0x106c0001 sub=0 rex=<NEVER> wex=<NEVER> rto=46s wto=<NEVER> iobuf.flags=0x00000000 .pipe=0 .buf=0@(nil)+0/0 h1s=0x7287190 h1s.flg=0x14094 .sd.flg=0x106c0001 .req.state=MSG_DONE .res.state=MSG_DATA .meth=GET status=200 .sd.flg=0x106c0001 .sc.flg=0x00001211 .sc.app=0x73438d0 .subs=(nil) h1c=0x7373920 h1c.flg=0x80000020 .sub=0 .ibuf=32704@0x7272700+318/32768 .obuf=0@(nil)+0/0 .task=0x729e700 .exp=<NEVER> co1=0x72f5290 ctrl=tcpv4 xprt=RAW mux=H1 data=STRM target=SERVER:0x5dd3f50 flags=0x00000300 fd=19 fd.state=10122 updt=0 fd.tmask=0x1 filters={0x728f1f0="cache store filter" [3], 0x728fea0="compression filter" [28]} req=0x73438f8 (f=0x21840000 an=0x48000 tofwd=0 total=224) an_exp=<NEVER> buf=0x7343900 data=(nil) o=0 p=0 i=0 size=0 htx=0x105f440 flags=0x0 size=0 data=0 used=0 wrap=NO extra=0 res=0x7343940 (f=0xa0040000 an=0x24000000 tofwd=0 total=359574) an_exp=<NEVER> buf=0x7343948 data=0x72b1b30 o=16333 p=16333 i=16435 size=32768 htx=0x72b1b30 flags=0x8 size=32720 data=16333 used=1 wrap=NO extra=0 ----------------------------------- strm.flg 0x100c4a SF_SRV_REUSED SF_HTX SF_REDIRECTABLE SF_CURR_SESS SF_BE_ASSIGNED SF_ASSIGNED task.state 0 0 txn.meth 1 GET txn.flg 0x43000 TX_NOT_FIRST TX_CACHE_COOK TX_CACHEABLE txn.req.flg 0x4c HTTP_MSGF_BODYLESS HTTP_MSGF_VER_11 HTTP_MSGF_XFER_LEN txn.rsp.flg 0x2e HTTP_MSGF_COMPRESSING HTTP_MSGF_VER_11 HTTP_MSGF_XFER_LEN HTTP_MSGF_TE_CHNK f.sc.flg 0x42082 SC_FL_EOS SC_FL_SND_EXP_MORE SC_FL_WONT_READ SC_FL_EOI f.sc.sd.flg 0x4040d001 SE_FL_HAVE_NO_DATA SE_FL_MAY_FASTFWD_CONS SE_FL_EOS SE_FL_EOI SE_FL_NOT_FIRST SE_FL_T_MUX f.h2s.flg 0x7001 H2_SF_HEADERS_RCVD H2_SF_OUTGOING_DATA H2_SF_HEADERS_SENT H2_SF_ES_RCVD f.h2s.sd.flg 0x4040d001 SE_FL_HAVE_NO_DATA SE_FL_MAY_FASTFWD_CONS SE_FL_EOS SE_FL_EOI SE_FL_NOT_FIRST SE_FL_T_MUX f.h2c.flg 0x60e00 H2_CF_END_REACHED H2_CF_RCVD_SHUT H2_CF_MBUF_HAS_DATA H2_CF_DEM_IN_PROGRESS H2_CF_DEM_SHORT_READ f.co.flg 0x80040300 CO_FL_XPRT_TRACKED CO_FL_SOCK_RD_SH CO_FL_XPRT_READY CO_FL_CTRL_READY f.co.fd.st 0x1122 FD_POLL_HUP FD_POLL_IN FD_EV_READY_W FD_EV_READY_R b.sc.flg 0x1211 SC_FL_SND_NEVERWAIT SC_FL_NEED_ROOM SC_FL_NOHALF SC_FL_ISBACK b.sc.sd.flg 0x106c0001 SE_FL_WAIT_DATA SE_FL_MAY_FASTFWD_CONS SE_FL_MAY_FASTFWD_PROD SE_FL_WANT_ROOM SE_FL_RCV_MORE SE_FL_T_MUX	2024-10-23 19:17:10 +02:00
Willy Tarreau	a1d0e58b06	BUILD: spoe: fix build warning on older gcc around sub-struct initialization gcc-4.8 is unhappy with the cfg_file initialization: src/flt_spoe.c: In function 'parse_spoe_flt': src/flt_spoe.c:2202:9: warning: missing braces around initializer [-Wmissing-braces] struct cfgfile cfg_file = {0}; ^ src/flt_spoe.c:2202:9: warning: (near initialization for 'cfg_file.list') [-Wmissing-braces] This is due to the embedded list member. Initializing it to empty like we do almost everywhere else makes it happy. No backport is needed as this was changed in 3.1-dev5 only.	2024-10-23 15:12:59 +02:00
Aurelien DARRAGON	b5b40a9843	BUG/MEDIUM: connection/http-reuse: fix address collision on unhandled address families As described in GH #2765, there were situations where http connections would be re-used for requests to different endpoints, which is obviously unexpected. In GH #2765, this occured with httpclient and UNIX socket combination, but later code analysis revealed that while disabling http reuse on httpclient proxy helped, it didn't fix the underlying issue since it was found that conn_calculate_hash_sockaddr() didn't take into account families such as AF_UNIX or AF_CUST_SOCKPAIR, and because of that the sock_addr part of the connection wasn't hashed. To properly fix the issue, let's explicly handle UNIX (both regular and ABNS) and AF_CUST_SOCKPAIR families, so that the destination address is properly hashed. To prevent this bug from re-appearing: when the family isn't known, instead of doing nothing like before, let's fall back to a generic (unoptimal) hashing which hashes the whole sockaddr_storage struct As a workaround, http-reuse may be disabled on impacted proxies. (unfortunately this doesn't help for httpclient since reuse policy defaults to safe and cannot be modified from the config) It should be backported to all stable versions. Shout out to @christopherhibbert for having reported the issue and provided a trivial reproducer. [ada: prior to 3.0, ctx adjt is required because conn_hash_update()'s prototype is slightly different]	2024-10-23 11:48:16 +02:00
Willy Tarreau	b74fb1325e	MINOR: sample: add the "when" converter to condition some expressions Sometimes it would be desirable to include some debugging output only under certain conditions, but the end of the transfer is too late to apply some rules. Here we take the approach of making a converter ("when") that takes a condition among an arbitrary list, and decides whether or not to let the input sample pass through or not based on the condition. This allows for example to log debugging information only when an error was encountered during the processing (sort of an extension of dontlog-normal). The conditions are quite limited (stopping, error, normal, toapplet, forwarded, processed) and can be negated. The converter can also be chained to use more complex conditions. A suggested example will be: # log "dbg={-}" when fine, or "dbg={... debug info ...}" on error: log-format "$HAPROXY_HTTP_LOG_FMT dbg={%[bs.debug_str,when(!normal)]}"	2024-10-22 20:13:00 +02:00
Willy Tarreau	19e4ec43b9	MINOR: filters: add per-filter call counters The idea here is to record how many times a filter is being called on a stream. We're incrementing the same counter all along, regardless of the type of event, since the purpose is essentially to detect one that might be misbehaving. The number of calls is reported in "show sess all" next to the filter name. It may also help detect suboptimal processing. For example compressing 1GB shows 138k calls to the compression filter, which is roughly two calls per buffer. Maybe we wake up with incomplete buffers and compress less. That's left for a future analysis.	2024-10-22 20:13:00 +02:00
Willy Tarreau	37d5c6fe3a	MINOR: stream: maintain per-stream counters of the number of passes on code Process_stream() is a complex function and a few times some lopos were either witnessed or suspected. Each time this happens it's extremely difficult to figure why because it involves combinations of analysers, filters, errors etc. Let's at least maintain a set of 4 counters per stream that report the number of times we've been through each of the 4 most important blocks (stconn changes, request analysers, response analysers, and propagation of changes down). These ones are stored in the stream and reported in "show sess all", just like they will be reported in panic dumps.	2024-10-22 20:13:00 +02:00
Christopher Faulet	ce314cfb39	MINOR: mux-h1: Add support of the debug string for logs Now it is possible to have info about front and back H1 multiplexer. For instance: <134>Oct 22 18:10:46 haproxy[3841864]: 127.0.0.1:44280 [22/Oct/2024:18:10:43.265] front-http back-http/www 0/0/-1/-1/3082 503 217 - - SC-- 1/1/0/0/3 0/0 "GET / HTTP/1.1" fs=< h1s=0x13b6f10 h1s.flg=0x14010 .sd.flg=0x50404601 .req.state=MSG_DONE .res.state=MSG_DONE .meth=GET status=503 .sd.flg=0x50404601 .sc.flg=0x00034482 .sc.app=0x11e4c30 .subs=(nil) h1c.flg=0x0 .sub=0 .ibuf =0@(nil)+0/0 .obuf=0@(nil)+0/0 .task=0x1337d10 .exp=<NEVER> conn.flg=0x80000300> bs=< h1s=0x13bb400 h1s.flg=0x100010 .sd.flg=0x10400001 .req.state=MSG_RQBEFORE .res.state=MSG_RPBEFORE .meth=UNKNOWN status=0 .sd.flg=0x10400001 .sc.flg=0x0003c007 .sc.app=0x11e4c30 .subs=(nil) h1c.flg=0x80000000 .sub=0 .ibuf=0@(nil)+0/0 .obuf=0@(nil)+0/0 .task=0x12ba610 .exp=<NEVER> conn.flg=0x5c0300> The have this log message, the log-format must be set to: log-format "$HAPROXY_HTTP_LOG_FMT fs=<%[fs.debug_str]> bs=<%[bs.debug_str]>"	2024-10-22 18:21:28 +02:00
Christopher Faulet	35ab9b8c6d	DEBUG: mux-h1: Add debug counters to track some errors Debug counters are added to track errors about wrong the payload length during the message formatting (on the sending path). Aborts are also concerned. connection shutdowns and errors while the end of the message was not reached are now tracked. On the sending path, shutdown performed while all the message was not forwarded are tracked too.	2024-10-22 17:39:32 +02:00
Christopher Faulet	c8aecc393b	DEBUG: stream: Add debug counters to track some client/server aborts Not all aborts are tracked for now but only those a bit ambiguous. Mainly, aborts during the data forwarding are concerned. Those triggered during the request or the response analysis are easier to analyze with the stream termination state.	2024-10-22 16:46:37 +02:00
Christopher Faulet	19b736a5fb	CLEANUP: stream: remove outdated comments Comments added during a refactoring session were still there while they are now totally useless. So let's remove them.	2024-10-22 16:14:15 +02:00
Christopher Faulet	7dc930d231	BUG/MINOR: stconn: Pretend the SE have more data to deliver on abortonclose When abortonclose option is enabled on the backend, at the SC level, we must still pretend the SE have more data to deliver to be able to receive the EOS. It must be performed at 2 places: * When the backend is set and the connection is requested. It is when the option is seen for the first time. * After a receive attempt, if the EOI flag is set on the sedesc. Otherwise, when an abort is detected by the mux, the SC is not notified. This patch should fix the issue #2764. This bug probably exists in all stable version but is only visible since `bca5e1423` ("OPTIM: stconn: Don't pretend mux have more data to deliver on EOI/EOS/ERROR"). So I suggest to not backport it for now, except if the commit above is backported.	2024-10-22 11:16:24 +02:00
Christopher Faulet	ded28f6e5c	BUG/MEDIUM: mux-h2: Remove H2S from send list if data are sent via 0-copy FF When data are sent via the zero-copy data forwarding, in h2_done_ff, we must be sure to remove the H2 stream from the send list if something is send. It was only performed if no blocking condition was encountered. But we must also do it if something is sent. Otherwise the transfer may be blocked till timeout. This patch must be backported as far as 2.9.	2024-10-22 08:00:32 +02:00
Christopher Faulet	529e4f36a3	BUG/MEDIUM: stats-html: Never dump more data than expected during 0-copy FF During the zero-copy data forwarding, the caller specify the maximum amount of data the producer may push. However, the HTML stats applet does not use it and can fill all the free space in the buffer. It is especially an issue when the consumer is limited by a flow control, like the H2. Because we may emit too large DATA frame in this case. It is especially visible with big buffer (for instance 32k). In the early age or zero-copy data forwarding, the caller was responsible to pass a properly resized buffer. And during the different refactoring steps, this has changed but the HTML stats applet was not updated accordingly. To fix the bug, the buffer used to dump the HTML page is resized to be sure not too much data are dumped. This patch should solve the issue #2757. It must be backported to 3.0.	2024-10-22 08:00:32 +02:00
Willy Tarreau	f2c415cec1	MINOR: debug: add "debug dev counters" to list code counters Issuing "debug dev counters" on the CLI will now scan all existing counters, and report their count, type, location, function name, the condition and an optional comment passed to the macro. The command takes a number of arguments: - "show": this is the default, it will just list the counters - "reset": will reset the matching counters instead of listing them - "all": by default, only non-zero counters are listed. With "all", they are all listed - "bug": restrict the reset or dump to counters of type "BUG" (BUG_ON usually) - "chk": restrict the reset or dump to counters of type "CHK" (CHECK_IF) - "cnt": restrict the reset or dump to counters of type "CNT" (COUNT_IF) The types may be cumulated, and the option entered in any order. Here's an example of the output of "debug dev counters show all bug": Count Type Location function(): "condition" [comment] 0 BUG ring.h:114 ring_dup(): "max > ring_size(dst)" 0 BUG vecpair.h:223 vp_getblk_ofs(): "ofs >= v1->len + v2->len" 0 BUG buf.h:395 b_add(): "b->data + count > b->size" 0 BUG buf.h:106 b_room(): "b->data > b->size" 0 BUG task.h:328 _task_queue(): "(ulong)caller & 1" 0 BUG task.h:324 _task_queue(): "task->tid != tid" 0 BUG task.h:313 _task_queue(): "(ulong)caller & 1" (...) This is expected to be convenient combined with the use and abuse of COUNT_IF() at select locations.	2024-10-21 19:17:55 +02:00
Willy Tarreau	da66c42f65	MINOR: debug: add a new debug macro COUNT_IF() This macro works exactly like BUG_ON() except that it never logs anything nor crashes, it only implements an atomic counter that is incremented on every call. This can be used to count a number of unlikely events that are worth checking at run time on setups showing unusual and unreproducible behaviors.	2024-10-21 19:14:07 +02:00
Willy Tarreau	776fd03509	MEDIUM: debug: add match counters for BUG_ON/WARN_ON/CHECK_IF These macros do not always kill the process, and sometimes it would be nice to know if some match or not, and how many times (especially for the CHECK_IF one). This commit adds a new section "dbg_cnt" made of structs that contain function name, file name, line number, check type, condition and match count. A newe macro __DBG_COUNT() adds one to the counter, and is placed inside _BUG_ON() and _BUG_ON_ONCE(). It's worth noting that the exact type of the check is not very precise but in practice we don't care, as most checks will cause the process to die anyway unless they're of type _BUG_ON_ONCE() (used by CHECK_IF by default). All of this is limited to !defined(USE_OBSOLETE_LINKER) because we're creating a section, thus we need a modern linker to be able to scan this section later. Doing so adds ~50kB to the executable due to the ~1266 BUG_ON() and others placed there. That's not huge in comparison to the visibility it can provide.	2024-10-21 19:14:07 +02:00
Willy Tarreau	8844ed2009	CLEANUP: debug: make the BUG_ON() macros check the condition in the outer one The BUG_ON() macros are made of two levels so as to resolve the condition to a string. However this doesn't offer much flexibility for performing other operations when the condition is validated, so let's adjust them so that the condition is checked in the outer macro and the operations are performed in the inner one.	2024-10-21 18:17:25 +02:00
Amaury Denoyelle	68c8c91023	BUG/MINOR: mux-quic: do not close STREAM with empty FIN if no data sent A stream may be shut without any HTX EOM reported to report a proper closure. This is the case for QCS instances flagged with QC_SF_UNKNOWN_PL_LENGTH. Shut is performed with an empty FIN emission instead of a RESET_STREAM. This has been implemented since the following patch : `24962dd178` BUG/MEDIUM: mux-quic: do not emit RESET_STREAM for unknown length However, in case of HTTP/3, an empty FIN should only be done after a full message is emitted, which requires at least a HEADERS frame. If an empty FIN is emitted without it, client may interpret this as invalid and close the connection. To prevent this, fallback to a RESET_STREAM emission if no data were emitted on the stream. This was reproduced using ngtcp2-client with 10% loss (-r 0.1) on a remote host, with httpterm request "/?s=100k&C=1&b=0&P=400". An error ERR_H3_FRAME_UNEXPECTED is returned by ngtcp2-client when the bug occurs. Note that this change is incomplete. The message validity depends solely on the application protocol in use. As such, a new app_ops callback should be implemented to ensure the stream is closed accordingly. However, this first patch ensures that at least HTTP/3 case is valid while keeping a minimal backport process. This should be backported up to 2.8.	2024-10-21 11:24:38 +02:00
Amaury Denoyelle	b200d3d80b	MINOR: mux-quic: simplify sending of empty STREAM FIN An empty STREAM frame can be emitted by QUIC MUX to notify about a delayed FIN when there is no data left to transmit. This requires a tedious comparison on stream offset in qmux_ctrl_send() to ensure an empty stream frame is not always considered as retransmitted, which is necessary to locally close the QCS instance. Simplify this by unsubscribe from streamdesc layer when the QCS is locally closed on FIN transmission notification. This prevents all future retransmitted frames to be reported to the QCS instance, especially any potentially retransmitted empty FIN.	2024-10-21 11:21:07 +02:00
Valentine Krasnobaeva	af1d170122	BUG/MINOR: mworker: fix mworker-max-reloads parser Before this patch, when wrong argument was provided in the configuration for mworker-max-reloads keyword, parser shows these errors below on the stderr: [WARNING] (1820317) : config : parsing [haproxy.cfg:154] : (null)parsing [haproxy.cfg:154] : 'mworker-max-reloads' expects an integer argument. In a case, when by mistake two arguments were provided instead of one, this has also triggered a buggy error message: [ALERT] (1820668) : config : parsing [haproxy.cfg:154] : 'mworker-max-reloads' cannot handle unexpected argument '45'. [WARNING] (1820668) : config : parsing [haproxy.cfg:154] : (null) So, as 'mworker-max-reloads' is parsed in discovery mode by master process let's align now its parser with all others, which could be called for this mode. Like this in cases, when there are too many args or argument isn't a valid integer we return proper error codes to global section parser and messages are formated properly. This fix should be backported in all stable versions.	2024-10-21 10:46:58 +02:00
Ilya Shipitsin	8a1aabb133	CI: modernize macos builds to macos-15 macos-15 support was announced few months ago: https://github.com/github/roadmap/issues/986	2024-10-21 07:54:38 +02:00
Ilya Shipitsin	50cf89ad5c	CI: bump development builds explicitely to Ubuntu 24.04 Initially we agreed to split builds into "latest" for development branch and fixed 22.04 for stable branches. It got broken when "latest" label migrated from ubuntu-22 to ubuntu-24 ... because of build cache. Cache key is built using runner label, it was not prepared to use the same "latest" cache from ubuntu 22 on ubuntu 24. To make things clear, let's stick explicitely to ubuntu 24.	2024-10-21 07:54:35 +02:00
Ilya Shipitsin	b6491ab19f	CI: prepare Coverity build for Ubuntu 24 PCRE2 is recommended, PCRE was chosen for no reason. GHA Ubuntu 22 images include both libs, but recent Ubuntu 24 does not. Let us prepare for Ubuntu 24	2024-10-21 07:54:32 +02:00
Willy Tarreau	9aa86b9dbd	BUILD: mux-h2/traces: fix build on 32-bit due to size of the DATA frame Commit `cf3fe1eed` ("MINOR: mux-h2/traces: print the size of the DATA frames") added the size of the DATA frame to the traces. Unfortunately it uses ullong instead of ulong to cast a pointer, which breaks the build on 32-bit platforms. Let's just switch it to ulong which works on both.	2024-10-21 04:17:59 +02:00
Willy Tarreau	278b9613a3	MEDIUM: debug: on panic, make the target thread automatically allocate its buf One main problem with panic dumps is that they're filling the dumping thread's trash, and that the global thread_dump_buffer is too small to catch enough of them. Here we're proceeding differently. When dumping threads for a panic, we're passing the magic value 0x2 as the buffer, and it will instruct the target thread to allocate its own buffer using get_trash_chunk() (which is signal safe), so that each thread dumps into its own buffer. Then the thread will wait for the buffer to be consumed, and will assign its own thread_dump_buffer to it. This way we can simply dump all threads' buffers from gdb like this: (gdb) set $t=0 while ($t < global.nbthread) printf "%s\n", ha_thread_ctx[$t].thread_dump_buffer.area set $t=$t+1 end For now we make it wait forever since it's only called on panic and we want to make sure the thread doesn't leave and continues to use that trash buffer or do other nasty stuff. That way the dumping thread will make all of them die. This would be useful to backport to the most recent branches to help troubleshooting. It backports well to 2.9, except for some trivial context in tinfo-t.h for an updated comment. 2.8 and older would also require TAINTED_PANIC. The following previous patches are required: MINOR: debug: make mark_tainted() return the previous value MINOR: chunk: drop the global thread_dump_buffer MINOR: debug: split ha_thread_dump() in two parts MINOR: debug: slightly change the thread_dump_pointer signification MINOR: debug: make ha_thread_dump_done() take the pointer to be used MINOR: debug: replace ha_thread_dump() with its two components	2024-10-19 16:01:52 +02:00
Willy Tarreau	afeac4bc02	MINOR: debug: replace ha_thread_dump() with its two components At the few places we were calling ha_thread_dump(), now we're calling separately ha_thread_dump_fill() and ha_thread_dump_done() once the data are consumed.	2024-10-19 15:42:34 +02:00
Willy Tarreau	d7c34ba479	MINOR: debug: make ha_thread_dump_done() take the pointer to be used This will allow the caller to decide whether to definitely clear the pointer and release the thread, or to leave it unlocked so that it's easy to analyse from the struct (the goal will be to use that in panic() so that cores are easy to analyse).	2024-10-19 15:42:07 +02:00
Willy Tarreau	091de0f9b2	MINOR: debug: slightly change the thread_dump_pointer signification Now the thread_dump_pointer is returned ORed with 1 once done, or NULL when cancelled (for now noone cancels). The goal will be to permit the callee to provide its own pointer. The ha_thread_dump_fill() function now returns the buffer pointer that was used (without OR 1) or NULL, for ease of use from the caller.	2024-10-19 15:42:07 +02:00
Willy Tarreau	2036f5bba1	MINOR: debug: split ha_thread_dump() in two parts We want to have a function to trigger the dump and another one to wait for it to be completed. This will be important to permit panic dumps to be done on local threads. For now this does not change anything, as the function still calls the two new functions one after the other.	2024-10-19 15:42:07 +02:00
Willy Tarreau	a6698304e0	MINOR: chunk: drop the global thread_dump_buffer This variable is not very useful and is confusing anyway. It was mostly used to detect that a panic dump was still in progress, but we can now check mark_tainted() for this. The pointer was set to one of the dumping thread's trash chunks. Let's temporarily continue to copy the dumps to that trash, we'll remove it later.	2024-10-19 15:42:00 +02:00
Willy Tarreau	8e048603d1	MINOR: debug: make mark_tainted() return the previous value Since mark_tainted() uses atomic ops to update the tainted status, let's make it return the prior value, which will allow the caller to detect if it's the first one to set it or not.	2024-10-19 15:13:47 +02:00
Willy Tarreau	84340d108b	OPTIM: buffers: avoid a useless wrapping check for ofs == 0 As mentioned in previous commit, b_peek_ofs() performs a wrapping check but is often called with ofs == 0 as a constant. We can detect this case with __builtin_const_p() so it makes sense to use it. A test shows a size reduction of about 320 bytes, which is not much, but it happens in hot code paths, and each 16 bytes reduction indicates an eliminated conditional branch. Some clear winners are ci_getblk_nc() (-48 bytes), h2c_dec_hdrs (-141B), h1_copy_msg_data (-124B), tcpcheck_spop_expect_hello (-80B), h1_parse_msg_data (-44B). These ones will definitely benefit from doing less conditional jumps.	2024-10-18 18:42:47 +02:00
Willy Tarreau	fca212292a	CLEANUP: buffers: simplify b_get_varint() The function is an exact copy of b_peek_varint() with ofs==0 and doing a b_del() at the end. We can simply call that other one and delete the contents. It turns out that the code is bigger with this change because b_peek_varint() passes its offset to b_peek() which performs a wrapping check. When ofs==0 the wrapping cannot happen, but there's no real way to tell that to the compiler. Instead conditioning the if() in b_peek() with (!__builtin_constant_p(ofs) \|\| ofs) does the job, but it's not worth it at the moment since we have no users of b_get_varint() for now. Let's just stick to the simple normal code.	2024-10-18 18:28:39 +02:00
Willy Tarreau	8b5a1fd1fc	BUILD: buffers: keep b_getblk_nc() and b_peek_varint() in buf.h Some large functions were moved to buf.c by commit `ac66df4e2` ("REORG: buffers: move some of the heavy functions from buf.h to buf.c"). However, as found by Amaury, haring doesn't build anymore. Upon close inspection, b_getblk_nc() isn't that big since it's very much inlinable, and a part of its apparently large size comes from the BUG_ON_HOT() that were implemented. Regarding b_peek_varint(), it doesn't have any dependency and is used only at 4 places in the DNS code, so its loop will not have big impacts, and the rest around can be optimised away by the compiler so it remains relevant to keep it inlined. Also it can serve as a base to deduplicate the code in b_get_varint(). No backport needed.	2024-10-18 17:53:25 +02:00
Dragan Dosen	f33e9079a9	MINOR: arg: add an argument type for identifier The ARGT_ID argument type may now be used to set a custom resolve function in order to help resolve the argument string value. If the custom resolve function is not set, the behavior is the same as of type ARGT_STR.	2024-10-18 14:30:24 +02:00
Dragan Dosen	40ab88899c	BUG/MINOR: sample: free err2 in smp_resolve_args for type ARGT_REG The err2 may be leaking memory in case an error occurred as a result of regex_comp() call.	2024-10-18 14:29:56 +02:00
Aurelien DARRAGON	9262b7109e	CLEANUP: http_ext: remove useless BUG_ON() in http_handle_xot_header() A useless BUG_ON() statement was let in a conditional block that already checks that the condition cannot be met within the block. Remove the useless BUG_ON()	2024-10-17 17:25:06 +02:00
Aurelien DARRAGON	d28d016f43	MINOR: http_ext: implement rfc7239_{nn,np} converters "option forwarded" provides a convenient way to automatically insert rfc7239 forwarded header to requests sent to servers. On the other hand, manually crafting the header is quite complicated due to specific formatting rules that must be followed as per rfc7239. However, sometimes it may be necessary to craft the header manually, for instance if it has to be conditional or based on parameters that "option forwarded" doesn't provide. To ease this task, in this patch we implement rfc7239_nn and rfc7239_np which are respectively meant to craft nodename: nodeport values, specifically intended to manually build rfc7239 'for' and 'by' header fields while ensuring rfc7239 compliancy. Example: # build RFC-compliant 7239 header: http-request set-var-fmt(txn.forwarded) "for=\"%[ipv6(::1),rfc7239_nn]:%[str(8888),rfc7239_np]\";host=\"haproxy.org\";proto=http" # check RFC-compliancy: http-request set-var(txn.test) "var(txn.forwarded),debug(ok,stderr),rfc7239_is_valid,debug(ok,stderr)" # stderr output: # [debug] ok: type=str <for="[::1]:_8888";host="haproxy.org";proto=http> # [debug] ok: type=bool <1> See documentation for more info and examples.	2024-10-17 17:24:58 +02:00
Aurelien DARRAGON	45cbbdc845	DOC: config: fix rfc7239 forwarded typo in desc replace specicy with specify in rfc7239 forwarded option description. Multiple occurences were found. May be backported in 2.8.	2024-10-17 17:24:51 +02:00
Frederic Lecaille	b1af5dabf0	BUG/MEDIUM: quic: avoid freezing 0RTT connections This issue came with this commit: `f627b92` BUG/MEDIUM: quic: always validate sender address on 0-RTT and could be easily reproduced with picoquic QUIC client with -Q option which splits a big ClientHello TLS message into two Initial datagrams. A second condition must be fulfilled to reprodue this issue: picoquic must not send the token provided by haproxy (NEW_TOKEN). To do that, haproxy must be patched to prevent it to send such tokens. Under these conditions, if haproxy has enough time to reply to the first Initial datagrams, when it receives the second Initial datagram it sends a Retry paquet. Then the client ignores the Retry paquet as mentionned by RFC 9000: 17.2.5.2. Handling a Retry Packet A client MUST accept and process at most one Retry packet for each connection attempt. After the client has received and processed an Initial or Retry packet from the server, it MUST discard any subsequent Retry packets that it receives. On its side, haproxy has closed the connection. When it receives the second Initial datagram, it open a new connection but with Initial packets it cannot decrypt (wrong ODCID) leaving the client without response. To fix this, as the aim of the token (NEW_TOKEN) sent by haproxy is to validate the peer address, in place of closing the connection when no token was received for a 0RTT connection, one leaves this validation to the handshake process. Indeed, the peer adress is validated during the handshake when a valid handshake packet is received by the listener. But as one does not want haproxy to process 0RTT data when no token was received, one does not accept the connection before the successful handshake completion. In addition to this, the 0RTT packets are not released after successful handshake completion when no token was received to leave a chance to haproxy to process these 0RTT data in such case (see quic_conn_io_cb()). Must be backported as far as 2.9.	2024-10-17 15:04:06 +02:00
Frederic Lecaille	c7f14a38f5	MINOR: quic: send new tokens (NEW_TOKEN) even for 1RTT sessions Tokens are sent when opening a connection, just after the handshake, to be possibly reused by the peer for the next connection. They are used to validate the peer address during the 0RTT connection openings. But there is no reason to reserve this feature to 0RTT connections. This patch modifies quic_build_post_handshake_frames() to do so.	2024-10-17 15:04:06 +02:00
Frederic Lecaille	19aa320f64	BUG/MINOR: quic: avoid leaking post handshake frames This bug came with this commit: `f627b92` BUG/MEDIUM: quic: always validate sender address on 0-RTT If an error happens in quic_build_post_handshake_frames() during the code exexuted for th NEW_TOKEN frame allocation, some could leak because of the wrong label used to interrupt this function asap. Replace the "goto leave" by "goto err" to deallocated such frames to fix this issue. Must be backported as far as 2.9.	2024-10-17 15:04:06 +02:00
Christopher Faulet	e7be13da87	REGTESTS: Never reuse server connection in http-messaging/truncated.vtc A "Connection: close" header is added to responses to avoid any connection reuse. This should avoid errors on the client side.	2024-10-17 14:44:01 +02:00
Christopher Faulet	52a3d807fc	BUG/MAJOR: filters/htx: Add a flag to state the payload is altered by a filter When a filter is registered on the data, it means it may change the payload length by rewritting data. It means consumers of the message cannot trust the expected length of payload as announced by the producer. The commit `8bd835b2d2` ("MEDIUM: filters/htx: Don't rely on HTX extra field if payload is filtered") was pushed to solve this issue. When the HTTP payload of a message is filtered, the extra field is set to 0 to be sure it will never be used by error by any consumer. However, it is not enough. Indeed, the filters must be called before fowarding some data. They cannot be by-passed. But if a consumer is unable to flush the HTX message, some outgoing data can remain blocked in the channel's buffer. If some new data are then pushed because there is some room in the channel's buffe, the producer will set the HTX extra field. At this stage, if the consumer is unblocked and can send again data, it is possible to call it to forward outgoing data blocked in the channel's buffer before waking the stream up to filter new input data. It is the purpose of the data fast-forwarding. In this case, the HTX extra field will be seen by the consumer. It is unexpected and leads to undefined behavior. One consequence of this bug is to perform a wrong chunking on compressed messages, leading to processing errors at the end of the message, reported as "ID--" in logs. To fix the bug, a HTX flag is added to state the payload of the current HTX message is altered. When this flag is set (HTX_FL_ALTERED_PAYLOAD), the HTX extra field must not be trusted. And to keep things simple, when this flag is set, the HTX extra field is automatically set to 0 when the HTX message is loaded, in htxbuf() function. It is probably the less intrusive way to fix the bug for now. But this part must be reviewed to save meta-info of the HTX message outside of the message itself. This commit should solve the issue #2741. It must be backported as far as 2.9.	2024-10-17 13:54:54 +02:00
Christopher Faulet	0fcfed9e23	BUG/MEDIUM: stconn: Check FF data of SC to perform a shutdown in sc_notify() In sc_notify() function, the consumer side of the SC is tested to verify if we must perform a shutdown on the endpoint. To do so, no output data must be present in the buffer and in the iobuf. However, there is a bug here, the iobuf of the opposite SC is tested instead of the one of the current SC. So a shutdown can be performed on the endpoint while there are still output data in the iobuf that must be sent. Concretely, it can only be data blocked in a pipe. Because of this bug, data blocked in the pipe will be never sent. I've not tested but I guess this may block the stream during the client or server timeout. This patch must be backported as far as 2.9.	2024-10-17 13:53:40 +02:00
Christopher Faulet	6790067e79	BUG/MINOR: http-ana: Don't report a server abort if response payload is invalid If a parsing error is reported by the mux on the response payload, a proxy error (PRXCOND) must be reported instead of a server abort (SRVCL). Because of this bug, inavlid response may are reported as "SD--" or "SL--" in logs instead of "PD--" or "PL--". This patch must be backported to all stable versions.	2024-10-17 13:53:40 +02:00
Christopher Faulet	f98feda53f	MINOR: mux-h1: Add a trace on shutdown when keep-alive is not possible When the stream is shut down, some tests are performed to know if the connection must also be closed or not. There are trace messages for all cases, except for the default one: Abort or close-mode. Thanks to this patch, there is now a message too in this case.	2024-10-17 13:53:40 +02:00
Christopher Faulet	2c82ca60c6	MINOR: mux-h1: Show the SD iobuf in trace messages on stream send events Info about the SD iobuf are now dumped in trace messages when a stream send event is processed. It is a useful information to debug zero-copy forwarding issues.	2024-10-17 13:53:40 +02:00

... 17 18 19 20 21 ...

24126 Commits