Move the QUIC conn (struct quic_conn) initialization from quic_sock_accept_conn()
to qc_lstnr_pkt_rcv() as this is done for the server part.
Move the timer initialization to ->start xprt callback to ensure the connection
context is done : it is initialized by the ->accept callback which may be run
by another thread than the one for the I/O handler which also run ->start.
Move the call to SSL_set_quic_transport_params() from the listener I/O dgram
handler to the ->init() callback of the xprt (qc_conn_init()) which initializes
its context where is stored the SSL context itself, needed by
SSL_set_quic_transport_params(). Furthermore this is already what is done for the
server counterpart of ->init() QUIC xprt callback. As the ->init() may be run
by another thread than the one for the I/O handler, the xprt context could
not be potentially already initialized before calling SSL_set_quic_transport_params()
from the I/O handler.
The name the maximum packet size transport parameter was ambiguous and replaced
by maximum UDP payload size. Our code would be also ambiguous if it does not
reflect this change.
Deactivate the action of this callback at this time. I am not sure
we will keep it for QUIC as it does not really make sense for QUIC:
the QUIC packet are already recvfrom()'ed by the low level I/O handler
used for all the connections.
This file has been derived from mux_h2.c removing all h2 parts. At
QUIC mux layer, there must not be any reference to http. This will be the
responsability of the application layer (h3) to open streams handled by the mux.
We move ->params transport parameters to ->rx.params. They are the
transport parameters which will be sent to the peer, and used for
the endpoint flow control. So, they will be used to received packets
from the peer (RX part).
Also move ->rx_tps transport parameters to ->tx.params. They are the
transport parameter which are sent by the peer, and used to respect
its flow control limits. So, they will be used when sending packets
to the peer (TX part).
This bug may occur when displaying streams traces. It came with this commit:
242fb1b63 ("MINOR: quic: Drop packets with STREAM frames with wrong direction.").
The current "ADD" vs "ADDQ" is confusing because when thinking in terms
of appending at the end of a list, "ADD" naturally comes to mind, but
here it does the opposite, it inserts. Several times already it's been
incorrectly used where ADDQ was expected, the latest of which was a
fortunate accident explained in 6fa922562 ("CLEANUP: stream: explain
why we queue the stream at the head of the server list").
Let's use more explicit (but slightly longer) names now:
LIST_ADD -> LIST_INSERT
LIST_ADDQ -> LIST_APPEND
LIST_ADDED -> LIST_INLIST
LIST_DEL -> LIST_DELETE
The same is true for MT_LISTs, including their "TRY" variant.
LIST_DEL_INIT keeps its short name to encourage to use it instead of the
lazier LIST_DELETE which is often less safe.
The change is large (~674 non-comment entries) but is mechanical enough
to remain safe. No permutation was performed, so any out-of-tree code
can easily map older names to new ones.
The list doc was updated.
No need to keep this flag apart any more, let's merge it into the global
state. The CLI's output state was extended to 6 digits and the linger/cloned
flags moved inside the parenthesis.
For a long time we've had fdtab[].ev and fdtab[].state which contain two
arbitrary sets of information, one is mostly the configuration plus some
shutdown reports and the other one is the latest polling status report
which also contains some sticky error and shutdown reports.
These ones used to be stored into distinct chars, complicating certain
operations and not even allowing to clearly see concurrent accesses (e.g.
fd_delete_orphan() would set the state to zero while fd_insert() would
only set the event to zero).
This patch creates a single uint with the two sets in it, still delimited
at the byte level for better readability. The original FD_EV_* values
remained at the lowest bit levels as they are also known by their bit
value. The next step will consist in merging the remaining bits into it.
The whole bits are now cleared both in fd_insert() and _fd_delete_orphan()
because after a complete check, it is certain that in both cases these
functions are the only ones touching these areas. Indeed, for
_fd_delete_orphan(), the thread_mask has already been zeroed before a
poller can call fd_update_event() which would touch the state, so it
is certain that _fd_delete_orphan() is alone. Regarding fd_insert(),
only one thread will get an FD at any moment, and it as this FD has
already been released by _fd_delete_orphan() by definition it is certain
that previous users have definitely stopped touching it.
Strictly speaking there's no need for clearing the state again in
fd_insert() but it's cheap and will remove some doubts during some
troubleshooting sessions.
Introduce a new XPRT method, start(). The init() method will now only
initialize whatever is needed for the XPRT to run, but any action the XPRT
has to do before being ready, such as handshakes, will be done in the new
start() method. That way, we will be sure the full stack of xprt will be
initialized before attempting to do anything.
The init() call is also moved to conn_prepare(). There's no longer any reason
to wait for the ctrl to be ready, any action will be deferred until start(),
anyway. This means conn_xprt_init() is no longer needed.
When tasklets were derived from tasks, there was no immediate need for
the scheduler to know their status after execution, and in a spirit of
simplicity they just started to always return NULL. The problem is that
it simply prevents the scheduler from 1) accounting their execution time,
and 2) keeping track of their current execution status. Indeed, a remote
wake-up could very well end up manipulating a tasklet that's currently
being executed. And this is the reason why those handlers have to take
the idle lock before checking their context.
In 2.5 we'll take care of making tasklets and tasks work more similarly,
but trouble is to be expected if we continue to propagate the trend of
returning NULL everywhere, especially if some fixes relying on a stricter
model later need to be backported. For this reason this patch updates all
known tasklet handlers to make them return NULL only when the tasklet was
freed. It has no effect for now and isn't even guaranteed to always be
100% safe but it puts the code into the right direction for this.
It's been too short for quite a while now and is now full. It's still
time to extend it to 32-bits since we have room for this without
wasting any space, so we now gained 16 new bits for future flags.
The values were not reassigned just in case there would be a few
hidden u16 or short somewhere in which these flags are placed (as
it used to be the case with stream->pending_events).
The patch is tagged MEDIUM because this required to update the task's
process() prototype to use an int instead of a short, that's quite a
bunch of places.
This makes the code more readable and less prone to copy-paste errors.
In addition, it allows to place some __builtin_constant_p() predicates
to trigger a link-time error in case the compiler knows that the freed
area is constant. It will also produce compile-time error if trying to
free something that is not a regular pointer (e.g. a function).
The DEBUG_MEM_STATS macro now also defines an instance for ha_free()
so that all these calls can be checked.
178 occurrences were converted. The vast majority of them were handled
by the following Coccinelle script, some slightly refined to better deal
with "&*x" or with long lines:
@ rule @
expression E;
@@
- free(E);
- E = NULL;
+ ha_free(&E);
It was verified that the resulting code is the same, more or less a
handful of cases where the compiler optimized slightly differently
the temporary variable that holds the copy of the pointer.
A non-negligible amount of {free(str);str=NULL;str_len=0;} are still
present in the config part (mostly header names in proxies). These
ones should also be cleaned for the same reasons, and probably be
turned into ist strings.
In FD dumps it's often very important to figure what upper layer function
is going to be called. Let's export the few I/O callbacks that appear as
tasklet functions so that "show fd" can resolve them instead of printing
a pointer relative to main. For example:
1028 : st=0x21(R:rA W:Ra) ev=0x01(heopI) [lc] tmask=0x2 umask=0x2 owner=0x7f00b889b200 iocb=0x65b638(sock_conn_iocb) back=0 cflg=0x00001300 fe=recv mux=H2 ctx=0x7f00c8824de0 h2c.st0=FRH .err=0 .maxid=795 .lastid=-1 .flg=0x0000 .nbst=0 .nbcs=0 .fctl_cnt=0 .send_cnt=0 .tree_cnt=0 .orph_cnt=0 .sub=1 .dsi=795 .dbuf=0@(nil)+0/0 .msi=-1 .mbuf=[1..1|32],h=[0@(nil)+0/0],t=[0@(nil)+0/0] xprt=SSL xprt_ctx=0x7f00c86d0750 xctx.st=0 .xprt=RAW .wait.ev=1 .subs=0x7f00c88252e0(ev=1 tl=0x7f00a07d1aa0 tl.calls=1047 tl.ctx=0x7f00c8824de0 tl.fct=h2_io_cb) .sent_early=0 .early_in=0
This is issue is due to the fact that when we call the function
responsible of building CRYPTO frames to fill a buffer, the Length
field of this packet did not take into an account the trailing 16 bytes for
the AEAD tag. Furthermore, the remaining <room> available in this buffer
was not decremented by the CRYPTO frame length, but only by the CRYPTO data length
of this frame.
Add traces to have an idea why this function may fail. In fact
in never fails when the passed parameters are correct, especially the
lengths. This is not the case when a packet is not correctly built
before being encrypted.
Even if the size of frames built by qc_build_frm() are computed so that
not to overflow a buffer, do not rely on this and always makes a packet
build fails if we could not build a frame.
Also add traces to have an idea where qc_build_frm() fails.
Fixes a memory leak in qc_build_phdshk_apkt().
At least displays the SSL alert error code passed to ->ssl_send_alert()
QUIC BIO method and the SSL encryption level. This function is newly called
when using picoquic client with a recent version of BoringSSL (Nov 19 2020).
This is not the case with OpenSSL with 32 as QUIC draft implementation.
Reorder by increasing type the switch/case in qc_parse_pkt_frms()
which is the high level frame parser.
Add new STREAM_X frame types to support some tests with ngtcp2 client.
Add ->flags to the QUIC frame parser as this has been done for the builder so
that to flag RX packets as ack-eliciting at low level. This should also be
helpful to maintain the code if we have to add new flags to RX packets.
Remove the statements which does the same thing as higher level in
qc_parse_pkt_frms().
Remove ->ifcdata which was there to control the CRYPTO data sent to the
peer so that not to saturate its reception buffer. This was a sort
of flow control.
Add ->prep_in_flight counter to the QUIC path struct to control the
number of bytes prepared to be sent so that not to saturare the
congestion control window. This counter is increased each time a
packet was built. This has nothing to see with ->in_flight which
is the real in flight number of bytes which have really been sent.
We are olbiged to maintain two such counters to know how many bytes
of data we can prepared before sending them.
Modify traces consequently which were useful to diagnose issues about
the congestion control window usage.
As there is a lot of information in this protocol, this is not
easy to make the traces readable. We remove here a few of them and
shorten some line shortening the variable names.
This patch imports all the C files for QUIC protocol implementation with few
modifications from 20200720-quic branch of quic-dev repository found at
https://github.com/haproxytech/quic-dev.
Traces were implemented to help with the development.