With variable connection limits, it's not possible to accurately determine
whether the mux is still in use by comparing usage and max to be equal due
to the fact that one determines the capacity and the other one takes care
of the context. This can cause some connections to be dropped before they
reach their stream ID limit.
It seems it could also cause some connections to be terminated with
streams still alive if the limit was reduced to match the newly computed
avail_streams() value, though this cannot yet happen with existing muxes.
Instead let's switch to usage reports and simply check whether connections
are both unused and available before adding them to the idle list.
This should be backported to 1.9.
The new flag SI_FL_KILL_CONN is now set by the rare actions which
deliberately want the whole connection (and not just the stream) to be
killed. This is only used for "tcp-request content reject",
"tcp-response content reject", "tcp-response content close" and
"http-request reject". The purpose is to desambiguate the close from
a regular shutdown. This will be used by the next patches.
If we're adding a connection to the server orphan idle list, don't forget
to remove the CO_FL_SESS_IDLE flag, or we will assume later it's still
attached to a session.
This should be backported to 1.9.
There's a very small but existing uncertainty window when waking another
thread up where it is possible for task_wakeup() not to wake the other
task up because it's still running while this once is in the process of
finishing and loses its TASK_RUNNING flag. In this case the wakeup will
be missed.
The problem is that we have a single flag to store 3 states, since the
transition from running to sleeping isn't atomic. Thus we need to have
another flag to cover this part. This patch introduces TASK_QUEUED to
mention that the task is already in the run queue, running or not. This
bit will be removed while TASK_RUNNING is kept once dequeued, and will
be used when removing TASK_RUNNING to check if the task has been requeued.
It might be possible to slightly improve this but the occurrence rate
is quite low and we don't really need to complexify the scheduler to
optimize for a rare case.
The impact with the current code is very low since we have few inter-
thread wakeups. Most of them are caused by checks killing sessions.
This must be backported to 1.9.
Before the first send() attempt, we should be in SI_ST_CON, not
SI_ST_EST, since we have not yet attempted to send and we are
allowed to retry. This is particularly important with complex
outgoing muxes which can fail during the first send attempt (e.g.
failed stream ID allocation).
It only requires that sess_update_st_con_tcp() knows about this
possibility, as we must not forcefully close a reused connection
when facing an error in this case, this will be handled later.
This may be backported to 1.9 with care after some observation period.
Make "bind" keywork be supported in "peers" sections.
All "bind" settings are supported on this line.
Add "default-bind" option to parse the binding options excepted the bind address.
Do not parse anymore the bind address for local peers on "server" lines.
Do not use anymore list_for_each_entry() to set the "peers" section
listener parameters because there is only one listener by "peers" section.
May be backported to 1.5 and newer.
This patch adds pointer to a struct server to peer structure which
is initialized after having parsed a remote "peer" line.
After having parsed all peers section we run ->prepare_srv to initialize
all SSL/TLS stuff of remote perr (or server).
Remaining thing to do to completely support peer protocol over SSL/TLS:
make "bind" keyword be supported in "peers" sections to make SSL/TLS
incoming connections to local peers work.
May be backported to 1.5 and newer.
When using the peers feature a race condition could prevent
a connection from being properly counted. When this connection
exits it is being "uncounted" nonetheless, leading to a possible
underflow (-1) of the conn_curr stick table entry in the following
scenario :
- Connect to peer A (A=1, B=0)
- Peer A sends 1 to B (A=1, B=1)
- Kill connection to A (A=0, B=1)
- Connect to peer B (A=0, B=2)
- Peer A sends 0 to B (A=0, B=0)
- Peer B sends 0/2 to A (A=?, B=0)
- Kill connection to B (A=?, B=-1)
- Peer B sends -1 to A (A=-1, B=-1)
This fix may be backported to all supported branches.
Openssl switched from aes128 to aes256 since may 2016 to compute
tls ticket secrets used by default. But Haproxy still handled only
128 bits keys for both tls key file and CLI.
This patch permit the user to set aes256 keys throught CLI or
the key file (80 bytes encoded in base64) in the same way that
aes128 keys were handled (48 bytes encoded in base64):
- first 16 bytes for the key name
- next 16/32 bytes for aes 128/256 key bits key
- last 16/32 bytes for hmac 128/256 bits
Both sizes are now supported (but keys from same file must be
of the same size and can but updated via CLI only using a key of
the same size).
Note: This feature need the fix "dec func ignores padding for output
size checking."
When mux->init() fails, session_free() will call it again to unregister
it while it was already done, resulting in null derefs or use-after-free.
This typically happens on out-of-memory conditions during H1 or H2 connection
or stream allocation.
This fix must be backported to 1.9.
The function channel_htx_truncate() can now be used on HTX buffer to truncate
all incoming data, keeping outgoing one intact. This function relies on the
function channel_htx_erase() and htx_truncate().
This patch may be backported to 1.9. If so, the patch "MINOR: channel/htx: Add
the HTX version of channel_truncate()" must also be backported.
HTX versions for functions to test the free space in input against the reserve
have been added. Now, on HTX streams, following functions can be used:
* channel_htx_may_recv
* channel_htx_recv_limit
* channel_htx_recv_max
* channel_htx_full
This patch must be backported in 1.9 because it will be used by a futher patch
to fix a bug.
This function must be called when new incoming data are pushed in the channel's
buffer. It updates the channel state and take care of the fast forwarding by
consuming right amount of data and decrementing "->to_forward" accordingly when
necessary. In fact, this patch just moves a part of ci_putblk in a dedicated
function.
This patch must be backported to 1.9.
Instead of keeping track of the number of connections we're responsible for,
keep track of the number of connections we're responsible for that we are
currently considering idling (ie that we are not using, they may be in use
by other sessions), that way we can actually reuse connections when we have
more connections than the max configured.
When a session adds a connection to its connection list, we used to remove
connections for an another server if there were not enough room for our
server. This can't work, because those lists are now the list of connections
we're responsible for, not just the idle connections.
To fix this, allow for an unlimited number of servers, instead of using
an array, we're now using a linked list.
In si_release_endpoint(), if the end point is a connection, because we don't
know which mux to use it, make sure we close the connection before freeing it,
or else, we'd have a fd left for polling, which would point to a now free'd
connection.
This should be backported to 1.9.
As long-time changes have accumulated over time, the exported functions
of the stream-interface were almost all prefixed "si_<something>" while
most private ones (mostly callbacks) were called "stream_int_<something>".
There were still a few confusing exceptions, which were addressed to
follow this shcme :
- stream_sock_read0(), only used internally, was renamed stream_int_read0()
and made static
- stream_int_notify() is only private and was made static
- stream_int_{check_timeouts,report_error,retnclose,register_handler,update}
were renamed si_<something>.
Now it is clearer when checking one of these if it risks to be used outside
or not.
There was a reference to struct stream in conn_free() for the case
where we're freeing a connection that doesn't have a mux attached.
For now we know it's always a stream, and we only need to do it to
put a NULL in s->si[1].end.
Let's do it better by storing the pointer to si[1].end in the context
and specifying that this pointer is always nulled if the mux is null.
This way it allows a connection to detach itself from wherever it's
being used. Maybe we could even get rid of the condition on the mux.
We most often store the mux context there but it can also be something
else while setting up the connection. Better call it "ctx" and know
that it's the owner's context than misleadingly call it mux_ctx and
get caught doing suspicious tricks.
The SUB_CAN_SEND/SUB_CAN_RECV enum values have been confusing a few
times, especially when checking them on reading. After some discussion,
it appears that calling them SUB_RETRY_SEND/SUB_RETRY_RECV more
accurately reflects their purpose since these events may only appear
after a first attempt to perform the I/O operation has failed or was
not completed.
In addition the wait_reason field in struct wait_event which carries
them makes one think that a single reason may happen at once while
it is in fact a set of events. Since the struct is called wait_event
it makes sense that this field is called "events" to indicate it's the
list of events we're subscribed to.
Last, the values for SUB_RETRY_RECV/SEND were swapped so that value
1 corresponds to recv and 2 to send, as is done almost everywhere else
in the code an in the shutdown() call.
Types DNS_SRVRQ and CS were not referenced in the type to string
conversions, causing possibly misleading outputs in session dumps.
Now instead of showing "NONE" for unknown invalid types names, we
display "!INVAL!" to clear the confusion that may exist in case of
memory corruption for example.
In session, don't keep an infinite number of connection that can idle.
Add a new frontend parameter, "max-session-srv-conns" to set a max number,
with a default value of 5.
Instead of trying to get the session from the connection, which is not
always there, and of course there could be multiple sessions per connection,
provide it with the init() and attach() methods, so that we know the
session for each outgoing stream.
Add a new command, "pool-max-conn" that sets the maximum number of connections
waiting in the orphan idling connections list (as activated with idle-timeout).
Using "-1" means unlimited. Using pools is now dependant on this.
Now that h1 and legacy HTTP are two distinct things, there's no need
to keep the legacy HTTP parsers in h1.c since they're only used by
the legacy code in proto_http.c, and h1.h doesn't need to include
hdr_idx anymore. This concerns the following functions :
- http_parse_reqline();
- http_parse_stsline();
- http_msg_analyzer();
- http_forward_trailers();
All of these were moved to http_msg.c.
Lots of HTTP code still uses struct http_msg. Not only this code is
still huge, but it's part of the legacy interface. Let's move most
of these functions to a separate file http_msg.c to make it more
visible which file relies on what. It's mostly symmetrical with
what is present in http_htx.c.
The function http_transform_header_str() which used to rely on two
function pointers to look up a header was simplified to rely on
two variants http_legacy_replace_{,full_}header(), making both
sides of the function much simpler.
No code was changed beyond these moves.
All the HTX definition is self-contained and doesn't really depend on
anything external since it's a mostly protocol. In addition, some
external similar files (like h2) also placed in common used to rely
on it, making it a bit awkward.
This patch moves the two htx.h files into a single self-contained one.
The historical dependency on sample.h could be also removed since it
used to be there only for http_meth_t which is now in http.h.
There were a number of ugly setsockopt() calls spread all over
proto_http.c, proto_htx.c and hlua.c just to manipulate the front
connection's TOS, mark or TCP quick-ack. These ones entirely relied
on the connection, its existence, its control layer's presence, and
its addresses. Worse, inet_set_tos() was placed in proto_http.c,
exported and used from the two other ones, surrounded in #ifdefs.
This patch moves this code to connection.h and makes the other ones
rely on it without ifdefs.
If we try to receive before the connection is established, we lose the
send event and are not woken up anymore once the connection is established.
This was diagnosed by Olivier.
No backport is needed.
There are some situations where we need to wait for the other side to
be connected. None of the current blocking flags support this. It used
to work more or less by accident using the old flags. Let's add a new
flag to mention we're blocking on this, it's removed by si_chk_rcv()
when a connection is established. It should be enough for now.
The master is not supposed to run (at the moment) any task before the
polling loop, the created tasks should be run only in the workers but in
the master they should be disabled or removed.
No backport needed.
To ease the fast forwarding and the infinte forwarding on HTX proxies, 2
functions have been added to let the channel be almost aware of the way data are
stored in its buffer. By calling these functions instead of legacy ones, we are
sure to forward the right amount of data.
Now, the function htx_from_buf() will set the buffer's length to its size
automatically. In return, the caller should call htx_to_buf() at the end to be
sure to leave the buffer hosting the HTX message in the right state. When the
caller can use the function htxbuf() to get the HTX message without any update
on the underlying buffer.
The small HTX overhead is enough to make the system perform multiple
reads and unaligned memory copies. Here we provide a function whose
purpose is to reduce the apparent room in a buffer by the size of the
overhead for DATA blocks, which is the struct htx plus 2 blocks (one
for DATA, one for the end of message so that small blocks can fit at
once). The muxes using HTX will be encouraged to use this one instead
of b_room() to compute the available buffer room and avoid filling
their demux buf with more data than can fit at once into the HTX
buffer.
This one is used a lot during transfers, let's avoid resetting its
size when there are already data in the buffer since it implies the
size is correct.