Till now we were using si_done_put() upon shutr, but these flags could
be reset upon next activity. Now let's switch to SI_FL_RXBLK_SHUT which
doesn't go away. It's also set in stream_int_update() in case a shutr
condition is detected.
The now unused si_done_put() was removed.
A number of calls to si_cant_put() were used in fact to request being
called back once a buffer is available. These ones are not needed anymore
since si_alloc_ibuf() already sets the SI_FL_RXBLK_BUFF flag when called
in appctx context. Those called with a foreign stream-int are simply turned
to si_rx_buff_blk().
A number of si_cant_put() calls were still present to in fact indicate
that the end point is ready (thus should be turned to si_rx_endp_more()).
One other call in the Lua handler indicates that the endpoint wanted to
be blocked until some room is made in the Rx buffer in order to detect
that the connection happened, which is in fact an indication that it
wants to be called once the endpoint is ready, this is the default case
for an applet so this call was removed.
A useless call to si_cant_put() before appctx_wakeup() in the Lua
applet wakeup call was removed as well since the first thing that will
be done there will be to set end ENDP blocking flag.
If an applet reports being blocked due to any of the channel-side flags,
it's reportedly ready to deliver incoming data. It's better to do this
after the return from the applet handler so that applet developers don't
have to worry about details related to flags ordering.
Instead of first indicating that there's more data to read from the
conn_stream then re-adjusting this info along the function, we now
instead set the status according to the subscription status at the
end. It's easier, more accurate, and less sensitive to intermediary
changes.
This will soon allow to remove all the si_cant_put() calls that were
placed in the middle to force a subsequent callback and prevent the
function from subscribing to the mux layer.
The stream interface used to conflate a missing buffer and lack of
buffer space into SI_FL_WAIT_ROOM but this causes difficulties as
these cannot be checked at the same moment and are not resolved at
the same moment either. Now we instead mark the buffer as presumably
available using si_rx_buff_rdy() and mark it as unavailable+requested
using si_rx_buff_blk().
The call to si_alloc_buf() was moved after si_stop_put(). This makes
sure that the SI_FL_RX_WAIT_EP flag is cleared on allocation failure so
that the function is called again if the callee fails to do its work.
The SI_FL_WANT_PUT flag is used in an awkward way, sometimes it's
set by the stream-interface to mean "I have something to deliver",
sometimes it's cleared by the channel to say "I don't want you to
send what you have", and it has to be set back once CF_DONT_READ
is cleared. This will have to be split between SI_FL_RX_WAIT_EP
and SI_FL_RXBLK_CHAN. This patch only replaces all uses of the
flag with its natural (but negated) replacement SI_FL_RX_WAIT_EP.
The code is expected to be strictly equivalent. The now unused flag
was completely removed.
This flag is not enough to describe all blocking situations, as can be
seen in each case we remove it. The muxes has taught us that using multiple
blocking flags in parallel will be much easier, so let's start to do this
now. This patch only renames this flags in order to make next changes more
readable.
There currently is an optimization in stream_int_notify() consisting
in not trying to forward small bits of data if extra data remain to be
processed. The purpose is to avoid forwarding one chunk at a time if
multiple chunks are available to be parsed at once. It consists in
avoiding sending pending output data if there are still data to be
parsed in the channel's buffer, since process_stream() will have the
opportunity to deal with them all at once.
Not only this optimization is less useful with the new way the connections
work, but it even causes problems like lost events since WAIT_ROOM will
not be removed. And with HTX, it will never be able to update the input
buffer after the first read.
Let's relax the rules now, by always sending if we don't have the
CF_EXPECT_MORE flag (used to group writes), or if the buffer is
already full.
The function used to abuse the internals of mux_pt to retrieve a
conn_stream, which will not work anymore after the idle connection
changes. Let's make it rely on the more reliable cs_get_first()
instead.
This method is used to retrieve the first known good conn_stream from
the mux. It will be used to find the other end of a connection when
dealing with the proxy protocol for example.
Commit d4dd22d ("MINOR: h2: Let user of h2_recv() and h2_send() know xfer
has been done") changed the API without documenting the expected returned
values which appear to come out of nowhere in the code :-( Please don't
do that anymore! The description was recovered from the commit message.
To be used, error messages declared in a default section must be copied when the
parsing of a proxy section starts. But this was only done for frontends.
This patch may be backported to older versions.
There are still some unwelcome synchronous calls to si_cs_recv() in
process_stream(). Let's have a new function si_sync_recv() to perform
a synchronous receive call on a stream interface regardless of the type
of its endpoint, and move these calls there. For now it only implements
conn_streams since it doesn't seem useful to support applets there. The
function implements an extra check for the stream interface to be in an
established state before attempting anything.
In commit f26c26c ("BUG/MEDIUM: stream-int: change the way buffer room
is requested by a stream-int") we used to call si_want_put() at the
end of sess_update_st_con_tcp(), when switching to SI_ST_EST state.
But this is incorrect as there are a few other situations where we
can switch to this state, such as in si_connect() where a connection
reuse is detected, or when directly calling an applet (in which case
that was already covered anyway). For now it doesn't have any side
effect but it could impact connection reuse after the stream-int
changes by stalling an immediately reused connection.
Let's move this flag change to sess_establish() instead, which is the
only place which is always called exactly once on connection setup.
No backport is needed, this is purely 1.9.
In master-worker mode, the socketpair CLI listener of the worker is now
marked unstoppable, which allows to connect to the CLI of an old process
which is in a leaving state, allowing to debug it.
An unstoppable listener is a listener which won't be stop during a soft
stop. The unstoppable_jobs variable is incremented and the listener
won't prevent the process to leave properly.
It is not a good idea to use this feature (the LI_O_NOSTOP flag) with a
listener that need to be bind again on another process during a soft
reload.
This patch allows a process to properly quit when some jobs are still
active, this feature is handled by the unstoppable_jobs variable, which
must be atomically incremented.
During each new iteration of run_poll_loop() the break condition of the
loop is now (jobs - unstoppable_jobs) == 0.
The unique usage of this at the moment is to handle the socketpair CLI
of a the worker during the stopping of the process. During the soft
stop, we could mark the CLI listener as an unstoppable job and still
handle new connections till every other jobs are stopped.
The previous commit fedceaf33 ("MINOR: http: Regroup return statements of
http_req_get_intercept_rule at the end") partly fixes the problem. But not
entierly. Because HTTP 103 reponses were sent line by line it is possible to mix
them with others. For instance, an early-hint rule followed by a redirect rule
leaving the response buffer totally messed up. Furthermore, if we fail to add
the last CRLF to finish the HTTP 103 response because there is no more space in
the buffer, it leave the buffer with an unfinished and invalid message.
This patch fixes the bug by creating a fully formed HTTP 103 response before
trying to push it in the response buffer. If an error occurred during the copy
or if another response was already sent, the HTTP 103 response is
ignored. However, the last point should never happened because, for redirects
and authentication errors, we first try to copy any pending HTTP 103 response.
Instead of having multiple return statements spreaded here and there in middle
of the function, we just exit from the loop setting the right return code. It
let a chance to do some work before leaving the function. It is also less error
prone.
Instead of having multiple return statements spreaded here and there in middle
of the function, we just exit from the loop setting the right return code. It
let a chance to do some work before leaving the function. It is also less error
prone.
This patch fixes a bug introduced in the commit 6b952c810 ("REORG: http: move
http_get_path() to http.c"). In the reorg, the code responsible to skip the
version to only extract the path in the HTTP request was dropped.
No backport is needed, this only affects 1.9.
When splice() reports a pipe full condition, we go through the common
code used to release a possibly empty pipe (which we don't have) and which
immediately tries to allocate a buffer that will never be used. Further,
it may even subscribe to get this buffer if the resources are low. Let's
simply get out of this way if the pipe is full.
This fix could be backported to 1.8 though the code is a bit different
overthere.
Since we don't necessarily pass through conn_fd_handler() when reading,
conn_refresh_polling_flags() is not necessarily called when performing
a recv() operation, thus flags like CO_FL_WAIT_ROOM are not cleared.
It happens that si_cs_recv() checks CO_FL_WAIT_ROOM before deciding to
receive into a buffer, to see if the previous rcv_pipe() call failed by
lack of pipe room. The combined effect of these two statements is that
at the end of a file transmission, when there's too little data to
warrant the use of a pipe and the pipe is empty, we refrain from using
rcv_pipe() for the last few bytes, but since CO_FL_WAIT_ROOM is still
present, we don't use rcv_buf() either, and the connection remains
frozen in this state with si_cs_recv() called in loops.
In order to fix this we can simply manually clear CO_FL_WAIT_ROOM when
not using pipe so that the next check sees the result of the previous
operation and not an old one. We could equally call
cond_refresh_polling_flags() but that would be overkill and dangerous
given that it would manipulate the connection's flags under the mux.
By the way ideally the mux should report this flag into the connstream
for cleaner manipulation.
No backport is needed as this is only post 1.9-dev2.
As part of the changes that went into 1.9-dev2 regarding the polling
modifications, the changes consecutive to the removal of the wait_list
from the conn_streams (commit 71384551a) made si_cs_recv() occasionally
return without subscribing to receive events, causing spliced transfers
to randomly fail if the client was at least as fast as the server. This
may remain unnoticed on most deployments since servers are usually close
to haproxy with higher bandwidth than clients have, resulting in buffers
always being full.
In order to reproduce his effect, it is better to do it on the local
machine and to transfer very large objects (hundreds of gigs) over a
single connection, to see it suddenly stall after a few tens of gigs.
Now with this fix it's fine even after 3 TB over a single connection.
No backport is needed.
When we allocate struct stksess, we also allocate memory to store the
associated data before the struct itself.
As the data can be of different types, they can have different size. However,
we need the struct stksess to be properly aligned, as it can do 64bits
load/store (including atomic load/stores) on 64bits platforms, and some of
them doesn't support unaligned access.
So, when allocating the struct stksess, round the size up to the next
multiple of sizeof(void *), and make sure the struct stksess itself is
properly aligned.
Many thanks to Paul Martin for investigating and reporting that bug.
This should be backported to earlier releases.
When configuring the logs with a FD and using the master worker, the FD
was closed upon a reload because it was configured with CLOEXEC. It
leads to using the wrong FD for the logs and to close them. Which is
unfortunate since the master rely on the FD left opened during a reload.
The fix is to stop doing a CLOEXEC when the FD is inherited.
No backport needed.
This patch implements http_apply_early_hint_rule() function is responsible of
building HTTP 103 Early Hint responses each time a "early-hint" rule is matched.
This patch adds a "early_hint" struct to "arg" union of "act_rule" struct
and parse "early-hint" http-request keyword with it using the same
code as for "(add|set)-header" parser.
When namespaces are disabled, support is still reported because the file
is built with almost nothing in it but built anyway. Instead of extending
the scope of the numerous ifdefs in this file, better avoid building it
when namespaces are diabled. In this case we define my_socketat() as an
inline function mapping directly to socket(). The struct netns_entry
still needs to be defined because it's used by various other functions
in the code.
Splicing was in great part broken over the last few development version
due to the use of co_data() to detect if data are available in the channel.
But co_data() only looks at buffered data, not spliced data.
Channel_is_empty() takes care of both and should be used. With this,
splicing restarts to work but there are still a few cases where transfers
may stall.
No backport is needed.
Subsequent to the recent stream-int updates, we started to consider that
SI_FL_WANT_PUT needs to be set when receipt is enabled, but this is wrong
and results in 100% CPU when an HTTP client stays idle after a keep-alive
request because the stream-int has nothing to provide and nothing to send.
In fact just like for applets this flag should reflect the continuation
of an attempt. So it's si_cs_recv() which should set the flag, and clear
it if it has nothing more to provide. This function is called the first
time in process_stream()), and called again during transfers, so it will
always be up to date during stream_int_update() and stream_int_notify().
As a special case, it should also be set when a connection switches to
the established state. And we should absolutely refrain from calling
si_cs_recv() to re-enable reading, normally just setting this flag
(from within the stream-int's handler or prior to calling si_chk_rcv())
is expected to be OK.
A corner case remains where it was observed that in stream_int_notify() we
can sometimes be called with an empty output channel with SI_FL_WAIT_ROOM
and no CF_WRITE_PARTIAL, so there's no way to detect that we should
re-enable receiving. It's easy to also take care of this condition
there for the time it takes to figure if this situation is expected
or not.
Now it becomes more obvious that relying on a single flag to request
room (or on two flags to arbiter activity) is not workable given the
autonomy of both sides. The mux_h2 has taught us that blocking flags
are much more reliable, require much less condition and are much easier
to deal with. That's probably something to consider quickly in this
area.
No backport is needed.
This format is pretty similar to the previous "short" format except
that it also removes the severity level. Thus only the raw message is
sent. This is suitable for use in containers, where only the raw
information is expected and where the severity is supposed to come
from the file descriptor used.
This format is meant to be used with local file descriptors. It emits
messages only prefixed with a level, removing all the process name,
system name, date and so on. It is similar to the printk() format used
on Linux. It's suitable to be sent to a local logger compatible with
systemd's output format.
Note that the facility is still required but not used, hence it is
suggested to use "daemon" to remind that it's a local logger.
Example :
log stdout format short daemon # send everything to stdout
log stderr format short daemon notice # send important events to stderr
In certain situations it would be desirable to log to an existing file
descriptor, the most common case being a pipe between containers or
processes. The main issue with pipes is that using write() on them will
randomly truncate messages. But there is a trick. By using writev(), we
can atomically deliver or drop a message, which perfectly fits the
purpose. The only caveat is that large messages (4096 bytes on modern
operating systems) may be interleaved with messages from other processes
if using nbproc for example. In practice such messages are rare and most
of the time when users need such type of logging, the load is low enough
for a single process to be running so this is not really a problem.
This logging method thus uses unbuffered writev() calls and is uses more
CPU than if it used its own buffer with large writes at once, though this
is not a problem for moderate loads.
Logging to a file descriptor attached to a file also works with the side
effect that the process is significantly slowed down during disk accesses
and that it's not possible to rotate the file without restarting the
process. For this reason this option is not offered as a configuration
option, since it would confuse most users, but one could decide to
redirect haproxy's output to a file during debugging sessions. Two aliases
"stdout" and "stderr" are provided, but keep in mind that these are closed
by default in daemon mode.
When logging to a pipe or socket at a high enough rate, some logs will be
dropped and the number of dropped messages is reported in "show info".
It's easy to detect when logs on some paths are lost as sendmsg() will
return EAGAIN. This is particularly true when sending to /dev/log, which
often doesn't support a big logging capacity. Let's keep track of these
and report the total number of dropped messages in "show info".
The error messages used to say something along "socket logger 2 failed"
or "sendmsg logger 2 failed" which are confusing. Let's rephrase this
"sendmsg() failed for logger 2".
Building on 32 bit gives this :
src/cache.c: In function 'http_action_store_cache':
src/cache.c:466:4: warning: this decimal constant is unsigned only in ISO C90 [enabled by default]
src/cache.c:467:5: warning: this decimal constant is unsigned only in ISO C90 [enabled by default]
src/cache.c: In function 'cache_channel_append_age_header':
src/cache.c:578:2: warning: this decimal constant is unsigned only in ISO C90 [enabled by default]
src/cache.c:579:3: warning: this decimal constant is unsigned only in ISO C90 [enabled by default]
It's because of the definition below added in commit e7a770c ("MINOR:
cache: Add "Age" header.") :
#define CACHE_ENTRY_MAX_AGE 2147483648
Just appending "U" to mark it unsigned is enough to fix it. This only
affects 1.9, no backport is needed.
In 1.8, commit 45a66cc ("MEDIUM: config: ensure that tune.bufsize is at
least 16384 when using HTTP/2") tried to avoid an annoying issue making
H2 fail when haproxy is built with default buffer sizes smaller than 16kB,
which used to be the case for a very long time. Sadly, the test only sees
when NPN/ALPN exactly match "h2" and not when it's combined like
"h2,http/1.1" nor "http/1.1,h2". We can safely use strstr() there because
the string is prefixed by the token's length (0x02) which is unambiguous
as it cannot be part of any other token.
This fix should be backported to 1.8 as a safety guard against bad
configurations.
Before calling the mux to get incoming data, we get the amount of space
available at the input of the buffer. If there is no space, we don't try to read
more data. This is good enough when raw data are stored in the buffer. But this
info has no meaning when structured data are stored. Because with the HTTP
refactoring, such kind of data will be stored in buffers, it is a bit annoying.
So, to avoid any problems, we always call the mux. It is the mux's responsiblity
to notify the stream interface it needs more space to store more data. This must
be done by setting the flag CS_FL_RCV_MORE on the conn_stream.
This is exactly what we do in the pass-through mux when <count> is null.
This flag is set on the stream interface when we should wait for more space in
the channel's buffer to store more incoming data. This means we should wait some
outgoing data are sent before retrying to receive more data.
But in stream interface functions, at many places, instead of checking this
flag, we use the function channel_may_recv to know if we can (re)start
reading. This currently works but it is not really consistent. And, it works
because only raw data are stored in buffers. But it will be a problem when we
start to store structured data in buffers.
So to avoid any problems with futur implementations, we now rely only on
SI_FL_WAIT_ROOM. The function channel_may_recv can still be called, but only
when we are sure to handle raw data (for instance in functions ci_put*). To do
so, among other things, we must be sure to unset SI_FL_WAIT_ROOM and offer an
opportunity to call chk_rcv() on a stream interface when some data are sent
on the other end, which is now granted by the previous patch series.
We exclusively use stream_int_update() now, the lower layers are not
called anymore so let's remove them, as well as si_update() which used
to be their wrapper.
It's far from being clean, but at least it allows to resync both CS and
applets from the same place, taking into account the fact that CS are
processed synchronously for the send side while appletx are processed
outside of the process_stream() loop. The arrangement is optimised to
minimize the amount of iteration by handling send first, then updating
the SI_FL_WAIT_ROOM flags and only then dealing with si_chk_rcv() on
both sides. The SI_FL_WANT_PUT flag is set if needed before calling
si_chk_rcv() since this is done prior to calling stream_int_update().
Now there's no risk that stream_int_notify() is called anymore during
such operations, thus we cannot have any spurious wake-up anymore. The
case where a successful send() could complete a pending connect() is
handled by taking any stream-int state changes into account at the
call place, which is normal since process_stream() is designed to
iterate till stabilisation.
Doing this solves most of the remaining inconsistencies between CS and
applets.
The function used to be called in turn for each side of the stream, but
since it's called exclusively from process_stream(), it prevents us from
making use of the knowledge we have of the operations in progress for
each side, resulting in having to go all the way through functions like
stream_int_notify() which are not appropriate there.
That patch creates a new function, si_update_both() which takes two
stream interfaces expected to belong to the same stream, and processes
their flags in a more suitable order, but for now doesn't change the
logic at all.
The next step will consist in trying to reinsert the rest of the socket
layer-specific update code to ultimately update the flags correctly at
the end of the operation.
Instead of clearing the SI_FL_WAIT_ROOM flag and losing the information
about the need from the producer to be woken up, we now call si_chk_rcv()
immediately. This is cheap to do and it could possibly be further improved
by only doing it when SI_FL_WAIT_ROOM was still set, though this will
require some extra auditing of the code paths.
The only remaining place where the flag was cleared without a call to
si_chk_rcv() is si_alloc_ibuf(), but since this one is called from a
receive path woken up from si_chk_rcv() or not having failed, the
clearing was not necessary anymore either.
And there was one place in stream_int_notify() where si_chk_rcv() was
called with SI_FL_WAIT_ROOM still explicitly set so this place was
adjusted in order to clear the flag prior to calling si_chk_rcv().
Now we don't have any situation where we randomly clear SI_FL_WAIT_ROOM
without trying to wake the other side up, nor where we call si_chk_rcv()
with the flag set, so this flag should accurately represent a failed
attempt at putting data into the buffer.
When CF_DONT_READ is set, till now we used to set SI_FL_WAIT_ROOM, which
is not appropriate since it would lose the subscribe status. Instead let's
clear SI_FL_WANT_PUT (just like applets do), and set the flag only when
CF_DONT_READ is cleared.
We have to do this in stream_int_update(), and in si_cs_io_cb() after
returning from si_cs_recv() since it would be a bit invasive to hack
this one for now. It must not be done in stream_int_notify() otherwise
it would re-enable blocked applets.
Last, when si_chk_rcv() is called, it immediately clears the flag before
calling ->chk_rcv() so that we are not tempted to uselessly loop on the
same call until the receive function is called. This is the same principle
as what is done with the applet scheduler.
This flag should already be cleared before calling the *chk_rcv() functions.
Before adapting all call places, let's first make sure si_chk_rcv() clears
it before calling them so that these functions do not have to check it again
and so that they do not adjust it. This function will only call the lower
layers if the SI_FL_WANT_PUT flag is present so that the endpoint can decide
not to be called (as done with applets).
We now do this on the si_cs_recv() path so that we always have
SI_FL_WANT_PUT properly set when there's a need to receive and
SI_FL_WAIT_ROOM upon failure.
It doesn't make sense to limit this code to applets, as any stream
interface can use it. Let's rename it by simply dropping the "applet_"
part of the name. No other change was made except updating the comments.
The buffer allocation callback appctx_res_wakeup() used to rely on old
tricks to detect if a buffer was already granted to an appctx, namely
by checking the task's state. Not only this test is not valid anymore,
but it's inaccurate.
Let's solely on SI_FL_WAIT_ROOM that is now set on allocation failure by
the functions trying to allocate a buffer. The buffer is now allocated on
the fly and the flag removed so that the consistency between the two
remains granted. The patch also fixes minor issues such as the function
being improperly declared inline(!) and the fact that using appctx_wakeup()
sets the wakeup reason to TASK_WOKEN_OTHER while we try to use TASK_WOKEN_RES
when waking up consecutive to a ressource allocation such as a buffer.
This function replaces stream_res_available(), which is used as a callback
for the buffer allocator. It now carefully checks which stream interface
was blocked on a buffer allocation, tries to allocate the input buffer to
this stream interface, and wakes the task up once such a buffer was found.
It will automatically remove the SI_FL_WAIT_ROOM flag upon success since
the info this flag indicates becomes wrong as soon as the buffer is
allocated.
The code is still far from being perfect because if a call to si_cs_recv()
fails to allocate a buffer, we'll still end up passing via process_stream()
again, but this could be improved in the future by using finer-grained
wake-up notifications.
When using the CLI proxy of the master and trying to access a worker
with the @ prefix, the worker just crash.
The commit 7216032 ("MEDIUM: mworker: leave when the master die")
reintroduced the old code of the pipe, which was not trying to access
the pointers before. The owner of the FD was modified to a different
value, this is a problem since we call listener_accept() in most cases
now from the mworker_accept_wrapper() and it casts the owner variable to
get the listener.
This patch fix the issue by setting back the previous owner of the FD.
Commit eafd8ebcf ("MEDIUM: stream-int: call si_cs_process() in
stream_int_update_conn") uncovered a sleeping bug. By calling
si_cs_process() within si_update(), we end up calling stream_int_notify().
We rely on it to update the stream-int before quitting as a hack, but
it happens to immediately wake the task up while the stream int's
state is still SI_ST_CON (during the connection establishment). The
observable effect is that an unreachable server causes haproxy to
use 100% CPU until the connection timeout strikes.
This patch fixes this by not causing the wake up for the SI_ST_CON
state. It would equally be possible to check for states higher than
SI_ST_EST as is done in other places, but for now better stay on the
safe side by covering the only issue that can be triggered. It's
suspected that this issue slightly affects older versions by causing
one extra call to process_stream() during the connection setup for
each activity change on the other side, but this should not have
any observable effect.
No backport is needed.
The process was aborting with nbthread > 1.
The mworker_pipe_register() could be called several time in multithread
mode, we don't want to abort() there.
It took me 17 minutes this morning to figure where si->wait_event was
set (it's in si_reset() which should now probably be renamed since it
doesn't just perform a reset anymore but also an allocation) and what
its task was assigned to (si_cs_io_cb() even for applets and empty SI).
This is too confusing and not intuitive enough, let's at least add a
few comments for now to help figure how this stuff works next time.
When the master die, the worker should exit too, this is achieved by
checking if the FD of the socketpair/pipe was closed between the master
and the worker.
In the former architecture of the master-worker, there was only a pipe
between the master and the workers, and it was easy to check an EOF on
the pipe FD to exit() the worker.
With the new architecture, we use a socketpair by process, and this
socketpair is also used to accept new connections with the
listener_accept() callback.
This accept callback can't handle the EOF and the exit of the process,
because it's very specific to the master worker. This is why we
transformed the mworker_pipe_handler() function in a wrapper which check
if there is an EOF and exit the process, and if not call
listener_accept() to achieve the accept.
The former behavior was to exit() the master process with the latest
status code known, which was the one of the last process to exit.
The problem is that the master process was not exiting with the status
code which provoked the exit-on-failure.
The active peers output indicates both the number of established peers
connections and the number of peers connection attempts. The new counter
"ConnectedPeers" also indicates the number of currently connected peers.
This helps detect that some peers cannot be reached for example. It's
worth mentioning that this value changes over time because unused peers
are often disconnected and reconnected. Most of the time it should be
equal to ActivePeers.
Peers are the last type of activity which can maintain a job present, so
it's important to report that such an entity is still active to explain
why the job count may be higher than zero. Here by "ActivePeers" we report
peers sessions, which include both established connections and outgoing
connection attempts.
When an haproxy process doesn't stop after a reload, it's because it
still has some active "jobs", which mainly are active sessions, listeners,
peers or other specific activities. Sometimes it's difficult to troubleshoot
the cause of these issues (which generally are the result of a bug) only
because some indicators are missing.
This patch add the number of listeners, the number of jobs, and the stopping
status to the output of "show info". This way it becomes a bit easier to try
to narrow down the cause of such an issue should it happen. A typical use
case is to connect to the CLI before reloading, then issuing the "show info"
command to see what happens. In the normal situation, stopping should equal
1, jobs should equal 1 (meaning only the CLI is still active) and listeners
should equal zero.
The patch is so trivial that it could make sense to backport it to 1.8 in
order to help with troubleshooting.
The tasks API was changed in 1.9-dev1 with commit 9f6af3322 ("MINOR: tasks:
Change the task API so that the callback takes 3 arguments."), causing the
task's state not to be usable anymore and to have been replaced with an
explicit argument in the callee. The task's state doesn't contain any trace
of the wakeup cause anymore. But there were two places where the old task's
state remained in use :
- sessions, used to more accurately report timeouts in logs when seeing
TASK_WOKEN_TIMEOUT ;
- peers, used to finish resynchronization when seeing TASK_WOKEN_SIGNAL
This commit fixes both occurrences by making sure we don't access task->state
directly (should we rename it by the way ?).
No backport is needed.
This one causes some events to be lost. It has already been tested in
an experimental branch but was not merged until being certain it was
needed. Fred figured that requesting /?k=1&s=447392 from httpterm through
haproxy-master was enough to stall the transfer.
No backport is needed, this only affects 1.9-dev5.
It was reported here that authentication may fail when threads are
enabled :
https://bugzilla.redhat.com/show_bug.cgi?id=1643941
While I couldn't reproduce the issue, it's obvious that there is a
problem with the use of the non-reentrant crypt() function there.
On Linux systems there's crypt_r() but not on the vast majority of
other ones. Thus a first approach consists in placing a lock around
this crypt() call. Another patch may relax it when crypt_r() is
available.
This fix must be backported to 1.8. Thanks to Ryan O'Hara for the
quick notification.
A bug occurs when the CLI proxy of the master received a command which
is prefixed by some spaces but without a routing prefix (@).
In this case the pcli_parse_request() was returning a wrong number of
data to forward.
The response analyzer was called twice and the prompt displayed twice.
Commit 85b73e9 ("BUG/MEDIUM: stream: Make sure polling is right on retry.")
introduced a possible null dereference on the error path detected by gcc-7.
Let's simply assign srv_conn after checking the error and not before.
No backport is needed.
When the "path" sample fetch function is called without any path, the
function doesn't check that the request buffer is allocated. While this
doesn't happen with the request during processing, it can definitely
happen when mistakenly trying to reference a path from the response
since the request channel is not allocated anymore.
It's certain that this bug was emphasized by the buffer changes that
went in 1.9 and the HTTP refactoring, but at first glance, 1.8 doesn't
seem 100% safe either so it's possible that older version are affected
as well.
Thanks to PiBa-NL for reporting this bug with a reproducer.
This patch makes the cache capable of adding an "Age" header as defined by
rfc7234.
During the storage of new HTTP objects we memorize ->eoh value and
the value of the "Age" header coming from the origin server.
These information may then be reused to return the cached HTTP objects
with a new "Age" header.
May be backported to 1.8.
This patch implements analysers for parsing the CLI and extra features
for the master's CLI.
For each command (sent alone, or separated by ; or \n) the request
analyser will determine to which server it should send the request.
The 'mode cli' proxy is able to parse a prefix for each command which is
used to select the apropriate server. The prefix start by @ and is
followed by "master", the PID preceded by ! or the relative PID. (e.g.
@master, @1, @!1234). The servers are not round-robined anymore.
The command is sent with a SHUTW which force the server to close the
connection after sending its response. However the proxy allows a
keepalive connection on the client side and does not close.
The response analyser does not do much stuff, it only reinits the
connection when it received a close from the server, and forward the
response. It does not analyze the response data.
The only guarantee of the end of the response is the close of the
server, we can't rely on the double \n since it's not send by every
command.
This could be reimplemented later as a filter.
Add a struct server pointer in the mworker_proc struct so we can easily
use it as a target for the mworker proxy.
pcli_prefix_to_pid() is used to find the right PID of the worker
when using a prefix in the CLI. (@master, @#<relative pid> , @<pid>)
pcli_pid_to_server() is used to find the right target server for the
CLI proxy.
The master process does not need all the keywords of the cli, add 2
flags to chose which keyword to use.
It might be useful to activate some of them in a debug mode later...
This patch introduces mworker_cli_proxy_new_listener() which allows the
creation of new listeners for the CLI proxy.
Using this function it is possible to create new listeners from the
program arguments with -Sa <unix_socket>. It is allowed to create
multiple listeners with several -Sa.
This patch implements a listen proxy within the master. It uses the
sockpair of all the workers as servers.
In the current state of the code, the proxy is only doing round robin on
the CLI of the workers. A CLI mode will be needed to know to which CLI
send the requests.
The init code of the mworker_proc structs has been moved before the
init of the listeners.
Each socketpair is now connected to a CLI within the workers, which
allows the master to access their CLI.
The inherited flag of the worker side socketpair is removed so the
socket can be closed in the master.
There's a call there to si_cs_send() while we're supposed to come from
si_cs_io_cb() which has just done it. But in fact we can also come here
as a lower layer callback from ->wake() after a connection is established.
Since most of the time we'll end up here with either no data in the buffer
or a blocked output, let's simply check if we're already susbcribed to send
events before calling si_cs_send().
stream_int_notify() is I/O agnostic and should not wake up the tasklet,
it's up to si_cs_process() to do that, just like si_applet_wake_cb()
does it for the applet.
This one was added by commit 53216e7db ("MEDIUM: connections: Don't
directly mess with the polling from the upper layers.") after the
removal of the conditional cs_want_send() call. But after analysis
it turned out that it's not needed since the si_cs_send() call will
either succeed or subscribe.
Calling si_cs_send() alone is always dangerous because it can result
in the loss of an event if it manages to empty the buffer. Indeed, in
this case it's critical to call si_chk_rcv() on the opposite stream-int.
Given that si_cs_process() takes care of all this, let's call it instead.
All this code could possibly be refined soon to avoid redoing the whole
stream_int_notify() and do it only after a send(), but at the moment it's
not important.
With the new synchronous si_cs_send() at the end of process_stream(),
we're seeing re-appear the I/O layer specific part of the stream interface
which is supposed to deal with I/O event subscription. The only difference
is that now we subscribe to I/Os only after having attempted (and failed)
them.
This patch brings a cleanup in this by reintroducing stream_int_update_conn()
with the send code from process_stream(). However this alone would not be
enough because the flags which are cleared afterwards would result in the
loss of the possible events (write events only at the moment). So the flags
clearing and stream-int state updates are also performed inside si_update()
between the generic code and the I/O specific code. This definitely makes
sense as after this call we can simply check again for channel and SI flag
changes and decide to loop once again or not.
The rationale here is that we should never need to try to send() at the
beginning of process_stream() because :
- if something was pending, it's very unlikely that it was unblocked
and not sent just between the last poll() and the wakeup instant.
- if something pending was recently sent, then we don't have anything
to send anymore.
So at first glance it doesn't seem like there could be any valid case
where trying to send before entering the function brings any benefit.
If a buffer allocation failed, we have SI_FL_WAIT_ROOM set and c_size(buf)
being zero. It's the only moment where we have a new opportunity to try to
allocate this buffer. However we don't want to waste our time trying this
if both are non-null since it indicates missing room without any changed
condition.
Well that's only 3 places (applet.c, stream_interface.c, hlua.c). This
ensures we always clear SI_FL_WAIT_ROOM before setting it on failure,
so that it is granted that SI_FL_WAIT_ROOM always indicates a lack of
room for doing an operation, including the inability to allocate a
buffer for this.
The vars_prune() and vars_init() functions involve locking while most of
the time there is no variable at all in streams nor sessions. Let's check
for emptiness before calling these functions. Simply doing this has
increased the multithreaded performance from 1.5 to 5% depending on the
workload.
While "option prefer-last-server" only applies to non-deterministic load
balancing algorithms, 401/407 responses actually caused haproxy to prefer
the last server unconditionally.
As this breaks deterministic load balancing algorithms like uri, this
patch applies the same condition here.
Should be backported to 1.8 (together with "BUG/MINOR: only mark
connections private if NTLM is detected").
Instead of marking all connections that see a 401/407 response private
(for connection reuse), this patch detects a RFC4559/NTLM authentication
scheme and restricts the private setting to those connections.
This is so we can reuse connections with 401/407 responses with
deterministic load balancing algorithms later (which requires another fix).
This fixes the problem reported here by Elliot Barlas :
https://discourse.haproxy.org/t/unable-to-configure-load-balancing-per-request-over-persistent-connection/3144
Should be backported to 1.8.
The behaviour of the flag CF_WRITE_PARTIAL was modified by commit
95fad5ba4 ("BUG/MAJOR: stream-int: don't re-arm recv if send fails") due
to a situation where it could trigger an immediate wake up of the other
side, both acting in loops via the FD cache. This loss has caused the
need to introduce CF_WRITE_EVENT as commit c5a9d5bf, to replace it, but
both flags express more or less the same thing and this distinction
creates a lot of confusion and complexity in the code.
Since the FD cache now acts via tasklets, the issue worked around in the
first patch no longer exists, so it's more than time to kill this hack
and to restore CF_WRITE_PARTIAL's semantics (i.e.: there has been some
write activity since we last left process_stream).
This patch mostly reverts the two commits above. Only the part making
use of CF_WROTE_DATA instead of CF_WRITE_PARTIAL to detect the loss of
data upon connection setup was kept because it's more accurate and
better suited.
With this patch we avoid parsing "max-object-size" with atoi() and we store its
value as an unsigned int to prevent bad implicit conversion issues especially
when we compare it with others unsigned value (content length).
With this patch we check that shctx_init() does not returns 0.
This is possible if the maxblocks argument, which is passed as an
int, is negative due to an implicit conversion.
Must be backported to 1.8.
With this patch we support cache size larger than 2047 (MB) and prevent haproxy from crashing when "total-max-size" is parsed as negative values by atoi().
The limit at parsing time is 4095 MB (UINT_MAX >> 20).
May be backported to 1.8.
This patch adds "max-object-size" option to the cache to limit
the size in bytes of the HTTP objects to be cached. When not provided,
the maximum size of an HTTP object is a 256th of the cache size.
This patch makes the capable of storing HTTP objects larger than a buffer.
It makes usage of the "block by block shared object allocation" new shctx API.
A new pointer to struct shared_block has been added to the cache applet
context to memorize the next block to be used by the HTTP cache I/O handler
http_cache_io_handler() to emit the data. Another member, named "sent" memorize
the number of bytes already sent by this handler. So, to send an object from cache,
http_cache_io_handler() must be called until "sent" counter reaches the size
of this object.
This patch makes shctx capable of storing objects in several parts,
each parts being made of several blocks. There is no more need to
walk through until reaching the end of a row to append new blocks.
A new pointer to a struct shared_block member, named last_reserved,
has been added to struct shared_block so that to memorize the last block which was
reserved by shctx_row_reserve_hot(). Same thing about "last_append" pointer which
is used to memorize the last block used by shctx_row_data_append() to store the data.
Fred reported a random crash related to the pools. This was introduced
by commit e18db9e98 ("MEDIUM: pools: implement a thread-local cache for
pool entries") because the minimum pool item size should have been
increased to 32 bytes to accommodate the 2 double-linked lists.
No backport is needed.
This option makes a proxy use only HTX-compatible muxes instead of the
HTTP-compatible ones for HTTP modes. It must be set on both ends, this
is checked at parsing time.
With the previous connection model, when we purposely decided to stop
receiving in order to avoid polling after a complete request was received
for example, it was needed to set SI_FL_WAIT_ROOM to prevent receive
polling from being re-armed. Now with the new subscription-based model
there is no such thing anymore and there is noone to remove this flag
either. Thus if a request takes more than one packet to come in or
spans over too many packets, this flag will cause it to wait forever.
Let's simply remove this flag now.
This patch should not be backported since older versions still need
that this flag is set here to stop receiving.
We wake up all the streams waiting to send data when we have space available
in the mux buffer. Doing so means we probably wake way too many streams,
because after a few the buffer will probably be full instead. So keep a
list of all the streams that are about to send data, and if we detect that
the buffer is full, unschedule the tasks and put the streams back to the
send_list.
Make sure we call tasklet_free() only after si_release_endpoint(), when the
unsubscribe() method has been called, so that we're sure the mux won't
attempt to access the taslet.
Avoid using conn_xprt_want_send/recv, and totally nuke cs_want_send/recv,
from the upper layers. The polling is now directly handled by the connection
layer, it is activated on subscribe(), and unactivated once we got the event
and we woke the related task.
In h2_recv(), return 1 if we have data available, or if h2_recv_allowed()
failed, to be sure h2_process() is called.
Also don't subscribe if our buffer is full.
When retrying to connect to a server, because the previous connection failed,
make sure if we subscribed to the previous connection, the polling flags will
be true for the new fd.
No backport is needed.
When we're closing a stream, is there's no stream left and a goaway was sent,
close the connection, there's no reason to keep it open.
[wt: it's likely that this is needed in 1.8 as well, though it's unclear
how to trigger this issue, some tests are needed]
Similary to what's been done in 7a6ad88b02,
take into account that free_list that free_list is a void **, and so use
a void ** too when attempting to do a CAS.
The purpose is to detect if threads or processes are competing for the
same CPU. This can happen when threads are incorrectly bound, or after a
reload if the previous process still has an important activity. With
threads this situation is problematic because a preempted thread holding
a lock will block other ones waiting for this lock to be released.
A first attempt consisted in measuring the cumulated lost time more
precisely but the system's scheduler is smart enough to try to limit the
thread preemption rate by mostly context switching during poll()'s blank
periods, so most of the time lost is not seen. In essence this is good
because it means a thread is not preempted with a lock held, and even
regarding the rendez-vous point it cannot prevent the other ones from
making progress. But still it happens tens to hundreds of times per
second that a thread might be preempted, so it's still possible to detect
that the situation is happening, thus it's interesting to measure and
report its frequency.
Each time we enter the poller, we check the CPU time spent working and
see if we've lost time doing something else. To limit false positives,
we're only interested in losses of 500 microseconds or more (i.e. half
a clock tick on a 1 kHz system). If so, it indicates that some time was
stolen by another thread or process. Note that we purposely store some
sub-millisecond counters so that under heavy traffic with a 1 kHz clock,
it's still possible to measure something without being subject to the
risk of rounding errors (i.e. if exactly 1 ms is stolen it's possible
that the time difference could often be slightly lower).
This counter of lost CPU time slots time is reported in "show activity"
in numbers of milliseconds of CPU lost per second, per 15s, and total
over the process' life. By definition, the per-second counter cannot
report values larger than 1000 per thread per second and the 15s one
will be limited to 15000/s in the worst case, but it's possible that
peak values exceed such thresholds after long pauses.
The calls to HA_ATOMIC_CAS() on the lockfree version of the pool allocator
were mistakenly done on (void*) for the old value instead of (void **).
While this has no impact on "recent" gcc, it does have one for gcc < 4.7
since the CAS was open coded and it's not possible to assign a temporary
variable of type "void".
No backport is needed, this only affects 1.9.
By placing this code into time.h (tv_entering_poll() and tv_leaving_poll())
we can remove the logic from the pollers and prepare for extending this to
offer more accurate time measurements.
The 4 pollers all contain the same code used to compute the poll timeout.
This is pointless, let's centralize this into fd.h. This also gets rid of
the useless SCHEDULER_RESOLUTION macro which used to work arond a very old
linux 2.2 bug causing select() to wake up slightly before the timeout.
There are as many ways to build the globalfilepathlen variable as branches
in the if/then/else, creating lots of confusion. Address the most obvious
parts, but some polishing definitely is still needed.
These ones are on error paths that are properly handled by luaL_error()
which does a longjmp() but the compiler cannot know it. By adding an
__unreachable() statement in WILL_LJMP(), there is no ambiguity anymore.
This may be backported to 1.8 but these previous patches are needed first :
- BUILD: compiler: add a new statement "__unreachable()"
- MINOR: lua: all functions calling lua_yieldk() may return
- BUILD: lua: silence some compiler warnings about potential null derefs (#2)
There was a mistake when tagging functions which always use longjmp and
those which may use it in that all those supposed to call lua_yieldk()
may return without calling longjmp. Thus they must not use WILL_LJMP()
but MAY_LJMP(). It has zero impact on the code emitted as such, but
prevents other fixes from being properly implemented : this was the
cause of the previous failure with the __unreachable() calls.
This may be backported to older versions. It may or may not apply
well depending on the context, though the change simply consists in
replacing "WILL_LJMP(hlua_yieldk" with "MAY_LJMP(hlua_yieldk", and
same with the single call to lua_yieldk() in hlua_yieldk().
Here we make sure that appctx is always taken from the unchecked value
since we know it's an appctx, which explains why it's immediately
dereferenced. A missing test was added to ensure that task_new() does
not return a NULL.
This may be backported to 1.8.
This reverts commit f1ffb39b61.
It breaks Lua causing some timeouts. Removing the __unreachable() statement
from WILL_LJMP() fixes it. It's very strange and unclear whether it's an
issue with WILL_LJMP() not fullfilling its promise of not returning, if
the code emitted with __unreachable() gets broken, or anything else. Let's
revert this for now.
There is a bug in this function used to release other threads. It leaves
the current thread marked as harmless. If after this another thread does
a thread_isolate(), but before the first one reaches poll(), the second
thread will believe it's alone while it's not.
This must be backported to 1.8 since the rendez-vous point was merged
into 1.8.14.
Each thread now keeps the last ~512 kB of freed objects into a local
cache. There are some heuristics involved so that a specific pool cannot
use more than 1/8 of the total cache in number of objects. Tests have
shown that 512 kB is an optimal size on a 24-thread test running on a
dual-socket machine, resulting in an overall 7.5% performance increase
and a cache miss ratio reducing from 19.2 to 17.7%. Anyway it seems
pointless to keep more than an L2 cache, which probably explains why
sizes between 256 and 512 kB are optimal.
Cached objects appear in two lists, one per pool and one LRU to help
with fair eviction. Currently there is no way to check each thread's
cache state nor to flush it. This cache cannot be disabled and is
enabled as soon as the lockless pools are enabled (i.e.: threads are
enabled, no pool debugging is in use and the CPU supports a double word
CAS).
For caching it will be convenient to have indexes associated with pools,
without having to dereference the pool itself. One solution could consist
in replacing all pool pointers with integers but this would limit the
number of allocatable pools. Instead here we allocate the 32 first pools
from a pre-allocated array whose base address is known so that it's trivial
to convert a pool to an index in this array. Pools that cannot fit there
will be allocated normally.
Currently we have per-thread arrays of trees and counts, but these
ones unfortunately share cache lines and are accessed very often. This
patch moves the task-specific stuff into a structure taking a multiple
of a cache line, and has one such per thread. Just doing this has
reduced the cache miss ratio from 19.2% to 18.7% and increased the
12-thread test performance by 3%.
It starts to become visible that we really need a process-wide per-thread
storage area that would cover more than just these parts of the tasks.
The code was arranged so that it's easy to move the pieces elsewhere if
needed.
Now we still have a main contention point with the timers in the main
wait queue, but the vast majority of the tasks are pinned to a single
thread. This patch creates a per-thread wait queue and queues a task
to the local wait queue without any locking if the task is bound to a
single thread (the current one) otherwise to the shared queue using
locking. This significantly reduces contention on the wait queue. A
test with 12 threads showed 11 ms spent in the WQ lock compared to
4.7 seconds in the same test without this change. The cache miss ratio
decreased from 19.7% to 19.2% on the 12-thread test, and its performance
increased by 1.5%.
Another indirect benefit is that the average queue size is divided
by the number of threads, which roughly removes log(nbthreads) levels
in the tree and further speeds up lookups.
The vast majority of FDs are only seen by one thread. Currently the lock
on FDs costs a lot because it's touched often, though there should be very
little contention. This patch ensures that the lock is only grabbed if the
FD is shared by more than one thread, since otherwise the situation is safe.
Doing so resulted in a 15% performance boost on a 12-threads test.
peers_init_sync() doesn't check task_new()'s return value and doesn't
return any result to indicate success or failure. Let's make it return
an int and check it from the caller.
This can be backported as far as 1.6.
Gcc reports a potential null-deref error in the stick-table init code.
While not critical there, it's trivial to fix. This check has been
missing since 1.4 so this fix can be backported to all supported versions.
This null-deref cannot happen either as there necesarily is a listener
where this function is called. Let's use __objt_listener() to address
this.
This may be backported to 1.8.
Gcc 6.4 detects a potential null-deref warning in smp_fetch_ssl_fc_cl_str().
This one is not real since already addressed a few lines above. Let's use
__objt_conn() instead of objt_conn() to avoid the extra test that confuses
it.
This could be backported to 1.8.
These ones are on error paths that are properly handled by luaL_error()
which does a longjmp() but the compiler cannot know it. By adding an
__unreachable() statement in WILL_LJMP(), there is no ambiguity anymore.
This may be backported to 1.8 but the previous patch (BUILD: compiler:
add a new statement "__unreachable()") is needed for this.
In case pool_alloc() fails in stream_new(), we try to detach the stream
from the list before it has been added, dereferencing a NULL. In order
to fix it, simply move the LIST_DEL call upwards.
This must be backported to 1.8.
The listeners with the LI_O_INHERITED flag were deleted but not unbound
which is a problem since we have a polling in the master.
This patch unbind every listeners which are not require for the master,
but does not close the FD of those that have a LI_O_INHERITED flag.
We will need to know if a mux was created for a front or a back
connection and once it's established it's much harder, so let's
introduce H2_CF_IS_BACK for this.
For backend connections we'll have to initialize streams but not allocate
conn_streams since they'll already be there. Thus this patch splits the
h2c_stream_new() function into one dedicated to allocation of a new stream
and another one supposed to attach this stream to an existing frontend
connection.
Till now in order to figure the timeouts, we used to retrieve the proxy
from the session's owner, but the new API provides it so it's better to
simply take it from the caller at init time. We take this opportunity to
store the pointer to the proxy into the h2 connection so that we can
reuse it later when needed.
The init function was split into the mux init and the front init, but it
appears that most of the code will be common between the two sides when
implementing the backend init. Thus let's simply make this a unique
h2_init() function.
h2_snd_buf() must not accept to send data if the preface was not yet
received nor sent. At the moment it doesn't happen but it can with
server-side H2.
At a few places we check these states to detect if a stream has valid
data/errcode or is one of the two dummy streams (idle or closed). It
will become problematic for outgoing streams as it will not be possible
to report errors for example since the stream will switch from IDLE
state only after sending a HEADERS frame.
There is a safer solution consisting in checking the stream ID, which
may only be zero in the dummy streams. This patch changes the test to
only rely on the stream ID.
At many places in muxes we'll have to add tests to check if the
connection is front or back before deciding to log. Instead let's
centralize this test in sess_log() to simply do nothing when sess=NULL.
Some pseudo-headers are added during the headers parsing, mainly for the mux
H2. With this flag, it is possible to not add them. This avoid some boring
filtering in the mux H1.
Instead of using offsets relating to the parsed buffer to store start line
infos, we now use indirect strings. So now, these infos remain valid only if the
origin buffer remains untouched. But it's not a real problem because this union
is used during the parsing and never stored to a later use.
When headers parsing ends, a pseudo header with an empty name and an empty value
is added to the array of parsed headers to mark its end. It is convenient to
loop on this array, but not really useful if we want remove the last header or
add a new one, because we don't really know where is the last CRLF (the empty
line ending the headers block). So now, instead the name of this pseudo header
points on this last CRLF. Its length is still 0 and its value is still empty, so
loops on the array remains unchanged.
Since keep-alive mode is the default mode, the passive close has disappeared,
and in the code, httpclose and forceclose options are handled the same way:
connections with the client and the server are closed as soon as the request and
the response are received and missing "Connection: close" header is added in
each direction.
So to make things clearer, forceclose is now an alias for httpclose. And
httpclose is explicitly an active close. So the old passive close does not exist
anymore. Internally, the flag PR_O_HTTP_PCL has been removed and PR_O_HTTP_FCL
has been replaced by PR_O_HTTP_CLO. In HTTP analyzers, the checks done to find
the right mode to use, depending on proxies options and "Connection: " header
value, have been simplified.
This should only be a cleanup and no changes are expected.
This option is frontends specific, so there is no reason to support it on
backends. So now, it is ignored if it is set on a backend and a warning is
emitted during the startup. The change is quite trivial, but the commit is
tagged as MEDIUM because it is a small breakage with previous versions and
configurations using this options could emit a warning now.
This option is backends specific, so there is no reason to support it on
frontends. So now, it is ignored if it is set on a frontend and a warning is
emitted during the startup. The change is quite trivial, but the commit is
tagged as MEDIUM because it is a small breakage with previous versions and
configurations using this options could emit a warning now.
To ease the refactoring, the function "http_header_add_tail" have been
remove. Now, "http_header_add_tail2" is always used. And the function
"capture_headers" have been renamed into "http_capture_headers". Finally, some
functions have been exported.
Make sure we unsubscribe from events before si_release_endpoint destroys
the conn_stream, or it will be never called. To do so, move the call to
unsubscribe to si_release_endpoint() directly.
This is 1.9-specific and shouldn't be backported.
This bug appeared only if nbthread > 1. Handling the pipe with the
master, multiple threads of the same worker could process the deinit().
In addition, deinit() was called while some other threads were still
performing some tasks.
This patch assign the handler of the pipe with master to only the first
thread and removes the call to deinit() before exiting with an error.
This patch should be backported in v1.8.
If we can't send data for a stream because of its flow control, make sure
not to put it in the send_list, until the flow control lets it send again.
This is specific to 1.9, and should not be backported.
When subscribing, we don't need to provide a list element, only the h2 mux
needs it. So instead, Add a list element to struct h2s, and use it when a
list is needed.
This forces us to use the unsubscribe method, since we can't just unsubscribe
by using LIST_DEL anymore.
This patch is larger than it should be because it includes some renaming.
As we don't know how subscriptions are handled, we can't just assume we can
use LIST_DEL() to unsubscribe, so introduce a new method to mux and connections
to do so.
CurSslConns inc/dec operations are not threadsafe. The unsigned CurSslConns
counter can wrap to a negative value. So we could notice connection rejects
because of MaxSslConns limit artificially exceeded.
CumSslConns inc operation are also not threadsafe so we could miss
some connections and show inconsistenties values compared to CumConns.
This fix should be backported to v1.8.
The run queue is designed to perform a single tree lookup and to
use multiple passes to eb32sc_next(). The scheduler rework took a
conservative approach first but this is not needed anymore and it
increases the processing cost of process_runnable_tasks() and even
the time during which the RQ lock is held if the global queue is
heavily loaded. Let's simply move the initial lookup to the entry
of the loop like the previous scheduler used to do. This has reduced
by a factor of 5.5 the number of calls to eb32sc_lookup_get() there.
In the past this conditional had multiple conditionals which is why the
additional parentheses were needed. The conditional was simplified but
the duplicate parentheses were not cleaned up.
OpenSSL released support for TLSv1.3. It also added a separate function
SSL_CTX_set_ciphersuites that is used to set the ciphers used in the
TLS 1.3 handshake. This change adds support for that new configuration
option by adding a ciphersuites configuration variable that works
essentially the same as the existing ciphers setting.
Note that it should likely be backported to 1.8 in order to ease usage
of the now released openssl-1.1.1.
In ci_insert_line2() and b_rep_blk(), we can't afford to wrap, so don't use
b_tail() to check if we do, use __b_tail() instead.
This should be backported to previous versions.
For generate-certificates, X509V3_EXT_conf is used but it's an old API
call: X509V3_EXT_nconf must be preferred. Openssl compatibility is ok
because it's inside #ifdef SSL_CTRL_SET_TLSEXT_HOSTNAME, introduce 5
years after X509V3_EXT_nconf.
Commit 8ae735da0 ("MEDIUM: mux_h2: Revamp the send path when blocking.")
added a tasklet allocation in h2_stream_new(), however the error exit path
fails to reset h2s in case the tasklet cannot be allocated, resulting in
the h2s pointer to be returned as valid to the caller. Let's readjust the
exit path to always return NULL on error and to always log as well (since
there is no reason for not logging on such important errors).
No backport is needed, this is strictly 1.9-dev.
Since commit 7505f94f9 ("MEDIUM: h2: Don't use a wake() method anymore."),
the H2 mux's init() calls h2_process(). But this last one may detect an
early error and call h2_release(), destroying the connection, and return
-1. At this point we're screwed because the caller will still dereference
the connection for various things ranging from the configuration of the
proxy protocol header to the retries. We could simply return -1 here upon
failure but that's not enough since the stream layer really needs to keep
its connection structure allocated (to clean it up in session_kill_embryonic
or for example because it holds the destination address to reconnect to
when the connection goes to the backend). Thus the correct solution here is
to only schedule a wakeup of the I/O callback so that the init succeeds,
and that the connection is only handled later.
No backport is needed, this is 1.9-specific.
The return value from conn_install_mux() was not checked, so if an
inconsistency happens in the code, or a memory allocation fails while
initializing the mux, we can crash while using an uninitialized mux.
In practice the code inconsistency does not really happen since we
cannot configure such a situation, except during development, but
the out of memory condition could definitely happen.
This should be backported to 1.8 (the code is a bit different there,
there are two calls to conn_install_mux()).
The prototypes of functions find_hdr_value_end(), extract_cookie_value()
and http_header_match2() were still in proto_http.h while some of them
don't exist anymore and the others were just moved. Let's remove them.
In addition, da.c was updated to use http_extract_cookie_value() which
is the correct one.
These ones are mostly called from cfgparse.c for the parsing and do
not depend on the HTTP representation. The functions's prototypes
were moved to proto/http_rules.h, making this file work exactly like
tcp_rules. Ideally we should stop calling these functions directly
from cfgparse and register keywords, but there are a few cases where
that wouldn't work (stats http-request) so it's probably not worth
trying to go this far.
The current proto_http.c file is huge and contains different processing
domains making it very difficult to work on an alternative representation.
This commit moves some parts to other files :
- ACL registration code => http_acl.c
This code only creates some ACL mappings and doesn't know anything
about HTTP nor about the representation. This code could even have
moved to acl.c but it was not worth polluting it again.
- HTTP sample conversion => http_conv.c
This code doesn't depend on the internal representation but definitely
manipulates some HTTP elements, such as dates. It also has access to
captures.
- HTTP sample fetching => http_fetch.c
This code does depend entirely on the internal representation but is
totally independent on the analysers. Placing it into a different
file will ease the transition to the new representation and the
creation of a wrapper if required. An include file was created due
to CHECK_HTTP_MESSAGE_FIRST() being used at various places.
- HTTP action registration => http_act.c
This code doesn't directly interact with the messages nor the
transaction but it does so via some exported http functions like
http_replace_req_line() or http_set_status() so it will be easier
to change only this after the conversion.
- a few very generic parts were found and moved to http.{c,h} as
relevant.
It is worth noting that the functions moved to these new files are not
referenced anywhere outside of the files and are only called as registered
callbacks, so these files do not even require associated include files.
found by coverity.
[wt: this bug was introduced by commit 404d978 ("MINOR: add ALPN
information to send-proxy-v2"). It might be triggered by a health
check on a server using ppv2 or by an applet making use of such a
server, if at all configurable].
This needs to be backported to 1.8.
This ads support for accessing stick tables from Lua. The supported
operations are reading general table info, lookup by string/IP key, and
dumping the table.
Similar to "show table", a data filter is available during dump, and as
an improvement over "show table" it's possible to use up to 4 filter
expressions instead of just one (with implicit AND clause binding the
expressions). Dumping with/without filters can take a long time for
large tables, and should be used sparingly.
At the eand of process_stream(), we wake the task if there's something in
the input buffer, after attempting a recv. However this is wrong, and we should
only do so if we received new data. Just check the CF_READ_PARTIAL flag.
This is 1.9-specific and should not be backported.
Instead of using si_cs_io_cb() in process_stream() use si_cs_send/si_cs_recv
instead, as si_cs_io_cb() may lead to process_stream being woken up when it
shouldn't be, and thus timeout would never get triggered.
With recent modifications on the buffers API, when a buffer is released (calling
b_free), we replace it by BUF_NULL where the area pointer is NULL. So many
operations, like b_peek, must be avoided on a released or not allocated
buffer. These changes were mainly made in the commit c9fa048 ("MAJOR: buffer:
finalize buffer detachment").
Since this commit, HAProxy can crash during the body parsing of chunked HTTP
messages because there is no check on the channel's buffer in HTTP analyzers
(http_request_forward_body and http_response_forward_body) nor in H1 functions
reponsible to parse chunked content (h1_skip_chunk_crlf & co). If a stream is
woken up after all input data were forwarded, its input channel's buffer is
released (so set to BUF_NULL). In this case, if we resume the parsing of a
chunk, HAProxy crashes.
To fix this issue, we just skip the parsing of chunks if there is no input data
for the corresponding channel. This is only done if the message state is
strickly lower to HTTP_MSG_ENDING.
Tim Dsterhus found using afl-fuzz that some parts of the HPACK decoder
use incorrect bounds checking which do not catch negative values after
a type cast. The first culprit is hpack_valid_idx() which takes a signed
int and is fed with an unsigned one, but a few others are affected as
well due to being designed to work with an uint16_t as in the table
header, thus not being able to detect the high offset bits, though they
are not exposed if hpack_valid_idx() is fixed.
The impact is that the HPACK decoder can be crashed by an out-of-bounds
read. The only work-around without this patch is to disable H2 in the
configuration.
CVE-2018-14645 was assigned to this bug.
This patch addresses all of these issues at once. It must be backported
to 1.8.
An invalid null-deref warning is emitted because cmsg is not checked,
though it definitely is valid given the test performed 10 lines above,
but the compiler cannot necessarily guess this. Adding a null test to
the problematic condition is enough to get rid of it and cheap enough.
Like for the other checks, the type is being tested just before calling
objt_{server,dns_srvrq}() so let's use the unguarded version instead to
silence the warning.
When building with -Wnull-dereferences, gcc sees some cases where a
pointer is dereferenced after a check may set it to null. While all of
these are already guarded by either a preliminary test or the code's
construction (eg: listeners code being called only on listeners), it
cannot be blamed for not "seeing" this, so better use the unguarded
calls everywhere this happens, particularly after checks. This is a
step towards building with -Wextra.
Theorically nothing would prevent a front applet form connecting to a stats
socket, and if a "getsock" command was issued, it would cause a crash. Right
now nothing in the code does this so in its current form there is no impact.
It may or may not be backported to 1.8.
In h1_headers_to_hdr_list, when an incomplete message is parsed, all updates
must be skipped until the end of the message is found. Then the parsing is
restarted from the beginning. But not all updates were skipped, leading to
invalid rewritting or segfault.
No backport is needed.
A null pointer assignment was missing after free() in function
pat_ref_reload() which can lead to segfault.
This bug was introduced in commit b5997f7 ("MAJOR: threads/map: Make
acls/maps thread safe").
Must be backported to 1.8.
Just like we used to do in proto_http, we now check that each and every
occurrence of the content-length header field and each of its values are
exactly identical, and we normalize the header to return the last value
of the first header with spaces trimmed.
The transfer-encoding header processing was a bit lenient in this part
because it was made to read messages already validated by haproxy. We
absolutely need to reinstate the strict processing defined in RFC7230
as is currently being done in proto_http.c. That is, transfer-encoding
presence alone is enough to cancel content-length, and must be
terminated by the "chunked" token, except in the response where we
can fall back to the close mode if it's not last.
For this we now use a specific parsing function which updates the
flags and we introduce a new flag H1_MF_XFER_ENC indicating that the
transfer-encoding header is present.
Last, if such a header is found, we delete all content-length header
fields found in the message.
The new function h1_parse_connection_header() is called when facing a
connection header in the generic parser, and it will set up to 3 bits
in h1m->flags indicating if at least one "close", "keep-alive" or "upgrade"
tokens was seen.
This will be needed for the mux to know how to process the Connection
header, and will save it from having to re-parse the request line since
it's captured on the fly.
While it was possible to consider the status before parsing response
headers, it's wrong to do it for request headers and could lead to
random behaviours due to this status matching other fields instead.
Additionnally there is little to no value in doing this for each and
every new header field. It's much better to reset the content-length
at once in the callerwhen seeing such statuses (which currently is only
the H2 mux).
No backport is needed, this is purely 1.9.
The h2 parser has this specificity that if it cannot send the headers
frame resulting from the headers it just parsed, it needs to drop it
and parse it again later. Since commit 8852850 ("MEDIUM: h1: let the
caller pass the initial parser's state"), when this happens the parser
remains in the data state and the headers are not parsed again next
time, resulting in a parse error. Let's reset the parser on exit there.
No backport is needed.
If we're detaching the conn_stream, and it was subscribed to be waken up
when more data was available to receive, unsubscribe it.
No backport is needed.
Empty both send_list and fctl_list when destroying the h2 context, so that
if we're freeing the stream after, it doesn't try to remove itself from the
now-deleted list.
No backport is needed.
Till now it was very difficult for a mux to know what proxy it was
working for. Let's pass the proxy when the mux is instanciated at
init() time. It's not yet used but the H1 mux will definitely need
it, just like the H2 mux when dealing with backend connections.
The h1 parser used to systematically turn header field names to lower
case because it was designed for H2. Let's add a flag which is off by
default to condition this behaviour so that when using it from an H1
parser it will not affect the message.
The original H1 request parsing code was reintroduced into the generic
H1 parser so that it can be used regardless of the direction. If the
parser is interrupted and restarts, it makes use of the H1_MF_RESP
flag to decide whether to re-parse a request or a response. While
parsing the request, the method is decoded and set into the start line
structure.
The HTTP status is not relevant to the H1 message but to the H2 stream
itself. It used to be placed there by pure convenience but better move
it before it's too hard to remove.
This state was only a delimiter between headers and body but it now
causes more harm than good because it requires someone to change it.
Since the H1 parser knows if we're in DATA or CHUNK_SIZE, simply let
it set the right next state so that h1m->state constantly matches
what is expected afterwards.
While it was not needed in the H2 mux which was reading full H1 messages
from the channel, it is mandatory for the H1 mux reading contents from
outside to be able to restart on a message. The problem is that the
headers are indexed on the fly, and it's not fun to have to store
everything between calls.
The solution here is to complete the first pass doing a partial restart,
and only once the end of message was found, to start over it again at
once, filling entries. This way there is a bounded number of passes on
the contents and no need to store an intermediary result anymore. Later
this principle could even be used to decide to completely drop an output
buffer to save memory.
This will allow the parser to fill some extra fields like the method or
status without having to store them permanently in the HTTP message. At
this point however the parser cannot restart from an interrupted read.
Till now the H1 parser made for H2 used to be lenient on invalid header
field names because they were supposed to be produced by haproxy. Now
instead we'll rely on err_pos to know how to act (ie: -2 == must block).
There's no reason to have the two sides in H1 format since we only use
one at a time (the response at the moment). While completely removing
the request declaration, let's rename the response to "h1m" to clarify
that it's the unique h1 message there.
This is the *parsing* state of an HTTP/1 message. Currently the h1_state
is composite as it's made both of parsing and control (100SENT, BODY,
DONE, TUNNEL, ENDING etc). The purpose here is to have a purely H1 state
that can be used by H1 parsers. For now it's equivalent to h1_state.
Instead of waiting for the connection layer to let us know we can read,
attempt to receive as soon as process_stream() is called, and subscribe
to receive events if we can't receive yet.
Now, except for idle connections, the recv(), send() and wake() methods are
no more, all the lower layers do is waking tasklet for anybody waiting
for I/O events.
Change fctl_list and send_list to be lists of struct wait_list, and nuke
send_wait_list, as it's now redundant.
Make the code responsible for shutr/shutw subscribe to those lists.
Instead of having our wake() method called each time a fd event happens,
just subscribe to recv/send events, and get our tasklet called when that
happens. If any recv/send was possible, the equivalent of what h2_wake_cb()
will be done.
Let the connection layer know we're always interested in getting more data,
so that we get scheduled as soon as data is available, instead of relying
on the wake() method.
Make h2_recv() and h2_send() return 1 if data has been sent/received, or 0
if it did not. That way the caller will be able to know if more work may
have to be done.
Remove the recv() method from mux and conn_stream.
The goal is to always receive from the upper layers, instead of waiting
for the connection later. For now, recv() is still called from the wake()
method, but that should change soon.
For struct connection, struct conn_stream, and for the h2 mux, add 2 new
lists, one that handles waiters for recv, and one that handles waiters for
recv and send. That way we can ask to subscribe for either recv or send.
Resetting the polling flags at the end of conn_fd_handler() shouldn't be
needed anymore, and it will create problem when we won't handle send/recv
from conn_fd_handler() anymore.
Cyril Bont reported that commit f9cc07c25b broke the build without
thread.
We don't need to initialise tid = 0 in mworker_loop, so we could
completely remove it.
Christopher noticed that the CS_FL_EOS to CS_FL_REOS conversion was
incomplete : when the connectionis closed, we mark the streams with EOS
instead of REOS, causing the loss of any possibly pending data. At the
moment it's not an issue since H2 is used only with a client, but with
servers it could be a real problem if servers close the connection right
after sending their response.
This patch should be backported to 1.8.
This patch ensures that a DNS resolution may be launched before
setting a server FQDN via the CLI. Especially, it checks that
resolvers was set.
A LEVEL 4 reg testing file is provided.
Thanks to Lukas Tribus for having reported this issue.
Must be backported to 1.8.
This protocol is based on the uxst one, but it uses socketpair and FD
passing insteads of a connect()/accept().
The "sockpair@" prefix has been implemented for both bind and server
keywords.
When HAProxy wants to connect through a sockpair@, it creates 2 new
sockets using the socketpair() syscall and pass one of the socket
through the FD specified on the server line.
On the bind side, haproxy will receive the FD, and will use it like it
was the FD of an accept() syscall.
This protocol was designed for internal communication within HAProxy
between the master and the workers, but it's possible to use it
externaly with a wrapper and pass the FD through environment variabls.
It's possible to have several protocols per family which is a problem
with the current way the protocols are stored.
This allows to register a new protocol in HAProxy which is not a
protocol in the strict socket definition. It will be used to register a
SOCK_STREAM protocol using socketpair().
In _update_fd(), if the fd wasn't polled, and we don't want it to be polled,
we just returned 0, however, we should return changes instead, or all previous
changes will be lost.
This should be backported to 1.8.
The following functions only deal with header field values and are agnostic
to the HTTP version so they were moved to http.c :
http_header_match2(), find_hdr_value_end(), find_cookie_value_end(),
extract_cookie_value(), parse_qvalue(), http_find_url_param_pos(),
http_find_next_url_param().
Those lacking the "http_" prefix were modified to have it.
There are 3 tables in proto_http which are used exclusively by logs :
hdr_encode_map[], url_encode_map[] and http_encode_map[]. They indicate
what characters are safe to be emitted in logs depending on the part of
the message where they are placed. Let's move this to log.c, as well as
its initialization. It's worth noting that the rfc5424 map was already
initialized there.
These error codes and messages are agnostic to the version, even if
they are represented as HTTP/1.0 messages. Ultimately they will have
to be transformed into internal HTTP messages to be used everywhere.
The HTTP/1.1 100 Continue message was turned to an IST and the local
copy in the Lua code was removed.
This function is purely HTTP once http_txn is put aside. So the original
one was renamed to http_txn_get_path() and it extracts the relevant offsets
from the txn to pass them to http_get_path(). One benefit of the new version
is that it returns the length at the same time so that allowed to slightly
simplify http_get_path_from_string() which had to look up the end pointer
previously and which is not needed anymore.
It's a bit painful to have to deal with HTTP semantics for each protocol
version (H1 and H2), and working on the version-agnostic code further
emphasizes the problem.
This patch creates http.h and http.c which are agnostic to the version
in use, and which borrow a few parts from proto_http and from h1. For
example the once thought h1-specific h1_char_classes array is in fact
dictated by RFC7231 and is used to parse HTTP headers. A few changes
were made to a few files which were including proto_http.h while they
only needed http.h.
Certain string definitions pre-dated the introduction of indirect
strings (ist) so some were used to simplify the definition of the known
HTTP methods. The current lookup code saves 2 kB of a heavily used table
and is faster than the previous table based lookup (typ. 14 ns vs 16
before).
We need to clean the FDs registered manually in the poller to avoid FD
leaking during a reload of the master.
This patch call the per thread deinit function which close the thread
waker pipe.
In order to communicate with the workers, the master pipe has been
replaced by a socketpair() per worker.
The goal is to use these sockets as stats sockets and be able to access
them from the master.
When reloading, the master serialize the information of the workers and
put them in a environment variable. Once the master has been reexecuted
it unserialize that information and it is capable of closing the FDs of
the leaving children.
The master now use a poll loop, which should be initialized even in wait
mode. We need to init some variables if we didn't success to load the
configuration file.
If haproxy failed to load its configuration, the process is reexecuted
and it did not init the poller. So we must not try to deinit the poller
before the exec().
With the new way of handling the signals in the master worker, we are
are not staying in a waitpid() loop. Which means that we need to catch the
SIGCHLD signals to call waitpid().
The problem is when the master is reloading, this signal is neither
registered nor blocked so we lost all signals between the restart and
the call to mworker_loop().
This patch blocks the SIGCHLD signals before the reloading and ensure
it's not unblocked before the master registered the SIGCHLD handler.
In order to reorganize the code of the master worker, the mworker_wait()
function which was the main function was split. This function was
handling a wait() loop, but it does not need it anymore since the code
will use the poll loop of haproxy instead.
The function was split in several functions:
- mworker_catch_sigterm() which is a signal handler for SIGTERM ans
SIGUSR1 that sends the signals to the workers
- mworker_catch_sigchld() which is the code handling the leaving of a
child
- mworker_catch_sighup which basically call the mworker_restart()
function
- mworker_loop() which is the function calling the main poll loop in the
master
Instead of having a separate area for the captured data, we now have a
contigous block made of the descriptor and the data. At the moment, since
the area is dynamically allocated, we can adjust its size to what is
needed, but the idea is to quickly switch to a pool and an LRU list.
Now upon error we dynamically allocate the snapshot instead of overwriting
it. This way there is no more memory wasted in the proxy to hold the two
error snapshot descriptors. Also an appreciable side effect of this is that
the proxy's lock is only taken during the pointer swap, no more while copying
the buffer's contents. This saves 480 bytes of memory per proxy.
The proxy's lock it held while filling the error but not while dumping
it, so it's possible to dereference pointers being replaced, typically
server pointers. The risk is very low and unlikely but not inexistent.
Since "show errors" is rarely used in parallel, let's simply grab the
proxy's lock while dumping. Ideally we should use an R/W lock here but
it will not make any difference.
This patch must be backported to 1.8, but the code is in proto_http.c
there, though mostly similar.
Now that we have a generic error capture function, let's simplify
http_capture_bad_message() to make use of it. At this point the API
is not changed at all, but it could be further simplified.
This function now captures an error regardless of its side and protocol.
The caller must pass a number of elements and may pass a protocol-specific
structure and a callback to display it. Later this function may deal with
more advanced allocation techniques to avoid allocating as many buffers
as proxies.
The HTTP dumps are now configurable in the code : "show errors" now
calls a protocol-specific function to emit the decoded output. For
now only HTTP is implemented.
The output of "show errors" was slightly reordered to split the HTTP part
in a single chunk_appendf() call. The useless buffer total input was
replaced to report the buffer's start offset, which is the offset in the
stream of the first input byte (thus not counting output). Also it was
the opportunity to stop calling the stream "session".
The idea will be to make the error snapshot feature accessible to other
protocols than just HTTP. This patch only introduces an "http_snapshot"
structure and renames a few fields to make things more explicit. The
HTTP part was installed inside a union so that we can easily add more
protocols in the future.
The snapshots have the ability to restart a partial dump and they use
the stream ID as the restart point. Since it's purely HTTP, let's use
the event ID instead.
Let's use an atomic increment for the error snapshot, as we'd rather
not assign the same ID to two errors happening in parallel. It's very
unlikely that it will ever happen though.
This patch must be backported to 1.8 with the other one it relies on
("MINOR: thread: implement HA_ATOMIC_XADD()").
On the Mailing list, Marcos Moreno reported that haproxy configuration
validation (through "haproxy -c cfgfile") does not detect when a
resolvers section does not exist for a server.
That said, this checking is done after HAProxy has started up.
The problem is that this can create production issue, since init
script can't detect the problem before starting / reloading HAProxy.
To fix this issue, this patch registers the function which validates DNS
configuration validity and run it right after configuration parsing is
finished (through cfg_register_postparser()).
Thanks to it, now "haproxy -c cfgfile" will fail when a server
points to a non-existing resolvers section (or any other validation made
by the function above).
Backport status: 1.8
Sometimes a connection is prepared before the target is set, sometimes
after. There's no real rule since the few functions involved operate on
different and independent fields. Soon we'll benefit from knowing the
target at the connection layer, in order to figure the associated proxy
and retrieve the various parameters (timeouts etc). This patch slightly
reorders a few calls to conn_prepare() so that we can make sure that the
target is always known to the mux.
Commit 5e74b0b ("MEDIUM: h1: port to new buffer API.") introduced a
minor bug by which a buffer's head could stay shifted by the amount
of removed CRLF if it started with empty lines. This would cause the
second request (or response) not to work until it would receive a few
extra characters. This most only impacts requests sent by hand though.
This is purely 1.9, no backport is needed.
The h2 mux currently lacks some basic transparency. Some errors cause the
connection to be aborted but they couldn't be reported. With this patch,
almost all situations where an error will cause a stream or connection to
be aborted without the ability for an existing stream to report it will be
reported in the logs. This at least provides a solution to monitor the
activity and abnormal traffic.
The new function sess_log() only needs a session to emit a log. It will
ignore the parts that depend on the stream. It is usable to emit a log
to report early errors in muxes. These ones will typically mention
"<BADREQ>" for the request and 0 for the HTTP status code.
Till now it was impossible to emit logs from the lower layers only because
a stream was mandatory. From now on it will at least be possible to emit a
log to report a bad request or some timings for example. When the stream
is null, sess_build_logline() will use default values and will extract the
timing information from the session just like stream_new() does, so the
resulting log line is perfectly valid.
The termination state will indicate a proxy error during the request phase
since it is the only realistic use for such a call with no stream.
When s==NULL we don't have any assigned request counter. Ideally we
should proceed exactly like when a stream is initialized and assign
a unique value. For now we only place it into a local variable.