While the SI_ST_DIS state is set *after* doing the close on a connection,
it was set *before* calling release on an applet. Applets have no internal
flags contrary to connections, so they have no way to detect they were
already released. Because of this it happened that applets were closed
twice, once via si_applet_release() and once via si_release_endpoint() at
the end of a transaction. The CLI applet could perform a double free in
this case, though the situation to cause it is quite hard because it
requires that the applet is stuck on output in states that produce very
few data.
In order to solve this, we now assign the SI_ST_DIS state *after* calling
->release, and we refrain from doing so if the state is already assigned.
This makes applets work much more like connections and definitely avoids
this double release.
In the future it might be worth making applets have their own flags like
connections to carry their own state regardless of the stream interface's
state, especially when dealing with connection reuse.
No backport is needed since this issue was caused by the rearchitecture
in 1.6.
It's dangerous to call this on internal state error, because it risks
to perform a double-free. This can only happen when a state is not
handled. Note that the switch/case currently doesn't offer any option
for missed states since they're all declared. Better fix this anyway.
The fix was tested by commenting out some entries in the switch/case.
Due to the code inherited from the early CLI mode with the non-interactive
code, the SHUTR status was only considered while waiting for a request,
which prevents the connection from properly being closed during a dump,
and the connection used to remain established. This issue didn't happen
in 1.5 because while this part was missed, the resynchronization performed
in process_session() would detect the situation and handle the cleanup.
No backport is needed.
If an error happens during a dump on the CLI, an explicit call to
cli_release_handler() is performed. This is not needed anymore since
we introduced ->release() in the applet which is called upon error.
Let's remove this confusing call which can even be risky in some
situations.
If an applet tries to write to a closed connection, it hangs forever.
This results in some "get map" commands on the CLI to leave orphaned
connections alive.
Now the applet wakeup function detects that the applet still wants to
write while the channel is closed for reads, which is the equivalent
to the common "broken pipe" situation. In this case, an error is
reported on the stream interface, just as it happens with connections
trying to perform a send() in a similar situation.
With this fix the stats socket is properly released.
This function is a callback made only for calls from the applet handler.
Rename it to remove confusion. It's currently called from the Lua code
but that's not correct, we should call the notify and update functions
instead otherwise it will not enable the applet again.
This one is not needed anymore as what it used to do is either
completely covered by the new stream_int_notify() function, or undesired
and inherited from the past as a side effect of introducing the
connections.
This update is theorically never called since it's assigned only when
nothing is connected to the stream interface. However a test has been
added to si_update() to stay safe if some foreign code decides to call
si_update() in unsafe situations.
The code to report completion after a connection update or an applet update
was almost the same since applets stole it from the connection. But the
differences made them hard to maintain and prevented the creation of new
functions doing only one part of the work.
This patch replaces the common code from the si_conn_wake_cb() and
si_applet_wake_cb() with a single call to stream_int_notify() which only
notifies the stream (si+channels+task) from the outside.
No functional change was made beyond this.
stream_int_notify() was taken from the common part between si_conn_wake_cb()
and si_applet_done(). It is designed to report activity to a stream from
outside its handler. It'll generally be used by lower layers to report I/O
completion but may also be used by remote streams if the buffer processing
is shared.
The condition to release the SI_FL_WAIT_ROOM flag was abnormally
complicated because it was inherited from 6 years ago before we used
to check for the buffer's emptiness. The CF_READ_PARTIAL flag had to be
removed, and the complex test was replaced with a simpler one checking
if *some* data were moved out or not.
The reason behind this change is to have a condition compatible with
both connections and applets, as applets currently don't work very
well in this area. Specifically, some optimizations on the applet
side cause them not to release the flag above until the buffer is
empty, which may prevent applets from taking together (eg: peers
over large haproxy buffers and small kernel buffers).
Now the call to stream_int_update() is moved to si_update(), which
is exclusively called from the stream, so that the socket layer may
be updated without updating the stream layer. This will later permit
to call it individually from other places (other tasks or applets for
example).
Now that we have a generic stream_int_update() function, we can
replace the equivalent part in stream_int_update_conn() and
stream_int_update_applet() to avoid code duplication.
There is no functional change, as the code is the same but split
in two functions for each call.
This function is designed to be called from within the stream handler to
update the channels' expiration timers and the stream interface's flags
based on the channels' flags. It needs to be called only once after the
channels' flags have settled down, and before they are cleared, though it
doesn't harm to call it as often as desired (it just slightly hurts
performance). It must not be called from outside of the stream handler,
as what it does will be used to compute the stream task's expiration.
The code was taken directly from stream_int_update_applet() and
stream_int_update_conn() which had exactly the same one except for
applet-specific or connection-specific status update.
The purpose is to separate the connection-specific parts so that the
stream-int specific one can be factored out. There's no functional
change here, only code displacement.
If an applet wakes up and causes the next one to sleep, the active list
is corrupted and cannot be scanned anymore, as the process then loops
over the next element.
In order to avoid this problem, we move the active applet list to a run
queue and reinit the active list. Only the first element of this queue
is checked, and if the element is not removed, it is removed and requeued
into the active list.
Since we're using a distinct list, if an applet wants to requeue another
applet into the active list, it properly gets added to the active list
and not to the run queue.
This stops the infinite loop issue that could be caused with Lua applets,
and in any future configuration where two applets could be attached
together.
This is not a real run queue and we're facing ugly bugs because
if this : if a an applet removes another applet from the queue,
typically the next one after itself, the list iterator loops
forever because the list's backup pointer is not valid anymore.
Before creating a run queue, let's rename this list.
The pattern match "found" fails to parse on binary type samples. The
reason is that it presents itself as an integer type which bin cannot
be cast into. We must always accept this match since it validates
anything on input regardless of the type. Let's just relax the parser
to accept it.
This problem might also exist in 1.5.
(cherry picked from commit 91cc2368a73198bddc3e140d267cce4ee08cf20e)
Due to a check between offset+len and buf->size, an empty buffer returns
"will never match". Check against tune.bufsize instead.
(cherry picked from commit 43e4039fd5d208fd9d32157d20de93d3ddf9bc0d)
The current Lua action are not registered. The executed function is
selected according with a function name writed in the HAProxy configuration.
This patch add an action registration function. The configuration mode
described above disappear.
This change make some incompatibilities with existing configuration files for
HAProxy 1.6-dev.
function 'peer_prepare_ackmsg' is designed to use the argument 'msg'
instead of 'trash.str'.
There is currently no bug because the caller passes 'trash.str' in
the 'msg' argument.
Some updates are pushed using an incremental update message after a
re-connection whereas the origin is forgotten by the peer.
These updates are never correctly acknowledged. So they are regularly
re-pushed after an idle timeout and a re-connect.
The fix consists to use an absolute update message in some cases.
If an entry is still not present in the update tree, we could miss to schedule
for a push depending of an un-initialized value (upd.key remains un-initialized
for new sessions or isn't re-initalized for reused ones).
In the same way, if an entry is present in the tree, but its update's tick
is far in the past (> 2^31). We could consider it's still scheduled even if
it is not the case.
The fix consist to force the re-scheduling of an update if it was not present in
the updates tree or if the update is not in the scheduling window of every peers.
PiBANL reported that HAProxy's DNS resolver can't "connect" its socker
on FreeBSD.
Remi Gacogne reported that we should use the function 'get_addr_len' to
get the addr structure size instead of sizeof.
Mailing list participant "mlist" reported negative conn_cur values in
stick tables as the result of "tcp-request connection track-sc". The
reason is that after the stick entry it copied from the session to the
stream, both the session and the stream grab a reference to the entry
and when the stream ends, it decrements one reference and one connection,
then the same is done for the session.
In fact this problem was already encountered slightly differently in the
past and addressed by Thierry using the patch below as it was believed by
then to be only a refcount issue since it was the observable symptom :
827752e "BUG/MEDIUM: stick-tables: refcount error after copying SC..."
In reality the problem is that the stream must touch neither the refcount
nor the connection count for entries it inherits from the session. While
we have no way to tell whether a track entry was inherited from the session
(since they're simply memcpy'd), it is possible to prevent the stream from
touching an entry that already exists in the session because that's a
guarantee that it was inherited from it.
Note that it may be a temporary fix. Maybe in the future when a session
gives birth to multiple streams we'll face a situation where a session may
be updated to add more tracked entries after instanciating some streams.
The correct long-term fix is to mark some tracked entries as shared or
private (or RO/RW). That will allow the session to track more entries
even after the same trackers are being used by early streams.
No backport is needed, this is only caused by the session/stream split in 1.6.
Pradeep Jindal reported and troubleshooted a bug causing haproxy to die
during startup on all processes not making use of a peers section. It
only happens with nbproc > 1 when peers are declared. Technically it's
when the peers task is stopped on processes that don't use it that the
crash occurred (a task_free() called on a NULL task pointer).
This only affects peers v2 in the dev branch, no backport is needed.
Trie device detection doesn't benefit from caching compared to Pattern.
As such the LRU cache has been removed from the Trie method.
A new fetch method has been added named 51d.all which uses all the
available HTTP headers for device device detection. The previous 51d
conv method has been changed to 51d.single where one HTTP header,
typically User-Agent, is used for detection. This method is marginally
faster but less accurate.
Three new properties are available with the Pattern method called
Method, Difference and Rank which provide insight into the validity of
the results returned.
A pool of worksets is used to avoid needing to create a new workset for
every request. The workset pool is thread safe ready to support a future
multi threaded version of HAProxy.
Added the definition of CHECK_HTTP_MESSAGE_FIRST and the declaration of
smp_prefetch_http to the header.
Changed smp_prefetch_http implementation to remove the static qualifier.
This patch uses the start up of the health check task to also start
the warmup task when required.
This is executed only once: when HAProxy has just started up and can
be started only if the load-server-state-from-file feature is enabled
and the server was in the warmup state before a reload occurs.
This directive gives HAProxy the ability to use the either the global
server-state-file directive or a local one using server-state-file-name to
load server states.
The state can be saved right before the reload by the init script, using
the "show servers state" command on the stats socket redirecting output into
a file.
This new global section directive is used to store the path to the file
where HAProxy will be able to retrieve server states across reloads.
The file pointed by this path is used to store a file which can contains
state of all servers from all backends.
This new global directive can be used to provide a base directory where
all the server state files could be loaded.
If a server state file name starts with a slash '/', then this directive
must not be applied.
new command 'show servers state' which dumps all variable parameters
of a server during an HAProxy process life.
Purpose is to dump current server state at current run time in order to
read them right after the reload.
The format of the output is versionned and we support version 1 for now.
When a server is disabled in the configuration using the "disabled"
keyword, a single flag is positionned: SRV_ADMF_CMAINT (use to be
SRV_ADMF_FMAINT)..
That said, when providing the first version of this code, we also
changed the SRV_ADMF_MAINT mask to match any of the possible MAINT
cases: SRV_ADMF_FMAINT, SRV_ADMF_IMAINT, SRV_ADMF_CMAINT
Since SRV_ADMF_CMAINT is never (and is not supposed to be) altered at
run time, once a server has this flag set up, it can never ever be
enabled again using the stats socket.
In order to fix this, we should:
- consider SRV_ADMF_CMAINT as a simple flag to report the state in the
old configuration file (will be used after a reload to deduce the
state of the server in a new running process)
- enabling both SRV_ADMF_CMAINT and SRV_ADMF_FMAINT when the keyword
"disabled" is in use in the configuration
- update the mask SRV_ADMF_MAINT as it was before, to only match
SRV_ADMF_FMAINT and SRV_ADMF_IMAINT.
The following patch perform the changes above.
It allows fixing the regression without breaking the way the up coming
feature (seamless server state accross reloads) is going to work.
Note: this is 1.6-only, no backport needed.
This couple of function executes securely some Lua calls outside of
the lua runtime environment. Each Lua call can return a longjmp
if it encounter a memory error.
Lua documentation extract:
If an error happens outside any protected environment, Lua calls
a panic function (see lua_atpanic) and then calls abort, thus
exiting the host application. Your panic function can avoid this
exit by never returning (e.g., doing a long jump to your own
recovery point outside Lua).
The panic function runs as if it were a message handler (see
§2.3); in particular, the error message is at the top of the
stack. However, there is no guarantee about stack space. To push
anything on the stack, the panic function must first check the
available space (see §4.2).
We must check all the Lua entry point. This includes:
- The include/proto/hlua.h exported functions
- the task wrapper function
- The action wrapper function
- The converters wrapper function
- The sample-fetch wrapper functions
It is tolerated that the initilisation function returns an abort.
Before each Lua abort, an error message is writed on stderr.
The macro SET_SAFE_LJMP initialise the longjmp. The Macro
RESET_SAFE_LJMP reset the longjmp. These function must be macro
because they must be exists in the program stack when the longjmp
is called
All the code which emits error log have the same pattern. Its:
Send log with syslog system, and if it is allowed, display error
log on screen.
This patch replace this pattern by a macro. This reduces the number
of lines.