For now, the lua scripts are not compatible with the new HTX internal
representation of HTTP messages. Thus, for a given proxy, when the option
"http-use-htx" is enabled, an error is triggered if any lua's
action/service/sample-fetch/converter is also configured.
Remaining calls to si_cant_put() were all for lack of room and were
turned to si_rx_room_blk(). A few places where SI_FL_RXBLK_ROOM was
cleared by hand were converted to si_rx_room_rdy().
The now unused si_cant_put() function was removed.
A number of calls to si_cant_put() were used in fact to request being
called back once a buffer is available. These ones are not needed anymore
since si_alloc_ibuf() already sets the SI_FL_RXBLK_BUFF flag when called
in appctx context. Those called with a foreign stream-int are simply turned
to si_rx_buff_blk().
A number of si_cant_put() calls were still present to in fact indicate
that the end point is ready (thus should be turned to si_rx_endp_more()).
One other call in the Lua handler indicates that the endpoint wanted to
be blocked until some room is made in the Rx buffer in order to detect
that the connection happened, which is in fact an indication that it
wants to be called once the endpoint is ready, this is the default case
for an applet so this call was removed.
A useless call to si_cant_put() before appctx_wakeup() in the Lua
applet wakeup call was removed as well since the first thing that will
be done there will be to set end ENDP blocking flag.
It doesn't make sense to limit this code to applets, as any stream
interface can use it. Let's rename it by simply dropping the "applet_"
part of the name. No other change was made except updating the comments.
Well that's only 3 places (applet.c, stream_interface.c, hlua.c). This
ensures we always clear SI_FL_WAIT_ROOM before setting it on failure,
so that it is granted that SI_FL_WAIT_ROOM always indicates a lack of
room for doing an operation, including the inability to allocate a
buffer for this.
These ones are on error paths that are properly handled by luaL_error()
which does a longjmp() but the compiler cannot know it. By adding an
__unreachable() statement in WILL_LJMP(), there is no ambiguity anymore.
This may be backported to 1.8 but these previous patches are needed first :
- BUILD: compiler: add a new statement "__unreachable()"
- MINOR: lua: all functions calling lua_yieldk() may return
- BUILD: lua: silence some compiler warnings about potential null derefs (#2)
There was a mistake when tagging functions which always use longjmp and
those which may use it in that all those supposed to call lua_yieldk()
may return without calling longjmp. Thus they must not use WILL_LJMP()
but MAY_LJMP(). It has zero impact on the code emitted as such, but
prevents other fixes from being properly implemented : this was the
cause of the previous failure with the __unreachable() calls.
This may be backported to older versions. It may or may not apply
well depending on the context, though the change simply consists in
replacing "WILL_LJMP(hlua_yieldk" with "MAY_LJMP(hlua_yieldk", and
same with the single call to lua_yieldk() in hlua_yieldk().
Here we make sure that appctx is always taken from the unchecked value
since we know it's an appctx, which explains why it's immediately
dereferenced. A missing test was added to ensure that task_new() does
not return a NULL.
This may be backported to 1.8.
This reverts commit f1ffb39b61.
It breaks Lua causing some timeouts. Removing the __unreachable() statement
from WILL_LJMP() fixes it. It's very strange and unclear whether it's an
issue with WILL_LJMP() not fullfilling its promise of not returning, if
the code emitted with __unreachable() gets broken, or anything else. Let's
revert this for now.
These ones are on error paths that are properly handled by luaL_error()
which does a longjmp() but the compiler cannot know it. By adding an
__unreachable() statement in WILL_LJMP(), there is no ambiguity anymore.
This may be backported to 1.8 but the previous patch (BUILD: compiler:
add a new statement "__unreachable()") is needed for this.
These ones are mostly called from cfgparse.c for the parsing and do
not depend on the HTTP representation. The functions's prototypes
were moved to proto/http_rules.h, making this file work exactly like
tcp_rules. Ideally we should stop calling these functions directly
from cfgparse and register keywords, but there are a few cases where
that wouldn't work (stats http-request) so it's probably not worth
trying to go this far.
The current proto_http.c file is huge and contains different processing
domains making it very difficult to work on an alternative representation.
This commit moves some parts to other files :
- ACL registration code => http_acl.c
This code only creates some ACL mappings and doesn't know anything
about HTTP nor about the representation. This code could even have
moved to acl.c but it was not worth polluting it again.
- HTTP sample conversion => http_conv.c
This code doesn't depend on the internal representation but definitely
manipulates some HTTP elements, such as dates. It also has access to
captures.
- HTTP sample fetching => http_fetch.c
This code does depend entirely on the internal representation but is
totally independent on the analysers. Placing it into a different
file will ease the transition to the new representation and the
creation of a wrapper if required. An include file was created due
to CHECK_HTTP_MESSAGE_FIRST() being used at various places.
- HTTP action registration => http_act.c
This code doesn't directly interact with the messages nor the
transaction but it does so via some exported http functions like
http_replace_req_line() or http_set_status() so it will be easier
to change only this after the conversion.
- a few very generic parts were found and moved to http.{c,h} as
relevant.
It is worth noting that the functions moved to these new files are not
referenced anywhere outside of the files and are only called as registered
callbacks, so these files do not even require associated include files.
These error codes and messages are agnostic to the version, even if
they are represented as HTTP/1.0 messages. Ultimately they will have
to be transformed into internal HTTP messages to be used everywhere.
The HTTP/1.1 100 Continue message was turned to an IST and the local
copy in the Lua code was removed.
This function is purely HTTP once http_txn is put aside. So the original
one was renamed to http_txn_get_path() and it extracts the relevant offsets
from the txn to pass them to http_get_path(). One benefit of the new version
is that it returns the length at the same time so that allowed to slightly
simplify http_get_path_from_string() which had to look up the end pointer
previously and which is not needed anymore.
If SET_SAFE_LJMP returns 0, the spinlock is already unlocked, and lua_atpanic
is already set back to hlua_panic_safe, so there's no need to call
RESET_SAFE_LJMP.
This should be MFC'd into 1.8.
When calling ->prepare_srv() callback for SSL server which
depends on global "nbthread" value, this latter was not already parsed,
so equal to 1 default value. This lead to bad memory accesses.
Thank you to Pieter (PiBa-NL) for having reported this issue and
for having provided a very helpful reg testing file to reproduce
this issue (reg-test/lua/b00002.*).
Must be backported to 1.8.
In hlua_applet_tcp_fct(), drain the output buffer when the applet is done
running, every time we're called.
Overwise, there's a race condition, and the output buffer could be filled
after the applet ran, and as it is never cleared, the stream interface
will never be destroyed.
This should be backported to 1.8 and 1.7.
HTTP LUA applet callback should not update the date on which the HTTP client requests
arrive. This was done just after the LUA applet has completed its job.
This patch simply removes the affected statement. The same fixe has been applied
to TCP LUA applet callback.
To reproduce this issue, as reported by Patrick Hemmer, implement an HTTP LUA applet
which sleeps a bit before replying:
core.register_service("foo", "http", function(applet)
core.msleep(100)
applet:set_status(200)
applet:start_response()
end)
This had as a consequence to log %TR field with approximatively the same value as
the LUA sleep time.
Thank you to Patrick Hemmer for having reported this issue.
Must be backported to 1.8, 1.7 and 1.6.
Since commit #56cc12509, haproxy accepts double values for timeouts. The
value is then converted to milliseconds before being rounded up and cast
to int. The issue is that to round up the value, a constant value of 0.5
is added to it, but too early in the conversion, resulting in an
additional 500ms to the value. We are talking about a precision of 1ms,
so we can safely get rid of this rounding trick and adjust resulting
timeouts equal to 0 to a minimum of 1ms.
This patch is specific to the 1.9 branch and doesn't require to be
backported.
Sachin Shetty reported that socket timeouts set in LUA code have no effect.
Indeed, connect timeout is never modified and is always set to its default,
set to 5 seconds. Currently, this patch will apply the specified timeout
value to the connect timeout.
For the read and write timeouts, the issue is that the timeout is updated but
the expiration dates were not updated.
This patch should be backported up to the 1.6 branch.
This adds the set-priority-class and set-priority-offset actions to
http-request and tcp-request content. At this point they are not used
yet, which is the purpose of the next commit, but all the logic to
set and clear the values is there.
We'll need trees to manage the queues by priorities. This change replaces
the list with a tree based on a single key. It's effectively a list but
allows us to get rid of the list management right now.
Now all the code used to manipulate chunks uses a struct buffer instead.
The functions are still called "chunk*", and some of them will progressively
move to the generic buffer handling code as they are cleaned up.
Chunks are only a subset of a buffer (a non-wrapping version with no head
offset). Despite this we still carry a lot of duplicated code between
buffers and chunks. Replacing chunks with buffers would significantly
reduce the maintenance efforts. This first patch renames the chunk's
fields to match the name and types used by struct buffers, with the goal
of isolating the code changes from the declaration changes.
Most of the changes were made with spatch using this coccinelle script :
@rule_d1@
typedef chunk;
struct chunk chunk;
@@
- chunk.str
+ chunk.area
@rule_d2@
typedef chunk;
struct chunk chunk;
@@
- chunk.len
+ chunk.data
@rule_i1@
typedef chunk;
struct chunk *chunk;
@@
- chunk->str
+ chunk->area
@rule_i2@
typedef chunk;
struct chunk *chunk;
@@
- chunk->len
+ chunk->data
Some minor updates to 3 http functions had to be performed to take size_t
ints instead of ints in order to match the unsigned length here.
Now the buffers only contain the header and a pointer to the storage
area which can be anywhere. This will significantly simplify buffer
swapping and will make it possible to map chunks on buffers as well.
The buf_empty variable was removed, as now it's enough to have size==0
and area==NULL to designate the empty buffer (thus a non-allocated head
is the empty buffer by default). buf_wanted for now is indicated by
size==0 and area==(void *)1.
The channels and the checks now embed the buffer's head, and the only
pointer is to the storage area. This slightly increases the unallocated
buffer size (3 extra ints for the empty buffer) but considerably
simplifies dynamic buffer management. It will also later permit to
detach unused checks.
The way the struct buffer is arranged has proven quite efficient on a
number of tests, which makes sense given that size is always accessed
and often first, followed by the othe ones.
This one is more generic and designed to work on a random block. It
may later get a b_rep_ist() variant since many strings are already
available as (ptr,len).
There's no distinction between in and out data now. The latter covers
the needs of the former and supports wrapping. The extra cost is
negligible given the locations where it's used.
This replaces chn->buf->p with ci_head(chn), chn->buf->o with co_data(chn)
and chn->buf->i with ci_data(chn). This is in order to help porting to the
new buffer API.
We used to have variations around buffer_total_space() and
size-buffer_len() or size-b_data(). Let's simplify all this. buffer_len()
was also removed as not used anymore.
Now that there are no more users requiring to modify the buffer anymore,
switch these ones to const char and const buffer. This will make it more
obvious next time send functions are tempted to modify the buffer's output
count. Minor adaptations were necessary at a few call places which were
using char due to the function's previous prototype.
These ones manipulate the output data count which will be specific to
the channel soon, so prepare the call points to use the channel only.
The b_* functions are now unused and were removed.
The few call places where it's used can use the trash as a swap buffer,
which is made for this exact purpose. This way we can rely on the
generic b_slow_realign() call.
Where relevant, the channel version is used instead. The buffer version
was ported to be more generic and now takes a swap buffer and the output
byte count to know where to set the alignment point. The H2 mux still
uses buffer_slow_realign() with buf->o but it will change later.
The Lua parser doesn't takes in account end-of-headers containing
only '\n'. It expects always '\r\n'. If a '\n' is processes the Lua
parser considers it miss 1 byte, and wait indefinitely for new data.
When the client reaches their timeout, it closes the connection.
This close is not detected and the connection keep in CLOSE-WAIT
state.
I guess that this patch fix only a visible part of the problem.
If the Lua HTTP parser wait for data, the timeout server or the
connectio closed by the client may stop the applet.
How reproduce the problem:
HAProxy conf:
global
lua-load bug38.lua
frontend frt
timeout client 2s
timeout server 2s
mode http
bind *:8080
http-request use-service lua.donothing
Lua conf
core.register_service("donothing", "http", function(applet) end)
Client request:
echo -ne 'GET / HTTP/1.1\n\n' | nc 127.0.0.1 8080
Look for CLOSE-WAIT in the connection with "netstat" or "ss". I
use this script:
while sleep 1; do ss | grep CLOSE-WAIT; done
This patch must be backported in 1.6, 1.7 and 1.8
Workaround: enable the "hard-stop-after" directive, and perform
periodic reload.
Patrick reported that this simple configuration made haproxy segfaults:
global
lua-load /tmp/haproxy.lua
frontend f1
mode http
bind :8000
default_backend b1
http-request lua.foo
backend b1
mode http
server s1 127.0.0.1:8080
with this '/tmp/haproxy.lua' script:
core.register_action("foo", { "http-req" }, function(txn)
txn.sc:ipmask(txn.f:src(), 24, 112)
end)
This is due to missing initialization of the array of arguments
passed to hlua_lua2arg_check() which makes it enter code with
corrupted arguments.
Thanks a lot to Patrick Hemmer for having reported this issue.
Must be backported to 1.8, 1.7 and 1.6.
When an unrecoverable error raises, the user receive poor information
for the trouble shooting. For example:
[ALERT] 157/143755 (21212) : Lua function 'hello-world': runtime error: memory allocation error: block too big.
Unfortunately, the memory allocation error can be throwed by many
function, and we have no informatio to reach the original cause.
This patch add the list of function called from the entry point to
the function in error, like this:
[ALERT] 157/143755 (21212) : Lua function 'hello-world': runtime error: memory allocation error: block too big from [C] method 'req_get_headers', bug35.lua:2 global 'ee', bug35.lua:6 global 'ff', bug35.lua:10 C function line 9.
The buffer pointer is already updated. It is again updated
when it is given to the function ci_putblk().
This patch must be backported in 1.6, 1.7 and 1.8
When we write data, we risk to encounter a dead-loack. The
function "stream_int_notify()" cannot be called the the
cosocket because the caller acquire a lock and when the socket
is closed, the cleanup function try to acquire the same lock.,
so a dead-lock raises.
In other way, the function stream_int_update_applet() can't
be called because it schedumes the applet only if some activity
in the buffers were detected. It is not always the case. We
replace this function by appctx_wakeup() which wake up the
applet inconditionnaly.
The last part of the fix is setting right signals. the applet
call the stream_int_update() function if the output buffer si
not empty, and ask for put data if some rite signals are
registered.
This patch must be backported in 1.6, 1.7 and 1.8. Note that it requires
patch "MINOR: task/notification: Is notifications registered" to be
applied.
Each time the send function yields, a notification must be registered.
Without this notification, the task is never wakeup when data arrives.
Today, the notification is registered only if the buffer is not available.
Other cases like the buffer is too small for all data are not processed.
This patch must be backported in 1.6, 1.7 and 1.8
In some cases, when we are waiting for data and the socket
timeout expires, we have a dead lock. The Lua socket locks
the applet socket, and call for a notify. The notify
immediately executes code and try to acquire the same lock,
so ... dead lock.
stream_int_notify() cant be used because it wakeup the applet
task only if the stream have changes. The changes are forces
by Lua, but not repported on the stream.
stream_int_update_applet() cant be used because the deadlock.
So, I inconditionnaly wakeup the applet. This wake is performed
asynchronously, and will call a stream_int_notify().
This patch must be backported in 1.6, 1.7 and 1.8
The appctx pointer is given from any variable which are wrong.
This implies the wakeup of wrong applet, and the socket are no
longer responsive.
This behavior is hidden by another inherited error which is
fixed in the next patch.
This patch remove all wrong appctx affectations.
This patch must be backported in 1.6, 1.7 and 1.8
In preparation for thread-specific runqueues, change the task API so that
the callback takes 3 arguments, the task itself, the context, and the state,
those were retrieved from the task before. This will allow these elements to
change atomically in the scheduler while the application uses the copied
value, and even to have NULL tasks later.
The limit of data read works only if all the data is in the
input buffer. Otherwise (if the data arrive in chunks), the
total amount of data is not taken in acount.
Only the current read data are compared to the expected amout
of data.
This patch must be backported from 1.9 to 1.6
The function hlua_ctx_resume return less text message and more error
code. These error code allow the caller to return appropriate
message to the user.
Since commit 36d1374 ("BUG/MINOR: lua: Fix SSL initialisation") in 1.6, the
Lua code always initializes an SSL server. It caused a small visible side
effect which is that by calling ssl_sock_prepare_srv_ctx(), it forces
global.ssl_used_backend to 1 and makes the initialization code believe that
there are some SSL servers in certain backends. This detection is used to
figure how to set the global maxconn value when only the memory usage is
limited. As such, even a configuration with no SSL at all will have a very
conservative maxconn.
The configuration below exhibits this :
global
ssl-server-verify none
stats socket /tmp/sock1 mode 666 level admin
tune.bufsize 16384
listen px
timeout client 5s
timeout server 5s
timeout connect 5s
bind :4445
#bind :4443 ssl crt rsa+dh2048.pem
#server s1 127.0.0.1:8003 ssl
Starting it with "-m 200" to limit it to 200 MB of RAM reports 1500 for
Maxconn, the same when uncommenting the "server" line, and 1300 when
uncommenting the "bind" line, regardless of the "server" line's status.
In practice it doesn't make sense to consider that Lua's server template
counts for one regular SSL server, because even if used for SSL, it will
not take large connection counts, compared to a backend relaying traffic.
Thus the solution consists in resetting the ssl_used_backend to its
previous value after creating the server_ctx from the Lua code. With the
fix, the same config with the same parameters now show :
- maxconn=5700 when neither side uses SSL
- maxconn=1500 when only one side uses SSL
- maxconn=1300 when both sides use SSL
This fix can be backported to versions 1.6 and beyond.
Function `hlua_socket_close` expected exactly one argument on the Lua stack.
But when `hlua_socket_close` was called from `hlua_socket_write_yield`,
Lua stack had 3 arguments. So `hlua_socket_close` threw the exception with
message "'close' needs 1 arguments".
Introduced new helper function `hlua_socket_close_helper`, which removed the
Lua stack argument count check and only checked if the first argument was
a socket.
This fix should be backported to 1.8, 1.7 and 1.6.
The parameters like server-address, port and timeout should be set before
process_stream task is called to avoid the stream being 'closed' before it
got initialized properly. This is most clearly visible when running with
tune.lua.forced-yield=1.. So scheduling the task should not be done when
creating the lua socket, but when connect is called. The error
"socket: not yet initialised, you can't set timeouts." would then appear.
Below code for example also shows this issue, as the sleep will
yield the lua code:
local con = core.tcp()
core.sleep(1)
con:settimeout(10)
If a lua socket is waiting for data it currently spins at 100% cpu usage.
This because the TICK_ETERNITY returned by the socket is ignored when
setting the 'expire' time of the task.
Fixed by removing the check for yields that return TICK_ETERNITY.
This should be backported to at least 1.8.
PiBa-NL reported a bug with tasks registered in lua when HAProxy is started with
serveral threads. These tasks have not specific affinity with threads so they
can be woken up on any threads. So, it is impossbile for these tasks to handled
cosockets or applets, because cosockets and applets are sticky on the thread
which created them. It is forbbiden to manipulate a cosocket from another
thread.
So to fix the bug, tasks registered in lua are now sticky to the current
thread. Because these tasks can be registered before threads creation, the
affinity is set the first time a lua's task is processed.
This patch must be backported in HAProxy 1.8.
In order to use arbitrary data in the CLI (multiple lines or group of words
that must be considered as a whole, for example), it is now possible to add a
payload to the commands. To do so, the first line needs to end with a special
pattern: <<\n. Everything that follows will be left untouched by the CLI parser
and will be passed to the commands parsers.
Per-command support will need to be added to take advantage of this
feature.
Signed-off-by: Aurlien Nephtali <aurelien.nephtali@corp.ovh.com>
PiBa-NL reported that haproxy crashes with a segmentation fault
if a function registered using `core.register_task` returns.
An example Lua script that reproduces the bug is:
mytask = function()
core.Info("Stopping task")
end
core.register_task(mytask)
The Valgrind output is as follows:
==6759== Process terminating with default action of signal 11 (SIGSEGV)
==6759== Access not within mapped region at address 0x20
==6759== at 0x5B60AA9: lua_sethook (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
==6759== by 0x430264: hlua_ctx_resume (hlua.c:1009)
==6759== by 0x43BB68: hlua_process_task (hlua.c:5525)
==6759== by 0x4FED0A: process_runnable_tasks (task.c:231)
==6759== by 0x4B2256: run_poll_loop (haproxy.c:2397)
==6759== by 0x4B2256: run_thread_poll_loop (haproxy.c:2459)
==6759== by 0x41A7E4: main (haproxy.c:3049)
Add the missing `task = NULL` for the `HLUA_E_OK` case. The error cases
have been fixed as of 253e53e661 which
first was included in haproxy v1.8-dev3. This bugfix should be backported
to haproxy 1.8.
Instead of hlua_socket_settimeout() accepting only integers, allow user
to specify float and double as well. Convert to milliseconds much like
cli_parse_set_timeout but also sanity check the value.
http://w3.impa.br/~diego/software/luasocket/tcp.html#settimeout
T. Fournier edit:
The main goal is to keep compatibility with the LuaSocket API. This
API only accept seconds, so using a float to specify milliseconds is
an acceptable way.
Update doc.
Negatives timeouts doesn't have sense. A negative timeout doesn't cause
a crash, but the connection expires before the system try to extablish it.
This patch should be backported in all versions from 1.6
The output of these function indicates that one element is pushed in
the stack, but no element is set in the stack. Actually, if anyone
read the value returned by this function, is gets "something"
present in the stack.
This patch is a complement of these one: 119a5f10e4
The LuaSocket documentation tell anything about the returned value,
but the effective code set an integer of value one.
316a9455b9/src/timeout.c (L172)
Thanks to Tim for the bug report.
This patch should be backported in all version from 1.6
The `socket.tcp.settimeout` method of Lua returns `1` in all cases,
while the `Socket.settimeout` method of haproxy returns `0` in all
cases. This breaks the `socket.http` module, because it validates
the return value of `settimeout`.
This bug was introduced in commit 7e7ac32dad
(which is the very first commit adding the Socket class to Lua). This
bugfix should be backported to every branch containing that commit:
- 1.6
- 1.7
- 1.8
A test case for this bug is as follows:
The 'Test' response header will contain an HTTP status code with the
patch applied and will be zero (nil) without the patch applied.
http.lua:
http = require("socket.http")
core.register_action("bug", { "http-req" }, function(txn)
local b, c, h = http.request {
url = "http://93.184.216.34",
headers = {
Host = "example.com"
},
create = core.tcp,
redirect = false
}
txn:set_var("txn.foo", c)
end)
haproxy.cfg:
global
lua-load /scratch/haproxy/http.lua
frontend fe
bind 127.0.0.1:8080
http-request lua.bug
http-response set-header Test %[var(txn.foo)]
default_backend be
backend be
server s example.com:80
The `socket.tcp.connect` method of Lua requires at least two parameters:
The host and the port. The `Socket.connect` method of haproxy requires
only one when a host with a combined port is provided. This stems from
the fact that `str2sa_range` is used internally in `hlua_socket_connect`.
This very fact unfortunately causes a diversion in the behaviour of
Lua's socket class and haproxy's for IPv6 addresses:
sock:connect("::1", "80")
works fine with Lua, but fails with:
connect: cannot parse destination address '::1'
in haproxy, because `str2sa_range` parses the trailing `:1` as the port.
This patch forcefully adds a `:` to the end of the address iff a port
number greater than `0` is given as the second parameter.
Technically this breaks backwards compatibility, because the docs state:
> The syntax "127.0.0.1:1234" is valid. in this case, the
> parameter *port* is ignored.
But: The connect() call can only succeed if the second parameter is left
out (which causes no breakage) or if the second parameter is an integer
or a numeric string.
It seems unlikely that someone would provide an address with a port number
and would also provide a second parameter containing a number other than
zero. Thus I feel this breakage is warranted to fix the mismatch between
haproxy's socket class and Lua's one.
This commit should be backported to haproxy 1.8 only, because of the
possible breakage of existing Lua scripts.
The default value of the pattern in `Socket.receive` is `*l` according
to the documentation and in the `socket.tcp.receive` method of Lua.
The default value of `wanted` in `int hlua_socket_receive(struct lua_State *)`
reflects this requirement, but the function fails to ensure this
nonetheless:
If no parameter is given the top of the Lua stack will have the index 1.
`lua_pushinteger(L, wanted);` then pushes the default value onto the stack
(with index 2).
The following `lua_replace(L, 2);` then pops the top index (2) and tries to
replace the index 2 with it.
I am not sure why exactly that happens (possibly, because one cannot replace
non-existent stack indicies), but this causes the stack index to be lost.
`hlua_socket_receive_yield` then tries to read the stack index 2, to
determine what to read and get the value `0`, instead of the correct
HLSR_READ_LINE, thus taking the wrong branch.
Fix this by ensuring that the top of the stack is not replaced by itself.
This bug was introduced in commit 7e7ac32dad
(which is the very first commit adding the Socket class to Lua). This
bugfix should be backported to every branch containing that commit:
- 1.6
- 1.7
- 1.8
A test case for this bug is as follows:
The 'Test' response header will contain an HTTP status line with the
patch applied and will be empty without the patch applied. Replacing
the `sock:receive()` with `sock:receive("*l")` will cause the status
line to appear with and without the patch
http.lua:
core.register_action("bug", { "http-req" }, function(txn)
local sock = core.tcp()
sock:settimeout(60)
sock:connect("127.0.0.1:80")
sock:send("GET / HTTP/1.0\r\n\r\n")
response = sock:receive()
sock:close()
txn:set_var("txn.foo", response)
end)
haproxy.cfg (bits omitted for brevity):
global
lua-load /scratch/haproxy/http.lua
frontend fe
bind 127.0.0.1:8080
http-request lua.bug
http-response set-header Test %[var(txn.foo)]
default_backend be
backend be
server s 127.0.0.1:80
When using an incorrect 'mode' as 2nd argument of core.register_service(),
HAProxy crashes while displaying the error message.
To be backported to 1.8, 1.7 and 1.6.
The thread patches adds refcount for notifications. The notifications are
used with the Lua cosocket. These refcount free the notifications when
the session is cleared. In the Lua task case, it not have sessions, so
the nofications are never cleraed.
This patch adds a garbage collector for signals. The garbage collector
just clean the notifications for which the end point is disconnected.
This patch should be backported in 1.8
During the migration to the second version of the pools, the new
functions and pool pointers were all called "pool_something2()" and
"pool2_something". Now there's no more pool v1 code and it's a real
pain to still have to deal with this. Let's clean this up now by
removing the "2" everywhere, and by renaming the pool heads
"pool_head_something".
This macro should be used to declare variables or struct members depending on
the USE_THREAD compile option. It avoids the encapsulation of such declarations
between #ifdef/#endif. It is used to declare all lock variables.
All the references to connections in the data path from streams and
stream_interfaces were changed to use conn_streams. Most functions named
"something_conn" were renamed to "something_cs" for this. Sometimes the
connection still is what matters (eg during a connection establishment)
and were not always renamed. The change is significant and minimal at the
same time, and was quite thoroughly tested now. As of this patch, all
accesses to the connection from upper layers go through the pass-through
mux.
We have two y for nsuring that the data is not concurently manipulated:
- locks
- running task on the same thread.
locks are expensives, it is better to avoid it.
This patch cecks that the Lua task run on the same thread that
the stream associated to the coprocess.
TODO: in a next version, the error should be replaced by a yield
and thread migration request.
The applet manipulates the session and its buffers. We have two methods for
ensuring that the memory of the session will not change during its manipulation
by the task:
1 - adding mutex
2 - running on the same threads than the task.
The second point is smart because it cannot lock the execution of another thread.
Note that the Lua processing is not really thread safe. It provides
heavy system which consists to add our own lock function in the Lua
code and recompile the library. This system will probably not accepted
by maintainers of various distribs.
Our main excution point of the Lua is the function lua_resume(). A
quick looking on the Lua sources displays a lua_lock() a the start
of function and a lua_unlock() at the end of the function. So I
conclude that the Lua thread safe mode just perform a mutex around
all execution. So I prefer to do this in the HAProxy code, it will be
easier for distro maintainers.
Note that the HAProxy lua functions rounded by the macro SET_SAFE_LJMP
and RESET_SAFE_LJMP manipulates the Lua stack, so it will be careful
to set mutex around these functions.
The jmpbuf contains pointer on the stack memory address currently use
when the jmpbuf is set. So the information is local to each thread.
The struct field is too big to put it in the stack, but it is used
as buffer for retriving stats values. So, this buffer si local to each
threads. Each function using this buffer, use it whithout break (yield)
so, the consistency of local buffer is ensured.
A global lock has been added to protect accesses to the list of active
applets. A process mask has also been added on each applet. Like for FDs and
tasks, it is used to know which threads are allowed to process an
applet. Because applets are, most of time, linked to a session, it should be
sticky on the same thread. But in all cases, it is the responsibility of the
applet handler to lock what have to be protected in the applet context.
For now, we have a list of each type per thread. So there is no need to lock
them. This is the easiest solution for now, but not the best one because there
is no sharing between threads. An idle connection on a thread will not be able
be used by a stream on another thread. So it could be a good idea to rework this
patch later.
2 global locks have been added to protect, respectively, the run queue and the
wait queue. And a process mask has been added on each task. Like for FDs, this
mask is used to know which threads are allowed to process a task.
For many tasks, all threads are granted. And this must be your first intension
when you create a new task, else you have a good reason to make a task sticky on
some threads. This is then the responsibility to the process callback to lock
what have to be locked in the task context.
Nevertheless, all tasks linked to a session must be sticky on the thread
creating the session. It is important that I/O handlers processing session FDs
and these tasks run on the same thread to avoid conflicts.
For HTTP/2 we'll need some buffer-only equivalent functions to some of
the ones applying to channels and still squatting the bi_* / bo_*
namespace. Since these names have kept being misleading for quite some
time now and are really getting annoying, it's time to rename them. This
commit will use "ci/co" as the prefix (for "channel in", "channel out")
instead of "bi/bo". The following ones were renamed :
bi_getblk_nc, bi_getline_nc, bi_putblk, bi_putchr,
bo_getblk, bo_getblk_nc, bo_getline, bo_getline_nc, bo_inject,
bi_putchk, bi_putstr, bo_getchr, bo_skip, bi_swpbuf
Since commit 'MAJOR: task: task scheduler rework'
0194897e54. LUA's
scheduling tasks are freezing.
A running task should not handle the scheduling itself
but let the task scheduler to handle it based on the
'expire' field.
[wt: no backport needed]
There are several places where we see feconn++, feconn--, totalconn++ and
an increment on the frontend's number of connections and connection rate.
This is done exactly once per session in each direction, so better take
care of this counter in the session and simplify the callers. At least it
ensures a better symmetry. It also ensures consistency as till now the
lua/spoe/peers frontend didn't have these counters properly set, which can
be useful at least for troubleshooting.
Each user of a session increments/decrements the jobs variable at its
own place, resulting in a real mess and inconsistencies between them.
Let's have session_new() increment jobs and session_free() decrement
it.
These notification management function and structs are generic and
it will be better to move in common parts.
The notification management functions and structs have names
containing some "lua" references because it was written for
the Lua. This patch removes also these references.
When we try to access to other proxy context, we must check
its existence because haproxy can kill it between the creation
and the usage.
This patch should be backported in 1.6 and 1.7
The server state and weight was reworked to handle
"pending" values updated by checks/CLI/LUA/agent.
These values are commited to be propagated to the
LB stack.
In further dev related to multi-thread, the commit
will be handled into a sync point.
Pending values are named using the prefix 'next_'
Current values used by the LB stack are named 'cur_'
Currently a task is allocated in session_new() and serves two purposes :
- either the handshake is complete and it is offered to the stream via
the second arg of stream_new()
- or the handshake is not complete and it's diverted to be used as a
timeout handler for the embryonic session and repurposed once we land
into conn_complete_session()
Furthermore, the task's process() function was taken from the listener's
handler in conn_complete_session() prior to being replaced by a call to
stream_new(). This will become a serious mess with the mux.
Since it's impossible to have a stream without a task, this patch removes
the second arg from stream_new() and make this function allocate its own
task. In session_accept_fd(), we now only allocate the task if needed for
the embryonic session and delete it later.
Till now connections used to rely exclusively on file descriptors. It
was planned in the past that alternative solutions would be implemented,
leading to member "union t" presenting sock.fd only for now.
With QUIC, the connection will need to continue to exist but will not
rely on a file descriptor but a connection ID.
So this patch introduces a "connection handle" which is either a file
descriptor or a connection ID, to replace the existing "union t". We've
now removed the intermediate "struct sock" which was never used. There
is no functional change at all, though the struct connection was inflated
by 32 bits on 64-bit platforms due to alignment.
Haproxy doesn't need this anymore, we're wasting cycles checking for
a Connection header in order to add "Connection: close" only in the
1.1 case so that haproxy sees it and removes it. All tests were run
in 1.0 and 1.1, with/without the request header, and in the various
keep-alive/close modes, with/without compression, and everything works
fine. It's worth noting that this header was inherited from the stats
applet and that the same cleanup probably ought to be done there as
well.
In the HTTP applet, we have to parse the response headers provided by
the application and to produce a response. strcasecmp() is expensive,
and chunk_append() even more as it uses a format string.
Here we check the string length before calling strcasecmp(), which
results in strcasecmp() being called only on the relevant header in
practise due to very few collisions on the name lengths, effectively
dividing the number of calls by 3, and we replace chunk_appendf()
with memcpy() as we already know the string lengths.
Doing just this makes the "hello-world" applet 5% faster, reaching
41400 requests/s on a core i5-3320M.
The header's value was parsed with atoi() then compared against -1,
meaning that all the unparsable stuff returning zero was not considered
and that all multiples of 2^32 + 0xFFFFFFFF would continue to emit a
chunk.
Now instead we parse the value using a long long, only accept positive
values and consider all unparsable values as incorrect and switch to
either close or chunked encoding. This is more in line with what a
client (including haproxy's parser) would expect.
This may be backported as a cleanup to stable versions, though it's
really unlikely that Lua applications are facing side effects of this.
The following Lua code causes emission of a final chunk after the body,
which is wrong :
core.register_service("send204", "http", function(applet)
applet:set_status(204)
applet:start_response()
end)
Indeed, responses with status codes 1xx, 204 and 304 do not contain any
body and the message ends immediately after the empty header (cf RFC7230)
so by emitting a 0<CR><LF> we're disturbing keep-alive responses. There's
a workaround against this for now which consists in always emitting
"Content-length: 0" but it may not be cool with 304 when clients use
the headers to update their cache.
This fix must be backported to stable versions back to 1.6.
Commit d1aa41f ("BUG/MAJOR: lua: properly dequeue hlua_applet_wakeup()
for new scheduler") tried to address the side effects of the scheduler
changes on Lua, but it was not enough. Having some Lua code send data
in chunks separated by one second each clearly shows busy polling being
done.
The issue was tracked down to hlua_applet_wakeup() being woken up on
timer expiration, and returning itself without clearing the timeout,
causing the task to be re-inserted with an expiration date in the past,
thus firing again. In the past it was not a problem, as returning NULL
was enough to clear the timer. Now we can't rely on this anymore so
it's important to clear this timeout.
No backport is needed, this issue is specific to 1.8-dev and results
from an incomplete fix in the commit above.
The recent scheduler change broke the Lua co-sockets due to
hlua_applet_wakeup() returning NULL after waking the applet up. With the
previous scheduler, returning NULL was a way to do nothing on return.
With the new one it keeps TASK_RUNNING set, causing all new notifications
to end up into t->pending_state instead of t->state, and prevents the
task from being added into the run queue again, so and it's never woken
up anymore.
The applet keeps waking up, causing hlua_socket_handler() to do nothing
new, then si_applet_wake_cb() calling stream_int_notify() to try to wake
the task up, which it can't do due to the TASK_RUNNING flag, then decide
that since the associated task is not in the run queue, it needs to call
stream_int_update_applet() to propagate the update. This last one finds
that the applet needs to be woken up to deal with the last reported events
and calling appctx_wakeup() again. Previously, this situation didn't exist
because the task was always added in the run queue despite the TASK_RUNNING
flag.
By returning the task instead in hlua_applet_wakeup(), we can ensure its
flag is properly cleared and the task is requeued if needed or just sits
waiting for new events to happen.
This fix requires the previous ones ("BUG/MINOR: lua: always detach the
tcp/http tasks before freeing them") and MINOR: task: always preinitialize
the task's timeout in task_init().
Thanks to Thierry, Christopher and Emeric for the long head-scratching
session!
No backport is needed as the bug doesn't appear in older versions and
it's unsure whether we'll not break something by backporting it.
In hlua_{http,tcp}_applet_release(), a call to task_free() is performed
to release the task, but no task_delete() is made on these tasks. Till
now it wasn't much of a problem because this was normally not done with
the task in the run queue, and the task was never put into the wait queue
since it doesn't have any timer. But with threading it will become an
issue. And not having this already prevents another bug from being fixed.
Thanks to Christopher for spotting this one. A backport to 1.7 and 1.6 is
preferred for safety.
We cannot perform garbage collection on unreferenced thread.
This memory is now free and another Lua process can use it for
other things.
HAProxy is monothread, so this bug doesn't cause crash.
This patch must be backported in 1.6 and 1.7
In some cases, the socket is misused. The user can open socket and never
close it, or open the socket and close it without sending data. This
causes resources leak on all resources associated to the stream (buffer,
spoe, ...)
This is caused by the stream_shutdown function which is called outside
of the stream execution process. Sometimes, the shtudown is required
while the stream is not started, so the cleanup is ignored.
This patch change the shutdown mode of the session. Now if the session is
no longer used and the Lua want to destroy it, it just set a destroy flag
and the session kill itself.
This patch should be backported in 1.6 and 1.7
When we destroy the Lua session, we manipulates Lua stack,
so errors can raises. It will be better to catch these errors.
This patch should be backported in 1.6 and 1.7
Just forgot of reset the safe mode. This have not consequences
the safe mode just set a pointer on fucntion which is called only
and initialises a longjmp.
Out of lua execution, this longjmp is never executed and the
function is never called.
This patch should be backported in 1.6 and 1.7
The task_wakeup was called on stream_new, but the task/stream
wasn't fully initialized yet. The task_wakeup must be called
explicitly by the caller once the task/stream is initialized.
In the case of a Lua sample-fetch or converter doesn't return any
value, an acces outside the Lua stack can be performed. This patch
check the stack size before converting the top value to a HAProxy
internal sample.
A workaround consist to check that a value value is always returned
with sample fetches and converters.
This patch should be backported in the version 1.6 and 1.7
The priv context is not cleaned when we set a new priv context.
This is caused by a stupid swap between two parameter of the
luaL_unref() function.
workaround: use set_priv only once when we process a stream.
This patch should be backported in version 1.7 and 1.6
luaL_setstate() uses malloc() to initialize the first objects, and only
after this we replace the allocator. This creates trouble when replacing
the standard memory allocators during debugging sessions since the new
allocator is used to realloc() an area previously allocated using the
default malloc().
Lua provides lua_newstate() in addition to luaL_newstate(), which takes
an allocator for the initial malloc. This is exactly what we need, and
this patch does this and fixes the problem. The now useless call to
lua_setallocf() could be removed.
This has no impact outside of debugging sessions and there's no need to
backport this.
Error in the HTTP parser. The function http_get_path() can
return NULL and this case is not catched in the code. So, we
try to dereference NULL pointer, and a segfault occurs.
These two lines are useful to prevent the bug.
acl prevent_bug path_beg /
http-request deny if !prevent_bug
This bug fix should be backported in 1.6 and 1.7
This patch change the names prefixing it by a "_". So "end" becomes "_end".
The backward compatibility with names without the prefix "_" is assured.
In other way, another the keyword "end" can be used like this: Map['end'].
Thanks Robin H. Johnson for the bug repport
This should be backported in version 1.6 and 1.7
The older 'rsprep' directive allows modification of the status reason.
Extend 'http-response set-status' to take an optional string of the new
status reason.
http-response set-status 418 reason "I'm a coffeepot"
Matching updates in Lua code:
- AppletHTTP.set_status
- HTTP.res_set_status
Signed-off-by: Robin H. Johnson <robbat2@gentoo.org>
Now we can simply check the transport layer at run time and decide
whether or not to initialize or destroy these entries. This removes
other ifdefs and includes from cfgparse.c, haproxy.c and hlua.c.
Now we exclusively use xprt_get(XPRT_RAW) instead of &raw_sock or
xprt_get(XPRT_SSL) for &ssl_sock. This removes a bunch of #ifdef and
include spread over a number of location including backend, cfgparse,
checks, cli, hlua, log, server and session.
Most of the SSL functions used to have a proxy argument which was mostly
used to be able to emit clean errors using Alert(). First, many of them
were converted to memprintf() and don't require this pointer anymore.
Second, the rare which still need it also have either a bind_conf argument
or a server argument, both of which carry a pointer to the relevant proxy.
So let's now get rid of it, it needlessly complicates the API and certain
functions already have many arguments.
If the memory allocator fails, it return a bad code, and the execution
continue. If the Lua/cli initializer fails, the allocated struct is not
released.
If the lua/cli fails during initialization, it returns an ok
status, an the execution continue. This will probably occur a
segfault.
Thiw patch should be backported in 1.7
This is a regression from the commit a73e59b690.
When data are sent from a cosocket, the action is done in the context of the
applet running a lua stack and not in the context of the applet owning the
cosocket. So we must take care, explicitly, that this last applet have a buffer
(the req buffer of the cosocket).
This patch must be backported in 1.7
In the case of applet, the Lua context is taken from session
when we get the private values. This patch just update comments
associated to this action because it is not obvious.
The signals system embedded in Lua can be tranformed in general purpose
signals code. To reach this goal, this path removes the Lua part of the
signals.
This is an easy job, because Lua is useles with signal. I change just two
prototypes.
The struct hlua size is 128 bytes. The size is the biggest of all the elements
of the union embedded in the appctx struct. With HTTP2, it is possible that this
appctx struct will be use many times for each connection, so the 128 bytes are
a little bit heavy for the global memory consomation.
This patch replace the embbeded hlua struct by a pointer and an associated memory
pool. Now, the memory for lua is allocated only if it is required.
[wt: the appctx is now down to 160 bytes]
We have very few users of the appctx's private field which was introduced
prior to the split of the CLI. Unfortunately it was not removed after the
end. This commit simply introduces hlua_cli->fcn which is the pointer to
the Lua function that the Lua code used to store in this private pointer.
This problem is already detected here:
8dc7316a6f
Another case raises. Now HAProxy sends a final message (typically
with "http-request deny"). Once the the message is sent, the response
channel flags are not modified.
HAProxy executes a Lua sample-fecthes for building logs, and the
result is ignored because the response flag remains set to the value
HTTP_MSG_RPBEFORE. So the Lua function hlua_check_proto() want to
guarantee the valid state of the buffer and ask for aborting the
request.
The function check_proto() is not the good way to ensure request
consistency. The real question is not "Are the message valid ?", but
"Are the validity of message unchanged ?"
This patch memorize the parser state before entering int the Lua
code, and perform a check when it go out of the Lua code. If the parser
state change for down, the request is aborted because the HTTP message
is degraded.
This patch should be backported in version 1.6 and 1.7
When an entity tries to get a buffer, if it cannot be allocted, for example
because the number of buffers which may be allocated per process is limited,
this entity is added in a list (called <buffer_wq>) and wait for an available
buffer.
Historically, the <buffer_wq> list was logically attached to streams because it
were the only entities likely to be added in it. Now, applets can also be
waiting for a free buffer. And with filters, we could imagine to have more other
entities waiting for a buffer. So it make sense to have a generic list.
Anyway, with the current design there is a bug. When an applet failed to get a
buffer, it will wait. But we add the stream attached to the applet in
<buffer_wq>, instead of the applet itself. So when a buffer is available, we
wake up the stream and not the waiting applet. So, it is possible to have
waiting applets and never awakened.
So, now, <buffer_wq> is independant from streams. And we really add the waiting
entity in <buffer_wq>. To be generic, the entity is responsible to define the
callback used to awaken it.
In addition, applets will still request an input buffer when they become
active. But they will not be sleeped anymore if no buffer are available. So this
is the responsibility to the applet I/O handler to check if this buffer is
allocated or not. This way, an applet can decide if this buffer is required or
not and can do additional processing if not.
[wt: backport to 1.7 and 1.6]
(http|tcp)-(request|response) action cannot take arguments from the
configuration file. Arguments are useful for executing the action with
a special context.
This patch adds the possibility of passing arguments to an action. It
runs exactly like sample fetches and other Lua wrappers.
Note that this patch implements a 'TODO'.
There's no more reason to keep tcp rules processing inside proto_tcp.c
given that there is nothing in common there except these 3 letters : tcp.
The tcp rules are in fact connection, session and content processing rules.
Let's move them to "tcp-rules" and let them live their life there.
proto/dumpstats.h has been split in 4 files:
* proto/cli.h contains protypes for the CLI
* proto/stats.h contains prototypes for the stats
* types/cli.h contains definition for the CLI
* types/stats.h contains definition for the stats
When:
- A Lua action return data and close the channel. The request status
is set to HTTP_MSG_CLOSED for the request and HTTP_MSG_DONE for the
response.
- HAProxy sets the state HTTP_MSG_ERROR. I don't known why, because
there are many line which sets this state.
- A Lua sample-fetch is executed, typically for building the log
line.
- When the Lua sample fetch exits, a control of the data is
executed. If HAProxy is currently parsing the request, the request
is aborted in order to prevent a segfault or sending corrupted
data.
This ast control is executed comparing the state HTTP_MSG_BODY. When
this state is reached, the request is parsed and no error are
possible. When the state is < than HTTP_MSG_BODY, the parser is
running.
Unfortunately, the code HTTP_MSG_ERROR is just < HTTP_MSG_BODY. When
we are in error, we want to terminate the execution of Lua without
error.
This patch changes the comparaison level.
This patch must be backported in 1.6
It does the opposite of 'set-var' action/converter. It is really useful for
per-process variables. But, it can be used for any scope.
The lua function 'unset_var' has also been added.
Somme HTTP manipulation functions are executed without valid and parsed
requests or responses. This causes a segmentation fault when the executed
code tries to change an empty buffer.
This patch must be backported in the 1.6 version
If an action wrapper stops the processing of the transaction
with a txn_done() function, the return code of the action is
"continue". So the continue can implies the processing of other
like adding headers. However, the HTTP content is flushed and
a segfault occurs.
This patchs add a flag indicating that the Lua code want to
stop the processing, ths flags is forwarded to the haproxy core,
and other actions are ignored.
Must be backported in 1.6
The function txn_done() ends a transaction. It does not make
sense to call this function from a lua sample-fetch wrapper,
because the role of a sample-fetch is not to terminate a
transaction.
This patch modify the role of the fucntion txn_done() if it
is called from a sample-fetch wrapper, now it just ends the
execution of the Lua code like the done() function.
Must be backported in 1.6
The number of arguments pushed in the stack are false, so we try to execute a
function out of the stack. This function is always a nil pointer, so the
following message is displayed.
Lua converter 'testconv': runtime error: attempt to call a nil value.
Thanks Michael Ezzell for the repporting.
This patch must be backported in the 1.6 version.
When compiled with GCC 6, the IP address specified for a frontend was
ignored and HAProxy was listening on all addresses instead. This is
caused by an incomplete copy of a "struct sockaddr_storage".
With the GNU Libc, "struct sockaddr_storage" is defined as this:
struct sockaddr_storage
{
sa_family_t ss_family;
unsigned long int __ss_align;
char __ss_padding[(128 - (2 * sizeof (unsigned long int)))];
};
Doing an aggregate copy (ss1 = ss2) is different than using memcpy():
only members of the aggregate have to be copied. Notably, padding can be
or not be copied. In GCC 6, some optimizations use this fact and if a
"struct sockaddr_storage" contains a "struct sockaddr_in", the port and
the address are part of the padding (between sa_family and __ss_align)
and can be not copied over.
Therefore, we replace any aggregate copy by a memcpy(). There is another
place using the same pattern. We also fix a function receiving a "struct
sockaddr_storage" by copy instead of by reference. Since it only needs a
read-only copy, the function is converted to request a reference.
When a converter or sample is called from within a Lua code, there is a risk
of invalid argument string data usage when the upper boundary is reached.
Would be kind to port to 1.6 if possible.
In C89, "void *" is automatically promoted to any pointer type. Casting
the result of malloc/calloc to the type of the LHS variable is therefore
unneeded.
Most of this patch was built using this Coccinelle patch:
@@
type T;
@@
- (T *)
(\(lua_touserdata\|malloc\|calloc\|SSL_get_app_data\|hlua_checkudata\|lua_newuserdata\)(...))
@@
type T;
T *x;
void *data;
@@
x =
- (T *)
data
@@
type T;
T *x;
T *data;
@@
x =
- (T *)
data
Unfortunately, either Coccinelle or I is too limited to detect situation
where a complex RHS expression is of type "void *" and therefore casting
is not needed. Those cases were manually examined and corrected.
This patch adds a Lua post initialisation wrapper. It already exists for
pure Lua function, now it executes also C. It is useful for doing things
when the configuration is ready to use. For example we can can browse and
register all the proxies.
All the HAProxy Lua object are declared with the same pattern:
- Add the function __tosting which dumps the object name
- Register the name in the Lua REGISTRY
- Register the reference ID
These action are refactored in on function. This remove some
lines of code.
The functions
- hlua_class_const_int()
- hlua_class_const_str()
- hlua_class_function()
are use for common class registration actions.
The function 'hlua_dump_object()' is generic dump name function.
These functions can be used by all the HAProxy objects, so I move
it into the safe functions file.
This is the continuation of previous patch called "BUG/MAJOR: samples:
check smp->strm before using it".
It happens that variables may have a session-wide scope, and that their
session is retrieved by dereferencing the stream. But nothing prevents them
from being used from a streamless context such as tcp-request connection,
thus crashing the process. Example :
tcp-request connection accept if { src,set-var(sess.foo) -m found }
In order to fix this, we have to always ensure that variable manipulation
only happens via the sample, which contains the correct owner and context,
and that we never use one from a different source. This results in quite a
large change since a lot of functions are inderctly involved in the call
chain, but the change is easy to follow.
This fix must be backported to 1.6, and requires the last two patches.
Some functions like sample_conv_var2smp(), var_get_byname(), and
var_set_byname() directly or indirectly need to access the current
stream and/or session and must find it in the sample itself and not
as a distinct argument. Thus we first need to call smp_set_owner()
prior to each such calls.
Since commit 6879ad3 ("MEDIUM: sample: fill the struct sample with the
session, proxy and stream pointers") merged in 1.6-dev2, the sample
contains the pointer to the stream and sample fetch functions as well
as converters use it heavily. This requires from a lot of call places
to initialize 4 fields, and it was even forgotten at a few places.
This patch provides a convenient helper to initialize all these fields
at once, making it easy to prepare a new sample from a previous one for
example.
A few call places were cleaned up to make use of it. It will be needed
by further fixes.
At one place in the Lua code, it was moved earlier because we used to
call sample casts with a non completely initialized sample, which is
not clean eventhough at the moment there are no consequences.
Since commit 6879ad3 ("MEDIUM: sample: fill the struct sample with the
session, proxy and stream pointers") merged in 1.6-dev2, the sample
contains the pointer to the stream and sample fetch functions as well
as converters use it heavily.
The problem is that earlier commit 87b0966 ("REORG/MAJOR: session:
rename the "session" entity to "stream"") had split the session and
stream resulting in the possibility for smp->strm to be NULL before
the stream was initialized. This is what happens in tcp-request
connection rulesets, as discovered by Baptiste.
The sample fetch functions must now check that smp->strm is valid
before using it. An alternative could consist in using a dummy stream
with nothing in it to avoid some checks but it would only result in
deferring them to the next step anyway, and making it harder to detect
that a stream is valid or the dummy one.
There is still an issue with variables which requires a complete
independant fix. They use strm->sess to find the session with strm
possibly NULL and passed as an argument. All call places indirectly
use smp->strm to build strm. So the problem is there but the API needs
to be changed to remove this duplicate argument that makes it much
harder to know what pointer to use.
This fix must be backported to 1.6, as well as the next one fixing
variables.
A value is copied two time in teh stack, but only one is usefull. The
second copy leaves unused in the stack and take some room for noting.
This path removes the second copy.
Must be backported in 1.6
This patch must be backported in 1.6
hlua_yield() function returns the required sleep time. The Lua core must
be resume the execution after the required time. The core dedicated to
the http and tcp applet doesn't implement the wake up function. It is a
miss.
This patch fix this.
This patch moves the function hlua_checkudata which check that
an object contains the expected class_reference as metatable.
This function is commonly used by all the lua functions.
The function hlua_metatype is also moved.
When Lua executes functions from its API, these can throws an error.
These function must be executed in a special environment which catch
these error, otherwise a critical error (like segfault) can raise.
This patch add a c file called "hlua_fcn.c" which collect all the
Lua/c function needing safe environment for its execution.
The applet can't have access to the session private data. This patch
fix this problem. Now an applet can use private data stored by actions
and fecthes.
INNER and XFERBODY analyzer were set in order to support HTTP applets
from TCP rulesets, but this does not work (cf previous patch).
Other cases already provides theses analyzers, so their addition is
not needed. Furthermore if INNER was set it could cause some headers
to be rewritten (ex: connection) after headers were already forwarded,
resulting in a crash in buffer_insert_line2().
Special thanks to Bernd Helm for providing very detailed information,
captures and stack traces making it possible to spot the root cause
here.
This fix must be backported to 1.6.
HTTP applets request requires everything initilized by
"http_process_request" (analyzer flag AN_REQ_HTTP_INNER).
The applet will be immediately initilized, but its before
the call of this analyzer.
Due to this problem HTTP applets could be called with uncompletely
initialized http_txn.
This fix must be backported to 1.6.
In certain circumstances (eg: Lua HTTP applet called from a
TCP ruleset before http_process_request()), the HTTP TXN is not
yet fully initialized so some information it contains cannot be
relied on. Such information include the HTTP version, the state
of the expect: 100-continue header, the connection header and
the transfer-encoding header.
Here the bug only turns something which already doesn't work
into something wrong, but better avoid any references to the
http_txn from the Lua code to avoid future mistakes.
This patch should be backported into 1.6 for code consistency.
If a sample fetch needing http_txn is called from an HTTP Lua applet,
the result will be invalid and may even cause a crash because some HTTP
data can be forwarded and the HTTP txn is no longer valid.
Here the solution is to ensure that a fetch called from Lua never
needs http_txn. This is done thanks to a new flag HLUA_F_MAY_USE_HTTP
which indicates whether or not it is safe to call a fetch which needs
HTTP.
This fix needs to be backported to 1.6.
This patch converts a boolean "int" to a bitfiled. The main
reason is to save space in the struct if another flag may will
be require.
Note that this patch is required for next fix and will need to be
backported to 1.6.
When a POST is processed by a Lua service, the HTTP header are
potentially gone. So, we cannot retrieve their content using
the standard "hdr" sample fetchs (which will soon become invalid
anyway) from an applet.
This patch add an entry "headers" to the object applet_http. This
entry is an array containing all the headers. It permits to use the
HTTP headers during the processing of the service.
Many thanks to Jan Bruder for reporting this issue with enough
details to reproduce it.
This patch will have to be backported to 1.6 since it will be the
only way to access headers from Lua applets.
Sander Klein reported an error messages about SSLv3 not
being supported on Debian 8, although he didn't force-sslv3.
Vincent Bernat tracked this down to the LUA initialization, which
actually does force-sslv3.
This patch removes force-sslv3 from the LUA initialization, so
the LUA SSL socket can actually use TLS and doesn't trigger
warnings when SSLv3 is not supported by libssl (such as in
Debian 8).
This should be backported to 1.6.
When the txn.done() fiunction is called, the ouput buffer is cleaned,
but the associated relative pointer on the HTTP requests elements
is not reseted.
This patch remove this cleanup, because the output buffer may contain
data to forward.
The direction (request or response) is not propagated in the
sample fecthes called throught Lua. This patch adds the direction
status in some structs (hlua_txn and hlua_smp) to make sure that
the sample fetches will be called with all the information.
The converters can not access to a TXN object, so there are not
impacted the direction. However, the samples used as input of the
Lua converter wrapper are initiliazed with the direction. Thereby,
the struct smp stay consistent.
[wt: needs to be backported to 1.6]
This patch cleanups the direction names. It replaces numeric values,
by the associated defines. It ensure the compliance with values found
somwhere else in HAProxy.
It is required by the bugfix patch which is following.
[wt: needs to be backported to 1.6]
Lua needs to known the direction of the http data processed (request or
response). It checks the flag SMP_OPT_DIR_REQ, buf this flag is 0. This patch
correctly checks the flags after applying the SMP_OPT_DIR mask.
Thierry reported that keep-alive still didn't cope well with Lua
services. The reason is that for now applets have to be closed at
the end of a transaction so we want to work in server-close mode,
which isn't noticeable by the client since it still sees keep-alive.
Additionally we want to enable the request body transfer analyser
which will be needed to synchronize with the response analyser to
indicate the end of the transfer.
Only the main execution function can set the run flag, because it is
the last function before the execution time.
This patch removes the flag set by another function. It will be used
by the new lua timeout counter.
The garbage collector is a little bit heavy to run, and it was added
only for cosockets. This patch prevent useless executions when no
cosockets are used.
When the channel is down, the applet is waked up. Before this patch,
the applet closes the stream and the unread data are discarded.
After this patch, the stream is not closed except if the buffers are
empty.
It will be closed later by the close function, or by the garbage collector.
The HAProxy Lua socket respects the Lua Socket tcp specs. these specs
are a little bit limited, it not permits to connect to unix socket.
this patch extends a little it the specs. It permit to accept a
syntax that begin with "unix@", "ipv4@", "ipv6@", "fd@" or "abns@".
Actions may yield but must not do it during the final call from a ruleset
because it indicates there will be no more opportunity to complete or
clean up. This is indicated by ACT_FLAG_FINAL in the action's flags,
which must be passed to hlua_resume().
Thanks to this, an action called from a TCP ruleset is properly woken
up and possibly finished when the client disconnects.
Since commit bc4c1ac ("MEDIUM: http/tcp: permit to resume http and tcp
custom actions"), some actions may yield and be called back when new
information are available. Unfortunately some of them may continue to
yield because they simply don't know that it's the last call from the
rule set. For this reason we'll need to pass a flag to the custom
action to pass such information and possibly other at the same time.
When a thread stops this patch forces a garbage collection which
cleanup and free the memory.
The HAProxy Lua Socket class contains an HAProxy session, so it
uses a big amount of memory, in other this Socket class uses a
filedescriptor and maintain a connection open.
If the class Socket is stored in a global variable, the Socket stay
alive along of the life of the process (except if it is closed by the
other size, or if a timeout is reached). If the class Socket is stored
in a local variable, it must die with this variable.
The socket is closed by a call from the garbage collector. And the
portability and use of a variable is known by the same garbage collector.
so, running the GC just after the end of all Lua code ensure that the
heavy resources like the socket class are freed quickly.
Not having the target set on the connection causes it to be released
at the last moment, and the destination address to randomly be valid
depending on the data found in the memory at this moment. In practice
it works as long as memory poisonning is disabled. The deep reason is
that connect_server() doesn't expect to be called with SF_ADDR_SET and
an existing connection with !reuse. This causes the release of the
connection, its reallocation (!reuse), and taking the address from the
newly allocated connection. This should certainly be improved.
Commit d75cb0f ("BUG/MAJOR: lua: segfault after the channel data is
modified by some Lua action.") introduced a regression causing an
action run from a TCP rule in an HTTP proxy to end in HTTP error
if it terminated cleanly, because it didn't parse the HTTP request!
Relax the test so that it takes into account the opportunity for the
analysers to parse the message.
The lua code must the the appropriate wakeup mechanism for cosockets.
It wakes them up from outside the stream and outside the applet, so it
shouldn't use the applet's wakeup callback which is designed only for
use from within the applet itself. For now it didn't cause any trouble
(yet).
When an action or a fetch modify the channel data, the http request parser
pointer become inconsistent. This patch detects the modification and call
again the parser.
The function lua_settable uses the metatable for acessing data,
so in the C part, we want to index value in the real table and
not throught the meta-methods. It's much faster this way.
This function is a callback made only for calls from the applet handler.
Rename it to remove confusion. It's currently called from the Lua code
but that's not correct, we should call the notify and update functions
instead otherwise it will not enable the applet again.
The current Lua action are not registered. The executed function is
selected according with a function name writed in the HAProxy configuration.
This patch add an action registration function. The configuration mode
described above disappear.
This change make some incompatibilities with existing configuration files for
HAProxy 1.6-dev.
This couple of function executes securely some Lua calls outside of
the lua runtime environment. Each Lua call can return a longjmp
if it encounter a memory error.
Lua documentation extract:
If an error happens outside any protected environment, Lua calls
a panic function (see lua_atpanic) and then calls abort, thus
exiting the host application. Your panic function can avoid this
exit by never returning (e.g., doing a long jump to your own
recovery point outside Lua).
The panic function runs as if it were a message handler (see
2.3); in particular, the error message is at the top of the
stack. However, there is no guarantee about stack space. To push
anything on the stack, the panic function must first check the
available space (see 4.2).
We must check all the Lua entry point. This includes:
- The include/proto/hlua.h exported functions
- the task wrapper function
- The action wrapper function
- The converters wrapper function
- The sample-fetch wrapper functions
It is tolerated that the initilisation function returns an abort.
Before each Lua abort, an error message is writed on stderr.
The macro SET_SAFE_LJMP initialise the longjmp. The Macro
RESET_SAFE_LJMP reset the longjmp. These function must be macro
because they must be exists in the program stack when the longjmp
is called
All the code which emits error log have the same pattern. Its:
Send log with syslog system, and if it is allowed, display error
log on screen.
This patch replace this pattern by a macro. This reduces the number
of lines.
The send_log function needs a final \n.
This bug is repported by Michael Ezzell.
Minor bug: when writing to syslog from Lua scripts, the last character from
each log entry is truncated.
core.Alert("this is truncated");
Sep 7 15:07:56 localhost haproxy[7055]: this is truncate
This issue appears to be related to the fact that send_log() (in src/log.c)
is expecting a newline at the end of the message's format string:
/*
* This function adds a header to the message and sends the syslog message
* using a printf format string. It expects an LF-terminated message.
*/
void send_log(struct proxy *p, int level, const char *format, ...)
I believe the fix would be in in src/hlua.c at line 760
<http://git.haproxy.org/?p=haproxy.git;a=blob;f=src/hlua.c;h=1e4d47c31e66c16c837ff2aa5ef577f6cafdc7e7;hb=316e3196285b89a917c7d84794ced59a6a5b4eba#l760>,
where this...
send_log(px, level, "%s", trash.str);
...should be adding a newline into the format string to accommodate what
the code expects.
send_log(px, level, "%s\n", trash.str);
This change provides what seems to be the correct behavior:
Sep 7 15:08:30 localhost haproxy[7150]: this is truncated
All other uses of send_log() in hlua.c have a trailing dot "." in the
message that is masking the truncation issue because the output message
stops on a clean word boundary. I suspect these would also benefit from
"\n" appended to their format strings as well, since this appears to be the
pattern seen throughout the rest of the code base.
Reported-by: Michael Ezzell <michael@ezzell.net>
See commit id bdc97a8795
Michael Ezzell reported that the following Lua code fails in
dev4 when the TCP is not established immediately (due to a little
bit of latency):
function tricky_socket()
local sock = core.tcp();
sock:settimeout(3);
core.log(core.alert,"calling connect()\n");
local connected, con_err = sock:connect("x.x.x.x",80);
core.log(core.alert,"returned from connect()\n");
if con_err ~= nil then
core.log(core.alert,"connect() failed with error: '" .. con_err .. "'\n");
end
The problem is that the flags who want to wake up the applet are
resetted before each applet call, so the applet must set again the
flags if the connection is not established.
Before this patch, two type of custom actions exists: ACT_ACTION_CONT and
ACT_ACTION_STOP. ACT_ACTION_CONT is a non terminal action and ACT_ACTION_STOP is
a terminal action.
Note that ACT_ACTION_STOP is not used in HAProxy.
This patch remove this behavior. Only type type of custom action exists, and it
is called ACT_CUSTOM. Now, the custion action can return a code indicating the
required behavior. ACT_RET_CONT wants that HAProxy continue the current rule
list evaluation, and ACT_RET_STOP wants that HAPRoxy stops the the current rule
list evaluation.
When called from an http ruleset, txn:done() can still crash the process
because it closes the stream without consuming pending data resulting in
the transaction's buffer representation to differ from the real buffer.
This patch also adjusts the transaction's state to indicate that it's
closed to be consistent with what's already done in redirect rules.
When the Lua execution flow endswith the command done (core.done or txn.done())
an error is detourned, and the stack is no longer usable. This patch juste
reinitilize the stack if this case is detected.
The function txn:close() must be terminal because it demands the session
destruction. This patch renames this function to "done()" to be much
clearer about the fact that it is a final operation.
This patch is inspired by Bowen Ni's proposal and it is based on his first
implementation:
With Lua integration in HAProxy 1.6, one can change the request method,
path, uri, header, response header etc except response line.
I'd like to contribute the following methods to allow modification of the
response line.
[...]
There are two new keywords in 'http-response' that allows you to rewrite
them in the native HAProxy config. There are also two new APIs in Lua that
allows you to do the same rewriting in your Lua script.
Example:
Use it in HAProxy config:
*http-response set-code 404*
Or use it in Lua script:
*txn.http:res_set_reason("Redirect")*
I dont take the full patch because the manipulation of the "reason" is useless.
standard reason are associated with each returned code, and unknown code can
take generic reason.
So, this patch can set the status code, and the reason is automatically adapted.
The function dont remove remaineing analysers and dont update response
channel timeout.
The fix is a copy of the behavior of the functions http_apply_redirect_rule()
and stream_int_retnclose().
Tsvetan Tsvetanov reported that the following Lua code fails in
dev2 and dev3 :
function hello(txn)
local request_msg = txn.req:dup()
local tsm_sock = core.tcp()
tsm_sock:connect("127.0.0.1", 7777)
local res = tsm_sock:send(request_msg)
local response = tsm_sock:receive('*l')
txn.res:send(response)
txn:close()
end
Thierry diagnosed that it was caused by commit 563cc37 ("MAJOR: stream:
use a regular ->update for all stream interfaces"). It broke lua's
ability to establish outgoing connections.
The reason is that the applet used to be notified about established
connections just after the stream analyser loop, and that's not the
case anymore. In peers, this issue didn't happen because peers use
a handshake so after sending data, the response is received and wakes
the applet up again. Here we have to indicate that we want to send or
receive data, this will cause the notification to happen as soon as
the connection is ready. This is similar to pretending that we're
working on a full buffer after all. In theory subscribing for reads
is not needed, but it's added here for completeness.
Reported-By: Tsvetan Tsvetanov <cpi.cecko@gmail.com>
Now the prototype for each action from each section are the same, and
a discriminant for determining for each section we are called are added.
So, this patch removes the wrappers for the action functions called from
more than one section.
This patch removes 132 lines of useless code.
This patch normalize the return code of the configuration parsers. Before
these changes, the tcp action parser returned -1 if fail and 0 for the
succes. The http action returned 0 if fail and 1 if succes.
The normalisation does:
- ACT_RET_PRS_OK for succes
- ACT_RET_PRS_ERR for failure
This patch merges the conguration keyword struct. Each declared configuration
keyword struct are similar with the others. This patch simplify the code.
Action function can return 3 status:
- error if the action encounter fatal error (like out of memory)
- yield if the action must terminate his work later
- continue in other cases
For performances considerations, some actions are not processed by remote
function. They are directly processed by the function. Some of these actions
does the same things but for different processing part (request / response).
This patch give the same name for the same actions, and change the normalization
of the other actions names.
This patch is ONLY a rename, it doesn't modify the code.
This patch group the action name in one file. Some action are called
many times and need an action embedded in the action caller. The main
goal is to have only one header file grouping all definitions.
The (http|tcp)-(request|response) action rules use common
opaque type. For the HAProxy embbedded feature, types are know,
it better to add this types in the action union and use it.
This patch is the first of a serie which merge all the action structs. The
function "tcp-request content", "tcp-response-content", "http-request" and
"http-response" have the same values and the same process for some defined
actions, but the struct and the prototype of the declared function are
different.
This patch try to unify all of these entries.
The union name "data" is a little bit heavy while we read the source
code because we can read "data.data.sint". The rename from "data" to "u"
makes the read easiest like "data.u.sint".
This patch remove the struct information stored both in the struct
sample_data and in the striuct sample. Now, only thestruct sample_data
contains data, and the struct sample use the struct sample_data for storing
his own data.
test conf:
global
tune.lua.session-timeout 0
lua-load lol.lua
debug
maxconn 4096
listen test
bind 0.0.0.0:10010
mode tcp
tcp-request content lua act_test
balance roundrobin
server test 127.0.0.1:3304
lua test:
function act_test(txn)
while true do
core.Alert("TEST")
end
end
The function "act_test()" is not executed because a zero timeout is not
considered as TICK_ETERNITY, but is considered as 0.
This path fix this behavior. This is the same problem than the bugfix
685c014e99.
I've been trying out 1.6 dev3 with lua support, and trying to start
lua tasks seems to not be working.
Using this configuration
global
lua-load /lua/lol.lua
debug
maxconn 4096
backend shard_b
server db01 mysql_shard_b:3306
backend shard_a
server db01 mysql_shard_a:3306
listen mysql-cluster
bind 0.0.0.0:8001
mode tcp
balance roundrobin
use_backend shard_b
And this lua function
core.register_task(function()
while true do
core.Alert("LOLOLOLOLOL")
end
end)
I'd always get a timeout error starting the registered function.
The problem lies as far as I can tell in the fact that is possible for
now_ms to not change (is this maybe a problem on my config/system?)
until the expiration check happens, in the resume function that
actually kickstarts the lua task, making HAProxy think that expiration
time for the task is up, if I understand correctly tasks are meant to
never really timeout.
These ones are considered safe as they have already been reused.
They will be useful in "aggressive" and "always" http-reuse modes
in order to place the first request of a connection with the least
risk.
For now it's not populated but we have the list entry. It will carry
all idle connections that sessions don't want to share. They may be
used later to reclaim connections upon socket shortage for example.
Since we now always call this function with the reuse parameter cleared,
let's simplify the function's logic as it cannot return the existing
connection anymore. The savings on this inline function are appreciable
(240 bytes) :
$ size haproxy.old haproxy.new
text data bss dec hex filename
1020383 40816 36928 1098127 10c18f haproxy.old
1020143 40816 36928 1097887 10c09f haproxy.new
This patch removes the 32 bits unsigned integer and the 32 bit signed
integer. It replaces these types by a unique type 64 bit signed.
This makes easy the usage of integer and clarify signed and unsigned use.
With the previous version, signed and unsigned are used ones in place of
others, and sometimes the converter loose the sign. For example, divisions
are processed with "unsigned", if one entry is negative, the result is
wrong.
Note that the integer pattern matching and dotted version pattern matching
are already working with signed 64 bits integer values.
There is one user-visible change : the "uint()" and "sint()" sample fetch
functions which used to return a constant integer have been replaced with
a new more natural, unified "int()" function. These functions were only
introduced in the latest 1.6-dev2 so there's no impact on regular
deployments.