Now all the code used to manipulate chunks uses a struct buffer instead.
The functions are still called "chunk*", and some of them will progressively
move to the generic buffer handling code as they are cleaned up.
Chunks are only a subset of a buffer (a non-wrapping version with no head
offset). Despite this we still carry a lot of duplicated code between
buffers and chunks. Replacing chunks with buffers would significantly
reduce the maintenance efforts. This first patch renames the chunk's
fields to match the name and types used by struct buffers, with the goal
of isolating the code changes from the declaration changes.
Most of the changes were made with spatch using this coccinelle script :
@rule_d1@
typedef chunk;
struct chunk chunk;
@@
- chunk.str
+ chunk.area
@rule_d2@
typedef chunk;
struct chunk chunk;
@@
- chunk.len
+ chunk.data
@rule_i1@
typedef chunk;
struct chunk *chunk;
@@
- chunk->str
+ chunk->area
@rule_i2@
typedef chunk;
struct chunk *chunk;
@@
- chunk->len
+ chunk->data
Some minor updates to 3 http functions had to be performed to take size_t
ints instead of ints in order to match the unsigned length here.
Now the buffers only contain the header and a pointer to the storage
area which can be anywhere. This will significantly simplify buffer
swapping and will make it possible to map chunks on buffers as well.
The buf_empty variable was removed, as now it's enough to have size==0
and area==NULL to designate the empty buffer (thus a non-allocated head
is the empty buffer by default). buf_wanted for now is indicated by
size==0 and area==(void *)1.
The channels and the checks now embed the buffer's head, and the only
pointer is to the storage area. This slightly increases the unallocated
buffer size (3 extra ints for the empty buffer) but considerably
simplifies dynamic buffer management. It will also later permit to
detach unused checks.
The way the struct buffer is arranged has proven quite efficient on a
number of tests, which makes sense given that size is always accessed
and often first, followed by the othe ones.
This one is more generic and designed to work on a random block. It
may later get a b_rep_ist() variant since many strings are already
available as (ptr,len).
There's no distinction between in and out data now. The latter covers
the needs of the former and supports wrapping. The extra cost is
negligible given the locations where it's used.
This replaces chn->buf->p with ci_head(chn), chn->buf->o with co_data(chn)
and chn->buf->i with ci_data(chn). This is in order to help porting to the
new buffer API.
We used to have variations around buffer_total_space() and
size-buffer_len() or size-b_data(). Let's simplify all this. buffer_len()
was also removed as not used anymore.
Now that there are no more users requiring to modify the buffer anymore,
switch these ones to const char and const buffer. This will make it more
obvious next time send functions are tempted to modify the buffer's output
count. Minor adaptations were necessary at a few call places which were
using char due to the function's previous prototype.
These ones manipulate the output data count which will be specific to
the channel soon, so prepare the call points to use the channel only.
The b_* functions are now unused and were removed.
The few call places where it's used can use the trash as a swap buffer,
which is made for this exact purpose. This way we can rely on the
generic b_slow_realign() call.
Where relevant, the channel version is used instead. The buffer version
was ported to be more generic and now takes a swap buffer and the output
byte count to know where to set the alignment point. The H2 mux still
uses buffer_slow_realign() with buf->o but it will change later.
The Lua parser doesn't takes in account end-of-headers containing
only '\n'. It expects always '\r\n'. If a '\n' is processes the Lua
parser considers it miss 1 byte, and wait indefinitely for new data.
When the client reaches their timeout, it closes the connection.
This close is not detected and the connection keep in CLOSE-WAIT
state.
I guess that this patch fix only a visible part of the problem.
If the Lua HTTP parser wait for data, the timeout server or the
connectio closed by the client may stop the applet.
How reproduce the problem:
HAProxy conf:
global
lua-load bug38.lua
frontend frt
timeout client 2s
timeout server 2s
mode http
bind *:8080
http-request use-service lua.donothing
Lua conf
core.register_service("donothing", "http", function(applet) end)
Client request:
echo -ne 'GET / HTTP/1.1\n\n' | nc 127.0.0.1 8080
Look for CLOSE-WAIT in the connection with "netstat" or "ss". I
use this script:
while sleep 1; do ss | grep CLOSE-WAIT; done
This patch must be backported in 1.6, 1.7 and 1.8
Workaround: enable the "hard-stop-after" directive, and perform
periodic reload.
Patrick reported that this simple configuration made haproxy segfaults:
global
lua-load /tmp/haproxy.lua
frontend f1
mode http
bind :8000
default_backend b1
http-request lua.foo
backend b1
mode http
server s1 127.0.0.1:8080
with this '/tmp/haproxy.lua' script:
core.register_action("foo", { "http-req" }, function(txn)
txn.sc:ipmask(txn.f:src(), 24, 112)
end)
This is due to missing initialization of the array of arguments
passed to hlua_lua2arg_check() which makes it enter code with
corrupted arguments.
Thanks a lot to Patrick Hemmer for having reported this issue.
Must be backported to 1.8, 1.7 and 1.6.
When an unrecoverable error raises, the user receive poor information
for the trouble shooting. For example:
[ALERT] 157/143755 (21212) : Lua function 'hello-world': runtime error: memory allocation error: block too big.
Unfortunately, the memory allocation error can be throwed by many
function, and we have no informatio to reach the original cause.
This patch add the list of function called from the entry point to
the function in error, like this:
[ALERT] 157/143755 (21212) : Lua function 'hello-world': runtime error: memory allocation error: block too big from [C] method 'req_get_headers', bug35.lua:2 global 'ee', bug35.lua:6 global 'ff', bug35.lua:10 C function line 9.
The buffer pointer is already updated. It is again updated
when it is given to the function ci_putblk().
This patch must be backported in 1.6, 1.7 and 1.8
When we write data, we risk to encounter a dead-loack. The
function "stream_int_notify()" cannot be called the the
cosocket because the caller acquire a lock and when the socket
is closed, the cleanup function try to acquire the same lock.,
so a dead-lock raises.
In other way, the function stream_int_update_applet() can't
be called because it schedumes the applet only if some activity
in the buffers were detected. It is not always the case. We
replace this function by appctx_wakeup() which wake up the
applet inconditionnaly.
The last part of the fix is setting right signals. the applet
call the stream_int_update() function if the output buffer si
not empty, and ask for put data if some rite signals are
registered.
This patch must be backported in 1.6, 1.7 and 1.8. Note that it requires
patch "MINOR: task/notification: Is notifications registered" to be
applied.
Each time the send function yields, a notification must be registered.
Without this notification, the task is never wakeup when data arrives.
Today, the notification is registered only if the buffer is not available.
Other cases like the buffer is too small for all data are not processed.
This patch must be backported in 1.6, 1.7 and 1.8
In some cases, when we are waiting for data and the socket
timeout expires, we have a dead lock. The Lua socket locks
the applet socket, and call for a notify. The notify
immediately executes code and try to acquire the same lock,
so ... dead lock.
stream_int_notify() cant be used because it wakeup the applet
task only if the stream have changes. The changes are forces
by Lua, but not repported on the stream.
stream_int_update_applet() cant be used because the deadlock.
So, I inconditionnaly wakeup the applet. This wake is performed
asynchronously, and will call a stream_int_notify().
This patch must be backported in 1.6, 1.7 and 1.8
The appctx pointer is given from any variable which are wrong.
This implies the wakeup of wrong applet, and the socket are no
longer responsive.
This behavior is hidden by another inherited error which is
fixed in the next patch.
This patch remove all wrong appctx affectations.
This patch must be backported in 1.6, 1.7 and 1.8
In preparation for thread-specific runqueues, change the task API so that
the callback takes 3 arguments, the task itself, the context, and the state,
those were retrieved from the task before. This will allow these elements to
change atomically in the scheduler while the application uses the copied
value, and even to have NULL tasks later.
The limit of data read works only if all the data is in the
input buffer. Otherwise (if the data arrive in chunks), the
total amount of data is not taken in acount.
Only the current read data are compared to the expected amout
of data.
This patch must be backported from 1.9 to 1.6
The function hlua_ctx_resume return less text message and more error
code. These error code allow the caller to return appropriate
message to the user.
Since commit 36d1374 ("BUG/MINOR: lua: Fix SSL initialisation") in 1.6, the
Lua code always initializes an SSL server. It caused a small visible side
effect which is that by calling ssl_sock_prepare_srv_ctx(), it forces
global.ssl_used_backend to 1 and makes the initialization code believe that
there are some SSL servers in certain backends. This detection is used to
figure how to set the global maxconn value when only the memory usage is
limited. As such, even a configuration with no SSL at all will have a very
conservative maxconn.
The configuration below exhibits this :
global
ssl-server-verify none
stats socket /tmp/sock1 mode 666 level admin
tune.bufsize 16384
listen px
timeout client 5s
timeout server 5s
timeout connect 5s
bind :4445
#bind :4443 ssl crt rsa+dh2048.pem
#server s1 127.0.0.1:8003 ssl
Starting it with "-m 200" to limit it to 200 MB of RAM reports 1500 for
Maxconn, the same when uncommenting the "server" line, and 1300 when
uncommenting the "bind" line, regardless of the "server" line's status.
In practice it doesn't make sense to consider that Lua's server template
counts for one regular SSL server, because even if used for SSL, it will
not take large connection counts, compared to a backend relaying traffic.
Thus the solution consists in resetting the ssl_used_backend to its
previous value after creating the server_ctx from the Lua code. With the
fix, the same config with the same parameters now show :
- maxconn=5700 when neither side uses SSL
- maxconn=1500 when only one side uses SSL
- maxconn=1300 when both sides use SSL
This fix can be backported to versions 1.6 and beyond.
Function `hlua_socket_close` expected exactly one argument on the Lua stack.
But when `hlua_socket_close` was called from `hlua_socket_write_yield`,
Lua stack had 3 arguments. So `hlua_socket_close` threw the exception with
message "'close' needs 1 arguments".
Introduced new helper function `hlua_socket_close_helper`, which removed the
Lua stack argument count check and only checked if the first argument was
a socket.
This fix should be backported to 1.8, 1.7 and 1.6.
The parameters like server-address, port and timeout should be set before
process_stream task is called to avoid the stream being 'closed' before it
got initialized properly. This is most clearly visible when running with
tune.lua.forced-yield=1.. So scheduling the task should not be done when
creating the lua socket, but when connect is called. The error
"socket: not yet initialised, you can't set timeouts." would then appear.
Below code for example also shows this issue, as the sleep will
yield the lua code:
local con = core.tcp()
core.sleep(1)
con:settimeout(10)
If a lua socket is waiting for data it currently spins at 100% cpu usage.
This because the TICK_ETERNITY returned by the socket is ignored when
setting the 'expire' time of the task.
Fixed by removing the check for yields that return TICK_ETERNITY.
This should be backported to at least 1.8.
PiBa-NL reported a bug with tasks registered in lua when HAProxy is started with
serveral threads. These tasks have not specific affinity with threads so they
can be woken up on any threads. So, it is impossbile for these tasks to handled
cosockets or applets, because cosockets and applets are sticky on the thread
which created them. It is forbbiden to manipulate a cosocket from another
thread.
So to fix the bug, tasks registered in lua are now sticky to the current
thread. Because these tasks can be registered before threads creation, the
affinity is set the first time a lua's task is processed.
This patch must be backported in HAProxy 1.8.
In order to use arbitrary data in the CLI (multiple lines or group of words
that must be considered as a whole, for example), it is now possible to add a
payload to the commands. To do so, the first line needs to end with a special
pattern: <<\n. Everything that follows will be left untouched by the CLI parser
and will be passed to the commands parsers.
Per-command support will need to be added to take advantage of this
feature.
Signed-off-by: Aurlien Nephtali <aurelien.nephtali@corp.ovh.com>
PiBa-NL reported that haproxy crashes with a segmentation fault
if a function registered using `core.register_task` returns.
An example Lua script that reproduces the bug is:
mytask = function()
core.Info("Stopping task")
end
core.register_task(mytask)
The Valgrind output is as follows:
==6759== Process terminating with default action of signal 11 (SIGSEGV)
==6759== Access not within mapped region at address 0x20
==6759== at 0x5B60AA9: lua_sethook (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
==6759== by 0x430264: hlua_ctx_resume (hlua.c:1009)
==6759== by 0x43BB68: hlua_process_task (hlua.c:5525)
==6759== by 0x4FED0A: process_runnable_tasks (task.c:231)
==6759== by 0x4B2256: run_poll_loop (haproxy.c:2397)
==6759== by 0x4B2256: run_thread_poll_loop (haproxy.c:2459)
==6759== by 0x41A7E4: main (haproxy.c:3049)
Add the missing `task = NULL` for the `HLUA_E_OK` case. The error cases
have been fixed as of 253e53e661 which
first was included in haproxy v1.8-dev3. This bugfix should be backported
to haproxy 1.8.
Instead of hlua_socket_settimeout() accepting only integers, allow user
to specify float and double as well. Convert to milliseconds much like
cli_parse_set_timeout but also sanity check the value.
http://w3.impa.br/~diego/software/luasocket/tcp.html#settimeout
T. Fournier edit:
The main goal is to keep compatibility with the LuaSocket API. This
API only accept seconds, so using a float to specify milliseconds is
an acceptable way.
Update doc.
Negatives timeouts doesn't have sense. A negative timeout doesn't cause
a crash, but the connection expires before the system try to extablish it.
This patch should be backported in all versions from 1.6
The output of these function indicates that one element is pushed in
the stack, but no element is set in the stack. Actually, if anyone
read the value returned by this function, is gets "something"
present in the stack.
This patch is a complement of these one: 119a5f10e4
The LuaSocket documentation tell anything about the returned value,
but the effective code set an integer of value one.
316a9455b9/src/timeout.c (L172)
Thanks to Tim for the bug report.
This patch should be backported in all version from 1.6
The `socket.tcp.settimeout` method of Lua returns `1` in all cases,
while the `Socket.settimeout` method of haproxy returns `0` in all
cases. This breaks the `socket.http` module, because it validates
the return value of `settimeout`.
This bug was introduced in commit 7e7ac32dad
(which is the very first commit adding the Socket class to Lua). This
bugfix should be backported to every branch containing that commit:
- 1.6
- 1.7
- 1.8
A test case for this bug is as follows:
The 'Test' response header will contain an HTTP status code with the
patch applied and will be zero (nil) without the patch applied.
http.lua:
http = require("socket.http")
core.register_action("bug", { "http-req" }, function(txn)
local b, c, h = http.request {
url = "http://93.184.216.34",
headers = {
Host = "example.com"
},
create = core.tcp,
redirect = false
}
txn:set_var("txn.foo", c)
end)
haproxy.cfg:
global
lua-load /scratch/haproxy/http.lua
frontend fe
bind 127.0.0.1:8080
http-request lua.bug
http-response set-header Test %[var(txn.foo)]
default_backend be
backend be
server s example.com:80
The `socket.tcp.connect` method of Lua requires at least two parameters:
The host and the port. The `Socket.connect` method of haproxy requires
only one when a host with a combined port is provided. This stems from
the fact that `str2sa_range` is used internally in `hlua_socket_connect`.
This very fact unfortunately causes a diversion in the behaviour of
Lua's socket class and haproxy's for IPv6 addresses:
sock:connect("::1", "80")
works fine with Lua, but fails with:
connect: cannot parse destination address '::1'
in haproxy, because `str2sa_range` parses the trailing `:1` as the port.
This patch forcefully adds a `:` to the end of the address iff a port
number greater than `0` is given as the second parameter.
Technically this breaks backwards compatibility, because the docs state:
> The syntax "127.0.0.1:1234" is valid. in this case, the
> parameter *port* is ignored.
But: The connect() call can only succeed if the second parameter is left
out (which causes no breakage) or if the second parameter is an integer
or a numeric string.
It seems unlikely that someone would provide an address with a port number
and would also provide a second parameter containing a number other than
zero. Thus I feel this breakage is warranted to fix the mismatch between
haproxy's socket class and Lua's one.
This commit should be backported to haproxy 1.8 only, because of the
possible breakage of existing Lua scripts.
The default value of the pattern in `Socket.receive` is `*l` according
to the documentation and in the `socket.tcp.receive` method of Lua.
The default value of `wanted` in `int hlua_socket_receive(struct lua_State *)`
reflects this requirement, but the function fails to ensure this
nonetheless:
If no parameter is given the top of the Lua stack will have the index 1.
`lua_pushinteger(L, wanted);` then pushes the default value onto the stack
(with index 2).
The following `lua_replace(L, 2);` then pops the top index (2) and tries to
replace the index 2 with it.
I am not sure why exactly that happens (possibly, because one cannot replace
non-existent stack indicies), but this causes the stack index to be lost.
`hlua_socket_receive_yield` then tries to read the stack index 2, to
determine what to read and get the value `0`, instead of the correct
HLSR_READ_LINE, thus taking the wrong branch.
Fix this by ensuring that the top of the stack is not replaced by itself.
This bug was introduced in commit 7e7ac32dad
(which is the very first commit adding the Socket class to Lua). This
bugfix should be backported to every branch containing that commit:
- 1.6
- 1.7
- 1.8
A test case for this bug is as follows:
The 'Test' response header will contain an HTTP status line with the
patch applied and will be empty without the patch applied. Replacing
the `sock:receive()` with `sock:receive("*l")` will cause the status
line to appear with and without the patch
http.lua:
core.register_action("bug", { "http-req" }, function(txn)
local sock = core.tcp()
sock:settimeout(60)
sock:connect("127.0.0.1:80")
sock:send("GET / HTTP/1.0\r\n\r\n")
response = sock:receive()
sock:close()
txn:set_var("txn.foo", response)
end)
haproxy.cfg (bits omitted for brevity):
global
lua-load /scratch/haproxy/http.lua
frontend fe
bind 127.0.0.1:8080
http-request lua.bug
http-response set-header Test %[var(txn.foo)]
default_backend be
backend be
server s 127.0.0.1:80
When using an incorrect 'mode' as 2nd argument of core.register_service(),
HAProxy crashes while displaying the error message.
To be backported to 1.8, 1.7 and 1.6.
The thread patches adds refcount for notifications. The notifications are
used with the Lua cosocket. These refcount free the notifications when
the session is cleared. In the Lua task case, it not have sessions, so
the nofications are never cleraed.
This patch adds a garbage collector for signals. The garbage collector
just clean the notifications for which the end point is disconnected.
This patch should be backported in 1.8
During the migration to the second version of the pools, the new
functions and pool pointers were all called "pool_something2()" and
"pool2_something". Now there's no more pool v1 code and it's a real
pain to still have to deal with this. Let's clean this up now by
removing the "2" everywhere, and by renaming the pool heads
"pool_head_something".
This macro should be used to declare variables or struct members depending on
the USE_THREAD compile option. It avoids the encapsulation of such declarations
between #ifdef/#endif. It is used to declare all lock variables.
All the references to connections in the data path from streams and
stream_interfaces were changed to use conn_streams. Most functions named
"something_conn" were renamed to "something_cs" for this. Sometimes the
connection still is what matters (eg during a connection establishment)
and were not always renamed. The change is significant and minimal at the
same time, and was quite thoroughly tested now. As of this patch, all
accesses to the connection from upper layers go through the pass-through
mux.