This will be needed to later postpone server address resolution. We need the
FQDN even when it doesn't resolve. The caller then needs to check if fqdn was
set when resolve is null to detect that the address couldn't be parsed and
needs later resolution.
ipcpy() is used to replace an IP address with another one, but it
doesn't preserve the original port so all callers have to do it
manually while it's trivial to do there. Better do it inside the
function.
Often we need to call str2ip2() on an address which already contains a
port without replacing it, so let's ensure we preserve it even if the
family changes.
The function ipcpy() simply duplicates the IP address found in one
struct sockaddr_storage into an other struct sockaddr_storage.
It also update the family on the destination structure.
Memory of destination structure must be allocated and cleared by the
caller.
It is sometimes needed in application server environments to easily tell
if a source is local to the machine or a remote one, without necessarily
knowing all the local addresses (dhcp, vrrp, etc). Similarly in transparent
proxy configurations it is sometimes desired to tell the difference between
local and remote destination addresses.
This patch adds two new sample fetch functions for this :
dst_is_local : boolean
Returns true if the destination address of the incoming connection is local
to the system, or false if the address doesn't exist on the system, meaning
that it was intercepted in transparent mode. It can be useful to apply
certain rules by default to forwarded traffic and other rules to the traffic
targetting the real address of the machine. For example the stats page could
be delivered only on this address, or SSH access could be locally redirected.
Please note that the check involves a few system calls, so it's better to do
it only once per connection.
src_is_local : boolean
Returns true if the source address of the incoming connection is local to the
system, or false if the address doesn't exist on the system, meaning that it
comes from a remote machine. Note that UNIX addresses are considered local.
It can be useful to apply certain access restrictions based on where the
client comes from (eg: require auth or https for remote machines). Please
note that the check involves a few system calls, so it's better to do it only
once per connection.
Similar to "escape_chunk", this function tries to prefix all characters
tagged in the <map> with the <escape> character. The specified <string>
contains the input to be escaped.
Alexander Lebedev reported that the DNS parser crashes in 1.6 with a bus
error on Sparc when it receives a response. This is obviously caused by
some alignment issues. The issue can also be reproduced on ARMv5 when
setting /proc/cpu/alignment to 4 (which helps debugging).
Two places cause this crash in turn, the first one is when the IP address
from the packet is compared to the current one, and the second place is
when the address is assigned because an unaligned address is passed to
update_server_addr().
This patch modifies these places to properly use memcpy() and memcmp()
to manipulate the unaligned data.
Nenad Merdanovic found another set of places specific to 1.7 in functions
in_net_ipv4() and in_net_ipv6(), which are used to compare networks. 1.6
has the functions but does not use them. There we perform a temporary copy
to a local variable to fix the problem. The type of the function's argument
is wrong since it's not necessarily aligned, so we change it for a const
void * instead.
This fix must be backported to 1.6. Note that in 1.6 the code is slightly
different, there's no rec[] array, the pointer is used directly from the
buffer.
Changed all the cases where the pointer passed to realloc is overwritten
by the pointer returned by realloc. The new function my_realloc2 has
been used except in function register_name. If register_name fails to
add a new variable because of an "out of memory" error, all the existing
variables remain valid. If we had used my_realloc2, the array of variables
would have been freed.
int list_append_word(struct list *li, const char *str, char **err)
Append a copy of string <str> (inside a wordlist) at the end of
the list <li>.
The caller is responsible for freeing the <err> and <str> copy memory
area using free().
On failure : return 0 and <err> filled with an error message.
In C89, "void *" is automatically promoted to any pointer type. Casting
the result of malloc/calloc to the type of the LHS variable is therefore
unneeded.
Most of this patch was built using this Coccinelle patch:
@@
type T;
@@
- (T *)
(\(lua_touserdata\|malloc\|calloc\|SSL_get_app_data\|hlua_checkudata\|lua_newuserdata\)(...))
@@
type T;
T *x;
void *data;
@@
x =
- (T *)
data
@@
type T;
T *x;
T *data;
@@
x =
- (T *)
data
Unfortunately, either Coccinelle or I is too limited to detect situation
where a complex RHS expression is of type "void *" and therefore casting
is not needed. Those cases were manually examined and corrected.
The strftime() function can call tzset() internally on some platforms.
When haproxy is chrooted, the /etc/localtime file is not found, and some
implementations will clobber the content of the current timezone.
The GMT offset is computed by diffing the times returned by gmtime_r() and
localtime_r(). These variants are guaranteed to not call tzset() and were
already used in haproxy while chrooted, so they should be safe.
This patch must be backported to 1.6 and 1.5.
GMT offset used in local time formats was computed at startup, but was not updated when DST status changed while running.
For example these two RFC5424 syslog traces where emitted 5 seconds apart, just before and after DST changed:
<14>1 2016-03-27T01:59:58+01:00 bunch-VirtualBox haproxy 2098 - - Connect ...
<14>1 2016-03-27T03:00:03+01:00 bunch-VirtualBox haproxy 2098 - - Connect ...
It looked like they were emitted more than 1 hour apart, unlike with the fix:
<14>1 2016-03-27T01:59:58+01:00 bunch-VirtualBox haproxy 3381 - - Connect ...
<14>1 2016-03-27T03:00:03+02:00 bunch-VirtualBox haproxy 3381 - - Connect ...
This patch should be backported to 1.6 and partially to 1.5 (no fix needed in log.c).
The original author forgot to dereference the argument to free in
parse_binary. This may result in a crash on reading bad input from
the configuration file instead of a proper error message.
Found in HAProxy 1.5.14.
This parser takes a string containing an HTTP date. It returns
a broken-down time struct. We must considers considers this
time as GMT. Maybe later the timezone will be taken in account.
csv_enc_append() returns a pointer to the beginning of the encoded
string, which makes it convenient to use in printf(). However it's not
convenient for use in chunks as it may leave an unused byte at the
beginning depending on the automatic quoting. Let's modify it to work
in two passes. First it looks for a character that requires escaping
using strpbrk(), and second it encodes the string. This way it
guarantees to always start at the first available byte of the chunk.
Additionally it made the code quite simpler.
We have csv_enc() but there's no way to append some CSV-encoded data
to an existing chunk, so here we modify the existing function for this
and create an inlined version of csv_enc() which first resets the output
chunk. It will be handy to append data to an existing chunk without
having to use an extra temporary chunk, or to encode multiple strings
into a single chunk with chunk_newstr().
The patch is quite small, in fact most changes are typo fixes in the
comments.
When first parameter to getaddrinfo() is not NULL (it is always not NULL
in str2ip()), on Linux AI_PASSIVE value for ai_flags is ignored. On
FreeBSD, when AI_PASSIVE is specified and hostname parameter is not NULL,
getaddrinfo() ignores local address selection policy, always returning
AAAA record. Pass zero ai_flags to behave correctly on FreeBSD, this
change should be no-op for Linux.
This fix should be backported to 1.5 as well, after some observation
period.
If an environment variable is used in an address, and is not set, it's
silently considered as ":" or "0.0.0.0:0" which is not correct as it
can hide environment issues and lead to unexpected behaviours. Let's
report this case when it happens.
This fix should be backported to 1.5.
The function does a bunch of things among which resolving environment
variables, skipping address family specifiers and trimming port ranges.
It is the only one which sees the complete host name before trying to
resolve it. The DNS resolving code needs to know the original hostname,
so we modify this function to optionally provide it to the caller.
Note that the function itself doesn't know if the host part was a host
or an address, but str2ip() knows that and can be asked not to try to
resolve. So we first try to parse the address without resolving and
try again with resolving enabled. This way we know if the address is
explicit or needs some kind of resolution.
This patch adds 3 functions for 64 bit integer conversion.
* lltoa_r : converts signed 64 bit integer to string
* read_uint64 : converts from string to signed 64 bits integer with capping
* read_int64 : converts from string to unsigned 64 bits integer with capping
This function checks a string for using it in a CSV output format. If
the string contains one of the following four char <">, <,>, CR or LF,
the string is encapsulated between <"> and the <"> are escaped by a <"">
sequence.
The rounding by <"> is optionnal. It can be canceled, forced or the
function choose automatically the right way.
If a stick table is defined as below:
stick-table type ip size 50ka expire 300s
HAProxy will stop parsing size after passing through "50k" and return the value
directly. But such format string of size should not be valid. The patch checks
the next character to report error if any.
Signed-off-by: Godbach <nylzhaowei@gmail.com>
This converter escapes string to use it as json/ascii escaped string.
It can read UTF-8 with differents behavior on errors and encode it in
json/ascii.
json([<input-code>])
Escapes the input string and produces an ASCII ouput string ready to use as a
JSON string. The converter tries to decode the input string according to the
<input-code> parameter. It can be "ascii", "utf8", "utf8s", "utf8"" or
"utf8ps". The "ascii" decoder never fails. The "utf8" decoder detects 3 types
of errors:
- bad UTF-8 sequence (lone continuation byte, bad number of continuation
bytes, ...)
- invalid range (the decoded value is within a UTF-8 prohibited range),
- code overlong (the value is encoded with more bytes than necessary).
The UTF-8 JSON encoding can produce a "too long value" error when the UTF-8
character is greater than 0xffff because the JSON string escape specification
only authorizes 4 hex digits for the value encoding. The UTF-8 decoder exists
in 4 variants designated by a combination of two suffix letters : "p" for
"permissive" and "s" for "silently ignore". The behaviors of the decoders
are :
- "ascii" : never fails ;
- "utf8" : fails on any detected errors ;
- "utf8s" : never fails, but removes characters corresponding to errors ;
- "utf8p" : accepts and fixes the overlong errors, but fails on any other
error ;
- "utf8ps" : never fails, accepts and fixes the overlong errors, but removes
characters corresponding to the other errors.
This converter is particularly useful for building properly escaped JSON for
logging to servers which consume JSON-formated traffic logs.
Example:
capture request header user-agent len 150
capture request header Host len 15
log-format {"ip":"%[src]","user-agent":"%[capture.req.hdr(1),json]"}
Input request from client 127.0.0.1:
GET / HTTP/1.0
User-Agent: Very "Ugly" UA 1/2
Output log:
{"ip":"127.0.0.1","user-agent":"Very \"Ugly\" UA 1\/2"}
qstr() and cstr() will be used to quote-encode strings. The first one
does it unconditionally. The second one is aimed at CSV files where the
quote-encoding is only needed when the field contains a quote or a comma.
This helper is similar to addr_to_str but
tries to convert the port rather than the address
of a struct sockaddr_storage.
This is in preparation for supporting
an external agent check.
Signed-off-by: Simon Horman <horms@verge.net.au>
Dmitry Sivachenko reported that "uint" doesn't build on FreeBSD 10.
On Linux it's defined in sys/types.h and indicated as "old". Just
get rid of the very few occurrences.
These sockets are the same as Unix sockets except that there's no need
for any filesystem access. The address may be whatever string both sides
agree upon. This can be really convenient for inter-process communications
as well as for chaining backends to frontends.
These addresses are forced by prepending their address with "abns@" for
"abstract namespace".
When compiled with USE_GETADDRINFO, make sure we use getaddrinfo(3) to
perform name lookups. On default dual-stack setups this will change the
behavior of using IPv6 first. Global configuration option
'nogetaddrinfo' can be used to revert to deprecated gethostbyname(3).
OpenBSD complains about the use of sprintf in human_time() :
src/standard.o(.text+0x1c40): In function `human_time':
src/standard.c:2067: warning: sprintf() is often misused, please use snprintf()
We can easily get around this by having a pointer to the end of the string and
using snprintf() instead.
OpenBSD complains about our use of strcpy() in standard.c. The checks
were OK and we didn't fall into the category of "almost always misused",
but it's very simple to fix it so better do it before a problem happens.
src/standard.o(.text+0x26ab): In function `str2sa_range':
src/standard.c:718: warning: strcpy() is almost always misused, please use strlcpy()
The function url2sa() converts faster url like http://<ip>:<port> in a
struct sockaddr_storage. This patch add:
- the https support
- permit to return the length parsed
- support IPv6
- support DNS synchronous resolution only during start of haproxy.
The faster IPv4 convertion way is keeped. IPv6 is slower, because I use
the standard IPv6 parser function.
The function str2net runs DNS resolution if valid ip cannot be parsed.
The DNS function used is the standard function of the libc and it
performs asynchronous request.
The asynchronous request is not compatible with the haproxy
archictecture.
str2net() is used during the runtime throught the "socket".
This patch remove the DNS resolution during the runtime.
The goal of these patch is to simplify the prototype of
"pat_pattern_*()" functions. I want to replace the argument "char
**args" by a simple "char *arg" and remove the "opaque" argument.
"pat_parse_int()" and "pat_parse_dotted_ver()" are the unique pattern
parser using the "opaque" argument and using more than one string
argument of the char **args. These specificities are only used with ACL.
Other systems using this pattern parser (MAP and CLI) just use one
string for describing a range.
This two functions can read a range, but the min and the max must y
specified. This patch extends the syntax to describe a range with
implicit min and max. This is used for operators like "lt", "le", "gt",
and "ge". the syntax is the following:
":x" -> no min to "x"
"x:" -> "x" to no max
This patch moves the parsing of the comparison operator from the
functions "pat_parse_int()" and "pat_parse_dotted_ver()" to the acl
parser. The acl parser read the operator and the values and build a
volatile string readable by the functions "pat_parse_int()" and
"pat_parse_dotted_ver()". The transformation is done with these rules:
If the parser is "pat_parse_int()":
"eq x" -> "x"
"le x" -> ":x"
"lt x" -> ":y" (with y = x - 1)
"ge x" -> "x:"
"gt x" -> "y:" (with y = x + 1)
If the parser is "pat_parse_dotted_ver()":
"eq x.y" -> "x.y"
"le x.y" -> ":x.y"
"lt x.y" -> ":w.z" (with w.z = x.y - 1)
"ge x.y" -> "x.y:"
"gt x.y" -> "w.z:" (with w.z = x.y + 1)
Note that, if "y" is not present, assume that is "0".
Now "pat_parse_int()" and "pat_parse_dotted_ver()" accept only one
pattern and the variable "opaque" is no longer used. The prototype of
the pattern parsers can be changed.
Actually the values returned by this function is never used. All the
callers just check if the resultat is non-zero. Before this patch, the
function returns the length of the produced content. This value is not
useful because is returned twice: the first time in the return value and
the second time in the <binstrlen> argument. Now the function returns
the number of bytes consumed from <source>.