This patch adds support for extracting captured header fields to halog. A field
can be extracted by passing the `-hdr <block>:<field>` output filter.
Both `<block>` and `<field>` are 1-indexed.
`<block>` refers to the index of the brace-delimited list of headers. If both
request and response headers are captured, then request headers are referenced
by `<block> = 1`, response headers are `2`. If only one direction is captured,
there will only be a single block `1`.
`<field>` refers to a single field within the selected block.
The output will contain one line, possibly empty, per log line processed.
Passing a non-existent `<block>` or `<field>` will result in an empty line.
Example:
capture request header a len 50
capture request header b len 50
capture request header c len 50
capture response header d len 50
capture response header e len 50
capture response header f len 50
`-srv 1:1` will extract request header `a`
`-srv 1:2` will extract request header `b`
`-srv 1:3` will extract request header `c`
`-srv 2:3` will extract response header `f`
This resolves GitHub issue #1146.
This is not an output filter, but instead a modifier. Specifically "only one
may be used at a time" is not true.
see 24b8d693b202b01b649f64ed878d8f9dd1b242e4
Our use-case for this is a dynamic application that performs routing based on
the query string. Without this option all URLs will just point to the central
entrypoint of this location, making the output completely useless.
Dmitry reported this warning on FreeBSD since the introduction of -Wundef:
admin/halog/fgets2.c:38:30: warning: '__GLIBC__' is not defined, evaluates to 0 [-Wundef]
#if defined(__x86_64__) && (__GLIBC__ > 2 || (__GLIBC__ == 2 && __GLIBC_MINOR__ >= 15))
^
A defined() was missing.
halog currently emits lots of warnings because it does not benefit from
the default flags. Let's update the main makefile to build it by itself
and remove the other one. The sub-project's makefile was replaced with
A readme indicating how to build it.
There has been a USE_MEMCHR option for ages that was mostly never enabled
because it was unclear when glibc became faster. A quick look at the code
indicates that this arrived with the SSE implementation of memchr() which
arrived at commit 093ecf92998de2 between 2.14 and 2.15, so let's automatically
turn this on on x86_64 with glibc >= 2.15.
This results in ~6GB of logs read per second (20 million lines) and ~2.5GB/s
(8 million lines) parsed for errors or status codes classification, or 1 GB/s
(3 million lines) for time percentiles.