The isalnum(), isalpha(), isdigit() etc functions from ctype.h are
supposed to take an int in argument which must either reflect an
unsigned char or EOF. In practice on some platforms they're implemented
as macros referencing an array, and when passed a char, they either cause
a warning "array subscript has type 'char'" when lucky, or cause random
segfaults when unlucky. It's quite unconvenient by the way since none of
them may return true for negative values. The recent introduction of
cygwin to the list of regularly tested build platforms revealed a lot
of breakage there due to the same issues again.
So this patch addresses the problem all over the code at once. It adds
unsigned char casts to every valid use case, and also drops the unneeded
double cast to int that was sometimes added on top of it.
It may be backported by dropping irrelevant changes if that helps better
support uncommon platforms. It's unlikely to fix bugs on platforms which
would already not emit any warning though.
This change make the payload filtering uniform between TCP and HTTP
filters. Now, in TCP, like in HTTP, there is only one callback responsible to
forward data. Thus, old callbacks, tcp_data() and tcp_forward_data(), are
replaced by a single callback function, tcp_payload(). This new callback gets
the offset in the payload to (re)start the filtering and the maximum amount of
data it can forward. It is the filter's responsibility to be compatible with HTX
streams. If not, it must not set the flag FLT_CFG_FL_HTX.
Because of this change, nxt and fwd offsets are no longer needed. Thus they are
removed from the filter structure with their update functions,
flt_change_next_size() and flt_change_forward_size(). Moreover, the trace filter
has been updated accordingly.
This patch breaks the compatibility with the old API. Thus it should probably
not be backported. But, AFAIK, there is no TCP filter, thus the breakage is very
limited.
Names of these macros may enter in conflict with the macros of the runtime
tracing mechanism. So the prefix "FLT_" has been added to avoid any ambiguities.
The old module proto_http does not exist anymore. All code dedicated to the HTTP
analysis is now grouped in the file proto_htx.c. So, to finish the polishing
after removing the legacy HTTP code, proto_htx.{c,h} files have been moved in
http_ana.{c,h} files.
In addition, all HTX analyzers and related functions prefixed with "htx_" have
been renamed to start with "http_" instead.
First of all, all legacy HTTP analyzers and all functions exclusively used by
them were removed. So the most of the functions in proto_http.{c,h} were
removed. Only functions to deal with the HTTP transaction have been kept. Then,
http_msg and hdr_idx modules were entirely removed. And finally the structure
http_msg was lightened of all its useless information about the legacy HTTP. The
structure hdr_ctx was also removed because unused now, just like unused states
in the enum h1_state. Note that the memory pool "hdr_idx" was removed and
"http_txn" is now smaller.
The legacy HTTP callbacks were removed (trace_http_data,
trace_http_chunk_trailers and trace_http_forward_data). And the loop on the HTTP
headers was updated to only handle HTX messages.
Previous fix about the random forwarding on the message body was not enough to
fix the bug in all cases. Among others, when there is no data but only the EOM,
we must forward everything.
This patch must be backported to 1.9 if the patch 0bdeeaacb ("BUG/MINOR:
flt_trace/htx: Only apply the random forwarding on the message body.") is also
backported.
There is no bug here, but this patch improves the debug message reported during
the random forwarding. The original offset is kept untouched so its value may be
used to format the message. Before, 0 was always reported.
In the function trace_http_payload(), when the random forwarding is enabled,
only blocks of type HTX_BLK_DATA must be considered. Because other blocks must
be forwarding in one time.
This patch must be backported to 1.9. But it will have to be adapted. Because
several changes on the HTX in the 2.0 are missing in the 1.9.
HTTP trailers are now parsed in the same way headers are. It means trailers are
converted to K/V blocks followed by an end-of-trailer marker. For now, to make
things simple, the type for trailer blocks are not the same than for header
blocks. But the aim is to make no difference between headers and trailers by
using the same type. Probably for the end-of marker too.
The filters filtering HTX body, in the callback http_payload, must now loop on
an HTX message starting from the first block position. The offset passed as
parameter is relative to this position and not the head one. It is mandatory
because once filtered, data are now forwarded using the function
channel_htx_fwd_payload(). So the first block position is always updated.
The first block is the start-line, if defined. Otherwise it the head of the HTX
message. So now, during HTTP analysis, lookup are all done using the first block
instead of the head. Concretely, for now, it is the same because only one HTTP
message is stored at a time in an HTX message. 1xx informational messages are
handled separatly from the final reponse and from each other. But it will make
sense when the 1xx informational messages and the associated final reponse will
be stored in the same HTX message.
Now, we only return the start-line. If not found, NULL is returned. No lookup is
performed and the HTX message is no more updated. It is now the caller
responsibility to update the position of the start-line to the right value. So
when it is not found, i.e sl_pos is set to -1, it means the last start-line has
been already processed and the next one has not been inserted yet.
It is mandatory to rely on this kind of warranty to store 1xx informational
responses and final reponse in the same HTX message.
All the HTX definition is self-contained and doesn't really depend on
anything external since it's a mostly protocol. In addition, some
external similar files (like h2) also placed in common used to rely
on it, making it a bit awkward.
This patch moves the two htx.h files into a single self-contained one.
The historical dependency on sample.h could be also removed since it
used to be there only for http_meth_t which is now in http.h.
Now, the function htx_from_buf() will set the buffer's length to its size
automatically. In return, the caller should call htx_to_buf() at the end to be
sure to leave the buffer hosting the HTX message in the right state. When the
caller can use the function htxbuf() to get the HTX message without any update
on the underlying buffer.
Of course, the flag FLT_CFG_FL_HTX must be used and not
STRM_FLT_FL_HAS_FILTERS. "Fortunately", these 2 flags have the same value, so
everything worked as expected.
The callback http_headers has been updated to dump HTX headers when the HTX
internal representation is in use. And the callback http_payload has been
implemented with its hexdump function.
This switches explicit calls to various trivial registration methods for
keywords, muxes or protocols from constructors to INITCALL1 at stage
STG_REGISTER. All these calls have in common to consume a single pointer
and return void. Doing this removes 26 constructors. The following calls
were addressed :
- acl_register_keywords
- bind_register_keywords
- cfg_register_keywords
- cli_register_kw
- flt_register_keywords
- http_req_keywords_register
- http_res_keywords_register
- protocol_register
- register_mux_proto
- sample_register_convs
- sample_register_fetches
- srv_register_keywords
- tcp_req_conn_keywords_register
- tcp_req_cont_keywords_register
- tcp_req_sess_keywords_register
- tcp_res_cont_keywords_register
- flt_register_keywords
Now all the code used to manipulate chunks uses a struct buffer instead.
The functions are still called "chunk*", and some of them will progressively
move to the generic buffer handling code as they are cleaned up.
Now the buffers only contain the header and a pointer to the storage
area which can be anywhere. This will significantly simplify buffer
swapping and will make it possible to map chunks on buffers as well.
The buf_empty variable was removed, as now it's enough to have size==0
and area==NULL to designate the empty buffer (thus a non-allocated head
is the empty buffer by default). buf_wanted for now is indicated by
size==0 and area==(void *)1.
The channels and the checks now embed the buffer's head, and the only
pointer is to the storage area. This slightly increases the unallocated
buffer size (3 extra ints for the empty buffer) but considerably
simplifies dynamic buffer management. It will also later permit to
detach unused checks.
The way the struct buffer is arranged has proven quite efficient on a
number of tests, which makes sense given that size is always accessed
and often first, followed by the othe ones.
This function was sometimes used from a channel and sometimes from a buffer.
In both cases it requires knowledge of the size of the output data (to skip
them). Here the split ensures the channel can deal with this point, and that
other places not having output data can continue to work.
These ones manipulate the output data count which will be specific to
the channel soon, so prepare the call points to use the channel only.
The b_* functions are now unused and were removed.
New callbacks have been added to handle creation and destruction of filter
instances:
* 'attach' callback is called after a filter instance creation, when it is
attached to a stream. This happens when the stream is started for filters
defined on the stream's frontend and when the backend is set for filters
declared on the stream's backend. It is possible to ignore the filter, if
needed, by returning 0. This could be useful to have conditional filtering.
* 'detach' callback is called when a filter instance is detached from a stream,
before its destruction. This happens when the stream is stopped for filters
defined on the stream's frontend and when the analyze ends for filters defined
on the stream's backend.
In addition, the callback 'stream_set_backend' has been added to know when a
backend is set for a stream. It is only called when the frontend and the backend
are not the same. And it is called for all filters attached to a stream
(frontend and backend).
Finally, the TRACE filter has been updated.
'channel_analyze' callback has been removed. Now, there are 2 callbacks to
surround calls to analyzers:
* channel_pre_analyze: Called BEFORE all filterable analyzers. it can be
called many times for the same analyzer, once at each loop until the
analyzer finishes its processing. This callback is resumable, it returns a
negative value if an error occurs, 0 if it needs to wait, any other value
otherwise.
* channel_post_analyze: Called AFTER all filterable analyzers. Here, AFTER
means when an analyzer finishes its processing. This callback is NOT
resumable, it returns a negative value if an error occurs, any other value
otherwise.
Pre and post analyzer callbacks are not automatically called. 'pre_analyzers'
and 'post_analyzers' bit fields in the filter structure must be set to the right
value using AN_* flags (see include/types/channel.h).
The flag AN_RES_ALL has been added (AN_REQ_ALL already exists) to ease the life
of filter developers. AN_REQ_ALL and AN_RES_ALL include all filterable
analyzers.
Instead of calling 'channel_analyze' callback with the flag AN_FLT_HTTP_HDRS,
now we use the new callback 'http_headers'. This change is done because
'channel_analyze' callback will be removed in a next commit.
Add opaque data between the filter keyword registrering and the parsing
function. This opaque data allow to use the same parser with differents
registered keywords. The opaque data is used for giving data which mainly
makes difference between the two keywords.
It will be used with Lua keywords registering.
Now, filter's configuration (.id, .conf and .ops fields) is stored in the
structure 'flt_conf'. So proxies own a flt_conf list instead of a filter
list. When a filter is attached to a stream, it gets a pointer on its
configuration. This avoids mixing the filter's context (owns by a stream) and
its configuration (owns by a proxy). It also saves 2 pointers per filter
instance.
The "trace" filter has been added. It defines all available callbacks and for
each one it prints a trace message. To enable it:
listener test
...
filter trace
...