Commit Graph

296 Commits

Author SHA1 Message Date
Willy Tarreau
199ad24661 MINOR: stats: report the number of active peers in "show info"
Peers are the last type of activity which can maintain a job present, so
it's important to report that such an entity is still active to explain
why the job count may be higher than zero. Here by "ActivePeers" we report
peers sessions, which include both established connections and outgoing
connection attempts.
2018-11-05 17:15:21 +01:00
Willy Tarreau
00098ea034 MINOR: stats: report the number of active jobs and listeners in "show info"
When an haproxy process doesn't stop after a reload, it's because it
still has some active "jobs", which mainly are active sessions, listeners,
peers or other specific activities. Sometimes it's difficult to troubleshoot
the cause of these issues (which generally are the result of a bug) only
because some indicators are missing.

This patch add the number of listeners, the number of jobs, and the stopping
status to the output of "show info". This way it becomes a bit easier to try
to narrow down the cause of such an issue should it happen. A typical use
case is to connect to the CLI before reloading, then issuing the "show info"
command to see what happens. In the normal situation, stopping should equal
1, jobs should equal 1 (meaning only the CLI is still active) and listeners
should equal zero.

The patch is so trivial that it could make sense to backport it to 1.8 in
order to help with troubleshooting.
2018-11-05 17:15:21 +01:00
Willy Tarreau
21ff2c46b7 BUILD: stats: remove build warnings on potential null-derefs
A couple of objt_appctx() could be replaced with their unchecked
equivalent since the pointer is guaranteed and not checked there.
2018-09-20 11:42:15 +02:00
Willy Tarreau
35b51c6e5b REORG: http: move the HTTP semantics definitions to http.h/http.c
It's a bit painful to have to deal with HTTP semantics for each protocol
version (H1 and H2), and working on the version-agnostic code further
emphasizes the problem.

This patch creates http.h and http.c which are agnostic to the version
in use, and which borrow a few parts from proto_http and from h1. For
example the once thought h1-specific h1_char_classes array is in fact
dictated by RFC7231 and is used to parse HTTP headers. A few changes
were made to a few files which were including proto_http.h while they
only needed http.h.

Certain string definitions pre-dated the introduction of indirect
strings (ist) so some were used to simplify the definition of the known
HTTP methods. The current lookup code saves 2 kB of a heavily used table
and is faster than the previous table based lookup (typ. 14 ns vs 16
before).
2018-09-11 10:30:25 +02:00
Willy Tarreau
055ba4f505 BUG/MEDIUM: stats: don't ask for more data as long as we're responding
The stats applet is still a bit hackish. It uses the HTTP txn to parse
the POST contents. Due to this it pretends not having parsed the request
from the buffer so that the HTTP parser continues to work fine on these
data. This comes with a side effect : the request lies pending in the
channel's buffer, and because of this, stream_int_update_applet() always
wakes the applet up. It's very visible when retrieving a large stats page
over a slow link as haproxy eats 100% of the CPU waiting for the data to
leave.

While the proper long term solution definitely is to consume these data
and parse the body from the applet, changing this is not suitable for a
fix.

What this patch does instead is to disable request polling as long as there
are pending data in the response buffer. Given that for almost all cases,
the applet remains busy sending data, this is at least enough to ensure
that we don't wake up for the pending request data while we're waiting for
the client to receive these data. Now a 5k backend stats page is dumped at
1% CPU over a 10 Mbps link instead of 100%, using 1500 epoll_wait() calls
instead of 80000.

Note that the previous fix (BUG/MEDIUM: stream-int: don't immediately
enable reading when the buffer was reportedly full) is necessary for the
effects of the fix to be noticed since both bugs have the exact same
effect.

This fix must be backported at least as far as 1.5.
2018-07-24 17:13:32 +02:00
Willy Tarreau
83061a820e MAJOR: chunks: replace struct chunk with struct buffer
Now all the code used to manipulate chunks uses a struct buffer instead.
The functions are still called "chunk*", and some of them will progressively
move to the generic buffer handling code as they are cleaned up.
2018-07-19 16:23:43 +02:00
Willy Tarreau
843b7cbe9d MEDIUM: chunks: make the chunk struct's fields match the buffer struct
Chunks are only a subset of a buffer (a non-wrapping version with no head
offset). Despite this we still carry a lot of duplicated code between
buffers and chunks. Replacing chunks with buffers would significantly
reduce the maintenance efforts. This first patch renames the chunk's
fields to match the name and types used by struct buffers, with the goal
of isolating the code changes from the declaration changes.

Most of the changes were made with spatch using this coccinelle script :

  @rule_d1@
  typedef chunk;
  struct chunk chunk;
  @@
  - chunk.str
  + chunk.area

  @rule_d2@
  typedef chunk;
  struct chunk chunk;
  @@
  - chunk.len
  + chunk.data

  @rule_i1@
  typedef chunk;
  struct chunk *chunk;
  @@
  - chunk->str
  + chunk->area

  @rule_i2@
  typedef chunk;
  struct chunk *chunk;
  @@
  - chunk->len
  + chunk->data

Some minor updates to 3 http functions had to be performed to take size_t
ints instead of ints in order to match the unsigned length here.
2018-07-19 16:23:43 +02:00
Willy Tarreau
c9fa0480af MAJOR: buffer: finalize buffer detachment
Now the buffers only contain the header and a pointer to the storage
area which can be anywhere. This will significantly simplify buffer
swapping and will make it possible to map chunks on buffers as well.

The buf_empty variable was removed, as now it's enough to have size==0
and area==NULL to designate the empty buffer (thus a non-allocated head
is the empty buffer by default). buf_wanted for now is indicated by
size==0 and area==(void *)1.

The channels and the checks now embed the buffer's head, and the only
pointer is to the storage area. This slightly increases the unallocated
buffer size (3 extra ints for the empty buffer) but considerably
simplifies dynamic buffer management. It will also later permit to
detach unused checks.

The way the struct buffer is arranged has proven quite efficient on a
number of tests, which makes sense given that size is always accessed
and often first, followed by the othe ones.
2018-07-19 16:23:43 +02:00
Willy Tarreau
97f538b895 MINOR: stats: adapt to the new buffers API
The changes are fairly straightforward. Some places require to trim
the length. Maybe we'd need a b_extend() or b_adjust() for this.
2018-07-19 16:23:42 +02:00
Willy Tarreau
89faf5d7c3 MINOR: buffer: remove bo_ptr()
It was replaced by co_head() when a channel was known, otherwise b_head().
2018-07-19 16:23:40 +02:00
Willy Tarreau
1b0f85e47f MINOR: stats: also report the failed header rewrites warnings on the stats page
These ones concern the warnings detected during header addition/insertion.
They are visible in the tooltip reporting the per-status codes stats. The
frontend and backend contain a total of request+response warnings, while
server only has the response warnings.
2018-05-28 15:16:23 +02:00
Tim Duesterhus
3fd1973d37 MINOR: http: Log warning if (add|set)-header fails
This patch adds a warning if an http-(request|reponse) (add|set)-header
rewrite fails to change the respective header in a request or response.

This usually happens when tune.maxrewrite is not sufficient to hold all
the headers that should be added.
2018-05-28 14:53:59 +02:00
Aurélien Nephtali
abbf607105 MEDIUM: cli: Add payload support
In order to use arbitrary data in the CLI (multiple lines or group of words
that must be considered as a whole, for example), it is now possible to add a
payload to the commands. To do so, the first line needs to end with a special
pattern: <<\n. Everything that follows will be left untouched by the CLI parser
and will be passed to the commands parsers.

Per-command support will need to be added to take advantage of this
feature.

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-04-26 14:19:33 +02:00
Yves Lafon
95317289e9 MINOR: stats: display the number of threads in the statistics.
Add the nbthread global variable to the output, matching nbproc.

This may be backported to 1.8
2018-02-26 11:53:46 +01:00
Willy Tarreau
d80cb4ee13 MINOR: global: add some global activity counters to help debugging
A number of counters have been added at special places helping better
understanding certain bug reports. These counters are maintained per
thread and are shown using "show activity" on the CLI. The "clear
counters" commands also reset these counters. The output is sent as a
single write(), which currently produces up to about 7 kB of data for
64 threads. If more counters are added, it may be necessary to write
into multiple buffers, or to reset the counters.

To backport to 1.8 to help collect more detailed bug reports.
2018-01-23 15:38:33 +01:00
Olivier Houchard
fbc74e8556 MINOR/CLEANUP: proxy: rename "proxy" to "proxies_list"
Rename the global variable "proxy" to "proxies_list".
There's been multiple proxies in haproxy for quite some time, and "proxy"
is a potential source of bugs, a number of functions have a "proxy" argument,
and some code used "proxy" when it really meant "px" or "curproxy". It worked
by pure luck, because it usually happened while parsing the config, and thus
"proxy" pointed to the currently parsed proxy, but we should probably not
rely on this.

[wt: some of these are definitely fixes that are worth backporting]
2017-11-24 17:21:27 +01:00
Willy Tarreau
ee8269e84d BUG/MINOR: stream: fix tv_request calculation for applets
When the stats code was moved to an applet, it wasn't completely
cleaned of its usage of the HTTP transaction and it used to store
the HTTP status in txn->status and to set the HTTP request date to
<now> from within the applet. This is totally wrong because the
applet is seen as a server from the HTTP engine, which parses its
response, so the http_txn must not be touched there.

This was made visible by the cache which would always exhibit a
negative TR log, indicating that nowhere in the code we took care of
setting s->logs.tv_request while the code above used to continue to
hide this. Another side effect of this issue is that under load, if
the stats applet call risks to be delayed, the reported t_queue can
appear negative by being below tv_request-tv_accept.

This patch removes the assignment of tv_request and txn->status from
the applet code and instead sets the tv_request if still unset when
connecting to the applet. This ensures that all applets report correct
request timers now.
2017-11-23 17:34:29 +01:00
Christopher Faulet
2a944ee16b BUILD: threads: Rename SPIN/RWLOCK macros using HA_ prefix
This remove any name conflicts, especially on Solaris.
2017-11-07 11:10:24 +01:00
Christopher Faulet
1bc04c7664 BUG/MINOR: threads: Add missing THREAD_LOCAL on static here and there 2017-10-31 13:58:33 +01:00
Emeric Brun
9f0b458525 MEDIUM: threads/server: Use the server lock to protect health check and cli concurrency 2017-10-31 13:58:33 +01:00
Christopher Faulet
29f77e846b MEDIUM: threads/server: Add a lock per server and atomically update server vars
The server's lock is use, among other things, to lock acces to the active
connection list of a server.
2017-10-31 13:58:31 +01:00
Emeric Brun
5a1335110c BUG/MEDIUM: log: check result details truncated.
Fix regression introduced by commit:
'MAJOR: servers: propagate server status changes asynchronously.'

The building of the log line was re-worked to be done at the
postponed point without lack of data.

[wt: this only affects 1.8-dev, no backport needed]
2017-10-19 18:51:32 +02:00
Willy Tarreau
06d80a9a9c REORG: channel: finally rename the last bi_* / bo_* functions
For HTTP/2 we'll need some buffer-only equivalent functions to some of
the ones applying to channels and still squatting the bi_* / bo_*
namespace. Since these names have kept being misleading for quite some
time now and are really getting annoying, it's time to rename them. This
commit will use "ci/co" as the prefix (for "channel in", "channel out")
instead of "bi/bo". The following ones were renamed :

  bi_getblk_nc, bi_getline_nc, bi_putblk, bi_putchr,
  bo_getblk, bo_getblk_nc, bo_getline, bo_getline_nc, bo_inject,
  bi_putchk, bi_putstr, bo_getchr, bo_skip, bi_swpbuf
2017-10-19 15:01:08 +02:00
Olivier Houchard
00bc3cb59f BUG/MINOR: stats: Clear a bit more counters with in cli_parse_clear_counters().
Clear MaxSslRate, SslFrontendMaxKeyRate and SslBackendMaxKeyRate when
clear counters is used, it was probably forgotten when those counters were
added.

[wt: this can probably be backported as far as 1.5 in dumpstats.c]
2017-10-18 18:36:08 +02:00
Willy Tarreau
31794892af MINOR: unix: remove the now unused proto_uxst.h file
Since everything is self contained in proto_uxst.c there's no need to
export anything. The same should be done for proto_tcp.c but the file
contains other stuff that's not related to the TCP protocol itself
and which should first be moved somewhere else.
2017-09-15 11:49:52 +02:00
Andjelko Iharos
c3680ecdf8 MINOR: add severity information to cli feedback messages 2017-09-13 13:38:32 +02:00
Emeric Brun
52a91d3d48 MEDIUM: check: server states and weight propagation re-work
The server state and weight was reworked to handle
"pending" values updated by checks/CLI/LUA/agent.
These values are commited to be propagated to the
LB stack.

In further dev related to multi-thread, the commit
will be handled into a sync point.

Pending values are named using the prefix 'next_'
Current values used by the LB stack are named 'cur_'
2017-09-05 15:23:16 +02:00
William Lallemand
07a62f7a7e MINOR: cli: add ACCESS_LVL_MASK to store the access level
The current level variable use only 2 bits for storing the 3 access
level (user, oper and admin).

This patch add a bitmask which allows to use the remaining bits for
other usage.
2017-05-27 07:02:06 +02:00
Willy Tarreau
9d7fb63e33 BUILD/MINOR: stats: remove unexpected argument to stats_dump_json_header()
Commit 05ee213 ("MEDIUM: stats: Add JSON output option to show (info|stat)")
used to pass argument "uri" to the aforementionned function which doesn't
take any. It's probably a leftover from multiple iterations of the same
patchset. Spotted by Dmitry Sivachenko. No backport is needed.
2017-04-11 07:54:45 +02:00
Simon Horman
6f6bb380ef MEDIUM: stats: Add show json schema
This may be used to output the JSON schema which describes the output of
show info json and show stats json.

The JSON output is without any extra whitespace in order to reduce the
volume of output. For human consumption passing the output through a
pretty printer may be helpful.

e.g.:
$ echo "show schema json" | socat /var/run/haproxy.stat stdio | \
     python -m json.tool

The implementation does not generate the schema. Some consideration could
be given to integrating the output of the schema with the output of
typed and json info and stats. In particular the types (u32, s64, etc...)
and tags.

A sample verification of show info json and show stats json using
the schema is as follows. It uses the jsonschema python module:

cat > jschema.py <<  __EOF__
import json

from jsonschema import validate
from jsonschema.validators import Draft3Validator

with open('schema.txt', 'r') as f:
    schema = json.load(f)
    Draft3Validator.check_schema(schema)

    with open('instance.txt', 'r') as f:
        instance = json.load(f)
	validate(instance, schema, Draft3Validator)
__EOF__

$ echo "show schema json" | socat /var/run/haproxy.stat stdio > schema.txt
$ echo "show info json" | socat /var/run/haproxy.stat stdio > instance.txt
python ./jschema.py
$ echo "show stats json" | socat /var/run/haproxy.stat stdio > instance.txt
python ./jschema.py

Signed-off-by: Simon Horman <horms@verge.net.au>
2017-03-14 11:14:03 +01:00
Simon Horman
05ee213f8b MEDIUM: stats: Add JSON output option to show (info|stat)
Add a json parameter to show (info|stat) which will output information
in JSON format. A follow-up patch will add a JSON schema which describes
the format of the JSON output of these commands.

The JSON output is without any extra whitespace in order to reduce the
volume of output. For human consumption passing the output through a
pretty printer may be helpful.

e.g.:
$ echo "show info json" | socat /var/run/haproxy.stat stdio | \
     python -m json.tool

STAT_STARTED has bee added in order to track if show output has begun or
not. This is used in order to allow the JSON output routines to only insert
a "," between elements when needed. I would value any feedback on how this
might be done better.

Signed-off-by: Simon Horman <horms@verge.net.au>
2017-03-14 11:14:03 +01:00
Willy Tarreau
04276f3d6e MEDIUM: server: split the address and the port into two different fields
Keeping the address and the port in the same field causes a lot of problems,
specifically on the DNS part where we're forced to cheat on the family to be
able to keep the port. This causes some issues such as some families not being
resolvable anymore.

This patch first moves the service port to a new field "svc_port" so that the
port field is never used anymore in the "addr" field (struct sockaddr_storage).
All call places were adapted (there aren't that many).
2017-01-06 19:29:33 +01:00
David Harrigan
d3db35a1d1 MINOR: stats: Support "select all" for backend actions
Allow the user to quickly select all servers within a group before invoking an
action.
2017-01-02 16:56:33 +01:00
Thierry FOURNIER
0ff98a4758 BUG/MINOR: stats: fix be/sessions/current out in typed stats
"scur" was typed as "limit" (FO_CONFIG) and "config value" (FN_LIMIT).
The real types of "scur" are "metric" (FO_METRIC) and "gauge"
(FN_GAUGE). FO_METRIC and FN_GAUGE are the value 0.
2016-12-22 23:20:00 +01:00
Willy Tarreau
d25fc79d72 CLEANUP: stats: move a misplaced stats context initialization
This is a leftover from the cleanup campaign, the stats scope was still
initialized by the CLI instead of being initialized by the stats keyword
parsers. This should probably be backported to 1.7 to make the code more
consistent.
2016-12-16 19:40:13 +01:00
Christopher Faulet
a73e59b690 BUG/MAJOR: Fix how the list of entities waiting for a buffer is handled
When an entity tries to get a buffer, if it cannot be allocted, for example
because the number of buffers which may be allocated per process is limited,
this entity is added in a list (called <buffer_wq>) and wait for an available
buffer.

Historically, the <buffer_wq> list was logically attached to streams because it
were the only entities likely to be added in it. Now, applets can also be
waiting for a free buffer. And with filters, we could imagine to have more other
entities waiting for a buffer. So it make sense to have a generic list.

Anyway, with the current design there is a bug. When an applet failed to get a
buffer, it will wait. But we add the stream attached to the applet in
<buffer_wq>, instead of the applet itself. So when a buffer is available, we
wake up the stream and not the waiting applet. So, it is possible to have
waiting applets and never awakened.

So, now, <buffer_wq> is independant from streams. And we really add the waiting
entity in <buffer_wq>. To be generic, the entity is responsible to define the
callback used to awaken it.

In addition, applets will still request an input buffer when they become
active. But they will not be sleeped anymore if no buffer are available. So this
is the responsibility to the applet I/O handler to check if this buffer is
allocated or not. This way, an applet can decide if this buffer is required or
not and can do additional processing if not.

[wt: backport to 1.7 and 1.6]
2016-12-12 19:11:04 +01:00
Christopher Faulet
34c5cc98da MINOR: task: Rename run_queue and run_queue_cur counters
<run_queue> is used to track the number of task in the run queue and
<run_queue_cur> is a copy used for the reporting purpose. These counters has
been renamed, respectively, <tasks_run_queue> and <tasks_run_queue_cur>. So the
naming is consistent between tasks and applets.

[wt: needed for next fixes, backport to 1.7 and 1.6]
2016-12-12 19:10:54 +01:00
Willy Tarreau
8e0f17543e BUG/MINOR: stats: fix be/sessions/max output in html stats
"Tadas / XtGem" reported that the max value was wrong and would report
the current value instead. This needs to be backported to 1.7.
2016-12-12 15:07:29 +01:00
Willy Tarreau
ae9bea0591 CLEANUP: counters: move from 3 types to 2 types
We used to have 3 types of counters with a huge overlap :
  - listener counters : stats collected for each bind line
  - proxy counters : union of the frontend and backend counters
  - server counters : stats collected per server

It happens that quite a good part was common between listeners and
proxies due to the frontend counters being updated at the two locations,
and that similarly the server and proxy counters were overlapping and
being updated together.

This patch cleans this up to propose only two types of counters :
  - fe_counters: used by frontends and listeners, related to
    incoming connections activity
  - be_counters: used by backends and servers, related to outgoing
    connections activity

This allowed to remove some non-sensical counters from both parts. For
frontends, the following entries were removed :

  cum_lbconn, last_sess, nbpend_max, failed_conns, failed_resp,
  retries, redispatches, q_time, c_time, d_time, t_time

For backends, this ones was removed : intercepted_req.

While doing this it was discovered that we used to incorrectly report
intercepted_req for backends in the HTML stats, which was always zero
since it's never updated.

Also it revealed a few inconsistencies (which were not fixed as they
are harmless). For example, backends count connections (cum_conn)
instead of sessions while servers count sessions and not connections.

Over the long term, some extra cleanups may be performed by having
some counters update functions touching both the server and backend
at the same time, as well as both the frontend and listener, to
ensure that all sides have all their stats properly filled. The stats
dump will also be able to factor the dump functions by counter types.
2016-11-25 15:03:12 +01:00
Willy Tarreau
a1b1ed53e7 MINOR: cli: make "show stat" support a proxy name
Till now it was needed to know the proxy's ID while we do have the
ability to look up a proxy by its name now.
2016-11-25 08:55:25 +01:00
Willy Tarreau
30e5e18bbb CLEANUP: cli: remove assignments to st0 and st2 in keyword parsers
Now it's not needed anymore to set STAT_ST_INIT nor CLI_ST_CALLBACK
in the parsers, remove it in the various places.
2016-11-24 16:59:28 +01:00
Willy Tarreau
89d467c105 REORG: cli: move "clear counters" to stats.c
This command is only used to clear stats. It now relies on cli_has_level()
to validate the permissions.
2016-11-24 16:59:28 +01:00
Willy Tarreau
0baac8cf1f REORG: cli: move "show info" to stats.c
Move the "show info" command to stats.c using the CLI keyword API
to register it on the CLI. The stats_dump_info_to_buffer() function
is now static again. Note, we don't need proto_ssl anymore in cli.c.
2016-11-24 16:59:27 +01:00
Willy Tarreau
2b812e29f6 REORG: cli: move "show stat" to stats.c
Move the "show stat" command to stats.c using the CLI keyword API
to register it on the CLI. The stats_dump_stat_to_buffer() function
is now static again.
2016-11-24 16:59:27 +01:00
William Lallemand
9ed6203aef REORG: cli: split dumpstats.h in stats.h and cli.h
proto/dumpstats.h has been split in 4 files:

  * proto/cli.h  contains protypes for the CLI
  * proto/stats.h contains prototypes for the stats
  * types/cli.h contains definition for the CLI
  * types/stats.h contains definition for the stats
2016-11-24 16:59:27 +01:00
William Lallemand
74c24fb071 REORG: cli: split dumpstats.c in src/cli.c and src/stats.c
dumpstats.c was containing either the stats code and the CLI code.
The cli code has been moved to cli.c and the stats code to stats.c
2016-11-24 16:59:27 +01:00