mirror of
				https://github.com/matrix-org/synapse.git
				synced 2025-10-27 06:11:36 +01:00 
			
		
		
		
	Converting some of the rst documentation to markdown. Attempted to preserve whitespace and line breaks to minimize cosmetic change.
		
			
				
	
	
		
			218 lines
		
	
	
		
			15 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			218 lines
		
	
	
		
			15 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| # How to monitor Synapse metrics using Prometheus
 | |
| 
 | |
| 1.  Install Prometheus:
 | |
| 
 | |
|     Follow instructions at
 | |
|     <http://prometheus.io/docs/introduction/install/>
 | |
| 
 | |
| 1.  Enable Synapse metrics:
 | |
| 
 | |
|     There are two methods of enabling metrics in Synapse.
 | |
| 
 | |
|     The first serves the metrics as a part of the usual web server and
 | |
|     can be enabled by adding the \"metrics\" resource to the existing
 | |
|     listener as such:
 | |
| 
 | |
|         resources:
 | |
|           - names:
 | |
|             - client
 | |
|             - metrics
 | |
| 
 | |
|     This provides a simple way of adding metrics to your Synapse
 | |
|     installation, and serves under `/_synapse/metrics`. If you do not
 | |
|     wish your metrics be publicly exposed, you will need to either
 | |
|     filter it out at your load balancer, or use the second method.
 | |
| 
 | |
|     The second method runs the metrics server on a different port, in a
 | |
|     different thread to Synapse. This can make it more resilient to
 | |
|     heavy load meaning metrics cannot be retrieved, and can be exposed
 | |
|     to just internal networks easier. The served metrics are available
 | |
|     over HTTP only, and will be available at `/`.
 | |
| 
 | |
|     Add a new listener to homeserver.yaml:
 | |
| 
 | |
|         listeners:
 | |
|           - type: metrics
 | |
|             port: 9000
 | |
|             bind_addresses:
 | |
|               - '0.0.0.0'
 | |
| 
 | |
|     For both options, you will need to ensure that `enable_metrics` is
 | |
|     set to `True`.
 | |
| 
 | |
| 1.  Restart Synapse.
 | |
| 
 | |
| 1.  Add a Prometheus target for Synapse.
 | |
| 
 | |
|     It needs to set the `metrics_path` to a non-default value (under
 | |
|     `scrape_configs`):
 | |
| 
 | |
|         - job_name: "synapse"
 | |
|           metrics_path: "/_synapse/metrics"
 | |
|           static_configs:
 | |
|             - targets: ["my.server.here:port"]
 | |
| 
 | |
|     where `my.server.here` is the IP address of Synapse, and `port` is
 | |
|     the listener port configured with the `metrics` resource.
 | |
| 
 | |
|     If your prometheus is older than 1.5.2, you will need to replace
 | |
|     `static_configs` in the above with `target_groups`.
 | |
| 
 | |
| 1.  Restart Prometheus.
 | |
| 
 | |
| ## Renaming of metrics & deprecation of old names in 1.2
 | |
| 
 | |
| Synapse 1.2 updates the Prometheus metrics to match the naming
 | |
| convention of the upstream `prometheus_client`. The old names are
 | |
| considered deprecated and will be removed in a future version of
 | |
| Synapse.
 | |
| 
 | |
| | New Name                                                                     | Old Name                                                               |
 | |
| | ---------------------------------------------------------------------------- | ---------------------------------------------------------------------- |
 | |
| | python_gc_objects_collected_total                                            | python_gc_objects_collected                                            |
 | |
| | python_gc_objects_uncollectable_total                                        | python_gc_objects_uncollectable                                        |
 | |
| | python_gc_collections_total                                                  | python_gc_collections                                                  |
 | |
| | process_cpu_seconds_total                                                    | process_cpu_seconds                                                    |
 | |
| | synapse_federation_client_sent_transactions_total                            | synapse_federation_client_sent_transactions                            |
 | |
| | synapse_federation_client_events_processed_total                             | synapse_federation_client_events_processed                             |
 | |
| | synapse_event_processing_loop_count_total                                    | synapse_event_processing_loop_count                                    |
 | |
| | synapse_event_processing_loop_room_count_total                               | synapse_event_processing_loop_room_count                               |
 | |
| | synapse_util_metrics_block_count_total                                       | synapse_util_metrics_block_count                                       |
 | |
| | synapse_util_metrics_block_time_seconds_total                                | synapse_util_metrics_block_time_seconds                                |
 | |
| | synapse_util_metrics_block_ru_utime_seconds_total                            | synapse_util_metrics_block_ru_utime_seconds                            |
 | |
| | synapse_util_metrics_block_ru_stime_seconds_total                            | synapse_util_metrics_block_ru_stime_seconds                            |
 | |
| | synapse_util_metrics_block_db_txn_count_total                                | synapse_util_metrics_block_db_txn_count                                |
 | |
| | synapse_util_metrics_block_db_txn_duration_seconds_total                     | synapse_util_metrics_block_db_txn_duration_seconds                     |
 | |
| | synapse_util_metrics_block_db_sched_duration_seconds_total                   | synapse_util_metrics_block_db_sched_duration_seconds                   |
 | |
| | synapse_background_process_start_count_total                                 | synapse_background_process_start_count                                 |
 | |
| | synapse_background_process_ru_utime_seconds_total                            | synapse_background_process_ru_utime_seconds                            |
 | |
| | synapse_background_process_ru_stime_seconds_total                            | synapse_background_process_ru_stime_seconds                            |
 | |
| | synapse_background_process_db_txn_count_total                                | synapse_background_process_db_txn_count                                |
 | |
| | synapse_background_process_db_txn_duration_seconds_total                     | synapse_background_process_db_txn_duration_seconds                     |
 | |
| | synapse_background_process_db_sched_duration_seconds_total                   | synapse_background_process_db_sched_duration_seconds                   |
 | |
| | synapse_storage_events_persisted_events_total                                | synapse_storage_events_persisted_events                                |
 | |
| | synapse_storage_events_persisted_events_sep_total                            | synapse_storage_events_persisted_events_sep                            |
 | |
| | synapse_storage_events_state_delta_total                                     | synapse_storage_events_state_delta                                     |
 | |
| | synapse_storage_events_state_delta_single_event_total                        | synapse_storage_events_state_delta_single_event                        |
 | |
| | synapse_storage_events_state_delta_reuse_delta_total                         | synapse_storage_events_state_delta_reuse_delta                         |
 | |
| | synapse_federation_server_received_pdus_total                                | synapse_federation_server_received_pdus                                |
 | |
| | synapse_federation_server_received_edus_total                                | synapse_federation_server_received_edus                                |
 | |
| | synapse_handler_presence_notified_presence_total                             | synapse_handler_presence_notified_presence                             |
 | |
| | synapse_handler_presence_federation_presence_out_total                       | synapse_handler_presence_federation_presence_out                       |
 | |
| | synapse_handler_presence_presence_updates_total                              | synapse_handler_presence_presence_updates                              |
 | |
| | synapse_handler_presence_timers_fired_total                                  | synapse_handler_presence_timers_fired                                  |
 | |
| | synapse_handler_presence_federation_presence_total                           | synapse_handler_presence_federation_presence                           |
 | |
| | synapse_handler_presence_bump_active_time_total                              | synapse_handler_presence_bump_active_time                              |
 | |
| | synapse_federation_client_sent_edus_total                                    | synapse_federation_client_sent_edus                                    |
 | |
| | synapse_federation_client_sent_pdu_destinations_count_total                  | synapse_federation_client_sent_pdu_destinations:count                  |
 | |
| | synapse_federation_client_sent_pdu_destinations_total                        | synapse_federation_client_sent_pdu_destinations:total                  |
 | |
| | synapse_handlers_appservice_events_processed_total                           | synapse_handlers_appservice_events_processed                           |
 | |
| | synapse_notifier_notified_events_total                                       | synapse_notifier_notified_events                                       |
 | |
| | synapse_push_bulk_push_rule_evaluator_push_rules_invalidation_counter_total  | synapse_push_bulk_push_rule_evaluator_push_rules_invalidation_counter  |
 | |
| | synapse_push_bulk_push_rule_evaluator_push_rules_state_size_counter_total    | synapse_push_bulk_push_rule_evaluator_push_rules_state_size_counter    |
 | |
| | synapse_http_httppusher_http_pushes_processed_total                          | synapse_http_httppusher_http_pushes_processed                          |
 | |
| | synapse_http_httppusher_http_pushes_failed_total                             | synapse_http_httppusher_http_pushes_failed                             |
 | |
| | synapse_http_httppusher_badge_updates_processed_total                        | synapse_http_httppusher_badge_updates_processed                        |
 | |
| | synapse_http_httppusher_badge_updates_failed_total                           | synapse_http_httppusher_badge_updates_failed                           |
 | |
| 
 | |
| Removal of deprecated metrics & time based counters becoming histograms in 0.31.0
 | |
| ---------------------------------------------------------------------------------
 | |
| 
 | |
| The duplicated metrics deprecated in Synapse 0.27.0 have been removed.
 | |
| 
 | |
| All time duration-based metrics have been changed to be seconds. This
 | |
| affects:
 | |
| 
 | |
| | msec -> sec metrics                    |
 | |
| | -------------------------------------- |
 | |
| | python_gc_time                         |
 | |
| | python_twisted_reactor_tick_time       |
 | |
| | synapse_storage_query_time             |
 | |
| | synapse_storage_schedule_time          |
 | |
| | synapse_storage_transaction_time       |
 | |
| 
 | |
| Several metrics have been changed to be histograms, which sort entries
 | |
| into buckets and allow better analysis. The following metrics are now
 | |
| histograms:
 | |
| 
 | |
| | Altered metrics                                  |
 | |
| | ------------------------------------------------ |
 | |
| | python_gc_time                                   |
 | |
| | python_twisted_reactor_pending_calls             |
 | |
| | python_twisted_reactor_tick_time                 |
 | |
| | synapse_http_server_response_time_seconds        |
 | |
| | synapse_storage_query_time                       |
 | |
| | synapse_storage_schedule_time                    |
 | |
| | synapse_storage_transaction_time                 |
 | |
| 
 | |
| Block and response metrics renamed for 0.27.0
 | |
| ---------------------------------------------
 | |
| 
 | |
| Synapse 0.27.0 begins the process of rationalising the duplicate
 | |
| `*:count` metrics reported for the resource tracking for code blocks and
 | |
| HTTP requests.
 | |
| 
 | |
| At the same time, the corresponding `*:total` metrics are being renamed,
 | |
| as the `:total` suffix no longer makes sense in the absence of a
 | |
| corresponding `:count` metric.
 | |
| 
 | |
| To enable a graceful migration path, this release just adds new names
 | |
| for the metrics being renamed. A future release will remove the old
 | |
| ones.
 | |
| 
 | |
| The following table shows the new metrics, and the old metrics which
 | |
| they are replacing.
 | |
| 
 | |
| | New name                                                      | Old name                                                   |
 | |
| | ------------------------------------------------------------- | ---------------------------------------------------------- |
 | |
| | synapse_util_metrics_block_count                              | synapse_util_metrics_block_timer:count                     |
 | |
| | synapse_util_metrics_block_count                              | synapse_util_metrics_block_ru_utime:count                  |
 | |
| | synapse_util_metrics_block_count                              | synapse_util_metrics_block_ru_stime:count                  |
 | |
| | synapse_util_metrics_block_count                              | synapse_util_metrics_block_db_txn_count:count              |
 | |
| | synapse_util_metrics_block_count                              | synapse_util_metrics_block_db_txn_duration:count           |
 | |
| | synapse_util_metrics_block_time_seconds                       | synapse_util_metrics_block_timer:total                     |
 | |
| | synapse_util_metrics_block_ru_utime_seconds                   | synapse_util_metrics_block_ru_utime:total                  |
 | |
| | synapse_util_metrics_block_ru_stime_seconds                   | synapse_util_metrics_block_ru_stime:total                  |
 | |
| | synapse_util_metrics_block_db_txn_count                       | synapse_util_metrics_block_db_txn_count:total              |
 | |
| | synapse_util_metrics_block_db_txn_duration_seconds            | synapse_util_metrics_block_db_txn_duration:total           |
 | |
| | synapse_http_server_response_count                            | synapse_http_server_requests                               |
 | |
| | synapse_http_server_response_count                            | synapse_http_server_response_time:count                    |
 | |
| | synapse_http_server_response_count                            | synapse_http_server_response_ru_utime:count                |
 | |
| | synapse_http_server_response_count                            | synapse_http_server_response_ru_stime:count                |
 | |
| | synapse_http_server_response_count                            | synapse_http_server_response_db_txn_count:count            |
 | |
| | synapse_http_server_response_count                            | synapse_http_server_response_db_txn_duration:count         |
 | |
| | synapse_http_server_response_time_seconds                     | synapse_http_server_response_time:total                    |
 | |
| | synapse_http_server_response_ru_utime_seconds                 | synapse_http_server_response_ru_utime:total                |
 | |
| | synapse_http_server_response_ru_stime_seconds                 | synapse_http_server_response_ru_stime:total                |
 | |
| | synapse_http_server_response_db_txn_count                     | synapse_http_server_response_db_txn_count:total            |
 | |
| | synapse_http_server_response_db_txn_duration_seconds          | synapse_http_server_response_db_txn_duration:total         |
 | |
| 
 | |
| Standard Metric Names
 | |
| ---------------------
 | |
| 
 | |
| As of synapse version 0.18.2, the format of the process-wide metrics has
 | |
| been changed to fit prometheus standard naming conventions. Additionally
 | |
| the units have been changed to seconds, from miliseconds.
 | |
| 
 | |
| | New name                                 | Old name                          |
 | |
| | ---------------------------------------- | --------------------------------- |
 | |
| | process_cpu_user_seconds_total           | process_resource_utime / 1000     |
 | |
| | process_cpu_system_seconds_total         | process_resource_stime / 1000     |
 | |
| | process_open_fds (no \'type\' label)     | process_fds                       |
 | |
| 
 | |
| The python-specific counts of garbage collector performance have been
 | |
| renamed.
 | |
| 
 | |
| | New name                         | Old name                   |
 | |
| | -------------------------------- | -------------------------- |
 | |
| | python_gc_time                   | reactor_gc_time            |
 | |
| | python_gc_unreachable_total      | reactor_gc_unreachable     |
 | |
| | python_gc_counts                 | reactor_gc_counts          |
 | |
| 
 | |
| The twisted-specific reactor metrics have been renamed.
 | |
| 
 | |
| | New name                               | Old name                |
 | |
| | -------------------------------------- | ----------------------- |
 | |
| | python_twisted_reactor_pending_calls   | reactor_pending_calls   |
 | |
| | python_twisted_reactor_tick_time       | reactor_tick_time       |
 |