Prometheus: Difference between revisions

From Bondix Wiki
(Formatting)
(Add more infos regarding status)
 
Line 73: Line 73:
These metrics provide information about tunnel status, traffic, and processing.
These metrics provide information about tunnel status, traffic, and processing.


* '''<code>bondix_server_connected_tunnels</code>''' (gauge) – Total number of connected tunnels.  ''Labels: <code>environment_interface</code>, <code>environment_name</code>''
* '''<code>bondix_server_connected_tunnels</code>''' (gauge) – Total number of connected tunnels per environment.   
* '''<code>bondix_server_tunnel_status</code>''' (gauge) – Tunnel status. ''Labels: <code>tunnel</code>''
''Labels: <code>environment_interface</code>, <code>environment_name</code>''
* '''<code>bondix_server_tunnel_process_time</code>''' (gauge) – Time taken to process a tunnel (nanoseconds). ''Labels: <code>tunnel</code>''
* '''<code>bondix_server_tunnel_status</code>''' (gauge) – Tunnel status. Possible Values:
* '''<code>bondix_server_tunnel_environment_index</code>''' (gauge) – Environment index associated with a tunnel. ''Labels: <code>tunnel</code>''
** 0 = Disconnected
* '''<code>bondix_server_tunnel_proxy_connections</code>''' (gauge) – Current number of proxy connections. ''Labels: <code>tunnel</code>''
** 1 = Connected
* '''<code>bondix_server_tunnel_qos_rule_count</code>''' (gauge) – Number of active QoS (Quality of Service) rules. ''Labels: <code>tunnel</code>''
** 2 = No channel connected, waiting for reconnect
 
* '''<code>bondix_server_tunnel_process_time</code>''' (gauge) – Time taken to process a tunnel (nanoseconds).
* '''<code>bondix_server_tunnel_environment_index</code>''' (gauge) – Environment index associated with a tunnel.
* '''<code>bondix_server_tunnel_proxy_connections</code>''' (gauge) – Current number of proxy connections.
* '''<code>bondix_server_tunnel_qos_rule_count</code>''' (gauge) – Number of active QoS (Quality of Service) rules.
* '''Traffic Metrics:'''
* '''Traffic Metrics:'''
** '''<code>bondix_server_tunnel_total_rx_bytes</code>''' – Total bytes received.
** '''<code>bondix_server_tunnel_total_rx_bytes</code>''' – Total bytes received.
** '''<code>bondix_server_tunnel_total_tx_bytes</code>''' – Total bytes transmitted.
** '''<code>bondix_server_tunnel_total_tx_bytes</code>''' – Total bytes transmitted.
** '''<code>bondix_server_tunnel_uptime</code>''' – Tunnel uptime (seconds). ''Labels: <code>tunnel</code>''
** '''<code>bondix_server_tunnel_uptime</code>''' – Tunnel uptime (seconds).  
''Labels: <code>tunnel</code>''


----
----
Line 91: Line 97:
* '''<code>bondix_server_environment_packet_cache_size</code>''' (gauge) – Packet cache size.
* '''<code>bondix_server_environment_packet_cache_size</code>''' (gauge) – Packet cache size.
* '''<code>bondix_server_environment_packet_cache_use</code>''' (gauge) – Number of packet cache entries in use.
* '''<code>bondix_server_environment_packet_cache_use</code>''' (gauge) – Number of packet cache entries in use.
* '''<code>bondix_server_environment_process_time</code>''' (gauge) – Time taken to process all tunnels in an environment (nanoseconds). ''Labels: <code>environment_name</code>''
* '''<code>bondix_server_environment_process_time</code>''' (gauge) – Time taken to process all tunnels in an environment (nanoseconds).  
''Labels: <code>environment_name</code>''


----
----
Line 102: Line 109:
* '''<code>bondix_server_tunnel_channel_connected_count</code>''' (gauge) – Number of connected channels per tunnel.
* '''<code>bondix_server_tunnel_channel_connected_count</code>''' (gauge) – Number of connected channels per tunnel.
* '''<code>bondix_server_tunnel_channel_count</code>''' (gauge) – Total number of channels per tunnel.
* '''<code>bondix_server_tunnel_channel_count</code>''' (gauge) – Total number of channels per tunnel.
* '''<code>bondix_server_tunnel_channel_status</code>''' (gauge) – Status of a tunnel channel.  
* '''<code>bondix_server_tunnel_channel_status</code>''' (gauge) – Status of a tunnel channel. Possible values:
** 0 = Disconnected
** 1 = Control Connect (TCP Handshake)
** 2 = Data/UDP Handshake I
** 3 = Data/UDP Handshake II
** 4 = Connected
** 5 = Connected, Stalled
** 6 = Connected, Standby
''Labels: <code>channel</code>, <code>tunnel</code>''
''Labels: <code>channel</code>, <code>tunnel</code>''


=== Latency and Performance ===
=== Latency and Performance ===


* '''<code>bondix_server_tunnel_channel_latency</code>''' (gauge) – Current latency of a tunnel channel.
* '''<code>bondix_server_tunnel_channel_latency</code>''' (gauge) – Current latency (milliseconds) of a tunnel channel.
* '''<code>bondix_server_tunnel_channel_idle_latency</code>''' (gauge) – Idle latency of a tunnel channel.
* '''<code>bondix_server_tunnel_channel_idle_latency</code>''' (gauge) – Idle latency of a tunnel channel.
* '''<code>bondix_server_tunnel_channel_latency_ms</code>''' (histogram) – Distribution of channel latency (milliseconds).
* '''<code>bondix_server_tunnel_channel_latency_ms</code>''' (histogram) – Distribution of channel latency (milliseconds).
Line 114: Line 128:
** <code>_count</code>: Number of observations.  
** <code>_count</code>: Number of observations.  
''Labels: <code>channel</code>, <code>tunnel</code>''
''Labels: <code>channel</code>, <code>tunnel</code>''
* '''<code>bondix_server_tunnel_channel_loss</code>''' (gauge) – Packet loss percentage.
* '''<code>bondix_server_tunnel_channel_loss</code>''' (gauge) – Packet loss percentage. ''Labels: <code>tunnel</code>''
''Labels: <code>tunnel</code>''
 
=== Traffic Metrics ===
=== Traffic Metrics ===



Latest revision as of 15:38, 11 March 2025

Enabling Prometheus Support

Prometheus support must be explicitly enabled by modifying the configuration file located at /etc/bondixserver.json or /etc/saneserver.json. You can either:

  • Serve Prometheus metrics over the default HTTPS port, or
  • Configure a dedicated listener for Prometheus metrics.

Once enabled, Prometheus metrics will be accessible at:

https://<server-ip>/metrics

Serving Prometheus Over the Default HTTPS Port

To enable Prometheus on the default HTTPS port, update your bondixserver.json configuration file.

Modify the Configuration

Locate the following line:

{"target": "server", "action": "add-https", "host": "0.0.0.0", "port": "443", "allowMonitor": true},

Add the following properties:

Property Value Description
allowPrometheus true Enables Prometheus metric reporting
prometheusUser <username> (Optional) If set, metrics require authentication
prometheusPassword <password> (Optional) Password for authentication. Both username and password must be provided, otherwise metrics remain unprotected.

Example Configuration

Your modified configuration should look something like this:

{"target": "server", "action": "add-https", "host": "0.0.0.0", "port": "443", "allowMonitor": true, "allowPrometheus": true, "prometheusUser": "prometheus", "prometheusPassword": "a-good-password"}

Serving Prometheus Over a Dedicated Port

To run Prometheus metrics on a dedicated port, add a new configuration entry to bondixserver.json. This ensures only Prometheus metrics are served. You can choose between HTTP and HTTPS.

Example Configuration

The following configuration allows Prometheus metrics on a dedicated port while restricting other services:

{"target": "server", "action": "add-http", "host": "127.0.0.1", "port": "181818", "allowMonitor": false, "allowTunnel": false, "allowPrometheus": true, "prometheusUser": "prometheus", "prometheusPassword": "a-good-password"}

Configuration Notes:

  • Use "action": "add-http" for HTTP or "action": "add-https" for HTTPS.
  • Adjust the host and port as per your requirements.
  • If other security measures restrict access to the port, authentication (prometheusUser and prometheusPassword) may be optional.

Adding Bondix Server to Prometheus

To configure Prometheus to scrape metrics from the Bondix server, add the following snippet to your Prometheus configuration file:

  - job_name: 'endpoints'
    static_configs:
      - targets: ['your_endpoint_server.com']
    scheme: https
    basic_auth:
      username: 'prometheus'
      password: 'a-good-password'
    tls_config:
      insecure_skip_verify: true
    enable_http2: false

This configuration enables Prometheus to securely scrape metrics over HTTPS with basic authentication.


Available Metrics

Here is an overview of the available Prometheus metrics, grouped by category.


1. Tunnel Metrics

These metrics provide information about tunnel status, traffic, and processing.

  • bondix_server_connected_tunnels (gauge) – Total number of connected tunnels per environment.

Labels: environment_interface, environment_name

  • bondix_server_tunnel_status (gauge) – Tunnel status. Possible Values:
    • 0 = Disconnected
    • 1 = Connected
    • 2 = No channel connected, waiting for reconnect
  • bondix_server_tunnel_process_time (gauge) – Time taken to process a tunnel (nanoseconds).
  • bondix_server_tunnel_environment_index (gauge) – Environment index associated with a tunnel.
  • bondix_server_tunnel_proxy_connections (gauge) – Current number of proxy connections.
  • bondix_server_tunnel_qos_rule_count (gauge) – Number of active QoS (Quality of Service) rules.
  • Traffic Metrics:
    • bondix_server_tunnel_total_rx_bytes – Total bytes received.
    • bondix_server_tunnel_total_tx_bytes – Total bytes transmitted.
    • bondix_server_tunnel_uptime – Tunnel uptime (seconds).

Labels: tunnel


2. Environment Metrics

These metrics provide details about packet cache usage and processing time for environments.

  • bondix_server_environment_packet_cache_size (gauge) – Packet cache size.
  • bondix_server_environment_packet_cache_use (gauge) – Number of packet cache entries in use.
  • bondix_server_environment_process_time (gauge) – Time taken to process all tunnels in an environment (nanoseconds).

Labels: environment_name


3. Tunnel Channel Metrics

These metrics track channel performance, connectivity, and traffic.

Channel Connection and Status

  • bondix_server_tunnel_channel_connected_count (gauge) – Number of connected channels per tunnel.
  • bondix_server_tunnel_channel_count (gauge) – Total number of channels per tunnel.
  • bondix_server_tunnel_channel_status (gauge) – Status of a tunnel channel. Possible values:
    • 0 = Disconnected
    • 1 = Control Connect (TCP Handshake)
    • 2 = Data/UDP Handshake I
    • 3 = Data/UDP Handshake II
    • 4 = Connected
    • 5 = Connected, Stalled
    • 6 = Connected, Standby

Labels: channel, tunnel

Latency and Performance

  • bondix_server_tunnel_channel_latency (gauge) – Current latency (milliseconds) of a tunnel channel.
  • bondix_server_tunnel_channel_idle_latency (gauge) – Idle latency of a tunnel channel.
  • bondix_server_tunnel_channel_latency_ms (histogram) – Distribution of channel latency (milliseconds).
    • _bucket: Observations within different latency ranges.
    • _sum: Total latency sum.
    • _count: Number of observations.

Labels: channel, tunnel

  • bondix_server_tunnel_channel_loss (gauge) – Packet loss percentage. Labels: tunnel

Traffic Metrics

  • Received Data:
    • bondix_server_tunnel_channel_total_rx_bytes – Total received bytes.
    • bondix_server_tunnel_channel_total_rx_packets – Total received packets.
    • bondix_server_tunnel_channel_total_rx_lost_packets – Total lost packets on the receiving side.
  • Transmitted Data:
    • bondix_server_tunnel_channel_total_tx_bytes – Total transmitted bytes.
    • bondix_server_tunnel_channel_total_tx_packets – Total transmitted packets.
    • bondix_server_tunnel_channel_total_tx_lost_packets – Total lost packets on the transmitting side.

Labels: channel, tunnel

Other

  • bondix_server_tunnel_channel_uptime (gauge) – Tunnel channel uptime (seconds).

Labels: channel, tunnel