|
| 1 | +.. _metrics-section: |
| 2 | + |
| 3 | +================== |
| 4 | +Metrics and alerts |
| 5 | +================== |
| 6 | + |
| 7 | +The monitoring stack is automatically installed on the leader node. |
| 8 | + |
| 9 | +All nodes will run `Node exporter <https://prometheus.io/docs/guides/node-exporter/>`_ that provides the node metrics endpoint |
| 10 | + |
| 11 | +The leader node will run: |
| 12 | + |
| 13 | +- `Prometheus <https://prometheus.io/>`_ scrapes all node_exporter metrics endpoint and stores them on a local disk |
| 14 | +- `Alertmanager <https://prometheus.io/docs/alerting/latest/alertmanager/>`_ sends alerts to the configured receivers |
| 15 | +- `Grafana <https://grafana.com/>`_ visualizes the collected metrics, it is disabled by default |
| 16 | + |
| 17 | +The monitoring stack does not require any configuration and it will automatically reconfigure when |
| 18 | +new nodes are added or removed from the cluster. |
| 19 | +When a node is promoted to leader, the monitoring stack will be automatically installed to new leader node |
| 20 | +and removed from the old one. |
| 21 | + |
| 22 | +.. note:: Metrics and alerts are not preserved when the leader node is switched. |
| 23 | + |
| 24 | +Alerts |
| 25 | +====== |
| 26 | + |
| 27 | +Prometheus will automatically send alerts to the Alertmanager when a rule is triggered. |
| 28 | +Current rules will send alerts for: |
| 29 | + |
| 30 | +- No SWAP is configured |
| 31 | +- SWAP space is nearly full |
| 32 | +- One or more backups have failed |
| 33 | +- Disk partitions are nearly full |
| 34 | + |
| 35 | +If the machine has a valid subscription, the alerts will be forwarded to the Nethesis portal like `my.nethesis.it <https://my.nethesis.it>`_ |
| 36 | +or `my.nethserver.com <https://my.nethserver.com>`_. |
| 37 | + |
| 38 | +If the machine does not have a valid subscription, the alerts will be visible only in the Grafana dashboard. |
| 39 | +Still you can configure the alerts to be sent to a specific email address. See :ref:`mail-notifications` section. |
| 40 | + |
| 41 | +Enable Grafana |
| 42 | +============== |
| 43 | + |
| 44 | +Grafana is an open-source platform for monitoring and observability. It allows you to query, visualize, alert on, |
| 45 | +and understand your metrics no matter where they are stored. |
| 46 | +Grafana provides you with tools to turn your time-series into insightful graphs and visualizations. |
| 47 | + |
| 48 | +By default, Grafana is not enabled. You can enable it by configuring a path where it will be exposed. |
| 49 | + |
| 50 | +Grafana can be exposed on a path of your choice. |
| 51 | +To enable Grafana access, run the following command on the leader node: :: |
| 52 | + |
| 53 | + api-cli run module/metrics1/configure-module --data '{"prometheus_path": "", "grafana_path": "grafana"}' |
| 54 | + |
| 55 | +Grafana will be then accessible at the following URL: ``https://<leader-node>/grafana``. |
| 56 | +If you switched the leader, please note that you may have to replace ``metrics1`` with actual metrics module instance name. |
| 57 | + |
| 58 | +Default Grafana credentials are: |
| 59 | + |
| 60 | +- username: ``admin`` |
| 61 | +- password: ``admin`` |
| 62 | + |
| 63 | +During the first login, you will be asked to change the password. |
| 64 | + |
| 65 | +Grafana will automatically display: |
| 66 | + |
| 67 | +- a dashboard for all nodes metrics like CPU load, memory usage, and disk space |
| 68 | +- a dashboard for fired alerts |
| 69 | + |
| 70 | +.. warning:: |
| 71 | + If the leader node is switched, Grafana will be accessible on the new leader node but the configuration will be lost: |
| 72 | + you will need to reconfigure the admin password and customization to the dashboards. |
| 73 | + |
| 74 | +Access Prometheus web interface |
| 75 | +=============================== |
| 76 | + |
| 77 | +By default, Prometheus web interface is not exposed to the public network. |
| 78 | + |
| 79 | +If you need to troubleshoot the Prometheus configuration, you can expose the Prometheus web interface on a path of your choice. |
| 80 | + |
| 81 | +To enable Prometheus web interface access, run the following command on the leader node: :: |
| 82 | + |
| 83 | + api-cli run module/metrics1/configure-module --data '{"prometheus_path": "prometheus", "grafana_path": "grafana"}' |
| 84 | + |
| 85 | +Prometheus will be then accessible at the following URL: ``https://<leader-node>/prometheus``. |
| 86 | + |
| 87 | +.. note:: Prometheus web interface will be accessible from any IP address without authentication. Use with caution. |
| 88 | + |
| 89 | +.. _mail-notifications: |
| 90 | + |
| 91 | +Mail notifications |
| 92 | +================== |
| 93 | + |
| 94 | +Mail notifications can be sent to users when an alert is fired or resolved. |
| 95 | +The cluster needs an SMTP server to send the notifications. So first, make sure to enable the :ref:`email-notifications` feature. |
| 96 | +If mail notifications are not enabled, the alerts will be visible only in the Grafana dashboard and not sent to any email address. |
| 97 | + |
| 98 | +Then, configure the mail notifications by running the following command on the leader node: :: |
| 99 | + |
| 100 | + api-cli run module/metrics1/configure-module --data '{"prometheus_path": "", "grafana_path": "grafana", "mail_to": ["[email protected]"], "mail_from": "[email protected]"}' |
| 101 | + |
| 102 | +The ``mail_to`` parameter is a list of email addresses that will receive the alerts. |
| 103 | +The ``mail_from`` parameter is the email address that will be used as the sender. |
0 commit comments