Monitoring
Install on one server next monitoring tools:
- Prometheus
- Prometheus Node Exporter
- Grafana
- Prometheus Alert Manager
- BlackBox Exporter
- PostgreSql Exporter
OS:
- Ubutnu/Debian based
Applicatons:
- ansible [core 2.12.x]
- python version = 3.8.x
- docker version 20.10.x
- docker-compose version 1.29.x
Deb packages:
- software-properties-common
- python3-pip
- virtualenv
- python3-setuptools
Pip packages:
- docker
- docker-compose
Variable | Default value | Description |
---|---|---|
monitoring_server_inventory_group | monitoring_manager | Group of hosts in your invenory, where monitoring will be hosted |
monitoring_targets_inventory_group | monitoring_node | Group of hosts in your invenory, which will be monitored |
monitoring_docker_network_name | monitoring | Name for monitoring docker network |
monitoring_docker_network_subnet | 172.20.0.0/16 | Subnet address for monitoring docker network |
monitoring_config_dir | /etc/monitoring | Directory for storing configuration files and running docker-compose.yml |
monitoring_keep_prometheus_logs | 200h | Keeping Prometheus logs time interval |
monitoring_domain | 127.0.0.1 | Domain for Prometheus service hosting |
monitoring_alertmanager_domain | http://{{ monitoring_domain }}: {{ monitoring_alertmanager_port }} |
Alertmanager web path |
monitoring_cvm_endpoint | http://127.0.0.1:8090 |
CVM server DNS name or IP address, like https://somename.com or http://1.2.3.4. |
monitoring_datagrok_endpoint | http://127.0.0.1:8080 |
Datagrok server DNS name or IP address like https://somename.com or http://1.2.3.4. |
monitoring_blackbox_endpoints | - "{{ monitoring_cvm_endpoint }}/grok_compute/info" - "{{ monitoring_cvm_endpoint }}/jupyter/helper/info" - "{{ monitoring_cvm_endpoint }}/notebook/helper/info" - "{{ monitoring_cvm_endpoint }}/notebook/api" - "{{ monitoring_cvm_endpoint }}/jupyter/api/swagger.yaml" - "{{ monitoring_cvm_endpoint }}:5005/helper/info" - "{{ monitoring_cvm_endpoint }}:54321/3/About" - "{{ monitoring_datagrok_endpoint }}/api/admin/health" |
Endpoints for monitoring health checks on CVM and Datagrok environment. Use defaults or set up needed value |
monitoring_grafana_enabled | true | Enable option to install Grafana . Describes monitoring statistics in graph mode. Can get true or false value. |
monitoring_alertmanager_enabled | true | Enable option to install Alert Manager. Used to send monitoring alerts via e-mail and slack. Can get true or false value |
monitoring_blackbox_enabled | true | Enable option to install BlackBox Exporter . Used for monitoring platform healthchecks |
monitoring_cadvisor_enabled | true | Enable option to install Cadvisor exporter . Used for monitoring docker containers |
monitoring_pgsql_exporter_enabled | true | Enable option to install PostgreSql Exporter. Used for monitoring PostgreSql databases |
prometheus_image | prom/prometheus:v2.40.4 | Used Prometheus docker image |
grafana_image | grafana/grafana-oss:9.2.6 | Used Grafana docker image |
alertmanager_image | quay.io/prometheus/alertmanager:v0.24.0 | Used Alert Manager docker image |
blackbox_image | prom/blackbox-exporter:master | Used BlackBox Exporter docker image |
cadvisor_image | gcr.io/cadvisor/cadvisor:v0.38.6 | Used Cadvisor exporter docker image |
monitoring_node_exporter_port | 9100 | External port for Prometheus Node exporter |
monitoring_cadvisor_port | 8181 | External port for Cadvisor exporter |
monitoring_prometheus_port | 9090 | External port for Prometheus |
monitoring_grafana_port | 3000 | External port for Grafana . First login/password is admin/admin |
monitoring_alertmanager_port | 9093 | External port for Alert Manager |
monitoring_blackbox_port | 9115 | External port for BlackBox Exporter |
monitoring_pgsql_exporter_port | 9187 | External port for PostgreSql Exporter |
monitoring_alerts_repeat_message_interval | 30m | Repeat alert message time interval if alert still exists |
monitoring_alerts_group_message_interval | 5m | Time interval to send alert message about new alerts that are added to a group for which an initial notification has already been sent |
monitoring_alerts_send_resolved | true | Notify about resolved alerts |
monitoring_alert_ssl_expire_enabled | true | Enable SSL sertificate expire alerting |
monitoring_alert_ssl_expire_time | 120 | Alerting time in hours before sertificate will expire |
monitoring_swarm_services_enabled | true | Enable docker swarm services fails alerting |
monitoring_slack_alerts | false | Enable sending monitoring alerts via slack |
monitoring_slack_channel | monitoring | Name of slack channel for sending alerts |
monitoring_slack_webhook_url | " " |
URL for using your Slack |
monitoring_email_alerts | true | Enable sending monitoring alerts via |
monitoring_smtp_host | 127.0.0.1 | The SMTP host through which emails are sent |
monitoring_smtp_port | 25 | Mail sending port |
monitoring_smtp_from | alertmanager@{{ inventory_hostname }} |
The default SMTP From header field |
monitoring_smtp_login | "" |
Login to SMTP mail server. Example: [email protected] |
monitoring_smtp_pwd | "" |
Password to SMTP mail server. |
monitoring_note_receiver | monitoring_note_receiver: - '[email protected]' |
E-mail for alert receiving. Can have several values. |
monitoring_smtp_require_tls | true | Enable tls for alert messages |
monitoring_pgsql_dbs | - login: 'postgres' password: 'postgres' host: '127.0.0.1' port: 5432 dbname: 'datagrok' sslmode: 'disable' name: 'local' inventory_hostname: 'localhost' |
Monitored database credentials |
We recomend to import these dashboards for metrics vizualization
Service | Dashboard number |
---|---|
Prometheus Node Exporter | 1860 |
Cadvisor Exporter | 14282 |
Blackbox Exporter | 7587 |
PostgeSql Exporter | 6742 |
- hosts: servers
roles:
- role: monitoring
vars:
monitoring_grafana_enabled: false
note_receiver:
[email protected]
[email protected]
monitoring_server_inventory_group: server
monitoring_targets_inventory_group: targets
Datagrok
Dmytro Nahovskyi, E-mail: [email protected]