-
Notifications
You must be signed in to change notification settings - Fork 1
Admin: Tutorial about CrateDB monitoring with Prometheus and Grafana #302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughAdds a new Prometheus/Grafana monitoring tutorial and links to it from admin and integration docs. Inserts a toctree entry under docs/admin/index.md and adds See also grid-item cards on Grafana and Prometheus integration pages. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Admin as Admin
participant Crate as CrateDB Cluster
participant JMX as Crate JMX HTTP Exporter
participant SQLExp as Prometheus SQL Exporter
participant NodeExp as Prometheus Node Exporter
participant Prom as Prometheus
participant Graf as Grafana
Admin->>Crate: Bootstrap cluster & configure memory/network
Admin->>JMX: Enable JMX HTTP Exporter on nodes
Admin->>NodeExp: Install/configure Node Exporter
Admin->>SQLExp: Install/configure SQL Exporter (create sql_exporter user)
Admin->>Prom: Add scrape targets (node, cratedb_jmx, sql_exporter)
Admin->>Graf: Add Prometheus datasource, import dashboard
Prom->>JMX: Scrape JMX metrics
Prom->>NodeExp: Scrape node metrics
Prom->>SQLExp: Scrape SQL-derived metrics
SQLExp->>Crate: Run queries to produce metrics
Graf->>Prom: Query metrics for dashboards
Prom-->>Graf: Return time-series
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
2900d54
to
4eb547b
Compare
4eb547b
to
ca9d644
Compare
ca9d644
to
ba78343
Compare
@coderabbitai review When suggesting changes, please use active voice instead of first-person or passive voice, thanks! |
@amotl I'll review the PR and ensure I use active voice when suggesting changes. ✏️ Learnings added
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (12)
docs/admin/monitoring/prometheus-grafana.md (11)
6-8
: Use direct, active voice; remove “we/I” phrasing.Tighten the intro and switch to imperative tone.
-If you are running CrateDB in a production environment, you have probably wondered what would be the best way to monitor the servers to identify issues before they become problematic and to collect statistics that you can use for capacity planning. +In production, monitor CrateDB proactively to catch issues early and collect statistics for capacity planning. -We recommend pairing two well-known OSS solutions, [Prometheus](https://prometheus.io/) which is a system that collects and stores performance metrics, and [Grafana](https://grafana.com/) which is a system to create dashboards. +Pair two OSS tools: use [Prometheus](https://prometheus.io/) to collect and store metrics, and [Grafana](https://grafana.com/) to build dashboards.
16-22
: Avoid first-person and conversational tone; state scope crisply.-Things are a bit different of course if you are using containers, or if you are using the fully-managed cloud-hosted [CrateDB Cloud](https://cratedb.com/products/cratedb-cloud), but let’s see how all this works on an on-premises installation by setting all this up together. +Containerized and [CrateDB Cloud](https://cratedb.com/products/cratedb-cloud) setups differ. This tutorial targets on‑premises installations.
24-35
: Remove first‑person; clarify required steps and why.-In my case, I am using Ubuntu and I did it like this, first I ssh to the first machine and run: +On Ubuntu, start on the first node and run: @@ -This is a configuration file that will be used by CrateDB, we only need one line to configure memory settings here (this is a required step otherwise we will fail bootstrap checks): +This configuration file sets the JVM heap. Configure it to satisfy bootstrap checks:
112-116
: Append the javaagent; don’t overwrite existing CRATE_JAVA_OPTS.Overwriting may drop other required JVM flags. Instruct to append.
-CRATE_JAVA_OPTS="-javaagent:/usr/share/crate/lib/crate-jmx-exporter-1.0.0.jar=8080" +# Append to existing options (preserve other flags) +CRATE_JAVA_OPTS="${CRATE_JAVA_OPTS:-} -javaagent:/usr/share/crate/lib/crate-jmx-exporter-1.0.0.jar=8080"Also advise restricting network access to the exporter port via firewall/security groups.
126-131
: Note default listen address; suggest binding/ACL.Node Exporter typically listens on all interfaces. Add a note to bind to loopback or firewall the port in production.
143-147
: Creating users via HTTP: call out security.Transmitting credentials over HTTP even on localhost can leak via proxies/logs. Recommend HTTPS when available or running from the node over the Postgres protocol with a local client.
248-259
: Fix duplicated wording; keep jobs concise.Tighten the sentence and ensure job list reflects intended targets.
-We replace this with the below configuration, which reflects port 8080 (Crate JMX Exporter), port 9100 (Prometheus Node Exporter), port 9237 (Prometheus SQL Exporter), as well as port 9100 (Prometheus Node Exporter). +Replace it with the following jobs: port 9100 (Node Exporter), port 8080 (Crate JMX Exporter), and port 9237 (SQL Exporter).
275-281
: Avoid bare URLs; satisfy markdownlint MD034.Render endpoints as links or code, not bare URLs.
-If you now point your browser to *http://<Grafana host>:3000* you will be welcomed by the Grafana login screen, the first time you can log in with admin as both the username and password, make sure to change this password right away. +Open `http://<grafana-host>:3000` to access the Grafana login screen. The default credentials are `admin`/`admin`; change the password immediately. @@ -then click on "Prometheus", and enter the URL *http://\<Prometheus host>:9090*. +then click "Prometheus" and set the URL to `http://<prometheus-host>:9090`.
283-285
: Avoid hotlinking external images in docs.Store the dashboard screenshot in the repo’s static assets to ensure offline builds and reproducibility.
291-303
: Deduplicate “Thread pool queue size” metric.The bullet appears twice. Keep one entry.
* Thread pool queue size: `sum(crate_threadpools{property="queueSize"}) by (name)` @@ - * Thread pool queue size: `crate_threadpools{property="queueSize"}`
102-116
: Update pinned crate-jmx-exporter version and document host bindingFile: docs/admin/monitoring/prometheus-grafana.md (lines 102–116)
- Replace hardcoded crate-jmx-exporter-1.0.0.jar with the latest stable (1.2.0 — released Aug 20, 2024) or use a version variable and document how to find the current release.
- Document that the agent accepts host:port and update the CRATE_JAVA_OPTS example to show binding a specific IP (e.g. CRATE_JAVA_OPTS="-javaagent:/usr/share/crate/lib/crate-jmx-exporter-1.2.0.jar=127.0.0.1:8080").
docs/integrate/prometheus/index.md (1)
124-138
: Resolve potential MD053: confirm link reference usage.Ensure the
[CrateDB and Prometheus]
reference is actually rendered by Sphinx in this page; otherwise remove the definition or link inline.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
docs/admin/index.md
(1 hunks)docs/admin/monitoring/prometheus-grafana.md
(1 hunks)docs/integrate/grafana/index.md
(1 hunks)docs/integrate/prometheus/index.md
(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-09T16:19:43.966Z
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.966Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.
Applied to files:
docs/integrate/grafana/index.md
🪛 markdownlint-cli2 (0.17.2)
docs/admin/monitoring/prometheus-grafana.md
61-61: Bare URL used
(MD034, no-bare-urls)
137-137: Link and image reference definitions should be needed
Unused link or image reference definition: "cratedb and prometheus"
(MD053, link-image-reference-definitions)
🔇 Additional comments (3)
docs/admin/index.md (1)
35-35
: LGTM: good placement under Cluster.The “Monitoring” entry fits between sharding/partitioning and performance.
docs/integrate/prometheus/index.md (1)
110-121
: LGTM: solid cross-link from Prometheus integration.The See also card improves discoverability of the monitoring tutorial.
docs/integrate/grafana/index.md (1)
54-66
: LGTM: See also card aligns with integration pattern.The cross-link to the admin tutorial is helpful and follows the integrations’ index.md anchor practice.
This comment was marked as resolved.
This comment was marked as resolved.
ba78343
to
4f59beb
Compare
Containerized and [CrateDB Cloud] setups differ. This tutorial targets | ||
standalone and on‑premises installations. | ||
|
||
## First we need a CrateDB cluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this topic already covered elsewhere? Could we link to the existing "install a cluster" content instead? This would avoid repeating and also avoids adjusting lot of places if we need to adjust anything on the setup guide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true. I think the unique thing here is that the fundamental installation is followed up by educating users about the installation of the Crate JMX HTTP Exporter, which requires editing CrateDB's /etc/default/crate
configuration file.
Setup of the Crate JMX HTTP Exporter
This is very simple, on each node run the following:
cd /usr/share/crate/lib wget https://repo1.maven.org/maven2/io/crate/crate-jmx-exporter/1.2.0/crate-jmx-exporter-1.2.0.jar nano /etc/default/crate
then uncomment the
CRATE_JAVA_OPTS
line and change its value to:# Append to existing options (preserve other flags). CRATE_JAVA_OPTS="${CRATE_JAVA_OPTS:-} -javaagent:/usr/share/crate/lib/crate-jmx-exporter-1.2.0.jar=8080"and restart the crate daemon:
systemctl restart crate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, we can use the opportunity to break out and refactor those ingredients to a dedicated place and then refer to them, as you've suggested.
Let's use the chance right away? If you agree, just signal 👍.
```shell | ||
echo "deb https://packages.grafana.com/oss/deb stable main" | tee -a /etc/apt/sources.list.d/grafana.list | ||
wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add - | ||
apt update | ||
apt install grafana | ||
systemctl start grafana-server | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe linking to the grafana doc instead (or additional)? https://grafana.com/docs/grafana/latest/setup-grafana/installation/debian/
Same points like before, if something changes on the install documentation, we do not have to adjust it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this article was meant to be written as a tutorial. According to its accepted definition, a tutorial apparently needs to lay out each step safely.
Because I think it is not too practical from a maintenance perspective, I agree with your approach to refer to Grafana's documentation instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See e93ebd7.
* Set up Docker Compose to run CrateDB, Prometheus, and the CrateDB Prometheus Adapter | ||
* Run the applications with Docker Compose | ||
|
||
*Note: this blog post uses CrateDB 4.7.0, Prometheus 2.33.3 and CrateDB Prometheus Adapter 0.4.0* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the whole content need some adjustments, like removing ^^ and adjusting the style from the I show you
into non-personal phrases.
Another option would be to just link to these post instead of copying it. Not sure about this. I think one idea of putting this into post instead of adding it to the documentation was, that a post can age while a documentation should be up-to-date and such requires maintaining.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The content has been converted to use active voice, thanks.
3691767
to
5c66096
Compare
e69661e
to
d8baa8c
Compare
e27df98
to
77abeea
Compare
b9722a3
to
0fe4e0d
Compare
0fe4e0d
to
e93ebd7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
docs/admin/monitoring/prometheus-grafana.md (4)
32-45
: Add language hint to the heap-size fence.Add a language to the fence so markdownlint stops complaining and readers get syntax highlighting.
-``` -CRATE_HEAP_SIZE=4G -``` +```ini +CRATE_HEAP_SIZE=4G +```Based on static analysis hints.
40-44
: Label the command block.Add
shell
to the fence so lint passes and the commands highlight correctly.-``` -mkdir /etc/crate -nano /etc/crate/crate.yml -``` +```shell +mkdir /etc/crate +nano /etc/crate/crate.yml +```Based on static analysis hints.
131-134
: Mark the Node Exporter command block.Add
shell
to this fence to satisfy markdownlint and improve readability.-``` -apt install prometheus-node-exporter -``` +```shell +apt install prometheus-node-exporter +```Based on static analysis hints.
239-246
: Convert the indented examples into fenced blocks.Switch these indented snippets to fenced code blocks so markdownlint stops flagging them and the examples stay crisp.
- basic_auth_users: - admin: <bcrypt hash> +```yaml +basic_auth_users: + admin: <bcrypt hash> +``` - ARGS="--web.config.file=/etc/prometheus/web.yml --web.enable-lifecycle" +```shell +ARGS="--web.config.file=/etc/prometheus/web.yml --web.enable-lifecycle" +```Based on static analysis hints.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
docs/admin/index.md
(1 hunks)docs/admin/monitoring/prometheus-grafana.md
(1 hunks)docs/integrate/grafana/index.md
(1 hunks)docs/integrate/prometheus/index.md
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- docs/integrate/prometheus/index.md
- docs/admin/index.md
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-08-22T18:11:12.776Z
Learnt from: amotl
PR: crate/cratedb-guide#263
File: docs/integrate/kafka/docker-python.md:112-120
Timestamp: 2025-08-22T18:11:12.776Z
Learning: In documentation and tutorial repositories like cratedb-guide, code examples should prioritize simplicity, clarity, and educational value over production-ready features. Comprehensive error handling, extensive validation, and other production concerns can distract from the main learning objectives and make examples harder to follow. Review suggestions should focus on correctness and clarity rather than production hardening.
Applied to files:
docs/admin/monitoring/prometheus-grafana.md
📚 Learning: 2025-08-14T12:42:12.744Z
Learnt from: amotl
PR: crate/cratedb-guide#248
File: docs/integrate/amqp/tutorial.md:36-41
Timestamp: 2025-08-14T12:42:12.744Z
Learning: In the cratedb-guide repository, code examples and tutorials prioritize simplicity and ease of following over security hardening. Security recommendations should be more lenient in tutorial/demo contexts to avoid distracting from the main learning objectives.
Applied to files:
docs/admin/monitoring/prometheus-grafana.md
🪛 markdownlint-cli2 (0.18.1)
docs/admin/monitoring/prometheus-grafana.md
33-33: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
131-131: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
239-239: Code block style
Expected: fenced; Actual: indented
(MD046, code-block-style)
244-244: Code block style
Expected: fenced; Actual: indented
(MD046, code-block-style)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build docs
About
Continue adding integration guides from the community forum.
Preview
References
/cc @hlcianfagna, @hammerhead