Skip to content

Conversation

amotl
Copy link
Member

@amotl amotl commented Sep 20, 2025

About

In order to expand the Ingest / Telemetry section, the new subsection about the venerable collectd accompanies subsections about integrations with Prometheus, rsyslog, and Telegraf.

Preview

References

Trivia

@coderabbitai: Please review and convert to active voice.

/cc @karynzv, @hammerhead, @surister

@amotl amotl added the new content New content being added. label Sep 20, 2025
Copy link

coderabbitai bot commented Sep 20, 2025

Walkthrough

Adds Collectd integration docs and examples: new integration page, configuration snippets for CrateDB and Telegraf, a Dockerfile to run collectd, two usage tutorials, and an entry in the telemetry integrations index. No existing content removed; all additions are documentation and config assets.

Changes

Cohort / File(s) Summary
Telemetry index entry
docs/ingest/telemetry/index.md
Inserted a new grid-item-card for Collectd in the Integrations section with link, type, and description.
Collectd integration page
docs/integrate/collectd/index.md
New integration landing page describing Collectd, showing config includes for CrateDB (PostgreSQL writer) and Telegraf, links to tutorials, and external references.
Collectd containerization
docs/integrate/collectd/Dockerfile
New Dockerfile based on debian:13-slim; installs collectd and runs it in foreground via entrypoint.
Collectd → CrateDB config
docs/integrate/collectd/collectd-cratedb.conf
New Collectd config using LoadPlugin postgresql; defines SQL writer and database to insert metrics into doc.collectd_data on CrateDB.
Collectd → Telegraf config
docs/integrate/collectd/collectd-telegraf.conf
New Collectd network plugin config sending metrics to telegraf server; optional port/security commented.
Telegraf config for Collectd input and CrateDB output
docs/integrate/collectd/telegraf.conf
New Telegraf config: [[inputs.socket_listener]] with collectd format on UDP 25826; CrateDB output via Postgres URL; optional auth/typesdb guidance.
Usage tutorial: Collectd → CrateDB
docs/integrate/collectd/usage-collectd.md
New tutorial covering prerequisites, table creation, building/running Collectd, config, and sample queries.
Usage tutorial: Collectd → Telegraf → CrateDB
docs/integrate/collectd/usage-telegraf.md
New tutorial detailing setup for Telegraf bridge, platform-specific start commands, and example query.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant Collectd
  participant CrateDB as CrateDB (PostgreSQL wire)
  Note over Collectd,CrateDB: Direct ingestion path
  User->>Collectd: Start collectd with postgresql plugin
  Collectd->>CrateDB: INSERT metrics into doc.collectd_data
  CrateDB-->>Collectd: ACK/Result
  User->>CrateDB: Query collected metrics
  CrateDB-->>User: Result rows
Loading
sequenceDiagram
  autonumber
  actor User
  participant Collectd
  participant Telegraf
  participant CrateDB as CrateDB (PostgreSQL wire)
  rect rgba(230,245,255,0.5)
    Note over Collectd,Telegraf: Network plugin (UDP 25826)
    User->>Collectd: Start collectd (network plugin)
    Collectd--)Telegraf: Metrics (collectd protocol)
  end
  rect rgba(240,255,230,0.5)
    Note over Telegraf,CrateDB: Output plugin (Postgres)
    Telegraf->>CrateDB: Write metrics (table_create=true)
    CrateDB-->>Telegraf: ACK/Result
  end
  User->>CrateDB: Query metrics
  CrateDB-->>User: Result rows
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested labels

guidance, cross linking

Suggested reviewers

  • kneth
  • karynzv

Poem

I twitch my whiskers, ears alert and keen,
New metrics hop to CrateDB—so clean!
From Collectd’s burrow, straight or via stream,
Telegraf ferries bytes like a gentle beam.
Carrots of data, rows in a row—
I thump with joy: “Ship, store, query—go!” 🥕🐇

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The title "collectd: Index page and two usage guides" concisely and accurately summarizes the primary changes in the PR — adding a collectd index page and two tutorial/usage guides — and aligns with the added files in the changeset; it is specific, readable, and not vague or noisy.
Description Check ✅ Passed The PR description explains the addition of a collectd subsection, provides preview links, references, and a reviewer note, all of which relate directly to the files and objectives in the changeset and give useful context for reviewers.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch collectd

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@amotl amotl marked this pull request as ready for review September 20, 2025 16:39
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
docs/integrate/collectd/Dockerfile (1)

7-8: Consider combining apt commands for better layer efficiency.

Combining the apt update and apt install commands reduces Docker layers and ensures the package index is always fresh when installing packages.

-RUN apt update --yes
-RUN apt install --yes collectd
+RUN apt update --yes && apt install --yes collectd
docs/integrate/collectd/collectd-telegraf.conf (1)

4-6: Add network plugin configuration details.

The configuration is minimal but functional. However, users may benefit from additional configuration options like port specification and security settings for production use.

Consider adding a comment about common optional settings:

 <Plugin "network">
   Server "telegraf"
+  # Optional: ServerPort 25826
+  # Optional: SecurityLevel "sign" or "encrypt"
 </Plugin>
docs/integrate/collectd/collectd-cratedb.conf (1)

13-13: Consider making CommitInterval configurable.

A 30-second commit interval may be too aggressive for some deployments. Consider documenting this as a tunable parameter.

Add a comment explaining the CommitInterval setting:

         <Database cratedb>
                 Writer sqlstore
+                # Commit every 30 seconds (adjust based on write volume)
                 CommitInterval 30
docs/integrate/collectd/tutorial-telegraf.md (1)

61-61: Fix typo in Dockerfile filename.

Line 61 refers to "Dockername" instead of "Dockerfile".

-or use these instructions for building an OCI to invoke on Docker or Podman.
-Store this file under the name `Dockername`, then invoke the command
+or use these instructions for building an OCI to invoke on Docker or Podman.
+Store this file under the name `Dockerfile`, then invoke the command
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 26db150 and 2d49faf.

📒 Files selected for processing (8)
  • docs/ingest/telemetry/index.md (1 hunks)
  • docs/integrate/collectd/Dockerfile (1 hunks)
  • docs/integrate/collectd/collectd-cratedb.conf (1 hunks)
  • docs/integrate/collectd/collectd-telegraf.conf (1 hunks)
  • docs/integrate/collectd/index.md (1 hunks)
  • docs/integrate/collectd/telegraf.conf (1 hunks)
  • docs/integrate/collectd/tutorial-collectd.md (1 hunks)
  • docs/integrate/collectd/tutorial-telegraf.md (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build docs
🔇 Additional comments (18)
docs/integrate/collectd/Dockerfile (2)

1-1: Consider using Debian 13 stable image instead of slim.

Debian 13 "trixie" is the current stable distribution released on August 9th, 2025, and collectd 5.12.0-23 is available in the trixie (testing) distribution. The debian:13-slim tag should work correctly for this use case.


11-11: LGTM! Correct entrypoint for collectd foreground execution.

The -f flag correctly runs collectd in foreground mode, which is essential for containerized deployments.

docs/ingest/telemetry/index.md (1)

55-59: LGTM! Consistent with existing grid entries.

The new collectd grid item follows the same format as existing entries (Prometheus, rsyslog, Telegraf) and provides an appropriate description.

docs/integrate/collectd/collectd-cratedb.conf (2)

14-16: LGTM! Database connection parameters are appropriate.

The connection settings use standard defaults that align with the tutorial Docker setup using the cratedb-demo network.


8-8: No schema issues — INSERT parameters match CREATE TABLE. The INSERT's 9 parameters map to doc.collectd_data columns (p_time TIMESTAMP WITH TIME ZONE; p_host, p_plugin, p_plugin_instance, p_type, p_type_instance, p_value_names, p_type_names, p_values as TEXT); month is GENERATED ALWAYS AS date_trunc('month', p_time). See docs/integrate/collectd/tutorial-collectd.md.

docs/integrate/collectd/telegraf.conf (3)

2-2: Verify UDP port 25826 consistency.

Port 25826 is the standard collectd network plugin port, which should match the collectd network configuration.


28-28: LGTM! Correct CrateDB connection string format.

The PostgreSQL-compatible URL format with SSL disabled is appropriate for the Docker demo environment.


37-37: Good practice: Enable automatic table creation.

Setting table_create = true is helpful for getting started quickly, though users should consider creating tables manually in production for better control.

docs/integrate/collectd/tutorial-collectd.md (4)

64-76: LGTM! Table schema matches collectd configuration.

The table schema with 9 columns correctly corresponds to the INSERT statement parameters in collectd-cratedb.conf. The monthly partitioning is a good practice for time-series data.


86-87: LGTM! Correct literalinclude directive.

The literalinclude correctly references the Dockerfile in the same directory, which will be rendered properly in the documentation.


98-99: LGTM! Configuration file reference is correct.

The literalinclude correctly references the collectd-cratedb.conf file, providing readers with the exact configuration needed.


143-151: Sample output looks realistic and helpful.

The sample psql output demonstrates the expected data structure and shows realistic collectd metrics, helping users verify their setup.

docs/integrate/collectd/tutorial-telegraf.md (2)

166-174: Sample output demonstrates Telegraf's different data format.

The JSON-structured fields show how Telegraf transforms collectd data differently compared to the direct PostgreSQL approach, which is valuable for users to understand.


163-163: Verify table name consistency with Telegraf configuration.
telegraf.conf shows the default table as '# table = "metrics"' and has table_create = true — the verification query SELECT * FROM doc.metrics is correct.

docs/integrate/collectd/index.md (4)

5-5: LGTM! Image link and formatting are appropriate.

The right-aligned collectd graph image with lazy loading provides good visual context while maintaining page performance.


13-21: LGTM! Accurate description of collectd functionality.

The description correctly explains collectd's purpose, data sources, and use cases for performance analysis and capacity planning.


28-34: LGTM! Configuration examples provide clear integration paths.

Both PostgreSQL direct integration and Telegraf forwarding options are clearly presented with literalinclude directives that reference the actual configuration files.


42-52: LGTM! Tutorial links are properly structured.

The grid-item-cards provide clear navigation to both integration approaches with descriptive text and correct reference links.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (15)
docs/integrate/collectd/tutorial-collectd.md (15)

4-6: Tighten intro; use active voice.

Prefer a concise, active sentence.

-This tutorial walks you through configuring and starting the [collectd]
-agent and daemon, and CrateDB, to submit and store system metrics into
-CrateDB.
+This tutorial shows how to configure and start [collectd] and CrateDB so that collectd sends system metrics and CrateDB stores them.

20-26: Pin the psql image for reproducibility.

Unpinned images drift. Pin a stable major.

-alias psql="docker run --rm -i --network=cratedb-demo docker.io/postgres psql"
+alias psql="docker run --rm -i --network=cratedb-demo docker.io/postgres:16 psql"

27-33: Pin the psql image on PowerShell as well.

-function psql { docker run --rm -i --network=cratedb-demo docker.io/postgres psql @args }
+function psql { docker run --rm -i --network=cratedb-demo docker.io/postgres:16 psql @args }

35-39: Use the correct code fence and pin the image in DOSKEY example.

Prefer bat for Windows cmd highlighting; pin image.

-```shell
-doskey psql=docker run --rm -i --network=cratedb-demo docker.io/postgres psql $*
+```bat
+doskey psql=docker run --rm -i --network=cratedb-demo docker.io/postgres:16 psql $*

---

`45-49`: **Make network creation idempotent.**

Avoid errors if the network already exists.


```diff
-docker network create cratedb-demo
+docker network inspect cratedb-demo >/dev/null 2>&1 || docker network create cratedb-demo

50-55: Run CrateDB detached and pin the image.

Detached improves UX for multi‑terminal steps; pinning avoids drift.

-docker run --name=cratedb --rm -it --network=cratedb-demo \
-  --publish=4200:4200 --publish=5432:5432 \
-  --env=CRATE_HEAP_SIZE=2g docker.io/crate -Cdiscovery.type=single-node
+docker run --name=cratedb --rm -d --network=cratedb-demo \
+  -p 4200:4200 -p 5432:5432 \
+  -e CRATE_HEAP_SIZE=2g docker.io/crate:5 -Cdiscovery.type=single-node

63-76: Align table types with arrays and avoid TIMESTAMPTZ ambiguity.

CrateDB handles TIMESTAMP in UTC; storing arrays as arrays eases querying (unnest/ANY). Also keep the generated monthly partition.

 CREATE TABLE doc.collectd_data (
-   p_time timestamp with time zone,
-   p_host TEXT,
-   p_plugin TEXT,
-   p_plugin_instance TEXT,
-   p_type TEXT,
-   p_type_instance TEXT,
-   p_value_names TEXT,
-   p_type_names TEXT,
-   p_values TEXT,
-   month GENERATED ALWAYS AS date_trunc('month',p_time)
+   p_time TIMESTAMP,
+   p_host TEXT,
+   p_plugin TEXT,
+   p_plugin_instance TEXT,
+   p_type TEXT,
+   p_type_instance TEXT,
+   p_value_names ARRAY(TEXT),
+   p_type_names ARRAY(TEXT),
+   p_values ARRAY(DOUBLE PRECISION),
+   month GENERATED ALWAYS AS date_trunc('month', p_time)
 ) PARTITIONED BY (month);

Please confirm that collectd-cratedb.conf maps fields to these column names and array types.


81-85: Fix phrasing: “as an OCI image”; “OCI image to run”.

Small editorial cleanup.

-collectd is not available per OCI image, so either install standalone,
-or use these instructions for building an OCI to invoke on Docker or Podman.
+collectd is not available as an OCI image, so either install it standalone,
+or use these instructions to build an OCI image to run on Docker or Podman.

106-113: Mount config read‑only.

Protects against accidental writes from the container.

-  --volume ${PWD}/collectd-cratedb.conf:/etc/collectd/collectd.conf.d/collectd-cratedb.conf \
+  --volume ${PWD}/collectd-cratedb.conf:/etc/collectd/collectd.conf.d/collectd-cratedb.conf:ro \

116-120: Read‑only config on PowerShell.

-  --volume "${PWD}\collectd-cratedb.conf:/etc/collectd/collectd.conf.d/collectd-cratedb.conf" `
+  --volume "${PWD}\collectd-cratedb.conf:/etc/collectd/collectd.conf.d/collectd-cratedb.conf:ro" `

124-128: Read‑only config on cmd.exe.

-  --volume "%cd%\collectd-cratedb.conf:/etc/collectd/collectd.conf.d/collectd-cratedb.conf" ^
+  --volume "%cd%\collectd-cratedb.conf:/etc/collectd/collectd.conf.d/collectd-cratedb.conf:ro" ^

132-138: Show most recent samples first.

More useful for validation.

-psql "postgresql://crate:crate@cratedb:5432/" -c "SELECT * FROM doc.collectd_data LIMIT 5;"
+psql "postgresql://crate:crate@cratedb:5432/" -c "SELECT * FROM doc.collectd_data ORDER BY p_time DESC LIMIT 5;"

140-147: Add a brief note about NaN samples.

Some interfaces report NaN; call it out to avoid confusion in screenshots.

Consider appending: “Note: Some interfaces may report NaN for counters that are not applicable.”


50-55: Optional: add a quick cleanup section.

Quality‑of‑life for users finishing the tutorial.

You can add:

docker rm -f collectd cratedb 2>/dev/null || true
docker network rm cratedb-demo 2>/dev/null || true

10-12: Mention Podman substitution explicitly.

Small clarity win.

-Docker is used for running all components. This approach works consistently
-across Linux, macOS, and Windows. Alternatively, you can use Podman.
+Docker runs all components consistently across Linux, macOS, and Windows.
+If you use Podman, substitute podman for docker in the commands.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2d49faf and 757d4d5.

📒 Files selected for processing (6)
  • docs/integrate/collectd/Dockerfile (1 hunks)
  • docs/integrate/collectd/collectd-cratedb.conf (1 hunks)
  • docs/integrate/collectd/collectd-telegraf.conf (1 hunks)
  • docs/integrate/collectd/telegraf.conf (1 hunks)
  • docs/integrate/collectd/tutorial-collectd.md (1 hunks)
  • docs/integrate/collectd/tutorial-telegraf.md (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • docs/integrate/collectd/collectd-telegraf.conf
🚧 Files skipped from review as they are similar to previous changes (4)
  • docs/integrate/collectd/telegraf.conf
  • docs/integrate/collectd/tutorial-telegraf.md
  • docs/integrate/collectd/Dockerfile
  • docs/integrate/collectd/collectd-cratedb.conf
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build docs
🔇 Additional comments (1)
docs/integrate/collectd/tutorial-collectd.md (1)

63-76: Verify CrateDB column types via information_schema

psql was not available in the verification environment (psql: command not found). Run this query against your CrateDB instance and paste the output to confirm column types — especially p_time (TIMESTAMP WITH TIME ZONE) and the *_names / *_values array mappings:

SELECT column_name, data_type
FROM information_schema.columns
WHERE table_schema='doc' AND table_name='collectd_data'
ORDER BY ordinal_position;

@amotl amotl changed the title collectd: Index page and two tutorials collectd: Index page and two usage guides Sep 23, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (6)
docs/integrate/collectd/usage-telegraf.md (6)

4-7: Tighten phrasing; plural agreement.

Use “them in” (metrics, plural) and a slightly more direct opener.

Apply this diff:

-This usage guide shows how to configure and start [collectd], [Telegraf]
-and CrateDB so that collectd sends system metrics to Telegraf, which
-stores it into CrateDB using the built-in [CrateDB Output Plugin for Telegraf].
+This guide shows how to configure and start [collectd], [Telegraf],
+and CrateDB so that collectd sends system metrics to Telegraf, which
+stores them in CrateDB using the built-in [CrateDB Output Plugin for Telegraf].

24-25: Allocate a TTY for interactive psql.

Add -t to match the other run commands and improve interactive UX.

Apply this diff:

-alias psql="docker run --rm -i --network=cratedb-demo docker.io/postgres:16 psql"
+alias psql="docker run --rm -it --network=cratedb-demo docker.io/postgres:16 psql"

47-48: Make network creation idempotent (optional).

Avoid “already exists” errors on repeat runs.

Use this instead:

docker network inspect cratedb-demo >/dev/null 2>&1 || docker network create cratedb-demo

133-136: Quote PWD in Linux path to handle spaces.

Unquoted ${PWD} can break when the path contains spaces; align with earlier usage of "$(pwd)".

Apply this diff:

-docker run --name=collectd --rm -it --network=cratedb-demo \
-  --volume ${PWD}/collectd-telegraf.conf:/etc/collectd/collectd.conf.d/collectd-telegraf.conf \
-  local/collectd
+docker run --name=collectd --rm -it --network=cratedb-demo \
+  --volume "$(pwd)"/collectd-telegraf.conf:/etc/collectd/collectd.conf.d/collectd-telegraf.conf \
+  local/collectd

84-87: Avoid backticks in reference-style link labels.

Backticks in the label can cause resolution issues. Keep code formatting in link text but use a simple label ID.

Apply these diffs:

- To send the collected data to Telegraf, configure collectd by loading its
-[`network` plugin] and supplying settings. Store this file under
+ To send the collected data to Telegraf, configure collectd by loading its
+[`network` plugin][collectd-network-plugin] and supplying settings. Store this file under
-[`network` plugin]: https://collectd.org/documentation/manpages/collectd.conf.html#plugin-network
+[collectd-network-plugin]: https://collectd.org/documentation/manpages/collectd.conf.html#plugin-network

Also applies to: 175-178


52-55: Pin Docker image tags for reproducibility.

Use stable image tags in the docs to avoid silent breakages: docker.io/crate:5.10.10 and influxdata/telegraf:1.35.4.

Locations: docs/integrate/collectd/usage-telegraf.md — lines 52-55 (also apply to 102-105, 118-121)

Suggested replacements:

  • --env=CRATE_HEAP_SIZE=2g docker.io/crate -> --env=CRATE_HEAP_SIZE=2g docker.io/crate:5.10.10
  • docker.io/telegraf -> influxdata/telegraf:1.35.4
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ad56396 and ea5e730.

📒 Files selected for processing (8)
  • docs/ingest/telemetry/index.md (1 hunks)
  • docs/integrate/collectd/Dockerfile (1 hunks)
  • docs/integrate/collectd/collectd-cratedb.conf (1 hunks)
  • docs/integrate/collectd/collectd-telegraf.conf (1 hunks)
  • docs/integrate/collectd/index.md (1 hunks)
  • docs/integrate/collectd/telegraf.conf (1 hunks)
  • docs/integrate/collectd/usage-collectd.md (1 hunks)
  • docs/integrate/collectd/usage-telegraf.md (1 hunks)
✅ Files skipped from review due to trivial changes (2)
  • docs/integrate/collectd/usage-collectd.md
  • docs/integrate/collectd/index.md
🚧 Files skipped from review as they are similar to previous changes (5)
  • docs/integrate/collectd/Dockerfile
  • docs/ingest/telemetry/index.md
  • docs/integrate/collectd/collectd-telegraf.conf
  • docs/integrate/collectd/collectd-cratedb.conf
  • docs/integrate/collectd/telegraf.conf
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build docs
🔇 Additional comments (1)
docs/integrate/collectd/usage-telegraf.md (1)

1-3: Solid structure and Sphinx/MyST usage.

Tabs, literalinclude, and anchors look consistent and should build cleanly.

@amotl amotl merged commit 3fe5d7f into main Sep 30, 2025
3 checks passed
@amotl amotl deleted the collectd branch September 30, 2025 14:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new content New content being added.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant