Skip to content

Conversation

amotl
Copy link
Member

@amotl amotl commented Aug 9, 2025

About

Following up on the previous layout patch GH-238, this one adds more guidance to the ETL and CDC category pages, in form of reasonable groupings that try to strike the balance, using card design elements to contain related elements.

To this end, the new “Ingest” section now has absolutely NO “walls of links” any longer on its category pages etl and cdc, but instead offers better guidance in the form of labeled cards.

Details

Along the lines, also add missing integration items about aws-lambda and n8n.

Preview

References

@amotl amotl added sanding-1200 Fine sanding. guidance Matters of layout, shape, and structure. labels Aug 9, 2025
Copy link

coderabbitai bot commented Aug 9, 2025

Warning

Rate limit exceeded

@amotl has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 14 minutes and 40 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 7264bf9 and 176cc1e.

📒 Files selected for processing (16)
  • docs/_include/links.md (2 hunks)
  • docs/_include/styles.html (1 hunks)
  • docs/connect/drivers.md (1 hunks)
  • docs/connect/index.md (3 hunks)
  • docs/ingest/cdc/index.md (1 hunks)
  • docs/ingest/etl/index.md (2 hunks)
  • docs/ingest/index.md (1 hunks)
  • docs/ingest/telemetry/index.md (1 hunks)
  • docs/integrate/apache-airflow/index.md (2 hunks)
  • docs/integrate/apache-iceberg/index.md (2 hunks)
  • docs/integrate/aws-lambda/index.md (1 hunks)
  • docs/integrate/azure-functions/index.md (1 hunks)
  • docs/integrate/azure-functions/learn.rst (3 hunks)
  • docs/integrate/index.md (2 hunks)
  • docs/integrate/marquez/index.md (1 hunks)
  • docs/integrate/n8n/index.md (1 hunks)

Walkthrough

This update restructures and expands the CrateDB documentation related to integrations, connectors, and ETL tools. It introduces new documentation files for AWS Lambda and n8n, reorganizes and categorizes ETL and CDC integration pages with grid layouts, updates styling with new CSS, and adds or adjusts external link references. Minor textual and markup refinements are included throughout.

Changes

Cohort / File(s) Change Summary
Link References Update
docs/_include/links.md
Added new markdown link references for Apache Airflow, Astronomer, AWS Lambda, and DynamoDB; replaced some direct URLs with internal inventory references.
Styling Enhancements
docs/_include/styles.html
Added .ul-li-wide CSS rule for vertical spacing between list items in tiled link overview pages.
Connect Page Update
docs/connect/index.md
Included styles snippet, applied new CSS class to a grid card, changed card icon, and rephrased descriptions.
CDC Integrations Restructure
docs/ingest/cdc/index.md
Reorganized CDC integrations into "Connectors" and "Platforms" with cards, added descriptions and categorized grids.
ETL Integrations Overhaul
docs/ingest/etl/index.md
Expanded and categorized ETL integrations, introduced grid layouts, updated references, and included new tools.
Grid Styling Adjustment
docs/ingest/index.md
Added a class attribute to the grid container for styling.
Telemetry Text Edit
docs/ingest/telemetry/index.md
Reworded introduction to use active voice.
Airflow Integration Structure
docs/integrate/apache-airflow/index.md
Added external links include, wrapped intro in a div, and removed redundant link reference.
Iceberg Integration Intro
docs/integrate/apache-iceberg/index.md
Added introductory sentence and reference link for Apache Iceberg.
AWS Lambda Integration (New)
docs/integrate/aws-lambda/index.md
Added new documentation for AWS Lambda integration, with description and resource links.
Azure Functions Tagline Move
docs/integrate/azure-functions/index.md
Moved tagline outside of the "About" rubric for improved structure.
Azure Functions Capitalization
docs/integrate/azure-functions/learn.rst
Standardized capitalization of "IoT Hub" throughout the document.
Integration Index Update
docs/integrate/index.md
Reordered and added entries for aws-lambda and n8n integrations.
Marquez Integration Link Removal
docs/integrate/marquez/index.md
Removed "Apache Airflow" link reference at the end.
n8n Integration (New)
docs/integrate/n8n/index.md
Added new documentation for n8n integration, with description and resource links.
StreamSets Minor Edit
docs/integrate/streamsets/index.md
Added a blank line before external link reference for formatting.
Drivers Doc Link Fix
docs/connect/drivers.md
Replaced markdown anchor link with Sphinx cross-reference for build status link.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Docs
    participant ExternalResource

    User->>Docs: Access integration documentation
    Docs->>Docs: Render categorized grids (e.g., ETL, CDC, integrations)
    Docs->>ExternalResource: Provide external links (e.g., AWS Lambda, n8n, Airflow)
    User->>ExternalResource: Follow links for more details
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

Suggested labels

enhancement, new content

Suggested reviewers

  • hammerhead
  • bmunkholm

Poem

In burrows deep, I hop and write,
With grids and cards, I bring delight.
New links and docs for all to see,
AWS Lambda, n8n—hoppily!
With styles refreshed and pages neat,
This rabbit’s work is quite a treat.
🐇✨

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch etl-cdc-grouping

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
docs/integrate/apache-airflow/index.md (1)

18-31: Remove duplicate reference labels for Apache Airflow and Astronomer

The local definitions at the bottom of docs/integrate/apache-airflow/index.md overlap with those in docs/_include/links.md, which will trigger Sphinx redefinition warnings. Please remove the duplicated labels:

  • Apache Airflow
  • Astronomer

Suggested diff:

@@ docs/integrate/apache-airflow/index.md:92-103
-[Apache Airflow]: https://airflow.apache.org
-[Astronomer]: https://www.astronomer.io
🧹 Nitpick comments (12)
docs/_include/links.md (2)

3-4: Centralized link refs are fine; silence MD053 for this include file.

This file intentionally hosts shared reference definitions. markdownlint MD053 false-positives can be silenced locally.

Apply a local rule disable at the top:

+<!-- markdownlint-disable MD053 -->
 [Amazon DynamoDB Streams]: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.html
 [Amazon Kinesis Data Streams]: https://docs.aws.amazon.com/streams/latest/dev/introduction.html
 [Apache Airflow]: https://airflow.apache.org/
 [Astronomer]: https://www.astronomer.io/
 [AWS Database Migration Service (AWS DMS)]: https://aws.amazon.com/dms/
 [AWS DMS Integration with CrateDB]: https://cratedb-toolkit.readthedocs.io/io/dms/
 [AWS Lambda]: https://aws.amazon.com/lambda/
 ...
 [DynamoDB]: https://aws.amazon.com/dynamodb/

Also applies to: 7-7, 23-23


24-26: Consider using intersphinx refs for Toolkit pages (if labels exist).

For consistency with MongoDB links using inv:ctk, consider switching these to inv:ctk labels, provided they exist in cratedb-toolkit.

If labels exist, suggested change:

-[DynamoDB CDC Relay]: https://cratedb-toolkit.readthedocs.io/io/dynamodb/cdc.html
-[DynamoDB CDC Relay with AWS Lambda]: https://cratedb-toolkit.readthedocs.io/io/dynamodb/cdc-lambda.html
-[DynamoDB Table Loader]: https://cratedb-toolkit.readthedocs.io/io/dynamodb/loader.html
+[DynamoDB CDC Relay]: inv:ctk:*:label#dynamodb-cdc-relay
+[DynamoDB CDC Relay with AWS Lambda]: inv:ctk:*:label#dynamodb-cdc-lambda
+[DynamoDB Table Loader]: inv:ctk:*:label#dynamodb-loader
docs/integrate/azure-functions/learn.rst (1)

366-368: Good addition: local testing path via VSCode IoT Hub extension

Nice, actionable guidance. Optionally, consider adding a brief note that D2C messages must match the example schema (id/type/location/timestamp) to exercise the function end-to-end.

docs/integrate/aws-lambda/index.md (2)

11-14: Minor copy edit for clarity

Consider tightening the last sentence.

-[AWS Lambda] is a serverless compute service that runs your code in response to
-events and automatically manages the underlying compute resources for you. These
-events may include changes in state or an update.
+[AWS Lambda] is a serverless compute service that runs your code in response to
+events and automatically manages the underlying compute resources for you.
+Events can include state changes and updates.

20-23: Verify link reference labels exist in _include/links.md

Ensure the following reference labels are defined in /_include/links.md, otherwise Sphinx will emit warnings:

  • [AWS Lambda]
  • [DynamoDB CDC Relay with AWS Lambda]
  • [Replicating CDC events from DynamoDB to CrateDB]

I can add any missing entries to /_include/links.md if you share the target URLs.

docs/integrate/apache-airflow/index.md (1)

12-17: Accessibility nit: add alt text for logos

The two images use empty alt text. Consider adding concise alt attributes for better accessibility.

-[![](https://.../Apache-Airflow-Logo-392x140.png){w=180px}](https://airflow.apache.org/)
+[![Apache Airflow logo](https://.../Apache-Airflow-Logo-392x140.png){w=180px}](https://airflow.apache.org/)

-[![](https://logowik.com/content/uploads/images/astronomer2824.jpg){w=180px}](https://www.astronomer.io/)
+[![Astronomer logo](https://logowik.com/content/uploads/images/astronomer2824.jpg){w=180px}](https://www.astronomer.io/)

Also applies to: 18-31

docs/integrate/n8n/index.md (3)

1-6: Consider including shared links/styles for consistency

Other integration pages include /_include/links.md (and sometimes /_include/styles.html) at the top. Adding them keeps pages consistent and enables shared references as content grows.

(n8n)=
 # n8n

+:::{include} /_include/links.md
+:::
+
+:::{include} /_include/styles.html
+:::

7-9: Tighten wording

“connect any app with an API with any other” reads awkwardly. Suggest rephrase.

-[n8n] is a fair-code licensed workflow automation tool that combines AI capabilities
-with business process automation. It helps you to connect any app with an API with
-any other, and manipulate its data with little or no code.
+[n8n] is a fair-code licensed workflow automation tool that combines AI capabilities
+with business process automation. It helps you connect any API‑enabled app to another
+and manipulate data with little or no code.

14-16: Optional: convert bare URLs to named references

To match the rest of the guide, consider using named references (defined locally or in /_include/links.md) instead of bare URLs.

docs/ingest/cdc/index.md (1)

13-17: Minor wording tweak for concision

Replace “Please also take a look …” with a more concise alternative.

-Please also take a look at support for {ref}`generic ETL <etl>` solutions.
+Additionally, see {ref}`generic ETL <etl>` solutions.
docs/ingest/etl/index.md (2)

17-21: Minor wording tweak for concision

Shorten “Please also take a look …” per writing style guidance.

-Please also take a look at support for {ref}`cdc` solutions.
+Additionally, see {ref}`cdc` solutions.

162-167: Concise phrasing

“A variety of sources” can be shortened without loss of meaning.

-  to build streaming, batch and change-data-capture (CDC) pipelines that can ingest
-  and transform data from a variety of sources.
+  to build streaming, batch, and change‑data‑capture (CDC) pipelines that can ingest
+  and transform data from many sources.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9efca51 and 1b5d8c6.

📒 Files selected for processing (15)
  • docs/_include/links.md (2 hunks)
  • docs/_include/styles.html (1 hunks)
  • docs/connect/index.md (3 hunks)
  • docs/ingest/cdc/index.md (1 hunks)
  • docs/ingest/etl/index.md (2 hunks)
  • docs/ingest/index.md (1 hunks)
  • docs/ingest/telemetry/index.md (1 hunks)
  • docs/integrate/apache-airflow/index.md (3 hunks)
  • docs/integrate/apache-iceberg/index.md (2 hunks)
  • docs/integrate/aws-lambda/index.md (1 hunks)
  • docs/integrate/azure-functions/index.md (1 hunks)
  • docs/integrate/azure-functions/learn.rst (3 hunks)
  • docs/integrate/index.md (2 hunks)
  • docs/integrate/marquez/index.md (1 hunks)
  • docs/integrate/n8n/index.md (1 hunks)
🧰 Additional context used
🧠 Learnings (12)
📓 Common learnings
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.947Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:6-8
Timestamp: 2025-05-18T13:25:11.786Z
Learning: In the CrateDB Guide documentation, particularly for MCP-related pages, the author prefers a 1-column grid layout for multiple cards to stack them vertically rather than side by side, as this is an intentional design choice.
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/connect/ide.md:2-2
Timestamp: 2025-08-09T10:59:42.048Z
Learning: In the CrateDB Guide documentation, legacy anchor labels (like `(ide)=`) should be retained alongside newer, more specific ones (like `(connect-ide)=`) to maintain backward compatibility for external projects that might reference them, even if they appear redundant within the repository.
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:22-33
Timestamp: 2025-05-18T12:50:36.393Z
Learning: In the CrateDB Guide repository, the prefix `ctk:` in documentation links (like `ctk:query/mcp/server`) is an intersphinx reference that points to external content in the CrateDB Toolkit documentation at https://cratedb-toolkit.readthedocs.io/. These references are intentionally not pointing to local files within the repository.
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:8-20
Timestamp: 2025-05-18T12:50:38.681Z
Learning: In the CrateDB guide repository, references with the `ctk:` prefix (like `ctk:query/mcp/landscape`) are intersphinx references that link to resources in the cratedb-toolkit repository (https://github.com/crate/cratedb-toolkit/tree/main/doc), which are rendered at https://cratedb-toolkit.readthedocs.io/. These are valid cross-references between separate Sphinx documentation sets, not local file references.
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/index.md:312-316
Timestamp: 2025-08-09T11:00:03.537Z
Learning: In the CrateDB Guide documentation, locally stale references (such as `(reference-architectures)=`, `(domain)=`, and similar anchor labels) should be retained even when they appear unused within the repository, as external projects might still reference them. This ensures backward compatibility for external documentation and projects.
📚 Learning: 2025-08-09T16:19:43.947Z
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.947Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.

Applied to files:

  • docs/ingest/telemetry/index.md
  • docs/integrate/azure-functions/index.md
  • docs/_include/links.md
  • docs/ingest/index.md
  • docs/integrate/index.md
  • docs/integrate/aws-lambda/index.md
  • docs/integrate/apache-airflow/index.md
  • docs/integrate/n8n/index.md
  • docs/connect/index.md
  • docs/ingest/cdc/index.md
  • docs/integrate/azure-functions/learn.rst
  • docs/ingest/etl/index.md
📚 Learning: 2025-08-09T10:59:42.048Z
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/connect/ide.md:2-2
Timestamp: 2025-08-09T10:59:42.048Z
Learning: In the CrateDB Guide documentation, legacy anchor labels (like `(ide)=`) should be retained alongside newer, more specific ones (like `(connect-ide)=`) to maintain backward compatibility for external projects that might reference them, even if they appear redundant within the repository.

Applied to files:

  • docs/ingest/telemetry/index.md
  • docs/ingest/index.md
  • docs/integrate/apache-iceberg/index.md
  • docs/connect/index.md
  • docs/ingest/cdc/index.md
  • docs/ingest/etl/index.md
📚 Learning: 2025-08-09T11:00:03.537Z
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/index.md:312-316
Timestamp: 2025-08-09T11:00:03.537Z
Learning: In the CrateDB Guide documentation, locally stale references (such as `(reference-architectures)=`, `(domain)=`, and similar anchor labels) should be retained even when they appear unused within the repository, as external projects might still reference them. This ensures backward compatibility for external documentation and projects.

Applied to files:

  • docs/ingest/telemetry/index.md
  • docs/integrate/marquez/index.md
  • docs/ingest/index.md
  • docs/integrate/apache-iceberg/index.md
  • docs/integrate/apache-airflow/index.md
  • docs/ingest/cdc/index.md
  • docs/ingest/etl/index.md
📚 Learning: 2025-05-18T12:50:36.393Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:22-33
Timestamp: 2025-05-18T12:50:36.393Z
Learning: In the CrateDB Guide repository, the prefix `ctk:` in documentation links (like `ctk:query/mcp/server`) is an intersphinx reference that points to external content in the CrateDB Toolkit documentation at https://cratedb-toolkit.readthedocs.io/. These references are intentionally not pointing to local files within the repository.

Applied to files:

  • docs/ingest/telemetry/index.md
  • docs/_include/links.md
  • docs/connect/index.md
  • docs/ingest/cdc/index.md
  • docs/ingest/etl/index.md
📚 Learning: 2025-05-18T12:50:38.681Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:8-20
Timestamp: 2025-05-18T12:50:38.681Z
Learning: In the CrateDB guide repository, references with the `ctk:` prefix (like `ctk:query/mcp/landscape`) are intersphinx references that link to resources in the cratedb-toolkit repository (https://github.com/crate/cratedb-toolkit/tree/main/doc), which are rendered at https://cratedb-toolkit.readthedocs.io/. These are valid cross-references between separate Sphinx documentation sets, not local file references.

Applied to files:

  • docs/_include/links.md
  • docs/ingest/cdc/index.md
  • docs/ingest/etl/index.md
📚 Learning: 2025-05-18T13:25:11.786Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:6-8
Timestamp: 2025-05-18T13:25:11.786Z
Learning: In the CrateDB Guide documentation, particularly for MCP-related pages, the author prefers a 1-column grid layout for multiple cards to stack them vertically rather than side by side, as this is an intentional design choice.

Applied to files:

  • docs/ingest/index.md
  • docs/connect/index.md
  • docs/ingest/cdc/index.md
  • docs/ingest/etl/index.md
📚 Learning: 2025-08-09T11:03:33.164Z
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/ingest/etl/index.md:22-23
Timestamp: 2025-08-09T11:03:33.164Z
Learning: In the CrateDB Guide documentation, MyST Markdown files can use `{ref}` directives to reference labels defined in reStructuredText files using the RST syntax `.. _label:`. This cross-format referencing between MyST and RST is supported by Sphinx.

Applied to files:

  • docs/ingest/index.md
📚 Learning: 2025-08-05T07:14:57.416Z
Learnt from: hammerhead
PR: crate/cratedb-guide#221
File: docs/connect/configure.md:58-66
Timestamp: 2025-08-05T07:14:57.416Z
Learning: In CrateDB connection strings, the user:password@ syntax is valid for HTTP Basic authentication on port 4200, but PostgreSQL JDBC drivers do not support this format and require credentials as query parameters (?user=<user>&password=<password>) instead.

Applied to files:

  • docs/connect/index.md
📚 Learning: 2025-06-05T14:29:15.512Z
Learnt from: amotl
PR: crate/cratedb-guide#207
File: docs/integrate/etl/iceberg-risingwave.md:205-207
Timestamp: 2025-06-05T14:29:15.512Z
Learning: The `records.Database("crate://", echo=True)` connection string for CrateDB works with defaults: localhost as host, "crate" as user, and blank password. This is valid and functional code in the records library.

Applied to files:

  • docs/connect/index.md
📚 Learning: 2025-08-08T16:50:14.965Z
Learnt from: amotl
PR: crate/cratedb-guide#234
File: docs/home/index.md:47-50
Timestamp: 2025-08-08T16:50:14.965Z
Learning: In the CrateDB Guide docs (MyST), the correct intersphinx target for the CrateDB Cloud documentation homepage is `cloud:index` (not `cloud:docs-index` or `cloud-docs-index`). Use `:link: cloud:index` on cards/links. The `cloud` mapping is inherited via crate-docs-theme.

Applied to files:

  • docs/connect/index.md
  • docs/ingest/cdc/index.md
📚 Learning: 2025-08-06T02:11:46.794Z
Learnt from: amotl
PR: crate/cratedb-guide#222
File: docs/integrate/powerbi/index.md:16-23
Timestamp: 2025-08-06T02:11:46.794Z
Learning: The `project:` role in MyST Markdown syntax like `[](project:#powerbi-desktop)` and `[](project:#powerbi-service)` is valid cross-reference syntax in the CrateDB Guide repository's MyST Markdown setup.

Applied to files:

  • docs/ingest/etl/index.md
🪛 markdownlint-cli2 (0.17.2)
docs/_include/links.md

3-3: Link and image reference definitions should be needed
Unused link or image reference definition: "apache airflow"

(MD053, link-image-reference-definitions)


4-4: Link and image reference definitions should be needed
Unused link or image reference definition: "astronomer"

(MD053, link-image-reference-definitions)


5-5: Link and image reference definitions should be needed
Unused link or image reference definition: "aws database migration service (aws dms)"

(MD053, link-image-reference-definitions)


6-6: Link and image reference definitions should be needed
Unused link or image reference definition: "aws dms integration with cratedb"

(MD053, link-image-reference-definitions)


7-7: Link and image reference definitions should be needed
Unused link or image reference definition: "aws lambda"

(MD053, link-image-reference-definitions)


23-23: Link and image reference definitions should be needed
Unused link or image reference definition: "dynamodb"

(MD053, link-image-reference-definitions)

🪛 LanguageTool
docs/ingest/cdc/index.md

[style] ~15-~15: The word “also” tends to be overused. Consider using a formal alternative to strengthen your wording.
Context: ...and outlines how to use them optimally. Please also take a look at support for {ref}`generi...

(PLEASE_ALSO_CHECK)


[style] ~92-~92: Consider using a more concise synonym.
Context: ...gine can ingest and transform data from a variety of sources. It runs on-premises or in any ...

(A_VARIETY_OF)

docs/ingest/etl/index.md

[style] ~19-~19: The word “also” tends to be overused. Consider using a formal alternative to strengthen your wording.
Context: ...and outlines how to use them optimally. Please also take a look at support for {ref}cdc s...

(PLEASE_ALSO_CHECK)


[style] ~166-~166: Consider using a more concise synonym.
Context: ...at can ingest and transform data from a variety of sources. +++ Load data from streaming ...

(A_VARIETY_OF)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build docs
🔇 Additional comments (18)
docs/integrate/marquez/index.md (1)

260-260: AI summary inconsistency: no visible removal of [Apache Airflow] def in this file.

The summary mentions removing the Apache Airflow link def here; the diff only tweaks the Docker docs link line. If centralizing link refs is intended, this page already includes /_include/links.md at the top, so all good.

Likely an incorrect or invalid review comment.

docs/integrate/azure-functions/index.md (1)

7-8: Tagline relocation LGTM.

Improves scannability while keeping the canonical anchor in index.md per the restructuring pattern.

docs/ingest/index.md (1)

13-13: Class addition is correct; depends on CSS fix to take effect.

The :class-container: ul-li-wide will work once the ul-li-wide selector is fixed to non-nested CSS.

After adjusting styles.html as suggested, verify visually that list items get extra spacing in this grid.

docs/integrate/azure-functions/learn.rst (2)

3-5: Title capitalization looks correct and consistent with Azure terminology

“IoT Hub” capitalization is now standardized; aligns with style used elsewhere. Anchor suffix “-learn” matches the documented pattern.


234-234: Terminology fix LGTM

“IoT Hub” wording in the Built-in endpoints reference looks correct.

docs/integrate/apache-iceberg/index.md (2)

7-8: Concise intro improves context

The new one-liner clearly sets scope without redundancy.


22-22: Link reference addition is correct

Reference label resolves neatly; consistent with style elsewhere.

docs/ingest/telemetry/index.md (1)

7-7: Improved phrasing

Switching to active voice reads better and is clearer.

docs/integrate/index.md (2)

29-31: AWS and Azure ordering: confirm intended grouping

Placing aws-lambda before azure-functions groups AWS items together after other aws-* entries. If alphabetical ordering is the goal, azure-functions would come earlier; if ecosystem grouping is intended, this is fine—just confirm consistency across the list.


48-49: n8n entry added in the right spot

Insertion between mysql and node-red matches alphabetical ordering; ensure the new page exposes the (n8n)= anchor in its index.md for cross-refs.

docs/integrate/aws-lambda/index.md (1)

1-5: Anchor and shared links include are correct

Top-level anchor (aws-lambda)= and inclusion of shared link refs follow established patterns.

docs/integrate/apache-airflow/index.md (1)

6-8: Good move: centralizing link references

Including /_include/links.md improves consistency across integration pages. No blocking issues here.

docs/connect/index.md (3)

6-7: Consistent styling include

Including /_include/styles.html here aligns with changes across ingest pages and enables .ul-li-wide. Good change.


75-88: Correct usage of card body class

Applying :class-body: ul-li-wide is the right place to widen list spacing inside this card body.


90-99: Icon change to link makes sense

“How to connect” semantically fits the link icon. No issues.

docs/ingest/cdc/index.md (1)

20-96: All connector and platform anchors are present

Each {ref} target used on this page has a corresponding anchor defined in its integration docs (line 1 of each file), so there are no missing anchors:

  • docs/integrate/aws-dynamodb/index.md:1 (aws-dynamodb)=
  • docs/integrate/aws-kinesis/index.md:1 (aws-kinesis)=
  • docs/integrate/mongodb/index.md:1 (mongodb)=
  • docs/integrate/aws-dms/index.md:1 (aws-dms)=
  • docs/integrate/debezium/index.md:1 (debezium)=
  • docs/integrate/estuary/index.md:1 (estuary)=
  • docs/integrate/risingwave/index.md:1 (risingwave)=
  • docs/integrate/streamsets/index.md:1 (streamsets)=

No further action required.

docs/ingest/etl/index.md (2)

8-9: Consistent styling include

Including /_include/styles.html is aligned with the new grid layout and spacing conventions.


27-201: All {ref} targets verified
A verification script confirmed that every {ref} label in docs/ingest/etl/index.md has a corresponding anchor defined in its integration index file (e.g., docs/integrate/apache-airflow/index.md, docs/integrate/dbt/index.md, etc.). No missing anchors were detected—no changes required.

@amotl amotl force-pushed the etl-cdc-grouping branch from 1b5d8c6 to 997c81a Compare August 9, 2025 21:03
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (8)
docs/integrate/n8n/index.md (2)

8-9: Grammar: “connect any app … with any other” → “connect any app … to any other”

Small wording fix improves readability.

- with business process automation. It helps you to connect any app with an API with
- any other, and manipulate its data with little or no code.
+ with business process automation. It helps you connect any app with an API to
+ any other, and manipulate its data with little or no code.

14-15: Fix markdownlint MD034 (no bare URLs) by adding link text

Wrap the URLs in explicit links to satisfy the linter and improve accessibility.

-- https://cratedb.com/integrations/cratedb-and-n8n
-- https://n8n.io/integrations/cratedb/
+- [CrateDB and n8n integration](https://cratedb.com/integrations/cratedb-and-n8n)
+- [n8n CrateDB integration](https://n8n.io/integrations/cratedb/)
docs/ingest/etl/index.md (3)

17-21: Tone/style: tighten “Please also take a look …”

Minor tightening to a more neutral, concise phrasing.

-Please also take a look at support for {ref}`cdc` solutions.
+Also see support for {ref}`cdc` solutions.

104-137: Reconsider category placement of AWS DMS under “Databases”

AWS DMS is a migration/replication service rather than a database. Consider moving it to a “Platforms”/“Data integration” section (like on the CDC page) or adjust the category label to avoid confusion.

Would you like me to propose a concrete rearrangement mirroring the CDC “Platforms” grouping for consistency?


164-166: Wording: avoid “a variety of”

More concise phrasing.

- to build streaming, batch and change-data-capture (CDC) pipelines that can ingest
- and transform data from a variety of sources.
+ to build streaming, batch, and CDC pipelines that can ingest
+ and transform data from many sources.
docs/ingest/cdc/index.md (3)

13-17: Tone/style: tighten “Please also take a look …”

Minor tightening for clarity and consistency with the ETL page.

-Please also take a look at support for {ref}`generic ETL <etl>` solutions.
+Also see support for {ref}`generic ETL <etl>` solutions.

67-73: Grammar/punctuation: Debezium description

Remove the stray comma after “Use” and hyphenate “open-source” as an adjective.

-Use, Debezium an open source distributed platform for change data capture for
-loading data into CrateDB.
-It is used as a building block by a number of downstream third-party projects and products.
+Use Debezium, an open-source distributed platform for change data capture,
+to load data into CrateDB.
+It is used as a building block by a number of downstream third-party projects and products.

95-96: Wording: avoid “a variety of”

Concise phrasing.

-Use the StreamSets Data Collector Engine to ingest and transform data from a variety
-of sources into CrateDB. It runs on-premises or in any cloud.
+Use the StreamSets Data Collector Engine to ingest and transform data from many
+sources into CrateDB. It runs on-premises or in any cloud.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1b5d8c6 and 997c81a.

📒 Files selected for processing (15)
  • docs/_include/links.md (2 hunks)
  • docs/_include/styles.html (1 hunks)
  • docs/connect/index.md (3 hunks)
  • docs/ingest/cdc/index.md (1 hunks)
  • docs/ingest/etl/index.md (2 hunks)
  • docs/ingest/index.md (1 hunks)
  • docs/ingest/telemetry/index.md (1 hunks)
  • docs/integrate/apache-airflow/index.md (3 hunks)
  • docs/integrate/apache-iceberg/index.md (2 hunks)
  • docs/integrate/aws-lambda/index.md (1 hunks)
  • docs/integrate/azure-functions/index.md (1 hunks)
  • docs/integrate/azure-functions/learn.rst (3 hunks)
  • docs/integrate/index.md (2 hunks)
  • docs/integrate/marquez/index.md (1 hunks)
  • docs/integrate/n8n/index.md (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • docs/integrate/azure-functions/learn.rst
🚧 Files skipped from review as they are similar to previous changes (10)
  • docs/integrate/azure-functions/index.md
  • docs/integrate/apache-iceberg/index.md
  • docs/integrate/index.md
  • docs/_include/styles.html
  • docs/ingest/index.md
  • docs/integrate/marquez/index.md
  • docs/integrate/aws-lambda/index.md
  • docs/integrate/apache-airflow/index.md
  • docs/ingest/telemetry/index.md
  • docs/connect/index.md
🧰 Additional context used
🧠 Learnings (8)
📓 Common learnings
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.947Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:6-8
Timestamp: 2025-05-18T13:25:11.786Z
Learning: In the CrateDB Guide documentation, particularly for MCP-related pages, the author prefers a 1-column grid layout for multiple cards to stack them vertically rather than side by side, as this is an intentional design choice.
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/connect/ide.md:2-2
Timestamp: 2025-08-09T10:59:42.048Z
Learning: In the CrateDB Guide documentation, legacy anchor labels (like `(ide)=`) should be retained alongside newer, more specific ones (like `(connect-ide)=`) to maintain backward compatibility for external projects that might reference them, even if they appear redundant within the repository.
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:22-33
Timestamp: 2025-05-18T12:50:36.393Z
Learning: In the CrateDB Guide repository, the prefix `ctk:` in documentation links (like `ctk:query/mcp/server`) is an intersphinx reference that points to external content in the CrateDB Toolkit documentation at https://cratedb-toolkit.readthedocs.io/. These references are intentionally not pointing to local files within the repository.
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:8-20
Timestamp: 2025-05-18T12:50:38.681Z
Learning: In the CrateDB guide repository, references with the `ctk:` prefix (like `ctk:query/mcp/landscape`) are intersphinx references that link to resources in the cratedb-toolkit repository (https://github.com/crate/cratedb-toolkit/tree/main/doc), which are rendered at https://cratedb-toolkit.readthedocs.io/. These are valid cross-references between separate Sphinx documentation sets, not local file references.
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/index.md:312-316
Timestamp: 2025-08-09T11:00:03.537Z
Learning: In the CrateDB Guide documentation, locally stale references (such as `(reference-architectures)=`, `(domain)=`, and similar anchor labels) should be retained even when they appear unused within the repository, as external projects might still reference them. This ensures backward compatibility for external documentation and projects.
📚 Learning: 2025-08-09T16:19:43.947Z
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.947Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.

Applied to files:

  • docs/ingest/etl/index.md
  • docs/ingest/cdc/index.md
  • docs/integrate/n8n/index.md
📚 Learning: 2025-08-09T10:59:42.048Z
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/connect/ide.md:2-2
Timestamp: 2025-08-09T10:59:42.048Z
Learning: In the CrateDB Guide documentation, legacy anchor labels (like `(ide)=`) should be retained alongside newer, more specific ones (like `(connect-ide)=`) to maintain backward compatibility for external projects that might reference them, even if they appear redundant within the repository.

Applied to files:

  • docs/ingest/etl/index.md
  • docs/ingest/cdc/index.md
📚 Learning: 2025-08-09T11:00:03.537Z
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/index.md:312-316
Timestamp: 2025-08-09T11:00:03.537Z
Learning: In the CrateDB Guide documentation, locally stale references (such as `(reference-architectures)=`, `(domain)=`, and similar anchor labels) should be retained even when they appear unused within the repository, as external projects might still reference them. This ensures backward compatibility for external documentation and projects.

Applied to files:

  • docs/ingest/etl/index.md
  • docs/ingest/cdc/index.md
📚 Learning: 2025-05-18T12:50:36.393Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:22-33
Timestamp: 2025-05-18T12:50:36.393Z
Learning: In the CrateDB Guide repository, the prefix `ctk:` in documentation links (like `ctk:query/mcp/server`) is an intersphinx reference that points to external content in the CrateDB Toolkit documentation at https://cratedb-toolkit.readthedocs.io/. These references are intentionally not pointing to local files within the repository.

Applied to files:

  • docs/ingest/etl/index.md
  • docs/_include/links.md
  • docs/ingest/cdc/index.md
📚 Learning: 2025-05-18T12:50:38.681Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:8-20
Timestamp: 2025-05-18T12:50:38.681Z
Learning: In the CrateDB guide repository, references with the `ctk:` prefix (like `ctk:query/mcp/landscape`) are intersphinx references that link to resources in the cratedb-toolkit repository (https://github.com/crate/cratedb-toolkit/tree/main/doc), which are rendered at https://cratedb-toolkit.readthedocs.io/. These are valid cross-references between separate Sphinx documentation sets, not local file references.

Applied to files:

  • docs/ingest/etl/index.md
  • docs/_include/links.md
  • docs/ingest/cdc/index.md
📚 Learning: 2025-05-18T13:25:11.786Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:6-8
Timestamp: 2025-05-18T13:25:11.786Z
Learning: In the CrateDB Guide documentation, particularly for MCP-related pages, the author prefers a 1-column grid layout for multiple cards to stack them vertically rather than side by side, as this is an intentional design choice.

Applied to files:

  • docs/ingest/etl/index.md
  • docs/ingest/cdc/index.md
📚 Learning: 2025-08-06T02:11:46.794Z
Learnt from: amotl
PR: crate/cratedb-guide#222
File: docs/integrate/powerbi/index.md:16-23
Timestamp: 2025-08-06T02:11:46.794Z
Learning: The `project:` role in MyST Markdown syntax like `[](project:#powerbi-desktop)` and `[](project:#powerbi-service)` is valid cross-reference syntax in the CrateDB Guide repository's MyST Markdown setup.

Applied to files:

  • docs/ingest/etl/index.md
🪛 LanguageTool
docs/ingest/etl/index.md

[style] ~19-~19: The word “also” tends to be overused. Consider using a formal alternative to strengthen your wording.
Context: ...and outlines how to use them optimally. Please also take a look at support for {ref}cdc s...

(PLEASE_ALSO_CHECK)


[style] ~166-~166: Consider using a more concise synonym.
Context: ...at can ingest and transform data from a variety of sources. +++ Load data from streaming ...

(A_VARIETY_OF)

docs/ingest/cdc/index.md

[style] ~15-~15: The word “also” tends to be overused. Consider using a formal alternative to strengthen your wording.
Context: ...and outlines how to use them optimally. Please also take a look at support for {ref}`generi...

(PLEASE_ALSO_CHECK)


[style] ~95-~95: Consider using a more concise synonym.
Context: ...ngine to ingest and transform data from a variety of sources into CrateDB. It runs on-premis...

(A_VARIETY_OF)

🪛 markdownlint-cli2 (0.17.2)
docs/_include/links.md

3-3: Link and image reference definitions should be needed
Unused link or image reference definition: "apache airflow"

(MD053, link-image-reference-definitions)


4-4: Link and image reference definitions should be needed
Unused link or image reference definition: "astronomer"

(MD053, link-image-reference-definitions)


5-5: Link and image reference definitions should be needed
Unused link or image reference definition: "aws database migration service (aws dms)"

(MD053, link-image-reference-definitions)


6-6: Link and image reference definitions should be needed
Unused link or image reference definition: "aws dms integration with cratedb"

(MD053, link-image-reference-definitions)


7-7: Link and image reference definitions should be needed
Unused link or image reference definition: "aws lambda"

(MD053, link-image-reference-definitions)


23-23: Link and image reference definitions should be needed
Unused link or image reference definition: "dynamodb"

(MD053, link-image-reference-definitions)

docs/integrate/n8n/index.md

14-14: Bare URL used

(MD034, no-bare-urls)


15-15: Bare URL used

(MD034, no-bare-urls)

🔇 Additional comments (3)
docs/ingest/etl/index.md (2)

8-9: Including styles for grids/cards looks good

The styles include aligns with the new card/grid layout and the use of the ul-li-wide class.


206-233: All {ref} targets are correctly defined (no action required)

The verification confirms that all 24 reference targets used in the alphabetically sorted list are properly defined with their corresponding anchor labels in the documentation:

  • All anchor labels follow the pattern (label-name)= and are located in their respective integration guide files
  • Each reference corresponds to a file in docs/integrate/[service-name]/index.md
  • The cross-references will resolve correctly when the documentation is built

The code is properly structured and all references are valid.

docs/ingest/cdc/index.md (1)

24-49: All referenced anchors exist
Verified that anchors for aws-dynamodb, aws-kinesis, mongodb, aws-dms, debezium, estuary, risingwave, and streamsets are defined in their respective docs/integrate/.../index.md files. No missing cross-reference targets.

@amotl amotl force-pushed the etl-cdc-grouping branch from 83c3bf5 to 89b423e Compare August 9, 2025 21:28
@amotl amotl force-pushed the etl-cdc-grouping branch from 89b423e to fd7285e Compare August 9, 2025 21:46
@amotl amotl marked this pull request as ready for review August 9, 2025 21:56
@amotl amotl changed the title ETL/CDC: Guidance, Layout ETL/CDC: Guidance, Layout, dissolve walls of links Aug 10, 2025
@amotl amotl changed the title ETL/CDC: Guidance, Layout, dissolve walls of links ETL/CDC: Guidance, Layout / Dissolve walls of links Aug 10, 2025
@amotl amotl added the refactoring Changing shape or layout, or moving content around. label Aug 10, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
docs/ingest/etl/index.md (1)

206-233: Fix alphabetical order under “Alphabetically sorted”.

AWS items are out of order. Reorder to maintain strict alphabetical sorting.

 - {ref}`aws-dynamodb`
 - {ref}`aws-kinesis`
- - {ref}`aws-dms`
- - {ref}`aws-lambda`
+ - {ref}`aws-dms`
+ - {ref}`aws-dynamodb`
+ - {ref}`aws-kinesis`
+ - {ref}`aws-lambda`
 - {ref}`azure-functions`
🧹 Nitpick comments (2)
docs/ingest/etl/index.md (2)

37-39: Hyphenate “open-source” as a compound adjective.

Use “open-source” before a noun.

-  Apache Airflow is an open source software platform to programmatically author,
+  Apache Airflow is an open-source software platform to programmatically author,

60-64: Hyphenate “open-source” as a compound adjective.

Same here for Kestra.

-  Kestra is an open source workflow automation and orchestration toolkit with a rich
+  Kestra is an open-source workflow automation and orchestration toolkit with a rich
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 83c3bf5 and fd7285e.

📒 Files selected for processing (16)
  • docs/_include/links.md (2 hunks)
  • docs/_include/styles.html (1 hunks)
  • docs/connect/index.md (3 hunks)
  • docs/ingest/cdc/index.md (1 hunks)
  • docs/ingest/etl/index.md (2 hunks)
  • docs/ingest/index.md (1 hunks)
  • docs/ingest/telemetry/index.md (1 hunks)
  • docs/integrate/apache-airflow/index.md (2 hunks)
  • docs/integrate/apache-iceberg/index.md (2 hunks)
  • docs/integrate/aws-lambda/index.md (1 hunks)
  • docs/integrate/azure-functions/index.md (1 hunks)
  • docs/integrate/azure-functions/learn.rst (3 hunks)
  • docs/integrate/index.md (2 hunks)
  • docs/integrate/marquez/index.md (1 hunks)
  • docs/integrate/n8n/index.md (1 hunks)
  • docs/integrate/streamsets/index.md (1 hunks)
✅ Files skipped from review due to trivial changes (4)
  • docs/integrate/streamsets/index.md
  • docs/integrate/apache-iceberg/index.md
  • docs/integrate/azure-functions/learn.rst
  • docs/integrate/marquez/index.md
🚧 Files skipped from review as they are similar to previous changes (11)
  • docs/integrate/azure-functions/index.md
  • docs/integrate/index.md
  • docs/integrate/aws-lambda/index.md
  • docs/connect/index.md
  • docs/ingest/index.md
  • docs/integrate/apache-airflow/index.md
  • docs/_include/links.md
  • docs/ingest/telemetry/index.md
  • docs/_include/styles.html
  • docs/integrate/n8n/index.md
  • docs/ingest/cdc/index.md
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.947Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.
📚 Learning: 2025-08-09T16:19:43.947Z
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.947Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.

Applied to files:

  • docs/ingest/etl/index.md
🪛 LanguageTool
docs/ingest/etl/index.md

[uncategorized] ~37-~37: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...apache-airflow` Apache Airflow is an open source software platform to programmatically a...

(EN_COMPOUND_ADJECTIVE_INTERNAL)


[uncategorized] ~60-~60: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ... data. - {ref}kestra Kestra is an open source workflow automation and orchestration t...

(EN_COMPOUND_ADJECTIVE_INTERNAL)

🔇 Additional comments (3)
docs/ingest/etl/index.md (3)

27-33: Structured grid/card layout looks solid.

Good use of grid-item-card, gutters, and container class to dissolve “walls of links.”


8-9: Styles include and .ul-li-wide class verified

  • docs/_include/styles.html exists and is included via ::_include/styles.html
  • The file defines the .ul-li-wide ul li { selector on line 50, ensuring the styles are scoped correctly

206-233: All {ref} anchors validated – no missing links

I’ve verified that every {ref} target in docs/ingest/etl/index.md (including the newly added aws-lambda, n8n and the cdc anchor) has a corresponding definition or integration index under docs/integrate. No broken references remain.

@amotl amotl force-pushed the etl-cdc-grouping branch from fd7285e to e880a64 Compare August 10, 2025 18:28
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
docs/ingest/etl/index.md (1)

206-234: Fix ordering in “Alphabetically sorted” list (violates stated rubric).

aws-dms and aws-lambda are out of order. The section claims alphabetical sorting, so please reorder the AWS entries.

 :::{div}
 - {ref}`apache-airflow`
 - {ref}`apache-flink`
 - {ref}`apache-hop`
 - {ref}`apache-iceberg`
 - {ref}`apache-kafka`
 - {ref}`apache-nifi`
-- {ref}`aws-dynamodb`
-- {ref}`aws-kinesis`
- - {ref}`aws-dms`
- - {ref}`aws-lambda`
+- {ref}`aws-dms`
+- {ref}`aws-dynamodb`
+- {ref}`aws-kinesis`
+- {ref}`aws-lambda`
 - {ref}`azure-functions`
 - {ref}`dbt`
 - {ref}`estuary`
 - {ref}`influxdb`
 - {ref}`kestra`
 - {ref}`meltano`
 - {ref}`mongodb`
 - {ref}`mysql`
 - {ref}`n8n`
 - {ref}`node-red`
 - {ref}`risingwave`
 - {ref}`sql-server`
 - {ref}`streamsets`
 :::
🧹 Nitpick comments (2)
docs/ingest/etl/index.md (2)

12-21: Tighten intro wording for clarity.

Minor copyedits to improve flow and consistency.

-Options to connect and integrate CrateDB with third-party
-ETL applications, mostly using [CrateDB's PostgreSQL interface].
-CrateDB also provides native adapter components to leverage advanced
-features.
-
-This documentation section lists ETL applications and
-frameworks which can be used together with CrateDB, and outlines how
-to use them optimally.
-Additionally, see support for {ref}`cdc` solutions.
+Options to integrate CrateDB with third‑party ETL applications, typically via
+[CrateDB's PostgreSQL interface]. CrateDB also provides native adapter components
+to leverage advanced features.
+
+This section lists ETL applications and frameworks that work with CrateDB, and
+outlines how to use them effectively. Additionally, see support for {ref}`cdc` solutions.

71-73: Unify card footers’ tone/style.

Make the two footers imperative like the others (“Load…”, “Use…”).

-Data pipeline programming frameworks and platforms.
+Use data pipeline programming frameworks and platforms.
-Visual data flow and integration frameworks and platforms.
+Use visual data flow and integration frameworks and platforms.

Also applies to: 99-101

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fd7285e and e880a64.

📒 Files selected for processing (7)
  • docs/_include/links.md (2 hunks)
  • docs/_include/styles.html (1 hunks)
  • docs/ingest/cdc/index.md (1 hunks)
  • docs/ingest/etl/index.md (2 hunks)
  • docs/integrate/apache-airflow/index.md (2 hunks)
  • docs/integrate/aws-lambda/index.md (1 hunks)
  • docs/integrate/n8n/index.md (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
  • docs/_include/styles.html
  • docs/integrate/aws-lambda/index.md
  • docs/integrate/n8n/index.md
  • docs/_include/links.md
  • docs/integrate/apache-airflow/index.md
  • docs/ingest/cdc/index.md
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.947Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.
📚 Learning: 2025-08-09T16:19:43.947Z
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.947Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.

Applied to files:

  • docs/ingest/etl/index.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build docs
🔇 Additional comments (1)
docs/ingest/etl/index.md (1)

88-92: All references resolve correctly
The anchors (n8n)=, (aws-lambda)=, and (aws-dms)= are defined in their respective docs/integrate/.../index.md files, and the “CrateDB’s PostgreSQL interface” link exists in docs/_include/links.md. No further changes required.

Comment on lines +8 to +9
:::{include} /_include/styles.html
:::
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Confirm styling and icon support are globally configured.

Including styles at page level is fine, but consider centralizing if reused broadly. Also ensure the Sphinx extensions for grids/cards and material icons are enabled.

Also applies to: 34-35


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Checking styles include exists =="
test -f docs/_include/styles.html && echo "Found: docs/_include/styles.html" || { echo "Missing styles include"; exit 1; }

echo "== Checking Sphinx extensions for design/cards/icons =="
# Look for sphinx_design and material/icon support in conf.py
fd -a conf.py | xargs rg -n "sphinx_design|material|icons|sphinx_panels" -A2 || {
  echo "Could not find expected extensions in conf.py. Verify grids/cards/icons are enabled."; exit 1;
}

echo "Checks completed."

Length of output: 601


Add missing Sphinx extensions for design components and icons

The styles.html include is present (docs/_include/styles.html), but your Sphinx configuration isn’t loading the extensions needed to render grids/cards or Material icons. Please update your docs/conf.py:

  • In the extensions list, add one or more of:
    • sphinx_design (for grid and card layouts)
    • sphinx_panels (if you’re using panels)
    • the appropriate Material-icon extension you’ve installed (e.g. sphinx_materialdesign_icon or similar)

Also, since you include these styles in multiple spots (lines 8–9 and 34–35 of docs/ingest/etl/index.md), consider moving the include into a global template or your main TOC page so it’s applied consistently across all docs.

🤖 Prompt for AI Agents
In docs/ingest/etl/index.md around lines 8 to 9, the styles.html include is used
but the Sphinx configuration in docs/conf.py lacks the necessary extensions to
render design components and Material icons properly. To fix this, update
docs/conf.py by adding 'sphinx_design', 'sphinx_panels' (if panels are used),
and the appropriate Material icon extension (e.g., 'sphinx_materialdesign_icon')
to the extensions list. Additionally, to avoid repeated includes, move the
styles.html include from this file and line 34-35 to a global template or the
main TOC page so the styles apply consistently across all documentation pages.

Copy link
Member Author

@amotl amotl Aug 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai: All relevant details are loaded by the theme which is used here. It is crate-docs-theme.

@amotl amotl force-pushed the etl-cdc-grouping branch from e880a64 to 7264bf9 Compare August 10, 2025 20:28
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
docs/ingest/etl/index.md (1)

206-233: Add the missing “cdc” entry and fix the AWS section ordering

The alphabetical list in docs/ingest/etl/index.md currently omits {ref} cdc, and the AWS entries are out of order. Update the block to:

  • Insert {ref} cdc immediately after {ref} azure-functions
  • Reorder the AWS lines so “aws-dms” comes before “aws-dynamodb” and “aws-kinesis”

Suggested diff (lines 206–233):

 :::{div}
 - {ref}`apache-airflow`
  …  
 - {ref}`aws-dynamodb`
 - {ref}`aws-kinesis`
 - {ref}`aws-dms`
 - {ref}`aws-lambda`
 - {ref}`azure-functions`
+- {ref}`cdc`
 - {ref}`dbt`
  …  

 Replace the AWS block:
 - {ref}`aws-dynamodb`
 - {ref}`aws-kinesis`
 - {ref}`aws-dms`

 With:
+ - {ref}`aws-dms`
+ - {ref}`aws-dynamodb`
+ - {ref}`aws-kinesis`
 :::

Also, please confirm that each {ref} target (including cdc) has a corresponding Sphinx/MyST label (.. _<label>: or (label)=) elsewhere in the docs.

♻️ Duplicate comments (1)
docs/ingest/etl/index.md (1)

8-9: Theme-provided extensions acknowledged; include looks good

Acknowledging the earlier thread: if crate-docs-theme provides the directives/icons, no extra Sphinx extensions are needed. The local styles include is fine here.

🧹 Nitpick comments (2)
docs/ingest/etl/index.md (2)

12-21: Consider cross-linking “native adapter components”

If there’s a canonical page for CrateDB’s native adapters, add a {ref} link to improve discoverability. Otherwise, optional to keep as-is.


192-201: Datasets section OK; consider future additions

Optional: add Apache Hudi and Delta Lake when/if relevant to CrateDB workflows.

Happy to prepare a follow-up patch adding these entries if desired.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e880a64 and 7264bf9.

📒 Files selected for processing (8)
  • docs/_include/links.md (2 hunks)
  • docs/_include/styles.html (1 hunks)
  • docs/connect/drivers.md (1 hunks)
  • docs/ingest/cdc/index.md (1 hunks)
  • docs/ingest/etl/index.md (2 hunks)
  • docs/integrate/apache-airflow/index.md (2 hunks)
  • docs/integrate/aws-lambda/index.md (1 hunks)
  • docs/integrate/n8n/index.md (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • docs/connect/drivers.md
🚧 Files skipped from review as they are similar to previous changes (6)
  • docs/integrate/aws-lambda/index.md
  • docs/_include/styles.html
  • docs/integrate/n8n/index.md
  • docs/_include/links.md
  • docs/ingest/cdc/index.md
  • docs/integrate/apache-airflow/index.md
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.947Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.
📚 Learning: 2025-08-09T16:19:43.947Z
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.947Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.

Applied to files:

  • docs/ingest/etl/index.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build docs
🔇 Additional comments (8)
docs/ingest/etl/index.md (8)

34-73: Dataflow/Pipeline section reads well and is consistent

Content, structure, and footer summary are clear.


76-101: Low-code/Visual section looks consistent; new n8n entry included

Good addition and concise descriptions.


104-136: Databases section is clear and balanced

Descriptions are accurate and scoped correctly.


139-171: Streams section is coherent

Coverage and wording look good.


173-189: Serverless compute section is solid; aws-lambda added

Descriptions fit the audience and intent.


206-207: Alphabetical rubric is clear

Good to have both grouped and alphabetical views.


228-228: n8n entry added in correct alphabetical position

Looks good. Ensure the anchor exists.

Use the comprehensive script below (next comment) to verify all anchors referenced on this page exist.


27-33: ul-li-wide class confirmed

The CSS class .ul-li-wide is defined in docs/_include/styles.html at line 50, so the layout will render as intended.

Comment on lines 216 to 221
- {ref}`aws-dynamodb`
- {ref}`aws-kinesis`
- {ref}`aws-dms`
- {ref}`aws-lambda`
- {ref}`azure-functions`
- {ref}`dbt`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix alphabetical order for AWS entries

Under “Alphabetically sorted,” aws-dms should come before aws-dynamodb and aws-kinesis.

Apply within this block:

- - {ref}`aws-dynamodb`
- - {ref}`aws-kinesis`
- - {ref}`aws-dms`
+ - {ref}`aws-dms`
+ - {ref}`aws-dynamodb`
+ - {ref}`aws-kinesis`
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- {ref}`aws-dynamodb`
- {ref}`aws-kinesis`
- {ref}`aws-dms`
- {ref}`aws-lambda`
- {ref}`azure-functions`
- {ref}`dbt`
- {ref}`aws-dms`
- {ref}`aws-dynamodb`
- {ref}`aws-kinesis`
- {ref}`aws-lambda`
- {ref}`azure-functions`
- {ref}`dbt`
🤖 Prompt for AI Agents
In docs/ingest/etl/index.md around lines 216 to 221, the AWS entries are not in
alphabetical order. Rearrange the list so that aws-dms appears before
aws-dynamodb and aws-kinesis, ensuring the AWS entries are sorted alphabetically
within this block.

Comment on lines +24 to +31
:::{rubric} Grouped by category
:::

:::::{grid} 1 2 2 2
:margin: 4 4 0 0
:padding: 0
:gutter: 2
:class-container: ul-li-wide
Copy link
Member Author

@amotl amotl Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Observations

I think we need to come up with a different layout than the proposed two-column thing. It felt good when starting it, but then got out of control, and now certainly adds a design smell.

image

-- https://cratedb-guide--239.org.readthedocs.build/ingest/etl/

Outlook

Well, at least we wordsmithed concise teaser texts which can certainly be reused. On the other hand, more often than not, it is just a matter of adjusting grid parameters to easily get a better presentation -- thanks to the magic of responsive design and grid layouts! ;]

If we are merging early, we will tackle this on a subsequent iteration. 🤞

Base automatically changed from layout-etl-cdc to main August 12, 2025 02:53
@amotl amotl force-pushed the etl-cdc-grouping branch from 7264bf9 to 176cc1e Compare August 12, 2025 03:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

guidance Matters of layout, shape, and structure. refactoring Changing shape or layout, or moving content around. sanding-1200 Fine sanding.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant