Naming things: Simplify directory layout at "Integrate" section #249

amotl · 2025-08-13T19:51:29Z

About

The flat-list directory layout at docs/integrate used prefixes for certain product names. In the spirit of better navigation (less is more), let's strip the prefixes for the sake of less confusion when exploring/navigating and authoring: We've elaborated below about our slightly negative experience.

Preview

https://cratedb-guide--249.org.readthedocs.build/integrate/ (you don't "see" too much here, only when navigating)

Rationale

Names for Apache and Amazon/AWS products have been prefixed before using apache- and aws-. However, while initially planned to be a good idea, it implicitly added an overlay subgroup to this flat list which skewed the core idea.

On the one hand intended, it turned out to be cumbersome to navigate, both when authoring, and when browsing / exploring the list on behalf of the directory tree, URL layout, or other interfaces.

So, let's strip all the prefixes, where the product name is feasible to be better used standalone.

Thoughts

Other than the feeling of a smell, those specific thought conflicts and copy-editing experiences contributed to this change, sorted by increasing gravity:

Kafka is not only an Apache product, but also a technology that is provided by different implementations, both unmanaged and managed. I'd like to elaborate on this on the relevant page in a quick statement for name-dropping reasons.
aws-dynamodb and aws-kinesis had also been defined "wrong", because the real product names are "Amazon DynamoDB" and "Amazon Kinesis", while aws-dms and aws-lambda are correct (AWS DMS, AWS Lambda). That would have been another conflict zone.
When doing actual work on the new tree yesterday, when thinking about navigating to the "Kafka" or "Kinesis" slot, needed one additional thought step to find them in apache-kafka or aws-kinesis. That yielded a slight bad taste, not just once.

coderabbitai · 2025-08-13T19:51:36Z

Walkthrough

Normalized integration slugs and simplified many integration page headings by removing vendor prefixes in the integrations ToC and individual integration index pages; added/promoted numerous top-level integration index pages and made small formatting/navigation edits. All changes are documentation-only.

Changes

Cohort / File(s)	Summary
Integrations ToC `docs/integrate/index.md`	Replaced vendor-prefixed toctree entries with simplified top-level slugs (e.g., `apache-airflow/index` → `airflow/index`), reordered entries, and added many top-level integration indices. Kept `maxdepth: 1`.
New / promoted integration indices `docs/integrate/airflow/index.md`, `docs/integrate/...`*	Added or promoted many top-level integration index pages and updated references in the main integrate ToC. Examples: `airflow`, `aws-lambda`, `azure-functions`, `cluvio`, `datagrip`, `dbeaver`, `dbt`, `debezium`, `django`, `dms`, `dynamodb`, `estuary`, `explo`, `flink`, `gradio`, `grafana`, `hop`, `iceberg`, `influxdb`, `kafka`, `kestra`, `kinesis`, `langchain`, `llamaindex`, `marquez`, `meltano`, `metabase`, `mongodb`, `mysql`, `n8n`, `nifi`, `node-red`, `plotly`, `powerbi`, `prometheus`, `pyviz`, `rill`, `risingwave`, `sql-server`, `streamlit`, `streamsets`, `superset`, `tableau`, `telegraf`.
Integration page heading / anchor updates `docs/integrate/airflow/index.md`, `docs/integrate/dms/index.md`, `docs/integrate/dynamodb/index.md`, `docs/integrate/flink/index.md`, `docs/integrate/hop/index.md`, `docs/integrate/iceberg/index.md`, `docs/integrate/kafka/index.md`, `docs/integrate/kinesis/index.md`, `docs/integrate/nifi/index.md`, `docs/integrate/superset/index.md`	Simplified H1 headings and front-matter anchors by removing vendor prefixes (e.g., "Apache Flink" → "Flink", `(apache-kafka)=` → `(kafka)=`). No other content or includes changed.
Minor formatting / navigation edits `docs/integrate/estuary/index.md`, `docs/integrate/influxdb/index.md`, `docs/ingest/cdc/index.md`, `docs/ingest/etl/index.md`	Small formatting fixes (removed blank line), removed two InfluxDB toctree entries, and updated internal cross-reference anchors from vendor-prefixed identifiers to simplified slugs (e.g., `aws-dynamodb`→`dynamodb`, `aws-kinesis`→`kinesis`, `aws-dms`→`dms`, and multiple `apache-` → short names).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

Integrate: Migrate pages from crate-clients-tools, to be retired soon #222 — Direct overlap: renames and promotes many integration pages, removing vendor prefixes and adding top-level indices.
Layout NG: Folder structure and naming things #236 — Related edits to the integrate ToC and normalization of integration entries.
Layout NG: Folder structure and naming things, focusing on ETL and CDC #238 — Overlapping reorganization of integration pages and anchor/toctree adjustments.

Suggested labels

sanding-500, cross linking

Suggested reviewers

hammerhead
karynzv
surister

Poem

I hopped through headings, nibbled prefixes small,
Cleared burrows in the ToC, lined up every hall.
New names sprout like clover under morning light,
A tidy rabbit shuffle — docs all set and bright. 🐇

✨ Finishing Touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch layout-integrate-strip-prefix

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

amotl · 2025-08-13T19:53:33Z

docs/integrate/airflow/index.md

Sure that needs some rewrite rules.

rewrite ^/docs/guide/integrate/apache-(.*)/(.*)$ /docs/guide/integrate/$1/$2 redirect; rewrite ^/docs/guide/integrate/aws-(.*)/(.*)$ /docs/guide/integrate/$1/$2 redirect;

Added to the set of redirects for the current iteration.

https://github.com/crate/infrastructure/commit/365782e83

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

docs/integrate/index.md (2)

16-63: Consider adopting a strictly alphabetical toctree for maintainability.

Keeping entries alphabetically sorted reduces diff noise and makes additions simpler. If intentional ordering is desired for a few items, consider annotating that in a short comment above the list.

28-29: Disambiguate vendor-less product names in page titles.

Using “dms” and “dynamodb” without “AWS” may be ambiguous in isolation. Ensure the target pages’ H1 titles clearly read “AWS Database Migration Service (DMS)” and “Amazon DynamoDB,” to avoid confusion in search and breadcrumbs.

Also applies to: 39-39

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ac6ad86 and fd39a22.

📒 Files selected for processing (1)

docs/integrate/index.md (4 hunks)

🧰 Additional context used

🧠 Learnings (1)

📓 Common learnings

Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.966Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.

🪛 LanguageTool

docs/integrate/index.md

[grammar] ~19-~19: There might be a mistake here.
Context: ...:::{toctree} :maxdepth: 1 airflow/index aws-lambda/index azure-functions/index c...

(QB_NEW_EN)

[grammar] ~28-~28: There might be a mistake here.
Context: ...ex debezium/index django/index dms/index dynamodb/index estuary/index explo/index...

(QB_NEW_EN)

[grammar] ~29-~29: There might be a mistake here.
Context: ...ex django/index dms/index dynamodb/index estuary/index explo/index flink/index gr...

(QB_NEW_EN)

[grammar] ~30-~30: There might be a mistake here.
Context: ...x dms/index dynamodb/index estuary/index explo/index flink/index gradio/index gra...

(QB_NEW_EN)

[grammar] ~31-~31: There might be a mistake here.
Context: ...dynamodb/index estuary/index explo/index flink/index gradio/index grafana/index h...

(QB_NEW_EN)

[grammar] ~32-~32: There might be a mistake here.
Context: ...ex estuary/index explo/index flink/index gradio/index grafana/index hop/index ice...

(QB_NEW_EN)

[grammar] ~33-~33: There might be a mistake here.
Context: ...dex explo/index flink/index gradio/index grafana/index hop/index iceberg/index in...

(QB_NEW_EN)

[grammar] ~34-~34: There might be a mistake here.
Context: ...x flink/index gradio/index grafana/index hop/index iceberg/index influxdb/index k...

(QB_NEW_EN)

[grammar] ~35-~35: There might be a mistake here.
Context: ...dex gradio/index grafana/index hop/index iceberg/index influxdb/index kafka/index...

(QB_NEW_EN)

[grammar] ~36-~36: There might be a mistake here.
Context: ...ex grafana/index hop/index iceberg/index influxdb/index kafka/index kinesis/index...

(QB_NEW_EN)

[grammar] ~37-~37: There might be a mistake here.
Context: ...x hop/index iceberg/index influxdb/index kafka/index kinesis/index kestra/index l...

(QB_NEW_EN)

[grammar] ~38-~38: There might be a mistake here.
Context: ...iceberg/index influxdb/index kafka/index kinesis/index kestra/index langchain/ind...

(QB_NEW_EN)

[grammar] ~39-~39: There might be a mistake here.
Context: ...influxdb/index kafka/index kinesis/index kestra/index langchain/index llamaindex/...

(QB_NEW_EN)

[grammar] ~49-~49: There might be a mistake here.
Context: ...b/index mysql/index n8n/index nifi/index node-red/index plotly/index Power BI <po...

(QB_NEW_EN)

[grammar] ~60-~60: There might be a mistake here.
Context: ...it/index streamsets/index superset/index tableau/index telegraf/index ::: [Crat...

(QB_NEW_EN)

🔇 Additional comments (3)

docs/integrate/index.md (3)

19-19: Good simplification of slugs; aligns with the PR goal.

Removing vendor prefixes (e.g., apache-airflow → airflow) improves readability and URL hygiene. Nice.

32-32: All listed integrations have learn pages — no action required.

I ran the provided script; for each slug (airflow, dms, dynamodb, flink, hop, iceberg, kafka, kinesis, nifi, superset) the script found a learn.(md|rst) or *-learn.(md|rst) file under docs/integrate/.

19-19: Check complete — new integrate targets exist; old vendor-prefixed refs preserved

Ran the supplied script.

New index pages verified (missing count: 0): docs/integrate/{airflow,dms,dynamodb,flink,hop,iceberg,kafka,kinesis,nifi,superset}.

Old vendor-prefixed slugs are present in the repo but appear to be preserved as anchor labels in the new integrate index files (so internal {ref} links should continue to resolve). Anchor examples:

docs/integrate/airflow/index.md — (apache-airflow)=

docs/integrate/flink/index.md — (apache-flink)=

docs/integrate/kafka/index.md — (apache-kafka)=

docs/integrate/nifi/index.md — (apache-nifi)=

docs/integrate/iceberg/index.md — (apache-iceberg)=

docs/integrate/superset/index.md — (apache-superset)=

docs/integrate/dms/index.md — (aws-dms)=

docs/integrate/dynamodb/index.md — (aws-dynamodb)=

docs/integrate/kinesis/index.md — (aws-kinesis)=

Files that reference the old slugs (examples; see full rg output for all lines): docs/ingest/etl/index.md, docs/ingest/cdc/index.md, docs/integrate/marquez/learn.md, docs/integrate/risingwave/index.md, docs/integrate/status.md, docs/_include/links.md.

No Sphinx redirection config (rediraffe) found in the repo.

Conclusion / recommendation:

Internal cross-references look safe (anchors preserved) — no immediate changes required.

If you need to preserve old path-level URLs for external links, add redirects (e.g., sphinxext-rediraffe or redirect stub pages). I can generate a rediraffe mapping or redirect stubs if you want.

Apache and Amazon/AWS products have been prefixed before. However, while initially planned to be a good idea, it implicitly added an overlay subgroup to this flat list. On the one hand intended, it turned out to be cumbersome to navigate, both when authoring, and when browsing / exploring the list on behalf of the URL layout, or other interfaces. So, let's strip all the prefixes, where the product name is feasible to be better used standalone.

coderabbitai

Actionable comments posted: 0

🔭 Outside diff range comments (1)

docs/integrate/index.md (1)

16-63: Fix broken ToC links in docs/integrate/index.md — referenced index pages are missing

The verification script reports the following referenced targets are missing (no docs//index.md or .rst):

docs/airflow/index

docs/aws-lambda/index

docs/azure-functions/index

docs/cluvio/index

docs/datagrip/index

docs/dbeaver/index

docs/dbt/index

docs/debezium/index

docs/django/index

docs/dms/index

docs/dynamodb/index

docs/estuary/index

docs/explo/index

docs/flink/index

docs/gradio/index

docs/grafana/index

docs/hop/index

docs/iceberg/index

docs/influxdb/index

docs/kafka/index

docs/kinesis/index

docs/kestra/index

docs/langchain/index

docs/llamaindex/index

docs/marquez/index

docs/meltano/index

docs/metabase/index

docs/mongodb/index

docs/mysql/index

docs/n8n/index

docs/nifi/index

docs/node-red/index

docs/plotly/index

docs/powerbi/index

docs/prometheus/index

docs/pyviz/index

docs/rill/index

docs/risingwave/index

docs/sql-server/index

docs/streamlit/index

docs/streamsets/index

docs/superset/index

docs/tableau/index

docs/telegraf/index

Location to edit: docs/integrate/index.md (around lines 16–63). Action: restore the missing index.md/.rst files under the listed paths or update the toctree entries to the correct, current slugs.

🧹 Nitpick comments (2)

docs/integrate/index.md (2)
38-41: Nit: keep K entries alphabetically ordered (move kestra before kinesis)*

Minor consistency tweak to maintain the otherwise alphabetical list.

Apply:
- kinesis/index
- kestra/index
+ kestra/index
+ kinesis/index
19-19: Preserve external links: add redirects for renamed slugs and update internal refs if any

I couldn't confirm automatically: the search for a redirects extension failed because conf.py / pyproject.toml weren't found in the checkout, and a scan for vendor-prefixed integrate paths returned no matches. Please run the checks below locally and paste results so I can re-verify.

Quick checks to run locally

Find configured redirect extensions:
rg -n "(sphinx-reredirects|rediraffe)" -S .

Search for old vendor-prefixed slugs across Markdown and reST:
rg -nS "integrate/(apache-|aws-|amazon-)" -g "**/*.{md,rst}" .

Search for specific old names:
rg -nS "apache-airflow|apache-flink|apache-hop|apache-iceberg|apache-kafka|apache-nifi|apache-superset|amazon-kinesis|amazon-dynamodb|aws-dms" -g "**/*.{md,rst}" .

Checks to make by hand

Inspect docs/conf.py or docs/source/conf.py (or pyproject.toml) for an extensions list containing sphinx_reredirects / rediraffe and any redirect mapping config.

Scan docs/ (both .md and .rst) for internal links that still reference vendor-prefixed slugs and update them or add redirects.

Suggested redirect mappings (example)

docs/integrate/apache-airflow/index -> docs/integrate/airflow/index

docs/integrate/apache-flink/index -> docs/integrate/flink/index

docs/integrate/apache-hop/index -> docs/integrate/hop/index

docs/integrate/apache-iceberg/index -> docs/integrate/iceberg/index

docs/integrate/apache-kafka/index -> docs/integrate/kafka/index

docs/integrate/apache-nifi/index -> docs/integrate/nifi/index

docs/integrate/apache-superset/index -> docs/integrate/superset/index

docs/integrate/amazon-kinesis/index -> docs/integrate/kinesis/index

docs/integrate/amazon-dynamodb/index -> docs/integrate/dynamodb/index

docs/integrate/aws-dms/index -> docs/integrate/dms/index

Note: I used the stored guidance to search both .md and .rst files for cross-references. Please paste the output of the commands above (or let me re-run if you can provide access to the repo root), and I will update/resolve the comment.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fd39a22 and 1390024.

📒 Files selected for processing (11)

docs/integrate/airflow/index.md (1 hunks)
docs/integrate/dms/index.md (1 hunks)
docs/integrate/dynamodb/index.md (1 hunks)
docs/integrate/flink/index.md (1 hunks)
docs/integrate/hop/index.md (1 hunks)
docs/integrate/iceberg/index.md (1 hunks)
docs/integrate/index.md (4 hunks)
docs/integrate/kafka/index.md (1 hunks)
docs/integrate/kinesis/index.md (1 hunks)
docs/integrate/nifi/index.md (1 hunks)
docs/integrate/superset/index.md (1 hunks)

✅ Files skipped from review due to trivial changes (10)

docs/integrate/airflow/index.md
docs/integrate/kinesis/index.md
docs/integrate/dynamodb/index.md
docs/integrate/iceberg/index.md
docs/integrate/dms/index.md
docs/integrate/flink/index.md
docs/integrate/nifi/index.md
docs/integrate/hop/index.md
docs/integrate/kafka/index.md
docs/integrate/superset/index.md

🧰 Additional context used

🧠 Learnings (1)

📓 Common learnings

Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.966Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.

🪛 LanguageTool

docs/integrate/index.md

[grammar] ~19-~19: There might be a mistake here.
Context: ...:::{toctree} :maxdepth: 1 airflow/index aws-lambda/index azure-functions/index c...

(QB_NEW_EN)

[grammar] ~28-~28: There might be a mistake here.
Context: ...ex debezium/index django/index dms/index dynamodb/index estuary/index explo/index...

(QB_NEW_EN)

[grammar] ~29-~29: There might be a mistake here.
Context: ...ex django/index dms/index dynamodb/index estuary/index explo/index flink/index gr...

(QB_NEW_EN)

[grammar] ~30-~30: There might be a mistake here.
Context: ...x dms/index dynamodb/index estuary/index explo/index flink/index gradio/index gra...

(QB_NEW_EN)

[grammar] ~31-~31: There might be a mistake here.
Context: ...dynamodb/index estuary/index explo/index flink/index gradio/index grafana/index h...

(QB_NEW_EN)

[grammar] ~32-~32: There might be a mistake here.
Context: ...ex estuary/index explo/index flink/index gradio/index grafana/index hop/index ice...

(QB_NEW_EN)

[grammar] ~33-~33: There might be a mistake here.
Context: ...dex explo/index flink/index gradio/index grafana/index hop/index iceberg/index in...

(QB_NEW_EN)

[grammar] ~34-~34: There might be a mistake here.
Context: ...x flink/index gradio/index grafana/index hop/index iceberg/index influxdb/index k...

(QB_NEW_EN)

[grammar] ~35-~35: There might be a mistake here.
Context: ...dex gradio/index grafana/index hop/index iceberg/index influxdb/index kafka/index...

(QB_NEW_EN)

[grammar] ~36-~36: There might be a mistake here.
Context: ...ex grafana/index hop/index iceberg/index influxdb/index kafka/index kinesis/index...

(QB_NEW_EN)

[grammar] ~37-~37: There might be a mistake here.
Context: ...x hop/index iceberg/index influxdb/index kafka/index kinesis/index kestra/index l...

(QB_NEW_EN)

[grammar] ~38-~38: There might be a mistake here.
Context: ...iceberg/index influxdb/index kafka/index kinesis/index kestra/index langchain/ind...

(QB_NEW_EN)

[grammar] ~39-~39: There might be a mistake here.
Context: ...influxdb/index kafka/index kinesis/index kestra/index langchain/index llamaindex/...

(QB_NEW_EN)

[grammar] ~49-~49: There might be a mistake here.
Context: ...b/index mysql/index n8n/index nifi/index node-red/index plotly/index Power BI <po...

(QB_NEW_EN)

[grammar] ~60-~60: There might be a mistake here.
Context: ...it/index streamsets/index superset/index tableau/index telegraf/index ::: [Crat...

(QB_NEW_EN)

🔇 Additional comments (2)

docs/integrate/index.md (2)

19-19: Simplified slugs improve navigation; ToC changes LGTM

The vendor-prefix removal makes the list flatter and easier to scan. The new entries look consistent with the repository’s integration structure and with the learned pattern of linking main integration anchors via their index pages.

Also applies to: 28-29, 32-32, 35-36, 38-39, 49-49, 60-60

16-63: Ignore LanguageTool “grammar” flags for toctree entries

These lines are Sphinx directive entries, not prose; the warnings are false positives.

coderabbitai

Actionable comments posted: 0

🔭 Outside diff range comments (1)

docs/integrate/index.md (1)

19-63: All toctree targets resolve — one prefixed directory remains (action required)

Summary: I verified docs/integrate/index.md (44 toctree entries) and all targets resolve to index.md/.rst; a repo scan found no other integrate/(apache|aws)-... references. However, a prefixed directory remains:

docs/integrate/aws-lambda/ — present (likely stale if aws-* prefixes were supposed to be removed)

Please remove or rename that directory (and update any links) or confirm it is intentionally kept.

🧹 Nitpick comments (2)

docs/integrate/index.md (2)
20-20: Confirm intent to keep “aws-lambda” prefixed (exception) vs. de-prefixing to “lambda”.

Most AWS-related entries (e.g., DMS, Kinesis, DynamoDB) are de-prefixed; “aws-lambda/index” remains. If this is an intentional exception for clarity (generic “Lambda” ambiguity), consider documenting this convention; otherwise, rename to “lambda/index” and update refs.

If you opt to de-prefix now:
-aws-lambda/index
+lambda/index
19-63: Add redirects to preserve existing inbound links and SEO after slug changes.

External links to old apache-/aws- paths will 404 after deploy. Recommend adding redirects (e.g., via sphinxext-rediraffe or site-level redirects) from old slugs to new ones.

Examples:

apache-kafka/index -> kafka/index

aws-dynamodb/index -> dynamodb/index

aws-dms/index -> dms/index

If you share the old->new mapping, I can generate a rediraffe redirects file or Netlify/GitHub Pages redirects for you.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1390024 and ea49305.

📒 Files selected for processing (1)

docs/integrate/index.md (4 hunks)

🧰 Additional context used

🧠 Learnings (1)

📓 Common learnings

Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.966Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.

🪛 LanguageTool

docs/integrate/index.md

[grammar] ~19-~19: There might be a mistake here.
Context: ...:::{toctree} :maxdepth: 1 airflow/index aws-lambda/index azure-functions/index c...

(QB_NEW_EN)

[grammar] ~28-~28: There might be a mistake here.
Context: ...ex debezium/index django/index dms/index dynamodb/index estuary/index explo/index...

(QB_NEW_EN)

[grammar] ~29-~29: There might be a mistake here.
Context: ...ex django/index dms/index dynamodb/index estuary/index explo/index flink/index gr...

(QB_NEW_EN)

[grammar] ~30-~30: There might be a mistake here.
Context: ...x dms/index dynamodb/index estuary/index explo/index flink/index gradio/index gra...

(QB_NEW_EN)

[grammar] ~31-~31: There might be a mistake here.
Context: ...dynamodb/index estuary/index explo/index flink/index gradio/index grafana/index h...

(QB_NEW_EN)

[grammar] ~32-~32: There might be a mistake here.
Context: ...ex estuary/index explo/index flink/index gradio/index grafana/index hop/index ice...

(QB_NEW_EN)

[grammar] ~33-~33: There might be a mistake here.
Context: ...dex explo/index flink/index gradio/index grafana/index hop/index iceberg/index in...

(QB_NEW_EN)

[grammar] ~34-~34: There might be a mistake here.
Context: ...x flink/index gradio/index grafana/index hop/index iceberg/index influxdb/index k...

(QB_NEW_EN)

[grammar] ~35-~35: There might be a mistake here.
Context: ...dex gradio/index grafana/index hop/index iceberg/index influxdb/index kafka/index...

(QB_NEW_EN)

[grammar] ~36-~36: There might be a mistake here.
Context: ...ex grafana/index hop/index iceberg/index influxdb/index kafka/index kestra/index ...

(QB_NEW_EN)

[grammar] ~37-~37: There might be a mistake here.
Context: ...x hop/index iceberg/index influxdb/index kafka/index kestra/index kinesis/index l...

(QB_NEW_EN)

[grammar] ~38-~38: There might be a mistake here.
Context: ...iceberg/index influxdb/index kafka/index kestra/index kinesis/index langchain/ind...

(QB_NEW_EN)

[grammar] ~39-~39: There might be a mistake here.
Context: ... influxdb/index kafka/index kestra/index kinesis/index langchain/index llamaindex...

(QB_NEW_EN)

[grammar] ~40-~40: There might be a mistake here.
Context: ...x kafka/index kestra/index kinesis/index langchain/index llamaindex/index marquez...

(QB_NEW_EN)

[grammar] ~49-~49: There might be a mistake here.
Context: ...b/index mysql/index n8n/index nifi/index node-red/index plotly/index Power BI <po...

(QB_NEW_EN)

[grammar] ~60-~60: There might be a mistake here.
Context: ...it/index streamsets/index superset/index tableau/index telegraf/index ::: [Crat...

(QB_NEW_EN)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Build docs

🔇 Additional comments (2)

docs/integrate/index.md (2)

16-63: Restructure aligns with PR objective; toctree looks coherent and alphabetized.

The simplified slugs and flat list meet the stated goal. The entries appear alphabetically ordered and consistent with the new naming scheme.

21-21: Pattern check: keep “-learn” pages separate from anchors (as per doc architecture).

This toctree links only to top-level integration indices, which matches the established pattern where detailed tutorials use the “-learn” suffix in learn.{md|rst}. Looks consistent with the documented approach.

Apache and Amazon/AWS products have been prefixed before. However, while initially planned to be a good idea, it implicitly added an overlay subgroup to this flat list. On the one hand intended, it turned out to be cumbersome to navigate, both when authoring, and when browsing / exploring the list on behalf of the URL layout, or other interfaces. So, let's strip all the prefixes, where the product name is feasible to be better used standalone.

bmunkholm · 2025-08-14T09:18:36Z

docs/integrate/kafka/index.md

@@ -1,5 +1,5 @@
 (apache-kafka)=


How about these references (not just here but all the similar ones) ?

Reference labels go orthogonal to any other structure (physical/folder/files/toc), but haven't been touched here yet. It could be a good idea to also use the canonical labels going forward, thanks!

2426269 adjusts the reference labels accordingly. Let's do the proper spring cleaning right away instead of deferring it to a future iteration.

bmunkholm

Assuming build error gets fixed :-)

amotl · 2025-08-14T10:20:50Z

Assuming anything without any crew? ;]

Let's also use canonical, non-prefixed variants for the reference labels. This patch aggressively prunes previous nomenclature to create less confusion for future authors. If anything breaks on this, it is now the right time to fix forward.

coderabbitai

Actionable comments posted: 9

🔭 Outside diff range comments (2)

docs/ingest/etl/index.md (2)
205-228: Fix legacy prefixed refs (aws-lambda) and verify anchors

Alphabetical order looks fine, but there are lingering old-style refs that will break anchors. Update the refs to the canonical labels (or add matching label definitions) and consider generating this list from a single source to avoid future drift.

Files/locations to fix:

docs/ingest/etl/index.md:169 — - {ref}aws-lambda``

docs/ingest/etl/index.md:206 — - {ref}aws-lambda``

135-147: Use the generic "Kafka" label in ETL docs and confirm reference targets

Replace the "Apache Kafka" phrasing in the ETL docs to match the unprefixed {ref}kafka`` usage.

Files to update:

docs/ingest/etl/index.md — lines ~135–147: change the paragraph starting "Apache Kafka is…" to use "Kafka".

docs/integrate/kafka/index.md — already defines the (kafka)= target (line 1), so {ref}kafka resolves. No `(apache-kafka)=` alias or `{ref}`apache-kafka usages were found; add the alias only if you need to preserve legacy references.

Apply this diff in docs/ingest/etl/index.md:
-  Apache Kafka is an open-source distributed event streaming platform
-  for high-performance data pipelines, streaming analytics, data integration,
-  and mission-critical applications.
+  Kafka is an open-source distributed event streaming platform for
+  high-performance data pipelines, streaming analytics, data integration,
+  and mission-critical applications.
Optional (only if you have legacy {ref}apache-kafka`` links): add (apache-kafka)= directly below the existing `(kafka)=` in docs/integrate/kafka/index.md.

🧹 Nitpick comments (2)

docs/integrate/dbt/index.md (1)
33-33: Minor grammar/style: add comma after “for example” and simplify link phrasing.

Improves readability and resolves the LanguageTool hint. Also consider explicit link text for proper casing inline.
-dbt projects run, for example with {ref}`debezium` or with {ref}`airflow`.
+dbt projects run, for example, with {ref}`Debezium <debezium>` or {ref}`Airflow <airflow>`.
Optional: If your docs follow American English, consider “Afterward” instead of “Afterwards” in the next sentence.
docs/ingest/etl/index.md (1)

100-105: Naming decision: Confirm dropping the “aws-” prefix for DMS is intentional.

The PR description suggests aws-dms was considered a correct use of the prefix, yet this change references {ref}dms``. If the rename is intended, ensure (dms)= exists and add `(aws-dms)=` as an alias on the DMS page to preserve existing links.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled
Linear integration is disabled

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 96ab2bb and 2426269.

📒 Files selected for processing (15)

docs/ingest/cdc/index.md (2 hunks)
docs/ingest/etl/index.md (6 hunks)
docs/integrate/airflow/index.md (1 hunks)
docs/integrate/dbt/index.md (1 hunks)
docs/integrate/dms/index.md (1 hunks)
docs/integrate/dynamodb/index.md (2 hunks)
docs/integrate/estuary/index.md (0 hunks)
docs/integrate/flink/index.md (1 hunks)
docs/integrate/hop/index.md (1 hunks)
docs/integrate/iceberg/index.md (1 hunks)
docs/integrate/influxdb/index.md (0 hunks)
docs/integrate/kafka/index.md (1 hunks)
docs/integrate/kinesis/index.md (2 hunks)
docs/integrate/nifi/index.md (1 hunks)
docs/integrate/superset/index.md (1 hunks)

💤 Files with no reviewable changes (2)

docs/integrate/estuary/index.md
docs/integrate/influxdb/index.md

🚧 Files skipped from review as they are similar to previous changes (6)

docs/integrate/kafka/index.md
docs/integrate/flink/index.md
docs/integrate/dms/index.md
docs/integrate/airflow/index.md
docs/ingest/cdc/index.md
docs/integrate/superset/index.md

🧰 Additional context used

🪛 LanguageTool

docs/ingest/etl/index.md

[grammar] ~71-~71: There might be a mistake here.
Context: ...clusive;2em Low-code / No-code / Visual - {ref}estuary` Estuary provides real-...

(QB_NEW_EN)

[grammar] ~205-~205: There might be a mistake here.
Context: ...habetically sorted ::: :::{div} - {ref}airflow - {ref}aws-lambda - {ref}`azure-function...

(QB_NEW_EN)

[grammar] ~206-~206: There might be a mistake here.
Context: ...d ::: :::{div} - {ref}airflow - {ref}aws-lambda - {ref}azure-functions - {ref}dbt - {r...

(QB_NEW_EN)

[grammar] ~207-~207: There might be a mistake here.
Context: ...ef}airflow - {ref}aws-lambda - {ref}azure-functions - {ref}dbt - {ref}dms - {ref}`dynamodb...

(QB_NEW_EN)

[grammar] ~208-~208: There might be a mistake here.
Context: ...lambda - {ref}azure-functions - {ref}dbt - {ref}dms - {ref}dynamodb - {ref}est...

(QB_NEW_EN)

[grammar] ~209-~209: There might be a mistake here.
Context: ...f}azure-functions - {ref}dbt - {ref}dms - {ref}dynamodb - {ref}estuary - {ref}...

(QB_NEW_EN)

[grammar] ~210-~210: There might be a mistake here.
Context: ...tions - {ref}dbt - {ref}dms - {ref}dynamodb - {ref}estuary - {ref}flink - {ref}ho...

(QB_NEW_EN)

[grammar] ~211-~211: There might be a mistake here.
Context: ... - {ref}dms - {ref}dynamodb - {ref}estuary - {ref}flink - {ref}hop - {ref}iceber...

(QB_NEW_EN)

[grammar] ~212-~212: There might be a mistake here.
Context: ...{ref}dynamodb - {ref}estuary - {ref}flink - {ref}hop - {ref}iceberg - {ref}`infl...

(QB_NEW_EN)

[grammar] ~213-~213: There might be a mistake here.
Context: ... - {ref}estuary - {ref}flink - {ref}hop - {ref}iceberg - {ref}influxdb - {ref}...

(QB_NEW_EN)

[grammar] ~214-~214: There might be a mistake here.
Context: ...ary - {ref}flink - {ref}hop - {ref}iceberg - {ref}influxdb - {ref}kafka - {ref}k...

(QB_NEW_EN)

[grammar] ~215-~215: There might be a mistake here.
Context: ...k - {ref}hop - {ref}iceberg - {ref}influxdb - {ref}kafka - {ref}kestra - {ref}kin...

(QB_NEW_EN)

[grammar] ~216-~216: There might be a mistake here.
Context: ...{ref}iceberg - {ref}influxdb - {ref}kafka - {ref}kestra - {ref}kinesis - {ref}`m...

(QB_NEW_EN)

[grammar] ~217-~217: There might be a mistake here.
Context: ...- {ref}influxdb - {ref}kafka - {ref}kestra - {ref}kinesis - {ref}meltano - {ref}`...

(QB_NEW_EN)

[grammar] ~218-~218: There might be a mistake here.
Context: ... - {ref}kafka - {ref}kestra - {ref}kinesis - {ref}meltano - {ref}mongodb - {ref}...

(QB_NEW_EN)

[grammar] ~219-~219: There might be a mistake here.
Context: ...- {ref}kestra - {ref}kinesis - {ref}meltano - {ref}mongodb - {ref}mysql - {ref}`n8...

(QB_NEW_EN)

[grammar] ~220-~220: There might be a mistake here.
Context: ... {ref}kinesis - {ref}meltano - {ref}mongodb - {ref}mysql - {ref}n8n - {ref}nifi ...

(QB_NEW_EN)

[grammar] ~221-~221: There might be a mistake here.
Context: ... {ref}meltano - {ref}mongodb - {ref}mysql - {ref}n8n - {ref}nifi - {ref}`node-re...

(QB_NEW_EN)

[grammar] ~222-~222: There might be a mistake here.
Context: ... - {ref}mongodb - {ref}mysql - {ref}n8n - {ref}nifi - {ref}node-red - {ref}`ri...

(QB_NEW_EN)

[grammar] ~223-~223: There might be a mistake here.
Context: ...odb - {ref}mysql - {ref}n8n - {ref}nifi - {ref}node-red - {ref}risingwave` - {r...

(QB_NEW_EN)

docs/integrate/dbt/index.md

[grammar] ~33-~33: There might be a mistake here.
Context: ... {ref}debezium or with {ref}airflow. Afterwards, data analysts can run their ...

(QB_NEW_EN)

[locale-violation] ~33-~33: In American English, ‘afterward’ is the preferred variant. ‘Afterwards’ is more commonly used in British English and other dialects.
Context: ...{ref}debezium or with {ref}airflow. Afterwards, data analysts can run their dbt projec...

(AFTERWARDS_US)

docs/integrate/dynamodb/index.md

[grammar] ~1-~1: There might be a mistake here.
Context: (dynamodb)= # DynamoDB :::{include} /_include/links.m...

(QB_NEW_EN)

docs/integrate/hop/index.md

[grammar] ~1-~1: There might be a mistake here.
Context: (hop)= # Hop ```{div} .float-right [![Apache Hop...

(QB_NEW_EN)

docs/integrate/iceberg/index.md

[grammar] ~1-~1: There might be a mistake here.
Context: (iceberg)= # Iceberg ```{div} .float-right [![Apache...

(QB_NEW_EN)

docs/integrate/kinesis/index.md

[grammar] ~1-~1: There might be a mistake here.
Context: (kinesis)= # Kinesis :::{include} /_include/links.md...

(QB_NEW_EN)

[grammar] ~39-~39: There might be a mistake here.
Context: ...o 3 3 3 :::{rubric} Related ::: - {ref}dynamodb - [Amazon DynamoDB Streams] :::: ::::: :...

(QB_NEW_EN)

docs/integrate/nifi/index.md

[grammar] ~1-~1: There might be a mistake here.
Context: (nifi)= # NiFi ```{div} .float-right [![Apache Ni...

(QB_NEW_EN)

🔇 Additional comments (6)

docs/integrate/dbt/index.md (2)

33-33: Good move: Using Sphinx cross-references aligns with the new integration structure.

Switching to {ref} keeps links resilient to path changes and matches the PR goal of normalizing integration slugs/pages.

33-33: No action needed — labels for airflow and debezium exist

Labels found:

docs/integrate/airflow/index.md:1 — (airflow)=

docs/integrate/debezium/index.md:1 — (debezium)=

Refs found:

docs/integrate/dbt/index.md:33 — contains {ref}debezium and `{ref}`airflow

docs/ingest/etl/index.md:30,205 — contains {ref}airflow``

Conclusion: the {ref} links are valid; no changes required.

docs/integrate/kinesis/index.md (1)

39-41: Cross-ref looks correct; keep consistency across the suite

The switch to {ref}dynamodb matches the new slugging. No changes needed here.

docs/integrate/dynamodb/index.md (1)

39-41: Related links now use the short slug; LGTM

The change to {ref}kinesis aligns with the new naming scheme.

docs/ingest/etl/index.md (2)

106-109: DynamoDB: unprefixed label is fine; ensure target label exists and reflect official name in text.

Please verify (dynamodb)= exists. The description already uses the official product name (“DynamoDB … by Amazon Web Services”), which is consistent.

189-195: Iceberg label change looks correct.

Unprefixed {ref}iceberg`` aligns with the new scheme. Ensure (iceberg)= exists on the target page and consider adding `(apache-iceberg)=` as an alias.

coderabbitai · 2025-08-14T11:36:06Z

docs/ingest/etl/index.md

+- {ref}`airflow`



💡 Verification agent

🧩 Analysis chain

Unprefixed {ref}airflow` looks good; please confirm target label exists and keep a legacy alias.

This page now references {ref}airflow. Verify that the target page defines (airflow)=and, to avoid breaking incoming links, also defines an alias for the old label (e.g.,(apache-airflow)=`) on the same page.

Run this script to check label targets and spot any leftover old-style refs:

🏁 Script executed:

#!/bin/bash set -euo pipefail # Extract all ref labels used in this file labels=$(rg -No '\{ref}`([a-z0-9\-]+)`' docs/ingest/etl/index.md | sed 's/.*`//; s/`$//' | sort -u) echo "Checking label definitions for:" printf ' - %s\n' $labels echo missing=0 for lbl in $labels; do if ! rg -n "^$$lbl$="; then echo "MISSING label: ($lbl)=" missing=1 fi done echo echo "Scanning for legacy prefixed refs still used elsewhere:" rg -n '\{ref}`(apache-|aws-)[^`]+`' || true exit $missing

Length of output: 514

🏁 Script executed:

#!/bin/bash set -euo pipefail file="docs/ingest/etl/index.md" echo "Listing all {ref}`...` occurrences in $file (line:number -> matched):" rg -n -oP '\{ref\}`([^`]+)`' "$file" || true echo echo "Extracting unique labels from those refs:" labels=$(rg -n -oP '\{ref\}`([^`]+)`' "$file" | sed -E 's/^[0-9]+://' | sed -E 's/\{ref\}`(.*)`/\1/' | sort -u || true) printf ' - %s\n' $labels echo missing=0 echo "Checking for anchor definitions for each label across the docs/ tree:" for lbl in $labels; do if rg -n -F "($lbl)=" docs || true; then true else echo "MISSING label definition: ($lbl)=" missing=1 fi done echo echo "Scanning repo for legacy-prefixed refs (apache- or aws-):" rg -n '\{ref\}`(apache-|aws-)[^`]+`' docs || true echo exit $missing

Length of output: 5025

Add legacy anchor alias (apache-airflow) to the airflow page

Confirmed: {ref}airflow points to docs/integrate/airflow/index.md which defines (airflow)=, but a legacy alias (apache-airflow)= is not present — please add it to avoid breaking external links.

Files to check/update:

docs/ingest/etl/index.md — references {ref}airflow at lines 30 and 205 (no change needed here).

docs/integrate/airflow/index.md — add legacy alias on the same page (anchor currently at line 1).

Suggested change (add the alias next to the existing anchor):
docs/integrate/airflow/index.md
@@
(airflow)=
+(apache-airflow)=

🤖 Prompt for AI Agents

In docs/integrate/airflow/index.md around line 1, the page defines the anchor (airflow)= but is missing the legacy alias (apache-airflow)= which external links rely on; add a second anchor (apache-airflow)= immediately alongside the existing (airflow)= so both aliases are available; no changes needed to docs/ingest/etl/index.md (it references {ref}`airflow` at lines 30 and 205).

@coderabbitai: We decided to prune ALL legacy aliases previously kept for compatibility reasons, but fix forward instead if anything breaks. Please resolve all relevant comments.

@amotl, understood! I'll note this policy change for future reviews. The "fix forward" approach makes sense for maintaining cleaner documentation structure.

✏️ Learnings added

Learnt from: amotl PR: crate/cratedb-guide#249 File: docs/ingest/etl/index.md:30-31 Timestamp: 2025-08-14T11:40:39.557Z Learning: In the CrateDB Guide documentation, the project policy is to prune ALL legacy aliases (like `(apache-airflow)=`, `(aws-*)=`) and fix forward if anything breaks, rather than maintaining backward compatibility with dual anchors.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

⛔ Learnings removed

Learnt from: amotl PR: crate/cratedb-guide#236 File: docs/connect/ide.md:2-2 Timestamp: 2025-08-09T10:59:42.066Z Learning: In the CrateDB Guide documentation, legacy anchor labels (like `(ide)=`) should be retained alongside newer, more specific ones (like `(connect-ide)=`) to maintain backward compatibility for external projects that might reference them, even if they appear redundant within the repository.

🧠 Learnings used

Learnt from: amotl PR: crate/cratedb-guide#236 File: docs/connect/ide.md:2-2 Timestamp: 2025-08-09T10:59:42.066Z Learning: In the CrateDB Guide documentation, legacy anchor labels (like `(ide)=`) should be retained alongside newer, more specific ones (like `(connect-ide)=`) to maintain backward compatibility for external projects that might reference them, even if they appear redundant within the repository.

coderabbitai · 2025-08-14T11:36:06Z

docs/ingest/etl/index.md

+- {ref}`flink`
+
+  Apache Flink is a programming framework and distributed processing engine for
+  stateful computations over unbounded and bounded data streams, written in Java.
+


💡 Verification agent

🧩 Analysis chain

Flink: label rename is consistent; ensure alias retained to prevent broken links.

Confirm (flink)= exists on the target page and consider adding (apache-flink)= there as a compatibility alias.

If needed, I can propose a PR-wide alias list for all renamed integration labels.

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "Looking for anchor labels and references for 'flink'..." echo echo "1) Literal anchor occurrences '(flink)=':" rg -n -F "(flink)=" || true echo echo "2) Literal anchor occurrences '(apache-flink)=':" rg -n -F "(apache-flink)=" || true echo echo "3) Inline refs using backticks (e.g. ref`flink`):" rg -n "ref\`flink\`" || true rg -n "ref\`apache-flink\`" || true echo echo "4) All occurrences of the word 'flink' under docs/:" rg -n "flink" docs || true echo echo "5) Show docs/ingest/etl/index.md (first 200 lines) to inspect the snippet referenced in the review:" sed -n '1,200p' docs/ingest/etl/index.md || true

Length of output: 11616

Action: add compatibility alias (apache-flink)= to the Flink page

Confirmed (flink)= is present at docs/integrate/flink/index.md:1; there is no (apache-flink)= alias in the repo — add it to avoid breaking external links.

Files to update:

docs/integrate/flink/index.md — add alias next to the existing anchor.

(no changes required) docs/ingest/etl/index.md — it already references {ref}flink (lines 42 and 212).

Suggested tiny diff:

docs/integrate/flink/index.md @@ -(flink)= +(flink)= +(apache-flink)=

🤖 Prompt for AI Agents

In docs/integrate/flink/index.md around line 1, add the compatibility alias (apache-flink)= next to the existing (flink)= anchor so both aliases are defined at the top of the file; no changes needed in docs/ingest/etl/index.md since it already references {ref}`flink`. Ensure the new alias uses the same syntax and is placed on the same anchor block as (flink)= to avoid breaking external links.

coderabbitai · 2025-08-14T11:36:06Z

docs/ingest/etl/index.md

+- {ref}`nifi`
+
+  Apache NiFi is a dataflow system based on the concepts of flow-based programming.
+  It supports powerful and scalable directed graphs of data routing, transformation,
+  and system mediation logic.
+


💡 Verification agent

🧩 Analysis chain

NiFi: label rename is consistent; ensure alias retained.

Same note: ensure (nifi)= exists and add (apache-nifi)= on the NiFi page to keep old inbound links working.

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "---- Showing docs/ingest/etl/index.md (first 200 lines) ----" if [ -f docs/ingest/etl/index.md ]; then sed -n '1,200p' docs/ingest/etl/index.md || true else echo "File not found: docs/ingest/etl/index.md" fi echo echo "---- Searching repo for anchor labels '(^\$nifi\$=)' or '(^\$apache-nifi\$=)' ----" rg -n --hidden --no-ignore-vcs '^$nifi$=|^$apache-nifi$=' || true echo echo "---- Searching for files with 'nifi' in the docs path ----" rg -n --hidden --no-ignore-vcs 'nifi' docs || true echo echo "---- Listing Markdown files with 'nifi' in filename ----" fd -e md -HI -g '*nifi*' || true

Length of output: 10013

Add (apache-nifi)= alias to the NiFi page; keep existing (nifi)=

Verified: docs/integrate/nifi/index.md contains (nifi)= at line 1 but there is no (apache-nifi)=. Please add the alias to preserve old inbound links.

Files to update / check:

Edit: docs/integrate/nifi/index.md — add (apache-nifi)= alongside the existing (nifi)= at the top of the file.

References (for awareness): docs/ingest/etl/index.md (lines ~60, 223), docs/integrate/index.md (line ~49).

Suggested top-of-file snippet:
(apache-nifi)=
(nifi)=

🤖 Prompt for AI Agents

In docs/integrate/nifi/index.md around line 1, add the alias "(apache-nifi)=" directly above or alongside the existing "(nifi)=" so the top of the file contains both aliases (apache-nifi)= and (nifi)=; update/save the file and then verify inbound references in docs/ingest/etl/index.md (around lines ~60 and ~223) and docs/integrate/index.md (around line ~49) still resolve correctly.

coderabbitai · 2025-08-14T11:36:06Z

docs/ingest/etl/index.md

+- {ref}`hop`
+
+  Apache Hop aims to be the future of data integration. Visual development enables
+  developers to be more productive than they can be through code.
+


💡 Verification agent

🧩 Analysis chain

Hop: label rename is consistent; ensure alias retained.

Please verify (hop)= exists and consider (apache-hop)= as an alias on the Hop page.

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "Searching for literal anchor patterns..." rg -nF "(hop)=" || true rg -nF "(apache-hop)=" || true echo echo "Searching for reStructuredText labels (e.g. '.. _hop:')..." rg -n --hidden --no-ignore-vcs '\.\.\s*_hop\s*:' -S || true rg -n --hidden --no-ignore-vcs '\.\.\s*_apache-hop\s*:' -S || true echo echo "Searching for 'Hop' mentions under docs/ to locate the Hop page(s)..." rg -n --hidden --no-ignore-vcs -S "^\s*(#|=|-){1,6}\s*Hop" docs || true rg -n --hidden --no-ignore-vcs -S "Hop" docs | sed -n '1,200p' || true echo echo "Looking for files with 'hop' in the filename (md/rst/adoc)..." fd -HI -e md -e rst -e adoc hop || true # Print top of the first matching file that mentions "Hop" to inspect anchors/labels first=$(rg -n --hidden --no-ignore-vcs -l -S "Hop" docs | head -n1 || true) echo echo "First file with a 'Hop' mention: $first" if [ -n "$first" ]; then echo "----- head of $first -----" sed -n '1,200p' "$first" || true echo "----- end head -----" fi

Length of output: 9417

Hop label present — add (apache-hop)= alias on Hop page

Verified: (hop)= exists in docs/integrate/hop/index.md (line 1). docs/ingest/etl/index.md references Hop via {ref}hop (≈lines 78–82).

Files to update:

docs/integrate/hop/index.md — add alias next to the existing label.

docs/ingest/etl/index.md — (no change required; reference is correct)

Suggested change (small diff):

(hop)= (apache-hop)=

🤖 Prompt for AI Agents

In docs/integrate/hop/index.md around line 1, the existing label "(hop)=" needs an alias so external refs like {ref}`hop` resolve consistently; add the alias "(apache-hop)=" on the next line or beside the existing label so the file contains both "(hop)=" and "(apache-hop)="; no changes required in docs/ingest/etl/index.md.

coderabbitai · 2025-08-14T11:36:06Z

docs/integrate/dynamodb/index.md

+(dynamodb)=
+# DynamoDB


🛠️ Refactor suggestion

Add aliases for both historical anchors

This page historically exposed (aws-dynamodb) and also appears to have been referenced as (cdc-dynamodb) in CDC-related sections. Add both as aliases to avoid breaking inbound links.

+ (aws-dynamodb)= + (cdc-dynamodb)= (dynamodb)= # DynamoDB

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

(dynamodb)=

# DynamoDB

(aws-dynamodb)=

(cdc-dynamodb)=

(dynamodb)=

# DynamoDB

🧰 Tools

🪛 LanguageTool

[grammar] ~1-~1: There might be a mistake here.
Context: (dynamodb)= # DynamoDB :::{include} /_include/links.m...

(QB_NEW_EN)

🤖 Prompt for AI Agents

In docs/integrate/dynamodb/index.md around lines 1 to 2, the page currently defines the DynamoDB heading but does not declare the historical anchor aliases; add aliases for both historical anchors "(aws-dynamodb)" and "(cdc-dynamodb)" so inbound links don't break. Update the top of the file to include explicit alias anchors for those names (e.g., add HTML anchor tags or front-matter/alias metadata supported by the docs site) pointing to this page, ensuring both aws-dynamodb and cdc-dynamodb resolve here.

coderabbitai · 2025-08-14T11:36:06Z

docs/integrate/hop/index.md

+(hop)=
+# Hop


🛠️ Refactor suggestion

Preserve old anchor via alias (apache-hop)

Add an alias label so existing {ref}apache-hop links keep resolving.

+ (apache-hop)= (hop)= # Hop

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

(hop)=

# Hop

(apache-hop)=

(hop)=

# Hop

🧰 Tools

🪛 LanguageTool

[grammar] ~1-~1: There might be a mistake here.
Context: (hop)= # Hop ```{div} .float-right [![Apache Hop...

(QB_NEW_EN)

🤖 Prompt for AI Agents

In docs/integrate/hop/index.md around lines 1 to 2, add an alias anchor for the old name so existing {ref}`apache-hop` links continue to resolve; place a new anchor line "(apache-hop)=" immediately above the existing "(hop)=" line (so the top of the file becomes "(apache-hop)=\n(hop)=\n# Hop"), ensuring both anchors precede the heading.

coderabbitai · 2025-08-14T11:36:07Z

docs/integrate/iceberg/index.md

+(iceberg)=
+# Iceberg


🛠️ Refactor suggestion

Alias the former (apache-iceberg) anchor

Add an alias to preserve existing deep links and references.

+ (apache-iceberg)= (iceberg)= # Iceberg

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

(iceberg)=

# Iceberg

(apache-iceberg)=

(iceberg)=

# Iceberg

🧰 Tools

🪛 LanguageTool

[grammar] ~1-~1: There might be a mistake here.
Context: (iceberg)= # Iceberg ```{div} .float-right [![Apache...

(QB_NEW_EN)

🤖 Prompt for AI Agents

In docs/integrate/iceberg/index.md around lines 1 to 2, add an alias anchor for the former name so existing deep links to (apache-iceberg) keep working; add a second anchor line for the old id (apache-iceberg)= above the heading (so both (apache-iceberg)= and (iceberg)= appear before “# Iceberg”) to preserve references.

coderabbitai · 2025-08-14T11:36:07Z

docs/integrate/kinesis/index.md

+(kinesis)=
+# Kinesis


🛠️ Refactor suggestion

Add backward-compatible alias for the old slug

Recommend adding (aws-kinesis) as an alias to keep old links working.

+ (aws-kinesis)= (kinesis)= # Kinesis

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

(kinesis)=

# Kinesis

(aws-kinesis)=

(kinesis)=

# Kinesis

🧰 Tools

🪛 LanguageTool

[grammar] ~1-~1: There might be a mistake here.
Context: (kinesis)= # Kinesis :::{include} /_include/links.md...

(QB_NEW_EN)

🤖 Prompt for AI Agents

In docs/integrate/kinesis/index.md around lines 1 to 2, add a backward-compatible alias anchor for the old slug by declaring the old anchor (aws-kinesis)= alongside the existing (kinesis)= before the heading so both anchors point to the same "Kinesis" section, ensuring old links using the aws-kinesis slug continue to work.

coderabbitai · 2025-08-14T11:36:07Z

docs/integrate/nifi/index.md

+(nifi)=
+# NiFi


💡 Verification agent

🧩 Analysis chain

Add alias for old anchor to preserve existing inbound links

Renaming the anchor to (nifi) is good. To avoid breaking external bookmarks or internal refs that might still target (apache-nifi), add an alias anchor that points to the same page.

Apply this diff:

+ (apache-nifi)= (nifi)= # NiFi

Run this script to verify no stale refs/paths remain and that new anchors are unique:

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "1) Search for old-style {ref} usages:" rg -n '\{ref\}`(apache|aws)-[a-z0-9-]+`' -g 'docs/**' || true echo echo "2) Search for old integrate paths still referenced:" rg -n 'integrate/(apache|aws)-' -g 'docs/**' || true echo echo "3) Search for pages still defining old anchors:" rg -n '^$(apache|aws)-[a-z0-9-]+$=' -g 'docs/**' || true echo echo "4) Ensure new anchors exist exactly once:" for a in nifi kinesis hop dynamodb iceberg; do echo -n " - $a: " rg -n "^$$a$=" -g 'docs/**' | wc -l | tr -d ' ' done echo

Length of output: 1446

Add alias for old anchor to preserve existing inbound links

Renaming the anchor to (nifi) is fine — add an alias for (apache-nifi) to avoid breaking external/internal links. I ran your verification script: no refs to apache-nifi were found, but there are remaining old-style refs/anchors for aws-lambda that you may want to handle similarly.

Files to change / review:

docs/integrate/nifi/index.md (lines 1-2) — add alias for apache-nifi.

Verification findings:

docs/ingest/etl/index.md:169 and :206 contain {ref}aws-lambda

docs/integrate/aws-lambda/index.md:1 defines (aws-lambda)=

Apply this diff:

+ (apache-nifi)= (nifi)= # NiFi

Consider adding alias anchors or updating refs for other old anchors (e.g., aws-lambda) to avoid broken links.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

(nifi)=

# NiFi

(apache-nifi)=

(nifi)=

# NiFi

🧰 Tools

🪛 LanguageTool

[grammar] ~1-~1: There might be a mistake here.
Context: (nifi)= # NiFi ```{div} .float-right [![Apache Ni...

(QB_NEW_EN)

🤖 Prompt for AI Agents

In docs/integrate/nifi/index.md around lines 1-2, add an alias anchor for the old name by inserting a second anchor line for (apache-nifi) alongside the existing (nifi) anchor before the heading so existing refs to apache-nifi continue to work; e.g., add a line with the old anchor immediately above the "# NiFi" heading, and optionally mirror this pattern for other renamed anchors like aws-lambda referenced elsewhere.

amotl · 2025-08-14T11:59:16Z

Assuming build error gets fixed :-)

Assuming anything without any crew? ;]

That's more background about this particular detail. We need to remove a compatibility layer we have been carrying forward up to today due to ~~laziness~~ understaffing.

[RTD] Backward-compatibility code trips documentation builds on downstream projects crate-docs-theme#553 (comment)

amotl commented Aug 13, 2025

View reviewed changes

coderabbitai bot reviewed Aug 13, 2025

View reviewed changes

amotl force-pushed the layout-integrate-strip-prefix branch from fd39a22 to 1390024 Compare August 13, 2025 20:06

amotl changed the title ~~Integrate: Simplify directory layout~~ Integrate: Naming things. Simplify directory layout. Aug 13, 2025

amotl added sanding-1200 Fine sanding. refactoring Changing shape or layout, or moving content around. labels Aug 13, 2025

amotl requested review from bmunkholm and kneth August 13, 2025 20:08

coderabbitai bot reviewed Aug 13, 2025

View reviewed changes

amotl marked this pull request as ready for review August 13, 2025 20:15

coderabbitai bot reviewed Aug 13, 2025

View reviewed changes

amotl changed the title ~~Integrate: Naming things. Simplify directory layout.~~ Naming things: Simplify directory layout at "Integrate" section Aug 13, 2025

amotl force-pushed the layout-integrate-strip-prefix branch from ea49305 to f6223a3 Compare August 14, 2025 00:38

bmunkholm reviewed Aug 14, 2025

View reviewed changes

bmunkholm self-requested a review August 14, 2025 10:17

bmunkholm approved these changes Aug 14, 2025

View reviewed changes

amotl force-pushed the layout-integrate-strip-prefix branch from 1e0a1a0 to 96ab2bb Compare August 14, 2025 11:13

amotl force-pushed the layout-integrate-strip-prefix branch from 96ab2bb to 2426269 Compare August 14, 2025 11:23

amotl added the guidance Matters of layout, shape, and structure. label Aug 14, 2025

coderabbitai bot reviewed Aug 14, 2025

View reviewed changes

amotl merged commit 796f993 into main Aug 14, 2025
3 checks passed

amotl deleted the layout-integrate-strip-prefix branch August 14, 2025 11:41

coderabbitai bot mentioned this pull request Aug 14, 2025

Naming things: {learn,usage}.md -> tutorial.md #250

Merged

coderabbitai bot mentioned this pull request Aug 14, 2025

Integrate/AMQP: Add entry point page, micro tutorial, and category item #248

Merged

This was referenced Aug 14, 2025

Integrate/InfluxDB: Simplify starter tutorial #255

Merged

Integrate/PostgreSQL: Add section with starter tutorial #257

Merged

This was referenced Aug 22, 2025

Integrate/Kafka: Rework section. Add Python tutorial. #263

Merged

Integrate: Add sections about dlt and ingestr #268

Merged

This was referenced Sep 30, 2025

Debezium: Tutorial about replicating data from MSSQL #289

Merged

Superset: Usage guides #282

Merged

-(hop)=
-# Hop
+(apache-hop)=
+(hop)=
+# Hop

-(nifi)=
-# NiFi
+(apache-nifi)=
+(nifi)=
+# NiFi

Naming things: Simplify directory layout at "Integrate" section #249

Naming things: Simplify directory layout at "Integrate" section #249

Uh oh!

Conversation

amotl commented Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

About

Preview

Rationale

Thoughts

Uh oh!

coderabbitai bot commented Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

amotl Aug 13, 2025

Choose a reason for hiding this comment

Uh oh!

amotl Aug 13, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

bmunkholm Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

amotl Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

amotl Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bmunkholm left a comment

Choose a reason for hiding this comment

Uh oh!

amotl commented Aug 14, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

amotl Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 14, 2025

Choose a reason for hiding this comment

amotl commented Aug 13, 2025 •

edited

Loading

coderabbitai bot commented Aug 13, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

amotl Aug 14, 2025 •

edited

Loading

amotl Aug 14, 2025 •

edited

Loading

coderabbitai bot Aug 14, 2025 •

edited

Loading

amotl Aug 14, 2025 •

edited

Loading