Skip to content

Conversation

amotl
Copy link
Member

@amotl amotl commented Aug 13, 2025

About

The flat-list directory layout at docs/integrate used prefixes for certain product names. In the spirit of better navigation (less is more), let's strip the prefixes for the sake of less confusion when exploring/navigating and authoring: We've elaborated below about our slightly negative experience.

Preview

https://cratedb-guide--249.org.readthedocs.build/integrate/ (you don't "see" too much here, only when navigating)

Rationale

Names for Apache and Amazon/AWS products have been prefixed before using apache- and aws-. However, while initially planned to be a good idea, it implicitly added an overlay subgroup to this flat list which skewed the core idea.

On the one hand intended, it turned out to be cumbersome to navigate, both when authoring, and when browsing / exploring the list on behalf of the directory tree, URL layout, or other interfaces.

So, let's strip all the prefixes, where the product name is feasible to be better used standalone.

Thoughts

Other than the feeling of a smell, those specific thought conflicts and copy-editing experiences contributed to this change, sorted by increasing gravity:

  • Kafka is not only an Apache product, but also a technology that is provided by different implementations, both unmanaged and managed. I'd like to elaborate on this on the relevant page in a quick statement for name-dropping reasons.

  • aws-dynamodb and aws-kinesis had also been defined "wrong", because the real product names are "Amazon DynamoDB" and "Amazon Kinesis", while aws-dms and aws-lambda are correct (AWS DMS, AWS Lambda). That would have been another conflict zone.

  • When doing actual work on the new tree yesterday, when thinking about navigating to the "Kafka" or "Kinesis" slot, needed one additional thought step to find them in apache-kafka or aws-kinesis. That yielded a slight bad taste, not just once.

Copy link

coderabbitai bot commented Aug 13, 2025

Walkthrough

Normalized integration slugs and simplified many integration page headings by removing vendor prefixes in the integrations ToC and individual integration index pages; added/promoted numerous top-level integration index pages and made small formatting/navigation edits. All changes are documentation-only.

Changes

Cohort / File(s) Summary
Integrations ToC
docs/integrate/index.md
Replaced vendor-prefixed toctree entries with simplified top-level slugs (e.g., apache-airflow/indexairflow/index), reordered entries, and added many top-level integration indices. Kept maxdepth: 1.
New / promoted integration indices
docs/integrate/airflow/index.md, docs/integrate/...*
Added or promoted many top-level integration index pages and updated references in the main integrate ToC. Examples: airflow, aws-lambda, azure-functions, cluvio, datagrip, dbeaver, dbt, debezium, django, dms, dynamodb, estuary, explo, flink, gradio, grafana, hop, iceberg, influxdb, kafka, kestra, kinesis, langchain, llamaindex, marquez, meltano, metabase, mongodb, mysql, n8n, nifi, node-red, plotly, powerbi, prometheus, pyviz, rill, risingwave, sql-server, streamlit, streamsets, superset, tableau, telegraf.
Integration page heading / anchor updates
docs/integrate/airflow/index.md, docs/integrate/dms/index.md, docs/integrate/dynamodb/index.md, docs/integrate/flink/index.md, docs/integrate/hop/index.md, docs/integrate/iceberg/index.md, docs/integrate/kafka/index.md, docs/integrate/kinesis/index.md, docs/integrate/nifi/index.md, docs/integrate/superset/index.md
Simplified H1 headings and front-matter anchors by removing vendor prefixes (e.g., "Apache Flink" → "Flink", (apache-kafka)=(kafka)=). No other content or includes changed.
Minor formatting / navigation edits
docs/integrate/estuary/index.md, docs/integrate/influxdb/index.md, docs/ingest/cdc/index.md, docs/ingest/etl/index.md
Small formatting fixes (removed blank line), removed two InfluxDB toctree entries, and updated internal cross-reference anchors from vendor-prefixed identifiers to simplified slugs (e.g., aws-dynamodbdynamodb, aws-kinesiskinesis, aws-dmsdms, and multiple apache- → short names).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

Suggested labels

sanding-500, cross linking

Suggested reviewers

  • hammerhead
  • karynzv
  • surister

Poem

I hopped through headings, nibbled prefixes small,
Cleared burrows in the ToC, lined up every hall.
New names sprout like clover under morning light,
A tidy rabbit shuffle — docs all set and bright. 🐇

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch layout-integrate-strip-prefix

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure that needs some rewrite rules.

rewrite ^/docs/guide/integrate/apache-(.*)/(.*)$                     /docs/guide/integrate/$1/$2           redirect;
rewrite ^/docs/guide/integrate/aws-(.*)/(.*)$                        /docs/guide/integrate/$1/$2           redirect;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to the set of redirects for the current iteration.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
docs/integrate/index.md (2)

16-63: Consider adopting a strictly alphabetical toctree for maintainability.

Keeping entries alphabetically sorted reduces diff noise and makes additions simpler. If intentional ordering is desired for a few items, consider annotating that in a short comment above the list.


28-29: Disambiguate vendor-less product names in page titles.

Using “dms” and “dynamodb” without “AWS” may be ambiguous in isolation. Ensure the target pages’ H1 titles clearly read “AWS Database Migration Service (DMS)” and “Amazon DynamoDB,” to avoid confusion in search and breadcrumbs.

Also applies to: 39-39

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ac6ad86 and fd39a22.

📒 Files selected for processing (1)
  • docs/integrate/index.md (4 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.966Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.
🪛 LanguageTool
docs/integrate/index.md

[grammar] ~19-~19: There might be a mistake here.
Context: ...:::{toctree} :maxdepth: 1 airflow/index aws-lambda/index azure-functions/index c...

(QB_NEW_EN)


[grammar] ~28-~28: There might be a mistake here.
Context: ...ex debezium/index django/index dms/index dynamodb/index estuary/index explo/index...

(QB_NEW_EN)


[grammar] ~29-~29: There might be a mistake here.
Context: ...ex django/index dms/index dynamodb/index estuary/index explo/index flink/index gr...

(QB_NEW_EN)


[grammar] ~30-~30: There might be a mistake here.
Context: ...x dms/index dynamodb/index estuary/index explo/index flink/index gradio/index gra...

(QB_NEW_EN)


[grammar] ~31-~31: There might be a mistake here.
Context: ...dynamodb/index estuary/index explo/index flink/index gradio/index grafana/index h...

(QB_NEW_EN)


[grammar] ~32-~32: There might be a mistake here.
Context: ...ex estuary/index explo/index flink/index gradio/index grafana/index hop/index ice...

(QB_NEW_EN)


[grammar] ~33-~33: There might be a mistake here.
Context: ...dex explo/index flink/index gradio/index grafana/index hop/index iceberg/index in...

(QB_NEW_EN)


[grammar] ~34-~34: There might be a mistake here.
Context: ...x flink/index gradio/index grafana/index hop/index iceberg/index influxdb/index k...

(QB_NEW_EN)


[grammar] ~35-~35: There might be a mistake here.
Context: ...dex gradio/index grafana/index hop/index iceberg/index influxdb/index kafka/index...

(QB_NEW_EN)


[grammar] ~36-~36: There might be a mistake here.
Context: ...ex grafana/index hop/index iceberg/index influxdb/index kafka/index kinesis/index...

(QB_NEW_EN)


[grammar] ~37-~37: There might be a mistake here.
Context: ...x hop/index iceberg/index influxdb/index kafka/index kinesis/index kestra/index l...

(QB_NEW_EN)


[grammar] ~38-~38: There might be a mistake here.
Context: ...iceberg/index influxdb/index kafka/index kinesis/index kestra/index langchain/ind...

(QB_NEW_EN)


[grammar] ~39-~39: There might be a mistake here.
Context: ...influxdb/index kafka/index kinesis/index kestra/index langchain/index llamaindex/...

(QB_NEW_EN)


[grammar] ~49-~49: There might be a mistake here.
Context: ...b/index mysql/index n8n/index nifi/index node-red/index plotly/index Power BI <po...

(QB_NEW_EN)


[grammar] ~60-~60: There might be a mistake here.
Context: ...it/index streamsets/index superset/index tableau/index telegraf/index ::: [Crat...

(QB_NEW_EN)

🔇 Additional comments (3)
docs/integrate/index.md (3)

19-19: Good simplification of slugs; aligns with the PR goal.

Removing vendor prefixes (e.g., apache-airflow → airflow) improves readability and URL hygiene. Nice.


32-32: All listed integrations have learn pages — no action required.

I ran the provided script; for each slug (airflow, dms, dynamodb, flink, hop, iceberg, kafka, kinesis, nifi, superset) the script found a learn.(md|rst) or *-learn.(md|rst) file under docs/integrate/.


19-19: Check complete — new integrate targets exist; old vendor-prefixed refs preserved

Ran the supplied script.

  • New index pages verified (missing count: 0): docs/integrate/{airflow,dms,dynamodb,flink,hop,iceberg,kafka,kinesis,nifi,superset}.
  • Old vendor-prefixed slugs are present in the repo but appear to be preserved as anchor labels in the new integrate index files (so internal {ref} links should continue to resolve). Anchor examples:
    • docs/integrate/airflow/index.md — (apache-airflow)=
    • docs/integrate/flink/index.md — (apache-flink)=
    • docs/integrate/kafka/index.md — (apache-kafka)=
    • docs/integrate/nifi/index.md — (apache-nifi)=
    • docs/integrate/iceberg/index.md — (apache-iceberg)=
    • docs/integrate/superset/index.md — (apache-superset)=
    • docs/integrate/dms/index.md — (aws-dms)=
    • docs/integrate/dynamodb/index.md — (aws-dynamodb)=
    • docs/integrate/kinesis/index.md — (aws-kinesis)=
  • Files that reference the old slugs (examples; see full rg output for all lines): docs/ingest/etl/index.md, docs/ingest/cdc/index.md, docs/integrate/marquez/learn.md, docs/integrate/risingwave/index.md, docs/integrate/status.md, docs/_include/links.md.
  • No Sphinx redirection config (rediraffe) found in the repo.

Conclusion / recommendation:

  • Internal cross-references look safe (anchors preserved) — no immediate changes required.
  • If you need to preserve old path-level URLs for external links, add redirects (e.g., sphinxext-rediraffe or redirect stub pages). I can generate a rediraffe mapping or redirect stubs if you want.

Apache and Amazon/AWS products have been prefixed before. However, while
initially planned to be a good idea, it implicitly added an overlay
subgroup to this flat list.

On the one hand intended, it turned out to be cumbersome to navigate,
both when authoring, and when browsing / exploring the list on behalf
of the URL layout, or other interfaces.

So, let's strip all the prefixes, where the product name is feasible to
be better used standalone.
@amotl amotl force-pushed the layout-integrate-strip-prefix branch from fd39a22 to 1390024 Compare August 13, 2025 20:06
@amotl amotl changed the title Integrate: Simplify directory layout Integrate: Naming things. Simplify directory layout. Aug 13, 2025
@amotl amotl added sanding-1200 Fine sanding. refactoring Changing shape or layout, or moving content around. labels Aug 13, 2025
@amotl amotl requested review from bmunkholm and kneth August 13, 2025 20:08
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
docs/integrate/index.md (1)

16-63: Fix broken ToC links in docs/integrate/index.md — referenced index pages are missing

The verification script reports the following referenced targets are missing (no docs//index.md or .rst):

  • docs/airflow/index
  • docs/aws-lambda/index
  • docs/azure-functions/index
  • docs/cluvio/index
  • docs/datagrip/index
  • docs/dbeaver/index
  • docs/dbt/index
  • docs/debezium/index
  • docs/django/index
  • docs/dms/index
  • docs/dynamodb/index
  • docs/estuary/index
  • docs/explo/index
  • docs/flink/index
  • docs/gradio/index
  • docs/grafana/index
  • docs/hop/index
  • docs/iceberg/index
  • docs/influxdb/index
  • docs/kafka/index
  • docs/kinesis/index
  • docs/kestra/index
  • docs/langchain/index
  • docs/llamaindex/index
  • docs/marquez/index
  • docs/meltano/index
  • docs/metabase/index
  • docs/mongodb/index
  • docs/mysql/index
  • docs/n8n/index
  • docs/nifi/index
  • docs/node-red/index
  • docs/plotly/index
  • docs/powerbi/index
  • docs/prometheus/index
  • docs/pyviz/index
  • docs/rill/index
  • docs/risingwave/index
  • docs/sql-server/index
  • docs/streamlit/index
  • docs/streamsets/index
  • docs/superset/index
  • docs/tableau/index
  • docs/telegraf/index

Location to edit: docs/integrate/index.md (around lines 16–63). Action: restore the missing index.md/.rst files under the listed paths or update the toctree entries to the correct, current slugs.

🧹 Nitpick comments (2)
docs/integrate/index.md (2)

38-41: Nit: keep K entries alphabetically ordered (move kestra before kinesis)*

Minor consistency tweak to maintain the otherwise alphabetical list.

Apply:

- kinesis/index
- kestra/index
+ kestra/index
+ kinesis/index

19-19: Preserve external links: add redirects for renamed slugs and update internal refs if any

I couldn't confirm automatically: the search for a redirects extension failed because conf.py / pyproject.toml weren't found in the checkout, and a scan for vendor-prefixed integrate paths returned no matches. Please run the checks below locally and paste results so I can re-verify.

Quick checks to run locally

  • Find configured redirect extensions:
    rg -n "(sphinx-reredirects|rediraffe)" -S .
  • Search for old vendor-prefixed slugs across Markdown and reST:
    rg -nS "integrate/(apache-|aws-|amazon-)" -g "**/*.{md,rst}" .
  • Search for specific old names:
    rg -nS "apache-airflow|apache-flink|apache-hop|apache-iceberg|apache-kafka|apache-nifi|apache-superset|amazon-kinesis|amazon-dynamodb|aws-dms" -g "**/*.{md,rst}" .

Checks to make by hand

  • Inspect docs/conf.py or docs/source/conf.py (or pyproject.toml) for an extensions list containing sphinx_reredirects / rediraffe and any redirect mapping config.
  • Scan docs/ (both .md and .rst) for internal links that still reference vendor-prefixed slugs and update them or add redirects.

Suggested redirect mappings (example)

  • docs/integrate/apache-airflow/index -> docs/integrate/airflow/index
  • docs/integrate/apache-flink/index -> docs/integrate/flink/index
  • docs/integrate/apache-hop/index -> docs/integrate/hop/index
  • docs/integrate/apache-iceberg/index -> docs/integrate/iceberg/index
  • docs/integrate/apache-kafka/index -> docs/integrate/kafka/index
  • docs/integrate/apache-nifi/index -> docs/integrate/nifi/index
  • docs/integrate/apache-superset/index -> docs/integrate/superset/index
  • docs/integrate/amazon-kinesis/index -> docs/integrate/kinesis/index
  • docs/integrate/amazon-dynamodb/index -> docs/integrate/dynamodb/index
  • docs/integrate/aws-dms/index -> docs/integrate/dms/index

Note: I used the stored guidance to search both .md and .rst files for cross-references. Please paste the output of the commands above (or let me re-run if you can provide access to the repo root), and I will update/resolve the comment.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fd39a22 and 1390024.

📒 Files selected for processing (11)
  • docs/integrate/airflow/index.md (1 hunks)
  • docs/integrate/dms/index.md (1 hunks)
  • docs/integrate/dynamodb/index.md (1 hunks)
  • docs/integrate/flink/index.md (1 hunks)
  • docs/integrate/hop/index.md (1 hunks)
  • docs/integrate/iceberg/index.md (1 hunks)
  • docs/integrate/index.md (4 hunks)
  • docs/integrate/kafka/index.md (1 hunks)
  • docs/integrate/kinesis/index.md (1 hunks)
  • docs/integrate/nifi/index.md (1 hunks)
  • docs/integrate/superset/index.md (1 hunks)
✅ Files skipped from review due to trivial changes (10)
  • docs/integrate/airflow/index.md
  • docs/integrate/kinesis/index.md
  • docs/integrate/dynamodb/index.md
  • docs/integrate/iceberg/index.md
  • docs/integrate/dms/index.md
  • docs/integrate/flink/index.md
  • docs/integrate/nifi/index.md
  • docs/integrate/hop/index.md
  • docs/integrate/kafka/index.md
  • docs/integrate/superset/index.md
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.966Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.
🪛 LanguageTool
docs/integrate/index.md

[grammar] ~19-~19: There might be a mistake here.
Context: ...:::{toctree} :maxdepth: 1 airflow/index aws-lambda/index azure-functions/index c...

(QB_NEW_EN)


[grammar] ~28-~28: There might be a mistake here.
Context: ...ex debezium/index django/index dms/index dynamodb/index estuary/index explo/index...

(QB_NEW_EN)


[grammar] ~29-~29: There might be a mistake here.
Context: ...ex django/index dms/index dynamodb/index estuary/index explo/index flink/index gr...

(QB_NEW_EN)


[grammar] ~30-~30: There might be a mistake here.
Context: ...x dms/index dynamodb/index estuary/index explo/index flink/index gradio/index gra...

(QB_NEW_EN)


[grammar] ~31-~31: There might be a mistake here.
Context: ...dynamodb/index estuary/index explo/index flink/index gradio/index grafana/index h...

(QB_NEW_EN)


[grammar] ~32-~32: There might be a mistake here.
Context: ...ex estuary/index explo/index flink/index gradio/index grafana/index hop/index ice...

(QB_NEW_EN)


[grammar] ~33-~33: There might be a mistake here.
Context: ...dex explo/index flink/index gradio/index grafana/index hop/index iceberg/index in...

(QB_NEW_EN)


[grammar] ~34-~34: There might be a mistake here.
Context: ...x flink/index gradio/index grafana/index hop/index iceberg/index influxdb/index k...

(QB_NEW_EN)


[grammar] ~35-~35: There might be a mistake here.
Context: ...dex gradio/index grafana/index hop/index iceberg/index influxdb/index kafka/index...

(QB_NEW_EN)


[grammar] ~36-~36: There might be a mistake here.
Context: ...ex grafana/index hop/index iceberg/index influxdb/index kafka/index kinesis/index...

(QB_NEW_EN)


[grammar] ~37-~37: There might be a mistake here.
Context: ...x hop/index iceberg/index influxdb/index kafka/index kinesis/index kestra/index l...

(QB_NEW_EN)


[grammar] ~38-~38: There might be a mistake here.
Context: ...iceberg/index influxdb/index kafka/index kinesis/index kestra/index langchain/ind...

(QB_NEW_EN)


[grammar] ~39-~39: There might be a mistake here.
Context: ...influxdb/index kafka/index kinesis/index kestra/index langchain/index llamaindex/...

(QB_NEW_EN)


[grammar] ~49-~49: There might be a mistake here.
Context: ...b/index mysql/index n8n/index nifi/index node-red/index plotly/index Power BI <po...

(QB_NEW_EN)


[grammar] ~60-~60: There might be a mistake here.
Context: ...it/index streamsets/index superset/index tableau/index telegraf/index ::: [Crat...

(QB_NEW_EN)

🔇 Additional comments (2)
docs/integrate/index.md (2)

19-19: Simplified slugs improve navigation; ToC changes LGTM

The vendor-prefix removal makes the list flatter and easier to scan. The new entries look consistent with the repository’s integration structure and with the learned pattern of linking main integration anchors via their index pages.

Also applies to: 28-29, 32-32, 35-36, 38-39, 49-49, 60-60


16-63: Ignore LanguageTool “grammar” flags for toctree entries

These lines are Sphinx directive entries, not prose; the warnings are false positives.

@amotl amotl marked this pull request as ready for review August 13, 2025 20:15
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
docs/integrate/index.md (1)

19-63: All toctree targets resolve — one prefixed directory remains (action required)

Summary: I verified docs/integrate/index.md (44 toctree entries) and all targets resolve to index.md/.rst; a repo scan found no other integrate/(apache|aws)-... references. However, a prefixed directory remains:

  • docs/integrate/aws-lambda/ — present (likely stale if aws-* prefixes were supposed to be removed)

Please remove or rename that directory (and update any links) or confirm it is intentionally kept.

🧹 Nitpick comments (2)
docs/integrate/index.md (2)

20-20: Confirm intent to keep “aws-lambda” prefixed (exception) vs. de-prefixing to “lambda”.

Most AWS-related entries (e.g., DMS, Kinesis, DynamoDB) are de-prefixed; “aws-lambda/index” remains. If this is an intentional exception for clarity (generic “Lambda” ambiguity), consider documenting this convention; otherwise, rename to “lambda/index” and update refs.

If you opt to de-prefix now:

-aws-lambda/index
+lambda/index

19-63: Add redirects to preserve existing inbound links and SEO after slug changes.

External links to old apache-/aws- paths will 404 after deploy. Recommend adding redirects (e.g., via sphinxext-rediraffe or site-level redirects) from old slugs to new ones.

Examples:

  • apache-kafka/index -> kafka/index
  • aws-dynamodb/index -> dynamodb/index
  • aws-dms/index -> dms/index

If you share the old->new mapping, I can generate a rediraffe redirects file or Netlify/GitHub Pages redirects for you.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1390024 and ea49305.

📒 Files selected for processing (1)
  • docs/integrate/index.md (4 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.966Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.
🪛 LanguageTool
docs/integrate/index.md

[grammar] ~19-~19: There might be a mistake here.
Context: ...:::{toctree} :maxdepth: 1 airflow/index aws-lambda/index azure-functions/index c...

(QB_NEW_EN)


[grammar] ~28-~28: There might be a mistake here.
Context: ...ex debezium/index django/index dms/index dynamodb/index estuary/index explo/index...

(QB_NEW_EN)


[grammar] ~29-~29: There might be a mistake here.
Context: ...ex django/index dms/index dynamodb/index estuary/index explo/index flink/index gr...

(QB_NEW_EN)


[grammar] ~30-~30: There might be a mistake here.
Context: ...x dms/index dynamodb/index estuary/index explo/index flink/index gradio/index gra...

(QB_NEW_EN)


[grammar] ~31-~31: There might be a mistake here.
Context: ...dynamodb/index estuary/index explo/index flink/index gradio/index grafana/index h...

(QB_NEW_EN)


[grammar] ~32-~32: There might be a mistake here.
Context: ...ex estuary/index explo/index flink/index gradio/index grafana/index hop/index ice...

(QB_NEW_EN)


[grammar] ~33-~33: There might be a mistake here.
Context: ...dex explo/index flink/index gradio/index grafana/index hop/index iceberg/index in...

(QB_NEW_EN)


[grammar] ~34-~34: There might be a mistake here.
Context: ...x flink/index gradio/index grafana/index hop/index iceberg/index influxdb/index k...

(QB_NEW_EN)


[grammar] ~35-~35: There might be a mistake here.
Context: ...dex gradio/index grafana/index hop/index iceberg/index influxdb/index kafka/index...

(QB_NEW_EN)


[grammar] ~36-~36: There might be a mistake here.
Context: ...ex grafana/index hop/index iceberg/index influxdb/index kafka/index kestra/index ...

(QB_NEW_EN)


[grammar] ~37-~37: There might be a mistake here.
Context: ...x hop/index iceberg/index influxdb/index kafka/index kestra/index kinesis/index l...

(QB_NEW_EN)


[grammar] ~38-~38: There might be a mistake here.
Context: ...iceberg/index influxdb/index kafka/index kestra/index kinesis/index langchain/ind...

(QB_NEW_EN)


[grammar] ~39-~39: There might be a mistake here.
Context: ... influxdb/index kafka/index kestra/index kinesis/index langchain/index llamaindex...

(QB_NEW_EN)


[grammar] ~40-~40: There might be a mistake here.
Context: ...x kafka/index kestra/index kinesis/index langchain/index llamaindex/index marquez...

(QB_NEW_EN)


[grammar] ~49-~49: There might be a mistake here.
Context: ...b/index mysql/index n8n/index nifi/index node-red/index plotly/index Power BI <po...

(QB_NEW_EN)


[grammar] ~60-~60: There might be a mistake here.
Context: ...it/index streamsets/index superset/index tableau/index telegraf/index ::: [Crat...

(QB_NEW_EN)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build docs
🔇 Additional comments (2)
docs/integrate/index.md (2)

16-63: Restructure aligns with PR objective; toctree looks coherent and alphabetized.

The simplified slugs and flat list meet the stated goal. The entries appear alphabetically ordered and consistent with the new naming scheme.


21-21: Pattern check: keep “-learn” pages separate from anchors (as per doc architecture).

This toctree links only to top-level integration indices, which matches the established pattern where detailed tutorials use the “-learn” suffix in learn.{md|rst}. Looks consistent with the documented approach.

@amotl amotl changed the title Integrate: Naming things. Simplify directory layout. Naming things: Simplify directory layout at "Integrate" section Aug 13, 2025
Apache and Amazon/AWS products have been prefixed before. However, while
initially planned to be a good idea, it implicitly added an overlay
subgroup to this flat list.

On the one hand intended, it turned out to be cumbersome to navigate,
both when authoring, and when browsing / exploring the list on behalf
of the URL layout, or other interfaces.

So, let's strip all the prefixes, where the product name is feasible to
be better used standalone.
@amotl amotl force-pushed the layout-integrate-strip-prefix branch from ea49305 to f6223a3 Compare August 14, 2025 00:38
@@ -1,5 +1,5 @@
(apache-kafka)=
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about these references (not just here but all the similar ones) ?

Copy link
Member Author

@amotl amotl Aug 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reference labels go orthogonal to any other structure (physical/folder/files/toc), but haven't been touched here yet. It could be a good idea to also use the canonical labels going forward, thanks!

Copy link
Member Author

@amotl amotl Aug 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2426269 adjusts the reference labels accordingly. Let's do the proper spring cleaning right away instead of deferring it to a future iteration.

@bmunkholm bmunkholm self-requested a review August 14, 2025 10:17
Copy link
Contributor

@bmunkholm bmunkholm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming build error gets fixed :-)

@amotl
Copy link
Member Author

amotl commented Aug 14, 2025

Assuming anything without any crew? ;]

@amotl amotl force-pushed the layout-integrate-strip-prefix branch from 1e0a1a0 to 96ab2bb Compare August 14, 2025 11:13
Let's also use canonical, non-prefixed variants for the reference
labels. This patch aggressively prunes previous nomenclature to create
less confusion for future authors. If anything breaks on this, it is
now the right time to fix forward.
@amotl amotl force-pushed the layout-integrate-strip-prefix branch from 96ab2bb to 2426269 Compare August 14, 2025 11:23
@amotl amotl added the guidance Matters of layout, shape, and structure. label Aug 14, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🔭 Outside diff range comments (2)
docs/ingest/etl/index.md (2)

205-228: Fix legacy prefixed refs (aws-lambda) and verify anchors

Alphabetical order looks fine, but there are lingering old-style refs that will break anchors. Update the refs to the canonical labels (or add matching label definitions) and consider generating this list from a single source to avoid future drift.

Files/locations to fix:

  • docs/ingest/etl/index.md:169 — - {ref}aws-lambda``
  • docs/ingest/etl/index.md:206 — - {ref}aws-lambda``

135-147: Use the generic "Kafka" label in ETL docs and confirm reference targets

Replace the "Apache Kafka" phrasing in the ETL docs to match the unprefixed {ref}kafka`` usage.

  • Files to update:
    • docs/ingest/etl/index.md — lines ~135–147: change the paragraph starting "Apache Kafka is…" to use "Kafka".
    • docs/integrate/kafka/index.md — already defines the (kafka)= target (line 1), so {ref}kafka resolves. No `(apache-kafka)=` alias or `{ref}`apache-kafka usages were found; add the alias only if you need to preserve legacy references.

Apply this diff in docs/ingest/etl/index.md:

-  Apache Kafka is an open-source distributed event streaming platform
-  for high-performance data pipelines, streaming analytics, data integration,
-  and mission-critical applications.
+  Kafka is an open-source distributed event streaming platform for
+  high-performance data pipelines, streaming analytics, data integration,
+  and mission-critical applications.

Optional (only if you have legacy {ref}apache-kafka`` links): add (apache-kafka)= directly below the existing `(kafka)=` in docs/integrate/kafka/index.md.

🧹 Nitpick comments (2)
docs/integrate/dbt/index.md (1)

33-33: Minor grammar/style: add comma after “for example” and simplify link phrasing.

Improves readability and resolves the LanguageTool hint. Also consider explicit link text for proper casing inline.

-dbt projects run, for example with {ref}`debezium` or with {ref}`airflow`.
+dbt projects run, for example, with {ref}`Debezium <debezium>` or {ref}`Airflow <airflow>`.

Optional: If your docs follow American English, consider “Afterward” instead of “Afterwards” in the next sentence.

docs/ingest/etl/index.md (1)

100-105: Naming decision: Confirm dropping the “aws-” prefix for DMS is intentional.

The PR description suggests aws-dms was considered a correct use of the prefix, yet this change references {ref}dms``. If the rename is intended, ensure (dms)= exists and add `(aws-dms)=` as an alias on the DMS page to preserve existing links.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled
  • Linear integration is disabled

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 96ab2bb and 2426269.

📒 Files selected for processing (15)
  • docs/ingest/cdc/index.md (2 hunks)
  • docs/ingest/etl/index.md (6 hunks)
  • docs/integrate/airflow/index.md (1 hunks)
  • docs/integrate/dbt/index.md (1 hunks)
  • docs/integrate/dms/index.md (1 hunks)
  • docs/integrate/dynamodb/index.md (2 hunks)
  • docs/integrate/estuary/index.md (0 hunks)
  • docs/integrate/flink/index.md (1 hunks)
  • docs/integrate/hop/index.md (1 hunks)
  • docs/integrate/iceberg/index.md (1 hunks)
  • docs/integrate/influxdb/index.md (0 hunks)
  • docs/integrate/kafka/index.md (1 hunks)
  • docs/integrate/kinesis/index.md (2 hunks)
  • docs/integrate/nifi/index.md (1 hunks)
  • docs/integrate/superset/index.md (1 hunks)
💤 Files with no reviewable changes (2)
  • docs/integrate/estuary/index.md
  • docs/integrate/influxdb/index.md
🚧 Files skipped from review as they are similar to previous changes (6)
  • docs/integrate/kafka/index.md
  • docs/integrate/flink/index.md
  • docs/integrate/dms/index.md
  • docs/integrate/airflow/index.md
  • docs/ingest/cdc/index.md
  • docs/integrate/superset/index.md
🧰 Additional context used
🪛 LanguageTool
docs/ingest/etl/index.md

[grammar] ~71-~71: There might be a mistake here.
Context: ...clusive;2em Low-code / No-code / Visual - {ref}estuary` Estuary provides real-...

(QB_NEW_EN)


[grammar] ~205-~205: There might be a mistake here.
Context: ...habetically sorted ::: :::{div} - {ref}airflow - {ref}aws-lambda - {ref}`azure-function...

(QB_NEW_EN)


[grammar] ~206-~206: There might be a mistake here.
Context: ...d ::: :::{div} - {ref}airflow - {ref}aws-lambda - {ref}azure-functions - {ref}dbt - {r...

(QB_NEW_EN)


[grammar] ~207-~207: There might be a mistake here.
Context: ...ef}airflow - {ref}aws-lambda - {ref}azure-functions - {ref}dbt - {ref}dms - {ref}`dynamodb...

(QB_NEW_EN)


[grammar] ~208-~208: There might be a mistake here.
Context: ...lambda - {ref}azure-functions - {ref}dbt - {ref}dms - {ref}dynamodb - {ref}est...

(QB_NEW_EN)


[grammar] ~209-~209: There might be a mistake here.
Context: ...f}azure-functions - {ref}dbt - {ref}dms - {ref}dynamodb - {ref}estuary - {ref}...

(QB_NEW_EN)


[grammar] ~210-~210: There might be a mistake here.
Context: ...tions - {ref}dbt - {ref}dms - {ref}dynamodb - {ref}estuary - {ref}flink - {ref}ho...

(QB_NEW_EN)


[grammar] ~211-~211: There might be a mistake here.
Context: ... - {ref}dms - {ref}dynamodb - {ref}estuary - {ref}flink - {ref}hop - {ref}iceber...

(QB_NEW_EN)


[grammar] ~212-~212: There might be a mistake here.
Context: ...{ref}dynamodb - {ref}estuary - {ref}flink - {ref}hop - {ref}iceberg - {ref}`infl...

(QB_NEW_EN)


[grammar] ~213-~213: There might be a mistake here.
Context: ... - {ref}estuary - {ref}flink - {ref}hop - {ref}iceberg - {ref}influxdb - {ref}...

(QB_NEW_EN)


[grammar] ~214-~214: There might be a mistake here.
Context: ...ary - {ref}flink - {ref}hop - {ref}iceberg - {ref}influxdb - {ref}kafka - {ref}k...

(QB_NEW_EN)


[grammar] ~215-~215: There might be a mistake here.
Context: ...k - {ref}hop - {ref}iceberg - {ref}influxdb - {ref}kafka - {ref}kestra - {ref}kin...

(QB_NEW_EN)


[grammar] ~216-~216: There might be a mistake here.
Context: ...{ref}iceberg - {ref}influxdb - {ref}kafka - {ref}kestra - {ref}kinesis - {ref}`m...

(QB_NEW_EN)


[grammar] ~217-~217: There might be a mistake here.
Context: ...- {ref}influxdb - {ref}kafka - {ref}kestra - {ref}kinesis - {ref}meltano - {ref}`...

(QB_NEW_EN)


[grammar] ~218-~218: There might be a mistake here.
Context: ... - {ref}kafka - {ref}kestra - {ref}kinesis - {ref}meltano - {ref}mongodb - {ref}...

(QB_NEW_EN)


[grammar] ~219-~219: There might be a mistake here.
Context: ...- {ref}kestra - {ref}kinesis - {ref}meltano - {ref}mongodb - {ref}mysql - {ref}`n8...

(QB_NEW_EN)


[grammar] ~220-~220: There might be a mistake here.
Context: ... {ref}kinesis - {ref}meltano - {ref}mongodb - {ref}mysql - {ref}n8n - {ref}nifi ...

(QB_NEW_EN)


[grammar] ~221-~221: There might be a mistake here.
Context: ... {ref}meltano - {ref}mongodb - {ref}mysql - {ref}n8n - {ref}nifi - {ref}`node-re...

(QB_NEW_EN)


[grammar] ~222-~222: There might be a mistake here.
Context: ... - {ref}mongodb - {ref}mysql - {ref}n8n - {ref}nifi - {ref}node-red - {ref}`ri...

(QB_NEW_EN)


[grammar] ~223-~223: There might be a mistake here.
Context: ...odb - {ref}mysql - {ref}n8n - {ref}nifi - {ref}node-red - {ref}risingwave` - {r...

(QB_NEW_EN)

docs/integrate/dbt/index.md

[grammar] ~33-~33: There might be a mistake here.
Context: ... {ref}debezium or with {ref}airflow. Afterwards, data analysts can run their ...

(QB_NEW_EN)


[locale-violation] ~33-~33: In American English, ‘afterward’ is the preferred variant. ‘Afterwards’ is more commonly used in British English and other dialects.
Context: ...{ref}debezium or with {ref}airflow. Afterwards, data analysts can run their dbt projec...

(AFTERWARDS_US)

docs/integrate/dynamodb/index.md

[grammar] ~1-~1: There might be a mistake here.
Context: (dynamodb)= # DynamoDB :::{include} /_include/links.m...

(QB_NEW_EN)

docs/integrate/hop/index.md

[grammar] ~1-~1: There might be a mistake here.
Context: (hop)= # Hop ```{div} .float-right [![Apache Hop...

(QB_NEW_EN)

docs/integrate/iceberg/index.md

[grammar] ~1-~1: There might be a mistake here.
Context: (iceberg)= # Iceberg ```{div} .float-right [![Apache...

(QB_NEW_EN)

docs/integrate/kinesis/index.md

[grammar] ~1-~1: There might be a mistake here.
Context: (kinesis)= # Kinesis :::{include} /_include/links.md...

(QB_NEW_EN)


[grammar] ~39-~39: There might be a mistake here.
Context: ...o 3 3 3 :::{rubric} Related ::: - {ref}dynamodb - [Amazon DynamoDB Streams] :::: ::::: :...

(QB_NEW_EN)

docs/integrate/nifi/index.md

[grammar] ~1-~1: There might be a mistake here.
Context: (nifi)= # NiFi ```{div} .float-right [![Apache Ni...

(QB_NEW_EN)

🔇 Additional comments (6)
docs/integrate/dbt/index.md (2)

33-33: Good move: Using Sphinx cross-references aligns with the new integration structure.

Switching to {ref} keeps links resilient to path changes and matches the PR goal of normalizing integration slugs/pages.


33-33: No action needed — labels for airflow and debezium exist

Labels found:

  • docs/integrate/airflow/index.md:1 — (airflow)=
  • docs/integrate/debezium/index.md:1 — (debezium)=

Refs found:

  • docs/integrate/dbt/index.md:33 — contains {ref}debezium and `{ref}`airflow
  • docs/ingest/etl/index.md:30,205 — contains {ref}airflow``

Conclusion: the {ref} links are valid; no changes required.

docs/integrate/kinesis/index.md (1)

39-41: Cross-ref looks correct; keep consistency across the suite

The switch to {ref}dynamodb matches the new slugging. No changes needed here.

docs/integrate/dynamodb/index.md (1)

39-41: Related links now use the short slug; LGTM

The change to {ref}kinesis aligns with the new naming scheme.

docs/ingest/etl/index.md (2)

106-109: DynamoDB: unprefixed label is fine; ensure target label exists and reflect official name in text.

Please verify (dynamodb)= exists. The description already uses the official product name (“DynamoDB … by Amazon Web Services”), which is consistent.


189-195: Iceberg label change looks correct.

Unprefixed {ref}iceberg`` aligns with the new scheme. Ensure (iceberg)= exists on the target page and consider adding `(apache-iceberg)=` as an alias.

Comment on lines +30 to 31
- {ref}`airflow`

Copy link

@coderabbitai coderabbitai bot Aug 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Unprefixed {ref}airflow` looks good; please confirm target label exists and keep a legacy alias.

This page now references {ref}airflow. Verify that the target page defines (airflow)=and, to avoid breaking incoming links, also defines an alias for the old label (e.g.,(apache-airflow)=`) on the same page.

Run this script to check label targets and spot any leftover old-style refs:


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Extract all ref labels used in this file
labels=$(rg -No '\{ref}`([a-z0-9\-]+)`' docs/ingest/etl/index.md | sed 's/.*`//; s/`$//' | sort -u)

echo "Checking label definitions for:"
printf ' - %s\n' $labels
echo

missing=0
for lbl in $labels; do
  if ! rg -n "^\($lbl\)="; then
    echo "MISSING label: ($lbl)="
    missing=1
  fi
done

echo
echo "Scanning for legacy prefixed refs still used elsewhere:"
rg -n '\{ref}`(apache-|aws-)[^`]+`' || true

exit $missing

Length of output: 514


🏁 Script executed:

#!/bin/bash
set -euo pipefail

file="docs/ingest/etl/index.md"

echo "Listing all {ref}`...` occurrences in $file (line:number -> matched):"
rg -n -oP '\{ref\}`([^`]+)`' "$file" || true
echo

echo "Extracting unique labels from those refs:"
labels=$(rg -n -oP '\{ref\}`([^`]+)`' "$file" | sed -E 's/^[0-9]+://' | sed -E 's/\{ref\}`(.*)`/\1/' | sort -u || true)
printf ' - %s\n' $labels
echo

missing=0
echo "Checking for anchor definitions for each label across the docs/ tree:"
for lbl in $labels; do
  if rg -n -F "($lbl)=" docs || true; then
    true
  else
    echo "MISSING label definition: ($lbl)="
    missing=1
  fi
done
echo

echo "Scanning repo for legacy-prefixed refs (apache- or aws-):"
rg -n '\{ref\}`(apache-|aws-)[^`]+`' docs || true
echo

exit $missing

Length of output: 5025


Add legacy anchor alias (apache-airflow) to the airflow page

Confirmed: {ref}airflow points to docs/integrate/airflow/index.md which defines (airflow)=, but a legacy alias (apache-airflow)= is not present — please add it to avoid breaking external links.

Files to check/update:

  • docs/ingest/etl/index.md — references {ref}airflow at lines 30 and 205 (no change needed here).
  • docs/integrate/airflow/index.md — add legacy alias on the same page (anchor currently at line 1).

Suggested change (add the alias next to the existing anchor):
docs/integrate/airflow/index.md
@@
(airflow)=
+(apache-airflow)=

🤖 Prompt for AI Agents
In docs/integrate/airflow/index.md around line 1, the page defines the anchor
(airflow)= but is missing the legacy alias (apache-airflow)= which external
links rely on; add a second anchor (apache-airflow)= immediately alongside the
existing (airflow)= so both aliases are available; no changes needed to
docs/ingest/etl/index.md (it references {ref}`airflow` at lines 30 and 205).

Copy link
Member Author

@amotl amotl Aug 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai: We decided to prune ALL legacy aliases previously kept for compatibility reasons, but fix forward instead if anything breaks. Please resolve all relevant comments.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amotl, understood! I'll note this policy change for future reviews. The "fix forward" approach makes sense for maintaining cleaner documentation structure.


✏️ Learnings added
Learnt from: amotl
PR: crate/cratedb-guide#249
File: docs/ingest/etl/index.md:30-31
Timestamp: 2025-08-14T11:40:39.557Z
Learning: In the CrateDB Guide documentation, the project policy is to prune ALL legacy aliases (like `(apache-airflow)=`, `(aws-*)=`) and fix forward if anything breaks, rather than maintaining backward compatibility with dual anchors.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


⛔ Learnings removed
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/connect/ide.md:2-2
Timestamp: 2025-08-09T10:59:42.066Z
Learning: In the CrateDB Guide documentation, legacy anchor labels (like `(ide)=`) should be retained alongside newer, more specific ones (like `(connect-ide)=`) to maintain backward compatibility for external projects that might reference them, even if they appear redundant within the repository.

🧠 Learnings used
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/connect/ide.md:2-2
Timestamp: 2025-08-09T10:59:42.066Z
Learning: In the CrateDB Guide documentation, legacy anchor labels (like `(ide)=`) should be retained alongside newer, more specific ones (like `(connect-ide)=`) to maintain backward compatibility for external projects that might reference them, even if they appear redundant within the repository.

Comment on lines +42 to +46
- {ref}`flink`

Apache Flink is a programming framework and distributed processing engine for
stateful computations over unbounded and bounded data streams, written in Java.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Flink: label rename is consistent; ensure alias retained to prevent broken links.

Confirm (flink)= exists on the target page and consider adding (apache-flink)= there as a compatibility alias.

If needed, I can propose a PR-wide alias list for all renamed integration labels.


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Looking for anchor labels and references for 'flink'..."

echo
echo "1) Literal anchor occurrences '(flink)=':"
rg -n -F "(flink)=" || true

echo
echo "2) Literal anchor occurrences '(apache-flink)=':"
rg -n -F "(apache-flink)=" || true

echo
echo "3) Inline refs using backticks (e.g. ref`flink`):"
rg -n "ref\`flink\`" || true
rg -n "ref\`apache-flink\`" || true

echo
echo "4) All occurrences of the word 'flink' under docs/:"
rg -n "flink" docs || true

echo
echo "5) Show docs/ingest/etl/index.md (first 200 lines) to inspect the snippet referenced in the review:"
sed -n '1,200p' docs/ingest/etl/index.md || true

Length of output: 11616


Action: add compatibility alias (apache-flink)= to the Flink page

Confirmed (flink)= is present at docs/integrate/flink/index.md:1; there is no (apache-flink)= alias in the repo — add it to avoid breaking external links.

Files to update:

  • docs/integrate/flink/index.md — add alias next to the existing anchor.
  • (no changes required) docs/ingest/etl/index.md — it already references {ref}flink (lines 42 and 212).

Suggested tiny diff:

docs/integrate/flink/index.md
@@
-(flink)=
+(flink)=
+(apache-flink)=
🤖 Prompt for AI Agents
In docs/integrate/flink/index.md around line 1, add the compatibility alias
(apache-flink)= next to the existing (flink)= anchor so both aliases are defined
at the top of the file; no changes needed in docs/ingest/etl/index.md since it
already references {ref}`flink`. Ensure the new alias uses the same syntax and
is placed on the same anchor block as (flink)= to avoid breaking external links.

Comment on lines +60 to +65
- {ref}`nifi`

Apache NiFi is a dataflow system based on the concepts of flow-based programming.
It supports powerful and scalable directed graphs of data routing, transformation,
and system mediation logic.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

NiFi: label rename is consistent; ensure alias retained.

Same note: ensure (nifi)= exists and add (apache-nifi)= on the NiFi page to keep old inbound links working.


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "---- Showing docs/ingest/etl/index.md (first 200 lines) ----"
if [ -f docs/ingest/etl/index.md ]; then
  sed -n '1,200p' docs/ingest/etl/index.md || true
else
  echo "File not found: docs/ingest/etl/index.md"
fi

echo
echo "---- Searching repo for anchor labels '(^\\(nifi\\)=)' or '(^\\(apache-nifi\\)=)' ----"
rg -n --hidden --no-ignore-vcs '^\(nifi\)=|^\(apache-nifi\)=' || true

echo
echo "---- Searching for files with 'nifi' in the docs path ----"
rg -n --hidden --no-ignore-vcs 'nifi' docs || true

echo
echo "---- Listing Markdown files with 'nifi' in filename ----"
fd -e md -HI -g '*nifi*' || true

Length of output: 10013


Add (apache-nifi)= alias to the NiFi page; keep existing (nifi)=

Verified: docs/integrate/nifi/index.md contains (nifi)= at line 1 but there is no (apache-nifi)=. Please add the alias to preserve old inbound links.

Files to update / check:

  • Edit: docs/integrate/nifi/index.md — add (apache-nifi)= alongside the existing (nifi)= at the top of the file.
  • References (for awareness): docs/ingest/etl/index.md (lines ~60, 223), docs/integrate/index.md (line ~49).

Suggested top-of-file snippet:
(apache-nifi)=
(nifi)=

🤖 Prompt for AI Agents
In docs/integrate/nifi/index.md around line 1, add the alias "(apache-nifi)="
directly above or alongside the existing "(nifi)=" so the top of the file
contains both aliases (apache-nifi)= and (nifi)=; update/save the file and then
verify inbound references in docs/ingest/etl/index.md (around lines ~60 and
~223) and docs/integrate/index.md (around line ~49) still resolve correctly.

Comment on lines +78 to +82
- {ref}`hop`

Apache Hop aims to be the future of data integration. Visual development enables
developers to be more productive than they can be through code.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Hop: label rename is consistent; ensure alias retained.

Please verify (hop)= exists and consider (apache-hop)= as an alias on the Hop page.


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Searching for literal anchor patterns..."
rg -nF "(hop)=" || true
rg -nF "(apache-hop)=" || true

echo
echo "Searching for reStructuredText labels (e.g. '.. _hop:')..."
rg -n --hidden --no-ignore-vcs '\.\.\s*_hop\s*:' -S || true
rg -n --hidden --no-ignore-vcs '\.\.\s*_apache-hop\s*:' -S || true

echo
echo "Searching for 'Hop' mentions under docs/ to locate the Hop page(s)..."
rg -n --hidden --no-ignore-vcs -S "^\s*(#|=|-){1,6}\s*Hop" docs || true
rg -n --hidden --no-ignore-vcs -S "Hop" docs | sed -n '1,200p' || true

echo
echo "Looking for files with 'hop' in the filename (md/rst/adoc)..."
fd -HI -e md -e rst -e adoc hop || true

# Print top of the first matching file that mentions "Hop" to inspect anchors/labels
first=$(rg -n --hidden --no-ignore-vcs -l -S "Hop" docs | head -n1 || true)
echo
echo "First file with a 'Hop' mention: $first"
if [ -n "$first" ]; then
  echo "----- head of $first -----"
  sed -n '1,200p' "$first" || true
  echo "----- end head -----"
fi

Length of output: 9417


Hop label present — add (apache-hop)= alias on Hop page

Verified: (hop)= exists in docs/integrate/hop/index.md (line 1). docs/ingest/etl/index.md references Hop via {ref}hop (≈lines 78–82).

Files to update:

  • docs/integrate/hop/index.md — add alias next to the existing label.
  • docs/ingest/etl/index.md — (no change required; reference is correct)

Suggested change (small diff):

(hop)=
(apache-hop)=
🤖 Prompt for AI Agents
In docs/integrate/hop/index.md around line 1, the existing label "(hop)=" needs
an alias so external refs like {ref}`hop` resolve consistently; add the alias
"(apache-hop)=" on the next line or beside the existing label so the file
contains both "(hop)=" and "(apache-hop)="; no changes required in
docs/ingest/etl/index.md.

Comment on lines +1 to +2
(dynamodb)=
# DynamoDB
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add aliases for both historical anchors

This page historically exposed (aws-dynamodb) and also appears to have been referenced as (cdc-dynamodb) in CDC-related sections. Add both as aliases to avoid breaking inbound links.

+ (aws-dynamodb)=
+ (cdc-dynamodb)=
 (dynamodb)=
 # DynamoDB
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
(dynamodb)=
# DynamoDB
(aws-dynamodb)=
(cdc-dynamodb)=
(dynamodb)=
# DynamoDB
🧰 Tools
🪛 LanguageTool

[grammar] ~1-~1: There might be a mistake here.
Context: (dynamodb)= # DynamoDB :::{include} /_include/links.m...

(QB_NEW_EN)

🤖 Prompt for AI Agents
In docs/integrate/dynamodb/index.md around lines 1 to 2, the page currently
defines the DynamoDB heading but does not declare the historical anchor aliases;
add aliases for both historical anchors "(aws-dynamodb)" and "(cdc-dynamodb)" so
inbound links don't break. Update the top of the file to include explicit alias
anchors for those names (e.g., add HTML anchor tags or front-matter/alias
metadata supported by the docs site) pointing to this page, ensuring both
aws-dynamodb and cdc-dynamodb resolve here.

Comment on lines +1 to +2
(hop)=
# Hop
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Preserve old anchor via alias (apache-hop)

Add an alias label so existing {ref}apache-hop links keep resolving.

+ (apache-hop)=
 (hop)=
 # Hop
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
(hop)=
# Hop
(apache-hop)=
(hop)=
# Hop
🧰 Tools
🪛 LanguageTool

[grammar] ~1-~1: There might be a mistake here.
Context: (hop)= # Hop ```{div} .float-right [![Apache Hop...

(QB_NEW_EN)

🤖 Prompt for AI Agents
In docs/integrate/hop/index.md around lines 1 to 2, add an alias anchor for the
old name so existing {ref}`apache-hop` links continue to resolve; place a new
anchor line "(apache-hop)=" immediately above the existing "(hop)=" line (so the
top of the file becomes "(apache-hop)=\n(hop)=\n# Hop"), ensuring both anchors
precede the heading.

Comment on lines +1 to +2
(iceberg)=
# Iceberg
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Alias the former (apache-iceberg) anchor

Add an alias to preserve existing deep links and references.

+ (apache-iceberg)=
 (iceberg)=
 # Iceberg
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
(iceberg)=
# Iceberg
(apache-iceberg)=
(iceberg)=
# Iceberg
🧰 Tools
🪛 LanguageTool

[grammar] ~1-~1: There might be a mistake here.
Context: (iceberg)= # Iceberg ```{div} .float-right [![Apache...

(QB_NEW_EN)

🤖 Prompt for AI Agents
In docs/integrate/iceberg/index.md around lines 1 to 2, add an alias anchor for
the former name so existing deep links to (apache-iceberg) keep working; add a
second anchor line for the old id (apache-iceberg)= above the heading (so both
(apache-iceberg)= and (iceberg)= appear before “# Iceberg”) to preserve
references.

Comment on lines +1 to +2
(kinesis)=
# Kinesis
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add backward-compatible alias for the old slug

Recommend adding (aws-kinesis) as an alias to keep old links working.

+ (aws-kinesis)=
 (kinesis)=
 # Kinesis
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
(kinesis)=
# Kinesis
(aws-kinesis)=
(kinesis)=
# Kinesis
🧰 Tools
🪛 LanguageTool

[grammar] ~1-~1: There might be a mistake here.
Context: (kinesis)= # Kinesis :::{include} /_include/links.md...

(QB_NEW_EN)

🤖 Prompt for AI Agents
In docs/integrate/kinesis/index.md around lines 1 to 2, add a
backward-compatible alias anchor for the old slug by declaring the old anchor
(aws-kinesis)= alongside the existing (kinesis)= before the heading so both
anchors point to the same "Kinesis" section, ensuring old links using the
aws-kinesis slug continue to work.

Comment on lines +1 to +2
(nifi)=
# NiFi
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Add alias for old anchor to preserve existing inbound links

Renaming the anchor to (nifi) is good. To avoid breaking external bookmarks or internal refs that might still target (apache-nifi), add an alias anchor that points to the same page.

Apply this diff:

+ (apache-nifi)=
 (nifi)=
 # NiFi

Run this script to verify no stale refs/paths remain and that new anchors are unique:


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "1) Search for old-style {ref} usages:"
rg -n '\{ref\}`(apache|aws)-[a-z0-9-]+`' -g 'docs/**' || true

echo
echo "2) Search for old integrate paths still referenced:"
rg -n 'integrate/(apache|aws)-' -g 'docs/**' || true

echo
echo "3) Search for pages still defining old anchors:"
rg -n '^\((apache|aws)-[a-z0-9-]+\)=' -g 'docs/**' || true

echo
echo "4) Ensure new anchors exist exactly once:"
for a in nifi kinesis hop dynamodb iceberg; do
  echo -n " - $a: "
  rg -n "^\($a\)=" -g 'docs/**' | wc -l | tr -d ' '
done
echo

Length of output: 1446


Add alias for old anchor to preserve existing inbound links

Renaming the anchor to (nifi) is fine — add an alias for (apache-nifi) to avoid breaking external/internal links. I ran your verification script: no refs to apache-nifi were found, but there are remaining old-style refs/anchors for aws-lambda that you may want to handle similarly.

Files to change / review:

  • docs/integrate/nifi/index.md (lines 1-2) — add alias for apache-nifi.
  • Verification findings:
    • docs/ingest/etl/index.md:169 and :206 contain {ref}aws-lambda
    • docs/integrate/aws-lambda/index.md:1 defines (aws-lambda)=

Apply this diff:

+ (apache-nifi)=
 (nifi)=
 # NiFi

Consider adding alias anchors or updating refs for other old anchors (e.g., aws-lambda) to avoid broken links.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
(nifi)=
# NiFi
(apache-nifi)=
(nifi)=
# NiFi
🧰 Tools
🪛 LanguageTool

[grammar] ~1-~1: There might be a mistake here.
Context: (nifi)= # NiFi ```{div} .float-right [![Apache Ni...

(QB_NEW_EN)

🤖 Prompt for AI Agents
In docs/integrate/nifi/index.md around lines 1-2, add an alias anchor for the
old name by inserting a second anchor line for (apache-nifi) alongside the
existing (nifi) anchor before the heading so existing refs to apache-nifi
continue to work; e.g., add a line with the old anchor immediately above the "#
NiFi" heading, and optionally mirror this pattern for other renamed anchors like
aws-lambda referenced elsewhere.

@amotl amotl merged commit 796f993 into main Aug 14, 2025
3 checks passed
@amotl amotl deleted the layout-integrate-strip-prefix branch August 14, 2025 11:41
@amotl
Copy link
Member Author

amotl commented Aug 14, 2025

Assuming build error gets fixed :-)

Assuming anything without any crew? ;]

That's more background about this particular detail. We need to remove a compatibility layer we have been carrying forward up to today due to laziness understaffing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

guidance Matters of layout, shape, and structure. refactoring Changing shape or layout, or moving content around. sanding-1200 Fine sanding.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants