-
Notifications
You must be signed in to change notification settings - Fork 1
Naming things: Simplify directory layout at "Integrate" section #249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -27,29 +27,23 @@ outlines how to use them effectively. Additionally, see support for {ref}`cdc` s | |
|
||
|
||
::::{grid-item-card} {material-outlined}`air;2em` Dataflow / Pipeline / Code-first | ||
- {ref}`apache-airflow` | ||
- {ref}`airflow` | ||
|
||
Apache Airflow is an open-source software platform to programmatically author, | ||
schedule, and monitor workflows. Pipelines are defined in Python, allowing for | ||
dynamic pipeline generation and on-demand, code-driven pipeline invocation. | ||
|
||
- {ref}`apache-flink` | ||
|
||
Apache Flink is a programming framework and distributed processing engine for | ||
stateful computations over unbounded and bounded data streams, written in Java. | ||
|
||
- {ref}`apache-nifi` | ||
|
||
Apache NiFi is a dataflow system based on the concepts of flow-based programming. | ||
It supports powerful and scalable directed graphs of data routing, transformation, | ||
and system mediation logic. | ||
|
||
- {ref}`dbt` | ||
|
||
dbt is an SQL-first platform for transforming data in data warehouses using | ||
Python and SQL. The data abstraction layer provided by dbt-core allows the | ||
decoupling of the models on which reports and dashboards rely from the source data. | ||
|
||
- {ref}`flink` | ||
|
||
Apache Flink is a programming framework and distributed processing engine for | ||
stateful computations over unbounded and bounded data streams, written in Java. | ||
|
||
Comment on lines
+42
to
+46
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 💡 Verification agent 🧩 Analysis chainFlink: label rename is consistent; ensure alias retained to prevent broken links. Confirm If needed, I can propose a PR-wide alias list for all renamed integration labels. 🏁 Script executed: #!/bin/bash
set -euo pipefail
echo "Looking for anchor labels and references for 'flink'..."
echo
echo "1) Literal anchor occurrences '(flink)=':"
rg -n -F "(flink)=" || true
echo
echo "2) Literal anchor occurrences '(apache-flink)=':"
rg -n -F "(apache-flink)=" || true
echo
echo "3) Inline refs using backticks (e.g. ref`flink`):"
rg -n "ref\`flink\`" || true
rg -n "ref\`apache-flink\`" || true
echo
echo "4) All occurrences of the word 'flink' under docs/:"
rg -n "flink" docs || true
echo
echo "5) Show docs/ingest/etl/index.md (first 200 lines) to inspect the snippet referenced in the review:"
sed -n '1,200p' docs/ingest/etl/index.md || true Length of output: 11616 Action: add compatibility alias (apache-flink)= to the Flink page Confirmed (flink)= is present at docs/integrate/flink/index.md:1; there is no (apache-flink)= alias in the repo — add it to avoid breaking external links. Files to update:
Suggested tiny diff:
🤖 Prompt for AI Agents
|
||
- {ref}`kestra` | ||
|
||
Kestra is an open-source workflow automation and orchestration toolkit with a rich | ||
|
@@ -63,23 +57,29 @@ outlines how to use them effectively. Additionally, see support for {ref}`cdc` s | |
the Singer specification. Singer is a composable open-source ETL framework and | ||
specification, including powerful data extraction and consolidation elements. | ||
|
||
- {ref}`nifi` | ||
|
||
Apache NiFi is a dataflow system based on the concepts of flow-based programming. | ||
It supports powerful and scalable directed graphs of data routing, transformation, | ||
and system mediation logic. | ||
|
||
Comment on lines
+60
to
+65
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 💡 Verification agent 🧩 Analysis chainNiFi: label rename is consistent; ensure alias retained. Same note: ensure 🏁 Script executed: #!/bin/bash
set -euo pipefail
echo "---- Showing docs/ingest/etl/index.md (first 200 lines) ----"
if [ -f docs/ingest/etl/index.md ]; then
sed -n '1,200p' docs/ingest/etl/index.md || true
else
echo "File not found: docs/ingest/etl/index.md"
fi
echo
echo "---- Searching repo for anchor labels '(^\\(nifi\\)=)' or '(^\\(apache-nifi\\)=)' ----"
rg -n --hidden --no-ignore-vcs '^\(nifi\)=|^\(apache-nifi\)=' || true
echo
echo "---- Searching for files with 'nifi' in the docs path ----"
rg -n --hidden --no-ignore-vcs 'nifi' docs || true
echo
echo "---- Listing Markdown files with 'nifi' in filename ----"
fd -e md -HI -g '*nifi*' || true Length of output: 10013 Add (apache-nifi)= alias to the NiFi page; keep existing (nifi)= Verified: docs/integrate/nifi/index.md contains (nifi)= at line 1 but there is no (apache-nifi)=. Please add the alias to preserve old inbound links. Files to update / check:
Suggested top-of-file snippet: 🤖 Prompt for AI Agents
|
||
+++ | ||
Use data pipeline programming frameworks and platforms. | ||
:::: | ||
|
||
|
||
::::{grid-item-card} {material-outlined}`all_inclusive;2em` Low-code / No-code / Visual | ||
- {ref}`apache-hop` | ||
|
||
Apache Hop aims to be the future of data integration. Visual development enables | ||
developers to be more productive than they can be through code. | ||
|
||
- {ref}`estuary` | ||
|
||
Estuary provides real-time data integration and modern ETL and ELT data pipelines | ||
as a fully managed solution. Estuary Flow is a real-time, reliable change data | ||
capture (CDC) solution. | ||
|
||
- {ref}`hop` | ||
|
||
Apache Hop aims to be the future of data integration. Visual development enables | ||
developers to be more productive than they can be through code. | ||
|
||
Comment on lines
+78
to
+82
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 💡 Verification agent 🧩 Analysis chainHop: label rename is consistent; ensure alias retained. Please verify 🏁 Script executed: #!/bin/bash
set -euo pipefail
echo "Searching for literal anchor patterns..."
rg -nF "(hop)=" || true
rg -nF "(apache-hop)=" || true
echo
echo "Searching for reStructuredText labels (e.g. '.. _hop:')..."
rg -n --hidden --no-ignore-vcs '\.\.\s*_hop\s*:' -S || true
rg -n --hidden --no-ignore-vcs '\.\.\s*_apache-hop\s*:' -S || true
echo
echo "Searching for 'Hop' mentions under docs/ to locate the Hop page(s)..."
rg -n --hidden --no-ignore-vcs -S "^\s*(#|=|-){1,6}\s*Hop" docs || true
rg -n --hidden --no-ignore-vcs -S "Hop" docs | sed -n '1,200p' || true
echo
echo "Looking for files with 'hop' in the filename (md/rst/adoc)..."
fd -HI -e md -e rst -e adoc hop || true
# Print top of the first matching file that mentions "Hop" to inspect anchors/labels
first=$(rg -n --hidden --no-ignore-vcs -l -S "Hop" docs | head -n1 || true)
echo
echo "First file with a 'Hop' mention: $first"
if [ -n "$first" ]; then
echo "----- head of $first -----"
sed -n '1,200p' "$first" || true
echo "----- end head -----"
fi Length of output: 9417 Hop label present — add (apache-hop)= alias on Hop page Verified: (hop)= exists in docs/integrate/hop/index.md (line 1). docs/ingest/etl/index.md references Hop via {ref} Files to update:
Suggested change (small diff):
🤖 Prompt for AI Agents
|
||
- {ref}`n8n` | ||
|
||
n8n is a workflow automation tool that helps you to connect any app with an API with | ||
|
@@ -97,13 +97,13 @@ Use visual data flow and integration frameworks and platforms. | |
|
||
|
||
::::{grid-item-card} {material-outlined}`storage;2em` Databases | ||
- {ref}`aws-dms` | ||
- {ref}`dms` | ||
|
||
AWS DMS is a managed migration and replication service that helps move your | ||
database and analytics workloads between different kinds of databases quickly, | ||
securely, and with minimal downtime and zero data loss. | ||
|
||
- {ref}`aws-dynamodb` | ||
- {ref}`dynamodb` | ||
|
||
DynamoDB is a fully managed NoSQL database service provided by Amazon Web Services (AWS). | ||
|
||
|
@@ -132,13 +132,13 @@ Load data from database systems. | |
|
||
|
||
::::{grid-item-card} {material-outlined}`fast_forward;2em` Streams | ||
- {ref}`apache-kafka` | ||
- {ref}`kafka` | ||
|
||
Apache Kafka is an open-source distributed event streaming platform | ||
for high-performance data pipelines, streaming analytics, data integration, | ||
and mission-critical applications. | ||
|
||
- {ref}`aws-kinesis` | ||
- {ref}`kinesis` | ||
|
||
Amazon Kinesis Data Streams is a serverless streaming data service that simplifies | ||
the capture, processing, and storage of data streams at any scale, such as | ||
|
@@ -186,7 +186,7 @@ Use serverless compute units for custom import tasks. | |
|
||
::::{grid-item-card} {material-outlined}`dataset;2em` Datasets | ||
|
||
- {ref}`apache-iceberg` | ||
- {ref}`iceberg` | ||
|
||
Apache Iceberg is an open table format for analytic datasets. | ||
|
||
|
@@ -202,25 +202,25 @@ Load data from datasets and open table formats. | |
::: | ||
|
||
:::{div} | ||
- {ref}`apache-airflow` | ||
- {ref}`apache-flink` | ||
- {ref}`apache-hop` | ||
- {ref}`apache-iceberg` | ||
- {ref}`apache-kafka` | ||
- {ref}`apache-nifi` | ||
- {ref}`aws-dynamodb` | ||
- {ref}`aws-kinesis` | ||
- {ref}`aws-dms` | ||
- {ref}`airflow` | ||
- {ref}`aws-lambda` | ||
- {ref}`azure-functions` | ||
- {ref}`dbt` | ||
- {ref}`dms` | ||
- {ref}`dynamodb` | ||
- {ref}`estuary` | ||
- {ref}`flink` | ||
- {ref}`hop` | ||
- {ref}`iceberg` | ||
- {ref}`influxdb` | ||
- {ref}`kafka` | ||
- {ref}`kestra` | ||
- {ref}`kinesis` | ||
- {ref}`meltano` | ||
- {ref}`mongodb` | ||
- {ref}`mysql` | ||
- {ref}`n8n` | ||
- {ref}`nifi` | ||
- {ref}`node-red` | ||
- {ref}`risingwave` | ||
- {ref}`sql-server` | ||
|
Original file line number | Diff line number | Diff line change | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
@@ -1,6 +1,5 @@ | ||||||||||||||
(aws-dynamodb)= | ||||||||||||||
(cdc-dynamodb)= | ||||||||||||||
# Amazon DynamoDB | ||||||||||||||
(dynamodb)= | ||||||||||||||
# DynamoDB | ||||||||||||||
Comment on lines
+1
to
+2
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🛠️ Refactor suggestion Add aliases for both historical anchors This page historically exposed (aws-dynamodb) and also appears to have been referenced as (cdc-dynamodb) in CDC-related sections. Add both as aliases to avoid breaking inbound links. + (aws-dynamodb)=
+ (cdc-dynamodb)=
(dynamodb)=
# DynamoDB 📝 Committable suggestion
Suggested change
🧰 Tools🪛 LanguageTool[grammar] ~1-~1: There might be a mistake here. (QB_NEW_EN) 🤖 Prompt for AI Agents
|
||||||||||||||
|
||||||||||||||
:::{include} /_include/links.md | ||||||||||||||
::: | ||||||||||||||
|
@@ -38,7 +37,7 @@ servers or infrastructure. | |||||||||||||
:::{rubric} Related | ||||||||||||||
::: | ||||||||||||||
- [Amazon DynamoDB Streams] | ||||||||||||||
- {ref}`aws-kinesis` | ||||||||||||||
- {ref}`kinesis` | ||||||||||||||
- [Amazon Kinesis Data Streams] | ||||||||||||||
:::: | ||||||||||||||
|
||||||||||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,4 @@ | ||
(estuary)= | ||
|
||
# Estuary | ||
|
||
```{div} .float-right | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,5 @@ | ||
(apache-flink)= | ||
(flink)= | ||
# Apache Flink | ||
# Flink | ||
|
||
:::{include} /_include/links.md | ||
::: | ||
|
Original file line number | Diff line number | Diff line change | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
@@ -1,5 +1,5 @@ | ||||||||||||
(apache-hop)= | ||||||||||||
# Apache Hop | ||||||||||||
(hop)= | ||||||||||||
# Hop | ||||||||||||
Comment on lines
+1
to
+2
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🛠️ Refactor suggestion Preserve old anchor via alias (apache-hop) Add an alias label so existing {ref} + (apache-hop)=
(hop)=
# Hop 📝 Committable suggestion
Suggested change
🧰 Tools🪛 LanguageTool[grammar] ~1-~1: There might be a mistake here. (QB_NEW_EN) 🤖 Prompt for AI Agents
|
||||||||||||
|
||||||||||||
```{div} .float-right | ||||||||||||
[{height=60px loading=lazy}][Apache Hop] | ||||||||||||
|
Original file line number | Diff line number | Diff line change | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
@@ -1,5 +1,5 @@ | ||||||||||||
(apache-iceberg)= | ||||||||||||
# Apache Iceberg | ||||||||||||
(iceberg)= | ||||||||||||
# Iceberg | ||||||||||||
Comment on lines
+1
to
+2
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🛠️ Refactor suggestion Alias the former (apache-iceberg) anchor Add an alias to preserve existing deep links and references. + (apache-iceberg)=
(iceberg)=
# Iceberg 📝 Committable suggestion
Suggested change
🧰 Tools🪛 LanguageTool[grammar] ~1-~1: There might be a mistake here. (QB_NEW_EN) 🤖 Prompt for AI Agents
|
||||||||||||
|
||||||||||||
```{div} .float-right | ||||||||||||
[{height=60px loading=lazy}][Apache Iceberg] | ||||||||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,4 @@ | ||
(influxdb)= | ||
(integrate-influxdb)= | ||
(integrate-influxdb-quickstart)= | ||
# InfluxDB | ||
|
||
:::{include} /_include/links.md | ||
|
Original file line number | Diff line number | Diff line change | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
@@ -1,5 +1,5 @@ | ||||||||||||
(aws-kinesis)= | ||||||||||||
# Amazon Kinesis | ||||||||||||
(kinesis)= | ||||||||||||
# Kinesis | ||||||||||||
Comment on lines
+1
to
+2
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🛠️ Refactor suggestion Add backward-compatible alias for the old slug Recommend adding (aws-kinesis) as an alias to keep old links working. + (aws-kinesis)=
(kinesis)=
# Kinesis 📝 Committable suggestion
Suggested change
🧰 Tools🪛 LanguageTool[grammar] ~1-~1: There might be a mistake here. (QB_NEW_EN) 🤖 Prompt for AI Agents
|
||||||||||||
|
||||||||||||
:::{include} /_include/links.md | ||||||||||||
::: | ||||||||||||
|
@@ -36,7 +36,7 @@ records. | |||||||||||
|
||||||||||||
:::{rubric} Related | ||||||||||||
::: | ||||||||||||
- {ref}`aws-dynamodb` | ||||||||||||
- {ref}`dynamodb` | ||||||||||||
- [Amazon DynamoDB Streams] | ||||||||||||
:::: | ||||||||||||
|
||||||||||||
|
Original file line number | Diff line number | Diff line change | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
@@ -1,5 +1,5 @@ | ||||||||||||
(apache-nifi)= | ||||||||||||
# Apache NiFi | ||||||||||||
(nifi)= | ||||||||||||
# NiFi | ||||||||||||
Comment on lines
+1
to
+2
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 💡 Verification agent 🧩 Analysis chainAdd alias for old anchor to preserve existing inbound links Renaming the anchor to (nifi) is good. To avoid breaking external bookmarks or internal refs that might still target (apache-nifi), add an alias anchor that points to the same page. Apply this diff: + (apache-nifi)=
(nifi)=
# NiFi Run this script to verify no stale refs/paths remain and that new anchors are unique: 🏁 Script executed: #!/bin/bash
set -euo pipefail
echo "1) Search for old-style {ref} usages:"
rg -n '\{ref\}`(apache|aws)-[a-z0-9-]+`' -g 'docs/**' || true
echo
echo "2) Search for old integrate paths still referenced:"
rg -n 'integrate/(apache|aws)-' -g 'docs/**' || true
echo
echo "3) Search for pages still defining old anchors:"
rg -n '^\((apache|aws)-[a-z0-9-]+\)=' -g 'docs/**' || true
echo
echo "4) Ensure new anchors exist exactly once:"
for a in nifi kinesis hop dynamodb iceberg; do
echo -n " - $a: "
rg -n "^\($a\)=" -g 'docs/**' | wc -l | tr -d ' '
done
echo Length of output: 1446 Add alias for old anchor to preserve existing inbound links Renaming the anchor to (nifi) is fine — add an alias for (apache-nifi) to avoid breaking external/internal links. I ran your verification script: no refs to apache-nifi were found, but there are remaining old-style refs/anchors for aws-lambda that you may want to handle similarly. Files to change / review:
Apply this diff: + (apache-nifi)=
(nifi)=
# NiFi Consider adding alias anchors or updating refs for other old anchors (e.g., aws-lambda) to avoid broken links. 📝 Committable suggestion
Suggested change
🧰 Tools🪛 LanguageTool[grammar] ~1-~1: There might be a mistake here. (QB_NEW_EN) 🤖 Prompt for AI Agents
|
||||||||||||
|
||||||||||||
```{div} .float-right | ||||||||||||
[{height=60px loading=lazy}][Apache NiFi] | ||||||||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,8 +1,6 @@ | ||
(apache-superset)= | ||
(preset)= | ||
(superset)= | ||
|
||
# Apache Superset / Preset | ||
(preset)= | ||
# Superset / Preset | ||
|
||
```{div} .float-right .text-right | ||
[{height=60px loading=lazy}][Apache Superset] | ||
|
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Unprefixed
{ref}
airflow` looks good; please confirm target label exists and keep a legacy alias.This page now references
{ref}
airflow. Verify that the target page defines
(airflow)=and, to avoid breaking incoming links, also defines an alias for the old label (e.g.,
(apache-airflow)=`) on the same page.Run this script to check label targets and spot any leftover old-style refs:
🏁 Script executed:
Length of output: 514
🏁 Script executed:
Length of output: 5025
Add legacy anchor alias (apache-airflow) to the airflow page
Confirmed: {ref}
airflow
points to docs/integrate/airflow/index.md which defines (airflow)=, but a legacy alias (apache-airflow)= is not present — please add it to avoid breaking external links.Files to check/update:
airflow
at lines 30 and 205 (no change needed here).Suggested change (add the alias next to the existing anchor):
docs/integrate/airflow/index.md
@@
(airflow)=
+(apache-airflow)=
🤖 Prompt for AI Agents
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@coderabbitai: We decided to prune ALL legacy aliases previously kept for compatibility reasons, but fix forward instead if anything breaks. Please resolve all relevant comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amotl, understood! I'll note this policy change for future reviews. The "fix forward" approach makes sense for maintaining cleaner documentation structure.
✏️ Learnings added
⛔ Learnings removed
🧠 Learnings used