Airflow: Add tutorials from community forum #279

amotl · 2025-09-13T22:47:26Z

About

Continue adding integration guides from the community forum.

Preview

Outlook

Further improvements can easily happen on subsequent iterations. Let us know what you would like to see fixed/improved immediately, or later, when needed.

References

Consolidate Integration Guides I vs. II #102

coderabbitai · 2025-09-13T22:47:33Z

Walkthrough

Adds multiple new integration guides for Apache Airflow and Apache Arrow, introduces hot→cold and delete-based data-retention examples, updates Airflow/Arrow index pages and toctrees, and provides example DAGs, SQL, and Docker configs for orchestration and storage allocation.

Changes

Cohort / File(s)	Summary
Airflow tutorials (new) `docs/integrate/airflow/getting-started.md`, `docs/integrate/airflow/import-parquet.md`, `docs/integrate/airflow/import-stock-market-data.md`, `docs/integrate/airflow/export-s3.md`, `docs/integrate/airflow/data-retention-policy.md`, `docs/integrate/airflow/data-retention-hot-cold.md`	Adds end-to-end Airflow guides: Astronomer getting-started; Parquet-to-CrateDB ingestion; stock-market ingestion DAG; S3 export using COPY TO; partition-based delete retention DAG; hot→cold partition reallocation plus combined reallocate-then-delete workflow with SQL and DAG outlines.
Airflow index & integrate navigation `docs/integrate/airflow/index.md`, `docs/integrate/index.md`	Rewrites copy, replaces external cards with internal ref links, updates anchors and toctrees to include and organize the new Airflow guides.
Arrow integration (new) `docs/integrate/arrow/index.md`, `docs/integrate/arrow/import-parquet.md`	Adds Arrow section and a pyarrow + SQLAlchemy Parquet ingestion tutorial including batch reads and bulk inserts with example ORM mapping and configuration notes.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant Airflow as Airflow DAG
  participant Catalog as retention_policies
  participant CrateDB as CrateDB
  Note over Airflow,Catalog: Hot→Cold reallocation + eventual deletion
  User->>Airflow: Trigger scheduled DAG
  Airflow->>Catalog: SELECT active policies
  Catalog-->>Airflow: policy rows
  loop per policy
    Airflow->>CrateDB: ALTER TABLE ... PARTITION ... SET ("routing.allocation.require.storage"='cold')
    CrateDB-->>Airflow: async shard relocation started/logs
  end
  alt deletion stage reached
    Airflow->>CrateDB: DELETE FROM table WHERE partition_column = value
    CrateDB-->>Airflow: deletion result/logs
  end

sequenceDiagram
  autonumber
  actor User
  participant Airflow as Airflow DAG
  participant CrateDB as CrateDB
  participant S3 as Amazon S3
  Note over Airflow,CrateDB: Daily export to S3 using COPY TO
  User->>Airflow: Schedule run (logical date)
  Airflow->>CrateDB: COPY (SELECT ... WHERE ts in previous day) TO 's3://bucket/prefix/{{ ds }}/...'
  CrateDB->>S3: upload JSON lines
  S3-->>Airflow: write complete

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Layout NG: Folder structure and naming things #236 — Overlaps edits to docs/integrate/index.md and toctree reorganization.
Naming things: Simplify directory layout at "Integrate" section #249 — Overlaps changes to docs/integrate/airflow/index.md and Airflow navigation content.
Integrate: Migrate pages from crate-clients-tools, to be retired soon #222 — Related additions/migration of Airflow integration docs in the same area.

Suggested labels

new content, guidance

Suggested reviewers

hammerhead
karynzv

Poem

I hop through DAGs at dawn's first light,
I move cold shards and keep things tight.
Parquet crumbs and S3 delight,
Policies run from day to night.
I queue, I copy, then I’m out of sight — 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title "Airflow: Add tutorials from community forum" concisely and accurately captures the primary change — adding multiple Airflow tutorials sourced from the community — and is clear and on-topic for reviewers; although the PR also includes a few Arrow integration docs, the Airflow-focused title reflects the main intent of the changeset.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.
Description Check	✅ Passed	The PR description says it continues adding integration guides from the community forum, provides preview links, and invites further improvements; this directly relates to the changeset which adds multiple Airflow and Arrow documentation pages and updates the integration index. The description therefore describes the purpose and scope of the changes and is not off-topic or unrelated. It contains enough context for this lenient check to pass.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch airflow

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

amotl · 2025-09-16T00:29:44Z

@coderabbitai review and please converge to active voice.

coderabbitai · 2025-09-16T00:30:03Z

@amotl I'll review the PR with a focus on converting the documentation to active voice as requested. This aligns with the learning I have about using active voice instead of first-person or passive voice in documentation reviews.

🧠 Learnings used

Learnt from: amotl
PR: crate/cratedb-guide#0
File: :0-0
Timestamp: 2025-09-15T17:42:25.449Z
Learning: When reviewing documentation, use active voice instead of first-person or passive voice, and emit corresponding suggestions.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (17)

docs/integrate/airflow/import-parquet.md (5)

4-10: Tighten intro; use active voice and remove filler

Streamline, avoid first‑person, and clarify the CSV fallback.

-## Introduction
-Using Airflow to import the NYC Taxi and Limousine dataset in Parquet format.
-
-CrateDB does not support `COPY FROM` for Parquet. It supports CSV and JSON. Therefore, this tutorial uses an alternative approach rather than switching the previous CSV workflow to Parquet.
-
-First and foremost, keep in mind the strategy presented here for importing Parquet files into CrateDB, we have already covered this topic in a previous tutorial using a different approach from the one introduced in this tutorial, so feel free to have a look at the tutorial about {ref}`arrow-import-parquet` and explore with the different possibilities out there.
+## Introduction
+Use Airflow to import the NYC Taxi and Limousine dataset provided in Parquet format.
+
+CrateDB supports `COPY FROM` for CSV and JSON, not Parquet. This tutorial converts Parquet to CSV before loading.
+
+For an alternative Parquet ingestion approach, see {ref}`arrow-import-parquet`.

13-18: Remove passive voice; simplify prerequisites and staging explanation

Make instructions direct and consistent.

-Before getting started, you need to have some knowledge of Airflow and an instance of Airflow already running. Besides that, a CrateDB instance should already be set up before moving on with this tutorial. This SQL is also available in the setup folder in our [GitHub repository](https://github.com/crate/crate-airflow-tutorial).
+Before you start, have Airflow and CrateDB running. The SQL shown below also resides in the setup folder of the [GitHub repository](https://github.com/crate/crate-airflow-tutorial).

-We start by creating the two tables in CrateDB: A temporary staging table (`nyc_taxi.load_trips_staging`) and the final destination table (`nyc_taxi.trips`).
+Create two tables in CrateDB: a temporary staging table (`nyc_taxi.load_trips_staging`) and the final table (`nyc_taxi.trips`).

-In this case, the staging table is a primary insertion point, which was later used to cast data to their final types. For example, the `passenger_count` column is defined as `REAL` in the staging table, while it is defined as `INTEGER` in the `nyc_taxi.trips` table.
+Insert into the staging table first, then cast values into their final types when inserting into `nyc_taxi.trips`. For example, `passenger_count` is `REAL` in staging and `INTEGER` in `nyc_taxi.trips`.

77-78: Drop filler phrases

Avoid “Ok!” and conversational tone.

-Ok! So, once the tools are already set up with the corresponding tables created, we should be good to go.
+With the tools set up and tables created, proceed to the DAG.

88-93: Fix list indentation and parquet-tools formatting (markdownlint MD007)

Indent sublist items by two spaces and keep parquet-tools as code, not a link.

-* **process_parquet:** afterward, the name is used to download the file to local storage and then transform it from Parquet to CSV using [`parquet-tools`] (Apache Parquet CLI, see [Apache Arrow])
-   * `curl -o "<LOCAL-PARQUET-FILE-PATH>" "<REMOTE-PARQUET-FILE>"`
-   * `parquet-tools csv <LOCAL-PARQUET-FILE-PATH> > <CSV-FILE-PATH>`
-   Both tasks are executed within one Bash Operator.
+* **process_parquet:** afterward, use the name to download the file to local storage and convert it from Parquet to CSV using `parquet-tools` (Apache Parquet CLI; see [Apache Arrow]).
+  * `curl -o "<LOCAL-PARQUET-FILE-PATH>" "<REMOTE-PARQUET-FILE>"`
+  * `parquet-tools csv <LOCAL-PARQUET-FILE-PATH> > <CSV-FILE-PATH>`
+  Both commands run within one `BashOperator`.

104-106: Clarify the “two-month delay” explanation

The paragraph is hard to parse. Tighten and use explicit dates.

-In this case, according to the NYC TLC website “Trip data is published monthly (with two months delay)”. So, the DAG is set up to run monthly, and given the first file was made available in January 2009, the start date was set to March 2009. But why March and not January? As previously mentioned, the files are made available with 2 months of delay, so the first DAG instance, which has a logical execution date equal to "March 2009" will retrieve March as the current month minus 2, corresponding to January 2009, the very first file ever published.
+The NYC TLC publishes trip data monthly with a two‑month delay. Set the DAG to run monthly with a start date of March 2009. The first run (logical date March 2009) downloads the file for January 2009 (logical date minus two months), which is the first available dataset.

docs/integrate/airflow/import-stock-market-data.md (6)

4-7: Avoid bare URL and broken image; add proper link and asset

Convert the YouTube URL to proper markdown.
The “upload://…” image path won’t resolve in Sphinx. Replace with a public URL or remove.

-Watch this tutorial on Youtube: https://www.youtube.com/watch?v=YTTUzeaYUgQ&t=685s
-
-![Picture by StockSnap on Pixabay](upload://tXDu25ajd6zX201Ju43lENW1uQ1.jpeg)
+Watch this tutorial on YouTube: [Automating stock data with Airflow and CrateDB](https://www.youtube.com/watch?v=YTTUzeaYUgQ&t=685s).
+
+<!-- Replace with a valid image URL or remove -->
+<!-- ![Photo by StockSnap on Pixabay](https://example.com/path/to/image.jpg) -->

13-17: Use active voice; remove narration

Tighten the overview and avoid “you can… you want to…”.

-You have a goal: regularly update stock market data.
-To achieve your goal, you can divide it into tasks: download, prepare, and store data. You want to turn these tasks into a workflow, run it and observe the results; in other words, you want to orchestrate your workflow, and Airflow is the tool for that. 
-
-So the first thing to do is to start CrateDB and set up a table to store your data. Then, to orchestrate the process of regular data updates, you will create an Airflow project and establish the connection to CrateDB. Once you set up your Airflow project, you will write your tasks in Python as an Airflow DAG workflow (more details later). Finally, you will set a schedule for your workflow, and it's done!
+Goal: update stock market data regularly.
+Approach: define tasks to download, prepare, and store data; orchestrate them with Airflow.
+
+Steps: start CrateDB and create a table; create an Airflow project and set the CrateDB connection; implement the DAG in Python; schedule it.

24-31: Simplify tone and use active voice

Drop “It’s easy” and conversational phrasing.

-The first to do is to run CrateDB with Docker. It's easy: once you have Docker Desktop running, copy the Docker command from the CrateDB installation page and run it in your terminal. 
+First, run CrateDB with Docker. With Docker Desktop running, copy the command from the CrateDB installation page and run it:

32-41: Avoid first-person and clarify PK

Use neutral tone and consistent terms (“S&P 500”).

-Let’s now create a table to store your financial data. I'm particularly interested in the "adjusted-close" value for the stocks, so I will create a table that stores the date, the stock ticker, and the adjusted-close value. I will set the `closing_date` and `ticker` as primary keys. The final statement looks like this:
+Create a table to store financial data. Focus on the adjusted close value (“adjusted_close”) per ticker per day. Use a composite primary key on (`closing_date`, `ticker`):

77-81: Don’t hardcode credentials; avoid “null” as a password

Using crate:null@... sets the literal password “null”. Prefer no password (default), environment‑specific secrets, or Docker secrets. Also call out non‑production use.

-AIRFLOW_CONN_CRATEDB_CONNECTION=postgresql://crate:[email protected]:5433/doc?sslmode=disable
+# For local development only; do not commit real credentials
+AIRFLOW_CONN_CRATEDB_CONNECTION=postgresql://[email protected]:5433/doc?sslmode=disable

41-41: Fix markdownlint MD052 (missing reference definition)

A reference label “astronomer” is reported missing. Either add the definition at the bottom or convert to an inline link.

docs/integrate/airflow/data-retention-policy.md (1)

111-144: Keep parameters consistently across the full DAG snippet

Mirror the change above in the final DAG code block for accuracy.

-    SQLExecuteQueryOperator.partial(
+    SQLExecuteQueryOperator.partial(
         task_id="delete_partition",
         conn_id="cratedb_connection",
-        sql="DELETE FROM {{params.table_fqn}} WHERE {{params.column}} = {{params.value}};",
-    ).expand(params=get_policies().map(map_policy))
+        sql="DELETE FROM {{ params.table_fqn }} WHERE {{ params.column }} = %(value)s;",
+    ).expand(parameters=get_policies().map(map_policy))

docs/integrate/airflow/index.md (1)

67-73: Card copy fits, minor nit: broaden “Getting started” description

If the Getting Started page isn’t limited to stock data, consider a neutral description.
-Define an Airflow DAG that downloads, processes, and stores stock market data in CrateDB.
+Define an Airflow DAG that downloads, processes, and stores data in CrateDB.

docs/integrate/airflow/getting-started.md (4)

62-71: Fix list style (markdownlint MD004) and keep consistent with dashes

Switch asterisks to dashes.

-* Create a project directory:
+- Create a project directory:
@@
-* Initialize the project with the following command:  
+- Initialize the project with the following command:
@@
-* This will create a skeleton project directory as follows:
+- This creates a skeleton project directory:

84-89: Unify bullet style and clarify Triggerer description

Consistent dashes and small grammar tweak.

-*   PostgreSQL server (for configuration/runtime data)
-*   Airflow scheduler
-*   Web server for rendering Airflow UI
-*  Triggerer (running an event loop for deferrable tasks)
+- PostgreSQL server (for configuration and runtime metadata)
+- Airflow scheduler
+- Web server for rendering the Airflow UI
+- Triggerer (event loop for deferrable tasks)

89-91: Avoid bare URLs (markdownlint MD034) and keep active voice

Wrap URLs in angle brackets and tighten text.

-The PostgreSQL server is configured to listen on port 5432. The web server is listening on port 8080 and can be accessed via http://localhost:8080/ with `admin` for both username and password.
+The PostgreSQL server listens on port 5432. The web server listens on port 8080 and is available at <http://localhost:8080/> with `admin`/`admin`.

-Start the project with `astro dev start`. After the containers start, access the Airflow UI at `http://localhost:8081`:
+Start the project with `astro dev start`. After the containers start, access the Airflow UI at <http://localhost:8081>:

Also applies to: 101-106

128-141: Credentials guidance: avoid committing secrets; show TLS variant

Slightly tighten phrasing; keep active voice.

-To configure the connection to CrateDB we need to set up a corresponding environment variable. On Astronomer the environment variable can be set up via the Astronomer UI, via `Dockerfile`, or via a `.env` file which is automatically generated during project initialization.
+To configure the CrateDB connection, set an environment variable. On Astronomer, set it via the UI, `Dockerfile`, or the `.env` file (generated during initialization).
@@
-In case a TLS connection is required, change `sslmode=require`. To confirm that a new variable is applied, first, start the Airflow project and then create a bash session in the scheduler container by running `docker exec -it <scheduler_container_name> /bin/bash`.
+For TLS, set `sslmode=require`. To confirm that the variable is applied, start the project and open a bash session in the scheduler container: `docker exec -it <scheduler_container_name> /bin/bash`.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 90001e4 and f1ebc44.

📒 Files selected for processing (10)

docs/integrate/airflow/data-retention-hot-cold.md (1 hunks)
docs/integrate/airflow/data-retention-policy.md (1 hunks)
docs/integrate/airflow/export-s3.md (1 hunks)
docs/integrate/airflow/getting-started.md (1 hunks)
docs/integrate/airflow/import-parquet.md (1 hunks)
docs/integrate/airflow/import-stock-market-data.md (1 hunks)
docs/integrate/airflow/index.md (3 hunks)
docs/integrate/arrow/import-parquet.md (1 hunks)
docs/integrate/arrow/index.md (1 hunks)
docs/integrate/index.md (1 hunks)

✅ Files skipped from review due to trivial changes (2)

docs/integrate/index.md
docs/integrate/arrow/index.md

🚧 Files skipped from review as they are similar to previous changes (3)

docs/integrate/airflow/export-s3.md
docs/integrate/arrow/import-parquet.md
docs/integrate/airflow/data-retention-hot-cold.md

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-08-14T11:40:39.584Z

Learnt from: amotl
PR: crate/cratedb-guide#249
File: docs/ingest/etl/index.md:30-31
Timestamp: 2025-08-14T11:40:39.584Z
Learning: In the CrateDB Guide documentation, the project policy is to prune ALL legacy aliases (like `(apache-airflow)=`, `(aws-*)=`) and fix forward if anything breaks, rather than maintaining backward compatibility with dual anchors.

Applied to files:

docs/integrate/airflow/data-retention-policy.md

🪛 markdownlint-cli2 (0.17.2)

docs/integrate/airflow/getting-started.md

62-62: Unordered list style
Expected: dash; Actual: asterisk

(MD004, ul-style)

66-66: Unordered list style
Expected: dash; Actual: asterisk

(MD004, ul-style)

70-70: Unordered list style
Expected: dash; Actual: asterisk

(MD004, ul-style)

84-84: Unordered list style
Expected: dash; Actual: asterisk

(MD004, ul-style)

85-85: Unordered list style
Expected: dash; Actual: asterisk

(MD004, ul-style)

86-86: Unordered list style
Expected: dash; Actual: asterisk

(MD004, ul-style)

87-87: Unordered list style
Expected: dash; Actual: asterisk

(MD004, ul-style)

89-89: Bare URL used

(MD034, no-bare-urls)

docs/integrate/airflow/import-parquet.md

90-90: Unordered list indentation
Expected: 2; Actual: 3

(MD007, ul-indent)

91-91: Unordered list indentation
Expected: 2; Actual: 3

(MD007, ul-indent)

docs/integrate/airflow/import-stock-market-data.md

41-41: Reference links and images should use a label that is defined
Missing link or image reference definition: "astronomer"

(MD052, reference-links-images)

143-143: Link and image reference definitions should be needed
Unused link or image reference definition: "cratedb and apache airflow"

(MD053, link-image-reference-definitions)

144-144: Link and image reference definitions should be needed
Unused link or image reference definition: "etl pipeline using apache airflow with cratedb (source)"

(MD053, link-image-reference-definitions)

145-145: Link and image reference definitions should be needed
Unused link or image reference definition: "etl with astro and cratedb cloud in 30min - fully up in the cloud"

(MD053, link-image-reference-definitions)

146-146: Link and image reference definitions should be needed
Unused link or image reference definition: "run an etl pipeline with cratedb and data quality checks"

(MD053, link-image-reference-definitions)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Build docs

🔇 Additional comments (4)

docs/integrate/airflow/import-parquet.md (1)

47-73: Request validation: generated partition column types

Partitioning by a generated TIMESTAMP WITH TIME ZONE column should work on current CrateDB versions, but confirm against the version you target.

Would you like me to run a quick check against your ReadTheDocs preview or a local CrateDB to confirm the DDL executes as written?

docs/integrate/airflow/import-stock-market-data.md (2)

121-130: Pin dependencies thoughtfully; verify current versions

yfinance==0.1.87 is dated. Recommend updating to a current, tested version and aligning provider versions with your Airflow version.

Would you like me to check the latest compatible versions for Airflow 2.x and propose a minimal, tested set?

189-197: CrateDB supports INSERT ... ON CONFLICT ... DO UPDATE (since CrateDB 3.0.0) — verify runtime version & PK.

This syntax was added in CrateDB 3.0.0 and is valid; keep as‑is only if your deployed CrateDB is >= 3.0.0 and (closing_date, ticker) are declared as the table PRIMARY KEY or a unique index; otherwise use MERGE or an idempotent INSERT+UPDATE pattern. Applies to: docs/integrate/airflow/import-stock-market-data.md lines 189–197 and 213–217

docs/integrate/airflow/index.md (1)

119-129: Good use of See also block and internal refs

Links and references look consistent.

docs/integrate/airflow/data-retention-policy.md

coderabbitai · 2025-09-16T19:27:36Z

docs/integrate/airflow/data-retention-policy.md

+SQLExecuteQueryOperator.partial(
+    task_id="delete_partition",
+    conn_id="cratedb_connection",
+    sql="DELETE FROM {{params.table_fqn}} WHERE {{params.column}} = {{params.value}};",
+).expand(params=get_policies().map(map_policy))
+```


⚠️ Potential issue

Prefer DB‑API parameters over Jinja for values (quoting/injection safety)

Use parameters= (DB binding) while keeping table and column names via Jinja.

-SQLExecuteQueryOperator.partial( - task_id="delete_partition", - conn_id="cratedb_connection", - sql="DELETE FROM {{params.table_fqn}} WHERE {{params.column}} = {{params.value}};", -).expand(params=get_policies().map(map_policy)) +SQLExecuteQueryOperator.partial( + task_id="delete_partition", + conn_id="cratedb_connection", + sql="DELETE FROM {{ params.table_fqn }} WHERE {{ params.column }} = %(value)s;", +).expand(parameters=get_policies().map(map_policy))

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In docs/integrate/airflow/data-retention-policy.md around lines 100 to 105, the SQL inlines the value into the Jinja template, which risks quoting/injection issues; change the operator to pass the value via DB-API parameter binding (use the operator's parameters= argument) while keeping table and column names produced by Jinja; update the SQL string to use a parameter placeholder (DB-specific, e.g. %s or ?), and pass the corresponding value(s) via parameters=get_policies().map(map_policy) or equivalent so the value is bound by the DB driver instead of being interpolated into the SQL.

Can someone review/validate this suggestion by @coderabbitai?

amotl force-pushed the airflow branch from 5dd4f54 to 46d51f9 Compare September 13, 2025 22:48

amotl added the refactoring Changing shape or layout, or moving content around. label Sep 15, 2025

amotl marked this pull request as ready for review September 16, 2025 01:22

This comment was marked as resolved.

Sign in to view

amotl force-pushed the airflow branch from c9d606e to 90001e4 Compare September 16, 2025 06:27

This comment was marked as resolved.

Sign in to view

amotl force-pushed the airflow branch from 90001e4 to b39fe56 Compare September 16, 2025 06:50

amotl requested review from hammerhead and karynzv September 16, 2025 06:50

amotl force-pushed the airflow branch from b39fe56 to f1ebc44 Compare September 16, 2025 19:18

coderabbitai bot reviewed Sep 16, 2025

View reviewed changes

amotl force-pushed the airflow branch from f1ebc44 to 961bff7 Compare September 16, 2025 20:49

This comment was marked as resolved.

Sign in to view

amotl mentioned this pull request Sep 17, 2025

Consolidate Integration Guides I vs. II #102

Open

This comment was marked as resolved.

Sign in to view

This comment was marked as spam.

Sign in to view

karynzv and others added 11 commits September 22, 2025 21:42

Airflow: Import Parquet files

ab8e9bc

Arrow: Import Parquet files

503bd1d

Airflow: Import and update stock market data

1ed42db

Airflow: Export to S3

686e428

Airflow: Data retention policy

8a81ae4

Airflow: Hot/cold data retention

f5a0ea6

Airflow: Slight refactoring. Separate "getting-started". Fix crosslinks.

23aca6d

Airflow: Implement suggestions by CodeRabbit, part 1

06e183c

Airflow: Implement suggestions by CodeRabbit, part 2

2604417

Airflow: Implement suggestions by CodeRabbit, part 3

08f7895

Airflow: Implement suggestions by CodeRabbit, part 4

c5ab10d

amotl added 3 commits September 22, 2025 21:43

Airflow: Fix breaking links

3d9e106

Airflow: Implement suggestions by CodeRabbit, part 5

bcd80ef

Airflow: Implement suggestions by CodeRabbit, part 6

ccf78f6

amotl force-pushed the airflow branch from feacb80 to ccf78f6 Compare September 22, 2025 19:43

This comment was marked as resolved.

Sign in to view

This was referenced Sep 22, 2025

Airflow: Update yfinance dependency from 0.1.87 to modern 0.2.x range #336

Open

Airflow: Complete the DAG code example in Airflow stock market tutorial #337

Open

Airflow: Implement suggestions by CodeRabbit, part 7

3bdb8d0

This comment was marked as resolved.

Sign in to view

amotl added 2 commits September 23, 2025 00:01

Airflow: Implement suggestions by CodeRabbit, part 8

947ccc7

Airflow: s/tutorial/guide/

6f1eec1

This comment was marked as resolved.

Sign in to view

amotl added 2 commits September 23, 2025 00:20

Airflow: Use active voice in header titles

bcad45a

Airflow: Responsive design improvements

f81270d

This comment was marked as resolved.

Sign in to view

Airflow: Implement suggestions by CodeRabbit, part 9

a17ee39

amotl merged commit 41f6ce2 into main Sep 30, 2025
3 checks passed

amotl deleted the airflow branch September 30, 2025 09:51

Airflow: Add tutorials from community forum #279

Airflow: Add tutorials from community forum #279

Uh oh!

Conversation

amotl commented Sep 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

About

Preview

Outlook

References

Uh oh!

coderabbitai bot commented Sep 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

amotl commented Sep 16, 2025

Uh oh!

coderabbitai bot commented Sep 16, 2025

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

amotl Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as spam.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

amotl commented Sep 13, 2025 •

edited

Loading

coderabbitai bot commented Sep 13, 2025 •

edited

Loading

amotl Sep 16, 2025 •

edited

Loading