Skip to content

Commit

Permalink
Document deployment of Edge Worker on Windows (#45403)
Browse files Browse the repository at this point in the history
* Document deployment of Edge Worker on Windows

* Spelling

* Review
  • Loading branch information
jscheffl authored Jan 5, 2025
1 parent 8059a57 commit 03659e4
Show file tree
Hide file tree
Showing 8 changed files with 540 additions and 11 deletions.
11 changes: 11 additions & 0 deletions dev/breeze/src/airflow_breeze/params/shell_params.py
Original file line number Diff line number Diff line change
Expand Up @@ -512,6 +512,17 @@ def env_variables_for_docker_commands(self) -> dict[str, str]:
_env, "AIRFLOW__CORE__EXECUTOR", "airflow.providers.edge.executors.edge_executor.EdgeExecutor"
)
_set_var(_env, "AIRFLOW__EDGE__API_ENABLED", "true")

# For testing Edge Worker on Windows... Default Run ID is having a colon (":") from the time which is
# made into the log path template, which then fails to be used in Windows. So we replace it with a dash
_set_var(
_env,
"AIRFLOW__LOGGING__LOG_FILENAME_TEMPLATE",
"dag_id={{ ti.dag_id }}/run_id={{ ti.run_id|replace(':', '-') }}/task_id={{ ti.task_id }}/"
"{% if ti.map_index >= 0 %}map_index={{ ti.map_index }}/{% endif %}"
"attempt={{ try_number|default(ti.try_number) }}.log",
)

# Dev Airflow 3 runs API on FastAPI transitional
port = 9091
if self.use_airflow_version and self.use_airflow_version.startswith("2."):
Expand Down
16 changes: 8 additions & 8 deletions docs/apache-airflow-providers-edge/edge_executor.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,19 +20,19 @@ Edge Executor

.. note::

The Edge Provider Package is an experimental preview. Features and stability is limited
and needs to be improved over time. Target is to have full support in Airflow 3.
Once Airflow 3 support contains Edge Provider, maintenance of the Airflow 2 package will
be dis-continued.
The Edge Provider Package is an experimental preview. Features and stability are limited,
and need to be improved over time. The target is to achieve full support in Airflow 3.
Once the Edge Provider is fully supported in Airflow 3, maintenance of the Airflow 2 package will
be discontinued.


.. note::

As of Airflow 2.10.3, the ``edge`` provider package is not included in normal release cycle.
Thus you can not directly install it via: ``pip install 'apache-airflow[edge]'`` as the dependency
can not be downloaded.
As of Airflow 2.10.4, the ``edge`` provider package is not included in normal release cycle.
Thus, it cannot be directly installed using: ``pip install 'apache-airflow[edge]'`` as the dependency
cannot be downloaded.

While it is in not-ready state, a wheel release package must be manually built from source tree
While it is in a not-ready state, a wheel release package must be manually built from source tree
via ``breeze release-management prepare-provider-packages --include-not-ready-providers edge``
and then installed via pip or uv from the generated wheel file. like:
``pip install apache_airflow_providers_edge-<version>-py3-none-any.whl``.
Expand Down
1 change: 1 addition & 0 deletions docs/apache-airflow-providers-edge/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
Home <self>
Changelog <changelog>
Security <security>
Installation on Windows <install_on_windows>


.. toctree::
Expand Down
94 changes: 94 additions & 0 deletions docs/apache-airflow-providers-edge/install_on_windows.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
.. Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
.. http://www.apache.org/licenses/LICENSE-2.0
.. Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
Install Edge Worker on Windows
==============================

.. note::

The Edge Provider Package is an experimental preview. Features and stability are limited,
and need to be improved over time. The target is to achieve full support in Airflow 3.
Once the Edge Provider is fully supported in Airflow 3, maintenance of the Airflow 2 package will
be discontinued.

This is also especially true for Windows. The Edge Worker has only been manually tested on Windows,
and the setup is not validated in CI. It is recommended to use Linux for Edge Worker. The
Windows-based setup is intended solely for testing at your own risk. It is technically limited
due to Python OS restrictions and if currently of Proof-of-Concept quality.


.. note::

As of Airflow 2.10.4, the ``edge`` provider package is not included in normal release cycle.
Thus, it cannot be directly installed using: ``pip install 'apache-airflow[edge]'`` as the dependency
cannot be downloaded.

While it is in a not-ready state, a wheel release package must be manually built from source tree
via ``breeze release-management prepare-provider-packages --include-not-ready-providers edge``.


The setup was tested on Windows 10 with Python 3.12.8, 64-bit.
To setup a instance of Edge Worker on Windows, you need to follow the steps below:

1. Install Python 3.9 or higher.
2. Create an empty folder as base to start with. In our example it is ``C:\\Airflow``.
3. Start Shell/Command Line in ``C:\\Airflow`` and create a new virtual environment via: ``python -m venv venv``
4. Activate the virtual environment via: ``venv\\Scripts\\activate.bat``
5. Copy the manually built wheel of the edge provider to the folder ``C:\\Airflow``.
This document used ``apache_airflow_providers_edge-0.9.7rc0-py3-none-any.whl``.
6. Install the wheel file with the Airflow constraints matching your Airflow and Python version:
``pip install apache_airflow_providers_edge-0.9.7rc0-py3-none-any.whl apache-airflow==2.10.4 virtualenv --constraint https://raw.githubusercontent.com/apache/airflow/constraints-2.10.4/constraints-3.12.txt``
7. Create a new folder ``dags`` in ``C:\\Airflow`` and copy the relevant DAG files in it.
(At least the DAG files which should be executed on the edge alongside the dependencies. For testing purposes
the DAGs from the ``apache-airflow`` repository can be used located in
<https://github.com/apache/airflow/tree/main/providers/src/airflow/providers/edge/example_dags>.)
8. Collect needed parameters from your running Airflow backend, at least the following:

- ``edge`` / ``api_url``: The HTTP(s) endpoint where the Edge Worker connects to
- ``core`` / ``internal_api_secret_key``: The shared secret key between the webserver and the Edge Worker
Note: This only applies to Airflow 2.10 - For Airflow 3 the authentication method might change before release.
- Any proxy details if applicable for your environment.

9. Create a worker start script to prevent repeated typing. Create a new file ``start_worker.bat`` in
``C:\\Airflow`` with the following content - replace with your settings:

.. code-block:: bash
@echo off
set AIRFLOW__CORE__DAGS_FOLDER=dags
set AIRFLOW__LOGGING__BASE_LOG_FOLDER=edge_logs
set AIRFLOW__EDGE__API_URL=https://your-hostname-and-port/edge_worker/v1/rpcapi
set AIRFLOW__CORE__EXECUTOR=airflow.providers.edge.executors.edge_executor.EdgeExecutor
set AIRFLOW__CORE__INTERNAL_API_SECRET_KEY=<steal this from your deployment...>
set AIRFLOW__CORE__LOAD_EXAMPLES=False
set AIRFLOW_ENABLE_AIP_44=true
@REM Add if needed: set http_proxy=http://my-company-proxy.com:3128
@REM Add if needed: set https_proxy=http://my-company-proxy.com:3128
airflow edge worker --concurrency 4 -q windows
10. Note on logs: Per default the DAG Run ID is used as path in the log structure and per default the date and time
is contained in the Run ID. Windows fails with a colon (":") in a file or folder name and this also
the Edge Worker fails.
Therefore you might consider changing the config ``AIRFLOW__LOGGING__LOG_FILENAME_TEMPLATE`` to avoid the colon.
For example you could add the Jinja2 template replacement ``| replace(":", "-")`` to use other characters.
Note that the log filename template is resolved on server side and not on the worker side. So you need to make
this as a global change.
Alternatively for testing purposes only you must use Run IDs without a colon, e.g. set the Run ID manually when
starting a DAG run.
11. Start the worker via: ``start_worker.bat``
Watch the console for errors.
12. Run a DAG as test and see if the result is as expected.
Original file line number Diff line number Diff line change
Expand Up @@ -47,8 +47,8 @@
description=__doc__.partition(".")[0],
doc_md=__doc__,
schedule=None,
start_date=datetime(2024, 7, 1),
tags=["example", "params", "integration test"],
start_date=datetime(2025, 1, 1),
tags=["example", "edge", "integration test"],
params={
"mapping_count": Param(
4,
Expand Down
83 changes: 83 additions & 0 deletions providers/src/airflow/providers/edge/example_dags/win_notepad.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
"""
In this DAG is a demonstrator how to interact with a Windows Worker via Notepad.
The DAG is created in conjunction with the documentation in
https://github.com/apache/airflow/blob/main/docs/apache-airflow-providers-edge/install_on_windows.rst
and serves as a PoC test for the Windows worker.
"""

from __future__ import annotations

from collections.abc import Sequence
from datetime import datetime
from pathlib import Path
from subprocess import check_call
from tempfile import gettempdir
from typing import TYPE_CHECKING, Any

from airflow.models import BaseOperator
from airflow.models.dag import DAG
from airflow.models.param import Param

if TYPE_CHECKING:
from airflow.utils.context import Context


class NotepadOperator(BaseOperator):
"""Example Operator Implementation which starts a Notepod.exe on WIndows."""

template_fields: Sequence[str] = "text"

def __init__(self, text: str, **kwargs):
self.text = text
super().__init__(**kwargs)

def execute(self, context: Context) -> Any:
tmp_file = Path(gettempdir()) / "airflow_test.txt"
with open(tmp_file, "w", encoding="utf8") as textfile:
textfile.write(self.text)
check_call(["notepad.exe", tmp_file])
with open(tmp_file, encoding="utf8") as textfile:
return textfile.read()


with DAG(
dag_id="win_notepad",
dag_display_name="Windows Notepad",
description=__doc__.partition(".")[0],
doc_md=__doc__,
schedule=None,
start_date=datetime(2024, 7, 1),
tags=["edge", "Windows"],
default_args={"queue": "windows"},
params={
"notepad_text": Param(
"This is a text as proposal generated by Airflow DAG. Change it and save and it will get to XCom.",
title="Notepad Text",
description="Add some text that should be filled into Notepad at start.",
type="string",
format="multiline",
),
},
) as dag:
npo = NotepadOperator(
task_id="notepad",
text="{{ params.notepad_text }}",
)
Loading

0 comments on commit 03659e4

Please sign in to comment.