Replies: 6 comments 1 reply
-
Thanks for opening your first issue here! Be sure to follow the issue template! |
Beta Was this translation helpful? Give feedback.
-
As an additional benefit this would free it up so that devs could develop plugins in other languages as well which would increase the potential number of maintainers and give a boost to the plugin ecosystem around airflow potentially. |
Beta Was this translation helpful? Give feedback.
-
This would be a complete change in how plugin mechanism work, because plugins are essentially executed in the same interpreter as the "main part" of the process. For example plugins create Python objects that then are used to instantiate views in the UI, or plugins create Timetable objects that are used by scheduler to calculate when to schedule the next dag run. if we were to isolate plugins and run them in separate process, we would have to execute them separately, have separate virtualenv for plugins and serialize the objects back and forth remotely between independently runniing python interpreters. This could be done with either mutliprocessing (with different python executable) or even have a separate plugin server that would have to be used for such purpose and reused between different components). It's not impossible, of course but it is rather complex when it comes to deployment and requires very, very careful thinking not only about the APIs and interfaces used but also about performance implications. This is a complex change that would require Airflow Improvement Proposal to be drafted, discussed and finalized in the devlist, reviewing and interating over the docs, reaching consensus and passing successful voting. If you are up to the task (just be aware that it will likely take months just to discuss it so this is a marathon, not a sprint) - feel free to start it. Just a watch-out: it's not needed in 9x% of cases where airflow is used so this is a niche case, that's why we have not invested into it. So there is a risk that we might even decide that we do not want to implement it, if from the discussion it will come out that either complexity of the implementation, or number of changes, or potential backwards incompatibilities or performance implications will be too much of a problem. Simply the gains to achieve might be much lower than cost involved in maintenance and risks of breking existing installation. So any proposal there must take into account and discuss all the aspects that might lead to higher cost of maintenance, complexity, performance. |
Beta Was this translation helpful? Give feedback.
-
converting it into discussion - as this is the stage it is in at the moment. |
Beta Was this translation helpful? Give feedback.
-
One thing I’m wondering is what kind of complexity exactly is in the plugin? Plugin code is run in the scheduler or webserver, so there’s really limited complexity in what it can do so to not clog up the entire Airflow process. |
Beta Was this translation helpful? Give feedback.
-
Thinking it over I think there's possible benefit in something like this in airflows case, but it might be less intrusive potentially if it was an extension to airflow in and of itself. Rather than ham-fistedly going in and refactoring large swathes of the airflow plugin, provider code, adding some additional hooks or interfaces to core, and then creating a plugin that acts as an extension manager and marshaller of some sort? Some of the benefits I could see (keeping in mind I haven't dug into internals enough yet to have real sense of scale here):
As a first sort of go around, taking the approach of creating a plugin itself that manages this stuff which hooks into airflow internals, might at least help to decide if it's feasible and the assumptions about benefits above are tangible ones. Major concerns?
As you said, it might not be worth the effort, and it might not even align with the goals of the project overall, i'm going to pick around a bit more in airflows internals over the weekend and see if the approach of integrating a plugin itself to manage plugins is reasonable as a test. |
Beta Was this translation helpful? Give feedback.
-
Description
Because of the complexity of some of the plugins that are starting to come out, i'd like to propose a long-term goal / feature of finding a way to isolate plugins to prevent dependency issues, and stabilize integration points and problems over the longer term.
We're starting to run into issues where plugins and extensions to airflow have really complex dependency trees which cause upgrades and installs to become much more complex as well as major versions can break previously working plugins triggering rewrites as well as version checking to decide which interface to work against.
I know i'm light on details, but doing something similar to other tools where plugins are isolated in a separate process / environment, like LSP implementations in an editor, or how Sublime handles python-based extensions (I think they run as a background service when needed), where they are spawned as a secondary process and integrate with airflow through gRPC or something similar.
Use case/motivation
This would make developing plugins and extensions easier as there can be a contractual interface that developers work against that can insulate them from changes in airflows code and classes. We're right now dealing with an issue in upgrading to 2.4.x where a plugin we rely on heavily has been a bit of a horror show on 2.4.x
a) Because it's own dependencies now conflict with airflows and some provider packages
b) Because the interface it worked against changed (albeit the interface it worked against was probably not the correct one in the first place).
Related issues
No response
Are you willing to submit a PR?
Code of Conduct
Beta Was this translation helpful? Give feedback.
All reactions