Remote Ibis backend #7704
Replies: 3 comments 8 replies
-
hi @tokoko, just as a quick follow up this is very interesting! I don't think we have an answer for you on whether this should go in the main Ibis repo or what next steps would be we're approaching holidays and will probably revisit this in the new year |
Beta Was this translation helpful? Give feedback.
-
Personally, I think this would be very very useful. For large data transformations where the output should just be writing another enriched dataset written to disk/network (for example), being able to just command and control an execution engine and never bring that data back to the client code would be fantastic. With that being said, not using substrait does (in my opinion) limit the effectiveness of this strategy |
Beta Was this translation helpful? Give feedback.
-
Quick note, in this use case FlightSQL might actually be a good fit, much better than pure Flight itself. While there isn't a standard FlightSQL server already available for Python, using FlightSQL would allow to connect to the remote ibis with a plain ADBC connection like you would do for many other backends. |
Beta Was this translation helpful? Give feedback.
-
Hi there,
Not sure how valuable this might be, but I am experimenting with Ibis for some personal project of mine and started to build a new remote backend for ibis. See example usage here: https://github.com/tokoko/ibis-connect/blob/main/example.py The idea is to split ibis workflow into a frontend python process and a backend ArrowFlight server sort of similar to what Spark accomplishes with SparkConnect (hence the name of the backend). The frontend generates ibis expressions, serializes them and sends instructions to the backend that either executes commands or returns resulting datasets over flight if necessary. I was initially planning to use Substrait for serialization, but found out that
ibis-substrait
recently dropped decompiler, so I ended up switching to pickle for now. The repo above implementscreate_table
,list_tables
,to_pyarrow
andexecute
methods for the backend.If you agree that this feature can be valuable for users, I will start working on PR to the main repo. thanks
Beta Was this translation helpful? Give feedback.
All reactions