Replies: 2 comments 4 replies
-
| 
         cc @gabotechs  | 
  
Beta Was this translation helpful? Give feedback.
-
| 
         One of the core differences is that this project is not meant to provide any executable, service or binary that can be configured and deployed to an environment and start executing distributed queries. Instead, it aims to provide a library with the building blocks for enhancing DataFusion with distributed capabilities. Another difference is that this project aims to be as close as possible to DataFusion in its execution model, maintaining a pull based approach very similar to how vanilla DataFusion works, but that happens to stream data over the network. This project also does not materialize intermediate results across network boundaries. The idea is to stream data across workers in a zero-copy manner as efficiently as possible, even if that implies that other features like checkpointing or subplan retries are going to be way harder to implement.  | 
  
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Thanks for your great work on this project—it’s impressive to see how rapidly it’s evolving.
Could you also clarify how this project differs from datafusion-ray and datafusion-ballista?
Beta Was this translation helpful? Give feedback.
All reactions