Skip to content

[Feat] Runtime optimizations#20

Draft
e-strauss wants to merge 11 commits into
mainfrom
Runtime-Optimizations
Draft

[Feat] Runtime optimizations#20
e-strauss wants to merge 11 commits into
mainfrom
Runtime-Optimizations

Conversation

@e-strauss
Copy link
Copy Markdown
Collaborator

Adding:

  • ParallelScheduler
  • More DataframeOps
  • Intermediate Clean Up for Scheduler
  • Minor fixes

e-strauss and others added 10 commits March 9, 2026 13:37
This commit introduces physical planning to the logical optimizer, enabling parallel execution of independent estimator tasks. Key changes include:

*   **Physical Planning Optimization**: A new optimization pass identifies independent estimator operations in the DAG and groups them into a `ParallelBlockOp` for concurrent execution.
*   **Parallel Execution Engine**: Implementation of `ParallelBlockOp` uses `joblib` to process estimator tasks in parallel, bypassing the Python Global Interpreter Lock (GIL).
*   **Scheduler Refactoring**: The `Scheduler` has been refactored into a base class with a `SequentialScheduler` implementation, and the internal logic was updated to handle optimized DAG sinks rather than flat lists.
*   **Configurable Activation**: A new `physical_planning` flag was added to the configuration and environment variables to toggle this optimization.
*   **Bug Fixes and Improvements**: Updates were made to estimator processing to ensure data is picklable for multiprocessing and to improve the display of performance statistics.
This follow-up commit refactors the parallel execution implementation, moving it from a compile-time DAG transformation to a runtime scheduling strategy.

*   **Parallel Scheduler Implementation**: A fully-featured `ParallelScheduler` was added that can execute independent operations concurrently using either thread-based or process-based parallelism, configurable via a `backend` parameter.
*   **Configuration Change**: The boolean `physical_planning` flag was replaced with a more flexible `scheduler_parallelism` option that accepts `"threading"`, `"process"`, or `"auto"` to select the parallel execution backend.
*   **Architecture Relocation**: The physical planning logic was moved from the logical optimizer to the runtime layer, making it a runtime concern rather than a compile-time DAG rewrite; it now marks ops with a `parallel_group` ID instead of restructuring the graph.
*   **Estimator/Transformer Distinction**: Operations are now explicitly categorized into `EstimatorOp` (predictors) and `TransformerOp`, each with dedicated processing functions that handle their specific fit/predict vs fit_transform/transform semantics.
*   **DAG Linearization**: The parallel scheduler linearizes the DAG into sequential blocks where some blocks are lists of independent ops that can be executed in parallel.
- Made Polars support configurable at runtime via FLAGS.force_polars
- Added automatic sin/cos UDF rewriting to native Polars ops
- Added compatibility layer to convert Polars→Pandas for unsupported estimators
- Fixed estimator cloning for repeated fits (e.g., cross-validation)
Add memory estimation function for transformer operations that returns
size multipliers based on estimator type (TableVectorizer: 10x,
StringEncoder: 3x). Filter parallelization candidates to only include
operations with known memory estimates, focusing on transformer
operations while temporarily disabling estimator parallelization.
@e-strauss e-strauss force-pushed the Runtime-Optimizations branch from 7c07432 to 6b52268 Compare March 9, 2026 17:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant