Closed
Description
One option: add a batch_apply
method:
This would be a shortcut for split-apply-combine with groupby/apply if the grouping over a dimension is only being done for efficiency reasons.
This function should take several parameters:
- The
dimension
to group over. - The
batchsize
to group over on this dimension (defaulting to1
). - The
func
to apply to each group.
At first, this function would be useful just to avoid memory issues. Eventually, it would be nice to add a n_jobs
parameter which would automatically dispatch to multiprocessing/joblib. We would need to get pickling (issue #24) working first to be able to do this.
Metadata
Metadata
Assignees
Labels
No labels