Skip to content

Allow UDFs to take all columns in reduce() #354

@hombit

Description

@hombit

Feature request

Currently, we require the user to specify all reduce() input columns explicitly, for the sake of performance. Let's allow a special case when no columns are specified and all items per row are passed as a single dictionary:

def udf1(base_col, sub_col):
    # base_col is a single value, sub_col is a numpy array
    ...

# Existing behavior:
nf.reduce(udf1, "ra", "lightcurve.time")

def udf2(row):
    assert isinstance(row, dict)
    assert "ra" in row and "lightcurve.time" in row
    # row is a dictionary: base columns are single values,
    # nested columns are numpy arrays
    ...

# New behavior:
nf.reduce(udf2)

Before submitting
Please check the following:

  • I have described the purpose of the suggested change, specifying what I need the enhancement to accomplish, i.e. what problem it solves.
  • I have included any relevant links, screenshots, environment information, and data relevant to implementing the requested feature, as well as pseudocode for how I want to access the new functionality.
  • If I have ideas for how the new feature could be implemented, I have provided explanations and/or pseudocode and/or task lists for the steps.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions