lincc-frameworks
diff --git a/‎README.md‎
Lines changed: 6 additions & 1 deletion b/‎README.md‎
Lines changed: 6 additions & 1 deletion
diff --git a/‎docs/gettingstarted/quickstart.ipynb‎
Lines changed: 30 additions & 8 deletions b/‎docs/gettingstarted/quickstart.ipynb‎
Lines changed: 30 additions & 8 deletions
diff --git a/‎docs/pre_executed/nested_spectra.ipynb‎
Lines changed: 2 additions & 2 deletions b/‎docs/pre_executed/nested_spectra.ipynb‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/pre_executed/performance.ipynb‎
Lines changed: 1 addition & 1 deletion b/‎docs/pre_executed/performance.ipynb‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/reference/accessor.rst‎
Lines changed: 6 additions & 9 deletions b/‎docs/reference/accessor.rst‎
Lines changed: 6 additions & 9 deletions
diff --git a/‎docs/reference/nesteddtype.rst‎
Lines changed: 1 addition & 1 deletion b/‎docs/reference/nesteddtype.rst‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/reference/nestedframe.rst‎
Lines changed: 14 additions & 3 deletions b/‎docs/reference/nestedframe.rst‎
Lines changed: 14 additions & 3 deletions
diff --git a/‎docs/reference/nestedseries.rst‎
Lines changed: 1 addition & 1 deletion b/‎docs/reference/nestedseries.rst‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/tutorials/data_loading_notebook.ipynb‎
Lines changed: 4 additions & 4 deletions b/‎docs/tutorials/data_loading_notebook.ipynb‎
Lines changed: 4 additions & 4 deletions
diff --git a/‎docs/tutorials/data_manipulation.ipynb‎
Lines changed: 4 additions & 11 deletions b/‎docs/tutorials/data_manipulation.ipynb‎
Lines changed: 4 additions & 11 deletions
@@ -46,7 +46,12 @@ Allowing powerful and straightforward operations, like:
 ```python
    # Compute the mean flux for each row of "object_nf"
    import numpy as np
-   object_nf.reduce(np.mean, "nested_sources.flux")
+
+   def mean_flux(row):
+   """Calculates the mean flux for each object"""
+       return np.mean(row["nested_sources.flux"])
+
+   object_nf.map_rows(mean_flux, output_names="mean_flux")
 ```
 
 <p align="center">
 
@@ -282,9 +282,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Reduce Function\n",
+    "## The `map_rows` Function\n",
     "\n",
-    "Finally, we'll end with the flexible `reduce` function. `reduce` functions similarly to pandas' `apply` but flattens (reduces) the inputs from nested layers into array inputs to the given apply function. For example, let's find the mean flux for each dataframe in \"nested\":"
+    "Finally, we'll end with the flexible `map_rows` function. `map_rows` functions similarly to pandas' `apply` but applies row by row and flattens the inputs from nested layers into array inputs to the given apply function. For example, let's find the mean flux for each dataframe in \"nested\":"
    ]
   },
   {
@@ -297,7 +297,8 @@
     "\n",
     "# use hierarchical column names to access the flux column\n",
     "# passed as an array to np.mean\n",
-    "nf.reduce(np.mean, \"lightcurve.brightness\")"
+    "# row_container signals how to pass the data to the function, in this case as direct arguments\n",
+    "nf.map_rows(np.mean, \"lightcurve.brightness\", row_container=\"args\")"
    ]
   },
   {
@@ -313,15 +314,15 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "def show_inputs(*args):\n",
-    "    return args"
+    "def show_inputs(row):\n",
+    "    return row"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Applying some inputs via reduce, we see how it sends inputs to a given function.  The output frame `nf_inputs` consists of two columns containing the output of the “ra” column and the “lightcurve.time” column."
+    "Applying some inputs via `map_rows`, we see how it sends inputs to a given function.  The output frame `nf_inputs` consists of two columns containing the output of the “ra” column and the “lightcurve.time” column."
    ]
   },
   {
@@ -330,8 +331,12 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "nf_inputs = nf.reduce(show_inputs, \"ra\", \"lightcurve.time\")\n",
-    "nf_inputs"
+    "# row_container=\"dict\" passes the data as a dictionary to the function\n",
+    "nf_inputs = nf.map_rows(show_inputs, columns=[\"ra\", \"lightcurve.time\"], row_container=\"dict\")\n",
+    "nf_inputs\n",
+    "\n",
+    "# map_rows returns a dataframe view of the dicts, but the two columns can be accessed with show_inputs as\n",
+    "# row[\"ra\"] and row[\"lightcurve.time\"]"
    ]
   },
   {
@@ -343,6 +348,23 @@
     "nf_inputs.loc[0]"
    ]
   },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# row_container=\"args\" passes the data as arguments to the function\n",
+    "\n",
+    "\n",
+    "def show_inputs(*args):\n",
+    "    return args\n",
+    "\n",
+    "\n",
+    "nf_inputs = nf.map_rows(show_inputs, columns=[\"ra\", \"lightcurve.time\"], row_container=\"args\")\n",
+    "nf_inputs"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
 
@@ -280,7 +280,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 8,
+   "execution_count": null,
    "metadata": {},
    "outputs": [
     {
@@ -452,7 +452,7 @@
     }
    ],
    "source": [
-    "spec_ndf = xid_ndf.add_nested(flat_spec, \"coadd_spectrum\").set_index(\"objid\")\n",
+    "spec_ndf = xid_ndf.join_nested(flat_spec, \"coadd_spectrum\").set_index(\"objid\")\n",
     "spec_ndf"
    ]
   },
 
@@ -98,7 +98,7 @@
     "# Read in parquet data\n",
     "# nesting sources into objects\n",
     "nf = npd.read_parquet(\"objects.parquet\")\n",
-    "nf = nf.add_nested(npd.read_parquet(\"ztf_sources.parquet\"), \"ztf_sources\")\n",
+    "nf = nf.join_nested(npd.read_parquet(\"ztf_sources.parquet\"), \"ztf_sources\")\n",
     "\n",
     "# Filter on object\n",
     "nf = nf.query(\"ra > 10.0\")\n",
 
@@ -18,12 +18,9 @@ Functions
     NestSeriesAccessor.to_lists
     NestSeriesAccessor.to_flat
     NestSeriesAccessor.to_flatten_inner
-    NestSeriesAccessor.with_field
-    NestSeriesAccessor.with_flat_field
-    NestSeriesAccessor.with_list_field
-    NestSeriesAccessor.with_filled_field
-    NestSeriesAccessor.without_field
-    NestSeriesAccessor.query_flat
-    NestSeriesAccessor.get_flat_index
-    NestSeriesAccessor.get_flat_series
-    NestSeriesAccessor.get_list_series
+    NestSeriesAccessor.set_column
+    NestSeriesAccessor.set_flat_column
+    NestSeriesAccessor.set_list_column
+    NestSeriesAccessor.set_filled_column
+    NestSeriesAccessor.drop
+    NestSeriesAccessor.query
@@ -17,6 +17,6 @@ Functions
 
     NestedDtype.construct_array_type
     NestedDtype.construct_from_string
-    NestedDtype.from_fields
+    NestedDtype.from_columns
     NestedDtype.from_pandas_arrow_dtype
     NestedDtype.to_pandas_arrow_dtype
@@ -10,12 +10,21 @@ Constructor
 
    NestedFrame
 
+Helpful Properties
+~~~~~~~~~~~~~~~~~~
+.. autosummary::
+    :toctree: api/
+
+    NestedFrame.nested_columns
+    NestedFrame.base_columns
+    NestedFrame.all_columns
+
 Nesting
 ~~~~~~~~~
 .. autosummary::
     :toctree: api/
 
-    NestedFrame.add_nested
+    NestedFrame.join_nested
     NestedFrame.nest_lists
     NestedFrame.from_flat
     NestedFrame.from_lists
@@ -25,19 +34,21 @@ Extended Pandas.DataFrame Interface
 
 .. note:: 
    The NestedFrame extends the Pandas.DataFrame interface, so all methods
-   of Pandas.DataFrame are available. The following methods are extended
+   of Pandas.DataFrame are available. The following methods are a mix of
+   newly added methods and extended methods from Pandas DataFrame
    to support NestedFrame functionality. Please reference the Pandas
    documentation for more information.
    https://pandas.pydata.org/docs/reference/frame.html
 
 .. autosummary::
     :toctree: api/
 
+    NestedFrame.get_subcolumns
     NestedFrame.eval
     NestedFrame.query
     NestedFrame.dropna
     NestedFrame.sort_values
-    NestedFrame.reduce
+    NestedFrame.map_rows
     NestedFrame.drop
     NestedFrame.min
     NestedFrame.max
 
@@ -16,4 +16,4 @@ Functions
     :toctree: api/
 
     NestedSeries.to_lists
-    NestedSeries.to_flat
+    NestedSeries.explode
@@ -141,7 +141,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "We can then create an additional pandas dataframes for the nested columns and pack them into our `NestedFrame` with `NestedFrame.add_nested()` function. `add_nested` will align the nest based on the index by default (a column may be selected instead via the `on` kwarg), as we see the `nested` `DataFrame` has a repeated index corresponding to the `nf` `NestedFrame`."
+    "We can then create an additional pandas dataframes for the nested columns and pack them into our `NestedFrame` with `NestedFrame.join_nested()` function. `join_nested` will align the nest based on the index by default (a column may be selected instead via the `on` kwarg), as we see the `nested` `DataFrame` has a repeated index corresponding to the `nf` `NestedFrame`."
    ]
   },
   {
@@ -158,7 +158,7 @@
     "    index=[0, 0, 0, 1, 1, 1, 2, 2, 2, 2],\n",
     ")\n",
     "\n",
-    "nf = nf.add_nested(nested, \"nested\")\n",
+    "nf = nf.join_nested(nested, \"nested\")\n",
     "nf"
    ]
   },
@@ -182,7 +182,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "We could add other nested columns by creating new sub-tables and adding them with `add_nested()`. Note that while the tables added with each `add_nested()` must be rectangular, they do not need to have the same dimensions between calls. We could add another nested row with a different number of observations."
+    "We could add other nested columns by creating new sub-tables and adding them with `join_nested()`. Note that while the tables added with each `join_nested()` must be rectangular, they do not need to have the same dimensions between calls. We could add another nested row with a different number of observations."
    ]
   },
   {
@@ -199,7 +199,7 @@
     "    index=[0, 0, 1, 1, 1, 2],\n",
     ")\n",
     "\n",
-    "nf = nf.add_nested(nested, \"nested2\")\n",
+    "nf = nf.join_nested(nested, \"nested2\")\n",
     "nf"
    ]
   },
 
@@ -105,13 +105,6 @@
     "## Adding or Replacing Nested Columns"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "> *A Note on Performance: These operations involve full reconstruction of the nested columns so expect impacted performance when doing this at scale. It may be appropriate to do these operations within reduce functions directly (e.g. subtracting a value from a column) if performance is key.*"
-   ]
-  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -210,7 +203,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "This is functionally equivalent to using `add_nested`:"
+    "This is functionally equivalent to using `join_nested`:"
    ]
   },
   {
@@ -224,7 +217,7 @@
    },
    "outputs": [],
    "source": [
-    "ndf.add_nested(ndf[\"nested.band\"].to_frame(), \"bands_from_add_nested\")"
+    "ndf.join_nested(ndf[\"nested.band\"].to_frame(), \"bands_from_add_nested\")"
    ]
   },
   {
@@ -254,7 +247,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "The above again being shorthand for the following `add_nested` call:"
+    "The above again being shorthand for the following `join_nested` call:"
    ]
   },
   {
@@ -263,7 +256,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "ndf.add_nested(flat_df, \"example_from_add_nested\")"
+    "ndf.join_nested(flat_df, \"example_from_add_nested\")"
    ]
   },
   {
Original file line number	Diff line number	Diff line change
`@@ -280,7 +280,7 @@`
`280`	`280`	`},`
`281`	`281`	`{`
`282`	`282`	`"cell_type": "code",`
`283`		`- "execution_count": 8,`
	`283`	`+ "execution_count": null,`
`284`	`284`	`"metadata": {},`
`285`	`285`	`"outputs": [`
`286`	`286`	`{`
`@@ -452,7 +452,7 @@`
`452`	`452`	`}`
`453`	`453`	`],`
`454`	`454`	`"source": [`
`455`		`- "spec_ndf = xid_ndf.add_nested(flat_spec, \"coadd_spectrum\").set_index(\"objid\")\n",`
	`455`	`+ "spec_ndf = xid_ndf.join_nested(flat_spec, \"coadd_spectrum\").set_index(\"objid\")\n",`
`456`	`456`	`"spec_ndf"`
`457`	`457`	`]`
`458`	`458`	`},`
Original file line number	Diff line number	Diff line change
`@@ -141,7 +141,7 @@`
`141`	`141`	`"cell_type": "markdown",`
`142`	`142`	`"metadata": {},`
`143`	`143`	`"source": [`
`144`		- "We can then create an additional pandas dataframes for the nested columns and pack them into our `NestedFrame` with `NestedFrame.add_nested()` function. `add_nested` will align the nest based on the index by default (a column may be selected instead via the `on` kwarg), as we see the `nested` `DataFrame` has a repeated index corresponding to the `nf` `NestedFrame`."
	`144`	+ "We can then create an additional pandas dataframes for the nested columns and pack them into our `NestedFrame` with `NestedFrame.join_nested()` function. `join_nested` will align the nest based on the index by default (a column may be selected instead via the `on` kwarg), as we see the `nested` `DataFrame` has a repeated index corresponding to the `nf` `NestedFrame`."
`145`	`145`	`]`
`146`	`146`	`},`
`147`	`147`	`{`
`@@ -158,7 +158,7 @@`
`158`	`158`	`" index=[0, 0, 0, 1, 1, 1, 2, 2, 2, 2],\n",`
`159`	`159`	`")\n",`
`160`	`160`	`"\n",`
`161`		`- "nf = nf.add_nested(nested, \"nested\")\n",`
	`161`	`+ "nf = nf.join_nested(nested, \"nested\")\n",`
`162`	`162`	`"nf"`
`163`	`163`	`]`
`164`	`164`	`},`
`@@ -182,7 +182,7 @@`
`182`	`182`	`"cell_type": "markdown",`
`183`	`183`	`"metadata": {},`
`184`	`184`	`"source": [`
`185`		- "We could add other nested columns by creating new sub-tables and adding them with `add_nested()`. Note that while the tables added with each `add_nested()` must be rectangular, they do not need to have the same dimensions between calls. We could add another nested row with a different number of observations."
	`185`	+ "We could add other nested columns by creating new sub-tables and adding them with `join_nested()`. Note that while the tables added with each `join_nested()` must be rectangular, they do not need to have the same dimensions between calls. We could add another nested row with a different number of observations."
`186`	`186`	`]`
`187`	`187`	`},`
`188`	`188`	`{`
`@@ -199,7 +199,7 @@`
`199`	`199`	`" index=[0, 0, 1, 1, 1, 2],\n",`
`200`	`200`	`")\n",`
`201`	`201`	`"\n",`
`202`		`- "nf = nf.add_nested(nested, \"nested2\")\n",`
	`202`	`+ "nf = nf.join_nested(nested, \"nested2\")\n",`
`203`	`203`	`"nf"`
`204`	`204`	`]`
`205`	`205`	`},`
Original file line number	Diff line number	Diff line change
`@@ -105,13 +105,6 @@`
`105`	`105`	`"## Adding or Replacing Nested Columns"`
`106`	`106`	`]`
`107`	`107`	`},`
`108`		`- {`
`109`		`- "cell_type": "markdown",`
`110`		`- "metadata": {},`
`111`		`- "source": [`
`112`		`- "> A Note on Performance: These operations involve full reconstruction of the nested columns so expect impacted performance when doing this at scale. It may be appropriate to do these operations within reduce functions directly (e.g. subtracting a value from a column) if performance is key."`
`113`		`- ]`
`114`		`- },`
`115`	`108`	`{`
`116`	`109`	`"cell_type": "markdown",`
`117`	`110`	`"metadata": {},`
`@@ -210,7 +203,7 @@`
`210`	`203`	`"cell_type": "markdown",`
`211`	`204`	`"metadata": {},`
`212`	`205`	`"source": [`
`213`		- "This is functionally equivalent to using `add_nested`:"
	`206`	+ "This is functionally equivalent to using `join_nested`:"
`214`	`207`	`]`
`215`	`208`	`},`
`216`	`209`	`{`
`@@ -224,7 +217,7 @@`
`224`	`217`	`},`
`225`	`218`	`"outputs": [],`
`226`	`219`	`"source": [`
`227`		`- "ndf.add_nested(ndf[\"nested.band\"].to_frame(), \"bands_from_add_nested\")"`
	`220`	`+ "ndf.join_nested(ndf[\"nested.band\"].to_frame(), \"bands_from_add_nested\")"`
`228`	`221`	`]`
`229`	`222`	`},`
`230`	`223`	`{`
`@@ -254,7 +247,7 @@`
`254`	`247`	`"cell_type": "markdown",`
`255`	`248`	`"metadata": {},`
`256`	`249`	`"source": [`
`257`		- "The above again being shorthand for the following `add_nested` call:"
	`250`	+ "The above again being shorthand for the following `join_nested` call:"
`258`	`251`	`]`
`259`	`252`	`},`
`260`	`253`	`{`
`@@ -263,7 +256,7 @@`
`263`	`256`	`"metadata": {},`
`264`	`257`	`"outputs": [],`
`265`	`258`	`"source": [`
`266`		`- "ndf.add_nested(flat_df, \"example_from_add_nested\")"`
	`259`	`+ "ndf.join_nested(flat_df, \"example_from_add_nested\")"`
`267`	`260`	`]`
`268`	`261`	`},`
`269`	`262`	`{`