Skip to content

pivot_wider() failing with unclear error: KeyError: None #1509

@wjandrea

Description

@wjandrea

Brief Description

I'm trying to do what I thought was a basic usage of pivot_wider(), but I'm getting this error and I can't figure out what's causing it.

janitor.pivot_wider(df, index=['subject', 'date'], names_from='strength')
KeyError: None

System Information

  • Operating system: Linux
  • OS details (optional): Debian 12 (Bookworm)
  • Python version (required): 3.11

Minimally Reproducible Code

df is:

   subject  pills        date  strength
0        1      4  10/10/2012       250
1        1      4  10/11/2012       250
2        1      2  10/12/2012       500
3        2      1    1/6/2014      1000
4        2      1    1/7/2014       250
5        2      1    1/7/2014       500
6        2      3    1/8/2014       250
import pandas as pd
import janitor

df = pd.DataFrame({
    'subject': [1, 1, 1, 2, 2, 2, 2],
    'pills': [4, 4, 2, 1, 1, 1, 3],
    'date': ['10/10/2012', '10/11/2012', '10/12/2012', '1/6/2014', '1/7/2014', '1/7/2014', '1/8/2014'],
    'strength': [250, 250, 500, 1000, 250, 500, 250]})

janitor.pivot_wider(df, index=['subject', 'date'], names_from='strength')

Error Messages

Traceback (most recent call last):
  File ~/micromamba/envs/anaconda3_11_plus/lib/python3.11/site-packages/pandas/core/indexes/base.py:3805 in get_loc
    return self._engine.get_loc(casted_key)
  File index.pyx:167 in pandas._libs.index.IndexEngine.get_loc
  File index.pyx:196 in pandas._libs.index.IndexEngine.get_loc
  File pandas/_libs/hashtable_class_helper.pxi:7081 in pandas._libs.hashtable.PyObjectHashTable.get_item
  File pandas/_libs/hashtable_class_helper.pxi:7089 in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: None


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  Cell In[11], line 4
    janitor.pivot_wider(df, index=['subject', 'date'], names_from='strength')
  File ~/micromamba/envs/anaconda3_11_plus/lib/python3.11/site-packages/janitor/utils.py:335 in emit_warning
    return func(*args, **kwargs)
  File ~/micromamba/envs/anaconda3_11_plus/lib/python3.11/site-packages/janitor/functions/pivot.py:2052 in pivot_wider
    return _computations_pivot_wider(
  File ~/micromamba/envs/anaconda3_11_plus/lib/python3.11/site-packages/janitor/functions/pivot.py:2112 in _computations_pivot_wider
    out = df.pivot(  # noqa: PD010
  File ~/micromamba/envs/anaconda3_11_plus/lib/python3.11/site-packages/pandas/core/frame.py:9339 in pivot
    return pivot(self, index=index, columns=columns, values=values)
  File ~/micromamba/envs/anaconda3_11_plus/lib/python3.11/site-packages/pandas/core/reshape/pivot.py:566 in pivot
    indexed = data._constructor_sliced(data[values]._values, index=multiindex)
  File ~/micromamba/envs/anaconda3_11_plus/lib/python3.11/site-packages/pandas/core/frame.py:4102 in __getitem__
    indexer = self.columns.get_loc(key)
  File ~/micromamba/envs/anaconda3_11_plus/lib/python3.11/site-packages/pandas/core/indexes/base.py:3812 in get_loc
    raise KeyError(key) from err
KeyError: None
verbose xmode
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File ~/micromamba/envs/anaconda3_11_plus/lib/python3.11/site-packages/pandas/core/indexes/base.py:3805, in Index.get_loc(self=Index(['subject', 'pills', 'date', 'strength'], dtype='object'), key=None)
   3804 try:
-> 3805     return self._engine.get_loc(casted_key)
        casted_key = None
        self = Index(['subject', 'pills', 'date', 'strength'], dtype='object')
   3806 except KeyError as err:

File index.pyx:167, in pandas._libs.index.IndexEngine.get_loc()

File index.pyx:196, in pandas._libs.index.IndexEngine.get_loc()

File pandas/_libs/hashtable_class_helper.pxi:7081, in pandas._libs.hashtable.PyObjectHashTable.get_item()

File pandas/_libs/hashtable_class_helper.pxi:7089, in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: None

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
Cell In[11], line 4
      1 # %%
      2 import janitor
----> 4 janitor.pivot_wider(df, index=['subject', 'date'], names_from='strength')
        df =    subject  pills        date  strength
0        1      4  10/10/2012       250
1        1      4  10/11/2012       250
2        1      2  10/12/2012       500
3        2      1    1/6/2014      1000
4        2      1    1/7/2014       250
5        2      1    1/7/2014       500
6        2      3    1/8/2014       250

File ~/micromamba/envs/anaconda3_11_plus/lib/python3.11/site-packages/janitor/utils.py:335, in refactored_function.<locals>.decorator.<locals>.emit_warning(*args=(   subject  pills        date  strength
0       ...      500
6        2      3    1/8/2014       250,), **kwargs={'index': ['subject', 'date'], 'names_from': 'strength'})
    332 @wraps(func)
    333 def emit_warning(*args, **kwargs):
    334     warn(message, category, stacklevel=find_stack_level())
--> 335     return func(*args, **kwargs)
        func = <function pivot_wider at 0x7f3660866480>
        args = (   subject  pills        date  strength
0        1      4  10/10/2012       250
1        1      4  10/11/2012       250
2        1      2  10/12/2012       500
3        2      1    1/6/2014      1000
4        2      1    1/7/2014       250
5        2      1    1/7/2014       500
6        2      3    1/8/2014       250,)
        kwargs = {'index': ['subject', 'date'], 'names_from': 'strength'}

File ~/micromamba/envs/anaconda3_11_plus/lib/python3.11/site-packages/janitor/functions/pivot.py:2052, in pivot_wider(df=   subject  pills        date  strength
0       ...      500
6        2      3    1/8/2014       250, index=['subject', 'date'], names_from='strength', values_from=None, flatten_levels=True, names_sep='_', names_glue=None, reset_index=True, names_expand=False, index_expand=False)
   1877 """Reshapes data from *long* to *wide* form.
   1878 
   1879 !!!note
   (...)
   2047     A pandas DataFrame that has been unpivoted from long to wide form.
   2048 """  # noqa: E501
   2050 # no need for an explicit copy --> df = df.copy()
   2051 # `pd.pivot` creates one
-> 2052 return _computations_pivot_wider(
        df =    subject  pills        date  strength
0        1      4  10/10/2012       250
1        1      4  10/11/2012       250
2        1      2  10/12/2012       500
3        2      1    1/6/2014      1000
4        2      1    1/7/2014       250
5        2      1    1/7/2014       500
6        2      3    1/8/2014       250
        index = ['subject', 'date']
        names_from = 'strength'
        values_from = None
        flatten_levels = True
        names_sep = '_'
        names_glue = None
        reset_index = True
        names_expand = False
        index_expand = False
   2053     df,
   2054     index,
   2055     names_from,
   2056     values_from,
   2057     flatten_levels,
   2058     names_sep,
   2059     names_glue,
   2060     reset_index,
   2061     names_expand,
   2062     index_expand,
   2063 )

File ~/micromamba/envs/anaconda3_11_plus/lib/python3.11/site-packages/janitor/functions/pivot.py:2112, in _computations_pivot_wider(df=   subject  pills        date  strength
0       ...      500
6        2      3    1/8/2014       250, index=['subject', 'date'], names_from=['strength'], values_from=None, flatten_levels=True, names_sep='_', names_glue=None, reset_index=True, names_expand=False, index_expand=False)
   2078 """
   2079 This is the main workhorse of the `pivot_wider` function.
   2080 
   (...)
   2085 A dataframe pivoted from long to wide form is returned.
   2086 """
   2088 (
   2089     df,
   2090     index,
   (...)
   2109     index_expand,
   2110 )
-> 2112 out = df.pivot(  # noqa: PD010
        df =    subject  pills        date  strength
0        1      4  10/10/2012       250
1        1      4  10/11/2012       250
2        1      2  10/12/2012       500
3        2      1    1/6/2014      1000
4        2      1    1/7/2014       250
5        2      1    1/7/2014       500
6        2      3    1/8/2014       250
        index = ['subject', 'date']
        names_from = ['strength']
        values_from = None
   2113     index=index, columns=names_from, values=values_from
   2114 )
   2116 indexer = out.index
   2117 if index_expand and index:

File ~/micromamba/envs/anaconda3_11_plus/lib/python3.11/site-packages/pandas/core/frame.py:9339, in DataFrame.pivot(self=   subject  pills        date  strength
0       ...      500
6        2      3    1/8/2014       250, columns=['strength'], index=['subject', 'date'], values=None)
   9332 @Substitution("")
   9333 @Appender(_shared_docs["pivot"])
   9334 def pivot(
   9335     self, *, columns, index=lib.no_default, values=lib.no_default
   9336 ) -> DataFrame:
   9337     from pandas.core.reshape.pivot import pivot
-> 9339     return pivot(self, index=index, columns=columns, values=values)
        self =    subject  pills        date  strength
0        1      4  10/10/2012       250
1        1      4  10/11/2012       250
2        1      2  10/12/2012       500
3        2      1    1/6/2014      1000
4        2      1    1/7/2014       250
5        2      1    1/7/2014       500
6        2      3    1/8/2014       250
        index = ['subject', 'date']
        columns = ['strength']
        values = None

File ~/micromamba/envs/anaconda3_11_plus/lib/python3.11/site-packages/pandas/core/reshape/pivot.py:566, in pivot(data=                       subject  pills        dat...                   2      3    1/8/2014       250, columns=['strength'], index=['subject', 'date'], values=None)
    562         indexed = data._constructor(
    563             data[values]._values, index=multiindex, columns=values
    564         )
    565     else:
--> 566         indexed = data._constructor_sliced(data[values]._values, index=multiindex)
        data =                        subject  pills        date  strength
_NoDefault.no_default                                      
0                            1      4  10/10/2012       250
1                            1      4  10/11/2012       250
2                            1      2  10/12/2012       500
3                            2      1    1/6/2014      1000
4                            2      1    1/7/2014       250
5                            2      1    1/7/2014       500
6                            2      3    1/8/2014       250
        values = None
        multiindex = MultiIndex([(1, '10/10/2012',  250),
            (1, '10/11/2012',  250),
            (1, '10/12/2012',  500),
            (2,   '1/6/2014', 1000),
            (2,   '1/7/2014',  250),
            (2,   '1/7/2014',  500),
            (2,   '1/8/2014',  250)],
           names=['subject', 'date', 'strength'])
        data._constructor_sliced = <class 'pandas.core.series.Series'>
    567 # error: Argument 1 to "unstack" of "DataFrame" has incompatible type "Union
    568 # [List[Any], ExtensionArray, ndarray[Any, Any], Index, Series]"; expected
    569 # "Hashable"
    570 result = indexed.unstack(columns_listlike)  # type: ignore[arg-type]

File ~/micromamba/envs/anaconda3_11_plus/lib/python3.11/site-packages/pandas/core/frame.py:4102, in DataFrame.__getitem__(self=                       subject  pills        dat...                   2      3    1/8/2014       250, key=None)
   4100 if self.columns.nlevels > 1:
   4101     return self._getitem_multilevel(key)
-> 4102 indexer = self.columns.get_loc(key)
        key = None
        self =                        subject  pills        date  strength
_NoDefault.no_default                                      
0                            1      4  10/10/2012       250
1                            1      4  10/11/2012       250
2                            1      2  10/12/2012       500
3                            2      1    1/6/2014      1000
4                            2      1    1/7/2014       250
5                            2      1    1/7/2014       500
6                            2      3    1/8/2014       250
   4103 if is_integer(indexer):
   4104     indexer = [indexer]

File ~/micromamba/envs/anaconda3_11_plus/lib/python3.11/site-packages/pandas/core/indexes/base.py:3812, in Index.get_loc(self=Index(['subject', 'pills', 'date', 'strength'], dtype='object'), key=None)
   3807     if isinstance(casted_key, slice) or (
   3808         isinstance(casted_key, abc.Iterable)
   3809         and any(isinstance(x, slice) for x in casted_key)
   3810     ):
   3811         raise InvalidIndexError(key)
-> 3812     raise KeyError(key) from err
        key = None
   3813 except TypeError:
   3814     # If we have a listlike key, _check_indexing_error will raise
   3815     #  InvalidIndexError. Otherwise we fall through and re-raise
   3816     #  the TypeError.
   3817     self._check_indexing_error(key)

KeyError: None

I'm also getting a warning which I don't believe is relevant to the problem (reverted in #1464):

FutureWarning: This function will be deprecated in a 1.x release. Please use pd.DataFrame.pivot instead.
janitor.pivot_wider(df, index=['subject', 'date'], names_from='strength')

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions