Skip to content

Commit 5fabd51

Browse files
committed
Merge branch 'master' into eat
2 parents f560ea1 + dc45fba commit 5fabd51

File tree

25 files changed

+657
-83
lines changed

25 files changed

+657
-83
lines changed

doc/source/extending.rst

+53-6
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ Extension Types
6161

6262
.. warning::
6363

64-
The :class:`pandas.api.extension.ExtensionDtype` and :class:`pandas.api.extension.ExtensionArray` APIs are new and
64+
The :class:`pandas.api.extensions.ExtensionDtype` and :class:`pandas.api.extensions.ExtensionArray` APIs are new and
6565
experimental. They may change between versions without warning.
6666

6767
Pandas defines an interface for implementing data types and arrays that *extend*
@@ -79,10 +79,10 @@ on :ref:`ecosystem.extensions`.
7979

8080
The interface consists of two classes.
8181

82-
:class:`~pandas.api.extension.ExtensionDtype`
82+
:class:`~pandas.api.extensions.ExtensionDtype`
8383
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8484

85-
A :class:`pandas.api.extension.ExtensionDtype` is similar to a ``numpy.dtype`` object. It describes the
85+
A :class:`pandas.api.extensions.ExtensionDtype` is similar to a ``numpy.dtype`` object. It describes the
8686
data type. Implementors are responsible for a few unique items like the name.
8787

8888
One particularly important item is the ``type`` property. This should be the
@@ -99,9 +99,8 @@ example ``'category'`` is a registered string accessor for the ``CategoricalDtyp
9999

100100
See the `extension dtype dtypes`_ for more on how to register dtypes.
101101

102-
103-
:class:`~pandas.api.extension.ExtensionArray`
104-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
102+
:class:`~pandas.api.extensions.ExtensionArray`
103+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
105104

106105
This class provides all the array-like functionality. ExtensionArrays are
107106
limited to 1 dimension. An ExtensionArray is linked to an ExtensionDtype via the
@@ -122,6 +121,54 @@ by some other storage type, like Python lists.
122121
See the `extension array source`_ for the interface definition. The docstrings
123122
and comments contain guidance for properly implementing the interface.
124123

124+
.. _extending.extension.operator:
125+
126+
:class:`~pandas.api.extensions.ExtensionArray` Operator Support
127+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
128+
129+
.. versionadded:: 0.24.0
130+
131+
By default, there are no operators defined for the class :class:`~pandas.api.extensions.ExtensionArray`.
132+
There are two approaches for providing operator support for your ExtensionArray:
133+
134+
1. Define each of the operators on your ``ExtensionArray`` subclass.
135+
2. Use an operator implementation from pandas that depends on operators that are already defined
136+
on the underlying elements (scalars) of the ExtensionArray.
137+
138+
For the first approach, you define selected operators, e.g., ``__add__``, ``__le__``, etc. that
139+
you want your ``ExtensionArray`` subclass to support.
140+
141+
The second approach assumes that the underlying elements (i.e., scalar type) of the ``ExtensionArray``
142+
have the individual operators already defined. In other words, if your ``ExtensionArray``
143+
named ``MyExtensionArray`` is implemented so that each element is an instance
144+
of the class ``MyExtensionElement``, then if the operators are defined
145+
for ``MyExtensionElement``, the second approach will automatically
146+
define the operators for ``MyExtensionArray``.
147+
148+
A mixin class, :class:`~pandas.api.extensions.ExtensionScalarOpsMixin` supports this second
149+
approach. If developing an ``ExtensionArray`` subclass, for example ``MyExtensionArray``,
150+
can simply include ``ExtensionScalarOpsMixin`` as a parent class of ``MyExtensionArray``,
151+
and then call the methods :meth:`~MyExtensionArray._add_arithmetic_ops` and/or
152+
:meth:`~MyExtensionArray._add_comparison_ops` to hook the operators into
153+
your ``MyExtensionArray`` class, as follows:
154+
155+
.. code-block:: python
156+
157+
class MyExtensionArray(ExtensionArray, ExtensionScalarOpsMixin):
158+
pass
159+
160+
MyExtensionArray._add_arithmetic_ops()
161+
MyExtensionArray._add_comparison_ops()
162+
163+
Note that since ``pandas`` automatically calls the underlying operator on each
164+
element one-by-one, this might not be as performant as implementing your own
165+
version of the associated operators directly on the ``ExtensionArray``.
166+
167+
.. _extending.extension.testing:
168+
169+
Testing Extension Arrays
170+
^^^^^^^^^^^^^^^^^^^^^^^^
171+
125172
We provide a test suite for ensuring that your extension arrays satisfy the expected
126173
behavior. To use the test suite, you must provide several pytest fixtures and inherit
127174
from the base test class. The required fixtures are found in

doc/source/whatsnew/v0.23.2.txt

+1
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,7 @@ Fixed Regressions
5757
- Bug in both :meth:`DataFrame.first_valid_index` and :meth:`Series.first_valid_index` raised for a row index having duplicate values (:issue:`21441`)
5858
- Fixed regression in unary negative operations with object dtype (:issue:`21380`)
5959
- Bug in :meth:`Timestamp.ceil` and :meth:`Timestamp.floor` when timestamp is a multiple of the rounding frequency (:issue:`21262`)
60+
- Fixed regression in :func:`to_clipboard` that defaulted to copying dataframes with space delimited instead of tab delimited (:issue:`21104`)
6061

6162
.. _whatsnew_0232.performance:
6263

doc/source/whatsnew/v0.24.0.txt

+18
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,22 @@ New features
1010

1111
- ``ExcelWriter`` now accepts ``mode`` as a keyword argument, enabling append to existing workbooks when using the ``openpyxl`` engine (:issue:`3441`)
1212

13+
.. _whatsnew_0240.enhancements.extension_array_operators
14+
15+
``ExtensionArray`` operator support
16+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17+
18+
A ``Series`` based on an ``ExtensionArray`` now supports arithmetic and comparison
19+
operators. (:issue:`19577`). There are two approaches for providing operator support for an ``ExtensionArray``:
20+
21+
1. Define each of the operators on your ``ExtensionArray`` subclass.
22+
2. Use an operator implementation from pandas that depends on operators that are already defined
23+
on the underlying elements (scalars) of the ``ExtensionArray``.
24+
25+
See the :ref:`ExtensionArray Operator Support
26+
<extending.extension.operator>` documentation section for details on both
27+
ways of adding operator support.
28+
1329
.. _whatsnew_0240.enhancements.other:
1430

1531
Other Enhancements
@@ -118,6 +134,7 @@ Datetimelike API Changes
118134

119135
- For :class:`DatetimeIndex` and :class:`TimedeltaIndex` with non-``None`` ``freq`` attribute, addition or subtraction of integer-dtyped array or ``Index`` will return an object of the same class (:issue:`19959`)
120136
- :class:`DateOffset` objects are now immutable. Attempting to alter one of these will now raise ``AttributeError`` (:issue:`21341`)
137+
- :class:`PeriodIndex` subtraction of another ``PeriodIndex`` will now return an object-dtype :class:`Index` of :class:`DateOffset` objects instead of raising a ``TypeError`` (:issue:`20049`)
121138

122139
.. _whatsnew_0240.api.extension:
123140

@@ -127,6 +144,7 @@ ExtensionType Changes
127144
- ``ExtensionDtype`` has gained the ability to instantiate from string dtypes, e.g. ``decimal`` would instantiate a registered ``DecimalDtype``; furthermore
128145
the ``ExtensionDtype`` has gained the method ``construct_array_type`` (:issue:`21185`)
129146
- The ``ExtensionArray`` constructor, ``_from_sequence`` now take the keyword arg ``copy=False`` (:issue:`21185`)
147+
- Bug in :meth:`Series.get` for ``Series`` using ``ExtensionArray`` and integer index (:issue:`21257`)
130148
- :meth:`Series.combine()` works correctly with :class:`~pandas.api.extensions.ExtensionArray` inside of :class:`Series` (:issue:`20825`)
131149
- :meth:`Series.combine()` with scalar argument now works for any function type (:issue:`21248`)
132150
-

pandas/api/extensions/__init__.py

+2-1
Original file line numberDiff line numberDiff line change
@@ -3,5 +3,6 @@
33
register_index_accessor,
44
register_series_accessor)
55
from pandas.core.algorithms import take # noqa
6-
from pandas.core.arrays.base import ExtensionArray # noqa
6+
from pandas.core.arrays.base import (ExtensionArray, # noqa
7+
ExtensionScalarOpsMixin)
78
from pandas.core.dtypes.dtypes import ExtensionDtype # noqa

pandas/conftest.py

+18-18
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,8 @@ def observed(request):
8989
'__mul__', '__rmul__',
9090
'__floordiv__', '__rfloordiv__',
9191
'__truediv__', '__rtruediv__',
92-
'__pow__', '__rpow__']
92+
'__pow__', '__rpow__',
93+
'__mod__', '__rmod__']
9394
if not PY3:
9495
_all_arithmetic_operators.extend(['__div__', '__rdiv__'])
9596

@@ -102,6 +103,22 @@ def all_arithmetic_operators(request):
102103
return request.param
103104

104105

106+
@pytest.fixture(params=['__eq__', '__ne__', '__le__',
107+
'__lt__', '__ge__', '__gt__'])
108+
def all_compare_operators(request):
109+
"""
110+
Fixture for dunder names for common compare operations
111+
112+
* >=
113+
* >
114+
* ==
115+
* !=
116+
* <
117+
* <=
118+
"""
119+
return request.param
120+
121+
105122
@pytest.fixture(params=[None, 'gzip', 'bz2', 'zip',
106123
pytest.param('xz', marks=td.skip_if_no_lzma)])
107124
def compression(request):
@@ -320,20 +337,3 @@ def mock():
320337
return importlib.import_module("unittest.mock")
321338
else:
322339
return pytest.importorskip("mock")
323-
324-
325-
@pytest.fixture(params=['__eq__', '__ne__', '__le__',
326-
'__lt__', '__ge__', '__gt__'])
327-
def all_compare_operators(request):
328-
"""
329-
Fixture for dunder names for common compare operations
330-
331-
* >=
332-
* >
333-
* ==
334-
* !=
335-
* <
336-
* <=
337-
"""
338-
339-
return request.param

pandas/core/arrays/__init__.py

+2-1
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,3 @@
1-
from .base import ExtensionArray # noqa
1+
from .base import (ExtensionArray, # noqa
2+
ExtensionScalarOpsMixin)
23
from .categorical import Categorical # noqa

pandas/core/arrays/base.py

+127
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,13 @@
77
"""
88
import numpy as np
99

10+
import operator
11+
1012
from pandas.errors import AbstractMethodError
1113
from pandas.compat.numpy import function as nv
14+
from pandas.compat import set_function_name, PY3
15+
from pandas.core.dtypes.common import is_list_like
16+
from pandas.core import ops
1217

1318
_not_implemented_message = "{} does not implement {}."
1419

@@ -623,3 +628,125 @@ def _ndarray_values(self):
623628
used for interacting with our indexers.
624629
"""
625630
return np.array(self)
631+
632+
633+
class ExtensionOpsMixin(object):
634+
"""
635+
A base class for linking the operators to their dunder names
636+
"""
637+
@classmethod
638+
def _add_arithmetic_ops(cls):
639+
cls.__add__ = cls._create_arithmetic_method(operator.add)
640+
cls.__radd__ = cls._create_arithmetic_method(ops.radd)
641+
cls.__sub__ = cls._create_arithmetic_method(operator.sub)
642+
cls.__rsub__ = cls._create_arithmetic_method(ops.rsub)
643+
cls.__mul__ = cls._create_arithmetic_method(operator.mul)
644+
cls.__rmul__ = cls._create_arithmetic_method(ops.rmul)
645+
cls.__pow__ = cls._create_arithmetic_method(operator.pow)
646+
cls.__rpow__ = cls._create_arithmetic_method(ops.rpow)
647+
cls.__mod__ = cls._create_arithmetic_method(operator.mod)
648+
cls.__rmod__ = cls._create_arithmetic_method(ops.rmod)
649+
cls.__floordiv__ = cls._create_arithmetic_method(operator.floordiv)
650+
cls.__rfloordiv__ = cls._create_arithmetic_method(ops.rfloordiv)
651+
cls.__truediv__ = cls._create_arithmetic_method(operator.truediv)
652+
cls.__rtruediv__ = cls._create_arithmetic_method(ops.rtruediv)
653+
if not PY3:
654+
cls.__div__ = cls._create_arithmetic_method(operator.div)
655+
cls.__rdiv__ = cls._create_arithmetic_method(ops.rdiv)
656+
657+
cls.__divmod__ = cls._create_arithmetic_method(divmod)
658+
cls.__rdivmod__ = cls._create_arithmetic_method(ops.rdivmod)
659+
660+
@classmethod
661+
def _add_comparison_ops(cls):
662+
cls.__eq__ = cls._create_comparison_method(operator.eq)
663+
cls.__ne__ = cls._create_comparison_method(operator.ne)
664+
cls.__lt__ = cls._create_comparison_method(operator.lt)
665+
cls.__gt__ = cls._create_comparison_method(operator.gt)
666+
cls.__le__ = cls._create_comparison_method(operator.le)
667+
cls.__ge__ = cls._create_comparison_method(operator.ge)
668+
669+
670+
class ExtensionScalarOpsMixin(ExtensionOpsMixin):
671+
"""A mixin for defining the arithmetic and logical operations on
672+
an ExtensionArray class, where it is assumed that the underlying objects
673+
have the operators already defined.
674+
675+
Usage
676+
------
677+
If you have defined a subclass MyExtensionArray(ExtensionArray), then
678+
use MyExtensionArray(ExtensionArray, ExtensionScalarOpsMixin) to
679+
get the arithmetic operators. After the definition of MyExtensionArray,
680+
insert the lines
681+
682+
MyExtensionArray._add_arithmetic_ops()
683+
MyExtensionArray._add_comparison_ops()
684+
685+
to link the operators to your class.
686+
"""
687+
688+
@classmethod
689+
def _create_method(cls, op, coerce_to_dtype=True):
690+
"""
691+
A class method that returns a method that will correspond to an
692+
operator for an ExtensionArray subclass, by dispatching to the
693+
relevant operator defined on the individual elements of the
694+
ExtensionArray.
695+
696+
Parameters
697+
----------
698+
op : function
699+
An operator that takes arguments op(a, b)
700+
coerce_to_dtype : bool
701+
boolean indicating whether to attempt to convert
702+
the result to the underlying ExtensionArray dtype
703+
(default True)
704+
705+
Returns
706+
-------
707+
A method that can be bound to a method of a class
708+
709+
Example
710+
-------
711+
Given an ExtensionArray subclass called MyExtensionArray, use
712+
713+
>>> __add__ = cls._create_method(operator.add)
714+
715+
in the class definition of MyExtensionArray to create the operator
716+
for addition, that will be based on the operator implementation
717+
of the underlying elements of the ExtensionArray
718+
719+
"""
720+
721+
def _binop(self, other):
722+
def convert_values(param):
723+
if isinstance(param, ExtensionArray) or is_list_like(param):
724+
ovalues = param
725+
else: # Assume its an object
726+
ovalues = [param] * len(self)
727+
return ovalues
728+
lvalues = self
729+
rvalues = convert_values(other)
730+
731+
# If the operator is not defined for the underlying objects,
732+
# a TypeError should be raised
733+
res = [op(a, b) for (a, b) in zip(lvalues, rvalues)]
734+
735+
if coerce_to_dtype:
736+
try:
737+
res = self._from_sequence(res)
738+
except TypeError:
739+
pass
740+
741+
return res
742+
743+
op_name = ops._get_op_name(op, True)
744+
return set_function_name(_binop, op_name, cls)
745+
746+
@classmethod
747+
def _create_arithmetic_method(cls, op):
748+
return cls._create_method(op)
749+
750+
@classmethod
751+
def _create_comparison_method(cls, op):
752+
return cls._create_method(op, coerce_to_dtype=False)

pandas/core/indexes/base.py

+7-3
Original file line numberDiff line numberDiff line change
@@ -2988,16 +2988,20 @@ def get_value(self, series, key):
29882988
# use this, e.g. DatetimeIndex
29892989
s = getattr(series, '_values', None)
29902990
if isinstance(s, (ExtensionArray, Index)) and is_scalar(key):
2991-
# GH 20825
2991+
# GH 20882, 21257
29922992
# Unify Index and ExtensionArray treatment
29932993
# First try to convert the key to a location
2994-
# If that fails, see if key is an integer, and
2994+
# If that fails, raise a KeyError if an integer
2995+
# index, otherwise, see if key is an integer, and
29952996
# try that
29962997
try:
29972998
iloc = self.get_loc(key)
29982999
return s[iloc]
29993000
except KeyError:
3000-
if is_integer(key):
3001+
if (len(self) > 0 and
3002+
self.inferred_type in ['integer', 'boolean']):
3003+
raise
3004+
elif is_integer(key):
30013005
return s[key]
30023006

30033007
s = com._values_from_object(series)

0 commit comments

Comments
 (0)