Skip to content

How should we type Series with a categorical dtype? #1415

@cmp0xff

Description

@cmp0xff

After #1383 and #1391, pd.Series([1], dtype="category") gives Series[CategoricalDtype].

Is it the best idea to follow? Typically we use the type of elements as the Generic type of a Series or Index, for example Series[int], instead of Series[Int64Dtype]. The current approach seems inconsistent to me.

This is also related to #1395, where

  • Series([1], dtype=int): dtype is np.dtype("int64"), element type is np.int64. Currently this is Series[int] in pandas-stubs.
  • Series([1], dtype="Int64"): dtype is pd.Int64Dtype, element type is (built-in) int. Currently this is Series[Any] in pandas-stubs.

We will discuss the last part in #1395, here it is just an illustrative example.

Metadata

Metadata

Assignees

No one assigned

    Labels

    CategoricalCategorical Data TypeNeeds DiscussionRequires discussion from core team before further actionNeeds TriageIssue that has not been reviewed by a pandas team memberSeriesSeries data structurehelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions