Skip to content

BUG: InvalidIndexError raised in Categorical.isin for categorical backed by interval with overlapping intervals #34974

@TomAugspurger

Description

@TomAugspurger

Code Sample, a copy-pastable example

In [1]: import pandas as pd
In [2]: idx = pd.IntervalIndex([pd.Interval(0, 2), pd.Interval(0, 1)])

In [3]: pd.Categorical(idx).isin(idx)
---------------------------------------------------------------------------
InvalidIndexError                         Traceback (most recent call last)
<ipython-input-3-801ab88eb4d0> in <module>
----> 1 pd.Categorical(idx).isin(idx)

~/sandbox/pandas/pandas/core/arrays/categorical.py in isin(self, values)
   2359         values = sanitize_array(values, None, None)
   2360         null_mask = np.asarray(isna(values))
-> 2361         code_values = self.categories.get_indexer(values)
   2362         code_values = code_values[null_mask | (code_values >= 0)]
   2363         return algorithms.isin(self.codes, code_values)

~/sandbox/pandas/pandas/core/indexes/interval.py in get_indexer(self, target, method, limit, tolerance)
    764         if self.is_overlapping:
    765             raise InvalidIndexError(
--> 766                 "cannot handle overlapping indices; "
    767                 "use IntervalIndex.get_indexer_non_unique"
    768             )

InvalidIndexError: cannot handle overlapping indices; use IntervalIndex.get_indexer_non_unique

Problem description

The result is unambigous

Expected Output

array([ True,  True])

I think that the call to self.categories.get_indexer might be self.categories.get_indexer_for instead.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions