-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Closed
Labels
IndexingRelated to indexing on series/frames, not to indexes themselvesRelated to indexing on series/frames, not to indexes themselvesMultiIndexPerformanceMemory or execution speed performanceMemory or execution speed performanceRegressionFunctionality that used to work in a prior pandas versionFunctionality that used to work in a prior pandas version
Milestone
Description
Indexing a multi-index seemingly went from O(1) to O(N):
I did a bisect, and found this was caused by the _shallow_copy
here: b0f33b3#diff-4ffd1c69d47e0ac9f2de4f9e3e4a118cR643.
Code Sample
from time import perf_counter as time
import pandas as pd
for N in [1000, 2000, 4000, 8000, 16000, 32000]:
values = list(range(N))
df = pd.DataFrame({ 'a': values })
df['b'] = 1
df.set_index(['a', 'b'], inplace = True)
t = time()
df.loc[values]
t = time() - t
print(N, t)
Metadata
Metadata
Assignees
Labels
IndexingRelated to indexing on series/frames, not to indexes themselvesRelated to indexing on series/frames, not to indexes themselvesMultiIndexPerformanceMemory or execution speed performanceMemory or execution speed performanceRegressionFunctionality that used to work in a prior pandas versionFunctionality that used to work in a prior pandas version