Skip to content

PERF: slowdown in groupby/resample mean() method #39622

@jorisvandenbossche

Description

@jorisvandenbossche

See https://pandas.pydata.org/speed/pandas/#timeseries.ResampleSeries.time_resample?python=3.8&Cython=0.29.21&p-index='datetime'&p-freq='1D'&p-method='mean'&commits=812c3012-71a4cb69

It's from the period there were no benchmarks runs, so no clear indication which commit (range) would be responsible.

Reproducer:

idx = pd.date_range(start="1/1/2000", end="1/1/2001", freq="T")
s = pd.Series(np.random.randn(len(idx)), index=idx)
%timeit s.resample("1D").mean()

Last release:

In [2]: %timeit s.resample("1D").mean()
4.45 ms ± 507 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [3]: pd.__version__
Out[3]: '1.2.1'

on master:

In [2]: %timeit s.resample("1D").mean()
6.33 ms ± 430 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

So around 50% slowdown.
And it seems somewhat specific to mean (eg I don't see a similar slowdown for eg max)

Metadata

Metadata

Assignees

No one assigned

    Labels

    PerformanceMemory or execution speed performanceRegressionFunctionality that used to work in a prior pandas versionResampleresample method

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions