-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Closed
Labels
PerformanceMemory or execution speed performanceMemory or execution speed performance
Description
Numpy 1.17 introduced a new random module with faster PRNGs, and which drops the strict reproducibility of random streams guarantee, which allows some algorithmic improvements. In particular, the choice method is now a lot faster in the replace=False case. Would it make sense for random_state
to return a np.random.Generator instead of a np.random.RandomState when numpy version >= 1.17 here: https://github.com/pandas-dev/pandas/blob/master/pandas/core/common.py#L408. This would automatically speed up the DataFrame.sample method for instance.
I can write a PR, but I'm not sure how to handle the different numpy versions. Should I just do the tests inside random_state
or is something that needs to go inside numpy.compat?
Metadata
Metadata
Assignees
Labels
PerformanceMemory or execution speed performanceMemory or execution speed performance