I observed that python's default random.sample
is much faster than numpy's random.choice
. Taking a small sample from an array of length 1 million, random.sample
is more than 1000x faster than its numpy's counterpart.
In [1]: import numpy as np
In [2]: import random
In [3]: arr = [x for x in range(1000000)]
In [4]: nparr = np.array(arr)
In [5]: %timeit random.sample(arr, 5)
The slowest run took 5.25 times longer than the fastest. This could mean that an intermediate result is being cached
100000 loops, best of 3: 4.54 µs per loop
In [6]: %timeit np.random.choice(arr, 5)
10 loops, best of 3: 47.7 ms per loop
In [7]: %timeit np.random.choice(nparr, 5)
The slowest run took 6.79 times longer than the fastest. This could mean that an intermediate result is being cached
100000 loops, best of 3: 7.79 µs per loop
Although numpy sampling from numpy array was decently fast yet it was slower than default random sampling.
Is the observation above correct, or am I missing the difference between what random.sample
and np.random.choice
compute?
Aucun commentaire:
Enregistrer un commentaire