I need a way to sample without replacement a certain array a
. I tried two approaches (see MCVE below), using random.sample()
and np.random.choice
.
I assumed the numpy
function would be faster, but it turns out it is not. In my tests random.sample
is ~15% faster than np.random.choice
.
Is this correct, or am I doing something wrong in my example below? If this is correct, why?
import numpy as np
import random
import time
from contextlib import contextmanager
@contextmanager
def timeblock(label):
start = time.clock()
try:
yield
finally:
end = time.clock()
print ('{} elapsed: {}'.format(label, end - start))
def f1(a, n_sample):
return random.sample(range(len(a)), n_sample)
def f2(a, n_sample):
return np.random.choice(len(a), n_sample, replace=False)
# Generate random array
a = np.random.uniform(1., 100., 10000)
# Number of samples' indexes to randomly take from a
n_sample = 100
# Number of times to repeat functions f1 and f2
N = 100000
with timeblock("random.sample"):
for _ in range(N):
f1(a, n_sample)
with timeblock("np.random.choice"):
for _ in range(N):
f2(a, n_sample)
Aucun commentaire:
Enregistrer un commentaire