I have a 1d NumPy array a
of length l
and I want to sample int(np.log(l))
instances from it, but I want the samples to be:
-
uniformly distributed, and
-
random.
- By
1
I mean I want to avoid having two samples with distance less thanint(l/int(np.log(l)))
. - By
2
I mean I don't want to get the same instances as the sample each time. - I also need to stress that I can't change the randomness seed.
One way is to split the array into int(np.log(l))
sub-arrays and then randomly sample one from each sub-array, but I am looking for a more efficient implementation since I need to run it several times on a considerable number of data.
import numpy as np
a = np.array([np.random.randint(1000) for _ in range(1000)])
a = thresholds = np.sort(a)
l = len(a)
random_indices = np.random.randint(0, l, int(np.log(l)))
samples = a[random_indices]
samples = np.sort(samples)
samples
# array([183, 536, 644, 791, 925, 999])
I appreciate any comments, suggestions, and helps.
Aucun commentaire:
Enregistrer un commentaire