lundi 27 septembre 2021

sampling NumPy 1d array with controlled randomness

I have a 1d NumPy array a of length l and I want to sample int(np.log(l)) instances from it, but I want the samples to be:

  1. uniformly distributed, and

  2. random.

  • By 1 I mean I want to avoid having two samples with distance less than int(l/int(np.log(l))).
  • By 2 I mean I don't want to get the same instances as the sample each time.
  • I also need to stress that I can't change the randomness seed.

One way is to split the array into int(np.log(l)) sub-arrays and then randomly sample one from each sub-array, but I am looking for a more efficient implementation since I need to run it several times on a considerable number of data.

import numpy as np
a = np.array([np.random.randint(1000) for _ in range(1000)])
a = thresholds = np.sort(a)
l = len(a)
random_indices = np.random.randint(0, l, int(np.log(l)))
samples = a[random_indices]
samples = np.sort(samples)
samples
# array([183, 536, 644, 791, 925, 999])

I appreciate any comments, suggestions, and helps.




Aucun commentaire:

Enregistrer un commentaire