lundi 17 avril 2023

Repeatedly drawing indices with numpy random choice with weights

What I am doing at the moment:

import numpy as np

d = 60000
n_choice = 1000
n_samples = 5000

# some probability vector
p = np.random.randint(d)
p = p/np.sum(p)

rng = np.default_rng(123)
samples = np.empty((n_choice, n_samples))

for i in range(n_samples):
    samples[:, i] = rng.choice(d, size=n_choice, replace=False, p=p, shuffle=False)

This is a bit slow for my taste. Is there a way to speed this up? E.g., by replacing the loop with a trick or using some other form of simulation?

I skimmed through similar questions on stack, but only found this where the weights are uniform and d=n_choice and this where weights are given but only the rows (columns) of the samples array have to be unique.




Aucun commentaire:

Enregistrer un commentaire