mercredi 20 octobre 2021

Vectorised np.random.choice with varying probabilities

I've trained a machine learning model using sklearn and want to simulate the result by sampling the predictions according to the predict_proba probabilities. So I want to do something like

samples = np.random.choice(a = possible_outcomes, size = (n_data, n_samples), p = probabilities)

Where probabilities would be is an (n_data, n_possible_outcomes) array

But np.random.choice only allows 1d arrays for the p argument. I've currently gotten around this using a for-loop like the following implementation

sample_outcomes = np.zeros(len(probs)
for i in trange(len(probs)):
    sample_outcomes[i, i*n_samples : i*nsamples + n_samples] = np.random.choice(outcomes, s = n_samples, p=probs[i])

but that's relatively slow. Any suggestions to speed this up would be much appreciated !




Aucun commentaire:

Enregistrer un commentaire