I've trained a machine learning model using sklearn and want to simulate the result by sampling the predictions according to the predict_proba probabilities. So I want to do something like
samples = np.random.choice(a = possible_outcomes, size = (n_data, n_samples), p = probabilities)
Where probabilities would be is an (n_data, n_possible_outcomes) array
But np.random.choice only allows 1d arrays for the p argument. I've currently gotten around this using a for-loop like the following implementation
sample_outcomes = np.zeros(len(probs)
for i in trange(len(probs)):
sample_outcomes[i, i*n_samples : i*nsamples + n_samples] = np.random.choice(outcomes, s = n_samples, p=probs[i])
but that's relatively slow. Any suggestions to speed this up would be much appreciated !
Aucun commentaire:
Enregistrer un commentaire