jeudi 30 juillet 2015

Fill bins according to probability distribution in numpy

I have a integer that needs to be split up in to bins according to a probability distribution. For example, if I had N=100 objects going into [0.02, 0.08, 0.16, 0.29, 0.45] then you might get [1, 10, 20, 25, 44].

import numpy as np
# sample distribution
d = np.array([x ** 2 for x in range(1,6)], dtype=float)
d = d / d.sum()
dcs = d.cumsum()
bins = np.zeros(d.shape)
N = 100
for roll in np.random.rand(N):
    # grab the first index that the roll satisfies
    i = np.where(roll < dcs)[0][0]  
    bins[i] += 1

In reality, N and my number of bins are very large, so looping isn't really a viable option. Is there any way I can vectorize this operation to speed it up?




Aucun commentaire:

Enregistrer un commentaire