dimanche 24 juin 2018

Efficient random sampling

My problem is simple: I've an array with 20 million floats. In that array, every float has a probability p of being randomly altered.

The simples way to do so is to move through the array, doing if (rand(0,1) < p) then modify.

However, even paralelizing, its slow as hell, and I was thinking if there's a faster way to randomly obtain some indexes to modify.

My first thought was to pick up p * n random numbers, where n is the total number of floats in the array, however, that doesnt exactly represent the probability distribution, as nothing in the first case guarantees that only p*n floats will be modified.

Ideas?

PD: I'm using python for the implementation, probably someone had this problem before and implemented something in the libraries, but I cannot find it.




Aucun commentaire:

Enregistrer un commentaire