lundi 29 février 2016

Looking for a demonstration of the need for Matsumoto's DCMT algorithm in parallel PRNG

In “Dynamic Creation of Pseudorandom Number Generators” Matsumoto and Nishimura cautioned against careless initialization of MT PRNGs for the purposes of parallel simulations which assume the number streams from the different generators are independent of each other:

The usual scheme for PRNG in parallel machines is to use one and the same PRNG for every process, with different intial seeds. However, this procedure may yield bad collision, in particular if the generator is based on a linear recurrence, because in this case the sum of two pseudorandom sequences satisfies the same linear recurrence, and may appear in the third sequence. The danger becomes non-negligible if the number of parallel streams becomes large compared to the size of the state space.

How big does the number of streams have to get for this to be a serious problem? In the case of the standard MT, MT19937, the state space is rather large... I could definitely see there being a linear relationship modulo 219937−1 among 20,000 sequences, but how about a birthday-paradox-style relationship among 400 sequences?

It appears that this is actually a serious problem, because parallel PRNG implementations do include DCMT, but it would be nice to have some examples of failure, and some sense for when this becomes a problem.




Aucun commentaire:

Enregistrer un commentaire