samedi 15 septembre 2018

Haskell: RandomGen drops half of values

I am writing a simple deterministic Random Number generator, based on the xorshift. The goal here is not to get a cryptographically secure or statistically perfect (pseudo-)random number generator, but to be able to archieve the same deterministic sequence of semi-random numbers across programming languages.

My Haskell program looks like follows:

{-# LANGUAGE GeneralizedNewtypeDeriving #-}
module SimpleRNG where

import Data.Word (Word32)
import Data.Bits (xor, shiftL, shiftR)
import System.Random (RandomGen(..))
import Control.Arrow

newtype SeedState = SeedState Word32
  deriving (Eq, Show, Enum, Bounded)

seed :: Integral a => a -> SeedState
seed = SeedState . fromIntegral

rand_r :: SeedState -> (Word32, SeedState)
rand_r (SeedState a) = (d, SeedState d)
  where
    b = a `xor` (shiftL a 13)
    c = b `xor` (shiftR b 17)
    d = c `xor` (shiftL c 5)

instance RandomGen SeedState where
  next seed_state = (first fromIntegral) $ rand_r seed_state --(fromIntegral num, new_seed_state)
    where
      -- (num, new_seed_state) = rand_r seed_state
  genRange seed_state = (fromEnum (minBound `asTypeOf` seed_state),
                fromEnum (maxBound `asTypeOf` seed_state))

  split seed_state@(SeedState num) =  (seed_state', inverted_seed_state')
    where
      (_, seed_state') = next seed_state
      (_, inverted_seed_state') = next inverted_seed_state
      inverted_seed_state = SeedState (maxBound - num)

Now, for some reason, when running

take 10 $ System.Random.randoms (seed 42) :: [Word32]

it returns only the 'odd' results, compared to the output of the following Python program:

class SeedState(object):
    def __init__(self, seed = 42):
        self.data = seed

def rand_r(rng_state):
    num = rng_state.data
    num ^= (num << 13) % (2 ** 32)
    num ^= (num >> 17) % (2 ** 32)
    num ^= (num << 5) % (2 ** 32)
    rng_state.data = num
    return num


__global_rng_state = SeedState(42)

def rand():
    global __global_rng_state
    return rand_r(__global_rng_state)

def seed(seed):
    global __global_rng_state
    __global_rng_state = SeedState(seed)

if __name__ == '__main__':
    for x in range(0, 10):
        print(rand())

It seems like the internals of the System.Random module do some weird trickery with the return result of the generator (calling

map fst $ take 10 $ iterate (\(_, rng) -> rand_r rng) (rand_r $ seed 42)

gives the result I'd expect).

This is odd, since the type returned by the generator is already a Word32, so it could/should just be passed on unaltered without any remapping happening.

What is going on here, and is there a way to plug this xorshift-generator into System.Random in a way that returns the same results?




Aucun commentaire:

Enregistrer un commentaire