mercredi 17 août 2022

Lagged auto-correlations of numpy.random.normal not nul

I am struggling with an unexpected/unwanted behavior of the function random.normal of numpy.

By generating vectors of T elements with this function, I find that in average the lagged auto-correlation of those vectors in not 0 at lags different than 0. The auto-correlation value tends to -1/(T-1).

See for example this simple code:

import numpy as np

N        = 10000000
T        = 100
invT     = -1./(T-1.)
sd       = 1
av       = 0
mxlag    = 10

X        = np.random.normal(av, sd, size=(N, T))
acf      = X[:,0:mxlag+1]
for i in range(N):
    acf[i,:] = [1. if l==0 else np.corrcoef(X[i,l:],X[i,:-l])[0][1] for l in range(mxlag+1)]

acf_mean = np.average(acf, axis=0)

print('mean auto-correlation of random_normal vector of length T=',T,' : ',acf_mean)
print('to be compared with -1/(T-1) = ',invT)

I have this behavior with both those version of Python: v2.7.9 and v3.7.4. Also, codding this in NCL gives me the same result.

The problem I am referring to might seem tiny. However, it leads to larger biases when those vectors are used as seeds to generate auto-regressive time series. This is also problematic in case one uses this function to create bootstrap statistical tests.

Someone would have an explanation about this? Am I doing something obviously wrong?

Many thanks!




Aucun commentaire:

Enregistrer un commentaire