I am struggling with an unexpected/unwanted behavior of the function random.normal of numpy.
By generating vectors of T elements with this function, I find that in average the lagged auto-correlation of those vectors in not 0 at lags different than 0. The auto-correlation value tends to -1/(T-1).
See for example this simple code:
import numpy as np
N = 10000000
T = 100
invT = -1./(T-1.)
sd = 1
av = 0
mxlag = 10
X = np.random.normal(av, sd, size=(N, T))
acf = X[:,0:mxlag+1]
for i in range(N):
acf[i,:] = [1. if l==0 else np.corrcoef(X[i,l:],X[i,:-l])[0][1] for l in range(mxlag+1)]
acf_mean = np.average(acf, axis=0)
print('mean auto-correlation of random_normal vector of length T=',T,' : ',acf_mean)
print('to be compared with -1/(T-1) = ',invT)
I have this behavior with both those version of Python: v2.7.9 and v3.7.4. Also, codding this in NCL gives me the same result.
The problem I am referring to might seem tiny. However, it leads to larger biases when those vectors are used as seeds to generate auto-regressive time series. This is also problematic in case one uses this function to create bootstrap statistical tests.
Someone would have an explanation about this? Am I doing something obviously wrong?
Many thanks!
Aucun commentaire:
Enregistrer un commentaire