mercredi 6 janvier 2016

Generate correlated variables in Stata

I want to generate 5 correlated variables in Stata. Four normally distributed with specific means and standard deviations and one following a bernoulli with probability 0.60.

I tried to follow the advice given in the post: How to generate correlated Uniform[0,1] variables

My code is the following:

matrix C =     (1,                  ///                                                                 /// 
2*sin(0.05*_pi/6), 1,      /// 
2*sin(-0.45*_pi/6), 2*sin(0.44*_pi/6), 1,            /// 
2*sin(0.22*_pi/6), 2*sin(0.33*_pi/6), 2*sin(-0.54*_pi/6), 1,   /// 
2*sin(0.45*_pi/6), 2*sin(0.32*_pi/6), 2*sin(-0.22*_pi/6), 2*sin(-0.13*_pi/6), 1)    

matrix B = (40, 26, 13, 146, 0.35) 
matrix A = (9, 11, 5, 2, 1)

corr2data var1 var2 var3 var4 var5, n(10000) corr(C) means(B) sds(A) cstorage(lower)

replace var1 = rnormal(var1)
replace var2 = rnormal(var2)
replace var3 = rnormal(var3)
replace var4 = rnormal(var4)

replace var5 = normal(var5)
replace var5 = rbinomial(1,var5)

I have gotten what i wanted more or less in the sense that the values that the generated variables have are as expected.

However, is my approach correct? If not, how would you amend the code to properly give the desired results while being scientifically sound?




Aucun commentaire:

Enregistrer un commentaire