jeudi 3 novembre 2016

Structure of the random effects in glmmLasso

I want to perform model selection among ~150 fixed-effect and 7 random-effect variables, on a set of 360 observations. I decided to use the Lasso procedure for mixed models, with the glmmLasso. I did a lost of researches to find some examples of comparable models without success. Here is a sample of my data:

    > str(RHI_12)
'data.frame':   350 obs. of  164 variables:
 $ RHI_counts_12   : int  0 14 1 3 2 2 2 0 0 1 ...
 $ Passage         : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
 $ Site        : Factor w/ 6 levels "14_metzerlen",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ Location             : Factor w/ 30 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ Dist_roost      : num  0.985 0.88 0.908 0.888 0.89 ...
 $ Natural_light   : num  -0.194 -0.194 -0.194 -0.194 -0.194 ...
 $ Mean_wind       : num  0.836 0.836 0.836 0.836 0.836 ...
 $ Mean_temp       : num  -0.427 -0.427 -0.427 -0.427 -0.427 ...
 $ Day             : num  -0.993 -0.993 -0.993 -0.993 -0.993 ...
 $ Artificial_light: num  -0.2016 -0.2016 0.0772 -0.2016 -0.2016 ...
 $ WBdi            : num  1.14 1.14 1.14 1.14 1.14 ...
 $ WCdi            : num  1.49 1.49 1.49 1.49 1.47 ...
 ... (many more fixed-effect variables)

The response variable is counts (RHI_counts_12).

My question is about the structure of the random-effect variables in the model. I have 2 categorical random-effect variables ("Site" and "Location"; "Location" is nested in "Site") and 5 numerical random-effect variables. I have structured my model like this (using only a sample of the fixed-effect variables):

lasso1<-glmmLasso(RHI_counts_12 ~ Artificial_light+WBdi+WCdi+BUdi+FOdi+TIdi, list(Site=~1,Location=~1+Dist_roost+Natural_light+Mean_wind+Mean_temp+Day), 
lambda = 500,family = poisson(link = log), data = RHI_12)

I am not convinced at all about the right way to structure the random effects if I have these 2 categorical nested random effects. I want to have a model with Location nested in Site, and I do not think that this is what I get. Here is my output for the random effects(in this output, "Loc" stands for Location, "siteName" for Site):

    Random Effects:

StdDev:
[[1]]
         siteName
siteName 1.180514

[[2]]
                          Loc Loc:Dist_roost Loc:Natural_light Loc:Mean_wind
Loc                1.15105859    -0.66317669       -0.35354821   -0.10805268
Loc:Dist_roost    -0.66317669     1.42601945        0.46004662   -0.42795987
Loc:Natural_light -0.35354821     0.46004662        0.49532786   -0.15485395
Loc:Mean_wind     -0.10805268    -0.42795987       -0.15485395    0.76175417
Loc:Mean_temp      0.02677276     0.03961902       -0.01431360   -0.03649499
Loc:Day            0.03756960    -0.02081360        0.02520654   -0.12082652
                  Loc:Mean_temp     Loc:Day
Loc                  0.02677276  0.03756960
Loc:Dist_roost       0.03961902 -0.02081360
Loc:Natural_light   -0.01431360  0.02520654
Loc:Mean_wind       -0.03649499 -0.12082652
Loc:Mean_temp        0.36923939 -0.08311209
Loc:Day             -0.08311209  0.56876662

Do you think that it is right? I was not able to build this model with "Location" nested in "Site" (and all the other random factors would also be nested in "Site".) I have tried many different ways without success.

I already thank you a lot for having read me and for any advices for the structure of random effects in glmmLasso! :-)

Thomas




Aucun commentaire:

Enregistrer un commentaire