I want to perform model selection among ~150 fixed-effect and 7 random-effect variables, on a set of 360 observations. I decided to use the Lasso procedure for mixed models, with the glmmLasso. I did a lost of researches to find some examples of comparable models without success. Here is a sample of my data:
> str(RHI_12)
'data.frame': 350 obs. of 164 variables:
$ RHI_counts_12 : int 0 14 1 3 2 2 2 0 0 1 ...
$ Passage : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ Site : Factor w/ 6 levels "14_metzerlen",..: 1 1 1 1 1 1 1 1 1 1 ...
$ Location : Factor w/ 30 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
$ Dist_roost : num 0.985 0.88 0.908 0.888 0.89 ...
$ Natural_light : num -0.194 -0.194 -0.194 -0.194 -0.194 ...
$ Mean_wind : num 0.836 0.836 0.836 0.836 0.836 ...
$ Mean_temp : num -0.427 -0.427 -0.427 -0.427 -0.427 ...
$ Day : num -0.993 -0.993 -0.993 -0.993 -0.993 ...
$ Artificial_light: num -0.2016 -0.2016 0.0772 -0.2016 -0.2016 ...
$ WBdi : num 1.14 1.14 1.14 1.14 1.14 ...
$ WCdi : num 1.49 1.49 1.49 1.49 1.47 ...
... (many more fixed-effect variables)
The response variable is counts (RHI_counts_12).
My question is about the structure of the random-effect variables in the model. I have 2 categorical random-effect variables ("Site" and "Location"; "Location" is nested in "Site") and 5 numerical random-effect variables. I have structured my model like this (using only a sample of the fixed-effect variables):
lasso1<-glmmLasso(RHI_counts_12 ~ Artificial_light+WBdi+WCdi+BUdi+FOdi+TIdi, list(Site=~1,Location=~1+Dist_roost+Natural_light+Mean_wind+Mean_temp+Day),
lambda = 500,family = poisson(link = log), data = RHI_12)
I am not convinced at all about the right way to structure the random effects if I have these 2 categorical nested random effects. I want to have a model with Location nested in Site, and I do not think that this is what I get. Here is my output for the random effects(in this output, "Loc" stands for Location, "siteName" for Site):
Random Effects:
StdDev:
[[1]]
siteName
siteName 1.180514
[[2]]
Loc Loc:Dist_roost Loc:Natural_light Loc:Mean_wind
Loc 1.15105859 -0.66317669 -0.35354821 -0.10805268
Loc:Dist_roost -0.66317669 1.42601945 0.46004662 -0.42795987
Loc:Natural_light -0.35354821 0.46004662 0.49532786 -0.15485395
Loc:Mean_wind -0.10805268 -0.42795987 -0.15485395 0.76175417
Loc:Mean_temp 0.02677276 0.03961902 -0.01431360 -0.03649499
Loc:Day 0.03756960 -0.02081360 0.02520654 -0.12082652
Loc:Mean_temp Loc:Day
Loc 0.02677276 0.03756960
Loc:Dist_roost 0.03961902 -0.02081360
Loc:Natural_light -0.01431360 0.02520654
Loc:Mean_wind -0.03649499 -0.12082652
Loc:Mean_temp 0.36923939 -0.08311209
Loc:Day -0.08311209 0.56876662
Do you think that it is right? I was not able to build this model with "Location" nested in "Site" (and all the other random factors would also be nested in "Site".) I have tried many different ways without success.
I already thank you a lot for having read me and for any advices for the structure of random effects in glmmLasso! :-)
Thomas
Aucun commentaire:
Enregistrer un commentaire