lundi 22 mai 2017

Randomly drop a column selected from a group, excluding one

I have the following data frame, which is going to be used as an input in a logit regression:

my_frame<-data.frame(y=c(1,0,1),A=c(0,1,1),B=c(1,0,0),C=c(0,0,0),t=c(1,1,1),x=c(1,0,0),z=c(1,0,1))

Knowing that the dummy variables A, B and C are connected through a linear equation (A+B+C=1), I need to drop one of the three before proceeding.

 y A B C t x z
 1 0 1 0 1 1 1
 0 1 0 0 1 0 0
 1 1 0 0 1 0 1

Now, here is the difficult part. I want to exclude randomly one of the columns of a group comprised by A,B,C and D, but not the one that has 1 as a value in the last row of the dataframe. In my example, I want one of B and C to be excluded randomly.

Column D is not present, because in this particular data frame it would always be valued 0, but it is still part of the same group of variables.




Aucun commentaire:

Enregistrer un commentaire