mercredi 12 août 2020

r generate a column with random 1s and 0s with restrictions

I have a data set with 500 observations. I like to generate 1s and 0s randomly based on two scenarios

Current Dataset

  Id     Age    Category   
  1      23     1
  2      24     1
  3      21     2
  .      .      .
  .      .      .
  .      .      .
500      27     3

Scenario 1

  • The total number of 1s should be 200 and they should be random. The remaining 300 should be 0s.

Scenario 2

  • The total number of 1s should be 200. The remaining 300 should be 0s.
    • 40% of the 1s should be in Category1. That is 80 1s should be in Category1
    • 40% of the 1s should be in Category2 That is 80 1s should be in Category2
    • 20% of the 1s should be in Category3 That is 40 1s should be in Category3

Expected Output

  Id     Age    Category  Indicator  
  1      23     1         1
  2      24     1         0
  3      21     2         1
  .      .      .
  .      .      .
  .      .      .
500      27     3         1

I know function sample(c(0,1), 500) will generate 1s but I dont know how to make this generate 200 1s randomly. Also not sure how to generate 80 1s randomly in Category1, 80 1s in category2 and 40 1s in Category3.




Aucun commentaire:

Enregistrer un commentaire