I am generating a data set where I first want to randomly draw a number for each observation from a discrete distribution, fill in var1
with these numbers. Next, I want to draw another number from the distribution for each row, but the catch is that the number in var1
for this observation is not eligible to be drawn anymore. I want to repeat this a relatively large number of times.
To hopefully make this make more sense, suppose that I start with:
id
1
2
3
...
999
1000
Suppose that the distribution I have is ["A", "B", "C", "D", "E"] that happen with probability [.2, .3, .1, .15, .25].
I would first like to randomly draw from this distribution to fill in var
. Suppose that the result of this is:
id var1
1 E
2 E
3 C
...
999 B
1000 A
Now E
is not eligible to be drawn for observations 1
and 2
. C
, B
, and A
are ineligible for observations 3
, 999
, and 1000
, respectively.
After all the columns are filled in, we may end up with this:
id var1 var2 var3 var4 var5
1 E C B A D
2 E A B D C
3 C B A E D
...
999 B D C A E
1000 A E B C D
I am not sure of how to approach this in Stata. But one way to fill in var1
is to do something like:
gen random1 = runiform()
replace var1 = "A" if random1<.2
replace var1 = "B" if random1>=.2 & random1<.5
etc....
Aucun commentaire:
Enregistrer un commentaire