I'm using Walker's alias method to adjust random rounded data, that is base 3. I have already assigned the column value to the each value of 3 in the dataframe, the AliasColumn'. The values in the
AliasColumnare integers in the range
1through
5`. I've used the Alias Method from here. The dataframe looks like this (it has 64 rows):
Industry AliasColumn
1 5
2 5
3 4
4 2
5 3
6 1
7 2
8 2
9 3
10 5
11 4
12 4
13 4
14 2
15 2
16 1
17 4
18 3
19 5
20 5
Based on the AliasColumn
value, I need to toss a loaded coin to create the "real" business count (NumBusinesses
), which is between 1 and 5. The loaded coin table is:
AliasColumn 1 2 3 4 5
"Heads prob" 8/12 11/12 1 10/12 5/12
"Alias prob" 4/12 1/12 - 2/12 7/12
Alias value 2 3 - 3 1
For example, if the AliasColumn
value is 1, then 8/12 of the time the NumBusinesses
value will be 1 and 4/12 of the time the NumBusinesses
value will be 2. For the AliasColumn
value of 3, that is the only value that can be assigned to NumBusinesses
.
Thus, NumBusinesses
receives one of two values, with a probability conditional to the specific column in the AliasColumn
. Because the NumBusinesses
column can only take one of two values, and because these are integers, and differ depending on the value in AliasColumn
, I was hoping to use the sample() function in R. I have been unable to get this to work.
I have tried (I've just noticed in my code that I have show the code for AliasColumn
value 4 differently to that for 1 and 2, but the output didn't seem any different to when I ran it initially with 1:2
and 2:3
instead of 1,2
and 2,3
, respectively):
foo$NumBusinesses[AliasCol==1] <-sample(c(1,2),1, replace=TRUE,prob=c(8,4))
foo$NumBusinesses[AliasCol==2] <-sample(c(2,3),1, replace=TRUE,prob=c(11,1))
foo$NumBusinesses[AliasCol==3] <- 3
foo$NumBusinesses[AliasCol==4] <-sample(c(3:4),1, replace=TRUE,prob=c(2,10))
foo$NumBusinesses[AliasCol==5] <-sample(c(1,5),1, replace=TRUE,prob=c(7,5))
This seems to set the NumBusinesses
value to be the same as that in AliasColumn
, apart from when the NumBusinesses
value is 5, and then the AliasColumn
value is being set to 1.
I considered an ifelse loop, and attempted one:
ifelse(foo$AliasCol==1, foo$NumBusinesses<- Sample(c(1,2),1, replace=TRUE,prob=c(8,4)),
ifelse(foo$AliasCol==2),
foo$NumBusinesses<- sample(c(2,3),1, replace=TRUE,prob=c(11,1)),
ifelse(foo$AliasCol==3), foo$NumBusinesses<- 3,
ifelse(foo$AliasCol==4),
foo$NumBusinesses <- sample(c(3:4),1, replace=TRUE,prob=c(2,10)),
foo$NumBusinesses <- sample(c(1,5),1, replace=TRUE,prob=c(7,5)))
And I received this error (which makes me believe I am overthinking the loop):
Error in ifelse(foo$AliasCol == 1, foo$NumBusinesses <- sample(c(1, : unused arguments (foo3$NumBusinesses <- sample(c(2, 3), 1, replace = TRUE, prob = c(11, 1)), ifelse(foo$AliasCol == 3), foo$NumBusinesses <- 3, ifelse(foo$AliasCol == 4), foo$NumBusinesses <- sample(c(3:4), 1, replace = TRUE, prob = c(2, 10)), foo$NumBusinesses <- sample(c(1, 5), 1, replace = TRUE, prob = c(7, 5)))
How can I generate my conditional output in one step, or one set of steps?
Aucun commentaire:
Enregistrer un commentaire