samedi 30 janvier 2021

Randomly generating numbers in R with condition for the total sum AND with restrictions for specific members of the generated vector

I am looking to randomly generate a vector of numbers in R, with a specific sum but which also has a restriction for some specific members of the generated vector, e.g. that the 4th number (say, in a vector of 5) cannot exceed 50.

I am doing this within a for loop with millions of iterations in order to simulate election vote changes, where I am adding votes to one party and taking them away from other parties with equal probability. However, my issue is that in many iterations, votes turn out to be negative, which is illogical. I have figured out how to do the "sums up to X" part from other answers here, and I have made a workaround for the second restriction as follows:

 parties <- data.table(party = c("red", "green", "blue", "brown", "yellow"), votes = c(657, 359, 250, 80, 7))
    votes_to_reallocate <- 350
    immune_party <- "green"

    parties_simulation <- copy(parties)
    
    parties_simulation[party != immune_party, 
                         votes := votes - as.vector(rmultinom(1, size=votes_to_reallocate, prob=rep(1, nrow(parties)-1)))
                         ]
# Most likely there are negative votes for the last party, perhaps even the last two.
# While loop is supposed to correct this
    
    while (any(parties_simulation[, votes]<0)) {
        negative_parties <- parties_simulation[votes < 0, party]
        for (i in seq_along(negative_parties)) {
            votes_to_correct <- parties_simulation[party == negative_parties[i], abs(votes)]
            parties_to_change <- parties_simulation[party != immune_party & !party %in% negative_parties, .N]
            parties_simulation[party != immune_party & !party %in% negative_parties, 
                               votes := votes - as.vector(rmultinom(1, size=votes_to_correct, prob=rep(1, parties_to_change)))
            ]
            parties_simulation[party == negative_parties[i], votes := votes + votes_to_correct]
            }
        }

However, this seems to be a huge bottleneck as each simulation has to be corrected by the while loop. I am curious as to whether there is a solution for this that would generate the random numbers with the restriction already imposed (for instance, generate 4 random numbers, adding up to 350, and with the fourth number not exceeding 7). If not, perhaps there is a more efficient way to solve this?




Aucun commentaire:

Enregistrer un commentaire