I have a data table with probabilities for a discrete distribution stored in columns.
For example, dt <- data.table(p1 = c(0.5, 0.25, 0.1), p2 = c(0.25, 0.5, 0.1), p3 = c(0.25, 0.25, 0.8))
I'd like to create a new column of a random variable sampled using the probabilities in the same row. In data.table syntax I imagine it working like this:
dt[, sample := sample(1:3, 1, prob = c(p1, p2, p3))]
If there were a 'psample' function similar to 'pmin' and 'pmax' this would work. I was able to make this work using apply, the downside is that with my real data set this takes longer than I would like. Is there a way to make this work using data.table? The apply solution is given below.
dt[, sample := apply(dt, 1, function(x) sample(1:3, 1, prob = x[c('p1', 'p2', 'p3')]))]
Aucun commentaire:
Enregistrer un commentaire