I am trying to randomly sample from a matrix (b below) but I want the resulting matrix of samples to have a proportion of zeros in each column equal to that of another matrix (a below). I am trying to use sample()
function to do this but I'm not having much joy. Some reproducible code is below which will hopefully explain my problem:
set.seed(1234)
# matrix a is the matrix that holds the distribution of zeros I want to match
a <- matrix(as.integer(rexp(200, rate=.1)), ncol=20)
# matrix b is the matrix to be sampled from
b <- matrix(as.integer(rexp(2000, rate=.1)), ncol=20)
a looks like:
[,1] [,2] [,3] [,4] [,5]
[1,] 6 0 6 1 22
[2,] 19 6 0 23 19
[3,] 8 22 8 5 0
[4,] 24 17 28 3 0
b looks like:
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 10 5 9
[2,] 26 1 3 2 2
[3,] 4 8 3 0 0
[4,] 2 10 35 3 11
[5,] 1 3 16 0 6
[6,] 2 4 2 16 2
[7,] 3 18 13 6 17
[8,] 0 2 9 0 13
[9,] 2 15 6 27 30
[10,] 1 2 7 9 15
[11,] 13 0 5 1 2
[12,] 18 12 9 27 33
[13,] 0 20 3 18 1
[14,] 5 7 7 16 4
[15,] 5 6 4 5 2
[16,] 0 7 5 10 7
[17,] 3 20 5 14 34
[18,] 28 0 10 5 8
[19,] 33 0 2 6 13
[20,] 7 28 0 11 8
I extract the distribution of zeros in each column of a
to use in the sampling
dist<-apply(a,2, function(x) sum(x!=0)/length(x))
dist
[1] 1.00 0.75 0.75 1.00 0.50
I then go on to try and sample from b
to hold the same number of rows as a
b_sample<-b[sample(x=nrow(b),
size=4,
replace=F
)
,]
This will work but I want the b_sample
to to have the same proportion of zeros in each column as a
. I have tried to do this
b_sample<-b[sample(x=nrow(b),
size=4,
replace=F,
prob=dist
)
,]
but I get an error:
Error in sample.int(x, size, replace, prob) :
incorrect number of probabilities
I am not sure if I have the format wrong to do this or is the sample()
function not the correction function at all to use. Any help would be greatly appreciated!
Aucun commentaire:
Enregistrer un commentaire