mercredi 14 juillet 2021

Subsampling without replacement in R

There's a very similar question to mine on stack, but that doesn't directly answer my question.

I have abundance data for 250 species across 1000 sites. Species are columns, sites are rows. My abundances data look something like the data in the linked post above.

0    0    3    0    0    201  0    0    0    82
0    23   5    0    0    0    0    0    0    0
9    0    0    0    0    12   0    0    0    913
0    7    91   0    8    0    0    92   9    0
131  12   0    410  0    0    0    3    0    0

If I wanted to sample 50 individuals from each site, without replacement, how can I do this? Focusing on code for single sites for now.

This code: samples <- sample(1:ncol(abundances), 50, rep=FALSE, prob=abundances[1,]) doesn't work unless I change to rep=TRUE. However, I need sampling WITHOUT replacement.

I don't want to use sample(abundances[1,], 50, rep=FALSE) because then instead of sampling individuals, it samples species and will report the whole value in that row (i.e. species 6 may occur 201 times at site 1, it'll report 201, rather than 1 individual from that species, resulting in >50 individuals in final subsample).

I essentially want an output identical to what user Dinre answered in post above, but without it being for bootstrapping. I just want to sample without replacement. This process will ultimately be integrated into a for loop for a subsample from each site.




Aucun commentaire:

Enregistrer un commentaire