I am trying to sample my dataset with particular logic. I want to sample my data with certain portions for each labeled id. I wonder if there is this type of option in the sample() function in R
A simple description of my dataset is:
id mode OD_ID
1: 50909 1 1
2: 62024 1 1
3: 82812 1 1
4: 100593 1 1
5: 150391 2 1
6: 159413 2 1
7: 132134 2 1
8: 111111 2 1
9: 78524 3 1
10:802212 3 1
.
.
.
I would like to sample this data with certain ratio of column "mode" within the same id column "OD_ID"
For example i would like to sample data with columns OD_ID=1, with different ratio of "mode"
I would like my sampled dataset with mode=1 71% mode=2 21% and mode=3 8%. I have more data with sufficient number of rows and I want the sampled data set to have 10 data for each OD_ID. I would also want to round up the number of columns of the samples to the closest integer.
So an example of my output would be
id mode OD_ID
some id 1 1
some id 1 1
some id 1 1
some id 1 1
some id 1 1
some id 1 1
some id 1 1
some id 2 1
some id 2 1
some id 1 1
.
.
.
some id 1 2
.
.
.
with sampled data of 71% of mode 1 21% of mode2 8% of mode 3 for each pair of OD_ID
I would appreciate some help.
Aucun commentaire:
Enregistrer un commentaire