I have a data frame of 750 rows and 3 columns.
- One column is "IDs" (e.g. 001, 002, 003, 004, 005, etc.). There are a total of 250 levels of IDs.
- One column is "type" (e.g. OO, AA, AP). There are a total of 3 levels of types.
- One column is "face" (e.g. CFD, OFD, RAD). There are a total of 3 levels of faces.
Each "ID" is repeated for each of the three levels of "type" for a total of 750 rows.
I'd like to randomly split my data frame into 25 subset of 30 "IDs" each made of:
- 10 IDs of the "OO" level of type (having 1 to 2 of the RAD level - 3 to 4 from the OFD level - 4 to 5 from the CFD level)
- 10 IDs of the "AA" level of type (having 1 to 2 of the RAD level - 3 to 4 from the OFD level - 4 to 5 from the CFD level)
- 10 IDs of the "AP" level of type (having 1 to 2 of the RAD level - 3 to 4 from the OFD level - 4 to 5 from the CFD level)
Each subset should not include duplicates of IDs.
I have tried combinations of split(), unique(), sample() but nothing is working. Any clue? Thanks in advance.
Aucun commentaire:
Enregistrer un commentaire