mardi 16 juin 2020

In R how to match with multiple conditions?

I need to divide data in DF1 in to groups based on their class. In some cases everything in class will be in the same group. Class needs to be divided in to groups by random, but not by equal shares. In DF2 i have the data that gives the shares how the data needs to be divided. DF2 is imported by me from excel. This file is mentained by me and if needed you can make changes to the structure of the data. This is the file I will use to divide the classes in to groups. Share column tells me how many of the class must be divided in to this group. For example 50% of rows in DF1 with class 1 must be dividend in to Apples, 25% in to Hammers and 25% in to Car. NB! It needs to be random, it cant be that first 50% rows are Apples, next 25% hammers etc.

My solution is to give every row in DF1 a random number that I save every time i make it so i can go back and use the seed I got before. NB! It’s important to me that I can go back to the previouse random if a colleague or I runs the code by mistake and making a new random seed. I have that part covered in the case of the random number.

      DF1 (base data)          
ID   Class   Random     
1      1      0,65
2      1      0,23
3      2      0,45
4      1      0,11
5      2      0,89
6      3      0,12
7      1      0,9 

My solution is to make a share_2 column where i divide 0-1 in to spaces based on the share column. In excel logic i would like to do the following:

IF Class = 1 then
IF Random < 0,5; Apples; if not then
IF Random < 0,75; Hammer if not then
IF Random <1; Car
 DF2  (Classification file made by me)
Class   Group          Share      Share_2
1       Apples        50%*        0,5
1       Hammer        25%         0,75
1       Car           25%         1
2       Building      100%**      1
3       Computer      50%         0,5
3       Hammer        50%         1

*This means that 50% of class 1 need to be "Apples". Shares in a class give 100% in total. 

I need

    DF3
ID   Class   Random      Group    
1      1      0,65      Hammer
2      1      0,23      Apples
3      2      0,45      Building
4      1      0,11      Apples
5      2      0,89      Building
6      3      0,12      Computer
7      1      0,9       Car

My probleem is that i don’t know how to write it in R. Can you please help me. NB! Please feel free to offer also ohter methods of solving my problem as long as it makes the dividing of class by random and i can save the randomnes to replicate it.




Aucun commentaire:

Enregistrer un commentaire