I need to divide data in DF1 in to groups based on their class. In some cases everything in class will be in the same group. Class needs to be divided in to groups by random, but not by equal shares. In DF2 i have the data that gives the shares how the data needs to be divided. DF2 is imported by me from excel. This file is mentained by me and if needed you can make changes to the structure of the data. This is the file I will use to divide the classes in to groups. Share column tells me how many of the class must be divided in to this group. For example 50% of rows in DF1 with class 1 must be dividend in to Apples, 25% in to Hammers and 25% in to Car. NB! It needs to be random, it cant be that first 50% rows are Apples, next 25% hammers etc.
My solution is to give every row in DF1 a random number that I save every time i make it so i can go back and use the seed I got before. NB! It’s important to me that I can go back to the previouse random if a colleague or I runs the code by mistake and making a new random seed. I have that part covered in the case of the random number.
DF1 (base data) ID Class Random 1 1 0,65 2 1 0,23 3 2 0,45 4 1 0,11 5 2 0,89 6 3 0,12 7 1 0,9
My solution is to make a share_2 column where i divide 0-1 in to spaces based on the share column. In excel logic i would like to do the following:
IF Class = 1 then
IF Random < 0,5; Apples; if not then
IF Random < 0,75; Hammer if not then
IF Random <1; Car
DF2 (Classification file made by me) Class Group Share Share_2 1 Apples 50%* 0,5 1 Hammer 25% 0,75 1 Car 25% 1 2 Building 100%** 1 3 Computer 50% 0,5 3 Hammer 50% 1 *This means that 50% of class 1 need to be "Apples". Shares in a class give 100% in total.
I need
DF3 ID Class Random Group 1 1 0,65 Hammer 2 1 0,23 Apples 3 2 0,45 Building 4 1 0,11 Apples 5 2 0,89 Building 6 3 0,12 Computer 7 1 0,9 Car
My probleem is that i don’t know how to write it in R. Can you please help me. NB! Please feel free to offer also ohter methods of solving my problem as long as it makes the dividing of class by random and i can save the randomnes to replicate it.
Aucun commentaire:
Enregistrer un commentaire