I'll try to write my problem in a list to be more understandable:
- I have a matlab table
Tof size1000x30. - All the data in the last column called 'Class' in the table has certain values of integers ranging from
1to20. - So some rows will have the value
1which means these rows are of Class1 and some will have the value 2 and some will have the value 20 and so on. - The number of rows having a certain class are not equal to the number of rows having another class, so may be there are 100 rows have class 1 but 10 rows have class 2 and 500 have class 3 and so on.
This is what I want to do:
- I want to get the number of rows with the class that have the smallest number of rows assigned to it. So let's say Class 10 has the least rows assigned to it with
count == 3while the rest of classes has more than 3 rows assigned to them. - I will then have a new column called
YesNowhere it will have only the values 0 or 1. - Then all rows of the class with the least count (e.g Class 10 in this example) will have the value
1. - For the rest of rows with all other classes, I want to randomly select from every other class a similar number of rows as the class with lowest number (in this example it will be
3). - Then for these randomly selected rows of each other class the value in the new column
YesNowill be1while for the rest of the not chosen rows will be 0. - So in this example, this will ends up with a new column with
1000values, where 3*20 of them will have 1's (3->number of rows assigned to class with lowest count, and 20->is number of classes) and 0 for the rest.
I wonder how this can be done in MATLAB R2015b? I know that I can create a new column in the table using T.YesNo = newArr; where newArr is a 1000x1 double having 0 and 1 values.
As a small example, if T is 10x3 and has only 3 classes (1,2,3), below is how T looks like:
ID Name Class
0 'a' 3
1 'b' 2
2 'a' 2
3 'b' 2
4 'a' 3
5 'a' 1
6 'a' 1
7 'b' 2
8 'b' 1
9 'a' 2
So as shown above class3 is the one with the lowest count where only 2 rows. So I want to randomly select two rows of each class1 and class2 and then set the values of the new column of these randomly selected rows to 1 while the rest will be 0 as shown below:
ID Name Class YesNo
0 'a' 3 1
1 'b' 2 0
2 'a' 2 1
3 'b' 2 0
4 'a' 3 1
5 'a' 1 0
6 'a' 1 1
7 'b' 2 0
8 'b' 1 1
9 'a' 2 1
Aucun commentaire:
Enregistrer un commentaire