I'll try to write my problem in a list to be more understandable:
- I have a matlab table
T
of size1000x30
. - All the data in the last column called 'Class' in the table has certain values of integers ranging from
1
to20
. - So some rows will have the value
1
which means these rows are of Class1 and some will have the value 2 and some will have the value 20 and so on. - The number of rows having a certain class are not equal to the number of rows having another class, so may be there are 100 rows have class 1 but 10 rows have class 2 and 500 have class 3 and so on.
This is what I want to do:
- I want to get the number of rows with the class that have the smallest number of rows assigned to it. So let's say Class 10 has the least rows assigned to it with
count == 3
while the rest of classes has more than 3 rows assigned to them. - I will then have a new column called
YesNo
where it will have only the values 0 or 1. - Then all rows of the class with the least count (e.g Class 10 in this example) will have the value
1
. - For the rest of rows with all other classes, I want to randomly select from every other class a similar number of rows as the class with lowest number (in this example it will be
3
). - Then for these randomly selected rows of each other class the value in the new column
YesNo
will be1
while for the rest of the not chosen rows will be 0. - So in this example, this will ends up with a new column with
1000
values, where 3*20 of them will have 1's (3->number of rows assigned to class with lowest count, and 20->is number of classes) and 0 for the rest.
I wonder how this can be done in MATLAB R2015b? I know that I can create a new column in the table using T.YesNo = newArr;
where newArr
is a 1000x1 double
having 0
and 1
values.
As a small example, if T
is 10x3 and has only 3 classes (1,2,3), below is how T
looks like:
ID Name Class
0 'a' 3
1 'b' 2
2 'a' 2
3 'b' 2
4 'a' 3
5 'a' 1
6 'a' 1
7 'b' 2
8 'b' 1
9 'a' 2
So as shown above class3 is the one with the lowest count where only 2 rows. So I want to randomly select two rows of each class1 and class2 and then set the values of the new column of these randomly selected rows to 1 while the rest will be 0 as shown below:
ID Name Class YesNo
0 'a' 3 1
1 'b' 2 0
2 'a' 2 1
3 'b' 2 0
4 'a' 3 1
5 'a' 1 0
6 'a' 1 1
7 'b' 2 0
8 'b' 1 1
9 'a' 2 1
Aucun commentaire:
Enregistrer un commentaire