how to sample (without replacement) from several folders containing image files (each belonging to the class name of the folder in which its stored) such that the relative proportion of images sampled is maintained.
For example, you have 4 classes: dog, cat, bird, turtle. There are 1000 dogs, 200 cats, 200 birds, 1400 turtles.
-
dogs |--img3487.png |--img2764.png ... |--img5773.png
-
cats |--img7701.png |--img5429.png ... |--img2716.png
-
birds |--img5232.png |--img6705.png
-
turtles |--img2601.png |--img7748.png
You want to ensure that when you split the dataset into, say, a 70/10/20 train/validation/test set, that the correct proportion of images are sampled from each animal's folder.
Aucun commentaire:
Enregistrer un commentaire