vendredi 13 septembre 2019

Split data into train & test and random by Id Mysql

I want to split data into 70% data training and 30% data testing. Then i try this query:

select A.*,
    case
        when rand() < 0.7 then 'training'
        else 'test'
    end as split
from costumer A
order by user_id

But when i count distinct from user_id, the proportion of training:test not 70%:30%. How i get 70%:30% data and random by user_id?




Aucun commentaire:

Enregistrer un commentaire