mardi 13 juin 2023

Random Sample data based on other columns using python

I have a dataframe with 1lakh rows contains Country, State, bill_ID, item_id, dates etc... columns I want to random sample 5k lines out of 1lakh lines which should have atleast one bill_ID from all countries and state. In short it should cover all countries and states with atleast one bill_ID.

Note: bill_ID contains multiple item_id

I am doing testing on a sampled data which should cover all unique countries and states with there bill_IDs.




Aucun commentaire:

Enregistrer un commentaire