I need to get a random sample of my sales dataset and have done that effectively using the cast, checksum and newID method.
However my dataset is a sales data table which can have c.1-10 transaction IDs that are the same because one transaction can include multiple products, and my current method of random sampling doesn’t include all transaction IDs that are the same. E.g. it will only pull 1 row of transaction ID= 17381 instead of all 6 rows.
I want to create a random sample that includes all transaction IDs that are the same so I have complete information of a transaction in the sample.
How can I do this?
Aucun commentaire:
Enregistrer un commentaire