lundi 24 août 2020

Generating random synthetic payment data

I have seen people create fake data sets but usually there are no repeated people to bill matching.

For example, i want 100 people's name and in those 100 people, some have 2 or more telephone accounts while some have 1, and their bill are constant throughout the year for each account.

Basically, for a given name in a random list of names:

  • I need an ID number that match that specific name
  • 1 or more telco per name
  • bill amount that matches the name & telco
  • a date for every month of the bill

eg.

name, id_number, telco, bill amount, data of bill
john, X123, vodafone, 40, 1/1/2020
john, X123, telecomz, 25, 1/1/2020
john, X123, vodafone, 40, 1/2/2020
Gerald, T124, vodafone, 22, 1/1/2020
Gerald, T124, vodafone, 22, 1/2/2020
tim, A555, TPG, 100, 1/1/2020
tim, A555, TPG, 100, 1/2/2020

May i know how can i create such a synthetic dataset?




Aucun commentaire:

Enregistrer un commentaire