vendredi 4 mai 2018

How to generated mixed data and keep the relation or correlation each column?

i'm working on clustering for mixed data. To test my algorithm, i need to do some simulation using generated data. i know to generate numerical attribute using rnorm, and for categorical using sample of letter maybe? But the problem is to make the relationship between one to another columns (numerical and categorical attribute). i cannot just make random value and the attributes and don't have any relationship. the relationship must make sense. for example if i just generated random value, let say i have product variables and price.

product  price
pen      $500

it doesnt make sense right, the relationship will be mess up. any suggest?




Aucun commentaire:

Enregistrer un commentaire