i'm working on clustering for mixed data. To test my algorithm, i need to do some simulation using generated data. i know to generate numerical attribute using rnorm, and for categorical using sample of letter maybe? But the problem is to make the relationship between one to another columns (numerical and categorical attribute). i cannot just make random value and the attributes and don't have any relationship. the relationship must make sense. for example if i just generated random value, let say i have product variables and price.
product price
pen $500
it doesnt make sense right, the relationship will be mess up. any suggest?
Aucun commentaire:
Enregistrer un commentaire