There are two columns in the dataset, user_id and site_name respectively. It records every site name that every user browsed. Now I want to reconstruct random network and meanwhile ensure a person with n sites in the observed network will have also have n sites in the randomized network. The means of numpy.random.shuffle in python is of low efficiency because of the large amount of data. I am wondering if we can use means of a Monte Carlo algorithm in Python and reduce computational burden?
Aucun commentaire:
Enregistrer un commentaire