mardi 8 mai 2018

Randomly Sample Data from TFIDF Vectorizer with reduced dimensions

`import logging import random

logging.basicConfig(level=logging.DEBUG)

tf_sample = random.sample(list(tf_idf_vect), 10000) `

The tf_idf_vect is a 364k X 291K Matrix where 291K are the dimensions of each vector. I want to take only a chunk of the data randomly. How to do this.




Aucun commentaire:

Enregistrer un commentaire