I am trying to select lines to read from a CSV file, selection is based on:
n=123456789
s= n//10
skip = sorted(random.sample(range(1,n+1),k=(n-s)) # to be used with skip_row in pd.read_csv
This statement is halting the computer due to the large RAM it is consuming.
I wonder if there is an alternative to efficiently select the rows to be skipped.
Aucun commentaire:
Enregistrer un commentaire