I'm facing a problem in reading random rows from a large csv file and moving it to another csv file using 0.18.1 pandas and 2.7.10 Python on Windows.
This is the code i used:
import random
file_size = 100
f = open("customers.csv",'r')
o = open("train_select.csv", 'w')
for i in range(0,50):
offset = random.randrange(file_size)
f.seek(offset)
f.readline()
random_line = f.readline()
o.write(random_line)
The current output looks something like this:
2;flhxu-name;tum-firstname; 17520;buo-city;1966/04/24;wfyz-street; 96;GA;GEORGIA
1;jwcdf-name;fsj-firstname; 13520;oem-city;1954/02/07;amrb-street; 145;AK;ALASKA
1;jwcdf-name;fsj-firstname; 13520;oem-city;1954/02/07;amrb-street; 145;AK;ALASKA
My problems are 2 fold: 1. I want to see the header also in the second csv and not just the rows. 2. A row should be selected by random function only once.
The output should be something like this:
id;name;firstname;zip;city;birthdate;street;housenr;stateCode;state
2;flhxu-name;tum-firstname; 17520;buo-city;1966/04/24;wfyz-street; 96;GA;GEORGIA
1;jwcdf-name;fsj-firstname; 13520;oem-city;1954/02/07;amrb-street; 145;AK;ALASKA
Some help would be much appreciated.
Aucun commentaire:
Enregistrer un commentaire