vendredi 21 octobre 2016

Read random row from CSV file read only that row and move to another CSV in Python

I'm facing a problem in reading random rows from a large csv file and moving it to another csv file using 0.18.1 pandas and 2.7.10 Python on Windows.

This is the code i used:

import random

file_size = 100

f = open("customers.csv",'r')

o = open("train_select.csv", 'w')

for i in range(0,50):

    offset = random.randrange(file_size)

    f.seek(offset)

    f.readline()

    random_line = f.readline()

    o.write(random_line)

The current output looks something like this:

2;flhxu-name;tum-firstname; 17520;buo-city;1966/04/24;wfyz-street;   96;GA;GEORGIA
1;jwcdf-name;fsj-firstname; 13520;oem-city;1954/02/07;amrb-street; 145;AK;ALASKA
1;jwcdf-name;fsj-firstname; 13520;oem-city;1954/02/07;amrb-street; 145;AK;ALASKA

My problems are 2 fold: 1. I want to see the header also in the second csv and not just the rows. 2. A row should be selected by random function only once.

The output should be something like this:

id;name;firstname;zip;city;birthdate;street;housenr;stateCode;state
2;flhxu-name;tum-firstname; 17520;buo-city;1966/04/24;wfyz-street;   96;GA;GEORGIA
1;jwcdf-name;fsj-firstname; 13520;oem-city;1954/02/07;amrb-street; 145;AK;ALASKA

Some help would be much appreciated.




Aucun commentaire:

Enregistrer un commentaire