I have a relatively large (1 gigabyte) sized csv file encoded in bz2 format. I want to sample a specific number of lines (not just a percentage), and if at all possible I'd like to do it without unzipping the bz2 file. I'm not sure if this is possible -- I looked through the bz2 module but it mostly looked like it was intended to read the entire thing; I don't think bz2.readlines() would serve my purpose. I also don't actually know how many lines the csv is, and I couldn't find a way to access this from the compressed file. Could someone point me in the right direction?
Aucun commentaire:
Enregistrer un commentaire