I have several lists of data in python:
a = [2,45,1,3]
b = [4,6,3,6,7,1,37,48,19]
c = [45,122]
total = [a,b,c]
I want to get n random indexes from them:
n = 7
# some code
result = [[1,3], [2,6,8], [0,1]]
This mean I get first and fhird from first list, 2,6,8 from second and 0th and 1th from third list.
So my idea - generate random indexes:
n = 7
total_len = sum([len(el) for el in total])
inds = random.sample(range(total_length), n))
But how then get such indexes? I think about np.cumsum() and shift indixes after that but can't find elegant solution...
P.S. Actually, I need to use it for loading data from a several csv files using skiprow option. So my idea - get indexes for every file, and this let me load only necessary rows from every file. So my real task: i have several csv files of different length and need to get n random rows from them. My idea:
lengths = my_func_to_get_lengths_for_every_csv(paths) # list of lengths
# generate random subsamle of indexes
skip = ...
for ind, fil in enumerate(files):
pd.read_csv(fil, skiprows=skip[ind])
Aucun commentaire:
Enregistrer un commentaire