vendredi 24 septembre 2021

Randomly selecting subsamples from dataframe using python

I have dataframe df. I would like to select the 6 sab samples randomly that contain fixed data (=100) size and does not have repetitive values. So far, I have written the following codes:

df_ = df.sample(n=6000)
n = 6  # specifying number of sample need
size = int(df_.shape[0]/n)
chunks = list()
for i in range(0, df.shape[0], size):
    chunks.append(df.iloc[i:i+size])

But when I select a sample, say subsample_1=chunks[1] then the results are not random but are in order. Any advice, how to select 6 random subsamples from given df that are not repetitive data?




Aucun commentaire:

Enregistrer un commentaire