dimanche 20 décembre 2020

R: How to randomly sample a dataframe - that returns ANOTHER dataframe

This code returns pointers into a dataframe:

index <- sample(1:nrow(Data),round(0.70*nrow(Data)))
Train <- Data[index,]
Test <- Data[-index,]

How do I generate a random sample from a dataframe, that returns a dataframe?

For example I could run the above code on that new dataframe, so I would be doing a 70/30 split on, say, 40% of the original data.

In response to the comment, I am looking for a random sample of complete rows.

In the above code, Train is an array of integers (not a dataframe), and Data is a dataframe:

class(index)
[1] "integer"
class(Data)
[1] "spec_tbl_df" "tbl_df"      "tbl"         "data.frame" 

Thanks for any help.




Aucun commentaire:

Enregistrer un commentaire