jeudi 15 septembre 2016

add exact proportion of random missing values to data.frame

I would like to add random NA to a data.frame in R. So far I've looked into these questions:

R: Randomly insert NAs into dataframe proportionaly

How do I add random NAs into a data frame

add random missing values to a complete data frame (in R)

Many solutions were provided here, but I couldn't find one that comply with these 5 conditions:

  • Add really random NA, and not the same amount by row or by column
  • Work with every class of variable that one can encounter in a data.frame (numeric, character, factor, logical, ts..), so the output must have the same format as the input data.frame or matrix.
  • Guarantee an exact number or proportion of NA in the output (many solutions result in a smaller number of NA since several are generated at the same place)
  • Is computationnaly efficient for big datasets.
  • Add the proportion/number of NA independently of already present NA in the input.

Anyone has an idea? Thanks.




Aucun commentaire:

Enregistrer un commentaire