dimanche 25 juin 2017

Sampling big amount of data frames, write directly onto harddrive

Referring to a question I asked earlier, I want to sample a big heap of data frames which would exceed the working memory of my machine (16GB). To circumvent working memory I've heard about the possibility to directly write onto the hard drive, but I have not the faintest idea how I could do this in R.

I give this as an example:

## !!CAUTION: THIS CODE COULD CRASH YOUR MACHINE!!

## Number of random data frames to create:
n <- 1e5

## Sample vector of seeds:
initSeed <- 1234
set.seed(initSeed)
seedVec <- sample.int(n = 1e8, size = n, replace = FALSE)

## loop:
lst <- lapply(1:n, function(i){
  set.seed(seedVec[i])
  a <- rnorm(5e4, .9, .05)
  b <- sample(8:200, 5e4, replace = TRUE)
  c <- rnorm(5e4, 80, 30)
  d <- c^2
  e <- sample(0:1, prob= c(1 - .33, .33), replace = TRUE)
  f <- sample(0:1, prob= c(.33, 1 -  .33), replace = TRUE)
  data.frame(a, b, c, d, e, f)
})




Aucun commentaire:

Enregistrer un commentaire