I have a data set that has 98 variables and nearly 1.1 million observations. I want to see the correlations between the variables however since the data is too large, R cannot proceed with the computation due to the memory allocation failure.
Then, I wanted to sample the data set with stratified sampling method so that I can compute correlations on sampled data. But again I got the same memory error which is "Error: cannot allocate vector of size ... Mb"
So, how can I find the correlation matrix of either the whole data or the sampled data?
Aucun commentaire:
Enregistrer un commentaire