samedi 2 mai 2020

R language, how to detect if gaps in numeric sequences are at random or contiguous?

I have many vectors of numeric data, some of them containing gaps. I must detect if those gaps are contiguous or distributed more or less at random within each vector. Something like that exemplified here:

# Let's create a couple of data vectors
x <- runif(1000)
y <- runif(1000)

# Let's add some NAs at random to x
x[sample(c(1:1000), 100, replace = F)] <- NA
# Let's add some continuous NAs to y
y[c(251:350)] <- NA

# And get the respective summaries
summary(x)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
0.00294 0.24446 0.51441 0.50535 0.76200 0.99850     100 
summary(y)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
0.00325 0.22178 0.47765 0.48207 0.73380 0.99969     100

That is, both x and y have the same amount of gaps, but in x these are distributed at random along the vector, while in y they are aggregated. I must detect this, any idea?




Aucun commentaire:

Enregistrer un commentaire