I have a numeric vector with integers which:
- I want to transform into "bins".
- I want these bins to be used as sample frames from which I can then sample again, uniformly.
So far I can do both using findInterval
but I am looking for a way to do it with cut
. Let's consider a random vector with integers which will be split in equally sized intervals of length 2
:
df = sample(1:100,10)
df
[1] 81 11 38 95 45 14 10 61 96 88
Using findInterval
I get the bins and a approximate way for sampling:
b <- findInterval(df, breaks)
b
[1] 9 2 4 10 5 2 1 7 10 9
# If b is equal to 1 or 100, then use ifelse() to prevent leaking outside [1,100]
sam <- round(runif(10,ifelse(b==1,10*b-9,10*b-10),ifelse(b==10,10*b,10*b+10)))
sam
[1] 85 14 39 94 50 16 7 63 93 85
Using cut
I get the intervals:
breaks = seq(1,max(df+1),by=10)
cut(df,breaks,right=TRUE)
[1] (71,81] (1,11] (31,41] <NA> (41,51] (11,21] (1,11] (51,61] <NA> (81,91] Levels: (1,11] (11,21] (21,31] (31,41] (41,51] (51,61] (61,71] (71,81] (81,91]
But I don't know how to use those values as intervals from which to sample.
If there is another approach, I would be interested to know!
Aucun commentaire:
Enregistrer un commentaire