I've been asked to "simulate" random variations into a set of continuous biomarkers, to see which of them are more robust to possible analytical variations.
Let's say we have the initial values of biomk
:
df = structure(list(biomk = c(4.97673374242057, 4.9600435079802, 4.73707525686803,
4.6737629774537, 5.12038615805537, 5.16421438202456, 5.94437293957413,
5.33464929579543, 5.12871458216186, 4.50424426739813)), row.names = c(NA,
-10L), class = "data.frame")
> df
biomk
1 4.976734
2 4.960044
3 4.737075
4 4.673763
5 5.120386
6 5.164214
7 5.944373
8 5.334649
9 5.128715
10 4.504244
Let's say we want 15% of variation in biomk_15
. The person doing this before me had coded:
set.seed(20)
seq_15 <- seq(from=-15, to=15, by=.01)
df$factor15<-sample(seq_15, size=10, replace=TRUE)
df$biomk15 <- df$biomk+((df$factor15*df$biomk)/100)
> df
biomk factor15 biomk15
1 4.976734 -13.35 4.312340
2 4.960044 -2.86 4.818186
3 4.737075 3.98 4.925611
4 4.673763 4.11 4.865855
5 5.120386 1.65 5.204873
6 5.164214 13.08 5.839694
7 5.944373 3.89 6.175609
8 5.334649 -9.60 4.822523
9 5.128715 -5.11 4.866637
10 4.504244 3.36 4.655587
This is a simple approach to simulate some random variations.
But this comes from an idea to simulate some sort of "inter-assay" coefficient of variance (CV), calculated as mean/sd. But the issue with this approach is how it is limited to a (-15,+15) range and ignores the initial "intra-assay" CV:
# Original biomk CV
> (mean(df$biomk)/sd(df$biomk))
[1] 12.56766
# New biomk CV
> (mean(df$biomk15)/sd(df$biomk15))
[1] 9.047034
Of course, ideally this CV simulations should not be done in silico but with inter-lab data etc etc, but I have to do this.
QUESTION: Do you see a way that this could be done better? Or to introduce this random variation leaving the "intra-assay" CV unchanged? So that new values would still have cv = 12.56766?
I'm not sure if it makes sense, but thanks anyway.
Aucun commentaire:
Enregistrer un commentaire