jeudi 21 juin 2018

R: replacing values in the randomly selected fractions of observations

I am relatively new to R and probably the solution to this problem is rather simple.

Let's imagine that i have nest dataset of bird two species (a and b) like this:

df
year nestid sp  egg chick
2013    a1  a   2   1
2013    a2  a   NA  1
2013    a3  a   NA  0
2013    a4  a   NA  1
2013    a5  a   NA  0
2013    b1  b   2   0
2013    b2  b   NA  1
2013    b3  b   NA  2
2013    b4  b   NA  1
2014    a1  a   NA  1
2014    a2  a   NA  1
2014    a3  a   1   1
2014    a4  a   NA  1
2014    a5  a   NA  1
2014    b1  b   NA  1
2014    b2  b   NA  2
2014    b3  b   NA  2
2014    b4  b   NA  1

I want to infer number of eggs for those 'NAs' from number of chicks. It makes sense to replace "NA" by 2 if there were "2" chicks as they lay 2 eggs max.

But i want to replace NAs by "2" for randomly selected 80% of nests with 1 chick and replace by "1" for remaining 20% of the nests with 1 chick for species "a" in year 2013. But this ratio would be 40% and 60% for clutch sizes of 2 and 1 respectively for species "a" in 2014.

I tried like this but could not work out how to code properly.

df%>% mutate(egg=ifelse(egg==0 & chick==2, 2, egg))

df%>% 
mutate(egg=ifelse(egg==0 & chick==1 & year==2013, sample_frac(.8)==2, egg))

Any help would be greatly appreciated!

Many thanks




Aucun commentaire:

Enregistrer un commentaire