dimanche 19 avril 2020

What's the correct way to answer this question in the edx "Statistics and R" course?

I'm sorry for asking this here but there is no discussion page for this course on the website and it mentions stackoverflow to ask any questions. This is from this edx course.

Q1: Using the following dataset:

'''
url <- "https://raw.githubusercontent.com/genomicsclass/dagdata/master/inst/extdata/babies.txt"
filename <- basename(url)
download(url, destfile=filename)
babies <- read.table("babies.txt", header=TRUE)
'''

splitting into two groups (non-smoking and smoking):

bwt.nonsmoke <- filter(babies, smoke==0) %>% select(bwt) %>% unlist 
bwt.smoke <- filter(babies, smoke==1) %>% select(bwt) %>% unlist

Set the seed at 1 and obtain a samples from the non-smoking mothers (dat.ns) of size N=25. Then, without resetting the seed, take a sample of the same size from and smoking mothers (dat.s). Compute the t-statistic (call it tval).

What is the absolute value of the t-statistic?

Here's how I did it:

set.seed(1)
dat.ns <- sample(bwt.nonsmoke,25)
dat.s <- sample(bwt.smoke,25)
tval <- t.test(dat.ns,dat.s)$statistic
tval

This gives the value 2.120904 which is apparently wrong. I also tried setting the seed to 1 before each sample as follows:

set.seed(1)
dat.ns <- sample(bwt.nonsmoke,25)
set.seed(1)
dat.s <- sample(bwt.smoke,25)
tval <- t.test(dat.ns,dat.s)$statistic
tval

which gives the t value of 1.573627 which is also wrong. I'm not sure what I'm doing wrong and I'd like some help.




Aucun commentaire:

Enregistrer un commentaire