mardi 20 juin 2017

Dynamic Random Forest

I have a dynamic random forest model implemented.

When I plot the predictions I get a weird result.

poc <- zoo(cbind(Y=y, y=y, x=x))
poc.orig <- poc


df.imputed <- data.frame(rfImpute(Y ~  . +Lag(y,1)+Lag(x,1), data=poc))
colnames(df.imputed)[4] <- "lagY"
colnames(df.imputed)[5] <- "lagX"

# NA out Y's that are to be predicted
df.imputed[traindata:nrow(df.imputed), "Y"] <- NA

df.imputed <- zoo(df.imputed)

# predict 1 ahead each iteration
for(i in traindata:nrow(df.imputed)) {
  # fit based on first i-1 values
  fit <- dyn$randomForest(Y ~  lagY + lagX, data=df.imputed, subset = seq_len(i-1))
  # get prediction for ith value
  df.imputed[i, "Y"] <- tail(predict(fit, df.imputed[1:i,]), 1)
  print(i)
}
results <- data.frame(cbind(pred = df.imputed[traindata:nrow(df.imputed), "Y"], act = poc.orig[traindata:nrow(df.imputed), "Y"]))

rsq(results$pred,results$act)
summary(fit)

g <- ggplot(results, aes(x=1:nrow(results)))
g <- g + geom_line(aes(y=results$pred),colour = "blue") 
g <- g + geom_line(aes(y=results$act),colour = "black")
g <- g +xlab("")+ylab("")
g

CLICK HERE TO SEE THE IMAGE WITH RESULTS

blue line is the predictions rf model on test data AND black line is the response var on test data.

¿Whats happens? Anyone can help me?

Thanks :)




Aucun commentaire:

Enregistrer un commentaire