I have a dynamic random forest model implemented.
When I plot the predictions I get a weird result.
poc <- zoo(cbind(Y=y, y=y, x=x))
poc.orig <- poc
df.imputed <- data.frame(rfImpute(Y ~ . +Lag(y,1)+Lag(x,1), data=poc))
colnames(df.imputed)[4] <- "lagY"
colnames(df.imputed)[5] <- "lagX"
# NA out Y's that are to be predicted
df.imputed[traindata:nrow(df.imputed), "Y"] <- NA
df.imputed <- zoo(df.imputed)
# predict 1 ahead each iteration
for(i in traindata:nrow(df.imputed)) {
# fit based on first i-1 values
fit <- dyn$randomForest(Y ~ lagY + lagX, data=df.imputed, subset = seq_len(i-1))
# get prediction for ith value
df.imputed[i, "Y"] <- tail(predict(fit, df.imputed[1:i,]), 1)
print(i)
}
results <- data.frame(cbind(pred = df.imputed[traindata:nrow(df.imputed), "Y"], act = poc.orig[traindata:nrow(df.imputed), "Y"]))
rsq(results$pred,results$act)
summary(fit)
g <- ggplot(results, aes(x=1:nrow(results)))
g <- g + geom_line(aes(y=results$pred),colour = "blue")
g <- g + geom_line(aes(y=results$act),colour = "black")
g <- g +xlab("")+ylab("")
g
CLICK HERE TO SEE THE IMAGE WITH RESULTS
blue line is the predictions rf model on test data AND black line is the response var on test data.
¿Whats happens? Anyone can help me?
Thanks :)
Aucun commentaire:
Enregistrer un commentaire