mardi 2 août 2016

Mlib RandomForest (Spark 2.0) predict a single vector

After training a RandomForestRegressor in PipelineModel using mlib and DataFrame (Spark 2.0) I loaded the saved model into my RT environment in order to predict using the model, each request is handled and transform through the loaded PipelineModel but in the process I had to convert the single request vector to a one row DataFrame using spark.createdataframe all of this takes around 700ms!

comparing to 2.5ms if I uses mllib RDD RandomForestRegressor.predict(VECTOR). Is there any way to use the new mlib to predict a a single vector without converting to DataFrame or do something else to speed things up?




Aucun commentaire:

Enregistrer un commentaire