After training a RandomForestRegressor in PipelineModel using mlib and DataFrame (Spark 2.0) I loaded the saved model into my RT environment in order to predict using the model, each request is handled and transform through the loaded PipelineModel but in the process I had to convert the single request vector to a one row DataFrame using spark.createdataframe all of this takes around 700ms!
comparing to 2.5ms if I uses mllib RDD RandomForestRegressor.predict(VECTOR). Is there any way to use the new mlib to predict a a single vector without converting to DataFrame or do something else to speed things up?
Aucun commentaire:
Enregistrer un commentaire