I'm working on the gensim’s word2vec model, but different runs on the same dataset produce the different model. I tried to set seed to a fixed number, including PYTHONHASHSEED and set the number of workers being one. But all the above methods are not working.
I included my code here:
def word2vec_model(data):
model = gensim.models.Word2Vec(data, size=300, window=20, workers=4, min_count=1)
model.wv.save("word2vec.wordvectors")
embed = gensim.models.KeyedVectors.load("word2vec.wordvectors", mmap='r')
return embed
I checked the following output:
Cooking.similar_by_vector(Cooking['apple'], topn=10, restrict_vocab=None)
example output:
[('apple', 0.9999999403953552),
('charcoal', 0.2554503381252289),
('response', 0.25395694375038147),
('boring', 0.2537640631198883),
('healthy', 0.24807702004909515),
('wrong', 0.24783077836036682),
('juice', 0.24270494282245636),
('lacta', 0.2373320758342743),
('saw', 0.2359238862991333),
('insufferable', 0.23015251755714417)]
Each run, I got different similar words.
Does anyone know how to solve it?I appreciate any direct codes or documentation. Thank you in advance!
Aucun commentaire:
Enregistrer un commentaire