I have fine-tuned a PyTorch transformer model using HuggingFace, and I'm trying to do inference on a GPU. However, even after setting model.eval()
I still get slightly different outputs if I run inference multiple times on the same data.
I have tried a number of things and have done some ablation analysis and found out that the only way to get deterministic output is by also setting
torch.cuda.manual_seed_all(42)
(or any other seed number).
Why is this the case? And is this normal? The model's weights are fixed, and there are no undefined or randomly initialized weights (when I load the trained model I get the All keys matched successfully
message), so where is the randomness coming from if I don't set the cuda seed manually? Is this randomness to be expected?
Aucun commentaire:
Enregistrer un commentaire