I try to implement a Multi-Step-Player for the 2048 game in Python which chooses the best move by calculating the maximum score for each move over multiple steps. In each step, the game randomly spawns a new tile on the board. I use np.random.RandomState for this.
Problem: after the player simulates a few steps in order to obtain the best move, the RandomState has changed and the actual score is different from the one predicted due to different spawning tiles.
My idea was to store the current RandomState before simulating the next steps and restore it afterwards so that the spawning tiles are identical. From what I understand, the RandomState is just a pointer but I haven't found any way to copy or store it.
Any ideas? Thank you!
Aucun commentaire:
Enregistrer un commentaire