mercredi 31 juillet 2019

How to choose nodes with same value in monte carlo tree search?

I implement Monte carlo tree search for a 2 person strategic game(Where you can win/lose/draw).

I search through the tree following the node with the highest UCB(Upper confidence bound for trees) value. If I find a node with no children, I add all possible moves to it, select one and go into simulation. I have three questions:

  1. How do I choose children when I have multiple children nodes with same UCB value? Should I randomly select one or should I select the node that occurs the first time in the for loop(max search)?(Does it even matter?)

  2. Which values should I choose for backpropagation? For example if I win in a simulation, should I backpropagate a 10 or 1? If I draw I backpropagate a 0(Only increase visit). Which value should I backpropagate if I lose in a simulation? A 0(like in draw) or a -1/-10 ?




Aucun commentaire:

Enregistrer un commentaire