mardi 21 septembre 2021

Select the best random variable

Let's assume I am offered to keep one of three slot machines, each with unknown and unique reward distributions. Each machine can output a -1, 0 or a 1 after each try. Given the following collected data:

Slot machine 1 data: Attempts: 100, Average reward: 0.3

Slot machine 2 data: Attempts: 10, Average reward: 0.4

Slot machine 3 data: Attempts: 4, Average reward: 0.5

If we want to keep the slot machine that maximizes the reward, which one would it be and why?

Some context: I understand that with more attempts I can be more certain about the expected reward, which is desired. For example, the 3rd machine has the best reward but has been attempted fewer times, meaning that there is a high risk involved. Is there a statistical formula that helps to make this decision?

This is not a Multi-Armed Bandit problem, I don't get to try the slot machines again to make another decision, the question is about making a decision now given the data.




Aucun commentaire:

Enregistrer un commentaire