# Differences

This shows you the differences between two versions of the page.

 en:dydaktyka:ggp:game_tree_2 [2019/01/08 20:40]msl Translates to english en:dydaktyka:ggp:game_tree_2 [2019/06/27 15:49] (current) Both sides previous revision Previous revision 2019/01/08 20:40 msl Translates to english2019/01/07 13:20 msl 2019/01/07 13:20 msl created1970/01/01 01:00 msl 1970/01/01 01:00 msl 2019/01/08 20:40 msl Translates to english2019/01/07 13:20 msl 2019/01/07 13:20 msl created Line 37: Line 37: Monte Carlo wasn't the smartest approach to the problem - we can be more intelligent than some random sampling. The idea behind a more advanced Monte Carlo Tree Search is to choose moves, according to a simple scoring formula: Monte Carlo wasn't the smartest approach to the problem - we can be more intelligent than some random sampling. The idea behind a more advanced Monte Carlo Tree Search is to choose moves, according to a simple scoring formula: - $UCT = \overline{score_i} + c*\sqrt{{n}\over{n_i}$ + $UCT = \overline{score_i} + c*\sqrt{{n}\over{n_i}$ where: where: - * $\overline{score_i}$ the average score of the $i$-th move + * $\overline{score_i}$ the average score of the $i$-th move - * $n$ is total number of playouts + * $n$ is total number of playouts - * $n_i$ is number of playouts after choosing the $i$-th move + * $n_i$ is number of playouts after choosing the $i$-th move * $c$ is a constant parameter, from theoretical reasons set traditionally to $\sqrt{2}$ * $c$ is a constant parameter, from theoretical reasons set traditionally to $\sqrt{2}$ As you see, the UCT assigns high scores to machines that: As you see, the UCT assigns high scores to machines that: - * have high average score ($\overline{score_i}$ is high) + * have high average score ($\overline{score_i}$ is high) * haven'​t been tried too many times (low $\over{n_i}$). * haven'​t been tried too many times (low $\over{n_i}$). Line 66: Line 66: - gray moves belong to our opponent, so during backpropagation we store our loss as his win. - gray moves belong to our opponent, so during backpropagation we store our loss as his win. - {{:​en:​dydaktyka:​ggp:​mcts-phases.png | https://​en.wikipedia.org/​wiki/​Monte_Carlo_tree_search#/​media/​File:​MCTS_(English)_-_Updated_2017-11-19.svg}} + {{:​en:​dydaktyka:​ggp:​mcts-phases.png?700 | https://​en.wikipedia.org/​wiki/​Monte_Carlo_tree_search#/​media/​File:​MCTS_(English)_-_Updated_2017-11-19.svg}}
en/dydaktyka/ggp/game_tree_2.txt · Last modified: 2019/06/27 15:49 (external edit) 