The core technique of AlphaGo is using tree search as a "policy improvement oper...

habitue · on Oct 18, 2017

This version explicitly does not use tree search.

panic · on Oct 18, 2017

MCTS means "Monte-Carlo Tree Search". It's the core of the algorithm. The big difference is that it doesn't use rollouts, or random play: it chooses where to expand the tree based only on the neural network.

cjbprime · on Oct 18, 2017

No, 'habitue is correct. This new blog post says that the new software no longer does game readouts and just uses the neural net.

Tarq0n · on Oct 18, 2017

That's not what Monte Carlo Tree search is. The new version is still one neural network + MCTS. There's no way to store enough information to judge the efficiency of every possible move in a neural network, therefore a second algorithm to simulate outcomes is necessary.

Twirrim · on Oct 18, 2017

Read the white paper. MCTS is still involved, right the way through.

ankeshanand · on Oct 18, 2017

The new version does use MCTS, you should read the paper again. :)

dastbe · on Oct 18, 2017

If you read the paper, they do in fact still use monte-Carlo tree search. They just simplify their usage in conjunction with reducing the number of neural networks to 1

AlexCoventry · on Oct 18, 2017

It does, during training.

panic · on Oct 18, 2017

Tree search is also used during play. In the paper, they pit the pure neural net against other versions of the algorithm -- it ends up slightly worse than the version that played Fan Hui, at about 3000 ELO.

AlexCoventry · on Oct 18, 2017

Oh, so it's just not using rollouts to estimate the board position? Thanks for the clarification.

mrec · on Oct 19, 2017

It doesn't use rollouts at all:

> AlphaGo Zero does not use “rollouts” - fast, random games used by other Go programs to predict which player will win from the current board position. Instead, it relies on its high quality neural networks to evaluate positions.