Minigo: An open-source implementation of the AlphaGo Zero algorithm

weavie · on Jan 30, 2018

To help put it all together, there is currently a book being written about Deep Learning and the game of Go. It is currently in MEAP at Manning [https://www.manning.com/books/deep-learning-and-the-game-of-...]

I am finding it quite useful. (I'm not the author, just a happy reader! :-))

cjbprime · on Jan 30, 2018

Neat! I was hoping to see an idea of what strength the self-taught Minigo can play at, e.g. on KGS.

brilee · on Jan 30, 2018

We've put it on KGS as "somebot", and it can play at somewhere from 3-5d level, although amusingly it still can't do ladders, and almost always plays 3-3. Our strength seems to have plateaued recently, and we suspect it's because of value net overfitting issues.

NhanH · on Jan 30, 2018

3-5d is about the strength of LeelaZero (which I presume you already knew about). Can you disclose how long have Minigo been training, how many games were generated and which hardwares were used?

brilee · on Jan 30, 2018

About a month using K80s. We have something like a million games played.

luckyt · on Jan 30, 2018

Ladders sound hard for neural networks. IIRC the original AlphaGo had a ladder-solver hardcoded that would determine if a ladder position is winning or losing and feed this bit into the neural network.

thomasahle · on Jan 30, 2018

I'm pretty sure it didn't, though I can't find it explicitly for Alpha Go. For Zero they specifically mention that it took the self playing a long time to learn ladders well.

espadrine · on Jan 30, 2018

I can confirm that the original 2015 Nature paper for AlphaGo mentions setting ladder capture / ladder escape bits as input to the neural network.

cf. https://gogameguru.com/i/2016/03/deepmind-mastering-go.pdf, Extended data table 2

AnimalMuppet · on Jan 30, 2018

So, um, what would you have to do to it to make it play at about 25 kyu? (I'm, um, asking for a friend...)

brilee · on Jan 30, 2018

We're working on making all the data (models, selfplay SGFs) available to download - you could download an older generation.

peterarmstrong · on Jan 30, 2018

puts "#{rand(19)}, #{rand(19)}"

(I'm not much better :)

syntaxing · on Jan 30, 2018

I've been super out of touch with playing Go for the past couple years. I would love to get back into it with all these new development! Does anyone know if I can play against AlphaGo or derivative (such as MiniGo) online?

gort · on Jan 30, 2018

Various people have tried to incorporate AG-like techniques into their Go programs. One you might wish to play with is Leela Zero, which is low to mid dan (amateur) now.

To get this working:

* Acquire a GTP-capable GUI, such as Sabaki

* Acquire the latest Leela Zero release

* Acquire a recent Leela Zero neural net

* Set up Sabaki to use LZ with the net passed as an argument, e.g. "-t 1 -p 1600 --noponder --gtp -w d16fa4c3801e55ec21e0df7ead67980fe8d4ee49188a3516818207ad28b017a6"

It's a bit of work but nothing too hard. I should mention that this may require a semi-decent GPU (my old GTX 750 works fine).

syntaxing · on Jan 31, 2018

Interesting, thanks!

nullbyte · on Jan 30, 2018

Playing against an unbeatable computer is not a good way to learn... Same thing goes for chess.

syntaxing · on Jan 31, 2018

That's the awesome thing about Go! If you play against someone who is better than you, you can place handicaps which makes the game fun from both sides.

thethimble · on Jan 30, 2018

Really cool! How much work would it be to get this to work for chess?

glinscott · on Jan 30, 2018

Check out https://github.com/glinscott/leela-chess. We are getting close to kicking off the distributed version now that we have validated it's possible to get a strong network through supervised training.

A nice win against Gnuchess (a very weak opponent, but nonetheless :) - https://github.com/glinscott/leela-chess/issues/47#issuecomm...

grondilu · on Jan 30, 2018

> A nice win against Gnuchess

Not sure why you guys don't show the PGN, but here you are:

1. e4 Nc6 2. Nf3 Nf6 3. Nc3 d5 4. exd5 Nxd5 5. Bb5 Nf4 6. O-O Bf5 7. d4 Nd5 8. Ne5 Qd6 9. Nxd5 Qxd5 10. Bxc6+ bxc6 11. c4 Qd6 12. Qf3 g6 13. Nxc6 Bg7 14. Bf4 Qe6 15. Rfe1 Qxc4 16. Rxe7+ Kf8 17. Rae1 Kg8 18. b3 Qc2 19. Qd5 Be6 20. R7xe6 fxe6 21. Qxe6+ Kf8 22. Bh6 Bxh6 23. Qf6+ Kg8 24. Ne7#

Question : when you switch to self-play reinforcement learning, do you plan on starting from the networked obtained in supervised learning or tabula rasa? I understand starting from tabula rasa will require more comptuting power/time, but if you start from the supervised learning network, isn't there a risk you inherit human biases in the game style? It would also defeat the purpose of having the system discover existing chess theory and possibly new one.

glinscott · on Jan 30, 2018

We are going to start tabula rasa. The supervised learning is meant to prove that there are no major bugs in the framework/learning process.

Should be fun to watch it learn chess theory :).

brilee · on Jan 30, 2018

You'd have to write a Chess implementation that very carefully respects all possible game-ending pathways, a translation of a chess board into an array that a NN could understand, and a schema by which to flatten the array of all possible moves (both legal and illegal) into a single vector. Then, the MCTS and RL portions would be identical.

jimmytucson · on Jan 30, 2018

Sounds like this meets the requirements: https://github.com/niklasf/python-chess

> very carefully respects all possible game-ending pathways

Not sure exactly what this one means but it implements a is-game-over method, so you could do MCTS with it.

> a translation of a chess board into an array that a NN could understand

Chess positions are represented by sets of 64-bit integers so I don’t think this would be a blocker.

> and a schema by which to flatten the array of all possible moves (both legal and illegal) into a single vector

The list of legal moves is a set of bitboards as well.

dsjoerg · on Jan 30, 2018

Not quite. python-chess is a wonderful library, and you would use it to do the things that your parent said. But python-chess on its own doesn't do those things.

Tepix · on Jan 30, 2018

Has anyone tried using Minigo with KSP (Kerbal Space Program)?

AlexCoventry · on Jan 30, 2018

Can you run it with the LeelaZero weights?

brilee · on Jan 30, 2018

With a lot of elbow grease, mayyyyybe. We're currently using 128 filters, 20 residual layers, whereas LZ is using a smaller network size. Our networks would have had to define their batchnorm layers exactly the same way for the model files to be compatible, and it would also require a lot of op renaming for the model load operation to work correctly.

AlexCoventry · on Jan 30, 2018

A lot of computation has already gone into training LeelaZero, so it might make a good baseline.

make3 · on Jan 30, 2018

does it come with a pre trained model/s?

brilee · on Jan 30, 2018

Here's one of the stronger models we've gone through.

https://github.com/tensorflow/minigo/releases/tag/v199