The game of checkers is the Drosophila of AI experimentation, where any aspiring AI amateur will try to teach the machine to exceed its master in achievement. It was the first big success of artificial intelligence when A. Samuel developed his checkers player (1960s) in whatever they called computers back then. Since then it was treated untold number of times by scientists, culminating to its complete and final solution (claimed perfect play by Chinook in 2007).
Thus there's no point seeking merit by yet another exposition of the checkers AI tale, so I thought I'd take a different tack: use it to teach you, the accidental browser of this blog, the basics of reinforcement learning, which are much simpler than you would have thought — including the full source code of RL learning and playing checkers, which anyone can try without reliance on cloud mega-CPU resources or AI architectures.
You would have thought that AI takes exceptional intelligence, but instead it requires just high school mathematics, so it is within the grasp of most people with functioning brain cells. It is mostly art combined with large data sets and massive CPU power — none of which are needed for our checkers player. The basic idea of RL is to let the PC play with itself thousands of games, and keep track of those actions that lead to victories, and learn to evaluate board states that are most likely to bring reward rather than penalty. It is the ultimate carrot and stick game, and the computer can do it almost on its own, as long as we supply the following:
The 2 game opponents share the same value function V: the main player plays for maximum board value, whereas the "other" plays for minimum (maximum from its point of view), aka the minimax principle.
It's called semi-gradient euphemistically, because it is half-correct mathematically!
I won't say it is all too easy, as there are many core techniques to master, and solution parameters to tweak by guesswork (trial and error), but it isn't rocket science. There is a free book for reinforcement learning by the main man of RL Sutton, and having the complete source code will guide you through all the questions that may occur to you that theory books cannot answer.
Includes VS6 (C++) source code without dependencies, for training and an ANN-based player for your amusement and edification
FANN library is included for neural network training
Along with the source code you will find a simple DOS (text mode) self-taught player CHECKERS.EXE, that isn't much to look at, but can be adequately managed with the keyboard (kings show green). It became checkers-aware after just 15000 games. As I don't know to play checkers, it beats me every time. If you are a better draughts player, please let me know if it is any good or rubbish? Thank you!
The player can get harder to beat if you enable rollouts, where it will search a few positions in the future to decide the best move. Just pass the number of steps to look ahead as a command line argument, e.g. CHECKERS.EXE 4
Post a comment on this topic »