Artificial Intelligence and Chess
I love writing AI when I can watch them learn and grow more intelligent. Chess is a great game for learning AI because there is a lot of literature available to get you started. At the same time there are massive scalability challenges, so good design is critical to building a strong engine.
Being the ambitious sort, I decided I wanted to write a chess engine because I was learning Java. I always learn a new language or framework best with a project and at the time it made sense to combine a game I liked with a new programming project. Of course when you consider the massive complexity of some open source chess engines (and the entire online communities it takes to develop them) you can see why this was no small task.
Being new to Java and chess programming in general I learned a lot and re-wrote the engine several times before coming up with anything resembling an intelligent chess player.
That being said, I am rather proud of the engine I made. It beat me pretty consistently and my ELO rating was around 1400 at the time. It can also beat my friends kind enough to test it for me and the Windows chess engine Chess Titan. Unfortunately it would still make strange mistakes now and then due to heuristic tuning errors. Example: compromising on king safety for a pawn grab.
Even so it was a terrifically fun project. Watching the AI learn chess strategy and make increasingly intelligent moves was incredible. Eventually it started punishing my own mistakes pretty severely, and I have to play pretty carefully to beat it.
The source code for the project is on GitHub: https://github.com/mpvoss/Chess
Disclaimer: I was new to Java and programming in general when writing this. There are many opportunities for improvement, to say the least!
Let's write a Chess AI
The biggest take-away I can give on actually programming a chess AI is...
A Chess Artificial Intelligence knows absolutely nothing....that you don't teach it explicitlyWhat does that mean? Let's walk through an example writing a chess AI to see. I'm going to skip over writing all of the data structures/rule engines/UI because frankly it's extremely tedious and not the point of this write-up.
Step one: Teach the AI to value powerful pieces.
You can't win if you don't have any pieces. This is a good start - now the AI loves to capture stuff. Although it really has no problems putting its pieces in terrible places (including the king!) if that means winning a pawn. Depending on if you've put together Quiescent Search it might also leave itself in a really risky position after the capture.
I usually think of this tweaking process (update AI settings, play against AI to test, observe AI moves) as an interaction between programmer and AI so that's how I'll illustrate what I mean:
Let's add that feature to our algorithm to make it a little more balanced.
Step two: Consider number of available moves in a position
Now we are getting somewhere! Our AI likes to capture pieces and also manoeuvres towards the middle of the board and open files when possible to maximize the number of available moves. A good Alpha-Beta search will apply these heuristics to the opponent too, so we'll constrict their options wherever possible. Even better!
Except for....balancing concerns. Now that we have two heuristics we have to weight them relative to each other. Clearly material should be weighted more (even if you can move to many squares that doesn't do you good if you lost most of your pieces to achieve that), but by how much? This is the most fun and painful part of coding a chess engine, in my opinion. You get to see the AI use all the heuristics you just taught it, but it takes a while before it uses them in a balanced way.
The AI is still exhibiting a lot of what we call "non-human moves". For example, no one ever told the AI that capturing towards the middle of the board with your king in the opening of the game is risky...so the computer happily does it! All it knows is that it took a piece and now its king has 3 extra squares it can move to. Way better than being stuck on the back rank where he couldn't get anywhere, right?!
Well...not exactly. An unprotected king can be easily harassed and that makes it hard to make progress. Let's add another feature.
Step three: Implement king safety bonuses, early castling bonuses
Much better. Now the AI will castle in the opening when it can, and avoids disturbing his protective shield of pawns if you implemented it that way. Bonuses for good behaviours like these and penalties for unwise moves (moving the queen early, doubling pawns, etc) can help us tine tune the play style of the AI.
This both creates a more well-rounded engine and creates a mountain of new balancing problems. Yes it's good to castle early, but would you rather capture a pawn or castle? Would you rather open up more available moves or keep your queen on the home square? Even for the basic material heuristic, exactly how much more is a queen worth than a knight? All of these are knobs we can turn slightly and get totally different engine behaviour. It's a deep rabbit hole but with some of the automated training I'll discuss later we can let the AI find the best parameters instead of us.
Space-time tradeoff
Since I knew I’d be visiting the same positions many times over my search, it seemed wasteful to re-explore these branches of the game tree many times.
I was correct about this.
As such, I decided to store the entire game-tree in memory for better performance.
I was horribly, horribly incorrect about this.
The game tree is simply too big, it grows at a rate of bd, where b is the branching factor (~35) and d is the depth to which you are searching. My JVM usually crashed after a depth of 4 or 5 ply (35^5 is about 52 million Move objects that I was storing in my tree).
The better approach is using a hashtable to store good moves as you find them. This has huge savings because it’s very common to see the same positions many times over the course of your search. Usually a specialized hashtable called a Transposition Table is used and there’s plenty of business logic about when you can remove a position from the table and how you handle collisions and whatnot.
With a transposition table my engine would be able to look much deeper than the pitiful 4 or 5 ply I was doing before (the strongest engines have no problem looking 20 or more ply deep). This would make a huge difference with the strength of the engine, and it's on my TODO list if I ever rewrite the engine for the umpteenth time.
Readability vs performance
Since I was learning Java I designed the engine in a very Object Oriented way. I had a class for Pieces, Moves, etc. The game tree was a graph of moves, and to explore the board you had to make and unmake moves on a global board object (I was new to coding, I admit the design was horrendous). This made for mostly easy to read code, but it’s simply not how the top engines represent the game.
Most self-respecting engines use bitboards to represent the board. This is a 64 bit datatype (usually a long) that represent if a piece is on a square or not. You can have a white bishop bitboard, a black rook bitboard, etc. Each bit of the 64 is associated with a square on the board, and you can combine all the bitboards to see the entire position. The reason this is fast is because modern CPU’s use 64 bit registers, and when your entire board can be manipulated using bit shifting instructions on the CPU you can crunch positions insanely quickly.
Said another way, my Object Oriented design was me thinking like a human and making the computer play along. If I thought like a computer instead (bit maps for pieces and bitwise operations) then the computer could evaluate positions far more quickly.
I actually implemented this in an assignment in university where the goal was to solve a puzzle where you slid around tetris-looking pieces on a board. The code is available here. The performance gains are incredible. My implementation with bitboards solved the puzzle in under 4 seconds. Other java implementations representing the board as char arrays or similar took 4 to 5 minutes to run. That said, you also end up with code like this:
static long leftMoveBorder = 0xE0C088888080C0E0L;
static long rightMoveBorder = 0x0703214101010307L;
static long downMoveBorder = 0xC39130300081C3FFL;
static long upMoveBorder = 0xFFC39130300081C3L;
// Bitmaps for all pieces. If there is a 1 in the binary string for a spot,
// one of the piece's parts is on that square
private long[] pieces = {
0x0000C0C000000000L, // Red square
0x0C08000000000000L, // Orange L
0x00040C0000000000L, // Blue L
0x0000000080C00000L, // Bright green L
0x0000000060200000L, // Purple L
0x0000000818080000L, // Green Blue T
0x0000000406040000L, // Green T
0x0000000001030000L, // Weird bright Blue L
0x0000000000002030L, // Pink L
0x0000000000001808L, // Yellow L
0x0000000000000604L // Brown L
};
Great performance? Absolutely. Easy to read? Oh heavens no. That being said, I would love to see how strong my engine could be if it really leveraged the 64 bit CPU it was running on. This is also on the TODO list if the rewrite ever happens.
(Not) Re-inventing the wheel
Like most college student programmers, on this project I tended to write everything from scratch. This was both because I wanted to learn everything I could and because I simply didn't appreciate what a timesaver it is to leverage other people's code.
Example: The entire front end of my chess app
Since I was programming in Java I had to learn Swing and I spent so many hours debugging Swing code and learning how to use Layout Managers and trying to understand why my pieces weren't moving around like I expected. What a waste of time! If I had made my engine UCI (Universal Chess Interface) compliant I could have leveraged the chess GUI's that are open source and plugged my engine into any of them.
This has other pay-offs like the ability to programmatically play my engine against other engines. The top open source engine Stockfish not only supports this but crowdsources the tuning of the engine by mutating its meta-parameters and playing millions of games with different strains of the engine against itself, all on the computers of the community! The strain that wins the most game lives on and thus they are able to tune the engine exponentially faster than anyone could by hand.
Another item in my eventual rewrite is to make the engine UCI compatible so it can "play with other kids" and I can tune it faster as well.
Yet another advantage of UCI compatibility is it would create a programatic interface for my engine, meaning other programs could interact with it instead of just a human with a mouse. I would LOVE to make a Chrome plugin that scraped webpages for chess boards and replied with moves from my engine automatically.
Awesome resources I used on this project:
- https://chessprogramming.wikispaces.com/