Last time I mentioned wanting to write a chess move parser since my earliest days of programming. Hard-core coders often see things in terms of the software behind them. For instance, I sometimes wonder about the software running my microwave oven. Andy Warhol drew our attention to how an artist is behind even a mundane soup can label. Similarly, every computer-driven item in your growing collection of smart tools and toys has a programmer or many behind it.
Dedicated coders also look at problems in terms of the software to solve them. When my (ex-)wife complained about the difficulty of scheduling teachers, rooms, and classes, for the year, I began pondering scheduling software. I think a big part of it is the challenge of solving a double-puzzle. First you have to figure out the problem; then you have to figure out the software solution.
And one area that programmers find extremely attractive is games!
I suspect computer games go back as far as computers. Games, obviously, go back as far as humans, but computers add new levels. For one thing, computers can take on the tedious parts of many games.
As one example, computers do the scoring for you in many bowling alleys (although there is something about way bowling is scored that tickles my fancy).
Another leap in gaming comes from computer graphics, which have come a long way since Pong!
The long, alone hours a coder puts in at the keyboard makes for a large overlap between programmers and introverts. Therefore, another attraction is having the computer play games with you. Yet, some games are extremely complex and a challenge to implement on the computer. For example, chess programs go way back, but have only recently become capable of beating expert players.
Other games are simple enough that I’ve used them as classroom exercises for infant programmers. A favorite is a game called Mastermind. It’s a two-player game, but extremely boring for one of them. They have the boring task of making up a code consisting of several (usually four to six) colors (repeats allowed). For example, they might pick, “Red, Blue, Blue, White.”
The second player, the one with the exciting task, attempts to figure out the code by making guesses. The first player responds to guesses only by saying the number of exact matches and the number of color matches that don’t match positions.
Given the code above, if I guessed, “Green, Blue, Yellow, Red,” I would get one exact match point (the blue) and one color match point (the red). This tells me one of my guesses is exactly right and one other is right, but needs to be moved. By making clear guesses and analyzing the answers, you zero in on the code. Your score is how many guesses you took.
As you can imagine, it’s rather boring for the code maker, especially if the guesser takes their time. The computer, however, doesn’t mind waiting. (Or if it does, it’s never said anything. It doesn’t even tap its mouse impatiently.)
I’ve written a version of Mastermind in nearly language I’ve used (although not in Python (yet!)). As I said, it’s student programmer easy. It has a couple interesting “gotchas” that make it not completely trivial, but it’s definitely beginner stuff.
[Huh! We interrupt with news: Robinson Cano, fixture on the New York Yankees, and a free agent this year, just signed with the Seattle Mariners! That’s gotta be a shock to Yankee’s fans!]
On the other side of the spectrum was a nut I found very tough to crack: a parser for the “descriptive” style chess move; the old-fashioned “pawn to king four” kind of move you’ve heard in movies.
As I understand it, these days a simple grid notation is in vogue: “E2-E4” replaces “P-K4” as an opening move. I vastly prefer the tradition and humanity of the latter!
Thing is, it’s a pain to parse! On the other hand, the “E2-E4” move is very easy to parse. Break it in two parts, and you have clear and easy coordinates. In Python you could turn those coordinates into rows and columns for a matrix like this:
f = lambda c:(ord(c)-ord('A'), ord(c)-ord('1')) def parse_move (move): parts = move.split('-') return (f(parts), f(parts)) print parse_move('E2-E4')
There it is, pretty much the entire parser in Python.
My (old-fashioned) chess move parser, however, has over 800 lines!
But the truth is, it’s only a first cut at the problem. I only went as far as the first development rule:
- Make it work.
- Make it work right.
- Make it work fast!
I made it work (which was a huge victory after a number of failed attempts in the past) and then set it aside intending to return to it.
As such, the code is… embarrassing. To me, anyway. Not gonna share this one just yet!
This post, like the previous one introducing the Maze Maker, is just an introduction. It’s taken me this much to just set the table. I’ll try to make the meal not last too long. (Technical material requires precision and detail!)
Given the amount of material to cover, the overview (as it was yesterday) will be plenty long!
Let me start with a few basics about descriptive chess moves. First, on the board, the rows that run side to side are called ranks. The columns that run back and forth between the two players are called files. You may have heard the phrase “rank and file” before.
One of the most notable things about the chess moves is their notational ambiguity. In the move, “P-K4” (Pawn to King-4), the Pawn is not specified (there are eight, after all). The deal is that the state of the board must be such that only one Pawn is capable of moving to King-4. The move is illegal if more than one Pawn qualifies.
This means the parser has to include the current state of the game!
There are two basic kinds of moves (three, actually): moves and captures.
Moves contain a hyphen, which separates the piece designation and target square designation (as in “P-K4”).
Captures have the form, “PxP” (Pawn takes Pawn), the “x” indicates it’s a capture, not a move. Captures specify a piece and a target piece (not a square).
(The third kind of move is King-side or Queen-side castling.)
A piece designation consists of up to three terms: first a “K” or “Q” (King or Queen), which we’ll call the gender; second a “B”, “N” or “R” (Bishop, Knight or Rook), which combined with “K” and “Q” we’ll call type; and last a “P”, which specifies Pawn.
The actual identity of the Piece, King or Pawn, is always required. Other terms are qualifiers that narrow down the identity. For example, there are eight Pawns (“P”), but only two Rook Pawns (“RP”), and only one King Rook Pawn (“KRP”). Likewise, there are two Knights (“N”), but only one Queen Knight (“QN”). Obviously there is only one King and Queen (per side).
Target squares can also be ambiguous. King-4 (“K4”) is not, but Bishop-6 (“B6”) is (because there are two Bishops). If necessary, the unambiguous square would be King (or Queen) Bishop-6 (“KB6” or “QB6”).
Captures have the same ambiguity with pieces on both side, rather than a piece and a square. The rules are the same. Qualifiers are necessary if more than one piece qualifies for the capture.
What may have led to success this time is two things: separating the ambiguity resolution logic from the parsing logic, and using a recursive descent parser.
I kind of want to go on at length about recursive descent parsers. They’re something that every programmer should learn, and many of us old hands don’t feel one is really a programmer until you have written at least one. (There are a number of other coding milestones like that, actually. Writing a state engine, for example.) But given the length of the article so far, I’ll leave that for the future.
[It’s a pity that it takes so much explanation to set the table. It leaves so little time to eat the actual meal!]
The real key for me this time was writing a parser that identified, but didn’t otherwise care about, the left-out terms. The front-end parser takes the move text and returns a data structure populated with data like this:
.move_type // "" or "move" or "capture" .src_gender // -1 or KING, QUEEN .src_type // -1 or KING, QUEEN, BISHOP, KNIGHT, ROOK .src_pawn // 0 or PAWN .dst_gender // -1 or KING, QUEEN .dst_type // -1 or KING, QUEEN, BISHOP, KNIGHT, ROOK .dst_pawn // 0 or PAWN .dst_rank // -1 or 1-8
Each field provides a bit of crucial information that the back-end parser uses to see if the move, as is, applies to any piece (or pieces!) on the current board. (If you’re wondering, “src” is source (starting square) and “dst” is destination (target square or piece).
If the move is too ambiguous, or no piece can be found, the parser throws an exception. If the move is valid, it uses it to change the state of the board for the next move.
Essentially, the software models the chess board and game, letting you make moves by just typing them in. Add network capability, and you could play online with someone.
I realize that these days that’s a huge closet of old hats, but 30+ years ago it would have been a pretty neat thing. All I would have needed is a chess partner (and the ability to figure out then what I’ve finally figured out now).
Better late than never?