Nov 13 : Kalah and the Model/View/Controller Pattern

Kalah
Alpha-Beta Pruning
How Kalah is organized - the Model/View/Controller Pattern

Code discussed in lecture

and two concrete implementations: and . KalahViewGraphical uses

Kalah

The program in Kalah.zip demonstrates how a simple program can play a complex game surprisingly well. Kalah is a very old African game played on a board that looks like an egg carton with big depressions at either end. Each player has six "pits" on his or her side of the board, and one "kalah" at the left end (from that player's view). The board looks something like:

Player 2's pits ______________________________________ | 13 12 11 10 9 8 | Player 1's | 0 7 | Player 2's Kalah | 1 2 3 4 5 6 | Kalah ______________________________________ Player 1's pits The one with the most stones in his or her kalah at the end of the game wins.

A move consists of picking up all the stones in one of your pits and dropping them around the board in a clockwise manner. You drop stones into your pits, your kalah, and your opponent's pits until you run out of stones. You skip over your opponent's kalah. If the last stone dropped ends up in your kalah you move again.

If the last stone dropped ends up in an empty pit on your side of the board opposite a non-empty pit on your opponent's side, then you captured the stones in that pit. You move all of those stones plus the capturing stone immediately to your kalah.

If it is your turn and there are no stones left on your side of the board all stones on your opponent's side are moved to his or her kalah and the game ends.

Demo the game.

We look at the pickMove method from the class ComputerKalahPlayerGTS.

private KalahMove pickMove (int depth) // Returns the best move and its value for the player after looking // ahead depth moves. { KalahMove currentMove; // Hold current move and its value KalahMove bestMove; // Hold best move found and its value int playerKalah, opponentKalah; // Keep track of whose kalah is whose if (playerIsUser) playerKalah = 0; else playerKalah = HALFPITS; opponentKalah = HALFPITS - playerKalah; // Set current value to negative infinity so anything will beat it bestMove = new KalahMove(-BIGINT, 0); // Run through possible moves for (int pit = playerKalah + 1; pit <= playerKalah + SIDEPITS; pit++) if (board[pit] > 0) // See if legal move { // Make a scratch copy of state KalahGame copy = this.copyGame(); copy.makeMove(pit); // Make the move // Find the value of this board by evaluating if game over or looking ahead if not if (copy.finished()) // Evaluate the true score, and multiply it by a huge constant so that the // program will choose a sure win over a potentially larger speculative win // and a possible loss over a sure loss. currentMove = new KalahMove(100 * copy.score(), pit); else if (playerIsUser == copy.playerIsUser) // Did current player change? { currentMove = copy.pickMove(depth); // No, so no depth change currentMove.move = pit; // Remember move made } else if (depth > 0) // Player changed, so reduce search depth { currentMove = copy.pickMove(depth - 1); currentMove.value = -currentMove.value; // Good for opponent is bad for me currentMove.move = pit; // Remember move made } else // Depth exhausted, so estimate who is winning by comparing kalahs currentMove = new KalahMove(copy.board[playerKalah] - copy.board[opponentKalah], pit); if (currentMove.value > bestMove.value) // Found a new best move? bestMove = currentMove; } return bestMove; }

This program looks ahead a fixed number of moves, where a move is considered to be "the other person's turn". Therefore if you get to move again because the last stone dropped was in your kalah it does not count as a new turn.

The static evaluation function is very simple - the stones in your kalah minus the stones in your opponent's kalah. It is crude, but easy to do and works well.

The evaluation function is a bit more complex than in Chips, but uses the same idea. It goes through all the moves and picks the best, calling itself recursively if we are not yet at maximum search depth and doing a static evaluation if we are at maximum search depth. It always maximizes, but it negates the score returned by the opponent in the recursive call. (Good scores for my opponent are bad scores for me, and vise versa.)

A couple of points are worth noting. First, outcomes from games played to completion are multiplied by 100. This is to make the program pick a certain one-point win over a move that is estimated to win by more, but is not certain. Similarly a certain one-point loss will be avoided if another course of action is not certain to lose.

Another point is how the depth is adjusted. The depth only is decreased when the player who is moving changes. Therefore a whole series of "move again" moves where the last stone ends up in your kalah count as a single move.

Finally, note that a class KalahMove is used to return a (value, move) pair. This is necessary because Java allows us to only return one value from a method and all parameters are call by value.

Alpha-Beta Pruning

In the game tree search in Chips we were able to quit looking through moves as soon as we found one that gave us a win. This ends up pruning a large part of the game tree. However, we do not have the equivalent for the kalah game tree search above. This is because we don't have simple wins and losses. We have estimated value from static evaluation functions. Even when we do find a win when the game is over we don't know if a different move might not give us a bigger win, so we keep looking.

It would be useful if there were a way of pruning the game tree for kalah and similar games. In fact, there is. It goes by the somewhat cryptic name of alpha-beta pruning. It is based on the fact that as game tree search progresses the tree we can bound the range of the eventual result. As soon as one move has been completely explored there is a lower bound (called alpha) on what the maximizing player can get. As more moves get explored this lower bound can increase, but will never decrease. Similarly, at levels where the player is minimizing there is an upper bound (called beta) of what the minimizing player is forced to accept. These bounds can be used for pruning.

I will give an example of how pruning is possible using Kalah. Suppose I evaluate move 1 and learn that it has a value of 3. I now evalute move 2. In doing so I want to find my opponent's best response. Suppose that the first response my opponent looks at results in a large capture of my stones and evaluates to -15 for me. Do I care what my opponent finds on his or her other choices? Not really. Once I know that my opponent has a response that will have a value below 3 (the alpha value) I know that I prefer first move rather to my second. Therefore we can prune all of my opponents other responses to my move 2 and return -15. I will reject this and will go on to evaluate my move 3. (Draw game tree showing this.) In general, if the minimizing player can select a move with value less than alpha there is no sense in looking at other choices. Thus the lower bound on what the maximizing player can achieve is used to prune moves by the minimizing player.

The same reasoning occurs when the maximizing player is looking at moves further down in the tree. The minimizing player has an upper bound beta that can be achieved by some choice other than the one that I am exploring. If I find a move that is worth more than beta I am happy, but my opponent will reject the move that lead to this position and will choose some other move. Therefore there is no sense in me continuing to evaluate moves after I have found a value greater than beta.

Note that the choice of a different move can occur not only at the parent of the current node, but also higher up the tree. It may be the great-grandfather node of the minimizing player who has a better alternative, but no matter where the choice is made once the minimizing player has found a move whose value is less than the current lower bound (alpha) there is no reason to look at further moves.

Just as we saw for regular game tree search it is easier to write code where the player is always viewing the values from his or her point of view and maximizing, and the values are negated when passed up to the caller. In this case we have an upper and lower bound, and can quit exploring moves if we do better than the upper bound. (We also can improve the lower bound each time we find a better move that we have seen before.) When passing this upper and lower bound to the next level they are negated and reversed, so the upper bound becomes the lower bound and vise versa. We can see this in the code for pickMove in ComputerKalahPlayerAB.

private KalahMove pickMove (KalahGame state, int depth, int low, int high) { KalahMove currentMove; // Hold current move and its value KalahMove bestMove; // Hold best move found and its value int playerKalah, opponentKalah; // Keep track of whose kalah is whose int [] board = state.getBoard(); int playerToMove = state.getPlayerNum(); playerKalah = state.getPlayerNum() * KalahGame.HALFPITS; opponentKalah = KalahGame.HALFPITS - playerKalah; // A dummy move that will be replaced when a real move is evaluated, // so the pit number is irrelevant. bestMove = new KalahMove(Integer.MIN_VALUE, 0); // Run through possible moves for (int pit = playerKalah + 1; bestMove.value < high && pit <= playerKalah + KalahGame.SIDEPITS; pit++) if (board[pit] > 0) // See if legal move { // Make a scratch copy of state KalahGame copy = new KalahGame(playerToMove, state.getPlayers(), board); copy.makeMove(pit); // Make the move // Find the value of this board by evaluating if game over or looking ahead if not if (copy.gameIsOver()) // Evaluate the true score, and multiply it by a huge constant so that the // program will choose a sure win over a potentially larger speculative win // and a possible loss over a sure loss. currentMove = new KalahMove(100 * copy.score(), pit); else if (playerToMove == copy.getPlayerNum()) // Did current player change? { currentMove = pickMove(copy, depth, low, high); // No, so no depth change currentMove.move = pit; // Remember move made } else if (depth > 0) // Player changed, so reduce search depth { currentMove = pickMove(copy, depth - 1, -high, -low); currentMove.value = -currentMove.value; // Good for opponent is bad for me currentMove.move = pit; // Remember move made } else // Depth exhausted, so estimate who is winning by comparing kalahs currentMove = new KalahMove(copy.getBoard()[playerKalah] - copy.getBoard()[opponentKalah], pit); if (currentMove.value > bestMove.value) { // Found a new best move? bestMove = currentMove; low = Math.max(low, bestMove.value); // Update the low value, also } } return bestMove; } Note that the low value passed from above does not affect pruning at this level. This pruning takes place in the comparison with high in the for loop. However, it does get passed on through to become the high at the next level down.

It pays to narrow down the range as soon as possible. Therefore it pays to evaluate the good moves early in the search. It can be useful to use a static evaluation function, or even a shallow lookahead, to sort the moves in decreasing order of apparent value. Some game-playing programs do this to speed up their run time and allow them to look deeper into the game tree.

There is another way to speed up search. Instead of starting the search with low as negative infinity and high as infinity you can pick a narrow range of values around the answer that you expect. (Maybe you remember the value of the last move that you made and you assume that the value of moves doesn't change that fast, so you choose low a bit below that value and high a bit above it.) If the answer you get back is strictly between the low and high values that you chose then it is the correct answer. If not, the answer may not be correct and you have to start the search over again with a wider range (perhaps falling back to negative infinity to infinity).

How Kalah is Organized - the Model/View/Controller pattern

There is a design pattern called the Model/View/Controller pattern. It suggests separating the program into three interacting parts. The first is the model - the representation of the state of the system that you are dealing with. It has all of the instance variables needed to model the state of the system and all of the methods needed to update that state and return information about that state. The second is the view - how the program presents information and interacts with the outside world. The third is the controller - the part of the program that manipulates the model and interacts with the world via the view.

For Kalah the model consists of the KalahGame class, along with the KalahPlayer class and its concrete implementations HumanKalahPlayer, ComputerKalahPlayerGTS, and ComputerKalahPlayerAB. We have two computer players because they use different forms of game tree search. We saw the two algorithms for game tree search above.

The controller is the part of the program that controls the process and tells the model to update itself as things change. In Kalah the Kalah class does this, as the Chips class did in the Chips program. In fact, the two classes have almost exactly the same structure.

The view is something new. It controls how the model is presented to the world and how the controller interacts with the world. The Kalah program has an interface KalahView and two concrete implementations: KalahViewText and KalahViewGraphical. KalahViewGraphical uses KalahFrame to actually display the board. By isolating the view part of the program we are able to switch from a text-based interface to a GUI interface by picking which constructor to call in the Kalah class.

Note the flexibility that this offers. Changing the rules of the game (or even the game itself!) can be done by modifying the model. The controller would work for almost any turn-based game, given a model that implemented the methods that it calls. We have supplied two different views and could supply others. The rest of the program does not have to worry about how the information is presented to and received from the world. (This is not quite true for the Kalah program. The model has to have two different move methods, one for views that will present the state after the move and the other for views that want to interactively display changes as we go along. But we could still have a text view that presented each change as it occurred or a graphical view that just showed the final position, so in that sense the model and the view are independent, even though they interact.)