Algorithm For Chess Program In Python

Algorithm For Chess Program In Python Free
Python Chess Tutorial
Chess Game Programming In Python

The subject of game AI generally begins with so-called perfectinformation games. These are turn-based games where the players haveno information hidden from each other and there is no element ofchance in the game mechanics (such as by rolling dice or drawing cardsfrom a shuffled deck). Tic Tac Toe, Connect 4, Checkers, Reversi,Chess, and Go are all games of this type. Because everything in thistype of game is fully determined, a tree can, in theory, beconstructed that contains all possible outcomes, and a value assignedcorresponding to a win or a loss for one of the players. Finding thebest possible play, then, is a matter of doing a search on the tree,with the method of choice at each level alternating between pickingthe maximum value and picking the minimum value, matching thedifferent players' conflicting goals, as the search proceeds down thetree. This algorithm is called Minimax.

The chess queens can attack in any direction as horizontal, vertical, horizontal and diagonal way. A binary matrix is used to display the positions of N Queens, where no queens can attack other queens. Input and Output Input: The size of a chess board. Generally, it is 8. As (8 x 8 is the size of a normal chess board.).

Depth first traversal or Depth first Search is a recursive algorithm for searching all the vertices of a graph or tree data structure. In this article, you will learn with the help of examples the DFS algorithm, DFS pseudocode, and the code of the depth first search algorithm with implementation in C, C, Java, and Python programs.
History of Computer Chess. Chess is a good fit for computers: –Clearly defined rules –Game of complete information –Easy to evaluate (judge) positions –Search tree is not too small or too big. 1950: Programming a Computer for Playing Chess (Claude Shannon). 1951: First chess playing program (on paper) (Alan Turing).
So it is no surprise that there are some algorithms that were devised with games in mind. 2 The Min-Max Algorithm The Min-Max algorithm is applied in two player games, such as tic-tac-toe, checkers, chess, go, and so on. All these games have at least one thing in common, they are logic games. This means that they can be described.

The problem with Minimax, though, is that it can take an impracticalamount of time to do a full search of the game tree. This isparticularly true for games with a high branching factor, or highaverage number of available moves per turn. This is because the basicversion of Minimax needs to search all of the nodes in the tree tofind the optimal solution, and the number of nodes in the tree thatmust be checked grows exponentially with the branching factor. Thereare methods of mitigating this problem, such as searching only to alimited number of moves ahead (or ply) and then using an evaluationfunction to estimate the value of the position, or by pruning branchesto be searched if they are unlikely to be worthwhile. Many of thesetechniques, though, require encoding domain knowledge about the game,which may be difficult to gather or formulate. And while such methodshave produced Chess programs capable of defeating grandmasters,similar success in Go has been elusive, particularly for programsplaying on the full 19x19 board.

However, there is a game AI technique that does do well for games witha high branching factor and has come to dominate the field of Goplaying programs. It is easy to create a basic implementation of thisalgorithm that will give good results for games with a smallerbranching factor, and relatively simple adaptations can build on itand improve it for games like Chess or Go. It can be configured tostop after any desired amount of time, with longer times resulting instronger game play. Since it doesn't necessarily requiregame-specific knowledge, it can be used for general game playing. It may evenbe adaptable to games that incorporate randomness in the rules. Thistechnique is called Monte Carlo Tree Search. In this article I willdescribe how MCTS works,specifically a variant called Upper Confidence bound applied to Trees(UCT), and then willshow you how to build a basic implementation in Python.

Imagine, if you will, that you are faced with a row of slot machines,each with different payout probabilities and amounts. As a rationalperson (if you are going to play them at all), you would prefer to usea strategy that will allow you to maximize your net gain. But how canyou do that? For whatever reason, there is no one nearby, so youcan't watch someone else play for a while to gain information aboutwhich is the best machine. Clearly, your strategy is going to have tobalance playing all of the machines to gather that informationyourself, with concentrating your plays on the observed best machine.One strategy, called UCB1, doesthis by constructing statistical confidence intervals for eachmachine

begin{equation*}bar{x}_i pm sqrt{frac{2 ln n}{n_i}}end{equation*}

where:

(bar{x}_i): the mean payout for machine (i)
(n_i): the number of plays of machine (i)
(n): the total number of plays

Then, your strategy is to pick the machine with the highest upperbound each time. As you do so, the observed mean value for thatmachine will shift and its confidence interval will become narrower,but all of the other machines' intervals will widen. Eventually, oneof the other machines will have an upper bound that exceeds that ofyour current one, and you will switch to that one. This strategy hasthe property that your regret, the difference between what you wouldhave won by playing solely on the actual best slot machine and yourexpected winnings under the strategy that you do use, grows only as(mathcal{O}(ln n)). This is the same big-O growth rate as thetheoretical best for this problem (referred to as the multi-armedbandit problem), and has the additional benefit of being easy tocalculate.

And here's how Monte Carlo comes in. In a standard Monte Carloprocess, a large number of random simulations are run, in this case,from the board position that you want to find the best move for.Statistics are kept for each possible move from this starting state,and then the move with the best overall results is returned. Thedownside to this method, though, is that for any given turn in thesimulation, there may be many possible moves, but only one or two thatare good. If a random move is chosen each turn, it becomes extremelyunlikely that the simulation will hit upon the best path forward. So,UCT has been proposed as an enhancement. The idea is this: any givenboard position can be considered a multi-armed bandit problem, ifstatistics are available for all of the positions that are only onemove away. So instead of doing many purely random simulations, UCTworks by doing many multi-phase playouts.

Selection

Here the positions and moves selected by the UCB1 algorithm at eachstep are marked in bold. Note that a number of playouts havealready been run to accumulate the statistics shown. Each circlecontains the number of wins / number of times played.

The first phase, selection, lasts while you have the statisticsnecessary to treat each position you reach as a multi-armed banditproblem. The move to use, then, would be chosen by the UCB1 algorithminstead of randomly, and applied to obtain the next position to beconsidered. Selection would then proceed until you reach a positionwhere not all of the child positions have statistics recorded.

Expansion

The position marked 1/1 at the bottom of the tree has no furtherstatistics records under it, so we choose a random move and add anew record for it (bold), initialized to 0/0.

The second phase, expansion, occurs when you can no longer applyUCB1. An unvisited child position is randomly chosen, and a newrecord node is added to the tree of statistics.

Simulation

Once the new record is added, the Monte Carlo simulation begins,here depicted with a dashed arrow. Moves in the simulation may becompletely random, or may use calculations to weight the randomnessin favor of moves that may be better.

After expansion occurs, the remainder of the playout is in phase 3,simulation. This is done as a typical Monte Carlo simulation,either purely random or with some simple weighting heuristics if alight playout is desired, or by using some computationally expensiveheuristics and evaluations for a heavy playout. For games with alower branching factor, a light playout can give good results.

Back-Propagation

After the simulation reaches an end, all of the records in the pathtaken are updated. Each has its play count incremented by one, andeach that matches the winner has its win count incremented by one,here shown by the bolded numbers.

Finally, the fourth phase is the update or back-propagation phase.This occurs when the playout reaches the end of the game. All of thepositions visited during this playout have their play countincremented, and if the player for that position won the playout, thewin count is also incremented.

This algorithm may be configured to stop after any desired length oftime, or on some other condition. As more and more playouts are run,the tree of statistics grows in memory and the move that will finallybe chosen will converge towards the actual optimal play, though thatmay take a very long time, depending on the game.

For more details about the mathematics of UCB1 and UCT, seeFinite-time Analysis of the Multiarmed Bandit Problem andBandit based Monte-Carlo Planning.

Now let's see some code. To separate concerns, we're going to need aBoard class, whose purpose is to encapsulate the rules of a gameand which will care nothing about the AI, and a MonteCarlo class,which will only care about the AI algorithm and will query into theBoard object in order to obtain information about the game. Let'sassume a Board class supporting this interface:

For the purposes of this article I'm not going to flesh this part outany further, but for example code you can find one of myimplementations on github.However, it is important to note that we will require that thestate data structure is hashable and equivalent states hash to thesame value. I personally use flat tuples as my state data structures.

The AI class we will be constructing will support this interface:

Let's begin with the initialization and bookkeeping. The boardobject is what the AI will be using to obtain information about wherethe game is going and what the AI is allowed to do, so we need tostore it. Additionally, we need to keep track of the state data as weget it.

Algorithm For Chess Program In Python Free

The UCT algorithm relies on playing out multiple games from thecurrent state, so let's add that next.

Here we've defined a configuration option for the amount of time tospend on a calculation, and get_play will repeatedly callrun_simulation until that amount of time has passed. This codewon't do anything particularly useful yet, because we still haven'tdefined run_simulation, so let's do that now.

This adds the beginnings of the run_simulation method, whicheither selects a move using UCB1 or chooses a random move from the setof legal moves each turn until the end of the game. We have alsointroduced a configuration option for limiting the number of movesforward that the AI will play.

You may notice at this point that we are making a copy ofself.states and adding new states to it, instead of addingdirectly to self.states. This is because self.states is theauthoritative record of what has happened so far in the game, and wedon't want to mess it up with these speculative moves from thesimulations.

Now we need to start keeping statistics on the game states that the AIhits during each run of run_simulation. The AI should pick thefirst unknown game state it reaches to add to the tables.

Here we've added two dictionaries to the AI, wins and plays,which will contain the counts for every game state that is beingtracked. The run_simulation method now checks to see if thecurrent state is the first new one it has encountered this call, and,if not, adds the state to both plays and wins, setting bothvalues to zero. This method also adds every game state that it goesthrough to a set, and at the end updates plays and wins withthose states in the set that are in the plays and wins dicts.We are now ready to base the AI's final decision on these statistics.

We have added three things in this step. First, we allow get_playto return early if there are no choices or only one choice to make.Next, we've added output of some debugging information, including thestatistics for the possible moves this turn and an attribute thatwill keep track of the maximum depth searched in the selection phaseof the playouts. Finally, we've added code that picks out the movewith the highest win percentage out of the possible moves, and returnsit.

But we are not quite finished yet. Currently, our AI is using purerandomness for its playouts. We need to implement UCB1 for positionswhere the legal plays are all in the stats tables, so the next trialplay is based on that information.

Python Chess Tutorial

The main addition here is the check to see if all of the results ofthe legal moves are in the plays dictionary. If they aren'tavailable, it defaults to the original random choice. But if thestatistics are all available, the move with the highest valueaccording to the confidence interval formula is chosen. This formulaadds together two parts. The first part is just the win ratio, butthe second part is a term that grows slowly as a particular moveremains neglected. Eventually, if a node with a poor win rate isneglected long enough, it will begin to be chosen again. This termcan be tweaked using the configuration parameter C added to__init__ above. Larger values of C will encourage moreexploration of the possibilities, and smaller values will cause the AIto prefer concentrating on known good moves. Also note that theself.max_depth attribute from the previous code block is nowupdated when a new node is added and its depth exceeds the previousself.max_depth.

So there we have it. If there are no mistakes, you should now have anAI that will make reasonable decisions for a variety of board games.I've left a suitable implementation of Board as an exercise forthe reader, but one thing I've left out here is a way of actuallyallowing a user to play against the AI. A toy framework for this canbe found at jbradberry/boardgame-socketserver andjbradberry/boardgame-socketplayer.

Chess Game Programming In Python

This version that we've just built uses light playouts. Next time,we'll explore improving our AI by using heavy playouts, by trainingsome evaluation functions using machine learning techniques andhooking in the results.

UPDATE: The diagrams have been corrected to more accuratelyreflect the possible node values.

UPDATE: Achievement Unlocked! This article has been republishedin Hacker Monthly, Issue 66, November 2015.