|
Introduction:
The subject of computer chess became important to the world a few years
ago after IBM's Deep Blue defeated World Chess Champion Garry Kasparov
in a match. Suddenly it seemed what had been unthinkable was reality.
Man had built a computer capable of doing something, which many believe
required real intellect. However, had Man really built an artificially
intelligent machine capable of reasoning and deduction, two of the skills
“required” in chess?
The answer was no. Deep Blue, by its own developers’ admission,
is not intelligent. IBM took advantage of highly symmetric multiprocessing
and powerful processors to generate and choose from thousands of positions
in seconds.
However, the success of Deep Blue raises the bar for intelligent chess
applications. Despite Deep Blue’s superior power and amazing look-ahead
skills, winning was not easy. Obviously, the human opponent still
has something the computer does not have. Can that be simulated or
overpowered by a computer? This paper will serve to explain the current
state-of-the-art in computer chess, address some of the questions related
to computer chess applications and speculate on the future of intelligent
chess applications.
The History of Computer Chess
The idea of using computers to play games is certainly nothing new.
The apocryphal story claims that one of the first computer programs ever
written was a tic-tac-toe player. While that likely is not true,
programming computers to play games against human opponents is a trivial
task. Only a few of those games hold any appeal from an artificial
intelligence standpoint. Games of chance such as roulette or backgammon
have proven to be popular programming targets, but the aspect of “luck”
makes them less suitable for long-term computer strategy programming.
In those games, winning is as much a matter of chance as it is players’
skill. However, it is the skill we want to try to model, emulate
and eventually defeat.
There are other games like checkers and tic-tac-toe that have been
popular programming targets as well. Now an average high school programmer
could probably build a tic-tac-toe or checkers program capable of winning
a large percentage of its games against a human. These are purely
tactical games with a limited number of possible moves.
Computer chess and the Oriental game of Go are the games that have long
interested computer scientists. Chess and Go seem to be ideal problems
for computers to address. There are clearly defined goals and it
is easy to determine whether an approach is good or bad, based upon established
world rankings. Go, however, has an extremely complicated set of
rules and has proven to be very difficult to program a computer to play.
Chess, on the other hand, is not complicated from a mathematical perspective.
There are only 64 squares and a limited number of legal moves per piece.
Because a computer can calculate a large number of moves quickly, the idea
of building a chess program that was capable of beating a human was seen
as very likely by early programmers. In fact, as early as the 1950s,
computer scientists have been trying to build chess players capable of
beating human beings.
Alan Turing, the famed computer pioneer, wrote what is believed to
be the first computer chess player in 1951. Turing’s “computer”,
which was nothing more than an algorithm which he ran through on paper,
was a horrible player, incapable of beating even the rankest amateur player.
Nevertheless, Turing proved that a computer could be “taught” to play chess.
Seven years later, for the first time ever recorded, a computer defeated
a human playing chess. The opponent, however, was a secretary playing
chess for the very first time. ( She had been taught the game an hour earlier.
) While this many seem highly unremarkable, it showed that it was
possible to program a chess program to beat a human being, playing at the
best of her ability.
By the end of the next decade, programmers had developed chess applications
that played at the level of a high school chess player, a far reach from
the heady predictions of the 50s, when scientists were placing bets as
to when they would be able to build world champion chess players.
David Levy, an International Master chess player, made such a bet with
one of the fathers of the artificial intelligence community, John McCarthy.
McCarthy bet Levy a sum of 1000 pounds that by 1978, there would be a computer
capable of beating Levy. In one of the first major matches between
a computer and a chess Master, in 1978, the CHESS 4.7 program was defeated
by Levy 3-1-1 ( three wins for Levy, one win for CHESS 4.7 and one draw.
)
Nevertheless, the match between CHESS 4.7 and Levy showed how far programmers
had come in a decade. In the late 60’s, a computer was playing at
the high school level. In the late 70’s, a computer had beaten a
chess Master in a game of chess. Just over ten years later, IBM’s
Deep Thought chess computer easily defeated Levy. Deep Thought ranked
100th in the world in 1988 among ranked chess players.
Finally, in the 1990s, Deep Blue was able to defeat Kasparov, securing
its place as the best computer chess player ever.
Why Study Chess?
To play chess well, a computer must be able to recognize patterns and
crunch numbers at high speeds. However, that alone is not enough
to make a chess program into a Grandmaster. The best chess players
know how to make difficult decisions and know when to gamble. If
we can develop a chess-playing computer with these qualities, they we can
rely on machines for more crucial decision-making.
As was noted in the introduction, chess is a model problem for computer
programming. It is obvious that we can only go so far with a brute
force approach, like an exhaustive search. The great chess players
have developed “intelligent” techniques that help them make decisions about
good and bad moves. The goal is to teach computers to do the same
things.
Chess requires pattern matching, knowledge representation, heuristics,
positional estimates, and problem solving, all within defined time constraints.
Can you name a computer program that doesn’t use one of these techniques?
IBM had this rationale behind the millions of dollars they invested
on Deep Blue. According to their literature, “Massively parallel,
special purpose computing like that found in Deep Blue could certainly
be of great use to people if applied to finance, medicine, education, etc.
Imagine an evaluative capability like Deep Blue is which could help an
investor manage a portfolio, a huge retailer manage inventory, or a government
deploy resources. These types of things justify the spending behind Deep
Blue. Chess and Kasparov, are merely ways of benchmarking progress.
“The lessons learned from this research project may be used to further
develop IBM's parallel processing knowledge in applying customized versions
of the Deep Blue technology to everyday business problems. Among these
are highly complex, difficult problems such as molecular dynamic simulations
employed in the pharmaceutical industry, data mining, for example, in financial
markets, and traffic and cargo scheduling at the world's largest international
airports. By conquering the classical chess paradigm through parallel computing
systems, IBM will use the knowledge and technology that they have gained
to solve these and other intricate problems well into the next century.
Of course, the cynic will claim IBM’s real reason for building Deep
Blue and playing chess with it is to show how powerful its computers can
be on special purpose applications, hoping to sell more computers.
Perhaps the biggest question we want to be able to answer is, “Can
Computers think?”
During Turing’s time, it was believed that playing Master level chess
required some kind of highly intelligent thought. For that reason,
a lot of attention was focused on trying to build a computer that simulated
reasoning, derivation and pattern recall. Many researchers now believe,
however, that a computer is thinking if it is capable of playing good chess,
whether that thought is artificial intelligence or state space searching.
Why Is Playing Chess So Difficult?
Computers can calculate possible moves with blinding speed. A
computer can store thousands of possible move combinations in memory at
a time. So why did it take thirty years for a computer to beat a
World Grand Master?
The fact is, chess is one of those games that is easy to learn to play,
but difficult to master. It is true that there only are a finite
number of outcomes for the computer to analyze, but it is not just board
analysis that makes a good chess player. After all, Kasparov can
likely only analyze two or three moves ahead, if that and he has been one
of the world’s best players.
It is easy for a computer to calculate whether it has a man advantage.
That is pure number crunching. Likewise, positional strength is easy
to calculate. However, which is more important? Do you sacrifice
a piece to gain a positional advantage? Computer players must be
able to make these decisions. Likewise, when playing against a human,
the computer finds that the player does not always do what the computer
thinks he will do. Sometimes a player will make a gamble to try to
gain position. The computer must be careful in those situations not
to be “suckered in” by a player’s gambit. The question is whether
a computer can ever understand the subtleties of a game like chess.
How Computers Play Chess
The Basics
Developing a computer that can “play” chess is simple. Once the
rules of chess are encoded, the computer could just make random legal moves.
Against a beginner, the computer might even have modest success.
Against a chess Master, the computer would be a pile of smoldering silicon
in seconds. Toast, that is. McCarthy actually played such a chess
machine himself, and the machine lost in an average of just five moves,
and McCarthy was only an average chess player.
So obviously random playing is not going to get us very far.
Enter the ubiquitous state space search, the favored tool of most board
game programmers. In theory, a computer programmed to play chess
by the legal rules should always be capable of winning a game. All
it has to do is calculate each of next possible moves and all the possible
consequences of those moves until it finds a sequence that results in an
end game.
The question now, however, is how deep do we go with our search.
Obviously the further ahead we can look, the better our chances of finding
the best move. The problem is, the total number of possible moves
to be explored is approximately 10EE44. ( For reference purposes,
the number of particles in the universe is estimated to be 10EE80. )
No computer will likely ever be capable of evaluating that many moves.
Obviously, we must limit the size of our search space. Even at a
depth of about 10 plies ( 10 moves by either player ), we need to study
half a quadrillion moves on average. The “branching factor'' or number
of legal moves in a typical position is about 35. In order to play master-level
chess a search of depth eight appears necessary, which would involve a
tree of or about leaf nodes.
In 1950, Claude Shannon developed the basic tree search strategy, known
as the “Shannon Type A strategy”. The computer considers all the
possible moves up to a certain depth, which allows the machine to return
a response in a reasonable amount of time. A human chess player uses
this same strategy. He looks ahead a number of moves and makes the
best move he knows how. Human chess players like Kasparov have built
upon centuries of research and literature to develop game playing strategies,
opening gambits and mid-game scenarios. The computer programmer needs
to find a way to make his computer develop a game-playing strategy.
All of the chess programs in the world use this same underlying depth-first
alpha-beta search algorithm. What varies from program to program is the
length ( or "depth", to keep the layer analogy ) of search assigned to
each of these stages. Ultimately the stage length is not fixed, but varies
by small amounts depending on the current sequence of moves being examined.
For example, a search path may be locally lengthened because one side has
attacked the King, leaving the opponent with only a few alternatives to
consider. There are so many options here that even programs using the same
basic model can achieve a radically different style and speed of play.
The Shannon Type-B strategy requires choosing a representative sample
of moves to explore at a given level and only searching those. Type-B
strategies are typically best for microcomputer game playing rather than
high-speed computers, since the chances of missing the best move increases
using type B, although the computers can search deeper. Against Grandmasters,
missing the “perfect” board can mean defeat. Against other chess
players, the computer can afford to sacrifice the perfect move for a “good
enough” move that it examined further ahead.
Improving The Search
While the human method of analyzing alternatives seems to involve selecting
a few promising lines of play and exploring them, computers are necessarily
exhaustive rather than selective, so refinement techniques have been developed.
First I will provide an overview of the techniques, then look at them
in slightly more depth.
In a technique called "iterative deepening", instead of embarking on
a single search of a certain ply (which might not be completed in the given
time) the computer performs a series of increasingly deeper searches (N-ply,
then N+1, then N+2, etc.) until the allotted time runs out. Thus, it is
able to produce the best move that the time constraint allows--a computer-chess
situation that has many parallels in real-time applications. The computer
can combine iterative deepening with various memory functions, particularly
refutation and transposition tables, to reorder moves, so that at the next
iteration its selected "principal variation" (best sequence of moves found
during the previous iteration) is explored first.
Another move-reordering technique is to keep a short list of "killer"
moves, which are tried first. Killer moves are those that have successfully
"cut off" or pruned the search elsewhere. Often these killer moves are
captures, so a simplification involves considering capture moves before
all others. This technique is nicely generalized in the "history heuristic
table" that many programs use. In its most elementary form, a history table
has 64x64 entries, each containing a value that measures the frequency
with which the corresponding possible move has recently pruned the search.
Move-reordering mechanisms enhance the efficiency of the depth-first
alpha-beta search algorithm. Three other improvements--Pearl's Scout algorithm
and the related NegaScout and Principal Variation Search (PVS) methods--share
a common theme: once a principal variation has been found it is sufficient
to show that each alternative is inferior. Any that is not inferior must
be re-searched, since it now constitutes the preferred path. Another technique
for curtailing the search is called aspiration alpha-beta search. In this
approach the value of the tree from the current position is estimated and
a narrow search window (customarily plus and minus the value of half a
pawn around that estimate) is used. Aspiration searching is a popular and
better-understood alternative to the Principal Variation Search method,
although not as efficient.
It is difficult to be precise about the advantages that more searching
provides. The size of the chess tree for any position is highly variable.
In many endgames there are only about 8 moves for each side, while in complex
middle game positions each side might have close to 80 moves. With today's
technology, programs exhaustively search 7 to 10 plies in the middle game,
while at least one programmer claims to extend searches selectively to
40 ply! Selective extensions are based on heuristics devised by individual
programmers to explore the sphere of influence associated with a key move:
to examine the moves that might defend against a mate threat, or that might
provide a counter attack and thus indirectly avoid some imminent loss.
Selective extensions are not to be confused with singular extensions. The
latter technique re-examines any move that looks singularly good relative
to the others. The search depth is increased to determine whether the singular
move remains best. In some sense, this is a way of extending the principal
variation in the small. It is a potentially costly but interesting method.
More popular and more widely used is the null move heuristic, where
one side provisionally makes two successive moves. If the value of the
position remains poor even with the benefit of two moves in a row, then
the line of play is abandoned. This is one way to identify situations where
an inevitable loss is otherwise being pushed out of sight beyond the search
horizon. While many forward pruning methods fail too often to be useful,
null move forward pruning is usually beneficial.
Mini-max and Nega-max
Finding the best move for some position on the chessboard means searching
through a tree of positions. At the root of the tree we search for the
best successor position for the player to move, at the next level we search
for the best successor position from the standpoint of the opponent, and
so on. Chess tree search is an alternation between maximizing and minimizing
the value of the positions in the tree; this is often abbreviated to minimaxing.
To remove the distinction between own and opponent position, the value
of a position is always evaluated from the standpoint of the player to
move, i.e. by negating the value as seen by the opponent; this is called
negamaxing.
The number of positions that has to be searched by this algorithm is
W^D, where W is the width of the tree (average number of moves possible
in each position) and D is the depth of the tree (^ indicates exponentiation).
This is extremely inefficient and would even hold back a supercomputer
from reaching greater depths.
Alpha-Beta Pruning
Alpha-Beta search is the first refinement for reducing the number of
positions that has to be searched and thus making greater depths possible
in the same amount of time. The idea is that in large parts of the tree
we are not interested in the exact value of a position, but are just interested
if it is better or worse than what we have found before. Only the value
of the position along the principal variation has to be determined exactly
(the principle variation is the alternation of best own moves and best
opponent moves from the root to the depth of the tree).
The Alpha-Beta search procedure gets two additional arguments that
indicate the bounds between which we are interested in exact values for
a position.
This idea relies on the basic premise that once a strong reply has
been found by your opponent in a given situation, the computer may stop
considering ways of improving this situation and focus on moves that avoid
such a position altogether.
The gain from Alpha-Beta will come form the earlier exit from the while
loop; a value of best that equals or exceeds beta is called a cutoff. These
cutoffs are completely safe because they mean that this branch of the tree
is worse than the principal variation. The largest gain is reached when
at each level of the tree the best successor position is searched first,
because this position will either be part of the principal variation (which
we want to establish as early as possible) or it will cause a cutoff to
be as early as possible.
Under optimal circumstances Alpha-Beta still has to search W^((D+1)/2)
+ W^(D/2) - 1 positions. This is much less than Mini-max, but still exponential.
It allows us to reach about twice the depth in the same amount of time.
More positions will have to be searched if move ordering is not perfect.
Aspiration Search
Aspiration search is a small improvement on Alpha-Beta search. Normally
the top level call would be Alpha-Beta(pos, depth, -INFINITY, +INFINITY).
Aspiration search changes this to Alpha-Beta(pos, depth, value-window,
value+window), where value is an estimate for the expected result and window
is a measure for the deviations we expect from this value.
Aspiration search will search fewer positions because it uses alpha/beta
limits already at the root of the tree. The danger is that the search result
will fall outside the aspiration window, in which case a re-search has
to be done. A good choice of the window variable will still give an average
net gain.
Transposition Table
A transposition table serves as a cache memory and is used to store
information about positions that have been visited before, usually during
an earlier part of an iterative deepening search. It is so called because
it can be used to recognize transpositions in the order of moves. Stored
in the entry associated with a position are important items like the "value"
of the position, the best move from there, and the length of the previous
search.
"Value" is computed by applying an evaluation function at the terminal
nodes of the tree (the nodes on the horizon where the search is stopping).
This evaluation function often includes a quiescent search to help resolve
existing capture sequences and other uncertainties in the position, such
as pending pawn promotions.
Transposition tables are also invaluable as a means of extending search
in the endgame, where only a few new moves emerge at each node, the others
leading through transposition to positions that have been seen before.
These tables do not increase program size or complexity, since the total
space allocated to them is simply a matter of cost. Each transposition-table
entry requires about 10 bytes of memory, and most programs have tables
in the range from 32 thousand to 1 million entries, though in 1993 one
Supercomputer program boasted a table with a 1,000 million entries! This
wide range simply reflects the memory available to the programmer.
The transposition table is a hashing scheme to detect positions in different
branches of the search tree that are identical. If a search arrives at
a position that has been reached before and if the value obtained can be
used, the position does not have to be searched again. If the value cannot
be used, it is still possible to use the best move that was used previously
at that position to improve the move ordering.
A transposition table is a safe optimization that can save much time.
The only danger is that mistakes can be made with respect to draw by repetition
of moves because two positions will not share the same move history.
A transposition table can save up to a factor 4 on tree size and thus
on search time. Because of the exponential nature of tree growth, this
means that maybe one level deeper can be searched in the same amount of
time.
Enhanced Transposition Cutoff
Move ordering is important in tree search because it increases the
chance of getting a cutoff on the first successor position searched. This
is not always optimal; there may be several successors causing a cutoff
and we want to use the one with the smallest search tree. One idea that
has been tried is to look at all successor positions and see if they are
in the transposition table and cause a cutoff. If one such position is
found, no further search has to be done. This can save about 20-25% in
total tree size.
Iterative Deepening
Iterative deepening means repeatedly calling a fixed depth search routine
with increasing depth until a time limit is exceeded or maximum search
depth has been reached. The advantage of doing this is that you do not
have to choose a search depth in advance; you can always use the result
of the last completed search. Also because many position evaluations and
best moves are stored in the transposition table, the deeper search trees
can have a much better move ordering than when starting immediately searching
at a deep level. In addition, the values returned from each search could
be used to adjust the aspiration search window of the next search, if this
technique is used.
Additional criteria for modifying the depth of search in either strategy
A or B can be used. Generalizations of the concept of quiescence, as stability
in the score of the evaluation function or aggressiveness of the position,
may extend the search further.
Tournament chess is played under a strict time control, and a program
must make decisions about how much time to use for each move. Most chess
programs do not set out to search to a fixed depth, but use a technique
called iterative deepening. This means a program does a depth two search,
then a depth three search, then a depth four search, and so on until the
allotted time has run out. When the time is up, the program returns its
current best guess at the move to make.
Iterative deepening has the additional advantage that it facilitates
move ordering. The program knows which move was best at the previous level
of iterative deepening, and it searches this principal variation first
at each new level. The extra time spent searching early levels is more
than repaid by the gain due to accurate move ordering.
Selective Extensions
In the full width part of the tree, search depth can be increased in
forced variations. Different criteria can be used to decide if a variation
is forced; examples are check evasions, capturing a piece that has just
captured another piece, promotion threats. The danger if used carelessly
is an explosion in tree size.
Of course, it only makes sense to apply a static evaluation function
to a position that is quiescent, or tactically quiet. Consequently, the
tree is extended beyond leaf nodes until a quiescent position is reached,
where the static evaluator is actually applied.
We can think of the quiescence search as a dynamic evaluation function,
which takes into account tactical possibilities. At each leaf node, the
side to move has the choice of accepting the current static evaluation
or of trying to improve its position by tactics. Tactical moves that can
be tried include pawn promotions, most capture moves, some checks, and
some pawn promotion threats. At each newly generated position, the dynamic
evaluator is applied again. At the nominal leaf nodes, therefore, a narrow
(small branching factor) tactical search is done, with the static evaluator
applied at all terminal points of this search (which end up being the true
leaves).
Instead of calling Evaluate when depth=zero it is customary to call
a quiescence search routine. Its purpose is to prevent horizon effects,
where a bad move hides an even worse threat because the threat is pushed
beyond the search horizon. This is done by making sure that evaluations
are done at stable positions, i.e. positions where there are no direct
threats (e.g. hanging pieces, checkmate, promotion). A quiescence search
does not considers all possible moves, but restricts itself e.g. to captures,
checks, check evasions, and promotion threats. The art is to restrict the
quiescence search in such a way that it does not add too much to the search
time. Major debates are possible about whether it is better to have one
more level in the full width search tree at the risk of overlooking deeper
threats in the quiescence search.
Principal Variation Search
Principal variation search is a variation of alpha-beta search where
all nodes outside the principal variation are searched with a minimal window
beta = alpha + one. The idea is that with perfect move ordering, all moves
outside the principal variation will be worse than the principal variation;
this can be proven by the minimal window search failing low. If the move
ordering is imperfect, fail high may be encountered and in such a case,
a re-search has to be done with the full alpha-beta window. The expectation
is that the gain of the minimal window search is higher than the loss of
these occasional re-searches.
A further refinement of this is known as NegaScout.
Memory Enhanced Test
Memory enhanced test is a family of search algorithms that have in
common that at the top level an alpha-beta search is done with a minimal
window beta = alpha + 1. Differences can be found in the sequence of alpha-beta
values that is tried. Because the top-level search is called repeatedly
with different alpha-beta parameters and the same depth, it is important
to have a large transposition table in order to re-use partial search results
from previous searches.
Killer Heuristic
The killer heuristic is used to improve the move ordering. The idea
is that a good move in one branch of the tree is also good at another branch
at the same depth. For this purpose at each ply, we maintain one or two
killer moves that are searched before other moves are searched. A successful
cutoff by a non-killer move overwrites one of the killer moves for that
ply.
If a move has been previously found to be a very good/bad move, most
likely the current evaluation will still find the present move to be a
very good/bad move. In this case, the killer moves are tried first by the
Alpha-Beta heuristic, with an expected reduction in the number of positions
to be considered.
History Heuristic
The history heuristic is another improvement method for the move ordering.
In a table indexed by from and to square, statistics are maintained of
good cutoff moves. This table is used in the move-ordering sort ( together
with other information such as capture gains/losses ).
Null Move Heuristic
The null move heuristic is a method of skipping searches in parts of
the tree where the position is good enough. This is tested by doing a null
move (i.e. passing, doing no move at all) and then searching with reduced
depth. If the result of this is higher than beta, no further search is
done; if the result is lower than beta, we do a normal search.
The null move heuristic has big dangers because it can fail to detect
deep combinations that could be either really good or really bad. On the
other hand, it can save a lot of time by skipping large parts of the search
tree.
The Evaluation Function
Of course, now that we have all these boards, we have to evaluate the
positions to determine if they are good or not. For this, we need
an evaluation function, which assigns each board a number based upon certain
characteristics. The idea is to assign a positive number to those
criteria that are advantageous to the player and a negative number to those
that are not. There are many different ideas for proper evaluation
functions, and they will be discussed later.
The most important term is material, which is a simple count of the
number of pieces on each side, modified by a factor which encourages the
side ahead in material to trade pieces but not pawns. The material evaluator
also recognizes known draws such as king and two knights versus king.
There are several types of positional terms, including pawn structure,
king safety, center control, king attack, and specialized bonuses for things
like putting rooks on the seventh rank.
The pawn structure evaluator knows about doubled, isolated, backward,
and passed pawns. It also has some notion of pawn chains and phalanxes.
Pawn structure computation is very expensive, so a hash table is used to
store the scores of recently evaluated pawn structures. Since pawn structure
changes slowly, this hash table usually saves us the work of pawn structure
evaluation.
King safety is evaluated by considering the positions of all pawns
on the file the king is occupying and both neighboring files. A penalty
is assessed if any of the king's covering pawns are missing or if there
are holes (squares which can never be attacked by a friendly pawn) in front
of the king. Additional penalties are imposed if the opponent has open
or half-open files near the king. The whole king safety score is multiplied
by the amount of material on the board, so the program will want to trade
pieces when its king is exposed, and avoid trades when the opponent's king
is exposed. As in pawn structure, king safety uses a hash table to avoid
recomputing the same information.
The center control term rewards the program for posting its pieces
safely in the center of the board. This term is crude since it does not
consider pieces attacking the center from a distance, but it can be computed
very quickly and it encourages the kind of straightforward play we want.
King attack gives a bonus for placing pieces near the opposing king.
Like center control, this term is crude but tends to lead to positions
in which attacking opportunities exist.
The evaluation function is rounded out by special bonuses to encourage
particular types of moves. These include a bonus for castling, a penalty
for giving up castling rights, rewards for placing rooks on open and half-open
files or on the seventh rank, and a penalty for a king on the back rank
with no air.
Other Useful Techniques
Some common techniques to improve both strategies have been developed.
These techniques have allowed computers with the same basic speed to play
significantly better through time.
Centuries of human experience in chess have resulted in a set of initial
moves ( about the first 15 to 20 ) which have been shown to
be optimal strategies for both players. This set of moves are stored in
databases and made available to the chess program.
The opening is played by making use of the “opening book'' of known
positions. Some programs know the theoretically “best'' move in about
18,000 common opening positions. This information is stored as a hash table
on disk and can be looked up quickly. This hash table could resolve collisions
through the method of chaining.
Similarly to openings, a wide collection of ending positions has been
accumulated and strategies for each are known. In contrast to opening books,
computers have added great deal of knowledge to endgame databases.
Endgames are handled by using special evaluation functions that contain
knowledge about endgame principles. For instance, an evaluator for king
and pawn endgames may be able to directly recognize a passed pawn which
can race to the last rank without being caught by the opposing king. Except
for the change in evaluation functions, the endgame is played in the same
fashion as the middle-game.
Deep Blue
How Does Deep Blue Think?
Deep Blue considers four basic chess values before deciding on a move:
material, position, king safety and tempo. For material scores, it
uses the rule of thumb that pawns are worth one point, pieces three points,
rook five points and the queen nine points.
Positional scoring is more complex. It has long been believed
that control of the center was all that mattered. Nearly all Grandmaster
games before the 20th century began with Pawn to King 4 or Pawn to Queen
4 to try to control the center. While center control is still
very important, some Grandmasters have developed effective openings that
delay development of the center. The idea is that the opponent will
overextend his position to center and leave himself vulnerable for attack.
The simplest way to understand position is by looking at your pieces
and counting the number of safe squares that they can attack. The more
squares they control, the stronger the position. Thus, a seemingly quiet
pawn move can be very strong if it opens many new squares for a more powerful
piece behind it. This is the standard positional score used
by Deep Blue.
The defensive aspect of position is the safety of the King. The computer
assigns a value to the safety of the King's position, in order to know
how to make a purely defensive move. This score is based mainly upon
the position of the King’s covering pawns. If one of those pawns
is missing, or if there is a hole in front of the king that a pawn cannot
cover, then point deductions are taken for that board.
Tempo is related to position, but focuses on the race to develop control
of the board. A player is said to "lose a tempo" if he dilly-dallies while
the opponent is making advances that are more productive. Deep Blue’s
programmers have defined how Deep Blue's program evaluates these factors.
The computer then searches through all the legal moves and chooses the
one that yields the highest value.
Deep Blue And Artificial Intelligence
Deep Blue’s programmers claim the program uses no artificial intelligence,
although their use of the term raises images of HAL in “2001”. IBM
has recognized, however, that the early computer designs that tried to
mimic human thinking were not very good at it because no formula exists
for intuition. Deep Blue relies more on computational power and a simple
search and evaluation function.
Its strengths are the power of its processors and its design.
Also important is Deep Blue’s knowledge base, which some experts would
classify as a kind of intelligence, that of stored knowledge. Deep
Blue has more chess information to work with than any other computer, and
more than all but a few chess Masters. However, unlike those Masters,
Deep Blue never forgets or gets distracted. In addition, its special-purpose
design is orders of magnitude better at processing all of that chess information
than anything else.
The most important intelligence, for a program like Deep Blue, is its
knowledge of the game. Most every chess Master would tell you that
the way to learn chess is by playing chess against skilled opponents and
studying successful strategies. Deep Blue, has, in a way, done that.
Think about the 100 years of Grandmaster games stored in its memory.
When Kasparov plays Deep Blue, he isn't just playing a computer, he is
playing the ghosts of Grandmasters past – games played long before transistors
and chips. Deep Blue can organize such a storehouse of knowledge -- and
apply it on the fly to the to the chessboard.
Psychology As Intelligence
Psychology, any top chess player will tell you, is an important key
to winning chess. But Deep Blue and other computer programs have no psychological
perception, can neither intimidate nor be intimidated, and experience no
joy from winning, or sadness from losing.
"There is no psychology at work" in Deep Blue, says IBM research scientist
Murray Campbell. Nor does Deep Blue "learn" its opponent as it plays. Instead,
it operates much like a turbo-charged "expert system," drawing on vast
resources of stored information. It then uses that information to assist
it in calculating the most appropriate response to an opponent’s move.
According to Campbell, Deep Blue is stunningly effective at solving
chess problems, but it is less "intelligent" than the stupidest person
is. It does not think, it reacts, he said. Moreover, that is where Kasparov
saw his advantage. Speaking of Deep Thought, which he defeated in 1989,
Kasparov said, "Chess gives us a chance to compare brute force with our
abilities."
This is the essential difference between a chess computer and a person.
Deep Blue will not look over the board and see the glare of the world champion.
Kasparov can growl, sneer, call Deep Blue names and engage in any intimidation
tactic he wants, and the computer will go right on crunching data in the
same impersonal way.
However, despite Campbell’s protestations to the contrary, experts
say there is a psychological residue in Deep Blue’s program for using the
clock. Given a total of 3.5 hours to make all its moves, it can ration
time in a variety of ways. It can average the number of moves and attempt
to deviate from that only by a small margin.
Alternatively, it can move very fast, forcing Kasparov to respond.
Or it can take an inordinate amount of time over one move, calculate many
trillions of possible games, forcing Kasparov to wait and possibly become
bored or agitated.
This sort of psychology is present in all computer chess games which
use an opening book. Many a player gets frustrated when they see
a computer rattle off 15 moves without thinking, while the player is trying
to develop a strategy. To the player, it seems as if the computer
is having no problem figuring out what it wants to do. This causes
player unfamiliar with computers to rush their moves.
A human player still has a psychological advantage over the computer.
The understanding of the game of chess that Kasparov has is entirely different
from Deep Blue's. Deep Blue does not mimic human thought -- it reaches
the same end by different means. The human uses intuition, judgement and
experience to make its moves.
The attempt to develop chess computers that play in similar fashion
to human players has often ended with poor results. Deep Blue has been
explicitly designed to take advantage of its differences from human opponents.
Although a human player is only capable of searching one or two possible
positions per second, their focus is much more selective. In addition,
humans use an extremely complex evaluation function that includes intuition,
experience, memory and pattern recognition. Conversely, Deep Blue searches
several hundreds of millions of positions per second using a simpler search
and evaluation function.
Kasparov himself adamantly believes that the very best players in the
world should be able to prepare themselves to exploit what he believes
to be the special weaknesses presented by machines. Kasparov has long maintained
that human creativity and imagination, especially his own, will always
triumph over silicon "Chess gives us a chance to compare brute force with
our abilities. In serious, classical chess, computers do not have a chance
in this century. I will personally take any challenge."
Deep Blue does not take into consideration its opponent’s tendencies,
although it is possible for its handlers to add that information.
This is important, because traditionally, in man-machine games, computers
have failed to pick up the nuance of the competitor's favorite opening,
end game or a trademarked strategy. One game between Deep Thought and Karpov
could have been a draw forced by the computer, but the machine thought
it saw a win based on material advantage, and ended up losing to the number
two player in the world. It should rather have taken the draw, which at
the time would have been the highest achievement ever by a computer.
In the past, computers' lack of ability to understand a grandmaster's
tendencies favored the human. Kasparov used this to his advantage in his
games against Deep Blue. He steered the computer where possible into
end games with which he was intimately familiar, especially end games that
confound the computer into "thinking" it is ahead, even when it is actually
behind. Because the evaluation function of Deep Blue is very simple,
and widely known, it is likely Kasparov was capable of figuring out what
move Deep Blue would make next, which allowed him to change moves.
With humans, on the other hand, irrational moves can confound a thinking
machine.
Perhaps this highlights the most important advantage a computer programmer
can give a computer chess application -- knowledge of the game and the
art behind it. Deep Blue was only as good as the game knowledge stored
it in. With better knowledge of common human moves, the computer
can only get better, to the point that it is likely that it will not be
able to be beaten.
Chess Applications
Green Light
The following discussion of Green Light Chess, one of three Windows
chess applications I examined, comes directly from the author of the program.
I provide it in its unedited form, since he obviously does a much better
job than I at explaining his software. It provides an excellent overview
of how an author believes his software works in relation to the issues
discussed in this paper.
Green Light Chess ( v2.05 ) is a single-processor chess application
designed for Windows 95/98 and Windows NT. Like all chess programs,
it uses a mini-max search with alpha-beta pruning. Green Light uses
iterative deepening at all levels, which makes it slower than some of the
other applications. This is because it has to recalculate the tree
at each level.
The iterative search starts at depth 1 and continues until the maximum
number of seconds has been reached, the maximum search depth has been reached,
or a checkmate has been found.
This version of the program no longer uses killer moves. It uses a more
generalized method of move prioritization called the "history heuristic".
In effect this is a table with 4096 entries, denoting the from and to
position of each move. When a 'good' move is found, its value in the table
is incremented. When a value gets too big, all values in the table are
divided by 2 (using a right shift). [alternatively you could just clear
all the entries to zero]. When move sorting is performed these values are
used to influence the ordering.
QUIESCENCE SEARCH
The top level of quiescence search tests for check and extends the search
one more level if so found (this is so that a checkmate can be seen at
the leaf level). Otherwise only moves which attack other pieces are considered.
A static move evaluator is used to test each capturing move before a
search is made, and if the value returned is <= zero (i.e. a loss is
predicted) the move is skipped, otherwise the move is searched as normal.
Other enhancements to the bare q-search include:
-
Escaping before looking at any moves if a significant gain in material
(5 pawns or greater) has already been made.
-
Escaping if no apparent material loss has occurred and quiescence has already
searched 10 deep.
-
Before each move is considered, and just before the static exchange evaluator
is applied, the position is pruned if the current score is <= -3 pawns
below alpha and capturing the current piece wouldn't get the score back
above alpha. As all quiescence moves (except for ordering due to the transposition
table) are sorted to capture the largest piece first this works fairly
well.
VALID MOVE GENERATION
The original program tried to always generate only valid moves. This was
giving it quite a large performance hit. In subsequent research I found
a different way of dealing with the situation.
There are two parts to move generation in chess.
-
Generate available moves (valid except for check considerations)
-
Throw out moves that are not valid due to being in check or causing exposed
check.
Generating available moves
The new method relies on using 'bitboards' (stored in 64 bit integers)
to generate masks of possible moves so that certain situations can be tested
using combinations of simple bitwise logical (e.g. AND OR NOT) operations.
Using this method a version of the chess board is stored as a 64bit
bitmap, containing 1's where there are pieces, and 0's for empty squares.
If you think about generating a rook's possible horizontal moves, you just
need to extract the correct 8 bits for the rank the rook is on and do a
table lookup to find another 64bit bitmap that gives the possible squares
the rook can move to.
If you store rotated versions of the original bitboard (45degs * 2
+ 90degs), you can use this method to generate all rook, bishop, queen
and king moves.
Knight moves become trivial. Pawn moves are still a little complicated.
Generating a bitmap of the possible moves for the queen simply consists
of array lookups (these are fast) and Boolean operations (these are really
fast!). The worst bit is probably the GetValidMovesFromBitmap() call, which
uses the bitmask to create a list of actual moves.
Removing the invalid moves
I now throw out moves in the tree search in two ways:
If the king is in check, I only generate strictly valid moves to get
out of check. This is fairly easy to do using the bitboard method, as all
you have to do is generate a bitmask of positions that pieces would have
to move to get the king out of check, and AND it with the bitmask produced
in the move generation code.
If the king is not in check, I make invalid exposing moves in the search
tree, but I catch them early in the next recursive level as I am generating
moves (see the '// king captured?' comment in the queen code ).
This has resulted in code that is 2 to 3 times faster than my old code
in these areas. This is actually the method used by Crafty (this is a free,
very strong, chess program written by Robert Hyatt [he wrote Cray Blitz],
and the source is available too).
Transposition table
This part of the program can be quite tricky. You need to account for all
the different conditions under which a move is added to the table in each
record. I.e. you need to know which bound is stored in the table (e.g.
lower, upper, none). When positions are hit in the table the current bounds
for the position can be outside the range found in the table, and this
needs to be dealt with in the correct fashion.
My code is now able to cope with these complexities, so I will soon
be attempting to perform Principle Variation search, and once that is working
I'll give MTD(f) a go.
My current code has Transposition records that are made like this:
2 bytes - The score for this position
6 bytes - The top 48 bits of the hash key. The index to the hash entry
contains the other bits, so we don't need to store them here.
1 byte - The draught of this record
1 byte - Flags
2 bytes - A move to search first in this position. This may be null.
The flags item above stores 1) information about whether the hash table
record is in use or stale, 2) information about whether the score is a
fail-low, a fail-high, is exact, or not used, and 3) information about
whether the side to move is in check or checkmate in this position.
Q: Which colour is the move for? Does it matter?
A: It does matter. The values are stored for the internal nodes in
the tree, and it is perfectly possible for a position to have two completely
different scores depending on whose turn it is to move. e.g. in this (admittedly
simple) position:
FEN: 8/4P3/3k4/8/8/5K2/8/8
white to move wins, whereas black to move draws.
Q: How do you generate a good hash value from the board positions?
A: You use the Zobrist algorithm.
In this way a 64bit hash value is produced which can reliably be used
to represent the current board position is a hash table. This means that
the postion itself doesn't need to be stored (which would probably take
more than 64 bits).
The Future For Computer Chess
Top chess computers are challenging Grand Master players and winning,
raising the spectre that someday man will develop a computer chess player
that even the World Champion is incapable of defeating. As processors continue
to advance, we will soon have a multi-processor computer capable of searching
3.75 billion positions within a second. So we have or soon will have
computers that can out-search even the smartest of humans.
But will this be enough to compensate for the human's proven strength
in long-term planning? Human chess players have proven to be especially
skilled at simplifying complex situations and identifying the fundamental
issues. More important from a chess point of view, humans are adept at
provoking long-term weaknesses in their opponent's position, until it becomes
indefensible.
As of yet, man has been unable to create a computer that cannot be “suckered”.
Why is that? Part of it comes from one of the underlying assumptions
of the minimax algorithm, namely that the opponent will always make his
or her best move. But with humans, that is not always the case.
Humans are willing to sacrifice pieces for odd reasons. As Garry
Karpov showed against Deep Thought in 1988, it is possible to trick a computer
into believing it is winning, only to surprise it.
In addition, computers are programmed to win, meaning they pass up opportunities
to force a draw in order to find a win. However, sometimes humans
can make a winning game look like a draw by moving pieces into unfavorable
positions, but controlling key spots on the board. Deep Blue ignored
one such situation during one of its matches with Kasparov and the World
Champion turned the situation into a victory.
So obviously the computer needs to be taught how to adjust to a human.
Most programmers have focused on the brute-force search methods to process
more boards. But now we can search so many plies at once that programmers
are turning their focus towards tuning methods. A group at IBM is
already working on the next iteration of Deep Blue and is trying to add
the ability for the program to be able to adjust its evaluation function
as it plays to take advantage of the opponent’s perceived disadvantage.
Program Performance and Rating
Part of this project involved evaluating three chess programs for Windows.
Those programs were Green Light, Arasan and Chenard. I chose those
three because they were freely available, run on the Windows 32 platforms
and had source code available. This was important because it allowed
me to examine their evaluation functions and search engines. Throughout
this paper, I have used that code as part of the information about how
chess programs are written.
Although I am only an average chess player ( ELO rating of around 1350
), I was able to beat all three chess applications at least once at a depth
level of three, which of course is not a very high depth level. At
level greater than three, in 50 games, I could not defeat any of the chess
games. This is likely more of an indication of my chess playing ability
( or lack thereof ) as opposed to the strength of the applications.
The important point is, even at level three, I could not consistently
beat any of them, indicating that there is some level of intelligence in
each of the programs; otherwise, I should have been able to consistently
beat the programs, as would any halfway decent chess player.
At level one, where the computer only examines its next move, I could
beat all three of the programs easily, losing only a handful of games.
Of course, playing at this level is like playing against a rank amateur
– someone who doesn’t understand Chess and is making moves which only maximize
their current position.
Remarkably, even though the search methods and evaluation functions
are nearly identical in the three programs examined, there is a wide variation
in performance on my Dell Optiplex GX 233MHz Pentium II. Green Light
proved to be incredibly slow, even at the same depth as the other programs,
yet I was still able to defeat it, indicating that its sluggishness was
not improving its performance. The reason for this is likely its
use of iterative deepening from the start. While Chenard uses iterative
deepening in the middlegame, Arasan uses it at the beginning. This
means that even on the early moves, when decision making is easy, Arasan
is searching many boards trying to improve its position, even though it
had likely already found a good-enough board.
Chenard proved to run the fastest of all the applications under Windows
and its performance as a console application was almost 1.5 times faster,
indicating there is a severe penalty for drawing graphics on screen.
Deep Blue, for example, did not use graphical output, but rather just returned
a move in standard chess notation. Clock cycles were not wasted drawing
chess boards on screen.
In some cases, it seems the differences in speed vs. performance is
a reflection on the effort put into development. All of the programs
use an opening book, but there is no standard. Arasan and Green Light
don’t use much of an opening book, while Chenard’s is more extensive.
None of them use much of an end-game database, typically because most amateur
games are unlikely to reach one of the five or six piece end games.
Arasan was the most customizable of all the programs, which made it
much easier for me to beat. By default, it remembers two killer moves,
but this can be reduced to improve the chance of beating the game.
I didn't notice much of a speed penalty for turning this off, so that leads
me to believe the killer moves weren't actually being used to cut off the
search that often.
Arasn also alllows you control its history heuristic. Every time a move
causes a refutation, the score for that move is incremented. When
a new position is encountered, moves that have high history scores will
be tried before moves that have low history scores, if feature is turned
on.
Three options for null moves are available. If you select 0 for
"null move depth", the null move feature is disabled. If you select
1, searching proceeds to a depth 1 less than normal after a null move is
made. If you select 2, depth is reduced by two after a null move.
Generally "2" will produce the fastest searches.
As mentioned above, the "null move heuristic" is a technique that can
sometimes reduce the number of positions (nodes) that need to be searched.
In a null move search, one of the players is given two moves in a row and
a minimal search is done. If the score for the side to move is very
low, even after having a "free" move, we can determine that the first move
was bad and stop searching that position. Again, I noticed very little
improvement in performance with null moves on.
The following search extensions can also be selected in Arasan:
-
Check extensions. If this box is checked, and the side to move is
in check, Arasan will search deeper.
-
Forced move extensions. If this box is checked, the search will be
extended following a forced move (a move that is only legal move available).
-
Pawn push extensions. If this box is checked, moving a pawn to the
7th rank causes the search to be extended.
-
Recapture extensions. If the previous move was a capture and the
capture restores the material balance to where it was 2 moves ago (for
example, rook takes rook then pawn takes rook), then the search is extended.
The most unique feature of Arasan was the ability to control the size
of the hash table. If you want to override Arasan's automatic calculation,
check the "Fixed" button in the "Hash Table Size" box and enter a number
into the "Hash Table Size" edit field, which will then be enabled.
Entering zero disables the hash table. This is the one feature which
caused a dramatic difference in speed. With no hash table, Arasan
was incredibly slow. With a 32MB hash table, Arasan was 25-30 percent
faster than Chenard. Obviously, the use of a hashing fuction is important
in storing boards.
When it comes to speed of execution, these programs examine between
3,000 and 500,000 positions per second on a single processor. Much
of the speed difference depends on the sizes of the brute force, the selective
and the quiescent search stages. Extra time is required in the selective
stage to assess and identify which moves will be examined. The extent of
this slow, knowledge-based process accounts for much of the speed difference.
One other factor that influences the speed and strength of a program is
the size of its transposition table.
Although many chess programs are similar to each other, their relative
playing strength can still differ greatly. Determining that strength is
no easy matter, since programs can be tuned to perform well on any standard
test suite. For this reason the group who produce the Swedish Rating List
use a more traditional approach. All commercially available programs continually
and automatically play games against each other, leading to hundreds of
statistically valid results. From these data an ELO rating is computed,
much like the rating system used for chess-players in America and elsewhere.
In the US the average player has a rating over 1500, while experts are
in the range 2000-2200 and masters are rated 2200-2400. Above that come
the super elite players called Grand Masters, of whom about 300 are active
worldwide. At the Eighth World Computer Chess Championships most programs
have an ELO rating in the range 2100-2500. The current Swedish rating list
is published in each issue of the International Computer Chess Association
Journal.
Arasan is the only one of the three programs I tested that had a ELO
rating. It was rated at 2050 in the most recent ratings, which makes
it an expert chess player.
|