Chapter 5. Stacks, Recursion and Backtracking
Frederick Thulin, Knowledge Systems Institute, USA (fthulin@ksi.edu)
Chapter Outline
5.1. Stacks
5.1.1. The LIFO Nature of Stacks
5.1.2. Reversing with a Stack
5.1.3. Stack Operations
5.1.4. Contiguous Implementation
5.1.5. Linked Implementation
5.2. Recursion
5.2.1. Introduction
5.2.2. Procedure Calls and the Run-Time Stack
5.2.3. Sum of the First n Positive Integers
5.2.4. Factorials
5.2.5. Collatz 3x+1 Problem
5.2.6. Greatest Common Divisor
5.2.7. Towers of Hanoi
5.2.8. Reversing a Line of Input
5.3. Backtracking
5.3.1. Introduction
5.3.2. The Eight Queens Problem
5.3.3. Escape from a Maze
Exercises
5.1. Stacks
5.1.1. The LIFO Nature of Stacks
A stack is a data structure analogous to stacks encountered in everyday life. For example,
consider a stack of books on a desk. One may easily put a new book on top of the stack, and
one may easily remove the top book. Adding books to, or removing them from, the middle of
the stack may be perilous. In the stack data structure accessing items in the middle is
prohibited. Items may be added to or removed from only the top of the stack.
Saving an item on a stack is referred to as pushing the item, and retrieving an item is called
popping the item. Popping an item removes the item from the stack. Pushes and pops are done
at the top of the stack, which may be thought of as a distinguished end of the stack. This means
that if two items are pushed and then one popped, the second one pushed will be popped. This
order of data access is called last in, first out or LIFO access.
5.1.2. Reversing with a Stack
Suppose a sequence of items is presented and it is desired to reverse the sequence. Various
methods could be used and the beginning programmer usually will suggest a solution using an
array. A conceptually very simple solution, however, is based on using a stack. The LIFO
property of stack access guarantees the reversal.
Suppose the sequence ABCDEF is to be reversed. With a stack, one simply scans the
sequence, pushing each item on the stack as it is encountered, until the end of the sequence is
reached. Then the stack is popped repeatedly, with each popped item sent to output, until the
stack is empty. The following illustrates this algorithm.
input ==> ABCDEF
push A:
A <== top of stack
input ==> BCDEF
push B:
B <== top of stack
A
input ==> CDEF
push C:
C <== top of stack
B
A
input ==> DEF
push D:
D <== top of stack
C
B
A
input ==> EF
push E:
E <== top of stack
D
C
B
A
input ==> F
push F:
F <== top of stack
E
D
C
B
A
The end of the input has been reached, and so popping the stack begins.
pop F to output:
E <== top of stack
D
C
B
A
pop E to output:
D <== top of stack
C
B
A
pop D to output:
C <== top of stack
B
A
pop C to output:
B <== top of stack
A
pop B to output:
A <== top of stack
pop A to output:
stack empty; stop.
5.1.3. Stack Operations
Besides the push and pop operations on a stack, it is desirable to have at least two more: an
initialize operation, which prepares the stack for use and sets it to an empty state, and an
empty operation, which is simply a test to see whether the stack is empty. The empty operation
is useful to guard against an attempt to pop an empty stack, which is an error. Ideally stacks
should be unlimited in capacity so that another item can always be pushed no matter how many
are already on the stack. However, stacks as implemented on an actual computer will always
have some finite maximum capacity. Sometimes these stacks are referred to as stacks with a
roof. When the roof is reached, items can no longer be pushed on the stack. In view of this
fact, it is sometimes desirable to supply a full operation for a stack.
Other stack operations may be conceived, such as peek (examine the top item of the stack
without popping it), traverse (go through all the items on the stack, performing some action for
each item) and count (find the number of items on the stack). Here, however, attention will be
restricted to only the five mentioned in the preceding paragraph.
Stack Operations
push:
add an item to the top of the stack
pop:
remove an item from the top of the stack
empty:
test whether the stack is empty
full:
test whether the stack is full
initialize:
set up the stack in an empty condition
In C programs the above operations are often implemented as functions to provide a degree of
data hiding. A program which uses stacks would access the stacks only through these
functions, and not be concerned with the inner workings of the stack. It is also convenient to
access a stack through a pointer. The pointer can be considered a handle whereby the program
can make use of the stack. These techniques can provide implementation independent code, so
that a program using stacks would not need to be changed if the stack functions themselves
were changed. This will be illustrated in the following two sections.
5.1.4. Contiguous Implementation
A contiguous implementation means that each stack item is adjacent in memory to the next stack
item, and so arrays are the natural structures for contiguous implementations. For a contiguous
stack implementation the stack items are kept in an array and the top position in an integer field
of a structure which also contains the array. The stack will be accessed through a handle which
is a pointer to this structure. For flexibility the type itemtype will be used in the code for stack
items. Then the actual type used in a program can simply be specified with a typedef. The
constant N is the maximum size of the stack and may be redefined to a suitable value for
different applications.
Reversing an Input Line with a Contiguous Stack
#include <stdio.h>
#include <stdlib.h>
#define N 80
typedef char itemtype;
typedef struct{
int top;
itemtype items[N];} stackstruct;
typedef stackstruct * stack;
stack initialize(stack s){
s = (stack)malloc(sizeof(stackstruct));
s->top = -1;
return s;}
void push(stack s, itemtype x){
s->items[++s->top] = x;}
itemtype pop(stack s){
return s->items[s->top--];}
int empty(stack s){
return s->top == -1;}
int full(stack s){
return s->top == N-1;}
void main(){
stack s;
char c;
s = initialize(s);
while((c = getchar()) != '\n' && !full(s))
push(s, c);
while(!empty(s))
putchar(pop(s));
getchar();}
Input
palindromes
Output
semordnilap
5.1.5. Linked Implementation
A stack can also be implemented as a linked structure, as illustrated in Figure 5.1. In such an
implementation the stack consists of a sequence of nodes. Each node is a record (structure in
C) containing a data item and a pointer to the next node if one exists. This pointer is called a
link (to the next node). The first node is considered to be the top of the stack, and will be
pointed to by a pointer called top. The last node is the bottom of the stack and its pointer field
is set to NULL. An empty stack will have top == NULL. A linked stack with elements C, B,
A in order (C on top) may be represented as below, where \ denotes a NULL pointer.
![]() top
Figure 5.1. A linked structure.
Recall that it is desirable for a program to access a stack through a handle. To achieve this, we
simply enclose the top field in a structure, and then a pointer to this structure is the desired
handle. It may seem strange to have a structure with only one field, but this ensures that a
program can use a linked stack in exactly the same way it would use a contiguous stack.
The memory for each node is dynamically allocated on the heap using malloc. So when an item
is pushed, a node for it is created, and when an item is popped, its node is freed (using free).
The following program illustrates appropriate types and functions for a linked stack. The main
function below is identical to the main function above using a contiguous stack. Note that the
behavior of this program is the same as the one for a contiguous stack, at least for input lines of
80 characters or less. This is because the same function names are used for the linked stack
operations and they provide the same functionality for the linked stack as the previous ones do
for the contiguous stack. The only difference is that the capacity of a linked stack is generally
greater than a contiguous stack since a linked stack will not become full until dynamic memory is
exhausted.
Reversing an Input Line with a Linked Stack
#include <stdio.h>
#include <stdlib.h>
typedef char itemtype;
typedef struct node{
itemtype item;
struct node * next;} Node;
typedef struct {Node * top;} stackstruct;
typedef stackstruct * stack;
void push(stack s, itemtype x){
Node * p = (Node*)malloc(sizeof(Node));
p->item = x;
p->next = s->top;
s->top = p;}
itemtype pop(stack s){
Node * p = s->top;
itemtype x = p->item;
s->top = p->next;
free(p);
C
B
A
\
return x;}
int empty(stack s){
return s->top==NULL;}
int full(stack s){
Node * p = (Node*)malloc(sizeof(Node));
if(p == NULL) return 1;
else{
free(p);
return 0;}}
stack initialize(stack s){
s = (stack)malloc(sizeof(stackstruct));
s->top = NULL;
return s;}
void main(){
stack s;
char c;
s = initialize(s);
while((c = getchar()) != '\n' && !full(s))
push(s, c);
while(!empty(s))
putchar(pop(s));
getchar();}
5.2. Recursion
5.2.1. Introduction
Recursion in computer science is generally identified with the use of recursive procedures; that
is, procedures which call themselves. The call (or calls) of the procedure to itself is an explicit
form of self-reference. These self-calls are referred to as recursive calls. Recursion also
includes indirect recursion, in which a procedure may call another, which may then call still
another, and so on, until one calls the first. An algorithm which uses recursion is called a
recursive algorithm.
Recursion is a form of repetition, one of the three basic control structures: sequence, selection
and repetition. Repetition is usually first studied as iteration or looping. Iteration is typically
more easily understood than recursion at first. Iteration occurs frequently in everyday life where
recursion is rare. As an example of iteration, most of us are familiar with seasoning food to
taste. We add a little of the seasoning, taste, add more if necessary and taste again, repeating
until the food is appropriately seasoned. Many other examples of iteration in daily life may
easily come to mind, but examples of recursion do not. The main feature of recursion, self-
reference, does not often play a part in ordinary life. However, programming with recursive
procedures may yield simple and elegant code that is easy to verify, primarily because of the
self-reference involved.
![]() Recursive procedures suffer a performance penalty compared to equivalent iterative code
because of the overhead of repeated procedure calls. The reasons for this overhead will be
discussed in a later section, but note for now that the overhead includes both space and time.
That means that a recursive procedure generally uses more memory and takes more time than
an equivalent iterative one. If the clarity of a recursive solution is sufficiently great and the
performance penalty small, the recursive solution may be preferred to an equivalent iterative
one. It is interesting to note that in the early days of programming any performance penalty was
frequently thought to be unacceptable and so recursion was not much used. In fact, many early
programming languages did not support recursion, but nowadays almost all do.
Recursion does not need to be used in programming because any recursive procedure or set of
procedures may be translated into an equivalent iterative version using only loops for repetition
and no recursive calls (direct or indirect). A recursive algorithm thus has a corresponding
equivalent nonrecursive or iterative version. One may speak of the recursive and iterative
versions of an algorithm. Just as recursive algorithms can be translated into iterative ones,
iterative algorithms may be translated into recursive ones. That is, any code containing loops
may have the loops eliminated and replaced by appropriate recursive calls. The term "removing
recursion" is often used to describe the process of translating from a recursive version to an
iterative version. In the general case, removing recursion requires the explicit use in the code of
a stack data structure. The general case, however, will not be considered here. Most
programming systems provide a special stack, invisible to the programmer, called the run-time
stack.
5.2.2. Procedure Calls and the Run-Time Stack
When a procedure is called, a pointer to the position in the code following the call must be
saved so that when the procedure returns, it can return to the proper place. The run-time stack
is used for this purpose. The run-time stack is also used for passing parameters and for local
variable storage for the called procedure.
In the following diagrams the run-time stack will be pictured as growing downward toward the
bottom of the page. This means that the top of the stack actually appears at the bottom in the
diagrams. This reflects the orientation used in most computers, where the stack starts at a
particular address in memory and grows toward lower addresses in memory.
When a procedure is called, a number of items are pushed on the stack. Together these items
constitute what is known as a stack frame for the procedure. The calling procedure (the caller)
pushes the parameters and the return address. The called procedure (the callee) reserves space
on the stack for its local variables. The building of the stack frame is then a cooperative venture
between the caller and the callee. When a procedure returns, the stack frame is torn down via
reversing (roughly) the steps used to build it. A stack frame is called an activation record by
some authors.
As an example, suppose a procedure has parameters P0 and P1 and local variables L0, L1 and
L2. Its stack frame can be diagrammed as shown in Figure 5.2 (this is a simplification of the
actual frame but suffices for the main ideas of stack frames):
P1
P0
![]() top of run-time stack
Figure 5.2. A stack frame.
As a further example, consider a program with a main procedure (M) and procedures N, O, P,
Q and R. These letters will stand for either the procedures or their stack frames in the following
discussion. Suppose that in a particular run of the program M calls N which then returns, M
calls O which calls P which then returns and then O returns, M calls Q which calls R which calls
itself and then calls itself again, and then all return. R is recursive, and the important thing to
note is that it has a different stack frame for each time it is called. The different stack frames for
R mean that there are multiple copies of its parameters and local variables, which may have
different values. The following diagrams show the frames on the run-time stack at different
points in the execution of the program.
Run-Time Stack Sequence
As illustrated in Figure 5.3, M calls N which then returns, M calls O which calls P which then
returns and then O returns, M calls Q which calls R which calls itself and the calls itself again,
and then all return.
M starts
M calls N
N returns
M calls O
O calls P
P returns
M
M
M
M
M
M
N
O
O
O
P
O returns
M calls Q
Q calls R
R calls R
R calls R
R returns
M
M
M
M
M
M
Q
Q
Q
Q
Q
R
R
R
R
R
R
R
R
R returns
R returns
Q returns
M returns (stack empty)
M
M
M
Q
Q
R
Figure 5.3. Run-time stack sequence.
return
address
L0
L1
L2
A tree diagram called the procedure call tree is a succinct representation of the growth and
shrinkage of the stack as the program executes. The tree is conventionally drawn growing
downward as illustrated in Figure 5.4.
Procedure Call Tree
_____M_____
|
|
|
N
O
Q
|
|
P
R
|
R
|
R
Figure 5.4. The procedure call tree.
In order to promote the reader's understanding of recursion, a number of code examples in C
using recursive functions (recall that all procedures in C are called functions) will be presented
and discussed in the following sections. In some cases iterative versions will be presented also
for comparison and contrast. The earlier examples will be simple and mostly impractical since
the true power and usefulness of recursion is evident only in more advanced applications which
will be presented in later chapters.
5.2.3. Sum of the First n Positive Integers
As a first example, consider the problem of determining the sum of the first n positive integers (=
the sum of the first n+1 nonnegative integers, including 0). There is a simple formula for this
sum: n(n+1)/2, but let us ignore that for a while so as to clarify certain concepts needed to
understand recursive algorithms. Of course this means that the only practical use of this
example is in teaching and studying recursion. Denoting the sum by Sn, and noting that
(1) Sn = n + (n-1) + (n-2) + ... + 2 + 1 + 0,
it is easy to see that Sn satisfies a recurrence relation:
(2) Sn = n + Sn-¹, for n > 1.
A recurrence relation may form part of a recursive definition in which something is defined in
terms of itself. A proper recursive definition contains two parts: one or more base cases in
which the item being defined is defined not in terms of itself, and one or more recursive cases
where the item is defined in terms of smaller or more basic versions of itself. Recursive
definitions frequently may be easily translated into recursive implementations in code. Recursive
definitions are often called inductive definitions, especially in the mathematical literature.
It is evident that in the recurrence relation (2), Sn, a sum of n+1 numbers (including 0), is
expressed in terms of Sn-¹, a sum of n numbers. So Sn is expressed in terms of a smaller
version of itself and (2) may then be included in a recursive definition of Sn:
Sn = 0, if n = 0
(base case);
![]() Sn = n + Sn-¹, for n > 1
(recursive case).
Now this definition may be directly translated into a C function, where the type unsigned int is
used since only nonnegative arguments are of interest:
unsigned int S(unsigned int n){
if(n == 0) return 0;
/*base case*/
return n + S(n-1);}
/*recursive case*/
Consider invoking this function with n = 2. The call S(2) results in evaluating the expression n +
S(n-1) for the return, and n-1 = 1 so the call S(1) is made. This call then results similarly in the
call S(0). n = 0 is the base case, so 0 is returned. Then S(1) returns 1+0 (= 1), and then S(2)
returns 2+1 (= 3). This sequence of calls and returns may be conveniently represented in the
following manner:
S(2)
(return 3)
S(1)
(return 1)
S(0)
(return 0)
The return values may be omitted from some such diagrams later in the chapter.
The parameters in the stack frames of S may be represented in a manner similar to that of the
whole stack frames done earlier, as illustrated by Figure 5.5:
call of S(2)
S(2) calls S(1) S(1) calls S(0) return 0
return 1
return 3
2
2
2
2
2
1
1
1
0
Figure 5.5. Parameters in the stack frames.
For reference below, (1) and (2) are here repeated:
(1) Sn = n + (n-1) + (n-2) + ... + 2 + 1 + 0,
(2) Sn = n + Sn-¹, for n > 1.
Now consider an iterative version of the function S. Just as the recursive version follows from
the recurrence relation (2), the iterative version follows from the definition (1). The variable s is
used to accumulate the sum that becomes the return value of the function.
unsigned int S(unsigned int n){
unsigned int s = 0, i;
for( i = n; i > 0; i--)
s += i;
return s;}
Note that the recursive version of S contains an if statement, and the iterative version contains a
for loop. In the recursive version, repetition is handled by recursion and the if statement serves
to stop the recursion at the right point. In the iterative version, repetition is handled by the for
loop and is stopped at the right point by the test of the i > 0 condition. Some students just
beginning to study recursion will incorrectly use a loop in a recursive function. Sometimes a
loop is required, but more frequently not.
Other things worthy to note are that the iterative version uses two local variables while the
recursive version uses none, and the recursive version has two fewer lines of code. Sometimes
the recursive version of an algorithm also requires more parameters than the iterative version,
though this is not the case in this example.
The comparisons above generalize to a fairly large proportion of algorithms in practical use. A
summary of these observations follows.
Recursion vs. Iteration
recursive versions frequently have:
more parameters
fewer local variables
if statement(s)
no loops
less code
more overhead
iterative versions frequently have:
fewer parameters
more local variables
loops (always)
more code
less overhead
Beginners sometimes forget to include the base case in a recursive function, which leads to a
condition known as infinite recursion. In this case the function simply calls itself again and again
until the system runs out of the memory required for the recursive calls. This is to be avoided
for the same reasons that infinite loops are to be avoided.
5.2.4. Factorials
An example similar to the sum of the first n positive integers is the product of the first n positive
integers. Unlike the case with the sum, there is no simple formula for this quantity, and so
functions that compute it after the manner of the sum example may be of some practical use.
There is a standard term for this product: n factorial. There is also a standard notation: n!.
The factorial function in mathematics is defined by f(n) = n!. A so-called "empty product," a
product of 0 numbers, is by convention equal to 1. So considering the product of the first 0
positive integers to be 0!, it is seen that 0! = 1. Iterative and recursive definitions of n! are then
evident:
iterative:
n! = n(n-1)(n-2)...2*1
recursive:
n! = 1, if n = 0
(base case);
n! = n(n-1)!, for n > 1
(recursive case).
These definitions lead to the following two versions of the function factorial.
Recursive Factorial
unsigned int factorial(unsigned int n){
if(n == 0) return 1;
/*base case*/
return n * factorial(n-1);}
/*recursive case*/
Iterative Factorial
unsigned int factorial(unsigned int n){
unsigned int p = 1, i;
for( i = n; i > 0; i--)
p *= i;
return p;}
Just as s was used in the iterative function S to accumulate the sum, p is used in the iterative
factorial function to accumulate the product. It is usually recommended to use the iterative
instead of the recursive factorial function in practical applications, but in fact the performance
penalty is quite small so that either version could be used unless it is required to compute a great
many factorials quickly. Also note that a practical factorial function would probably use type
double rather than unsigned int. Computing factorials with type unsigned int will produce
overflow for factorials greater than 8! in 16 bits and greater than 12! in 32 bits.
5.2.5. Collatz 3x+1 Problem
An interesting example is comes from the Collatz 3x+1 problem, which relates to generating a
sequence of numbers by either dividing the previous number by 2 (if it is even) or multiplying it
by 3 and adding 1 (if it is odd). In the late 1940's the mathematician Collatz, for whom the
problem is named, investigated sequences of integers satisfying the following:
xn = 3xn-¹+1 if xn-¹ is odd;
xn = xn-¹/2 if xn-¹ is even.
These are recurrence relations and so may be incorporated into a recursive definition. Note that
if the number 1 ever occurs in such a sequence, then the sequence proceeds:
1, 4, 2, 1, 4, 2, 1, ...
and so stopping such a sequence when 1 occurs is a common convention. These sequences are
called Collatz sequences. If the sequence starts with an integer x, it is called the Collatz
sequence of x. Collatz hypothesized that the Collatz sequence of any positive integer x
eventually contained 1 and so, with the stopping convention, was finite. Many mathematicians
have tried to prove this, but none has yet done so. The mathematician Ulam did some work
with Collatz sequences and the length of a number's Collatz sequence is named for him: the
Ulam length of x is the length of the Collatz sequence of x. Recursive definitions of the Collatz
sequence and Ulam length of x and corresponding recursive functions are easy to create, and so
are the iterative versions. Here are only considered Collatz sequences for positive integers.
Those for negative integers are also of interest (and for x = 0 of very little interest). For further
information on this topic one may consult Martin Gardner's book Wheels, Life and Other
Mathematical Amusements.
One difficulty with any recursive functions to compute Collatz sequences or Ulam lengths is the
lack of a mathematical proof that the sequence is always finite (terminates in 1). This means that
the recursion may be infinite, a situation to be avoided, as previously mentioned. The sequences
have, however, been tested for very many values of x and have always terminated in such tests.
So we have empirical evidence that the recursion will terminate but not actual knowledge that it
will. The problem occurs because the recurrence
xn = 3xn-¹+1 if xn-¹ is odd
apparently does not define xn in terms of smaller or more basic entities. Nonetheless let us
consider the recursive definitions and functions and also the iterative versions, whose loops may
possibly be infinite for some values of x just as the recursion may be infinite for those values.
Recursive Definition for the Collatz Sequence of a Positive Integer x
xn = x if n = 0,
xn undefined if xn-¹ = 1 (sequence stops),
xn = 3xn-¹+1 if n > 0 and xn-¹ >1 is odd;
xn = xn-¹/2 if n > 0 and xn-¹ >1 is even.
Collatz Sequence Display (Recursive)
void collatz(unsigned int x){
printf("%u\n", x);
/*%u is proper format for unsigned int*/
if(x <= 1) return;
/*base case*/
if(x%2 != 0) collatz(3*x+1);
else collatz(x/2);}
Collatz Sequence Display (Iterative)
void collatz(unsigned int x){
printf("%u\n", x);
while(x > 1){
if(x%2 != 0) x = 3*x+1;
else x = x/2;
printf("%u\n", x);}}
Recursive Definition for Ulam Length of Positive Integer x
Ulam length(x) = 1 if x = 1,
Ulam length(x) = 1 + Ulam length(3x+1) if x > 1 and odd,
Ulam length(x) = 1 + Ulam length(x/2) if x > 1 and even.
Recursive Ulam Length Function
unsigned int ulam_length(unsigned int x){
if(x <= 1) return 1;
/*base case*/
if(x%2 != 0) return 1 + ulam_length(3*x+1);
else return 1 + ulam_length(x/2);}
Iterative Ulam Length Function
unsigned int ulam_length(unsigned int x){
unsigned int ul = 1;
while(x > 1){
if(x%2 != 0) x = 3*x+1;
else x = x/2;
ul++;}
return ul;}
5.2.6. Greatest Common Divisor
Another example is one of the first known algorithms to be described in an unambiguously
algorithmic manner. This algorithm was known to the ancient Greeks and is given in essentially
a recursive form in Euclid, and is often called Euclid's algorithm. It is the algorithm for finding
the greatest common divisor of two given positive integers. This number is the largest integer
that evenly divides the two given integers and is abbreviated GCD. It is sometimes called the
greatest common factor. The algorithm may be understood by studying the code below.
Recursive Greatest Common Divisor Function
unsigned int gcd(unsigned int x, unsigned int y){
if(x==0) return y;
return gcd(y%x, x);}
It is instructive to consider a trace of gcd function calls for a variety of arguments. A modified
gcd with a printf to display a trace of the arguments is given in a complete program below.
GCD with Argument Trace
#include <stdio.h>
unsigned int gcd(unsigned int x, unsigned int y){
printf("%u, %u\n",x ,y);
if(x==0) return y;
return gcd(y%x, x);}
void main(){
printf("%u\n", gcd(91, 156));
getchar();}
Output:
91, 156
65, 91
26, 65
13, 26
0, 13
1Using the indentation technique previously described to indicate function calls, the sequence of
recursive calls may be illustrated in Figure 5.6.
gcd(91, 156)
gcd(65, 91)
gcd(26, 65)
gcd(13, 26)
gcd(0, 13)
(return 13)
Figure 5.6. Call sequence for gcd(91, 156).
5.2.7. Towers of Hanoi
The Towers of Hanoi problem is based upon a children's game popular in Europe in the 1800's.
The game consisted of several wooden disks of different sizes and three wooden pegs mounted
upright on a wooden base. The disks had holes in the middle to allow them to fit on the pegs.
To play the game, the disks were first stacked on one peg, largest first and so on in order of size
to the smallest. The object was to move all the disks to one of the other pegs, one disk at a
time, subject to the rule that no disk could rest on a disk smaller than itself. A manufactured
legend was supplied with the game saying that in a temple in Hanoi (or Benares, or some other
location) there were similar disks but 40 (or 64) of them, made of gold and stacked on diamond
needles. In this temple monks were transferring the disks one at a time, following the rule.
When they completed the task, the world would come to an end.
It is not immediately clear that a solution exists for the towers of Hanoi problem. Thinking
recursively, however, provides a demonstration that there is a solution. The idea is that
transferring the disks from one peg to another involves the use of the third peg to help the
transfer. This third peg is called the auxiliary peg. If the pegs are numbered 0, 1, 2, then the
problem may be phrased: transfer the disks from peg 0 to peg 2 using peg 1 as the auxiliary.
The problem may be generalized, where i, j and k are distinct integers in the range 0 to 2:
transfer the disks from peg i to peg k using peg j as auxiliary.
Now the power of recursive thinking becomes apparent if the original problem is reduced to the
following: to transfer n disks from peg 0 to peg 2 with peg 1 as auxiliary, first transfer n-1 disks
from peg 0 to peg 1 using peg 2 as auxiliary. Then move one disk from peg 0 to peg 2. After
this, transfer the n-1 disks from peg 1 to peg 2 using peg 0 as auxiliary.
There is the remaining question of how the n-1 disks are to be transferred. But using recursion,
only the base case needs to be addressed. The base case may be chosen as n = 0, so now only
the explicit method of transferring 0 pegs need be considered. It is clear that to transfer 0 pegs,
one does nothing. This results in the following recursive algorithm for towers of Hanoi, which
shows that a solution indeed exists. The word "by" indicates the auxiliary peg:
Pseudocode for Towers of Hanoi
to move n disks from i to k by j:
if n = 0 do nothing; otherwise:
move n-1 disks from i to j by k,
move 1 disk from i to k;
move n-1 disks from j to k by i.
Using the above pseudocode, a simple proof by mathematical induction shows that the minimum
number of moves to transfer n disks is 2
n
1.
Proof that at Least 2
n
- 1 Moves are Needed for n Disks
For n = 0, 2
0
-1 = 1 - 1 = 0 and 0 moves are required.
For n > 0, the inductive assumption is that 2
n-1
-1 moves are needed for the transfer of the first
n-1 disks and 2
n-1
-1 moves required for the second n-1 disks. One move is required for the
lowest disk. Then the moving of all n disks requires (at least) 2
n-1
-1 + 1 + 2
n-1
-1 = 2*2
n-1
-1 =
2
n
-1 moves, QED.
Towers of Hanoi is an Exponential Algorithm (impractical for large n)
The preceding proof shows that the towers of Hanoi algorithm is of order 2
n
or, using the big O
notation to be fully explained in a later chapter, O(2
n
). Such algorithms are called exponential.
This implies that the algorithm takes twice as long to execute on n disks as on n-1 disks. So if it
takes 1 second to run on 20 disks, it will take about 2 seconds on 21, about 4 seconds on 22
and so on. 40 disks would require about 2.14 days. 64 disks would require about 557,845
years.
For certain practical problems sometimes the most obvious solution turns out to be an
exponential one like towers of Hanoi. In practice, these solutions are to be avoided wherever
possible. Sometimes algorithms can be found which give approximate solutions in much less
time. These algorithms are to be preferred over exact solutions when the approximation is
acceptable.
Below is a complete program for towers of Hanoi, with its output following. The output is a set
of instructions for moving 4 disks from peg 0 to peg 2 by peg 1. The function towers follows
the pseudocode given above. The move_disk function may be modified to show a graphic
display of disks moving from peg to peg.
Towers of Hanoi Program
#include <stdio.h>
void move_disk(int peg1, int peg2){
printf("Move a disk from peg %d to peg %d.\n", peg1, peg2);}
void towers(int n, int i, int j, int k){
if(n==0) return;
towers(n-1, i, k, j);
move_disk(i, k);
towers(n-1, j, i, k);}
void main(){
towers(4, 0, 1, 2);
getchar();}
Output
Move a disk from peg 0 to peg 1.
Move a disk from peg 0 to peg 2.
Move a disk from peg 1 to peg 2.
Move a disk from peg 0 to peg 1.
Move a disk from peg 2 to peg 0.
Move a disk from peg 2 to peg 1.
Move a disk from peg 0 to peg 1.
Move a disk from peg 0 to peg 2.
Move a disk from peg 1 to peg 2.
Move a disk from peg 1 to peg 0.
Move a disk from peg 2 to peg 0.
Move a disk from peg 1 to peg 2.
Move a disk from peg 0 to peg 1.
Move a disk from peg 0 to peg 2.
Move a disk from peg 1 to peg 2.
5.2.8. Reversing a Line of Input
Recursion may be used to reverse a line of input without using an explicit stack or an array. The
idea is to get a character of the input line, save it, display the rest of the line in reverse
(recursively), and then display the saved character. The base case occurs at the end of the line
when the newline character is encountered. Here is an implementation as the function line_rev
included in a complete program.
#include <stdio.h>
void line_rev(){
char c = getchar();
if(c=='\n') return;
line_rev();
putchar(c);}
void main(){
line_rev();
getchar();}
Input:
abracadabra
Output:
arbadacarba
Displaying an Input Line in Order
The code for line_rev above when given a slight modification displays the line forward instead of
backward. If the character c is displayed before rather than after the recursive call, the line is
displayed forward. Calling the new function line_for, the following code results:
#include <stdio.h>
void line_for(){
char c = getchar();
if(c=='\n') return;
putchar(c);
line_for();}
void main(){
line_for();
getchar();}
Input:
abracadabra
Output:
abracadabra
5.3. Backtracking
5.3.1. Introduction
Backtracking is the exploration of a sequence of potential partial solutions to a problem until a
solution is found or a dead end is reached, followed by a regression to an earlier point in the
sequence and exploring a new sequence starting from there, and so on. This process may be
used to find one or all solutions to a problem, or to verify that no solution exists. Recursion is
often used to implement backtracking since the resulting code may be clearer and simpler than
in iterative (looping) implementations.
In the exploration of a sequence of potential partial solutions the regression to an earlier point in
the sequence and exploring a new sequence starting from there causes a branching in the search
for a solution. The sequences thus branch off from one another and form a tree structure.
Backtracking is essentially a depth-first search (DFS) of this tree, a concept to be explored
![]() more fully in Chapter 11. This tree structure is much like a tree of procedure calls as discussed
previously. In fact, the tree is of exactly the same form as the tree of recursive function calls
when the backtracking is implemented with a recursive function.
Two examples will help to clarify the concepts used in backtracking. The first example is the
eight queens problem and some variants of it. The second example is the problem of escaping
from a maze.
5.3.2. The Eight Queens Problem
The game of chess is played on an 8 by 8 board with 64 squares. There are a number of pieces
used in the game. The most powerful one is called a queen. The eight queens problem refers to
a configuration which cannot occur in an actual game, but which is based on properties of the
queen. The problem is: place 8 queens on the board in such a way that no queen is attacking
any other.
The queen attacks any other piece in its row, column, or either of its two diagonals. In Figure
5.7 shown below, the queen (Q) is attacking all squares marked X.
X
X
X
X
X
X
X
X
X
X
X
X
X
X
Q
X
X
X
X
X
X
X
X
X
X
X
X
X
Figure 5.7. The eight queens problem.
The power of the queen is such that there is some difficulty in placing all eight queens. It may
not even be clear that such a placement is possible, but in fact it is possible, as will be seen later.
First consider an easier version of the problem: the four queens problem, in which four queens
are to be placed on a 4 by 4 board, none attacking any other. This problem will illustrate some
of the basic principles of backtracking.
The Four Queens Problem
Since it is evident that only one queen can occupy each row, the solution may proceed by
attempting to place one queen in each row. As a first try the queen is put in the first position in
the row that is not under attack. The following Figure 5.8 indicates this attempt.
![]() Q
Q
Figure 5.8. Placing one queen in a row.
The above is a dead end since now no queen may be placed in the third row. But now we
backtrack to the second row and place the queen in the next unattacked position and continue,
as shown in Figure 5.9.
Q
Q
Q
Figure 5.9. Backtracking and placement.
But now there is no place to put a queen in the fourth row and a dead end is reached again.
Backtracking to the third row, we find no other position for the queen, and so we backtrack to
the second row. Here again there is no remaining position since the queen is already at the end
of the row. So backtracking to the first row, we place the queen in the next allowable position
and continue. This results in the following Figure 5.10.
Q
Q
Q
Q
Figure 5.10.
So eventually backtracking yields a solution in the four queens case. Further backtracking can
be used to give all solutions. Backtracking to the first row again and placing the queen in the
third position yields the solution shown in Figure 5.11.
![]() Q
Q
Q
Q
Figure 5.11.
This solution is a mirror image of the first and in fact is the only other solution, as can be
determined by further backtracking.
So backtracking yields the two solutions of the four queens problem. Consider now the n
queens problem in which n queens are to be placed on an n-by-n board, none attacking any
other. The goal will be to implement the recursive backtracking algorithm described for the four
queens problem but generalized to n queens. The implementation may then be used to find
solutions for eight queens or any other number for which machine resources are adequate.
A dynamic array of n² characters will be used to represent the board and will be initialized to all
'-'. This array, named board, will be treated as an n-by-n array, but in the code we will actually
use a one-dimensional array, which is easier to allocate dynamically. Thus we will not be able
to write board[i][j] (since board is one-dimensional) and board will be treated as the pointer it
actually is and in place of board[i][j] will appear *(board+i*n+j).
A queen is placed by overwriting a board position with 'Q'. Before placing the queen, the code
must check whether the upper left diagonal, the column above, and the upper right diagonal are
free of other queens and only place the queen if so. Only positions above need to be checked
since the queens are placed starting in row 0 and proceeding down to row n-1 (success) or a
row less than n-1 (failure). A queen that was placed needs to be removed before backtracking.
The above ideas are used in the following complete program for the n queens problem.
Program for the n Queens Problem
#include <stdio.h>
#include <stdlib.h>
int solution;
void printboard(char * board, int n){
int i, j;
puts("\nPress Enter for next solution.\n");
getchar();
printf("\nSolution # %d:\n\n", ++solution);
for(i = 0; i < n; i++){
putchar('\t');
for(j = 0; j < n; j++)
printf("%c ", *(board + i*n +j));
putchar('\n');}}
int aboveOK(char * board, int i, int j, int n){
for(i--; i >= 0; i--)
if(*(board + i*n +j) == 'Q')
return 0;
return 1;}
int upleftOK(char * board, int i, int j, int n){
for(i--, j--; i >= 0 && j >= 0; i-- , j--)
if(*(board + i*n +j) == 'Q')
return 0;
return 1;}
int uprightOK(char * board, int i, int j, int n){
for(i--, j++; i >= 0 && j < n; i-- , j++)
if(*(board + i*n +j) == 'Q')
return 0;
return 1;}
void putqueen(char * board, int row, int n){
int j;
if(row == n){
printboard(board, n);
return;}
for(j = 0; j < n; j++){
if(upleftOK(board, row, j, n)
&& aboveOK(board, row, j, n)
&& uprightOK(board, row, j, n)){
*(board + row*n +j) = 'Q';
putqueen(board, row+1, n);
*(board + row*n +j) = '-';}}}
void initboard(char * board, int n){
int i;
for(i = 0; i < n*n; i++)
*(board + i) = '-';}
void main(){
char * board;
int n, c;
do{
solution = 0;
puts("\nEnter size of board:");
scanf("%d", &n);
getchar();
board = (char *)malloc(n*n*sizeof(char));
initboard(board, n);
putqueen(board, 0, n);
free(board);
printf("\n%d solutions total for %d queens problem.", solution, n);
puts("\n\nContinue? (y/n):");
while((c = getchar()) == '\n');
}while(c == 'y' || c == 'Y');}
Partial Output
Enter size of board:
4
Press Enter for next solution.
Solution # 1:
- Q - -
- - - Q
Q - - -
- - Q -
Press Enter for next solution.
Solution # 2:
- - Q -
Q - - -
- - - Q
- Q - -
2 solutions total for 4 queens problem.
Continue? (y/n):
y
Enter size of board:
8
Press Enter for next solution.
Solution # 1:
Q - - - - - - -
- - - - Q - - -
- - - - - - - Q
- - - - - Q - -
- - Q - - - - -
- - - - - - Q -
- Q - - - - - -
- - - Q - - - -
Press Enter for next solution.
.......
Solution # 92:
- - - - - - - Q
- - - Q - - - -
Q - - - - - - -
- - Q - - - - -
- - - - - Q - -
- Q - - - - - -
- - - - - - Q -
- - - - Q - - -
92 solutions total for 8 queens problem.
Continue? (y/n):
n
5.3.3. Escape from a Maze
Another classic backtracking problem is that of escaping from a maze. The main idea is to
proceed into the maze until a dead end is reached and then backtrack to an earlier position and
continue from there, backtracking again if a dead end is reached, and so on. If an escape route
exists, it will eventually be found by this method. In order for the backtracking to be done, the
route so far taken must be remembered. This is most easily accomplished by marking the path.
For definiteness let us assume the maze is represented by an n-by-n character array with the
following conventions. A path is to be constructed as a sequence of moves from a designated
starting position to an exit position (assumed to be on one edge or corner of the array). The
only moves allowed are up 1 position, down 1, left 1 and right 1 (to an open position). No
diagonal moves are allowed. Open positions will be represented by blanks and blocked
positions (hedges) by X's. An E marks the exit position. The path will be marked by asterisks.
Figure 5.12 is an example of a maze represented in the above fashion.
X X X X
X X X
X X X X X X X
X X E
X X X X X X
X X
X X X X X X X
X X X
X X X X X
Figure 5.12.
Assuming a starting position of (0, 0), a successful path would be represented as shown in
Figure 5.13.
* X X X X
* * * * * X X X
X X X X * X X X
X * * * X * * E
X * X X X * X X
* * * X * * X
X X X X * X X * X
X * X * * X
X X X * * * X X
Figure 5.13.
The path above is a direct path to the maze exit, but a path constructed by backtracking would
generally include some branches into dead ends. The implementation of a recursive
backtracking solution is left as an exercise.
Exercises:
Beginners Exercises:
1.
What operation adds an item to the top of a stack?
2.
What operation removes an item from the top of a stack?
3.
Which stack operation is prohibited on an empty stack?
4.
What does LIFO mean?
5.
What is a recursive function?
6.
Recursion is a form of which of the three basic control structures?
7.
For a recursive function to terminate, it must contain at least one ____ case.
8.
If a recursive function is called by another function, then calls itself and then calls itself again,
how many stack frames for the function are now on the run-time stack?
9.
Would a recursive function for a particular task tend to have more or fewer parameters than
an equivalent iterative function?
10. How many moves does the towers of Hanoi algorithm make for (a) 10, (b) 20, (c) 30
disks?
Intermediate Exercises:
11.
In the following sequence a letter means push that letter on the stack and * means pop a
character from the stack. Indicate the sequence of characters resulting from the pops.
LAST***IN***FIRST***OUT***STACK***
12.
Write a recursive function to compute the nth Fibonacci number F
n
. Follow the
recurrence relation: F
0
= 0, F1 = 1, F
n
= F
n-1
+ F
n-2
for n > 1. This will yield a highly
inefficient function that should not be used for practical computation.
13.
Write an iterative function to compute the nth Fibonacci number.
14.
Write a recursive function to compute the binomial coefficient C(n, k) (n and k
nonnegative) defined by the recurrence relation: C(n, 0) = 1, C(n, n) = 1, C(n, k) = C(n-1,
k) + C(n-1, k-1) for n>k>0. This, as in (1) above, will yield a highly inefficient function
that should not be used for practical computation.
15.
Write an iterative function to compute the binomial coefficient C(n, k).
16.
Write a recursive function to compute Ackermanns function, defined by: A(0, n) =
n+1, A(m, 0) = A(m-1, 1) for m > 0, A(m, n) = A(m-1, A(m, n-1)) for m, n > 0. This is a
function whose values can be extremely large for small arguments. It can be used to test
how well a particular compiler on a given computer implements recursion. Technically it is
a recursive function that is not primitive recursive. The definition of primitive recursive is
beyond the scope of this work, but Ackermanns function may informally be thought of as
an extremely recursive function.
Advanced Exercises:
17. Write a program which allows a user to push an integer on a stack, pop an integer and
display it, or quit. The user should be able to repeat these choices until quitting. Do not allow
the user to pop an empty stack.
18. Repeat problem 17 but with strings in place of integers.
19. Modify any of the recursive functions to display the recursive call sequence with proper
indentation, such as the following for gcd.
gcd(91, 156)
gcd(65, 91)
gcd(26, 65)
gcd(13, 26)
gcd(0, 13)
20. Write a recursive function queencount(int n) which returns the number of solutions for the n
queens problem.
21. Modify the towers of Hanoi program to allow user input of the number of disks and to
repeat until the user chooses to quit.
22. Modify the solution of problem 21 to allow suppression of printing and timing of the
execution of the towers function as called from main. Determine the number of disks that can
be processed in one second on your computer. Investigate timings for higher numbers of disks.
23. Write a program which uses either of the ulam_length functions to find and display the
number with the maximum Ulam length from 1 up to a user-supplied input.
24. Write a program which times the execution of finding the number with the maximum Ulam
length from 1 up to a user-supplied input using each ulam_length function. The program should
display the times for the recursive version and the times for the iterative version.
25. Implement a recursive solution to the maze escape problem. The maze should be read
from a text file that has the starting coordinates and size of the maze on the first line and t5he
character representation of the maze in subsequent lines. The maze array should be displayed
when a path is found or when it is determined that no path exists.
26. Modify the solution of problem 25 to display all successful paths.
|