c05thu - page 1 of 1

Chapter 5. Stacks, Recursion and Backtracking

Frederick Thulin, Knowledge Systems Institute, USA (fthulin@ksi.edu)

Chapter Outline

5.1. Stacks

5.1.1. The LIFO Nature of Stacks

5.1.2. Reversing with a Stack

5.1.3. Stack Operations

5.1.4. Contiguous Implementation

5.1.5. Linked Implementation

5.2. Recursion

5.2.1. Introduction

5.2.2. Procedure Calls and the Run-Time Stack

5.2.3. Sum of the First n Positive Integers

5.2.4. Factorials

5.2.5. Collatz 3x+1 Problem

5.2.6. Greatest Common Divisor

5.2.7. Towers of Hanoi

5.2.8. Reversing a Line of Input

5.3. Backtracking

5.3.1. Introduction

5.3.2. The Eight Queens Problem

5.3.3. Escape from a Maze

Exercises

5.1. Stacks

5.1.1. The LIFO Nature of Stacks

A stack is a data structure analogous to stacks encountered in everyday life. For example,

consider a stack of books on a desk. One may easily put a new book on top of the stack, and

one may easily remove the top book. Adding books to, or removing them from, the middle of

the stack may be perilous. In the stack data structure accessing items in the middle is

prohibited. Items may be added to or removed from only the top of the stack.

Saving an item on a stack is referred to as pushing the item, and retrieving an item is called

popping the item. Popping an item removes the item from the stack. Pushes and pops are done

at the top of the stack, which may be thought of as a distinguished end of the stack. This means

that if two items are pushed and then one popped, the second one pushed will be popped. This

order of data access is called last in, first out or LIFO access.

5.1.2. Reversing with a Stack

Suppose a sequence of items is presented and it is desired to reverse the sequence. Various

methods could be used and the beginning programmer usually will suggest a solution using an

array. A conceptually very simple solution, however, is based on using a stack. The LIFO

property of stack access guarantees the reversal.

Suppose the sequence ABCDEF is to be reversed. With a stack, one simply scans the

sequence, pushing each item on the stack as it is encountered, until the end of the sequence is

reached. Then the stack is popped repeatedly, with each popped item sent to output, until the

stack is empty. The following illustrates this algorithm.

input ==> ABCDEF

push A:

A <== top of stack

input ==> BCDEF

push B:

B <== top of stack

input ==> CDEF

push C:

C <== top of stack

input ==> DEF

push D:

D <== top of stack

input ==> EF

push E:

E <== top of stack

input ==> F

push F:

F <== top of stack

The end of the input has been reached, and so popping the stack begins.

pop F to output:

E <== top of stack

pop E to output:

D <== top of stack

pop D to output:

C <== top of stack

pop C to output:

B <== top of stack

pop B to output:

A <== top of stack

pop A to output:

stack empty; stop.

5.1.3. Stack Operations

Besides the push and pop operations on a stack, it is desirable to have at least two more: an

initialize operation, which prepares the stack for use and sets it to an empty state, and an

empty operation, which is simply a test to see whether the stack is empty. The empty operation

is useful to guard against an attempt to pop an empty stack, which is an error. Ideally stacks

should be unlimited in capacity so that another item can always be pushed no matter how many

are already on the stack. However, stacks as implemented on an actual computer will always

have some finite maximum capacity. Sometimes these stacks are referred to as stacks with a

roof. When the roof is reached, items can no longer be pushed on the stack. In view of this

fact, it is sometimes desirable to supply a full operation for a stack.

Other stack operations may be conceived, such as peek (examine the top item of the stack

without popping it), traverse (go through all the items on the stack, performing some action for

each item) and count (find the number of items on the stack). Here, however, attention will be

restricted to only the five mentioned in the preceding paragraph.

Stack Operations

push:

add an item to the top of the stack

pop:

remove an item from the top of the stack

empty:

test whether the stack is empty

full:

test whether the stack is full

initialize:

set up the stack in an empty condition

In C programs the above operations are often implemented as functions to provide a degree of

data hiding. A program which uses stacks would access the stacks only through these

functions, and not be concerned with the inner workings of the stack. It is also convenient to

access a stack through a pointer. The pointer can be considered a handle whereby the program

can make use of the stack. These techniques can provide implementation independent code, so

that a program using stacks would not need to be changed if the stack functions themselves

were changed. This will be illustrated in the following two sections.

5.1.4. Contiguous Implementation

A contiguous implementation means that each stack item is adjacent in memory to the next stack

item, and so arrays are the natural structures for contiguous implementations. For a contiguous

stack implementation the stack items are kept in an array and the top position in an integer field

of a structure which also contains the array. The stack will be accessed through a handle which

is a pointer to this structure. For flexibility the type itemtype will be used in the code for stack

items. Then the actual type used in a program can simply be specified with a typedef. The

constant N is the maximum size of the stack and may be redefined to a suitable value for

different applications.

Reversing an Input Line with a Contiguous Stack

#include <stdio.h>

#include <stdlib.h>

#define N 80

typedef char itemtype;

typedef struct{

int top;

itemtype items[N];} stackstruct;

typedef stackstruct * stack;

stack initialize(stack s){

s = (stack)malloc(sizeof(stackstruct));

s->top = -1;

return s;}

void push(stack s, itemtype x){

s->items[++s->top] = x;}

itemtype pop(stack s){

return s->items[s->top--];}

int empty(stack s){

return s->top == -1;}

int full(stack s){

return s->top == N-1;}

void main(){

stack s;

char c;

s = initialize(s);

while((c = getchar()) != '\n' && !full(s))

push(s, c);

while(!empty(s))

putchar(pop(s));

getchar();}

Input

palindromes

Output

semordnilap

5.1.5. Linked Implementation

A stack can also be implemented as a linked structure, as illustrated in Figure 5.1. In such an

implementation the stack consists of a sequence of nodes. Each node is a record (structure in

C) containing a data item and a pointer to the next node if one exists. This pointer is called a

link (to the next node). The first node is considered to be the top of the stack, and will be

pointed to by a pointer called top. The last node is the bottom of the stack and its pointer field

is set to NULL. An empty stack will have top == NULL. A linked stack with elements C, B,

A in order (C on top) may be represented as below, where \ denotes a NULL pointer.

top

Figure 5.1. A linked structure.

Recall that it is desirable for a program to access a stack through a handle. To achieve this, we

simply enclose the top field in a structure, and then a pointer to this structure is the desired

handle. It may seem strange to have a structure with only one field, but this ensures that a

program can use a linked stack in exactly the same way it would use a contiguous stack.

The memory for each node is dynamically allocated on the heap using malloc. So when an item

is pushed, a node for it is created, and when an item is popped, its node is freed (using free).

The following program illustrates appropriate types and functions for a linked stack. The main

function below is identical to the main function above using a contiguous stack. Note that the

behavior of this program is the same as the one for a contiguous stack, at least for input lines of

80 characters or less. This is because the same function names are used for the linked stack

operations and they provide the same functionality for the linked stack as the previous ones do

for the contiguous stack. The only difference is that the capacity of a linked stack is generally

greater than a contiguous stack since a linked stack will not become full until dynamic memory is

exhausted.

Reversing an Input Line with a Linked Stack

#include <stdio.h>

#include <stdlib.h>

typedef char itemtype;

typedef struct node{

itemtype item;

struct node * next;} Node;

typedef struct {Node * top;} stackstruct;

typedef stackstruct * stack;

void push(stack s, itemtype x){

Node * p = (Node*)malloc(sizeof(Node));

p->item = x;

p->next = s->top;

s->top = p;}

itemtype pop(stack s){

Node * p = s->top;

itemtype x = p->item;

s->top = p->next;

free(p);

return x;}

int empty(stack s){

return s->top==NULL;}

int full(stack s){

Node * p = (Node*)malloc(sizeof(Node));

if(p == NULL) return 1;

else{

free(p);

return 0;}}

stack initialize(stack s){

s = (stack)malloc(sizeof(stackstruct));

s->top = NULL;

return s;}

void main(){

stack s;

char c;

s = initialize(s);

while((c = getchar()) != '\n' && !full(s))

push(s, c);

while(!empty(s))

putchar(pop(s));

getchar();}

5.2. Recursion

5.2.1. Introduction

Recursion in computer science is generally identified with the use of recursive procedures; that

is, procedures which call themselves. The call (or calls) of the procedure to itself is an explicit

form of self-reference. These self-calls are referred to as recursive calls. Recursion also

includes indirect recursion, in which a procedure may call another, which may then call still

another, and so on, until one calls the first. An algorithm which uses recursion is called a

recursive algorithm.

Recursion is a form of repetition, one of the three basic control structures: sequence, selection

and repetition. Repetition is usually first studied as iteration or looping. Iteration is typically

more easily understood than recursion at first. Iteration occurs frequently in everyday life where

recursion is rare. As an example of iteration, most of us are familiar with seasoning food to

taste. We add a little of the seasoning, taste, add more if necessary and taste again, repeating

until the food is appropriately seasoned. Many other examples of iteration in daily life may

easily come to mind, but examples of recursion do not. The main feature of recursion, self-

reference, does not often play a part in ordinary life. However, programming with recursive

procedures may yield simple and elegant code that is easy to verify, primarily because of the

self-reference involved.

Recursive procedures suffer a performance penalty compared to equivalent iterative code

because of the overhead of repeated procedure calls. The reasons for this overhead will be

discussed in a later section, but note for now that the overhead includes both space and time.

That means that a recursive procedure generally uses more memory and takes more time than

an equivalent iterative one. If the clarity of a recursive solution is sufficiently great and the

performance penalty small, the recursive solution may be preferred to an equivalent iterative

one. It is interesting to note that in the early days of programming any performance penalty was

frequently thought to be unacceptable and so recursion was not much used. In fact, many early

programming languages did not support recursion, but nowadays almost all do.

Recursion does not need to be used in programming because any recursive procedure or set of

procedures may be translated into an equivalent iterative version using only loops for repetition

and no recursive calls (direct or indirect). A recursive algorithm thus has a corresponding

equivalent nonrecursive or iterative version. One may speak of the recursive and iterative

versions of an algorithm. Just as recursive algorithms can be translated into iterative ones,

iterative algorithms may be translated into recursive ones. That is, any code containing loops

may have the loops eliminated and replaced by appropriate recursive calls. The term "removing

recursion" is often used to describe the process of translating from a recursive version to an

iterative version. In the general case, removing recursion requires the explicit use in the code of

a stack data structure. The general case, however, will not be considered here. Most

programming systems provide a special stack, invisible to the programmer, called the run-time

stack.

5.2.2. Procedure Calls and the Run-Time Stack

When a procedure is called, a pointer to the position in the code following the call must be

saved so that when the procedure returns, it can return to the proper place. The run-time stack

is used for this purpose. The run-time stack is also used for passing parameters and for local

variable storage for the called procedure.

In the following diagrams the run-time stack will be pictured as growing downward toward the

bottom of the page. This means that the top of the stack actually appears at the bottom in the

diagrams. This reflects the orientation used in most computers, where the stack starts at a

particular address in memory and grows toward lower addresses in memory.

When a procedure is called, a number of items are pushed on the stack. Together these items

constitute what is known as a stack frame for the procedure. The calling procedure (the caller)

pushes the parameters and the return address. The called procedure (the callee) reserves space

on the stack for its local variables. The building of the stack frame is then a cooperative venture

between the caller and the callee. When a procedure returns, the stack frame is torn down via

reversing (roughly) the steps used to build it. A stack frame is called an activation record by

some authors.

As an example, suppose a procedure has parameters P0 and P1 and local variables L0, L1 and

L2. Its stack frame can be diagrammed as shown in Figure 5.2 (this is a simplification of the

actual frame but suffices for the main ideas of stack frames):

top of run-time stack

Figure 5.2. A stack frame.

As a further example, consider a program with a main procedure (M) and procedures N, O, P,

Q and R. These letters will stand for either the procedures or their stack frames in the following

discussion. Suppose that in a particular run of the program M calls N which then returns, M

calls O which calls P which then returns and then O returns, M calls Q which calls R which calls

itself and then calls itself again, and then all return. R is recursive, and the important thing to

note is that it has a different stack frame for each time it is called. The different stack frames for

R mean that there are multiple copies of its parameters and local variables, which may have

different values. The following diagrams show the frames on the run-time stack at different

points in the execution of the program.

Run-Time Stack Sequence

As illustrated in Figure 5.3, M calls N which then returns, M calls O which calls P which then

returns and then O returns, M calls Q which calls R which calls itself and the calls itself again,

and then all return.

M starts

M calls N

N returns

M calls O

O calls P

P returns

O returns

M calls Q

Q calls R

R calls R

R returns

Q returns

M returns (stack empty)

Figure 5.3. Run-time stack sequence.

return

address

A tree diagram called the procedure call tree is a succinct representation of the growth and

shrinkage of the stack as the program executes. The tree is conventionally drawn growing

downward as illustrated in Figure 5.4.

Procedure Call Tree

_____M_____

Figure 5.4. The procedure call tree.

In order to promote the reader's understanding of recursion, a number of code examples in C

using recursive functions (recall that all procedures in C are called functions) will be presented

and discussed in the following sections. In some cases iterative versions will be presented also

for comparison and contrast. The earlier examples will be simple and mostly impractical since

the true power and usefulness of recursion is evident only in more advanced applications which

will be presented in later chapters.

5.2.3. Sum of the First n Positive Integers

As a first example, consider the problem of determining the sum of the first n positive integers (=

the sum of the first n+1 nonnegative integers, including 0). There is a simple formula for this

sum: n(n+1)/2, but let us ignore that for a while so as to clarify certain concepts needed to

understand recursive algorithms. Of course this means that the only practical use of this

example is in teaching and studying recursion. Denoting the sum by Sn, and noting that

(1) Sn = n + (n-1) + (n-2) + ... + 2 + 1 + 0,

it is easy to see that Sn satisfies a recurrence relation:

(2) Sn = n + Sn-¹, for n > 1.

A recurrence relation may form part of a recursive definition in which something is defined in

terms of itself. A proper recursive definition contains two parts: one or more base cases in

which the item being defined is defined not in terms of itself, and one or more recursive cases

where the item is defined in terms of smaller or more basic versions of itself. Recursive

definitions frequently may be easily translated into recursive implementations in code. Recursive

definitions are often called inductive definitions, especially in the mathematical literature.

It is evident that in the recurrence relation (2), Sn, a sum of n+1 numbers (including 0), is

expressed in terms of Sn-¹, a sum of n numbers. So Sn is expressed in terms of a smaller

version of itself and (2) may then be included in a recursive definition of Sn:

Sn = 0, if n = 0

(base case);

Sn = n + Sn-¹, for n > 1

(recursive case).

Now this definition may be directly translated into a C function, where the type unsigned int is

used since only nonnegative arguments are of interest:

unsigned int S(unsigned int n){

if(n == 0) return 0;

/*base case*/

return n + S(n-1);}

/*recursive case*/

Consider invoking this function with n = 2. The call S(2) results in evaluating the expression n +

S(n-1) for the return, and n-1 = 1 so the call S(1) is made. This call then results similarly in the

call S(0). n = 0 is the base case, so 0 is returned. Then S(1) returns 1+0 (= 1), and then S(2)

returns 2+1 (= 3). This sequence of calls and returns may be conveniently represented in the

following manner:

S(2)

(return 3)

S(1)

(return 1)

S(0)

(return 0)

The return values may be omitted from some such diagrams later in the chapter.

The parameters in the stack frames of S may be represented in a manner similar to that of the

whole stack frames done earlier, as illustrated by Figure 5.5:

call of S(2)

S(2) calls S(1) S(1) calls S(0) return 0

return 1

return 3

Figure 5.5. Parameters in the stack frames.

For reference below, (1) and (2) are here repeated:

(1) Sn = n + (n-1) + (n-2) + ... + 2 + 1 + 0,

(2) Sn = n + Sn-¹, for n > 1.

Now consider an iterative version of the function S. Just as the recursive version follows from

the recurrence relation (2), the iterative version follows from the definition (1). The variable s is

used to accumulate the sum that becomes the return value of the function.

unsigned int S(unsigned int n){

unsigned int s = 0, i;

for( i = n; i > 0; i--)

s += i;

return s;}

Note that the recursive version of S contains an if statement, and the iterative version contains a

for loop. In the recursive version, repetition is handled by recursion and the if statement serves

to stop the recursion at the right point. In the iterative version, repetition is handled by the for

loop and is stopped at the right point by the test of the i > 0 condition. Some students just

beginning to study recursion will incorrectly use a loop in a recursive function. Sometimes a

loop is required, but more frequently not.

Other things worthy to note are that the iterative version uses two local variables while the

recursive version uses none, and the recursive version has two fewer lines of code. Sometimes

the recursive version of an algorithm also requires more parameters than the iterative version,

though this is not the case in this example.

The comparisons above generalize to a fairly large proportion of algorithms in practical use. A

summary of these observations follows.

Recursion vs. Iteration

recursive versions frequently have:

more parameters

fewer local variables

if statement(s)

no loops

less code

more overhead

iterative versions frequently have:

fewer parameters

more local variables

loops (always)

more code

less overhead

Beginners sometimes forget to include the base case in a recursive function, which leads to a

condition known as infinite recursion. In this case the function simply calls itself again and again

until the system runs out of the memory required for the recursive calls. This is to be avoided

for the same reasons that infinite loops are to be avoided.

5.2.4. Factorials

An example similar to the sum of the first n positive integers is the product of the first n positive

integers. Unlike the case with the sum, there is no simple formula for this quantity, and so

functions that compute it after the manner of the sum example may be of some practical use.

There is a standard term for this product: n factorial. There is also a standard notation: n!.

The factorial function in mathematics is defined by f(n) = n!. A so-called "empty product," a

product of 0 numbers, is by convention equal to 1. So considering the product of the first 0

positive integers to be 0!, it is seen that 0! = 1. Iterative and recursive definitions of n! are then

evident:

iterative:

n! = n(n-1)(n-2)...2*1

recursive:

n! = 1, if n = 0

(base case);

n! = n(n-1)!, for n > 1

(recursive case).

These definitions lead to the following two versions of the function factorial.

Recursive Factorial

unsigned int factorial(unsigned int n){

if(n == 0) return 1;

/*base case*/

return n * factorial(n-1);}

/*recursive case*/

Iterative Factorial

unsigned int factorial(unsigned int n){

unsigned int p = 1, i;

for( i = n; i > 0; i--)

p *= i;

return p;}

Just as s was used in the iterative function S to accumulate the sum, p is used in the iterative

factorial function to accumulate the product. It is usually recommended to use the iterative

instead of the recursive factorial function in practical applications, but in fact the performance

penalty is quite small so that either version could be used unless it is required to compute a great

many factorials quickly. Also note that a practical factorial function would probably use type

double rather than unsigned int. Computing factorials with type unsigned int will produce

overflow for factorials greater than 8! in 16 bits and greater than 12! in 32 bits.

5.2.5. Collatz 3x+1 Problem

An interesting example is comes from the Collatz 3x+1 problem, which relates to generating a

sequence of numbers by either dividing the previous number by 2 (if it is even) or multiplying it

by 3 and adding 1 (if it is odd). In the late 1940's the mathematician Collatz, for whom the

problem is named, investigated sequences of integers satisfying the following:

xn = 3xn-¹+1 if xn-¹ is odd;

xn = xn-¹/2 if xn-¹ is even.

These are recurrence relations and so may be incorporated into a recursive definition. Note that

if the number 1 ever occurs in such a sequence, then the sequence proceeds:

1, 4, 2, 1, 4, 2, 1, ...

and so stopping such a sequence when 1 occurs is a common convention. These sequences are

called Collatz sequences. If the sequence starts with an integer x, it is called the Collatz

sequence of x. Collatz hypothesized that the Collatz sequence of any positive integer x

eventually contained 1 and so, with the stopping convention, was finite. Many mathematicians

have tried to prove this, but none has yet done so. The mathematician Ulam did some work

with Collatz sequences and the length of a number's Collatz sequence is named for him: the

Ulam length of x is the length of the Collatz sequence of x. Recursive definitions of the Collatz

sequence and Ulam length of x and corresponding recursive functions are easy to create, and so

are the iterative versions. Here are only considered Collatz sequences for positive integers.

Those for negative integers are also of interest (and for x = 0 of very little interest). For further

information on this topic one may consult Martin Gardner's book Wheels, Life and Other

Mathematical Amusements.

One difficulty with any recursive functions to compute Collatz sequences or Ulam lengths is the

lack of a mathematical proof that the sequence is always finite (terminates in 1). This means that

the recursion may be infinite, a situation to be avoided, as previously mentioned. The sequences

have, however, been tested for very many values of x and have always terminated in such tests.

So we have empirical evidence that the recursion will terminate but not actual knowledge that it

will. The problem occurs because the recurrence

xn = 3xn-¹+1 if xn-¹ is odd

apparently does not define xn in terms of smaller or more basic entities. Nonetheless let us

consider the recursive definitions and functions and also the iterative versions, whose loops may

possibly be infinite for some values of x just as the recursion may be infinite for those values.

Recursive Definition for the Collatz Sequence of a Positive Integer x

xn = x if n = 0,

xn undefined if xn-¹ = 1 (sequence stops),

xn = 3xn-¹+1 if n > 0 and xn-¹ >1 is odd;

xn = xn-¹/2 if n > 0 and xn-¹ >1 is even.

Collatz Sequence Display (Recursive)

void collatz(unsigned int x){

printf("%u\n", x);

/*%u is proper format for unsigned int*/

if(x <= 1) return;

/*base case*/

if(x%2 != 0) collatz(3*x+1);

else collatz(x/2);}

Collatz Sequence Display (Iterative)

void collatz(unsigned int x){

printf("%u\n", x);

while(x > 1){

if(x%2 != 0) x = 3*x+1;

else x = x/2;

printf("%u\n", x);}}

Recursive Definition for Ulam Length of Positive Integer x

Ulam length(x) = 1 if x = 1,

Ulam length(x) = 1 + Ulam length(3x+1) if x > 1 and odd,

Ulam length(x) = 1 + Ulam length(x/2) if x > 1 and even.

Recursive Ulam Length Function

unsigned int ulam_length(unsigned int x){

if(x <= 1) return 1;

/*base case*/

if(x%2 != 0) return 1 + ulam_length(3*x+1);

else return 1 + ulam_length(x/2);}

Iterative Ulam Length Function

unsigned int ulam_length(unsigned int x){

unsigned int ul = 1;

while(x > 1){

if(x%2 != 0) x = 3*x+1;

else x = x/2;

ul++;}

return ul;}

5.2.6. Greatest Common Divisor

Another example is one of the first known algorithms to be described in an unambiguously

algorithmic manner. This algorithm was known to the ancient Greeks and is given in essentially

a recursive form in Euclid, and is often called Euclid's algorithm. It is the algorithm for finding

the greatest common divisor of two given positive integers. This number is the largest integer

that evenly divides the two given integers and is abbreviated GCD. It is sometimes called the

greatest common factor. The algorithm may be understood by studying the code below.

Recursive Greatest Common Divisor Function

unsigned int gcd(unsigned int x, unsigned int y){

if(x==0) return y;

return gcd(y%x, x);}

It is instructive to consider a trace of gcd function calls for a variety of arguments. A modified

gcd with a printf to display a trace of the arguments is given in a complete program below.

GCD with Argument Trace

#include <stdio.h>

unsigned int gcd(unsigned int x, unsigned int y){

printf("%u, %u\n",x ,y);

if(x==0) return y;

return gcd(y%x, x);}

void main(){

printf("%u\n", gcd(91, 156));

getchar();}

Output:

91, 156

65, 91

26, 65

13, 26

0, 13

1Using the indentation technique previously described to indicate function calls, the sequence of

recursive calls may be illustrated in Figure 5.6.

gcd(91, 156)

gcd(65, 91)

gcd(26, 65)

gcd(13, 26)

gcd(0, 13)

(return 13)

Figure 5.6. Call sequence for gcd(91, 156).

5.2.7. Towers of Hanoi

The Towers of Hanoi problem is based upon a children's game popular in Europe in the 1800's.

The game consisted of several wooden disks of different sizes and three wooden pegs mounted

upright on a wooden base. The disks had holes in the middle to allow them to fit on the pegs.

To play the game, the disks were first stacked on one peg, largest first and so on in order of size

to the smallest. The object was to move all the disks to one of the other pegs, one disk at a

time, subject to the rule that no disk could rest on a disk smaller than itself. A manufactured

legend was supplied with the game saying that in a temple in Hanoi (or Benares, or some other

location) there were similar disks but 40 (or 64) of them, made of gold and stacked on diamond

needles. In this temple monks were transferring the disks one at a time, following the rule.

When they completed the task, the world would come to an end.

It is not immediately clear that a solution exists for the towers of Hanoi problem. Thinking

recursively, however, provides a demonstration that there is a solution. The idea is that

transferring the disks from one peg to another involves the use of the third peg to help the

transfer. This third peg is called the auxiliary peg. If the pegs are numbered 0, 1, 2, then the

problem may be phrased: transfer the disks from peg 0 to peg 2 using peg 1 as the auxiliary.

The problem may be generalized, where i, j and k are distinct integers in the range 0 to 2:

transfer the disks from peg i to peg k using peg j as auxiliary.

Now the power of recursive thinking becomes apparent if the original problem is reduced to the

following: to transfer n disks from peg 0 to peg 2 with peg 1 as auxiliary, first transfer n-1 disks

from peg 0 to peg 1 using peg 2 as auxiliary. Then move one disk from peg 0 to peg 2. After

this, transfer the n-1 disks from peg 1 to peg 2 using peg 0 as auxiliary.

There is the remaining question of how the n-1 disks are to be transferred. But using recursion,

only the base case needs to be addressed. The base case may be chosen as n = 0, so now only

the explicit method of transferring 0 pegs need be considered. It is clear that to transfer 0 pegs,

one does nothing. This results in the following recursive algorithm for towers of Hanoi, which

shows that a solution indeed exists. The word "by" indicates the auxiliary peg:

Pseudocode for Towers of Hanoi

to move n disks from i to k by j:

if n = 0 do nothing; otherwise:

move n-1 disks from i to j by k,

move 1 disk from i to k;

move n-1 disks from j to k by i.

Using the above pseudocode, a simple proof by mathematical induction shows that the minimum

number of moves to transfer n disks is 2

–1.

Proof that at Least 2

- 1 Moves are Needed for n Disks

For n = 0, 2

-1 = 1 - 1 = 0 and 0 moves are required.

For n > 0, the inductive assumption is that 2

n-1

-1 moves are needed for the transfer of the first

n-1 disks and 2

n-1

-1 moves required for the second n-1 disks. One move is required for the

lowest disk. Then the moving of all n disks requires (at least) 2

n-1

-1 + 1 + 2

n-1

-1 = 2*2

n-1

-1 =

-1 moves, QED.

Towers of Hanoi is an Exponential Algorithm (impractical for large n)

The preceding proof shows that the towers of Hanoi algorithm is of order 2

or, using the big O

notation to be fully explained in a later chapter, O(2

). Such algorithms are called exponential.

This implies that the algorithm takes twice as long to execute on n disks as on n-1 disks. So if it

takes 1 second to run on 20 disks, it will take about 2 seconds on 21, about 4 seconds on 22

and so on. 40 disks would require about 2.14 days. 64 disks would require about 557,845

years.

For certain practical problems sometimes the most obvious solution turns out to be an

exponential one like towers of Hanoi. In practice, these solutions are to be avoided wherever

possible. Sometimes algorithms can be found which give approximate solutions in much less

time. These algorithms are to be preferred over exact solutions when the approximation is

acceptable.

Below is a complete program for towers of Hanoi, with its output following. The output is a set

of instructions for moving 4 disks from peg 0 to peg 2 by peg 1. The function towers follows

the pseudocode given above. The move_disk function may be modified to show a graphic

display of disks moving from peg to peg.

Towers of Hanoi Program

#include <stdio.h>

void move_disk(int peg1, int peg2){

printf("Move a disk from peg %d to peg %d.\n", peg1, peg2);}

void towers(int n, int i, int j, int k){

if(n==0) return;

towers(n-1, i, k, j);

move_disk(i, k);

towers(n-1, j, i, k);}

void main(){

towers(4, 0, 1, 2);

getchar();}

Output

Move a disk from peg 0 to peg 1.

Move a disk from peg 0 to peg 2.

Move a disk from peg 1 to peg 2.

Move a disk from peg 0 to peg 1.

Move a disk from peg 2 to peg 0.

Move a disk from peg 2 to peg 1.

Move a disk from peg 0 to peg 1.

Move a disk from peg 0 to peg 2.

Move a disk from peg 1 to peg 2.

Move a disk from peg 1 to peg 0.

Move a disk from peg 2 to peg 0.

Move a disk from peg 1 to peg 2.

Move a disk from peg 0 to peg 1.

Move a disk from peg 0 to peg 2.

Move a disk from peg 1 to peg 2.

5.2.8. Reversing a Line of Input

Recursion may be used to reverse a line of input without using an explicit stack or an array. The

idea is to get a character of the input line, save it, display the rest of the line in reverse

(recursively), and then display the saved character. The base case occurs at the end of the line

when the newline character is encountered. Here is an implementation as the function line_rev

included in a complete program.

#include <stdio.h>

void line_rev(){

char c = getchar();

if(c=='\n') return;

line_rev();

putchar(c);}

void main(){

line_rev();

getchar();}

Input:

abracadabra

Output:

arbadacarba

Displaying an Input Line in Order

The code for line_rev above when given a slight modification displays the line forward instead of

backward. If the character c is displayed before rather than after the recursive call, the line is

displayed forward. Calling the new function line_for, the following code results:

#include <stdio.h>

void line_for(){

char c = getchar();

if(c=='\n') return;

putchar(c);

line_for();}

void main(){

line_for();

getchar();}

Input:

abracadabra

Output:

abracadabra

5.3. Backtracking

5.3.1. Introduction

Backtracking is the exploration of a sequence of potential partial solutions to a problem until a

solution is found or a dead end is reached, followed by a regression to an earlier point in the

sequence and exploring a new sequence starting from there, and so on. This process may be

used to find one or all solutions to a problem, or to verify that no solution exists. Recursion is

often used to implement backtracking since the resulting code may be clearer and simpler than

in iterative (looping) implementations.

In the exploration of a sequence of potential partial solutions the regression to an earlier point in

the sequence and exploring a new sequence starting from there causes a branching in the search

for a solution. The sequences thus branch off from one another and form a tree structure.

Backtracking is essentially a depth-first search (DFS) of this tree, a concept to be explored

more fully in Chapter 11. This tree structure is much like a tree of procedure calls as discussed

previously. In fact, the tree is of exactly the same form as the tree of recursive function calls

when the backtracking is implemented with a recursive function.

Two examples will help to clarify the concepts used in backtracking. The first example is the

eight queens problem and some variants of it. The second example is the problem of escaping

from a maze.

5.3.2. The Eight Queens Problem

The game of chess is played on an 8 by 8 board with 64 squares. There are a number of pieces

used in the game. The most powerful one is called a queen. The eight queens problem refers to

a configuration which cannot occur in an actual game, but which is based on properties of the

queen. The problem is: place 8 queens on the board in such a way that no queen is attacking

any other.

The queen attacks any other piece in its row, column, or either of its two diagonals. In Figure

5.7 shown below, the queen (Q) is attacking all squares marked X.

Figure 5.7. The eight queens problem.

The power of the queen is such that there is some difficulty in placing all eight queens. It may

not even be clear that such a placement is possible, but in fact it is possible, as will be seen later.

First consider an easier version of the problem: the four queens problem, in which four queens

are to be placed on a 4 by 4 board, none attacking any other. This problem will illustrate some

of the basic principles of backtracking.

The Four Queens Problem

Since it is evident that only one queen can occupy each row, the solution may proceed by

attempting to place one queen in each row. As a first try the queen is put in the first position in

the row that is not under attack. The following Figure 5.8 indicates this attempt.

Figure 5.8. Placing one queen in a row.

The above is a dead end since now no queen may be placed in the third row. But now we

backtrack to the second row and place the queen in the next unattacked position and continue,

as shown in Figure 5.9.

Figure 5.9. Backtracking and placement.

But now there is no place to put a queen in the fourth row and a dead end is reached again.

Backtracking to the third row, we find no other position for the queen, and so we backtrack to

the second row. Here again there is no remaining position since the queen is already at the end

of the row. So backtracking to the first row, we place the queen in the next allowable position

and continue. This results in the following Figure 5.10.

Figure 5.10.

So eventually backtracking yields a solution in the four queens case. Further backtracking can

be used to give all solutions. Backtracking to the first row again and placing the queen in the

third position yields the solution shown in Figure 5.11.

Figure 5.11.

This solution is a mirror image of the first and in fact is the only other solution, as can be

determined by further backtracking.

So backtracking yields the two solutions of the four queens problem. Consider now the n

queens problem in which n queens are to be placed on an n-by-n board, none attacking any

other. The goal will be to implement the recursive backtracking algorithm described for the four

queens problem but generalized to n queens. The implementation may then be used to find

solutions for eight queens or any other number for which machine resources are adequate.

A dynamic array of n² characters will be used to represent the board and will be initialized to all

'-'. This array, named board, will be treated as an n-by-n array, but in the code we will actually

use a one-dimensional array, which is easier to allocate dynamically. Thus we will not be able

to write board[i][j] (since board is one-dimensional) and board will be treated as the pointer it

actually is and in place of board[i][j] will appear *(board+i*n+j).

A queen is placed by overwriting a board position with 'Q'. Before placing the queen, the code

must check whether the upper left diagonal, the column above, and the upper right diagonal are

free of other queens and only place the queen if so. Only positions above need to be checked

since the queens are placed starting in row 0 and proceeding down to row n-1 (success) or a

row less than n-1 (failure). A queen that was placed needs to be removed before backtracking.

The above ideas are used in the following complete program for the n queens problem.

Program for the n Queens Problem

#include <stdio.h>

#include <stdlib.h>

int solution;

void printboard(char * board, int n){

int i, j;

puts("\nPress Enter for next solution.\n");

getchar();

printf("\nSolution # %d:\n\n", ++solution);

for(i = 0; i < n; i++){

putchar('\t');

for(j = 0; j < n; j++)

printf("%c ", *(board + i*n +j));

putchar('\n');}}

int aboveOK(char * board, int i, int j, int n){

for(i--; i >= 0; i--)

if(*(board + i*n +j) == 'Q')

return 0;

return 1;}

int upleftOK(char * board, int i, int j, int n){

for(i--, j--; i >= 0 && j >= 0; i-- , j--)

if(*(board + i*n +j) == 'Q')

return 0;

return 1;}

int uprightOK(char * board, int i, int j, int n){

for(i--, j++; i >= 0 && j < n; i-- , j++)

if(*(board + i*n +j) == 'Q')

return 0;

return 1;}

void putqueen(char * board, int row, int n){

int j;

if(row == n){

printboard(board, n);

return;}

for(j = 0; j < n; j++){

if(upleftOK(board, row, j, n)

&& aboveOK(board, row, j, n)

&& uprightOK(board, row, j, n)){

*(board + row*n +j) = 'Q';

putqueen(board, row+1, n);

*(board + row*n +j) = '-';}}}

void initboard(char * board, int n){

int i;

for(i = 0; i < n*n; i++)

*(board + i) = '-';}

void main(){

char * board;

int n, c;

do{

solution = 0;

puts("\nEnter size of board:");

scanf("%d", &n);

getchar();

board = (char *)malloc(n*n*sizeof(char));

initboard(board, n);

putqueen(board, 0, n);

free(board);

printf("\n%d solutions total for %d queens problem.", solution, n);

puts("\n\nContinue? (y/n):");

while((c = getchar()) == '\n');

}while(c == 'y' || c == 'Y');}

Partial Output

Enter size of board:

Press Enter for next solution.

Solution # 1:

- Q - -

- - - Q

Q - - -

- - Q -

Press Enter for next solution.

Solution # 2:

- - Q -

Q - - -

- - - Q

- Q - -

2 solutions total for 4 queens problem.

Continue? (y/n):

Enter size of board:

Press Enter for next solution.

Solution # 1:

Q - - - - - - -

- - - - Q - - -

- - - - - - - Q

- - - - - Q - -

- - Q - - - - -

- - - - - - Q -

- Q - - - - - -

- - - Q - - - -

Press Enter for next solution.

.......

Solution # 92:

- - - - - - - Q

- - - Q - - - -

Q - - - - - - -

- - Q - - - - -

- - - - - Q - -

- Q - - - - - -

- - - - - - Q -

- - - - Q - - -

92 solutions total for 8 queens problem.

Continue? (y/n):

5.3.3. Escape from a Maze

Another classic backtracking problem is that of escaping from a maze. The main idea is to

proceed into the maze until a dead end is reached and then backtrack to an earlier position and

continue from there, backtracking again if a dead end is reached, and so on. If an escape route

exists, it will eventually be found by this method. In order for the backtracking to be done, the

route so far taken must be remembered. This is most easily accomplished by marking the path.

For definiteness let us assume the maze is represented by an n-by-n character array with the

following conventions. A path is to be constructed as a sequence of moves from a designated

starting position to an exit position (assumed to be on one edge or corner of the array). The

only moves allowed are up 1 position, down 1, left 1 and right 1 (to an open position). No

diagonal moves are allowed. Open positions will be represented by blanks and blocked

positions (hedges) by X's. An E marks the exit position. The path will be marked by asterisks.

Figure 5.12 is an example of a maze represented in the above fashion.

X X X X

X X X

X X X X X X X

X X E

X X X X X X

X X

X X X X X X X

X X X

X X X X X

Figure 5.12.

Assuming a starting position of (0, 0), a successful path would be represented as shown in

Figure 5.13.

* X X X X

* * * * * X X X

X X X X * X X X

X * * * X * * E

X * X X X * X X

* * * X * * X

X X X X * X X * X

X * X * * X

X X X * * * X X

Figure 5.13.

The path above is a direct path to the maze exit, but a path constructed by backtracking would

generally include some branches into dead ends. The implementation of a recursive

backtracking solution is left as an exercise.

Exercises:

Beginner’s Exercises:

What operation adds an item to the top of a stack?

What operation removes an item from the top of a stack?

Which stack operation is prohibited on an empty stack?

What does LIFO mean?

What is a recursive function?

Recursion is a form of which of the three basic control structures?

For a recursive function to terminate, it must contain at least one ____ case.

If a recursive function is called by another function, then calls itself and then calls itself again,

how many stack frames for the function are now on the run-time stack?

Would a recursive function for a particular task tend to have more or fewer parameters than

an equivalent iterative function?

10. How many moves does the towers of Hanoi algorithm make for (a) 10, (b) 20, (c) 30

disks?

Intermediate Exercises:

11.

In the following sequence a letter means push that letter on the stack and * means pop a

character from the stack. Indicate the sequence of characters resulting from the pops.

LAST***IN***FIRST***OUT***STACK***

12.

Write a recursive function to compute the nth Fibonacci number F

. Follow the

recurrence relation: F

= 0, F1 = 1, F

= F

n-1

+ F

n-2

for n > 1. This will yield a highly

inefficient function that should not be used for practical computation.

13.

Write an iterative function to compute the nth Fibonacci number.

14.

Write a recursive function to compute the binomial coefficient C(n, k) (n and k

nonnegative) defined by the recurrence relation: C(n, 0) = 1, C(n, n) = 1, C(n, k) = C(n-1,

k) + C(n-1, k-1) for n>k>0. This, as in (1) above, will yield a highly inefficient function

that should not be used for practical computation.

15.

Write an iterative function to compute the binomial coefficient C(n, k).

16.

Write a recursive function to compute Ackermann’s function, defined by: A(0, n) =

n+1, A(m, 0) = A(m-1, 1) for m > 0, A(m, n) = A(m-1, A(m, n-1)) for m, n > 0. This is a

function whose values can be extremely large for small arguments. It can be used to test

how well a particular compiler on a given computer implements recursion. Technically it is

a recursive function that is not primitive recursive. The definition of primitive recursive is

beyond the scope of this work, but Ackermann’s function may informally be thought of as

an extremely recursive function.

Advanced Exercises:

17. Write a program which allows a user to push an integer on a stack, pop an integer and

display it, or quit. The user should be able to repeat these choices until quitting. Do not allow

the user to pop an empty stack.

18. Repeat problem 17 but with strings in place of integers.

19. Modify any of the recursive functions to display the recursive call sequence with proper

indentation, such as the following for gcd.

gcd(91, 156)

gcd(65, 91)

gcd(26, 65)

gcd(13, 26)

gcd(0, 13)

20. Write a recursive function queencount(int n) which returns the number of solutions for the n

queens problem.

21. Modify the towers of Hanoi program to allow user input of the number of disks and to

repeat until the user chooses to quit.

22. Modify the solution of problem 21 to allow suppression of printing and timing of the

execution of the towers function as called from main. Determine the number of disks that can

be processed in one second on your computer. Investigate timings for higher numbers of disks.

23. Write a program which uses either of the ulam_length functions to find and display the

number with the maximum Ulam length from 1 up to a user-supplied input.

24. Write a program which times the execution of finding the number with the maximum Ulam

length from 1 up to a user-supplied input using each ulam_length function. The program should

display the times for the recursive version and the times for the iterative version.

25. Implement a recursive solution to the maze escape problem. The maze should be read

from a text file that has the starting coordinates and size of the maze on the first line and t5he

character representation of the maze in subsequent lines. The maze array should be displayed

when a path is found or when it is determined that no path exists.

26. Modify the solution of problem 25 to display all successful paths.