Introduction and Review

Goals of Course

  1. To learn, understand and be able to utilize many of the data structures that are fundamental to Computer Science
  2. To understand implementation issues related to these data structures, and to see how they can be implemented in the Java programming language
  3. To understand and utilize programming ideas and techniques utilized in data structure implementation
  4. To learn more of the Java programming language and its features, and to become more proficient at programming with it

Review

Java is a platform-independent programming language. This means you can compile your program on Windows and run it on Mac OS X, Linux, and Windows. How is that possible?

In CS 7 and CS/COE 401, you covered the basics of programming in Java. In this course, we will use these to build and use the common computer science data structures. The basic programming topics you should be comfortable with include:

If you are unsure about these topics, spend some time reviewing them so you are more comfortable with them. Some helpful resources include:

We will briefly review Appendices B, C, and D and review some of these concepts in the first few lectures. However, they will be a refresher only, not an in-depth teaching of them as you would have received in CS/COE 401.

Classes and Objects

Classes are basically blueprints for data. The code below defines a Person class, but there are no instances of that class yet; there are no people because we haven't created them.

class Person {
    private String name;
    private int age;
    private String address;
    
    //...
}

Classes allow us to encapsulate everything (data and operations) of a thing together in one place (the class). These data and operations are refered to as "instance data" and "instance methods".

Access restrictions (i.e. data hiding through private declarations) allow the implementation details of the data type to be hidden from a user. The modifiers public, protected, and private allow various levels of accessibility. What does each modifier do? When should you use each one? The table below shows when a member is accessible given its visibility modifier.

Class Package Subclass World
public y y y y
protected y y y n
no modifier y y n n
private y n n n

Because of data hiding, the users of the class can only see and (directly) interact with the public members of the class. Through these public members, users can determine the nature of the data stored in the class, but not the implementation details. This is called data abstraction. The user does not need to know (and their code should not rely on) the hidden implementation details of the class. For example, we don't know how the ArrayList works, we just know that it does. Data abstraction is related to abstract data types, which we'll discuss later.

Therefore, Java classes define the structure of Java objects (i.e. they are the blueprints for Java objects). Public members (often methods) give the interface and functionality of the objects. Private members (often data) hide the implementation details. Objects are specific instances of a class. They fill in the blanks of the class blueprint.

In addition to the three visibility modifiers mentioned above, we also need to be aware of these two modifiers:

References, Pointers, and Memory

Java has two kinds of variables: primitive and reference. Primitive types are the basic building block data of the language and include ints, doubles, booleans, and chars (among others). Reference types refer to objects. It's important to know how the two types are similar and, more importantly, how they differ. In the table below, what does each statement do?

Primitive Type Reference Type
int i; String s;
i = 445; s = new String("Hello, World!");
int j = i; String t = s;
String u = new String(s);
if (i == j) ... if (s == t) ...
if (s == u) ...

Remember that operators operate on reference variables, not on the objects they refer to. For example, know when you want to compare reference variables and when you want to compare objects. Java does not support operator overloading (unlike C++ and Python, for example) so to compare objects, we must use methods, such as:

These compare the contents of the object. What does each of these methods do? How do you write your own equals or compareTo methods for your own classes?

Why do reference variables have this complication? Why aren't they as straight-forward to use as primitive variables? To understand this, let's first look at pointers. Although Java does not have pointers, many programming languages (ex: C, C++, Pascal) use them.

Pointers are variables that store addresses of other memory locations. They allow indirect access of the data in objects. In the example below, x and y are pointers. The pointers store memory locations. At those memory locations are the objects.

Since the variable holds an address, if you work with the variable, you are working with the address. So if we have the line below in our code, this would have x and y point to the same object. That is, x would no longer point to its object, it would now point to the one y points to. This is called aliasing: having two (or more) pointers pointing to the same object.

x = y;

If you want to work with the object at that address, you must dereference the pointer (i.e. explicitly tell the computer to go to that address). In C++, this is done with the * operator. So if we really wanted to have two separate objects, but want one of them to be a copy of the other, we could do (assuming the classes are set up properly):

*x = *y;

Here, x is still pointing to its original location (as is y). Now, the values at the position x is pointing to is changing:

There is a lot more to pointers, but since Java doesn't have them, let's turn our attention now to references. One complication with pointers is that it is very easy for programmers to forget to dereference pointers before using them, making it very easy to write bad code. Reference variables were created to try to reduce this problem. They still "point" to objects, but now the dereferencing is implicit. You can still assign addresses (just like in the example above), but you can't manipulate them (as you could in older or more low-level programming languages like C and C++).

All objects in Java are allocated dynamically. Memory is allocated using the new operator, e.g.:

Random rand = new Random();
Scanner keyboard = new Scanner(System.in);

Once allocated, objects exist for an indefinite period of time, as long as there is an active reference to the object. Once there are no more active references to the object, the object is no longer accessible to the program (as in the example above where no pointers pointed to the "ClassA object 1"). In Java, these objects are marked for Garbage Collection.

The Java garbage collector is a process that runs in the background during program execution. When the amount of available memory runs low, the garbage collector reclaims objects that have been marked for collection. A fairly sophisticated algorithm is used to determine which objects can be garbage collected. (If you take CS 1621 or CS 1622 you will likely discuss this algorithm in more detail). If plenty of memory is available, there is a good chance that the garbage collector will never run.

See Example1.java and MyRectangle.java.

Building New Classes

Java has many predefined classes in its standard library, which contains hundreds of classes. Each class is designed for a specific purpose. However, there are many situations where we may need a class that is not already defined, thus we must define it ourselves. There are two primary techniques for this:

Composition

With composition, we build a new class using components (instance variables) that are from previously-defined classes. That is, we compose the class from existent "pieces". Through composition, we define a "has-a" relationship between the new class and the old one(s). For example, looking at our Persion class from above:

class Person {
    private String name;
    private int age;
    private String address;
    
    //methods for the class
}

the Person class:

With composition, the new class has no special access to its fields. It interacts with them the way you would interact with any variable of that type (e.g. Person interacting with its name field is the same as a programmer interacting with a String variable). Methods in the new class use methods from the field to interact with it. For example, in the class below, the setCharAt method must indirectly change the character at a particular position; it cannot directly access the character at that position in the string and change it.

public class CompositionClass {
    private String name;
    
    public CompoClass(String n) {
        name = new String(n);
    }
    
    public void setCharAt(int i, char c) {
        StringBuilder b = new StringBuilder(name);
        b.setCharAt(i, c);
        name = b.toString();
    }
}

Composition is very important when creating a data structure. Since the data structure is reponsible for holding data provided by the programmer, it must be composed of other classes. In the example above, we already knew the type of data being held, but a data structure will not know ahead of time what the data type will be. When we talk about Generics, we will see a way to deal with this issue (hint: you've already used at least one generic class in CS/COE 401).

Inheritance

With inheritance, we build a new class (subclass or child class) by extending a previously-defined class (superclass or parent class). The subclass has all of the properties/members (i.e. data and methods) defined in the superclass. Unlike composition and its "has-a" relationship, inheritance defines an is-a relationship between subclass and superclass. For example, our Person class from above can be the superclass for these two subclasses:

class HourlyEmployee extends Person {
    private int employeeNum;
    private double payRate;
    private double hoursWorked;
    
    //methods for the class
}

class SalariedEmployee extends Person {
    private int employeeNum;
    private double payAmount;
    
    //methods for the class
}

Here, we have

We are able to make use of the Person class to simplify the creation of the other two classes. We don't need to duplicate code to create each of the three classes. Another advantage is that HourlyEmployee and SalariedEmployee can each look like a Person object. This means we can store both HourlyEmployee and SalariedEmployee objects in a Person reference variable. For example:

Person alice = new SalariedEmployee("Alice", 28, "123 Orchard Street", 105, 4000);
Person bob = new HourlyEmployee("Bob", 25, "456 Willow Avenue", 113, 12);

This is possible because a subclass looks exactly like its superclass, but with some extra stuff in it. In other words, everything in the superclass is also in the subclass. The reverse is not true. The subclass can have more things than its superclass. For example:

Notice that the first three fields in both classes are the same. Why do you think that's the case?

Through inheritance we can have special access to the other class' fields, but it depends on the visibility of those fields. If the superclass declares them as private, then the subclass cannot directly access those fields. However, if the fields are declared protected, then the subclass does have special access to those fields and can directly modify them.

Inheritance is very useful with data structures. One case is when you want to store different kinds of data (but all similar, such as different kinds of employees) in one data structure. For example, storing different kinds of employees in a list (ArrayList or LinkedList) of employees. This is referred to as polymorphism. Another case is when there exists a data structure that already does most of what you want. Instead of writing an entirely new data structure, with only small differences from the original, you can inherit from that data structure and extend it in the way you need.

Polymorphism

In the last section, the idea of polymorphism was introduced. Polymorphism means "having many forms" and refers to reference variables being able to refer to different kinds of objects. Consider the code snippet below:

Person employee;

The employee reference variable could refer to a Person object, or it could refer to a SalariedEmployee or HourlyEmployee. It can refer to any one of these because, through inheritance, SalariedEmployee is a Person and HourlyEmployee is also a Person. Could employee refer to any other kinds of objects?

Let's say we have the following classes (the changes from the earlier class definitions are highlighted in red -- why do you think Person's fields were changed to protected?):

class Person {
    protected String name;
    protected int age;
    protected String address;
    
    //methods for the class
    
    public String getTitle() {
        return name;
    }
}
class HourlyEmployee extends Person {
    private int employeeNum;
    private double payRate;
    private double hoursWorked;
    
    //methods for the class
    
    public int getEmployeeNumber() {
        return employeeNum;
    }
    
    public String getTitle() {
        return name + ", hourly employee";
    }
}

class SalariedEmployee extends Person {
    private int employeeNum;
    private double payAmount;
    
    //methods for the class
    
    public int getEmployeeNumber() {
        return employeeNum;
    }
    
    public String getTitle() {
        return name + ", salaried employee";
    }
}

Here, both subclasses override the definition of the getTitle method. However, since there was a getTitle method in Person, all Person reference variables know about this method.

Person [] people = new Person[3];
people[0] = new SalariedEmployee("Alice", 28, "123 Orchard Street", 105, 4000);
people[1] = new HourlyEmployee("Bob", 25, "456 Willow Avenue", 113, 12);
people[2] = new Person("Carol", 22, "789 Sunny Road");

for (int i=0; i < people.length; i++) {
    System.out.println("The title is: "+people[i].getTitle());
}

When a method is called, it is the object's method that is called, not the reference variable's method. So the output of the code snippet above will be:

The title is: Alice, salaried employee
The title is: Bob, hourly employee
The title is: Carol

Continuing from the code snippet above, what if we tried to do:

for (int i=0; i < people.length; i++) {
    System.out.println("The title is: "+people[i].getEmployeeNumber());
}

Polymorphism is implemented utilizing two important ideas:

Method Overriding

When a parent's method is redefined in the child class, this is called method overriding. For method overriding, the method name and the method parameters (together, this is called the "method signature") must be identical between the two method headers. If they are not identical, then the method is overloaded. For a subclass object, the definition in the subclass replaces the version in the superclass, even if a superclass reference is used to access the object. We saw an example of that in the for loop example above with the getTitle method.

If the subclass would like to access the superclass' version of the method, then the class must use the super reference. For example:

class Person {
    protected String name;
    protected int age;
    protected String address;
    
    //...
    
    public String toString() {
        return name+" ("+age+" years old), at address: "+address;
    }
}
class HourlyEmployee extends Person {
    private int employeeNum;
    private double payRate;
    private double hoursWorked;
    
    //...
    
    public int getEmployeeNumber() {
        return employeeNum;
    }
    
    public String toString() {
        return super.toString()+"\n"+" Hourly Employee (#"+employeeNum+")";
    }
}

Here, HourlyEmployee's toString method makes use of Person's toString method.

Dynamic (or late) Binding

Dynamic binding means that the code executed for a method call is associated with the call during run-time. The actual method executed is determined by the type of the object, not the type of the reference. We saw an example of that in the for loop example above with the getTitle method. There, the object's getTitle method was called, not the reference variable's version of the method.

Why is polymorphism useful?

Polymorphism is very useful if we want to access collections of mixed data types consistently. For example, we may want a list of all employees at a company and print out their title and addresses. This is easier to do with just one list for all employees rather than one list for salaried employees and another for hourly employees.

For data structures, polymorphism is useful when we want a data structure to hold many related, but different, kinds of objects. Because all child classes can look like their parent (or grandparent, ...) class, a data structure can be told it will hold instances of one class, but could actually hold instances of that class or instances of descendants of that class.

Class Hierarchies

With inheritance, we have class hierarchies, such as this one with the people/employee example above:

This diagram tells us that both HourlyEmployee and SalariedEmployee descend/inherit from Person.

Let's consider a larger example. In this company, there are two kinds of staff members: volunteers and employees. All staff members have names, addresses, and phone numbers. However, only employees are paid, meaning we need their social security number and their pay rate. Now, there are three kinds of employees: hourly, executive, and regular. Hourly employees have an hourly pay rate, whereas the other two are salaried. Executives are eligible for a bonus, but hourly and regular employees are not. To summarize, this is the information we need for each kind of staff member:

When designing a class hierarchy, it is important to reduce the duplication of code. Thus, common information should be pushed up the hierarchy as high as possible. However, it is also useful to have meaningful/sensical classes for programmers to use; this includes maintaining the is-a relationship.

Given the requirements above, one might design a class hierarchy as follows. Since all staff members have a name, address, and phone number, and since the volunteer only needs that information, we can make the top of the hierarchy a Volunteer class. From that could descend a HourlyEmployee class and an Employee class since both just add social security number and pay rates (one hourly, one salary). Finally, the Executive class could descend from Employee because executives are just regular employees who can get a bonus. So, the hierarchy could look like:

However, there are some problems with this hierarchy. What are they?

What would a better hierarchy look like?

Abstract Classes

In designing a better hierarchy, it might be useful to create a superclass that both the volunteer class and the employee classes could descend from. The problem with this new class is that it wouldn't represent an actual staff member, but would just be a class that would hold things common to all staff members. We could make this class an abstract class. Abstract classes are used to give cohesion to its subclasses. No instances will be created of this abstract class (and, in fact none can be created). All fields and methods implemented in this abstract class are inherited by its child class(es).

To create an abstract class, use the class modifier abstract. One or more methods may be declared abstract by using the method modifier abstract. Abstract methods do not have method bodies (just like interface methods). Non-abstract child classes must implement those abstract methods or be abstract classes themselves. Since an abstract class does not have method bodies for its abstract methods, a programmer cannot create objects from abstract classes.

To correct one of the problems with the hierarchy above, we could create an abstract class called StaffMember. All other classes can then be descendants of that class.

An example of the final source code can be found here: Payroll.zip

Interfaces

Unlike some programming langauges, Java allows only single inheritance. That is, one class cannot have more than one parent. Java language developers chose to offer only single inheritance for two basic reasons:

The Java language developers also looked at other programming languages that support multiple inheritance, such as C++, and found that multiple inheritance is rarely used. For a more thorough explanation of why multiple inheritance is not offered, read Why Multiple Inheritance is Not Supported in Java.

One big, powerful use of inheritance is polymorphism. Often, when a programmer wants multiple inheritance, they want it for polymorphism (i.e. they want a class to be able to look like two different other things). Java offers a way to do this with interfaces.

An interface is a named set of methods (i.e. method headers, but no bodies). Basically, an interface is an abstract class without any instance fields and only abstract methods (although there are differences between abstract classes and interfaces). Static constants are allowed, but static methods are not allowed. Any Java class can implement an interface (regardless of whether it inherits from a class). In fact, any Java class can implement multiple interfaces and inherit from a class. To implement an interface, a class must declare so in the class header and implement all methods in the interface.

An example of two interfaces:

public interface Laughable {
    public void laugh();
}

public interface Booable {
    public void boo();
}

We can then have a class implement these two interfaces. Any Java class can implement Laughable by implementing the method laugh(). Similarly, any Java class can implement Booable by implementing the method boo(). For example:

public class Ghost implements Booable {
    // various methods here (constructor, etc.)
    public void boo() {
    System.out.println("Boo!");
    }
}

public class Comedian implements Laughable, Booable {
    // various methods here (constructor, etc.)
    public void laugh() {
        System.out.println("Ha ha ha");
    }
    
    public void boo() {
        System.out.println("You stink!");
    }
}

All of the polymorphism behavior also applies to interfaces. The interface acts as a superclass and the implementing classes are like subclasses to it. An interface reference variable can be used to reference any object that implements that interface and only interface methods are accessible through that interface reference. For example:

Booable [] boo = new Booable[2];
boo[0] = new Ghost();
boo[1] = new Comedian();

for (int i = 0; i < boo.length; i++) {
    boo[i].boo();
}

Why do you think interfaces are useful for data structures?

Daily Schedule Next Notes >>