Classes are programming structures that hold data and methods. They allow you to bundle together all data related to a thing and all of the methods of interacting with that thing. This allows for data abstraction or data encapsulation. You are encapsulating all of the information related to a thing in one variable/object; you are abstracting away (or hiding) the details of how that information is stored. Basically, by creating a class, you are able to create a new kind of data type.
In the paragraph above, classes are described as representing "things". What kinds of things can classes represent? Anything, probably. When creating a class, you will need to answer these two questions:
The answers to these two questions depend on how the class will be used. Not all data for a thing is relevant for a program.
Some terminology:
A class has both a class header and a class body. The class header starts with the class
keyword, then the class name, ending with a colon. The class header has some optional parts (such as inheritance) that we may cover later. The class body is indented from the class header and contains fields (variables) and methods (functions).
In the example above, the class is named "ClassEx1
". Any valid identifier will work as a class name. Traditionally, class names capitalize the first letter in each word of the name (e.g. the 'C' in "Class" and "E" in "Ex").
In the body of the class, there are two fields. Notice that creating a field is very similar to creating a variable. There is also a method in the body. Creating a method is very similar to creating a function. Below, we'll cover some of the nuance in fields and methods (such as what self
is doing in the method).
To use a class, you must create an instance of the class. With that instance, you can access fields and methods. In most cases, each instance of a class has its own set of fields. So, two instances of a class can each have their own values for the fields.
Code | Output |
---|---|
print('a.var2 =', a.var2) #this accesses var2, belonging to the instance 'a' a.var2 = 'dog' #changing a's var2's value print('a.var2 =', a.var2) print('a.var1 =', a.var1) a.var1 = 1 print('a.var1 =', a.var1) a.set_var1('ant') #here's how you call a method...more on this below print('a.var1 =', a.var1) print('notice a.var1's value changed') b = ClassEx1() print('b.var1 =', b.var1) print('b.var2 =', b.var2) print("notice b's fields are separate from a's") b.var1 = 'bird' print('b.var1 =', b.var1) print('a.var1 =', a.var1) print("changing b's var1 does not affect a's var1") |
a.var2 = dog a.var1 = None a.var1 = 1 a.var1 = ant notice a.var1's value changed b.var1 = None b.var2 = cat notice b's fields are separate from a's b.var1 = bird a.var1 = ant changing b's var1 does not affect a's var1 |
self
There are a number of additions that make methods different from functions. In almost all cases, functions need access to fields. In Python, this is accomplished with the first parameter in a method. This first parameter always refers to the object the method is being called from. Traditionally, this first parameter is called self
; this is not required, but Python programmers would be confused if you do not follow this convention. Whenever you write a method, you must include at least one parameter (the self
parameter).
Code | Output |
---|---|
def method(): print('method was called') a = ClassEx2() a.method() |
File "<stdin>", line 1, in <module> TypeError: method() takes 0 positional arguments but 1 was given |
When calling a method, Python automatically passes in a reference to the object you're dealing with. So even though no arguments were given to method
in the example above, Python automatically included one.
This also means that if you want to pass an argument into a method call, you must have two parameters. The first parameter will be the self reference and the second will be the value to pass in.
Code | Output |
---|---|
def methodBAD(value): #"bad" because it only has one parameter, poorly-labeled print('the value was:', value) def methodGOOD(self, value): #"good" because it has a self reference variable and a value parameter print('the value was:', value) a = ClassEx3() print('calling methodBAD') a.methodBAD('argument passed in') print('calling methodGOOD') a.methodGOOD('argument passed in') |
Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: methodBAD() takes 1 positional arguments but 2 were given calling methodGOOD the value was: argument passed in |
__init__
MethodWhen you create a new object (e.g. a = ClassEx3()
), you're actually calling a constructor. A constructor is a method that "constructs" (also called "builds" or "initializes") the object, basically by setting up the object's fields. In Python, the constructor is called "__init__
" (those are two underscores before "init" and two after), short for "initializing". For example:
Just like any other method, constructors can take arguments. For example, to initialize a Person object with their name, write the constructor as:
Notice that the field "name" doesn't exist until we create it in the constructor. This is ok. Often, the __init__
method is the place that fields are created. Since this method is called whenever an object is created, fields created in this method will exist for the entirety of the object. Now, when we create Person objects, we must pass in the person's name.
Code | Output |
---|---|
def __init__(self, name): self.name = name print('creating Joel') joel = Person('Joel') print('creating baby') baby = Person() |
creating baby Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: __init__() missing 1 required positional argument: 'name' |
Often, we can't trust users to do the right thing. Thus, with classes, we often engage in data hiding. Data hiding is when we hide fields from the user and force them to use methods to access and modify fields. These methods can then validate the values coming in from the user. They also help to ensure data abstraction and encapsulation. By default, all members are publicly-accessible, that is they are accessible both in the class and outside of the class. Private members are those that are only directly-accessible inside the class (i.e. the members are hidden).
To hide a field (or method) from the user, start the member's name with two underscores. This causes Python to secretly mangle the name so that it isn't easily accessible outside of the class. Inside the class, you can still access the member using its name (including the two underscores).
For example, we want an age
field in our Person
class, but it only makes sense for age to be a positive number (maybe even just a positive int). So, we don't want the age field to be publicly accessible (why not?). To make age private, we name it __age
. See the example below:
The user of the Person
class can still access name directly but they can't access age:
If you want to allow the user to access your private fields, we need to write set and get methods (also called setters and getters or mutators and accessors). A get method gets the value of a field. A set method sets, modifies, or mutates the value of a field. Traditionally, accessors follow the naming convention "get_field_name
" and mutators follow the naming convention "set_field_name
". For example:
Now, if the user wants to access age, they can use the set_age
and get_age
methods. Notice that the set_age
method also validates the age, ensuring that it is a positive int. Notice also that the __init__
method now calls the set_age
method instead of directly setting __age
. Why do you think this change was made?
Your setter should do validation of the value before doing the assignment. In many cases, your setter and getter methods should save and return copies of the fields and not the originals so that the field's value can only be modified through the setter. We'll talk more about this when we cover data aggregatioin.
In many other programming languages, it is strongly recommended to make all fields private. Python is more lax about that, but it is still a good idea to make your fields private.
With some fields public (e.g. name
in the Person
class) and other fields private (e.g. age
in the Person
class), accessing fields becomes inconsistent. The programmer will need to memorize which fields they can access directly and which fields they must use methods for.
One solution to this problem is to provide set/get methods for all fields. A lot of programming languages take this approach (and also say that almost all fields should be private). This is an option for Python as well. In our Person
class, we could do:
In the code above, notice that name
was changed to __name
, set/get methods were written, and set_name
is used in the __init__
method.
While this solves our problem of inconsistent access to fields, it has the unfortunately effect of causing the user to type more to access a field. For example, now the user has to type:
instead of the shorter option of:
Python offers an alternative to this first solution: create properties. Properties are basically pseudo-fields. The user can treat them like a field, but secretly the user is interacting with methods. To create a property, first write set and get methods. Then, use the property function to create the property by passing in the get and set methods (do not call these methods when passing them in). The property function returns the pseudo-field, so store it into a field with the name you want to pseudo-field to have. For example:
To now use these properties, treat them just like a field:
There are other things you can do with properties, such as provide a del
method and provide documentation. We won't cover these features. However, the last thing to cover with properties is that you can use them to make constant fields. Constant fields are fields that do not change their values. To create a constant field, just create a get method, then just pass in the get method to the property
function. In the example below, two constant fields are created: birthday
and home_planet
and some other changes are made to support those fields (all new code is bolded and red):
Notice that:
birthday
property only has a getter, meaning it is not possible for the user to change the birthdayhome_planet
is a constant field whose value is just a string literal ('Earth'
)<< Previous Notes | Daily Schedule | Next Notes >> |