Lecture 10

Review

  1. What are the differences between a list and a tuple?
  2. What is the difference between writing to a file and appending to a file?
  3. What is the mode used for reading a file? for writing to a file? to appending to a file?

Files

Opening Non-Existent File

When you try to open a non-existent file to read from, you get a FileNotFoundError error, crashing your program. You can either handle the error (which we'll talk about soon) or check whether the file exists before trying to open it. If you plan on asking the user for a filename and using a loop to ensure the file exists, then checking whether it exists is easier than exception handling.

To check whether a file exists, you want to use the exists function from the os.path module. This function takes a string representing a filename and it returns True if the file exists and False otherwise:

>>> import os.path
>>> os.path.exists("missing.txt")
False
>>> infile = open("missing.txt", 'r')
Traceback (most recent call last):   File "<stdin>", line 1, in <module>
FileNotFoundError: [Errno 2] No such file or directory: 'missing.txt'
>>>
>>> os.path.exists('writing_example.txt')
True

What do you think happens when you try to open a non-existent file to write to?

File Locations

In the examples so far, we have been working with files in the same location as the program. What if the file is somewhere else? One option is to give the full address of the file. Let's say your file is located at:

To open the file in your program, you could write:

infile = open("C:\\Users\\Michael\\Documents\\Programs\\Example\\data.txt", 'r') #Windows
infile = open("/Users/Michael/Documents/Programs/Example/data.txt", 'r') #Mac
infile = open("/home/michael/Documents/Programs/Example/data.txt", 'r') #Linux

Notice that Windows' address uses two slashes to separate each part of the address instead of one. Why do you think there are two? You can also specify Windows' address using:

infile = open("C:/Users/Michael/Documents/Programs/Example/data.txt", 'r') #Windows

In fact, Python provides a function for joining together parts of a file's location: join from the os.path module. You can use it like this:

from os.path import join
infile = open(join("C:", "Users", "Michael", "Documents", "Programs", "Example", "data.txt"), 'r') #Windows
infile = open('/'+join("Users", "Michael", "Documents", "Programs", "Example", "data.txt"), 'r') #Mac
infile = open('/'+join("home", "michael", "Documents", "Programs", "Example", "data.txt"), 'r') #Linux

Often, you don't need to (and wouldn't want to) specify the absolute position of a file. Instead, you usually want to specify the relative position (relative to where the program is being run). For example, one common convention is to have separate locations for source code files and data files. Let's say your project is located at:

Within the Example folder, let's say there are two sub-folders:

If main.py tried to open input.txt using open('input.txt', 'r'), that would fail since just giving the filename ('input.txt') means it is in the same location as the source code file. Instead, we need to specify the relative location. There are two basic parts to specifying a relative location:

So if main.py is located in Documents/Programs/Example/src, to access input.txt, you can do:

infile = open('../data/input.txt', 'r')
#that says to go to src's parent folder, then into the data folder inside of it, and open input.txt.

Let's say there was a data file at Documents/Programs/AllData/collected_data-2015-06-10.csv and you wanted main.py (still in Documents/Programs/Example/src). To open the data file:

infile = open('../../AllData/collected_data-2015-06-10.csv', 'r')
#notice that we used ".." twice to indicate going into src's grandparent folder

Sometimes, it is nice to know where your program is being run from (since it isn't always where the source code file is saved at). To do this, use the getcwd function in the os module to "get the current working directory" (historically, folders were called directories).

File Names

File names are case sensitive (except for Windows), so you must use the correct capitalization when opening files. You cannot write to a file named "name.txt" then try to read from it by opening "Name.txt". Even if you are using Windows, there is no guarantee others using your program (such as the grader) is using Windows. So you must be consistent in the capitalization of filenames

Sometimes, file extensions are hidden. In Python (and other programming languages), you must include the file extension. If you write to a file named "name.txt", you cannot try opening the file by using "name".

File Formats

In the writing examples earlier, we were just writing sentences to a file with no regard for how to make extracting the information as easy as possible. The reading examples earlier were just spitting out to the user the contents of the file. Often, you will want the information stored in a file to be easily extracted by your program. To accomplish this, you must think about the best way to store data in the file.

Usually, each line in a file represents a single record (i.e. everything related to a single data point, single person, or single element of interest) since you can use the readline and readines methods or the for-loop iteration example to read in one line at a time and process it.

Within a line, it is a very good idea to put values in the same order on each line. That makes processing the data much easier. If you're storing names, birthdates, and favorite colors, don't sometimes store them as name, birthdate, color and other times as birthdate, color, name

Decide how the fields in a record are separated. Commas are traditional, but if you're saving text from the user that could contain spaces, then you might want a different separator. Tabs are also traditional and less likely to cause problems than commas. But, you can use any separator you want.

Which of the following would be easier for a program to read?

Alice	10/24/86	red
Bob	3/14/90	green
Carol	7/14/80	blue

or

Alice,10/24/86,Red
Bob	1990-03-14	green
Carol	blue	7/14/80

Once you've decided on a format for the file, you will likely be using readline, readlines, and writelines to read and write files.

Write a program that asks the user for their name, birthdate, and favorite color. Save this information to a file (have the user specify the file). Continue asking until the user wants to stop. Write another program that reads this information and prints it to the screen.

<< Previous Notes Daily Schedule Next Notes >>