Dictionaries and Sets

Dictionaries

Dictionaries are a lot like lists. From a usage perspective, the big difference is that indexes with lists are just positive integers (in order starting at zero). With dictionaries, indexes are anything (with dictionaries, indexes are called keys). Because of this, there is no guaranteed order to the elements in a dictionary. Another side effect of this is that checking whether an index is valid for a dictionary is more complicated than for a list. Python provides help with this. Python's online documentation for built-in types provides information on how to use dictionaries.

Creating Dictionaries

To create a dictionary in Python, you can either use the dict function or curly brackets ({}). In most cases, it's easier to use curly brackets. In the example below, both a and b are empty dictionaries.

>>> a = dict()
>>> b = {}

To create a dictionary with values already in it, you must specify the key and the value. You do this inside the curly brackets as key : value. You separate these key-value pairs using commas. The example below creates two key-value pairs, mapping common names of animals to their scientific names.

>>> animals = {'cat' : 'Felis catus', 'dog' : 'Canis familiaris'}

Getting, Setting, and Deleting Values

Getting values out of a dictionary is very similar to getting values out of a list (or tuple), you just provide the key inside of square brackets:

>>> print('The cat\'s scientific name is:', animals['cat'])
The cat's scientific name is: Felis catus

Similarly, to add new values to a list, you just use index notation to indicate the key, and assign the value for that key, as in the example below:

>>> animals['ant'] = 'Formicidae'
>>> animals['bird'] = 'Aves'

Now when we print out the dictionary, we see all three keys and values:

>>> print(animals)
{'bird': 'Aves', 'cat': 'Felis catus', 'dog': 'Canis familiaris', 'ant': 'Formicidae'}

Notice that the order in the dictionary is seemingly random. It is not in insertion order, alphabetical order, or length order.

To delete an entry in the dictionary, use the del keyword:

>>> del animals['bird']
>>> print(animals['bird'])
{'cat': 'Felis catus', 'dog': 'Canis familiaris', 'ant': 'Formicidae'}

Keys in a Dictionary

If you try to access a key that does not exist, you will get the KeyError exception:

>>> bird = animals['bird']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'bird'

There are a couple of ways to check whether a key exists in a dictionary. The technique you pick will depend on the problem you are solving.

One option is to put a try-except block around the indexing above and catch KeyError. For example: try:
bird = animals['bird']
except KeyError:
print('bird not in animals')
If all you care about is whether the key is in the dictionary, you can use: key in dictionary, e.g.: if 'bird' not in animals:
print('bird not in animals')
If you want to get the value associated with the key, then you can use the get method. It has an optional second parameter. This second parameter is the value to give if the key is not in the dictionary (it defaults to None). For example: value = animals.get('bird')
if value == None:
    print('bird not in animals')
else:
    print('bird\'s value is:', value)

value = animals.get('bird', 'BIRD IS NOT IN THE DICTIONARY')
if value == 'BIRD IS NOT IN THE DICTIONARY':
    print('bird not in animals')
else:
    print('bird\'s value is:', value)

Keys Must be Hashable

Have you ever wondered why ints, floats, and strs are immutable? Or, why tuples exist when lists seem more versatile? That's all because of how dictionaries store their keys. Dictionary keys must meet certain requirements:

keys are immutable (i.e. their values do not change)
keys are hashable (i.e. calling the hash returns a value)

ints, floats, strs, and tuples all meet these requirements (as does frozen_set, more on that soon). If you try to use a key that is not hashable, you will get a TypeError exception:

>>> a = {}
>>> key = [1, 2, 3]
>>> a[key] = 'value'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'

This also means that if you want an object from a class you create to be a key, you must:

make your class immutable (i.e. no methods that change fields after the object is created)
implement the special method __hash__

Looping through Dictionaries

Just like other containers (lists and tuples), you will often want to loop through everything stored in a dictionary. There are a couple of ways to do this. The one you choose is up to you.

Before we talk about looping, let's first talk about three methods:

keys - Returns a "tuple" of the dictionary's keys (actually, it's more similar to a set). It's type is technically dict_keys, which is similar to a tuple (set), but will automatically update its values whenever the dictionary's keys change (i.e. keys/values are added or removed). For example: >>> keys = animals.keys()
>>> print(keys)
dict_keys(['dog', 'ant', 'cat'])
values - Returns a "tuple" of the dictionary's values. It's type is technically dict_values, which is similar to a tuple, but will automatically update its values whenever the dictionary's values change (i.e. keys/values are added or removed, or values change for a key). For example: >>> vals = animals.values()
>>> print(vals)
dict_values(['Canis familiaris', 'Formicidae', 'Felis catus'])
items - Returns a "set" of tuples containing the dictionary's (key, value) pairs. It's type is technically dict_items, which is similar to a set, but will automatically update its values whenever the dictionary's keys and/or values change. For example: >>> items = animals.items()
>>> print(items)
dict_items([('dog', 'Canis familiaris'), ('ant', 'Formicidae'), ('cat', 'Felis catus')]) The elements of dict_items are tuples though.

Those methods return special types. You cannot index into the returned value, nor can you add/remove values with them. They only reflect what is in the dictionary. If the dictionary changes, then these will also change. What if you want to be able to index into them, or rearrange their order?

With those methods out of the way, we can now look at looping through a dictionary. If you just want to loop through the values (and don't care about their keys), you can loop through the object returned by the values method.

If you want to loop through the keys, you have two options. One is to loop through the object returned by the keys method. Another is to just loop through the dictionary, for example:

for key in animals:
print('The scientific name of', key, 'is', animals[key])

Often, you will want both the keys and values when you loop through a dictionary. The above shows one way to get both the key and the value. However, this can be slow if the dictionary is large. Another way is to use the items method to get both the keys and values, for example:

for anim in animals.items():
print('The scientific name of', anim[0], 'is', anim[1])

Remember that the objects stored in dict_items are tuples. Thanks to tuple notation, we can make the loop above easier to read:

for common_name, sci_name in animals.items():
print('The scientific name of', common_name, 'is', sci_name)

`dict` and `zip`

Let's say you have a list/tuple of (key, value) pairs (as a list or tuple). Something like:

[('key1', 'val1'), ('key2', 'val2'), ('key3', 'val3')]

You can easily convert this into a dictionary by using the dict function:

>>> collection = [('key1', 'val1'), ('key2', 'val2'), ('key3', 'val3')]
>>> mapping = dict(collection)
>>> print(mapping)
{'key3': 'val3', 'key2': 'val2', 'key1': 'val1'}

How is this function useful? When would you encounter a need for this function? It's not often you have such a list/tuple. Instead, you sometimes have a list/tuple of keys and another list/tuple of values. This could come up when reading in values from a comma-separated values (csv) file. You can combine these two lists/tuples into a dictionary using zip and dict. The zip function will zip the two lists/tuples into one "tuple" (technically a zip object, but it's very similar) that looks like the list/tuple dict takes. Here's an example of how to use them together:

>>> keys = ['key1', 'key2', 'key3']
>>> vals = ['val1', 'val2', 'val3']
>>> mapping = dict(zip(keys, vals))
>>> print(mapping)
{'key3': 'val3', 'key2': 'val2', 'key1': 'val1'}

Practice

The first question on the programming part of the midterm is shown below. Try implementing it using dictionaries.

Ask the user for a length of time and the unit of time used. Ask the user for the unit of time to convert the length into. Perform the calculation and display the result. The units to handle and the conversions between them are:

		Convert from ...
		second	minute	hour	day
to ...	second	1	60	3,600	86,400
	minute	1/60	1	60	1,440
	hour	1/3600	1/60	1	24
	day	1/86400	1/3600	1/60	1

Sets

Like lists and tuples, sets contain data. However, unlike lists and tuples, sets never contain duplicates and sets there is no ability to index into a set (i.e. there is no guaranteed order to the values in the set). The reason for the second point (no guaranteed order) is because Python stores the values in such a way to guarantee there are no duplicates. There are many techniques to guarantee no duplicates and good techniques (those using little space or time) must sacrifice order.

So, sets are good if you are interested in just having a collection of items, don't want duplicates, and don't care about order. Maybe their most common use is to remove duplicates from a collection.

To create a set, you can use either the set function or curly brackets:

set - This function can take zero or one arguments. If you don't give any argument, you create an empty set. If you give one argument, it must be an iterable (such as list, tuple, set, string, etc). It will take each element from the iterable and add it to the set (if there are duplicates, the additional elements are ignored). >>> a = set()
>>> print(a)
set()
>>> len(a)
0
>>> b = set(['ant', 'bird', 'cat', 'cat', 'cat'])
>>> print(b)
{'bird', 'ant', 'cat'}
>>> len(b)
3
Curly brackets - In curly brackets, provide a comma-delimited list of values to put in the set being created. >>> a = {'ant', 'bird', 'cat', 'cat', 'cat'}
>>> print(a)
{'bird', 'ant', 'cat'}
>>> len(a)
3

If you want to create an empty set, you must use the set function because {} is how you create an empty dictionary. As you may guess, dictionaries are more common than sets.

Python provides a lot of functions for working with sets (documentation). Below are some common operations.

Operation	Description	Example
Adding and Removing Values
`set.add(value)`	Add value to set.	>>> collection = {'ant', 'bird', 'cat'} >>> collection.add('dog') >>> collection.add('cat') >>> print(collection) {'bird', 'dog', 'ant', 'cat'}
`set.remove(value)`	Removes value from set. It raises `KeyError` if value is not in set.	>>> collection = {'ant', 'bird', 'cat'} >>> collection.remove('cat') >>> print(collection) {'bird', 'ant'}
`set.discard(value)`	Removes value from set, but does not raise any exceptions if value is not in set.	>>> collection = {'ant', 'bird', 'cat'} >>> collection.discard('cat') >>> print(collection) {'bird', 'ant'}
Set Operations
`set.isdisjoint(other)`	Returns `True` if set has no elements in common with other. other can be a set, list, tuple, etc.	>>> collection = {'ant', 'bird', 'cat'} >>> collection.isdisjoint({'bird', 'dog'}) False >>> collection.isdisjoint(['dog', 'elk']) True
`set.issubset(other)` `set <= other`	Returns `True` if all elements of set are in other. other can be a set, list, tuple, etc. (but must be a set for the `<=` version)	>>> collection = {'ant', 'bird', 'cat'} >>> collection.issubset({'bird', 'dog'}) False >>> collection <= {'ant', 'bird', 'cat', 'dog', 'elk'} True
`set < other`	Returns `True` if all elements of set are in other and set is not equal to other. other can be a set, list, tuple, etc. (but must be a set for the `<` version)	>>> collection = {'ant', 'bird', 'cat'} >>> collection < {'ant', 'bird', 'cat'} False >>> collection < {'ant', 'bird', 'cat', 'dog', 'elk'} True
`set.issuperset(other)` `set >= other`	Returns `True` if all elements of other are in set. other can be a set, list, tuple, etc. (but must be a set for the `>=` version)	>>> collection = {'ant', 'bird', 'cat'} >>> collection.issuperset({'bird', 'ant'}) True >>> collection >= {'ant', 'bird', 'cat', 'dog', 'elk'} False
`set > other`	Returns `True` if all elements of other are in set and set is not equal to other. other can be a set, list, tuple, etc. (but must be a set for the `>` version)	>>> collection = {'ant', 'bird', 'cat'} >>> collection > {'ant', 'bird', 'cat'} False >>> collection > {'ant', 'bird'} True
`set.union(other)` `set \| other`	Returns a new set containing the elements from set and other. other can be a set, list, tuple, etc. (but must be a set for the `\|` version)	>>> collection = {'ant', 'bird', 'cat'} >>> new_col = collection \| {'bird', 'elk'} >>> print(new_col) {'bird', 'ant', 'cat', 'elk'}
`set.intersection(other)` `set & other`	Returns a new set containing only the elements in both set and other. other can be a set, list, tuple, etc. (but must be a set for the `&` version)	>>> collection = {'ant', 'bird', 'cat'} >>> new_col = collection & {'bird', 'elk'} >>> print(new_col) {'bird'}
`set.difference(other)` `set - other`	Returns a new set containing only the elements in set that were not in other. other can be a set, list, tuple, etc. (but must be a set for the `-` version)	>>> collection = {'ant', 'bird', 'cat'} >>> new_col = collection - {'bird', 'elk'} >>> print(new_col) {'ant', 'cat'}
`set.symmetric_difference(other)` `set ^ other`	Return a new set with elements in either the set or other but not both. other can be a set, list, tuple, etc. (but must be a set for the `^` version)	>>> collection = {'ant', 'bird', 'cat'} >>> new_col = collection ^ {'bird', 'elk'} >>> print(new_col) {'ant', 'elk', 'cat'}

In addition to the operations above, you have the standard operations for collection data types, such as len and in. However, sets do not support indexing/slicing operations, so you cannot do something like: set[0] (because there is no ordering for sets). If you want to loop through every element in a set, just use the for loop, e.g.:

items = {1, 4, 3, 6, 7}
for val in items:
print(val)

Values Must be Hashable

Just like with dictionary keys, the items you store in a set must be immutable and hashable. For more information, see the section above on dictionary keys being hashable.

Frozen Sets

If you want to make a set immutable, you can make it a frozen set. Frozen sets can be used as values in other sets or as keys in a dictionary. To accomplish this, use the function frozenset to convert an iterable object into a frozen set.

<< Previous Notes

Daily Schedule

Next Notes >>