DAY 14 Go over the stuff from the end of last class; a bit more slowly and in depth. Another string method: split! Splits a string into a *list* of strings. by default, it splits the string on spaces. Things separated by spaces will be separate elements. No spaces will be included in the list elements. x = "hello there my friend " x.split() x = "My~Dog~ate~fleas" x.split("~") x = "23,45,54,72" x.split(",") >>> f = open("temp.txt",'w') >>> f.write("3 5 55 1\n") >>> f.close() >>> f = open("temp.txt",'r') >>> line = f.readline() >>> line '3 5 55 1\n' >>> numStrings = line.split() >>> numStrings ['3', '5', '55', '1'] >>> sum = 0 >>> for i in numStrings: ... sum += int(i) ... >>> sum 64 >>> A new list, intVals: >>> f = open("temp.txt") >>> line = f.readline() >>> line '3 5 55 1\n' >>> numStrings = line.split() >>> numStrings ['3', '5', '55', '1'] >>> intVals = [] >>> for n in numStrings: ... intVals.append(int(n)) ... >>> intVals [3, 5, 55, 1] >>> ***append! A new method for lists! >>> x = [] >>> x.append(1) >>> x [1] >>> x.append(54) >>> x [1, 54] >>> it actually changes the list, as you can see! so, above, n is '3' intVals.append(int('3')) appends 3 on the end of intVals. then, 5 is appended, and so on. --- lists of lists. you worked with them some for lab 6. the lab said: # Here is an example of a nested loop. TODO: Run and trace this until you are sure you understand it for student in student_grades: print(student) for x in student: print(x) This lab: put together reading lines of numbers from files, and working with nested lists. So, let's work with nested lists a bit. This will also be on the Python Shell for today (I'll put it up before lab). >>> x = [1,2,3] >>> y = [4,5,6] >>> z = [] >>> z.append(x) >>> z [[1, 2, 3]] >>> z.append(y) >>> z [[1, 2, 3], [4, 5, 6]] >>> z[0] [1, 2, 3] >>> z[1] [4, 5, 6] >>> z[0][2] 3 >>> z[1][0] 4 >>> z[1][1] 5 >>> z[1][2] 6 >>> >>> z.append([10]) >>> z [[1, 2, 3], [4, 5, 6], [10]] >>> z.append([]) >>> z [[1, 2, 3], [4, 5, 6], [10], []] >>> for i in z: ... print(i) ... [1, 2, 3] [4, 5, 6] [10] [] >>> len(z) 4 >>> len(z[0]) 3 >>> for i in z: ... print(i,"len:",len(i)) ... [1, 2, 3] len: 3 [4, 5, 6] len: 3 [10] len: 1 [] len: 0 >>> ================================ Lists, strings, and tuples (chapters 8-9). So many similarities, covering lists and strings together. Then, tuples. == Introduction == - We said that a variable is a name and a value associated with it, and that every variable has a type. - There are two big categories of types: things you can change and things you can't ... - This has a big impact on how programs behave, we we need to understand it. == Mutable and immutable == - some types of Python objects are unchangeable Eg x = 3 x = 5 # We can't change a 3; # it is what it is. We just make # x refer to a new integer object. x = "hello there" x = "bye" # we can't change the string "hello there". # we can just make x refer to a new string # value - but some are changable. A list is an example! x = [0,1,2,3] x[2] = 4 x - doesn't work with strings, because they cannot be changed s= "hello" s[1] = "g" - we say that objects are "mutable" or "immutable" (The word "mutable" comes from the same root as "mutate") == Testing your understanding == DRAW PICTURE: - QU: what is the output from this? a = 19 b = a a = 42 print a, b - QU: what does pic2 look like after this? a = [0,1,2,3] b = a a[2] = 4 print a, b - Notice the similarity: In both cases we set b to a and then change a But the big difference: With the int variables, changing a has no effect on b QU: Why? (Because a and b refer to different int objects!) With the list variable, changing a effects b QU: Why? (Because a and b refer to the same list) == Mutable and immutable parameters == - Remember that passing a parameter is like an assignment statement. - So the issue we just saw with ordinary assignment comes up with parameter passing too. - When you have a mutable parameter, you can make changes to the object it refers to. They will therefore be changes to the object that the argument refers to, since they both refer to the same thing. def fun1(x): x = x + 1 return x def fun2(x): x.append(1) return x y = 3 z = fun1(y) y # still 3 z # 4 y = [1,2,3] z = fun2(y) y # has been changed! z # same as y; z and y now point to the same thing z[0]= 32 z y ==== same idea does not work with an immutable parameter! def double(x): x = x * 2 def main(): x = 27 double(x) print x main() - Doesn't work! QU: Why not? Double's x and main's x start out referring to the same object. But you can't change an int, so x = x * 2 creates a new int object. Now double's x and main's x refer to two different objects. Changing what double's x refers to had no effect on main's x. How - So how do we make this code work? I.e., where we double x? def double_fixed(x): return x * 2 def main() x = 27 x = double_fixed(x) print x main() ===== Note: when we are tracing with immutable objects, it makes sense to do this: x = 5 x = 6 x = 7 x 5(crossout) 6(crossout) 7 We'll draw different pictures now, which is more faithful to how Python works, so that we can make the distinction. (I did that above without telling you I was doing this) degrees_celsius = 26.0 degrees_celsius --> 26.0 >>> difference = 20 difference --> 20 >>> double = 2 * difference difference --> 20 double --> 40 >>> difference = 5 difference --> 5 double --> 40 ASIDE FOR ME &&&&& [E.g. for a = 10; b = 10; For immutable objects, variables associated with the same value may or may not point to exactly the same location. It doesn't matter, for immutable objects. It seems that ints between 0 and 99 all do use the same memory location, but larger ints do not. I played around with, e.g., x = 100 y = 100 id(x) == id(y). It seems that right after, e.g., a = b, they refer to the same location. More efficient that way. BUT for them: we don't want to think that the variables are tied together in any way. So, say think about these as pointing to different int objects, different string objects, and so on. ***If they ask, it may or may not point to the same object - with with immutable objects, the computer does what is convenient/efficient. [[for me: this is using indirect addressing. >>> x = 10 >>> id(x) 7213632 >>> y = 10 >>> id(y) 7213632 >>> help(id) Help on built-in function id in module builtins: id(...) id(object) -> integer Return the identity of an object. This is guaranteed to be unique among simultaneously existing objects. (Hint: it's the object's memory address.) x and y both contain value 7213632, which is presumably the address of a location that contains the value 10. ] [** It's better to think about them pointing to different objects; I should draw them that way. I think earlier in the course I drew actual and formal int parameters pointing to the same value. No reason for them not to think of this as the formal parameter getting a copy of the value. We will show variables pointing to the same value only when we have to - only when it matters.] ]] &&&&& === == Lists == - measurements = [45.27, 45.26, 45.24] measurements measurements[0] measurements[1] measurements[2] measurements[3] # this results in an error - Lists can contain more than just numbers, and even be heterogeneous. instructors = ["Andrew", "Jennifer", "Michael"] instructors student = ["Jon Reed", "University of Pittsburgh", 812391236, 3.45] - Lists are mutable. instructors[2] = "Jen" instructors instructors.append("Paul") instructors - QU: What kind of thing is "append(...)"? Yay, method! == List functions == - Lists come with some useful functions - len: len(instructors) instructors[len(instructors) - 1] # Last item is at position len - 1 # Note: brackets can contain expressions - min, max, and sum: measurements max(measurements) # What do min and max do? min(measurements) sum(measurements) min(instructors) # What if the list is not a list of numbers? max(instructors) # min and max work as long as <, >, = are defined. max("Hi, class!") # Works for strings, too. Comparing characters. min("Hi, class!") == List Methods == dir(list) # List methods for lists help(list.append) # Note - does not say returns!!! measurements.append(45.27) measurements help(list.insert) measurements.insert(2, 44.99) measurements measurements.insert(len(measurements) - 1, 44.02) measurements help(list.sort) measurements.sort() === Strings also have many methods defined for them. These are functions that python strings "own" dir(str) "lower" - converts any letters in the string to lower case >>> mystring = 'Hi, There, Bear!' >>> mystring.lower() >>> # mystring is not changed; but you knew that # strings are immutable!! If you do help on all the string # methods, you will see that they return an object; obviously # they don't change the original string. >>> mystring >>> help(str.lower) >>> help(mystring.lower) >>> help('abc'.lower) >>> help(str.replace) >>> s = "jjjkkkllmmm" >>> y = s.replace('k','!') >>> s 'jjjkkkllmmm' >>> y 'jjj!!!llmmm' >>> >>> help(str.count) >>> y.count("!") 3 >>> y.count("!!") 1 >>> >>>help(str.find) >>> y.find('!') 3 >>> y.find('w') -1 >>> >>>help(str.startswith) >>> y.startswith('jjj') True >>> y.startswith('jj') True >>> y.startswith('j') True >>> y.startswith('jjjj') False >>> >>>help(str.len) ** not a method of str ** a primitive function built into python Designers of Python could have defined it either way - method or function Also, programmers design their own kinds of objects (we'll see some of this; CS401 covers this) Guidelines: method or function? - if operation is only relevant to one type of objective, make it a method of that type lower, replace, and so on - if it applies to different types of objects, make it is function, so it can be applied to those different types of objects len('hello') len([1,2,3,4,5]) etc. -YOU WON'T need to make this decision in this class. But the issue was probably bugging you. - Your job: be able to recognize when we are calling something as a function and when we are calling something as a method. More practice, now that we've seen this. >>>"This is CS????".replace("????", "0008") >>> "This is CS0008".count("i") >>> "This is CS0008".count("j") s = "yabababa daba do!" s.count("aba") # algorithm? after it finds a match, it starts #looking in the string AFTER the end of the match. s.count("abab") s.count("bad") #can't ignore spaces - they are chars too! >>> "This is CS0008".find("is") # first occurrence looking from left >>> "This is CS0008".rfind("is") # first occurrence looking from right >>> "This is CS0008".find("is", 3) # first occurrence looking from index 3 on from left >>> "This is CS0008".find("is", 2) >>> "This is CS0008".find("the") # not found ==== Back to lists (want you to get used to their similarity and differences) x = [4,5,6,2,3,4,5] x.count(4) help(list.extend) iteratable? range, list, string. >>> x.extend(range(3)) >>> x [4, 5, 6, 2, 3, 4, 5, 0, 1, 2] >>> x.extend([55,66,77,88]) >>> x [4, 5, 6, 2, 3, 4, 5, 0, 1, 2, 55, 66, 77, 88] >>> >>> x.extend("mom and dad") >>> x [4, 5, 6, 2, 3, 4, 5, 0, 1, 2, 55, 66, 77, 88, 'm', 'o', 'm', ' ', 'a', 'n', 'd', ' ', 'd', 'a', 'd'] >>> >>> x.index(55) 10 >>> >>> x.remove(55) >>> x [4, 5, 6, 2, 3, 4, 5, 0, 1, 2, 66, 77, 88, 'm', 'o', 'm', ' ', 'a', 'n', 'd', ' ', 'd', 'a', 'd'] >>> x.remove(4) >>> x [5, 6, 2, 3, 4, 5, 0, 1, 2, 66, 77, 88, 'm', 'o', 'm', ' ', 'a', 'n', 'd', ' ', 'd', 'a', 'd'] >>> x.remove('m') >>> x [5, 6, 2, 3, 4, 5, 0, 1, 2, 66, 77, 88, 'o', 'm', ' ', 'a', 'n', 'd', ' ', 'd', 'a', 'd'] >>> >>> x.reverse() >>> x ['d', 'a', 'd', ' ', 'd', 'n', 'a', ' ', 'm', 'o', 88, 77, 66, 2, 1, 0, 5, 4, 3, 2, 6, 5] >>> >>> x.sort() Traceback (most recent call last): File "", line 1, in TypeError: unorderable types: int() < str() >>> >> x = [4,2,8,1] >>> x.sort() >>> x [1, 2, 4, 8] >>> x = ['h','2','A'] >>> x.sort() >>> x ['2', 'A', 'h'] >>> == Quiz! Can you remember back to strings? def are_same_string(str1, str2): '''Return True if string str1 and string str2 have the same contents (ignoring case), and False otherwise.''' **** so with one line! ADD return str1.lower() == str2.lower() grade(are_same_string("AbCdEF", 'abcdef'), True) === # ------ Slicing and dicing strings ------ s = "sliceofspam" s[0] # Grab a single character. First index is zero. s[3] s[-2] # Negative indexes are for counting from the RHS # here's how to think of the numbering: "0 s 1 l 2 i...ceofspam" ...7 s -3 p -2 a -1 m 0 but 0 is the one on the left s[2:5] # Slice a string. From position 2 to 5 (not inclusive) s[3:] # If omit the second index, goes to the far RHS s[:8] # If omit the first index, goes from the far LHS s[:] # If omit both, goes from both extremes. Ie, you get the whole string s[3:-2] # You can use negative indices when slicing too s[-5:] s[5:2] # ?? #----- Same, for lists! s = [4,3,5,6,7,8,9,9,8] s[0] # Grab a single element. First index is zero. s[3] s[-2] # Negative indexes are for counting from the RHS Numbering is the same way. s[2:5] s[3:] s[:8] s[:] s[-5:] s[5:2] ------ ==== Some quick programming exercises. ''' print the length of each string in the list 'instructors' instructors = ["Andrew", "Jennifer", "Michael"] instructors student = ["Jon Reed", "University of Pittsburgh", 812391236, 3.45] for s in instructors: print(len(s)) interesting test cases? s = [] [listExercises.py] def make_list_uppercase(original_list): '''Return a list of strings that contains the words from list original_list but in all uppercase letters.''' uppercase_list = [] for s in original_list: create a version of s that is all uppercase add that to the end of uppercase_list return uppercase_list example: ["My","Friendly","Kitten"] s: My -> MY uppercase_list ["MY"] s: "Friendly" --> "FRIENDLY" uppercase_list ["MY","FRIENDLY"] s: "Kitten" --> "KITTEN" uppercase_list ["MY","FRIENDLY","KITTEN"] In Python: uppercase_list = [] for s in original_list: upperS = s.upper() uppercase_list.append(upperS) return uppercase_list OR JUST: uppercase_list = [] for s in original_list: uppercase_list.append(s.upper()) return uppercase_list def square_list(int_list): '''Return a list that contains the ints from list int_list squared''' Exactly the same structure! squared_list = [] for i in int_list: squared_list.append(i**2) return squared_list def grade(myAns,correctAnswer): if myAns == correctAnswer: print("Correct, the answer is",myAns) else: print("My answer is",myAns,"but the correct answer is",correctAnswer) def main(): instructors = ["Jen","David","Rover","Paul"] for s in instructors: print(len(s)) grade( make_list_uppercase(["Hello", "My Friendly", "Kitten"]),\ ['HELLO', 'MY FRIENDLY', 'KITTEN']) grade(make_list_uppercase([]),[]) grade(square_list([0,1,2,3,4]),[0, 1, 4, 9, 16]) main() === More exercises ... - Eg: Write code to print the items in a list, until the value 8 is encountered. ''' write some code to print the items in a list, until the value 8 is encountered ''' data = [2, 6, 7, 8, 3, 8, 5, 9] print("Let's try the first way - it isn't correct!") for num in data: if num != 8: print num #("Now, let's improve the code") #("Reminder, our list is", data) #("We should see the numbers 2, 6, 7 printed, but no others") i = 0 num = data[i] while num != 8: print(num) i = i + 1 num = data[i] #("We can make it shorter") i = 0 while data[i] != 8: print(data[i]) i = i + 1 trace: i data[i] 0 2 1 6 2 7 3 8 - don't go into the body of the loop #("What if the list is empty? we would get an error?") data = [] #("Our list is now", data) i = 0 while data[i] != 8: print data[i] i = i + 1 # data[0] doesn't exist! # "Comment out the 4 lines above, so we can go on this." # "We need to do error checking: make sure the list isn't empty" if len(data) > 0: i = 0 while data[i] != 8: print data[i] i = i + 1 # ("Ok, now our code can handle an empty list") # ("") # ("There is one case where our code is not yet right") # ("What if the list does not contain an 8?") data = [0,1,2,3,4] if len(data) > 0: i = 0 while data[i] != 8: print data[i] i = i + 1 HMMMM: while (we've not already gone through all of data) and (data[i] != 8) ()? remember: data is length 5. indices are: 0,1,2,3,4 #("Comment out the above code, so we can move on") #("This version works in all cases:") if len(data) > 0: i = 0 while i < len(data) and data[i] != 8: print data[i] i = i + 1 for data = [0,1,2,3,4] i will become 5, the test will be 5 < 5, and i < len(data) will be False. **** how come data[i] != 8 doesn't cause an error in this case? after all, data[5] does not exist? Because of short-circuit (lazy) evaluation of conditionals! False and *anything* is False. Once Python determines the first part is False, it stops; it does not test data[i] != 8. # With this fix to handle the case where 8 is not in the list, # now we can simplify the code! # # We added the if-statement to handle the case of the empty # list... Let's see if we need it: data = [] i = 0 while i < len(data) and data[i] != 8: print data[i] i = i + 1 #while 0 < 0 False! nothing is printed. ===Aliasing == - We saw examples where two variables referred to the same object. This is called "aliasing" - Recap: - If the object is mutable, changing what one variable refers to changes what the other one refers to. - If it's immutable, you can't change it; you can only make a new object that has the changes. So you can't effect what one variable references by changing the other. subjects = ["Computer Science", "Biology", "French", "History"] subjects_copy = subjects # Both refer to the same list object. subjects subjects_copy subjects[0] = "Commerce" # So changing one changes both. subjects subjects_copy - New fact: Slicing creates a new object, even if you slice the whole list. subjects_clone = subjects[:] # A brand new object. subjects # Since they are two different objects, subjects_clone subjects[0] = "Philosophy" # changing one can't change the other. subjects subjects_clone PICTURE: sc points to a different value than subjects does. add to the code: # try help(id) in the shell # if two variables are pointing to the same object, their ids are the # same. If they do not, their ids are different. if id(subjects_copy) == id(subjects): print("ids of the copy and the original are the same") if id(subjects_clone) == id(subjects): print("ids of the clone and the original are the same") In the shell: s = [0,1,2] sc = s s_clone = s[:] s == sc s == s_clone id(s) == id(sc) id(s) == id(s_clone) === one = [0,1,2,3,4,5,6,7] two = one one = [0,1,2,3,4,5,6,7] one --> [0,1,2,3,4,5,6,7] two = one one --> [0,1,2,3,4,5,6,7] ^ two ------ one = one[1:2] hmmmmm a new object is created, [1] "one" points to it but "two" is left alone! [0,1,2,3,4,5,6,7] ^ two ------ one ---> [1] Here it is in Python, for my reference: >>> one = [0,1,2,3,4,5,6,7] >>> two = one >>> one = one[1:2] >>> one [1] >>> two [0, 1, 2, 3, 4, 5, 6, 7] >>> == More list practice === - Try this out in a while loop: Problem: '''Remove all instances of the letter i from the string text.''' Let's figure out how to solve this, and then write a function to do it. shell, to remind you of some list methods we have: s = "Indiana Illinois" s.count("i") s.find("i") #Ok, so how could we remove all the i's? # note: 'remove' is a list method, not a string method while there are still i's in the string: find the position of the leftmost i remove it, by concatenating the part of the string before that i, and the part of the string after that i. keep going! Iteration 1: "Indiana Illinois" s.find("i") --> 3 Think of the string has having three parts: "Ind" "i" "ana Illinois" we want to put the first and third parts together "Ind" + "ana Illinois" how do we do this? s = s[0:3] + s[4:] concatenate the part of the string before the first i, and the part of the string after the first i! now, we will have "Indana Illinois" next iteration: s.find("i") --> 10 s = s[0:10] + s[11:] s is now: "Indana Illnois" next iteration: s.find("i") --> 12 s = s[0:12] + s[13:] s is now "Indana Illnos" s.find("i") --> 0 So, we are done!!! Here is the pseudo code we had: while there are still i's in the string: find the position of the leftmost i remove it, by concatenating the part of the string before that i, and the part of the string after that i. keep going! refine: while s.count("i") > 0: find the position of the leftmost i remove it, by concatenating the part of the string before that i, and the part of the string after that i. keep going! refine: while s.count("i") > 0: s.find("i") remove it, by concatenating the part of the string before that i, and the part of the string after that i. keep going! But what's wrong with this? We didn't save the value we need! Need to save it in a variable. while s.count("i") > 0: position = s.find("i") remove it, by concatenating the part of the string before that i, and the part of the string after that i. keep going! what is "position" on each of the iterations in our example? GO BACK AND LOOK. A KEY to programming is to figure out your strategy, often using an example, and then generalize!!! while s.count("i") > 0: position = s.find("i") s = s[0:position] + [position+1:] keep going! - no need for this - it will keep going, until there are no more 'i's in the string! Ok, now let's turn this into a function. ====== [remove_i_broken.py] # This function doesn't work because strings are immutable! def remove_i_broken(s): '''Remove all instances of the letter i from the string s.''' while s.count("i") > 0: # while there are more i's to remove position = s.find("i") # find the next i s = s[0:position] + s[position + 1:] def main(): text = "Will this work?" print text remove_i_broken(text) print text,"should be","Wll ths work?" What's happening? Each time through the loop, s points to a new string object. The string itself is not changing, because strings are immutable! AND - nothing is returned from the function. So, how do we fix this so it works? Need to return the value, and update text! [remove_i_correct.py] ==== However, suppose this were a list function. First, let's look at the main program: def main(): nums = [0, 1, 2, 30, 2, 0, 1, 0, 0, 1] print(nums) remove_0(nums) print (nums,"should be",[1, 2, 30, 2, 1, 1] ) nums = [] remove_0(nums) print(nums,"should be the empty list") Because lists are mutable, if we write the function properly, remove_0 will change the value of the list that nums is pointing to. We are testing a general case, and a special case (empty list). Ok: def remove_0(l): ''' Remove all instances of 0 from list l.''' Here is what we had for the string version: while s.count("i") > 0: position = s.find("i") s = s[0:position] + s[position + 1:] We don't need to return anything, because we will change the list that l points to. while l.count(0) > 0: That still works! help(list.count) but we don't want to do something like this: l = l[0:position] + l[position + 1:] This makes "l" point to a different list. **We don't want to make l point to a different list; we want to change the list that l points to!!! nums, from the main program, also points to it!!!! check this out: help(list.remove) We can use that: [remove_0_from_list.py.txt] What if we did this instead: def main(): nums = [0, 1, 2, 30, 2, 0, 1, 0, 0, 1] print(nums) nums = remove_i(text) print(nums) main() As written, what would "nums" be after the call to remove_i? None! we don't want to change where nums points. We do NOT want to assign something to nums!