DAY 14 == Writing epics == Goal: practice problem solving we'll use dictionaries, tuples, and lists Recap: do this slowly, so they realize what is going on. - We'll develop a program that mimics the writing it has seen - demo [wannabe.py] - Even though the text doesn't make all that much sense, can you see the differences in the style? (a) Remember what you saw so that you can mimic it Type "help", "copyright", "credits" or "license" for more information. >>> # A demo of the program to mimic the writing in a text: >>> Evaluating epic.py upon the sky, So shall behold the humble-bees, And leads me come to be not this cold fruitless moon. TITANIA. Thou shalt buy this advantage take, An I have we grew together, More than arrow from my Thisby's face. Where art thou aby it. What though I pick'd a day As Shafalus to be a fine tragedy. And here together. THESEUS. True; and you were a little voice: 'Thisne, Thisne!' [Then speaking small] 'Ah Pyramus, I'll apply To greet me like to steal. HERMIA. Methinks she was in love you? HELENA. Your vows are to perish on thee! [They sleep] Enter OBERON at the Duke of doubt he That is too old device, and here's a day. The lover, that from his scene of Athens here, But, in the wood Enter DEMETRIUS I'll give a girdle round and lovers seek new ribbons to entreat your patience well. Exit Enter a woman, God warrant us: She shall we come without warning. HIPPOLYTA. How can be set aside. Away with thy brawls thou aby it. Demetrius loves me have a serpent. HERMIA. I see. Dian's bud o'er and not merit. Where is a knavery of man to form in mockery, set. The >>> >>> # The kind of dictionary we want to build: # for midsummernight.txt >>> d = {"happy": ["days", "is", "fair!", "some", "is", "hour!"]} >>> d["couples"] = ["shall","three"] >>> d [Go into midsummernight.txt to find these occurences of the words] >>> >>> # To mimic the writing we've read: suppose we just wrote word w. >>> # Now, random pick the next word from the list of follow-on words for w in >>> # our dictionary. Separate issue: what word to start the whole thing with. >>> >>> # How to pick a random element of a list: >>> inport random >>> >>> l = [5,2,7,33,1,2,66,5] >>> random.shuffle(l) >>> l [2, 66, 2, 1, 33, 7, 5, 5] >>> random.shuffle(l) >>> l [1, 5, 2, 2, 5, 33, 66, 7] Remember d? d So, here is the idea of what we will do: Read a text from the file, building the dictionary Generate a text according to the dictionary FIRST VERSION: the keys of the dictionary will be individual words only ==== #Read a text from the file, building the dictionary as you go along "Happy day and Happy year and sad minute" ^ ^ | | lw word Trace through on the board: last_word = Happy word = day d['Happy'] = ['day'] ^ ^ | | key value, a list last_word = day word = and d['day'] = ['and'] ==== in pseudo-code: initialize the word_dict and last_word # type this in, referring to the trace on the board for line in input_text: words = get the words in the line for w in words: if last_word is already a key in word_dict: append w to the list stored with last_word else: create a new entry for last_word initialize its value to [w] !! last_word = w Generate a text according to the dictionary ==== initialize the word_dict and last_word for line in input_text: words = line.split() !!!!! for w in words: if last_word in word_dict: !!!!! word_dict[last_word].append(w) !!!!! else: word_dict[last_word] = [w] !!! last_word = w Generate a text according to the dictionary initialize? word_dict = {} # easy last_word = "" We will end up with an entry in word_dict: key: '' value: the first word in the text! That will be fine. It will never be used, but it will work to get us kickstarted. # Take care of the special case of the very last word # only appearing once - at the end of the file if not last_word in word_dict: word_dict[last_word] = [""] [wannabe_starter.txt] show/trace on simple.txt also, go through the code referring to your trace on the board. ===== Now that we have the dictionary, how shall we use it to generate text? input to the function: the word dictionary how long we want the epic to be - num_words General strategy: suppose current_word is the word we just added to the epic repeat the following num_words times get word_dict[current_word], the words that followed current_word in the original text randomly choose one of those words add the chosen word to the epic then, make THAT word the current word For example: [simple_txt_and_dict.txt has the following, so you don't need to type it in] Happy day and Happy year and sad minute. {'': ['Happy'], 'and': ['Happy', 'sad'], 'minute.': [''], 'sad': ['minute.'], 'year': ['and'], 'day': ['and'], 'Happy': ['day', 'year']} TRACE THIS ON THE BOARD. current_word is choose from among ['Happy'] chosen word is Happy epic is now Happy current_word is Happy choose from among ['day', 'year'] chosen word is day epic is now Happy day current_word is day choose from among ['and'] chosen word is and epic is now Happy day and current_word is and choose from among ['Happy', 'sad'] chosen word is sad epic is now Happy day and sad current_word is sad choose from among ['minute.'] chosen word is minute. epic is now Happy day and sad minute. current_word is minute. choose from among [''] chosen word is epic is now Happy day and sad minute. current_word is choose from among ['Happy'] chosen word is Happy epic is now Happy day and sad minute. Happy Now, let's see our epic Happy day and sad minute. Happy == now, let's write the code. get word_dict[current_word], the words that followed current_word in the original text randomly choose one of those words add the chosen word to the epic then, make THAT word the current word epic = '' current_word = "" for i in range(num_words): values = word_dict[current_word] random.shuffle(values) new_word = values[0] epic += new_word + " " current_word = new_word return epic [trace wannabe_simple2.py] OK: our code only looks at a context of 1! but, we can make the epic even better if it considers, say, the TWO words before the current one. or the three! Let's make the code general: The two functions will take another argument: how much previous context, in the number of words to consider to generate our epic. Suppose, we want the context to be 2 words Keys of the dictionary? 2 words! values? the words found in the text to follow those TWO words what data structure should we use for the keys? list? NOPE! those are mutable tuble! e.g., for midsummernight.txt {('knavish', 'lad,'): ['Thus'], ('good', 'mounsieur,'): ['get', 'bring', 'have', 'but'], ... Look in midsummernight.txt to show exactly what this means. Here is an example of using this: word_dict[('knavish','lad,')] is ['Thus'] just like: my = {1:45,2:43} my[1] is 45 Same thing! It is just that the keys and the values are more complicated. So, for example for this text: simple_1.txt simple_1_txt_and_dict.txt don't type in: just show them. Happy day and Happy day but Sad minute so Sad minute drats! {('Happy', 'day'): ['and', 'but'], ('', 'Happy'): ['day'], ('minute', 'so'): ['Sad'], ('Sad', 'minute'): ['so', 'drats!'], ('day', 'and'): ['Happy'], ('minute', 'drats!'): [''], ('', ''): ['Happy'], ('but', 'Sad'): ['minute'], ('and', 'Happy'): ['day'], ('day', 'but'): ['Sad'], ('so', 'Sad'): ['minute']} start context off? ('','')! the two previous words - nothing. Then, move the context over one word by one word. word_dict is {} current word is Happy context is ('', '') word_dict is {('', ''): ['Happy']} current word is day context is ('', 'Happy') word_dict is {('', ''): ['Happy'], ('', 'Happy'): ['day']} current word is and context is ('Happy', 'day') word_dict is {('Happy', 'day'): ['and'], ('', ''): ['Happy'], ('', 'Happy'): ['day']} current word is Happy context is ('day', 'and') word_dict is {('Happy', 'day'): ['and'], ('', ''): ['Happy'], ('day', 'and'): ['Happy'], ('', 'Happy'): ['day']} current word is day context is ('and', 'Happy') word_dict is {('Happy', 'day'): ['and'], ('and', 'Happy'): ['day'], ('', ''): ['Happy'], ('day', 'and'): ['Happy'], ('', 'Happy'): ['day']} current word is but context is ('Happy', 'day') word_dict is {('Happy', 'day'): ['and', 'but'], ('and', 'Happy'): ['day'], ('', ''): ['Happy'], ('day', 'and'): ['Happy'], ('', 'Happy'): ['day']} current word is Sad context is ('day', 'but') word_dict is {('Happy', 'day'): ['and', 'but'], ('', 'Happy'): ['day'], ('day', 'and'): ['Happy'], ('', ''): ['Happy'], ('and', 'Happy'): ['day'], ('day', 'but'): ['Sad']} current word is minute context is ('but', 'Sad') word_dict is {('Happy', 'day'): ['and', 'but'], ('', 'Happy'): ['day'], ('day', 'and'): ['Happy'], ('', ''): ['Happy'], ('but', 'Sad'): ['minute'], ('and', 'Happy'): ['day'], ('day', 'but'): ['Sad']} current word is so context is ('Sad', 'minute') word_dict is {('Happy', 'day'): ['and', 'but'], ('', 'Happy'): ['day'], ('Sad', 'minute'): ['so'], ('day', 'and'): ['Happy'], ('', ''): ['Happy'], ('but', 'Sad'): ['minute'], ('and', 'Happy'): ['day'], ('day', 'but'): ['Sad']} current word is Sad context is ('minute', 'so') word_dict is {('Happy', 'day'): ['and', 'but'], ('', 'Happy'): ['day'], ('minute', 'so'): ['Sad'], ('Sad', 'minute'): ['so'], ('day', 'and'): ['Happy'], ('', ''): ['Happy'], ('but', 'Sad'): ['minute'], ('and', 'Happy'): ['day'], ('day', 'but'): ['Sad']} current word is minute context is ('so', 'Sad') word_dict is {('Happy', 'day'): ['and', 'but'], ('', 'Happy'): ['day'], ('minute', 'so'): ['Sad'], ('Sad', 'minute'): ['so'], ('day', 'and'): ['Happy'], ('', ''): ['Happy'], ('but', 'Sad'): ['minute'], ('and', 'Happy'): ['day'], ('day', 'but'): ['Sad'], ('so', 'Sad'): ['minute']} current word is drats! context is ('Sad', 'minute') {('Happy', 'day'): ['and', 'but'], ('', 'Happy'): ['day'], ('minute', 'so'): ['Sad'], ('Sad', 'minute'): ['so', 'drats!'], ('day', 'and'): ['Happy'], ('minute', 'drats!'): [''], ('', ''): ['Happy'], ('but', 'Sad'): ['minute'], ('and', 'Happy'): ['day'], ('day', 'but'): ['Sad'], ('so', 'Sad'): ['minute']} Now, to the code. Here is what we have: [just change the file; keep a copy of the original] def build_dict(r,CONTEXT_LENGTH): context = ?? word_dict = {} for line in infile: words = line.split() for w in words: e.g., context is something like ('day','but') if CONTEXT in word_dict: word_dict[CONTEXT].append(w) else: word_dict[CONTEXT] = [w] UPDATE CONTEXT - how? # Take care of the special case of the very last word # only appearing once - at the end of the file if not last_word in word_dict: word_dict[last_word] = [""] WHAT SHOULD THIS BE? return word_dict def build_dict(r,CONTEXT_LENGTH): context = ?? word_dict = {} for line in infile: words = line.split() for w in words: if CONTEXT in word_dict: word_dict[CONTEXT].append(w) else: word_dict[CONTEXT] = [w] UPDATE CONTEXT - how? <--- context = context[1:] + (word, ) <--- In our example: Happy day and Happy day but Sad minute so Sad minute drats! DRAW something on the board: current word is and context is ('Happy', 'day') word_dict is {('Happy', 'day'): ['and'], ('', ''): ['Happy'], ('', 'Happy'): ['day']} current word is Happy context is ('day', 'and') def build_dict(r,CONTEXT_LENGTH): context = ?? <--- NEEDS to be context_lenth "" We can use a for-loop for this: context = ('',) <-- for i in range(context_length-1): context = context + ('',) word_dict = {} for line in infile: words = line.split() for w in words: if CONTEXT in word_dict: word_dict[CONTEXT].append(w) else: word_dict[CONTEXT] = [w] context = context[1:] + (word, ) if not CONTEXT in word_dict: <--- word_dict[last_word] = [""] return word_dict We'll probably be out of time for this. They should read and run the code wannabe.py until they understand it. Now, generating our text: context ('', '') chosen word Happy our epic Happy context ('', 'Happy') chosen word day our epic Happy day context ('Happy', 'day') chosen word but our epic Happy day but context ('day', 'but') chosen word Sad our epic Happy day but Sad context ('but', 'Sad') chosen word minute our epic Happy day but Sad minute context ('Sad', 'minute') chosen word drats! our epic Happy day but Sad minute drats! context ('minute', 'drats!') chosen word our epic Happy day but Sad minute drats! Now, let's see our epic Happy day but Sad minute drats! Here's the new code: def write_epic(word_dict, num_words, context_length): '''Based on the word_table dictionary, produce an epic of num_words words.''' epic = '' context = ('',) DIFF for i in range(context_length-1): context = context + ('',) for i in range(num_words): values = word_dict[context] # get words that follow current context random.shuffle(values) word = values[0] # To do chars rather than words, delete '+ " "' epic += word + " " context = context[1:] + (word, ) # next prefix DIFF return epic