### Homework Spring 2015

• Wednesday January 14
• Join the class group https://groups.google.com/forum/#!forum/pitt-cs-2015-spring-2015
• Problem 3-3 part a. You need not provide justifications for your order. If you aren't able to find a group, you can do this individually. While I encourage you to use LaTex, it is not strictly required for this assignment. You may hand write these solutions.
• Friday January 16
• Problem 3-4. For conjectures that are untrue, explain which reasonable class of functions the statement is true for. For example, you might say "The statement is untrue in general, and here is an example. But the statement is true if the functions are strictly increasing, and here is a proof."
• Consider the following definitions for f(n, m) = O(g(n, m)). State which definitions are logically equivalent. Fully justify your answers. Argue about which definition you think is the best one in the context where f and g are run times of algorithms and the input size is monotonicaly growing in n and m ( for example n might be the number of vertices in a graph and m might be the number of edges in a graph).
• there exists positive constants c, n_0, m_0 such that 0<= f(n, m) <= c * g(n, m) for all (n, m) such that n >= n_0 AND m >= m_0
• there exists positive constants c, n_0, m_0 such that 0<= f(n, m) <= c * g(n, m) for all (n, m) such that n >= n_0 OR m >= m_0
• There exists a constant c> 0 such that lim sup_{n -> infinity} lim sup_{m -> infinity} f(n,m)/g(n, m) < c. Note that this means that you first take the limit superior with respect to m. The result will be a function of just n. You then take the limit superior of this function with respect to n. If you don't know what limit superior means, you can just assume that the limit exists, in which case the limit and limit superior are the same.
• There exists a constant c> 0 such that lim sup_{m -> infinity} lim sup_{n -> infinity} f(n,m)/g(n, m) < c. Note that on the surface that this definition is different than the last on in that the order that you take the limits is switched.
• There exists a constant c> 0 such that for all but finitely many pairs (m, n) it is the case that f(n,m) < c * g(n, m)
• Wednesday January 21
• 8.1-3 Given an adversarial strategy and prove that it is correct.
• 8.1-4. Give an adversarial strategy and prove that it is correct.
• Problem 8-6. For part a and b,  is essentially asking you to consider the adversarial strategy that answers so as to maximize the number of the original ways of merging two sorted lists that are consistent with the answer. You will likely find Stirling's approximation for n! useful. For parts c and d, come up with a different adversarial strategy.  Explain why the bound that you get use the method proposed in parts a and b isn't as good as the bound you get using an adversarial strategy. That is, in what way are you being too generous to the algorithm in parts a and b?
• Friday January 23
• Consider the problem of determining whether a collection of real numbers x_1 ... x_n is nice. A collection of numbers is nice iff the difference between consecutive numbers in the sorted order is at most 1. So (1.2, 2.7, 1.8) is nice, but (1.2, 2.9, 1.8) is not nice since the difference between 1.8 and 2.9 is more than 1. We want to show that every comparison based algorithm to determine if a collection of n numbers is nice requires Omega(n log n) comparisons of the form x_i - x_j <= c, where c is some constant that the algorithm can specify. So if c=0, this is a standard comparison.
• Hint: This is similar to the lower bound for element uniqueness.
• Another Hint: Consider the (n-1)! permutations pi of {1, ... n} where pi(1)=1, and the corresponding points in n dimensional space. Note that all (n-1)! of these points are nice. Show the midpoint of any pair of these nice points is not nice. Then explain how to use this fact to give an adversarial strategy showing the Omega(n log n) lower bound.
• Consider a distributed ring network of n computers where each node starts with a unique arbitrary O(log n) bit ID in the range [1, n]. Note the word "arbitrary" means you can not make any assumptions about the ordering of the ID's on the ring. In each synchronous round, each computer can sends a message of unbounded size to its clockwise neighbor. The computers' goal is to assign a label to each node in such a way that no two adjacent computers are assigned the same label (call this a proper labeling). Formally a label is just a sequence of bits of some fixed length. The objective is to use as few rounds of communication as possible.
• Upper Bound
• As a super easy warm-up, give an algorithm that uses O(n) rounds to obtain a proper O(1) bit labeling.
• As a super easy warm-up, give an algorithm that in 0 rounds obtains a proper labeling with O(log n) bit labels.
• Give an algorithm that uses 1 round to obtain a proper labeling using O(log log n) bit labels. Hint: Each computer can only learn the ID of its counterclockwise neighbor. Consider the first bit on which these IDs differ. This is enough information to produce a label.
• Give an algorithm that uses 2 round to obtain a proper labeling using O(log log log n) bit labels. Hint: See the above hint.
• Extend this idea to give an algorithm that uses O(log^* n) rounds to obtain a proper labeling using O(1) bit labels.
• Lower Bound
• As an easy warm-up give an adversarial argument to show that every algorithm that obtains a proper labeling in 0 rounds requires Omega(log n) bit labels. Its obvious that this is true. That isn't the point. The point is to understand what arguments you have to make to make this formally correct.
• Show that if there is an algorithm that there is an algorithm A that obtains a proper labeling in 1 rounds using c bits, then an algorithm B that obtains a proper labeling in 0 rounds using O(2^c) bits.
• Hint: This is conceptually a bit tricky. First understand why one can think of A as a function F that takes two ID's as inputs and outputs a label. Why is this trivial statement if F is injective?
• So for the moment consider the case that this function has the property that every label in the range has at least different ID pairs that map to it. Now in zero round protocol, each computer only knows one of these two IDS, its own; It does not actually know the ID of the computer counterclockwise to itself. But it can compute the collection of possible labels that it might produce for the various possible ID's for its counterclockwise neighbor. Key Point: Why does the correctness of A mean that the collection of possible labels for two adjacent computers can not be identical? How should B use this insight to produce a label.
• Then explain how to accomplish this without any assumptions on F.
• Show that if there is an algorithm that there is an algorithm A that obtains a proper labeling in 2 rounds using c bits, then an algorithm B that obtains a proper labeling in 1 rounds using O(2^c) bits. Hint: Imagine that you are running A, but you stop it one round early. Now imagine that you are a computer, and have to produce a label at this point. Think about how what information the computer would have obtained if A was run for one more round, and what collection of labels the computer might have produced. Key question: What can you say about the collection of labels for consecutive computers if A is correct? Ask yourself how you might B produce a label from the collection of labels that the computer might have produce if A were allowed one more round. Then show that this labeling is proper.
• Show that if there is an algorithm that there is an algorithm A that obtains a proper labeling in k rounds using c bits, then an algorithm B that obtains a proper labeling in k-1 rounds using O(2^c) bits. Hint: If you an do it for k=1, it basically works the same in general.
• Conclude that any algorithm that obtains a proper labeling using O(1) bit labels takes Omega(log^* n) rounds.
• Monday January 26
• Consider a setting where you have two computer networking routers A and B. Each router has collected a list L_A and L_B of IP source addresses for the packets that have passed through the router that day. An IP address is n bits, and thus there are 2^n possible IP addresses. Now the two routers want to communicate via a two-way channel to whether there was some source that sent a packet through one of the routers, but not the other. So more precisely, at the end of the protocol each router should commit to a bit specifying the answer to this question, and the bits for both routers should be correct. You can assume that a bit sent on the channel is guaranteed to arrive on the other end in one time unit. We want to consider protocols for accomplishing this goal.
• Consider the following protocol:  A sends to B the list of all of the IP source addresses that it has seen,  B compares A's list to its list, and then B sends A a 0 bit if the lists are identical and a 1 bit otherwise. Show that uses protocol above uses n2^n +1 bits in the worst case. This is a trivial warmup problem.
• Give a protocol that uses 2^n +O(1) bits in the worst case. Another trivial warmup problem.
• Show that there is no protocol that can solve this problem without exchanging any bits. Its obvious that this is true. That isn't the point. The point is to understand what arguments you have to make to make this formally correct.
• Show that there is no protocol that can solve this problem that involves A sending one bit to B. And no more bits are exchanged. Again its obvious that this is true. Again that isn't the point. Again the point is to understand what arguments you have to make to make this formally correct. Hint:  Ask yourself how should the adversarial strategy should decide whether this first bit is a 0 or a 1?
• Show that there is no protocol that can solve this problem that involves A sending one bit to B and B replying with one bit to A. And no more bits are exchanged.  Again its obvious that this is true. Again that isn't the point. Again the point is to understand what arguments you have to make to make this formally correct.
• Prove that every protocol for this problem must sent 2^n bits for its worst case instance. Of course your argument should involve an adversarial argument.
• Assume that you have a computer networking router that sees a stream of k IP packets, each with a source IP address. The router sees a packet, optionally records some information in memory, and then passes the packet on.  The routers's goal is to always know the IP source address that it has seen most frequently to date. The most obvoius way to accomplish this is to keep a count for each IP source address seen to date. Show that every algorithm must use  Omega(k) bits of memory.
• Hint: This is an "easy" consequence of the previous subproblem, provided that you think about it the right way. Assume that you had a method that solved this problem using o(k) bits of memory. Explain how to use this method to get an algorithm for the previous subproblem that uses less than 2^k bits of communication.
• Wednesday January 28
• 8-4. For part b give an adversarial strategy. For part c, use linearity of expectations in your algorithm analysis.
• 4-3 except for part d.
• Apply the Master Theorem (Theorem 4.1) whenever applicable.
• For part f use induction; You will need 2 inductive proofs, one for the upper bound and one for the lower bound.
• Otherwise, draw the recursive call tree and sum up the costs level by level
• Friday January 30
• McDonalds is running a Monopoly promotion where every time your order a meal, you get a random ticket from one of m possible ticket types (there are essentially infinitely many tickets of each type). Assume (as is not true in the real promotion) that each of the m tickets types is equally likely. Calculate as accurately as possible (at least to within a multiplicative constant) how many McDonald's meals you would have to eat before you have n different ticket types for 1 <=n <= m. In particular, how many meals do you have to eat before you get all m different ticket types?  HINT: Find the Bernoulli trials.
• Consider the problem of finding the largest i numbers in sorted order from a list of n numbers (see problem 9-1) in the text. Consider the following algorithm: you consider the numbers one by one, maintaining an auxilary data structure of the largest i numbers seen to date. We get various algorithms depending on what the auxillary data structure is and how one searches and updates it. For each of the following variations give the worst-case time complexity as a function of n and i. For each of the following variations give the average-case time complexity as a function of n and i under the assumption that each input permutation is eaully likely. Hint: Use linearity of expectations. These are all similar and easy if you look at them the right way.
• The auxillary data structure is an ordered list and you use linear search starting from the end  that contains the largest number
• The auxillary data structure is an ordered list and you use linear search starting from the end  that contains the smallest number
• The auxillary data structure is a balanced binary search tree and you use standard log time search, insert and delete operations
• The auxillary data structure is a balanced binary search tree and you use standard log time insert and delete operations, but you start your search from the smallest item in the tree
• Consider the following problem. The input is n disjoint line segments contained in an L by L square S in the Euclidean plane. The goal is to partition S into convex polygons so that every polygon intersects at most one line segment. So it is ok for a line segment to be in multiple polygons, but each polygon can intersect at most one line segment.
• Consider the following  algorithm that starts with the polygon S. Let pi be a random permutation of the the line segments.
• While there is a polygon P that contains more than one line segment,
• let l be the first line segment in the pi order that intersects P.
• Now cut P into two polygons using the linear extension of l (so you extend the line segment l into a line and then use that to cut P).
• Show that the expected number of resulting polygons is O(n log n).
• Hint: Use linearity of expectations. First ask yourself how the number of polygons is related to the number of times that line segments get cut in the process. Consider to line segments u and v. Let C_{u,v} be a 0/1 random variable that is 1 if the linear extension of u cuts v. Let index(u, v) denote the number of line segments that the linear extension of u hits before hitting v. In other words if you starting walking from u on u's linear extension towards v, the index is how many line segments you cross before hitting v. If you don't hit v, then the index is positive infinity.  What is the relationship between the probability that C_{u,v}=1 and index(u, v).
• Monday  February 2
• Assume you have a source of random bits. So in one time unit, this source will produce one random bit (that is 1 with probability 1/2 independent of other bits). Consider the problem of outputing a random permutation of the integers from 1 to n. So each of the n! permutations should be produced with probabiltiy exactly 1/n!.
• Give an algorithm to solve this problem and show that the expected time of the algorithm is O(n log n).  This includes both the time that your algorithm takes, plus 1 unit of time for each random bit used.
• Now assume that there is a limited source of at most n^2 random bits. Show that there is no algorithm that can solve the problem using expected time O(n^2).  Hint: Show the result for n=3. Why can't you produce a random permutation of 1, 2, 3 using 9 bits? Then generalize to an arbitrary n.
• 11-1
• Wednesday February 4
• Assume that you had to solve the hiring problem at a large academic institution where effectively you couldn't fire anyone (note that this is a realistic assumption). Thus once you hire someone, the game is over.
• Consider the following strategy. Consider the applicants in random order. Interview but do not hire the first k candidates. So k is a parameter to the algorithm. After the first k candidates, hire the first one that is better than all of the first k applicants. You can assume that you can determine an underlying linear order among the candidates interviewed so far.
• Find the probability of hiring the best candidate as a function of n and k.
• Determine the k that maximizes this probability, and what this maximum probability is for large n
• HINT:  There are multiple ways to analyze this, some are much easier than others. So unless you are a bit lucky, the first thing you try may not work easily. 1/e is the best you can do in terms of the probability you hire the best person.
• HINT: See the discussion of hiring is section 5.1 in the text
• 11-2
• Friday February 6
• You have a sorted array A of containing n real numbers each selected independently and uniformly at random from the interval [0, 1]. You have an real x in [0, 1]. The problem is to find a subarray of size sqrt(n) that contains x.
• Show that the following algorithm solves this problem in O(1) average case time. HINT: Find the Bernoulli trials. Figure out how to think about the outcome of this algorithm in terms of the number of successes/failures in some Bernoulli trials. Use a Chernoff tail bound. See appendix C.5 or here. HINT: You can use the result of exercise C.5-6 without proof.

• last= x*n
• if A[last] < x then
• next= last + sqrt(n)
• while A[next] < x do
• last=next
• next= next + sqrt(n)
• else if A[last] > x
• next= last - sqrt(n)
• while A[next] > x d0
• last=next
• next= next - sqrt(n)
• Return x lies between positions next and last
• Explain how to use the above algorithm to obtain an algorithm with O(log log n) average case running time for the searching problem (finding the exact location of x in A).
• Monday February 9
• The purpose of this problem is to develop a version of Yao's technique for Monte Carlo randomized algorithms, within the context of the jug problem. Consider the red and blue jug problem from problem 8-4 in the text.  Assume that if you sorted the jugs by volume, that each permutation is equally likely.
• Show that if a  deterministic algorithm A  always stops in o(n log n) steps, then  the probability that A is correct for large n is less than 1 percent.
• Show if there is a distribution of the input on which no deterministic algorithm with running time A(n) is correct with probability > 1 percent, then there is no Monte Carlo algorithm with running time A(n) that can be correct with probability > 1 percent. Hint: Mimic the proof of Yao's technique/lemma for the case of Las Vegas algorithms. Consider a two dimensional table/matrix T, where entry T(A, I) is 1 if algorithm A is correct on input I, and 0 otherwise.
• Conclude that any Monte Carlo algorithm for this jug problem must have time complexity Omega(n log n).
• Wednesday February 11
• Consider the following online problem. You given a sequence of bits b_1, ... b_n over time. Each bit is in an envelope. You first see the envelope for b_1, then the envelope for b_2, .... When you get the i^th envelope, you can either look inside to see the bit, or destroy the envelope (in which case you will never know what the bit is).  You know a priori that at least n/2+1 of the bits are 1. You goal is to find an envelope containing a 1 bit. You want to open as few envelopes as possible.
• Give a deterministic algorithm that will open at most n/2 + O(1)envelopes. HINT: This is completely straight-forward.
• Show that every deterministic algorithm must open at least n/2 - O(1) envelopes. HINT: This is completely straight-forward.
• Assume that each of the n! permutations of the inputs is equally likely. Show that there is a deterministic algorithm where the expected number of envelopes that is opens is O(1). HINT: This is a straight-forward consequence of some facts that we learned about Bernoulli trials.
• Give a Monte Carlo algorithm that opens O(log n) envelopes and has probability of error  < 1/n. You must show that the probability of error is small. HINT: This is a straight-forward consequence of some facts that we learned about Bernoulli trials.
• Show using the version of Yao's technique for Monte Carlo algorithms that you developed in the last homework assignment to show that every Monte Carlo algorithm must open Omega(log n) envelopes if it is to be incorrect with probability < 1/n. HINT: This is a straight-forward application of the Yao's technique for Monte Carlo algorithms that you developed in the previous homework problem.
• Give a Las Vegas algorithm where the expected number of opened envelopes is O(n^(1/2)).
• Hint: Take some random guesses for the first half of the envelopes, and then if you don't find a 1 bit, give up and do the most obvious thing.
• Hint: See the discussion of the Birthday paradox in section 5.4.1. You may use facts from the analysis of the Birthday paradox in the CLRS text or from the wikipedia page without proof.
• Show that every Las Vegas algorithm for the previous envelope problem must open Omega(n^{1/2}) envelopes in expectation.
• Hint: Use Yao's technique and the following probability distribution.
• With probability half, sqrt(n) uniformly distributed random bits in [1,n/2] are set to 1 and the remaining bits in that interval are 0, bits in the interval [n/2 + 1 , n/ 2 + sqrt(n) ] are all set to 0,  and the remaining bits are 1.
• For k = 0 , ..., sqrt(n) − 1, with probability 1/(2 sqrt(n)), bits [1 , ..., n/ 2] contains a uniformly distributed random set of k 0's and the rest are 1's. Then sqrt( n) − k 0's are contained in uniformly distributed random bit positions in [ n/ 2 + 1 , n/ 2 + sqrt(n) ], and the remaining k bits in positions [ n/ 2 + 1 , n/ 2 + sqrt(n) n ] are 1's. The remaining bits in the stream are 0.
• Friday February 13
• 16-5 part c
• 16-2. You don't need to give running times, just prove correctness. Hint: For part (a) the most obvious exchange works. For part (b), the most obvious exchange does not work.
• Monday February 16
• 16-4 part a
• 15-4
• 15-5
• Wednesday February 18
• 15-2
• 15-3
• 15-9
• Friday February 20
• Our goal is now to consider the Knapsack problem (input: n coins with positive integer weights and positive integer values and a positive integer weight limit L, output: maximum value collection of coins with weight less than the weight limit)
• Give a straight-forward O(nL) time and O(nL) space dynamic programming algorithm that actually computes the collection of coins. HINT: Compute the table/array in the obvious way, and then backtrack through the table to determine the value of the coins.
• Given a straight-forward  O(nL) time and O(L) space dynamic programming algorithm that only computes the maximum value (not the actual coins you would take to obtain this value)
• Give an O(nL) time and  O(L) space that actually computes the collection of coins. This is not straight-forward. You should use the following strategy:
• Consider the following problem. The input is the same as for the knapsack problem, a collection of n items I1,...,In with weights w1,...,wn, and values v1,...,vn, and a weight limit L. The output is in two parts. First you want to compute the maximum value of a subset S of the n items that has weight at most L, as well as the weight of this subset. Let us call this value and weight va and wa. Secondly for this subset S you want to compute the weight and value of the items in {I1, . . . , In/2} that are in S. Let use call this value and weight vb and wb. So your output will be two weights and two values. Give an algorithm for this problem that uses space O(L) and time O(nL).
• Explain how to use the algorithm from the previous subproblem to get a divide and conquer algorithm for finding the items in the Knapsack problem a and uses space O(L) and time O(nL). HINT: First call the algorithm for the previous subproblem. What recursive call do you need to make to find the items in the final answer from the items in {I1 , . . . , In/2}? What recursive call do you need to make to find the items in the final answer from the items in {In/2+1, . . . , In}? Solve the resulting recurrence relation.
• Comment: Note that this method can be applied to most dynamic programs.
• There are three shortest path algorithms covered in chapter 24 (Bellman-Ford, Dijkstra, and the topological sort algorithm for directed acyclic graphs). For each of the following problems, pick the most appropriate of these three shortest path algorithm to apply to obtain an algorithm for the problem. This may or may not involve modifying the algorithm slightly. If you need to modify the algorithm, explain how. You may need to first briefly explain why the problem is indeed just a shortest path problem in disguise; That is, state how one obtains the graph, and why the shortest path in this graph corresponds to a solution to the problem. Give the running time of the resulting algorithm.
• The problem described in 24-2
• The problem described in 24-3
• The problem described in 24-6
• The problem of finding the path where the minimum edge weight is maximized. You need such an algorithm to implement one of the Karp-Edmonds variations on Ford-Fulkerson.
• Monday February 23
•     Show how each of problems described in  26-1, 26-2 and 26-3 can be efficiently reduced to network flow. Give the running time of the resulting algorithms for each problem assuming that you can solve network flow in time N(V, E), where N is some function of the number of vertices V and the number of edges E in the network.
• Wednesday February 25
• 26-5
• 26-6
• Friday February 27 Email write-ups to Mike Nugent (mpn1@pitt.edu) by noon.
• We consider  the minimum spanning tree problem defined in chapter 23 of the text.
• Give an integer linear programming formulation using the following intuition, and prove that your formulation is correct: There is an indicator 0/1 random variable for each edge. You must choose at least n-1 edges (n is the number of vertices in the graph). For each subset S of k vertices, you can choose at most k -1 edges connecting vertices in S. Explain why the size of this linear program can be exponential in the size of the graph.
• Give an integer linear programming formulation using the following intuition, and prove that your formulation is correct: There is an indicator 0/1 random variable for each edge. You must choose at exactly n-1 edges. For each subset S of vertices (S not the empty set and not all the vertices), you can choose at least one edge with one endpoint in S and one endpoint not in S. Explain why the size of this linear program can be exponential in the size of the graph. HINT: Theorem 23.1 in the text may be useful.
• Give a polynomial sized integer linear programming formulation using the following intuition, and prove that your formulation is correct: Call an arbitrary vertex the root. Think of a spanning tree as routing flow away from r to the rest of the tree (but now you do not have flow conservation at the vertices). Explain why the size of this linear program is polynomially bounded in the size of the graph.
• Consider a relaxation of the integer linear program in the last subproblem in that now the flows on the edges may be rational (and not necessarily integer).
• Show how to express a feasible solution to the linear program as an affine combination of rooted spanning trees. HINT: The coefficient for the first tree will be the least flow on any edge. And then repeat this idea.
•  Conclude that the minimum spanning tree is an optimal solution to this linear program. That is, explain how to take the minimum spanning tree and construct a solution to this linear program with objective value equal to the weight of the minimum spanning tree. Then show that every other feasible solution as weight at least the weight of the minimum spanning tree.
• Monday March 2 Email write-ups to Mike Nugent (mpn1@pitt.edu) by noon.
• Consider a two person game specified by an m by n payoff matrix P. The two players can can be thought of as a row player and a column player. The number of possible moves for the row player is m and the number of possible moves for the column player is n. Each player picks one of its moves, and then money is exchanged. If the row player makes move r, and the column player makes move c, then the row player pays the column player P_{r,c} dollars. Note that P_{r,c} could be negative, in which case really the column player is paying money to the row player. We assume that the game is played sequentially, so that one player specifies his move, the other players sees that move, and then specifies a response move (we will assume that this player makes the best possible response). Obvious each player wants to be payed as much money as possible, and if this is not possible, to pay as little as possible. HINT: All of these subproblems are  easy, so if you are heading toward a complicated answer, you might want to reevaluate.
• Trivial warm-up problems:
• Give an algorithm that will efficiently compute the best response for the column player give a specific move by the row player.
• Give an algorithm that will efficiently compute the best first move by the row player given that the column player will give its best response
• Either give an example of a payoff  matrix where it is strictly better for each player to go second, or argue that there is no such payoff matrix. HINT: Roshambo
• Now we change the problem so that each player  specifies a probability distribution over his moves, and then the row player pays the column player E[P_{r,c}], where the expectation is taken over the two probability distributions.
• Give an algorithm that will efficiently compute the best response (which is probability distribution over column moves) for the column player given a probability distribution specified by the row player.
• Give an algorithm that will efficiently compute the best first move (probability distribution over row moves) for the row player given that the column player makes the best response. Hint: Linear programming
• Show the linear program for this payoff matrix
•  3 2 11 9 6 4 1 14 7

• Give an algorithm that will efficiently compute the best first move (probability distribution over column moves) for the column player given that the row player makes the best response. Hint: Linear programming
• Show the linear program for this payoff matrix
•  3 2 11 9 6 4 1 14 7

• Either give an example of a payoff  matrix where it is strictly better for each player to go second, or argue that there is no such payoff matrix.
• Hint: Strong inear programming duality.
• Consider the envelope bit sequence problem that was due on February 11. Assume that the statement of the problem was correct, and that every Las Vegas algorithm will open Omega(sqrt(n)) envelopes in expectation for some inputs. Can this necessarily be proven by Yao's technique? That is, is there necessarily a input distribution that will cause every deterministic algorithm to open Omega(sqrt(n)) envelopes in expectation? Explain.
• Wedneday March 4 Email write-ups to Mike Nugent (mpn1@pitt.edu) by noon.
• Consider the problem of constructing a  maximum cardinality bipartite matching. See section 26.3 in the book, or here is a brief discrption. The  input is a bipartite graph, where one bipartition are the girls, and one bipartition is the boys. There is an edge between a boy and a girl if they are willing to dance together. The problem is to matching the boys and girls for one dance so that as many couples are dancing as possible.
• Construct an integer linear program for this problem
• Consider the relaxed linear program where the integrality requirements are dropped. Explain how to find an integer optimal solution from any rational optimal solution.  Hint: Find cycles of edges who associated variables are not integer.
• Construct the dual program.
• Give a natural English interpretation of the dual problem (e.g. similar to how we interpreted the dual of diet problem as the pill problem)
• Explain how to give a simple proof that a graph doesn't have a matching of a particular size. You should be able to come up with a method that would convince someone who knows nothing about linear programming.
• Assume that you have a park (mathematically a 2D plane) containing k lights and n statues. In particular, you know for each light L and for each statue S, whether light L will illuminate statue S if light L is lit. Further you are told for each light, the cost C_L for turning on light L. The goal is to light all the statues while spending as little money as possible.
• Construct an integer linear program for this problem where there are binary indicator variables for each light signifying whether the light is lit or not.
• Consider the relaxed linear program where the variables are allowed to be any rational between 0 and 1. Give an English explanation of the problem that this models. HINT: Imagine the lights have a dimmer control.
• Show that the relaxed linear program where the variables are allowed to be any rational between 0 and 1 can have a strictly smaller objective than the optimal objective for the integer linear program for some instances.
• Construct the dual program for the relaxed linear program.
• Give a natural English interpretation of the dual problem (the problem modeled by the dual linear program).
• Explain how to give a simple proof that a certain cost is required for the problem modeled by the relaxed linear program (the one with dimmer controls) using this natural interpretation of the dual.
• Friday March 6 Email write-ups to Mike Nugent (mpn1@pitt.edu) by noon.
• Consider the problem of scheduling a collection of processes on one processor. Each process J_i has a size x_i a release time r_i ,  and a deadline d_i. All these values are positive integers. The goal is to find the slowest possible speed that will allow you to finish each job between its release time and deadine. A job of size x_i that is run at speed s, takes x_i/s units of time to complete. A processor can switch between processes arbitrarily. For example, the processor can run J_1 for a while, then switch to J_2, then back to J_1, then to J_3, etc.
• Express this problem as a linear program
• Construct the dual program.
• Give a natural English interpretation of the dual problem (e.g. similar to how we interpreted the dual of the max flow problem as the min cut problem).
• Explain how to give a simple proof that the input is infeasible for a particular speed. You should be able to come up with a method that would convince someone who knows nothing about linear programming.
• Monday March 16
• Prove that each of the problems defined in 34.5-1, 34.5-2, 34.5-3, 34.5-5, 34.5-6, 34.5-7, and 34.5-8 are NP-hard using a reduction using a reduction from an NP-complete problem of your choice that is defined earlier in Chapter 34. So for each problem, you need to give one polynomial time reduction. The difficulty of finding the redutions ranges from trivial to reasonably straight-forward.
• Wednesday March 18
• Show that the 3-COLOR problem is NP-hard by reduction from the 3-CNF-SAT problem. 3-COLOR is defined in problem 34-3 in the text, which also contains copious hints.
• In the disjoint paths problem the input is a directed graph G and pairs (s_1, t_1), ..., (s_k, t_k) of vertices. The problem is to determine if there exists a collection of vertex disjoint paths between the pairs of vertices (from each s_i to each t_i). Show that this problem is NP-hard by a reduction from the 3SAT problem.
• HINT: Construct one pair (s_i, t_i) for each variable x_i in your formula F. Intuitively there will be two possible  paths between s_i and t_i depending on whether x_i is true or false. There will be a component/subgraph D_j of G for each  clause C_j in F. There will be three possible paths  between the (s_i, t_i)'s pairs for each D_j. You want that
it is possible to route any two of these paths (but not all three) through D_j.
• Friday March 20
• The input to the triangle problem is a subset W of the Cartesian product X x Y x Z of sets X,  and Z, each of cardinality n. The problem is to determine if there is a subset U of W such that 1) every element of X is in exactly one element of U, 2) every element of Y is in exactly one element of U, and 3) every element of \$Z\$ is in exactly one element of U. Here's a story version of the same problem. You have disjoint collections of n pilots, n copilots, and n flight engineers. For each possible triple of pilot, copilot, and flight engineer, you know if these three people are compatible or not. You goal is to determine if you can assign these 3n people to n flights so that every flight has one pilot, one copilot, and one flight engineer that are compatible. Show that this problem is NP-hard using a reduction from 3SAT.
• Hint: Consider a cyclic collection of an even number of triangles, where consecutive triangles in this cycle share a single common element. These shared common elements are alternately X and Y.  So to cover all these X and Y elements, you either need to pick all the odd triangles in the cycle or all the even triangles in the cycle. Now as a warmup, assume that you are reducing from the problem of deciding where there is a truth assignment that makes exactly one literal per clause true, and that you know that the number of occurrences of each literal x is equal to the number of occurrences of the literal not \$x\$. Once you see this, you can now try to figure out how to modify this to fix the issues that you can have more than one literal per clause being true, and the number of occurrences of a literal and its negation may not be the same.
• Prove that the following problem is NP-hard by reduction from 3SAT. The input consists of a finite set S and a collection C of subsets of S. The problem is to determine if there is a partition of S into two subsets S_1 and S_2 such that no set D in C is entirely contained in either S_1 or S_2. No hints this time.
• Monday March 23
• 35-5 parts a, b and d
• 35-7
• Wednesday March 25
• 35.2-4 Use the minimum bottleneck spanning tree as your lower bound for the optimal bottleneck tour. Show  using an exchange argument that Kruskal's algorithm computes the optimal minimum bottleneck spanning tree/
• Prove that if there is a polynomial time approximation algorithm for the maximum clique problem that has approximation ratio 1000 then there is a polynomial time approximation algorithm with approximation ratio 1.000000001. This is actually a slightly easier problem than problem 35-2 part b in the book, which I suggest that you look at for inspiration. Note that in some sense this can be viewed as a gap reduction.
• Friday March 27
• 35-3. Use a feasible solution (defined using the greedy algorithm) to the dual of the obvious linear program as your lower bound.
• Consider the following problem. The input is a graph G-(V, E). Feasible solutions are subsets S of the vertices V. The objective is to maximize the number of edges with one endpoint in S and one endpoint in V-S.
• Give a simple polynomial-time randomized algorithm for this problem and show that it is 2 approximate. Hint: Flip a coin for vertex and consider analysis for MAX2SAT from class.
• Develop a deterministic polynomial-time 2-approximation algorithm for this problem using the method of conditional expectations, which considers the vertices one by one, but  instead of flipping a coin for each vertex v, puts v in the bipartition that would maximize the expected number of edges in the cut if coin flips were used for the remaining vertices. Give a simple greedy algorithm that ends up implementing this policy. Prove that this algorithm has approximation ratio at most 2.
• Monday March 30 Email write-ups to Mike Nugent (mpn1@pitt.edu) by noon
• Problem 17.3-6. You must use a potential function analysis to prove O(1) amortized time. Hint: The potential function for dynamic tables will be useful.
• 17-3
• Wednesday April 1  Email write-ups to Mike Nugent (mpn1@pitt.edu) by noon
• Assume that you have a collection of n boxes arriving online over time that must be loaded onto m trucks. When a box arrives, the online algorithm learns the weight of the box, and a list of trucks that that box can be loaded on. So not every box is allowed to be loaded on every truck. At the time that a box arrives, the online algorithm must pick a truck to load the box on. The objective is to minimize the weight of the most heavily loaded truck. Give an adversarial argument to show no deterministic online algorithm can achieve approximation ratio O(1).  Hint: In your adversarial strategy, later arriving boxes should be made only assignable to trucks that the online algorithm assigned boxes to earlier.
• Consider the paging problem. Consider the following randomized online algorithm.
• ALGORITHM DESCRIPTION: Each page P has an associated bit: FRESH or STALE. If requested page P in fast memory, then P's associated bit is set to FRESH. If the requested page P is not in fast memory, then a STALE page is selected uniformly at random from the STALE pages in fast memory and ejected, and P's associated bit is set to FRESH. If the request page P is not in fast memory, and all pages in fast memory are FRESH, then make all pages in fast memory STALE, select a STALE page uniformly at random from the STALE pages in fast memory to evict, and P associated bit is set to FRESH.
• Show that this algorithm is O(log k) competitive/approximate using the following strategy (recall k is the size of the fast memory). Partition the input sequence into consecutive subsequences/phases where there are exactly k distinct pages requested in each subsequence/phase. The phase breaks are when all pages in fast memory are made STALE. Let m_i be the number of pages requested in phase i that were not requested in phase i-1.
• Show that the optimal number of page faults is Omega(sum_i m_i)
• Show that the expected number of page faults for the randomized algorithm on the page requests in phase i is O(m_i log k)
• Friday April 3 Email write-ups to Mike Nugent (mpn1@pitt.edu) by noon
• Consider an online or approximation problem where there are only finitely many possible algorithms and finitely many possible inputs. We generalize Yao's technique to approximation ratios. The correct answer is "yes" to three of the following four questions, and the correct answer is "no" for the remaining question. Identify the three questions where the answer is yes, and give a proof that the answer is yes. For extra credit, prove that the correct answer is no for the remaining question.
• Assume that the problem is a minimization problem
• Assume that you have an input distribution I, such that for all deterministic algorithms A it is the case that E[A(I)]/E[Opt(I)] > c. Can you logically conclude that the expected competitive ratio for every randomized algorithm is at least c?
• Assume that you have an input distribution I, such that for all deterministic algorithms A it is the case that E[A(I)/Opt(I)] > c. Can you logically conclude that the expected competitive ratio for every randomized algorithm is at least c?
• Assume that the problem is a maximization problem
• Assume that you have an input distribution I, such that for all deterministic algorithms A it is the case that E[Opt(I)]/E[A(I)] > c. Can you logically conclude that the expected competitive ratio for every randomized algorithm is at least c?
• Assume that you have an input distribution I, such that for all deterministic algorithms A it is the case that E[Opt(I)/A(I)] > c. Can you logically conclude that the expected competitive ratio for every randomized algorithm is at least c?
• Monday April 6
• Use a correct generalization of Yao's techique to show that the expected competitive ratio for every randomized paging algorithms is Omega(log k). Hint: Assume that the number of pages is one more than the size of fast memory, and the most obvious input distribution.
• Monday April 13 (This is the last homework problem for the semester !)
• Consider the following online problem. There are two taxis on a line that initially start at the origin. At positive integer time t, a request point h_t on the line arrives. In response, each taxi can move to a different location on the line, or stay put at the current point. The path traveled by at least one of the two taxis much cross h_t. The objective is to minimize the total movement of the taxis.
• As a warmup show that if there is a c-competitive algorithm A for this problem, then there is a c-competitive algorithm B that only moves one taxi in response to each request, and that one taxi moves directly from its position to the request.
• Give an adversarial strategy to show that the competitive ratio of every deterministic algorithm is at least 2. Hint:  Come up with a request sequence that makes it hard to decide if one of the taxis should move.
• Consider the following algorithm A. If both taxis are to the left of h_t, then the rightmost taxi moves to h_t. If both taxis are to the right of h_t, then the leftmost taxi moves to h_t. If h_t is between the two taxis, then both taxis move toward h_t at the same rate until one of the taxis reaches h_t, at which point both taxis stop moving. Show that this algorithm is 2-competitive using the following potential function: Phi = (the distance between the leftmost taxi for A and the leftmost taxi for optimal) + (the distance between the rightmost taxi for A and the rightmost taxi for optimal) + (the distance between the leftmost and the rightmost taxis for A). So you need to show that for each request, the cost to A + the change in the potential Phi is at most 2 times the cost to optimal.