Approximation Algorithms:

Question: What kind of algorithm does an NP-hardness proof for an
optimization problem probably rule out?
Answer: One that is 
*runs in poly time on all inputs
*optimal on all inputs

Approximation algorithms give up on optimality, but try to give up
on optimality as little as possible. 


Definition: The input for (one variation) of the TSP problem is a complete graph with positive edge weights. 
The problem is to find the simple spanning cycle of minimum aggregate weight. 

Definition: The input for the MAXSAT problem is a collection of clauses, where each
clause is the disjunction of literals, where every literal is a variable or the negation of a variable.
The problem is to find a truth assignment to the variables that satisfies as many clauses as possible.

Question: Are the following plausibly achievable approximation for poly-time algorithm A:
TSP: A(I) <= Opt(I) + 1
MAXSAT: A(I) >= Opt(I) - 1
Here Opt(I) is the objective value of the optimal solution.
Answer: No. 


Thus the standard measure of goodness of an algorithm = worst case relative error

Definition: Approximation ratio for maximization problem =
benefit of optimal solution/ benefit of solution produced by algorithm


Definition of Approximation ratio of A:
for minimization problem max_I A(I)/Opt(I)
for maximization problem max_I Opt(I)/A(I)

Definition that No One Uses: Approximation threshold for A =
|value of A - OPT|/ max (opt, value of A)
(Note 0 is good here, and 1 is bad, and this works for both
minimization and maximization problems)
An algorithm is said to be an epsilon approximation algorithm
if its approximation threshold is less than  epsilon

Definition: An algorithm is a (fully) poly time approximation
scheme if for each epsilon you can get within an epsilon factor
of optimal (or equivalently an approximation threshold less than
epsilon) with an algorithm that runs in time poly in
the input size (and 1/epsilon).

Theorem: There is a polynomial time 2-approximation for MAXSAT.
Proof: Flip a fair independent coin for each variable. 
This can be derandomized with the method of conditional expectations. 

Theorem: There is no polynomial time 2-approximation algorithm for (the above version of) TSP.
Proof: Gap reduction from Hamiltonian cycle. 
Definition: Gap reduction from HC to TSP to prove c lower bound on approximation: 
	Yes instances of HC are mapped to instances of TSP where Opt < k
	No instances of HC are mapped to instances of TSP where Opt > c k


Theorem folks wanted to prove around 1980:
There is no poly-time 1.01-approximation algorithm for Max3SAT, or equivalently,
it is NP-hard to approximate MaxSAT within a factor of 1.01.
Proof attempt: Gap Reduction from NP-complete language 
L = { (M, I, 0^|I|^k) | where M is a n^k time NP machine that accepts I}
We know how to construct a formula F in CNF form such all clauses of F are satisfied iff
M accepts I. But what we need to construct is a formula G in CNF form such that:

If M accepts I then all clauses of G are satisfiable
If M doesn't accept I then it is not possible to satisfy more than 98% of the clauses of G

Then we could use a 1.01 approximation algorithm for Max3SAT to determine
whether M accepts I or not.

Unfortunately, using F for G doesn't work. If M doesn't accept I, it is possible
to satisfy all but one clause of F. In 1980, nobody knew how to build such
a formula G.

Intuition from 1980: Probably you can't build such a G. Turing machine computation
is too fragile, just one fault can cause the computation to be wrong. 
What one needs is a more fault-tolerant model of computation to characterize NP.
But nobody knows to define NP using fault tolerant computation.


TURN TIME FORWARD A BIT IN THE 1980'S TO INTERACTIVE PROOF RESEARCH


This example explain why PCP developed out of the interactive
proofs line of research. 

Question: What does the prover have to write down initially in a book so that
the verifier can just look up the book? Assume the verifier uses r random
bits and the prover in total never sends more than q bits.
Answer: It is sufficient to write down the up to q bits for each of the 2^r
choices of coin flips. So the book has to be q*2^r bits long.


Informal Definition: This book is a probabilistically checkable proof


Definition: A (r, q) restricted verifier is a randomized poly time machine
that can only only use r random bits and look at q
of bits of the certificate/book. If the input is not in the language then
with probability at least 1/2 the verifier will reject, and
if the input is in the language then the machine will accept witr
probability 1.

Example: (poly, poly) restricted verifier for graph non-isomorphism

Definition: PCP(r, q) are the languages for which there is a (r, q)
restricted verifier that will accept this language

Question: IP = PCP(?,?) ?
Answer: IP = PCP(poly,poly), Note the certificate/book is exponentially long in this case


Question: Why might PCP classes be related to approximation?
Answer: They are a model of computation that is fault tolerant


(Circa 1990) Theorem 13.12: NP  = PCP(O(log n), O(1) )

Question: Which direction is trivial?
Answer: A (log n, 1) verifier implies membership in NP
since the NP machine could guess all poly elements of
the certificate that the verifier might look at.


To see why this characterization of NP is useful for approximation:


Theorem  13.13: There is some delta such that it is NP-hard to approximate MAXSAT within a factor of 1+delta.

Proof: Let L be a language in NP. Let V be a (log n, 1) verifier of L.
To decide if an x in L we create an instance phi of MAX3SAT for
which "a lot" of clauses will be satisfiable if x in L and "few"
clauses will be satisfied if x not in L. (This is a gap reduction.)
The poly time approximation algorithm for MAX3SAT will then tell
us if x is in L or not.

How to create phi: If were fix r to be the 2^(c log n) random bits used
by V then we get a deterministic computation. Let y_i_1(r) .. y_i_d(r) be
the bits accessed in the certificate y.
The output of V (give a fixed r) can be thought
as a Boolean function of d inputs. Therefore is has a circuit with
at most K=exp(d) = O(1) gates. There for the CNF form phi_r of this circuit has
only a constant number of clauses. Note that all but one of the clauses
of phi_r can be satisfied. All of the clauses of phi_r can be satisfied
iff V accepts.

We know make phi the "and" of all phi_r clauses for each of the
2^(c log n) = O(poly(n)) choices of r. We get Kn^c clauses.
Note the phi_r clauses share the y_i_j variables.

If x in L there is a truth assignment that satisfies all the clauses.
If x not in L then any truth assignment must miss at least one clause
for each group, i.e. at least Kn^c/2K clauses would be unsatisfied.

Now assume you can approximate MAX3SAT within a factor of 1+delta, for delta = 1/(4K).


if x in L then OPT= Kn^c
|S(phi) - OPT|/ max (opt, S(phi))=
(OPT -S(phi))/ OPT =
(Kn^c -S(phi))/ Kn^c  < 1/4K (by defn of appox threshold of S) or equivalently
S(phi) > (1-1/4K)

If x not in L then at most (1-1/2K) clauses can be satisfied.

So we say x in L iff S satisfies more than (1 - 1/3K) clauses in phi


It is too hard to prove NP subset PCP(log n, 1), but to get a feel for
how such a proof might go, we sketch the proof of:

Theorem: NP subset PCP(poly, O(1))
Proof: You need a way to spread any error out through the "proof"


We give a (poly, O(1)) restricted verifier for the following NP-complete language  QUADEQ
consisting of quadratic equations over a field of 2 elements that have a simultaneous solution.
For example an instance might be (I didn't check whether this has simultaneous solution):

u_1 u_3 + u_1 u_2 = 1
u_2 u_4 + u_1 u_2  + u_2 u_3= 0
u_1 u_3 + u_1 u_4  + u_2 u_3= 1

so you can think of the k^th equation as sum_{i,j} A_{k,i,j} u_i u_j = b_k


Let m be the number of equation and n the number of variables
CORRECT PROOF: The correct proof is the Walsh Hadamard encoding f of the n bit solution u and
Walsh Hadamard encoding g of the n^2 bit outer product uxu 

Definition: Walsh-Hadamard encoding of an n bit string u is
a Boolean function from n bits to 1 bit defined by f_u(x)=x@u, where @ is inner product mod 2.


Question: How many bits does it take to encode a function f:{0,1}^n -> {0,1} ?
Answer: 2^n. Although note that most 2^n bit strings are not valid 
Walsh-Hadamard code words
Question: How many bits does it take to encode a function g:{0,1}^{n^2} -> {0,1} ?
Answer: 2^{n^2}. 

Random Subsum property: If u<>v then Prob[u@x=v@x] = 1/2


VERIFIER ALGORITHM (attempt 1)

1. Check to see if the book encodes a bit string u in the right way,
that is, that is, there is a u, such the book is the Walsh-Hadamard
encoding of u and then the Walsh-Hadamard encoding of u x u. 


2. Check that u is satisfying.


Let's first focus on the second point:

Question: How to check that the k^th equation sum_{i,j} A_{k,i,j} u_i u_j = b_k is satisfied?
Answer: Let z = be the n^2 bit vector A_{k,i,j} for each of then n^2 values of (i,j)
Then check whether the bit g(z)= (uxu)@z = sum_{i,j} A_{k,i,j} u_i u_j is equal to b_k 


Question: Why can't you check all equations?
Answer: That would have m, not O(1) queries

Question: Would just checking a random subset of the equations be OK?
Answer: No you want to verify that all equation are satisfied.

Flip a coin for each equation whether to check for it or not.
Then add up all the equations that you want to check to get a new quadratic equation
Now check whether this new quadratic equation is satisfied
If there is even one equation unsatisfied by u, you are likely to catch it.
NOTE: This is where you use poly random bits.


Now back to checking that the book encodes u in the right way:

Question: How could the prover/book be cheating?

1. f may not be a Walsh-Hadamard encoding of a string u. 


Fact: Walsh-Hadamard functions/codes are exactly the functions
f from n bits to 1 bit that are linear, that is f(u+v)=f(u)+f(v) 
where + is sum mod 2. 


So this is  equivalently to saying the encoded function f may not be linear function

2. g may not be a Walsh-Hadamard encoding of a string n^2 bit string, or equivalently, g is not a linear function

3. g maybe be a Walsh-Hadamard code, but not for u x u

Note that 1 and 2 are basically the same problem. 


Question: How do you try to falsify the claim that a function is linear"
Answer: pick 100's of random pairs (x, y) and verify f(x+y)=f(x)+f(y) 

Problem: This only establishes the function is nearly linear.
The definition of a near linear function is one that likely passes this test. 
We're going to have to live with this. 

Question: How many linear function are there that agree in 90% of the positions with a near linear function?
Answer: 1, since linear functions disagree in 

Note: We want to use the function in the book, which may only be near linear, to compute 
the unique linear function close to it with high probability. 


Question: Given a near linear function h, and an x, how do you compute f(x) with high probability if f is 
the unique linear function this is similar to h? Note we want the answer to be correct whp for all x
Wrong answer:  h(x) since this could be the x on which it is wrong
Right answer: Let x' be random n bit string and x'' = x + x' or equivalently x = x' + x''
Then output h(x' + x'') = h(x') + h(x'')


We now deal with 3, Verifying that g encodes uxu
	pick random n bit strings r and r' and check f(r) f(r') = g(rxr')
	In a correct proof it is the case that
	By definition f(r) f(r') = (sum_{i=1}^n u_i r_i)( {i=1}^n u_i r'_i)
                                 = sum_{i,j} u_i r_i u_j r'_j
                                 = (uxu)@(rxr')
                                 = g(rxr')

	In an near linear encoding of n^2 bits that isn't the real g, there must be a lot of bits
		different from the real g, and so probably an incorrect proof would get caught


Question: What do you need to prove NP subset PCP(O(log n), O(1))?
Answer: Better codes