Cryptography: "Its all about the definitions." Setting: Alice wants to send a message to Bob. Carol can observe the message. But Alice and Bob want the coummication to be secure in that Carol shouldn't learn anything about the message. Alice uses an encryption algorithm E and Bob uses a decryption algorithm D, where if E(m, k) = s then D(s, k) = m. Or equivalently D(E(m, k), k) = m. Settings: PRIVATE KEY CRYPTO WITH LARGE KEYS Sender and receiver share a secret key k that is as large as the message. PRIVATE KEY CRYPTO WITH SMALL KEYS Sender and receiver share a secret key k that has less bits than as the message. PUBLIC KEY CRYPTO Sender and receiver share no secret key Diffie Hellman (1970's) Informal discussion of definition of security: What is wrong with the definition that Carol can't decode any message? What is wrong with the definition that there is exists some message that Carol can't decode? What is wrong with what the book calls computational security, that Carol can't guess any bit with probability more than 1/2 plus negligible? Very Informal Definition of semantic security: Secure means that a computationally bounded eaves dropper Carol can not learn/compute anything about the message that she could learn/compute without the message *************************************************************************************************************************************************************************************** PRIVATE KEY CRYPTO WITH LARGE KEYS Sender and receiver share a secret key k that is as large as the message. Theorem: Bitwise XOR is perfectly secret if the key is at least of size the message Definition (Shannon 1940's): Perfect Secrecy: Let (E, D) be encryption and decryption algorithms for messages of length m and a key of length n. (E,D) is perfectly secret iff for a uniform distribution on the possible keys, for every pair of messages x, and y, the distribution of E(x,k), is equivalent to the distribution of E(y,k) That is for all s, Prob_k[E(x, k) = s ] = Prob_k[ E[y, k]=s] Definition: Reasonable formal defintion of semantic security in the context of privat key crypto: For polynomial time computable functions f For all polynomial time algorithms C that Carol might use there exists a randomized polynomial time algorithm D such that Prob_{k,C}[C(E(m,k)) = f(m)] <= Prob[D(1^n) = f(m)] + negligible Question: What is negligible? Answer: What naturally arises mathematically is smaller than 1/p(n) for some/any polynomial p(n). Intuition: So Carol's goal is to compute f(m). To whatever extent Carol can compute f(m) from the encrypted message, there is another poly time algorithm D that can compute f(m) with almost the same probability only knowing the length of the message, and not the encoding. So intuitivley the encrypted message didn't provide Carol with any advantage in computing f(m). Note the probability Prob_{k,C}[C(E(m,k)) = f(m)] is over both the random selection of the key k and the randomness internal to Carol's algorithm. Obvious Theorem: Perfect secrecy implies semantic security *************************************************************************************************************************************************************************************** PUBLIC KEY CRYPTO Sender and receiver share no secret key Diffie Hellman (1970's) Question: What is wrong with the following proof? Theorem: Public key cryptography is impossible Proof: What ever the sender sends to the receiver, the receiver can understand iff the eavesdropper can understand by the assumption of identically Answer: The proof is correct assume the protocol consists of only one message from the sender to the receiver. Public key cryptographic protocols require: 1. The receiver does some computation 2. The receiver sends a key k to the sender 3. The sender uses key k to encrypt the message m to send the encoded message s = E(m, k). Note E and k are publicly known. 4. The receiver determines message m from s and key k D(E(m, key), key) = m *************************************************************************************************************************************************************************************** SUMMARY OF RESULTS: secure public key crypto => existence of one way functions/permutations => existence of pseudo-random generators => computationally secure small private key with small keys existence of pseudo-random generators => BPP subset sub-exponential time existence of one way functions/permutations => P not equal NP existence of one way functions/permutations => bit commitment protocol ___________________ Definition "One way function": One way function: F is a one way function, such that: * F can be computed by a deterministic poly time machine, *F^{-1} is a function. So generally the output can't be smaller than the input. So without any real loss of generality you can think of F:2^ -> 2^n as a permutation. *For every randomized machine A, Prob_{x}{F(x)=y}[A(y)=x] is negligible Question: Why not use "= 0" or " < 1" instead of negligible? Question: Can you give an example of a function that might plausibly be one way? Answer: Multiplication. The inverse is factoring. The secuity of the RSA public key crypto system is based on the hardness of factoring (although there is no formal proof that breaking RSA requires factoring). ___________________ Theorem: secure public key crypto => existence of one way functions/permutations Proof: The contrapositive is obvious ___________________ Theorem: existence of one way functions/permutations => P not equal NP Proof: The contrapositive is obvious ___________________ Definition Pseudo-random generator: A pseudo-random generator is an efficiently deterministically computable function G that maps an n bit string to an n^c bit string such that for every probabilistic poly time algorithm C, the probability that C does something different on a uniformly generated n^c bit string than what C does on G(x), where x is uniformly generated n bit string, is negligible. That is, Prob_{x}[C(G(n bit random string x))=1] - Prob_{C, y}[C(n^c bit random string y)=1] is negligible Note that the randomness in Prob_{x}[C(G(n bit random string x))=1] is over x (Why can't G be a randomized algorithm?) ___________________ Theorem: pseudo-random generators => computationally secure small private key with small keys Proof: Now we construct a protocol: Encryption E = Apply G to the smaller truly random key and then xor the result with the message to produce the sent message Decryption D = Apply G to the smaller truly random key and then xor the result with the received message Now assume that this protocol is not secure. Then there is something that distinguishes the output of G and random bits, namely being able to determine m, which contradicts the pseudo-randomness of G. End proof Question: What does secrecy mean here? Answer: For all but a negligible number of messages, Carol's chance of decoding is negligible ___________________ Theorem: existence of one way functions/permutations => good pseudo-random generators Proof: Too hard, so we are just going to get some intuition here Let f be a one way permutation. Then there is generator G extending 2n random bits to 2n+1 pseudo random bits that is unpredictable. Let G(x,r) = f(x), r, x@r where x@r = sum_{i=1}^n (x_i and r_i) mod 2 be a function from {0,1}^2n -> {0,1} Question: Why aren't the first 2n bits predictable? So let us consider the last bit Assume to reach a contradiction that there is an algorithm A such that Prob[A(f(x), r) = x@r] > 1/2 + epsilon. We now want to show how to find a poly time algorithm B to invert f with non-negligible probability epsilon, getting a contradiction that f is one way. By an averaging argument, there is at least an epsilon/2 fraction of the inputs (call them good inputs) that A correctly computes x@r with probability at least 1/2 + epsilon/2. We show how to invert these good inputs: Note that because x@r is linear, if A was always right, it would be sufficient to run A with each of the n possible unit vectors as r, to recover x. Algorithm B: Run A on inputs f(x) and a bunch of randomly choose r's. Appeal to linear algebra and probability that this gives you enough info to recover x End proof. ___________________ Theorem: existence of one way functions/permutations => bit commitment protocol Proof: BIT COMMITMENT PROTOCOL Parties A and B want to flip a fair coin over the phone. Suggested protocols? Theorem: One way permutation f imply bit commitment protocol. Proof: A: generates n bit random strings x and r. Sends B 2n bits: f(x) r B: sends A one random bit b A: sends B the value of x They agree that the random bit is b XOR (sum_{i=1}^n x_i r_i mod 2) Question: How can B verify that A sent x in the third message? Answer: Check f of it is equal to what was sent in the first message So A is committed to the bit (sum_{i=1}^n x_i r_i mod 2) after the first message Question: Why can't B cheat? Answer: We've seen that since (sum_{i=1}^n x_i r_i mod 2) can't be predicted from f(x) and r, B can't bias the choice of its bit. End proof ___________________ Theorem: existence of pseudo-random generators => BPP subset sub-exponential time Proof: Assume a pseudo-random generator G exists. Let M be a BPP algorithm. We give a subexponential time algorithm A. But first we give the following algorithm B: Get random bits r as input Create many pseudo-random bits G(r) by applying G to r Run M using G(r) as the random bits By the definition of pseudo-random number generator, B has to have behaviour very close (probabilistically) to M. We then show how to construct algorithm A from algorithm B Algorithm A: For each of the 2^r choices for r See what algorithm B does with these random bits End proof Current thinking: P probably equals BPP because probably pseudo-random number generators that increase the number of random bits exponentially exist *************************************************************************************************************************************************************************************** ZERO KNOWLEDGE PROTOCOLS Assume you want to have some protocol to allow people to prove their identity to an ATM machine. The problem with a pin number, is that if the ATM machine is corrupted, you have given away your pin number to the machine, and the bad guys could then take money out of you account. So you want a protocol that will convince an honest ATM machine that you are you, but won't give away any info about the pin (since you don't know whether the ATM is honest or not). Question: Does this seem possible? Warm-up: Public key digital signature of random nonce. Problem: No provably secure, particular again chosen text attack Protocol for graph isomorphism: Public input: Graphs G_0 and G_1 Private input for the prover: a permutation pi such that G_1 = pi(G_0) Note that prover here is the ATM user. The verifier is the ATM machine. The Graphs G_0 and G_1 are the claimed identity of the ATM user. The permutation pi is the private pin number/information that the prover needs to convince the ATM machine that it knows. Protocol: The prover generates a random permutation sigma First message is from prover is H=sigma(G_1) Implicitly the prover asks whether the verifier would like the prover to prove that H is really isomorphic to G_1 or whether the verifier would like the prover to give evidence that H is isomorphic to G_0 The verifier flips a fair coin with outcome b The verifier sends b to the prover If b=1 then the verifier is asking the prover to prove that H is isomorphic to G_0 Question: What should the prover return if it gets b=1? Answer: If b=1 then the prover sends sigma to the verifier The verifier can check that sigma is a permutation and that H=sigma(G_1) Question: What should the prover return if it gets b=0? if b=0 then the prover sends the permutation tau = sigma composed with pi (that is first apply pi and then apply sigma) Note that since sigma is a random permutation, then so is tau Note that tau has the property that tau(G_0)=sigma(pi(G_0))=sigma(G_1)=H This proves that G_1 and G_0 are isomorphic if the prover wasn't lying about H being a permutation of G_1 Error analysis: If the graphs are isomorphic, the the prover can follow the protocol and the verifier will always accept. If the graphs are not isomorphic, and the user is trying to cheat, it either needs to lie about H being isomorphic to G_1 (in which case it gets caught with probability 1/2 if the verifier flips b=1) or it can't supply a tau that will cause the verifier to accept. So in any case the prover fails with probability at least 1/2. This can be raised to 1-1/exponential by repeating this in parallel poly many times. Question: Now we want to argue that the verifier didn't get any information from this interaction, beyond confidence that the graphs are isomorphic. How can be formalize this? Answer: There are several reasonable ways to do this. Here is one way: Definition of Perfect zero knowledge: A protocol P is perfect zero knowledge if there exists an efficient probabilistic algorithm S whose output on the public input x in the language, has exactly the same distribution as the verifier's in this protocol when the public input is x. Intuition: Since there was already an efficient algorithm to compute what the protocol computed, the eavesdropper could have learned just as much by ignoring the protocol. Theorem: The protocol for graph isomorphism has perfect zero knowledge Pretty convincing informal proof: all the verifier sees are random permutations of the graph and the resulting graph if this permutation is applied. The verifier can generate that information himself Formal Proof: We need to produce that algorithm S. The key insight is that if G_0 and G_1 are isomorphic then S sees G_0 and G_1 that are isomorphic S flips a fair coin to get a bit c S generates a random permutation pi S simulates the verifier on input H = pi(G_c) to get a bit b If b=c=1 (which happens with probability 1/4) then The simulated verifier initially saw a random permutation of G_1 and then asked to see that permutation (this happens with probability 1/2 to the verifier) Here since c=1, S knows this random permutation pi, and thus can give this to the verifier and simulate the verifier Else if b=c=0 (this happens with probability 1/4) then The simulated verifier initially saw a random permutation H of G_1 and then asked to see a permutation tau such that tau(G_0)=H (this happens with probability 1/2 to the verifier) Question: Are the H's that the verifier sees and the H's that S sees identically distributed? Answer: Yes, that both sees random permutations of G_1 since G_0 and G_1 are isomorphic Question: What is tau in this case? Answer: pi Hence S simulates the verifier when the next message is pi else try again The try again part doubles the probability that S hits the situation b=c=1 and b=c=0.