10/8/2010 by Sanjoy

An easy proof of the Chernoff-Hoeffding bound

Textbooks invariably seem to carry the proof that uses Markov’s inequality, moment-generating functions, and Taylor approximations. Here’s an easier way.

For $latex p,q \in (0,1)$, let $latex K(p,q)$ be the KL divergence between a coin of bias $latex p$ and one of bias $latex q$: $latex K(p,q) = p \ln \frac{p}{q} + (1-p) \ln \frac{1-p}{1-q}.$

Theorem: Suppose you do $latex n$ independent tosses of a coin of bias $latex p$. The probability of seeing $latex qn$ heads or more, for $latex q > p$, is at most $latex \exp(-nK(q,p))$. So is the probability of seeing $latex qn$ heads or less, for $latex q < p$.

Remark: By Pinsker’s inequality, $latex K(q,p) \geq 2(p-q)^2$.

Proof Let’s do the $latex q > p$ case; the other is identical.

Let $latex \theta_p$ be the distribution over $latex \{0,1\}^n$ induced by a coin of bias $latex p$, and likewise $latex \theta_q$ for a coin of bias $latex q$. Let $latex S$ be the set of all sequences of $latex n$ tosses which contain $latex qn$ heads or more. We’d like to show that $latex S$ is unlikely under $latex \theta_p$.

Pick any $latex \bar{x} \in S$, with say $latex k \geq qn$ heads. Then:
[latex size=”2″] \frac{\theta_q(\bar{x})}{\theta_p(\bar{x})} = \frac{q^k(1-q)^{n-k}}{p^k(1-p)^{n-k}} \geq \frac{q^{qn}(1-q)^{n-qn}}{p^{qn}(1-p)^{n-qn}} = \left( \frac{q}{p} \right)^{qn} \left( \frac{1-q}{1-p}\right)^{(1-q)n} = e^{n K(q,p)}.[/latex]

Since $latex \theta_p(\bar{x}) \leq \exp(-nK(q,p)) \theta_q(\bar{x})$ for every $latex \bar{x} \in S$, we have [latex]\theta_p(S) \leq \exp(-nK(q,p)) \theta_q(S) \leq \exp(-nK(q,p))[/latex] and we’re done.

15 Replies to “An easy proof of the Chernoff-Hoeffding bound”

Yaroslav Bulatov says:

10/8/2010 at 11:13 am

How do you justify the step where you replace k with qn?
1. DH says:
  
  10/8/2010 at 2:43 pm
  
  Since q/p > 1 and k > qn, (q/p)^k >= (q/p)^(qn). The other factor is similar.
  1. Yaroslav Bulatov says:
    
    10/8/2010 at 3:17 pm
    
    The other factor is similar, but doesn’t it give the wrong direction in inequality? I get 1-q ((1-q)/(1-p))^k<=((1-q)/(1-p))^qn
    1. Yaroslav Bulatov says:
      
      10/8/2010 at 5:08 pm
      
      broken formatting, should be 1-q less than 1-p implies ((1-q)/(1-p))^k<=((1-q)/(1-p))^qn
      1. dh's protege says:
        
        10/8/2010 at 5:22 pm
        
        you dropped the minus sign on k.
Umar Syed says:

10/8/2010 at 1:50 pm

John, this is really nice!
1. jl says:
  
  10/8/2010 at 2:27 pm
  
  Sanjoy, actually 🙂
Hsuan-Tien Lin says:

10/8/2010 at 4:57 pm

Very cool! Thank you, Sanjoy. My students would be so happy to know that they don’t need to go through the lengthy proof that was given last week. 🙂
Alekh Agarwal says:

10/8/2010 at 6:19 pm

Isn’t this what they call the method of types in info theory books?
Yaroslav Bulatov says:

10/8/2010 at 6:25 pm

Is there a way to get Chernoff-type bound for more than 2 outcomes? Cover/Thomas use Sanov’s Theorem for that, but it seems much looser than Chernoff-type bound
Suresh says:

11/18/2010 at 12:15 pm

This is similiar to the idea as that used by Impagliazzo and Kabanets in a recent paper on a combinatorial version of the CH-bound. In that paper, they also generalize this argument (as Yaroslav asks). Nice !
Dai says:

3/8/2011 at 7:10 pm

Excellent exposition! Would you please tell me where you found this proof since I assume that it predated the proof by Impapliazzo-Kabanets. Many thanks in advance!
jl says:

3/9/2011 at 5:36 am

I’m not sure which paper you are referring to.

My understanding is that this proof is new in origination, although I can’t swear it’s the first time it’s been discovered.
1. Dai says:
  
  3/9/2011 at 11:11 am
  
  @jl: That’s excellent! I have no doubt about its originality, but I thought it was officially published somewhere else either by you or others. Such a nice and simple proof should be more well-known!
Pingback: Proofs that expose a deeper structure | Q&A System

Comments are closed.