10/8/2010 by Sanjoy

An easy proof of the Chernoff-Hoeffding bound

Textbooks invariably seem to carry the proof that uses Markov’s inequality, moment-generating functions, and Taylor approximations. Here’s an easier way.

For $latex p,q \in (0,1)$, let $latex K(p,q)$ be the KL divergence between a coin of bias $latex p$ and one of bias $latex q$: $latex K(p,q) = p \ln \frac{p}{q} + (1-p) \ln \frac{1-p}{1-q}.$

Theorem: Suppose you do $latex n$ independent tosses of a coin of bias $latex p$. The probability of seeing $latex qn$ heads or more, for $latex q > p$, is at most $latex \exp(-nK(q,p))$. So is the probability of seeing $latex qn$ heads or less, for $latex q < p$.

Remark: By Pinsker’s inequality, $latex K(q,p) \geq 2(p-q)^2$.

Proof Let’s do the $latex q > p$ case; the other is identical.

Let $latex \theta_p$ be the distribution over $latex \{0,1\}^n$ induced by a coin of bias $latex p$, and likewise $latex \theta_q$ for a coin of bias $latex q$. Let $latex S$ be the set of all sequences of $latex n$ tosses which contain $latex qn$ heads or more. We’d like to show that $latex S$ is unlikely under $latex \theta_p$.

Pick any $latex \bar{x} \in S$, with say $latex k \geq qn$ heads. Then:
[latex size=”2″] \frac{\theta_q(\bar{x})}{\theta_p(\bar{x})} = \frac{q^k(1-q)^{n-k}}{p^k(1-p)^{n-k}} \geq \frac{q^{qn}(1-q)^{n-qn}}{p^{qn}(1-p)^{n-qn}} = \left( \frac{q}{p} \right)^{qn} \left( \frac{1-q}{1-p}\right)^{(1-q)n} = e^{n K(q,p)}.[/latex]

Since $latex \theta_p(\bar{x}) \leq \exp(-nK(q,p)) \theta_q(\bar{x})$ for every $latex \bar{x} \in S$, we have [latex]\theta_p(S) \leq \exp(-nK(q,p)) \theta_q(S) \leq \exp(-nK(q,p))[/latex] and we’re done.

15 Replies to “An easy proof of the Chernoff-Hoeffding bound”

Yaroslav Bulatov says:

10/8/2010 at 11:13 am

How do you justify the step where you replace k with qn?
1. DH says:
  
  10/8/2010 at 2:43 pm
  
  Since q/p > 1 and k > qn, (q/p)^k >= (q/p)^(qn). The other factor is similar.
  1. Yaroslav Bulatov says:
    
    10/8/2010 at 3:17 pm
    
    The other factor is similar, but doesn’t it give the wrong direction in inequality? I get 1-q ((1-q)/(1-p))^k<=((1-q)/(1-p))^qn
    1. Yaroslav Bulatov says:
      
      10/8/2010 at 5:08 pm
      
      broken formatting, should be 1-q less than 1-p implies ((1-q)/(1-p))^k<=((1-q)/(1-p))^qn
      1. dh's protege says:
        
        10/8/2010 at 5:22 pm
        
        you dropped the minus sign on k.
Umar Syed says:

10/8/2010 at 1:50 pm

John, this is really nice!
1. jl says:
  
  10/8/2010 at 2:27 pm
  
  Sanjoy, actually
Hsuan-Tien Lin says:

10/8/2010 at 4:57 pm

Very cool! Thank you, Sanjoy. My students would be so happy to know that they don’t need to go through the lengthy proof that was given last week.
Alekh Agarwal says:

10/8/2010 at 6:19 pm

Isn’t this what they call the method of types in info theory books?
Yaroslav Bulatov says:

10/8/2010 at 6:25 pm

Is there a way to get Chernoff-type bound for more than 2 outcomes? Cover/Thomas use Sanov’s Theorem for that, but it seems much looser than Chernoff-type bound
Suresh says:

11/18/2010 at 12:15 pm

This is similiar to the idea as that used by Impagliazzo and Kabanets in a recent paper on a combinatorial version of the CH-bound. In that paper, they also generalize this argument (as Yaroslav asks). Nice !
Dai says:

3/8/2011 at 7:10 pm

Excellent exposition! Would you please tell me where you found this proof since I assume that it predated the proof by Impapliazzo-Kabanets. Many thanks in advance!
jl says:

3/9/2011 at 5:36 am

I’m not sure which paper you are referring to.

My understanding is that this proof is new in origination, although I can’t swear it’s the first time it’s been discovered.
1. Dai says:
  
  3/9/2011 at 11:11 am
  
  @jl: That’s excellent! I have no doubt about its originality, but I thought it was officially published somewhere else either by you or others. Such a nice and simple proof should be more well-known!
Pingback: Proofs that expose a deeper structure | Q&A System

Comments are closed.