<?xml version="1.0"?> 
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "mathml.dtd"> 
<?xml-stylesheet type="text/css" href="thesis.css"?> 
<html  
xmlns="http://www.w3.org/1999/xhtml"  
><head>
   <title>Quantitatively Tight Sample Complexity Bounds</title> 
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> 
<meta name="generator" content="TeX4ht (http://www.cis.ohio-state.edu/~gurari/TeX4ht/mn.html)" /> 
<meta name="originator" content="TeX4ht (http://www.cis.ohio-state.edu/~gurari/TeX4ht/mn.html)" /> 
<!-- 3,early_,early^,xhtml,mozilla --> 
<meta name="src" content="thesis.tex" /> 
<meta name="date" content="2002-08-28 13:56:00" /> 
<link rel="stylesheet" type="text/css" href="thesis.css" /> 
</head><body 
>
   <div align="center" class="maketitle">                                                                 

                                                                     
                                                                     

                                                                     

<h2 class="titleHead">Quantitatively Tight Sample Complexity Bounds</h2>
<div class="authors"><div class="author"> <span 
class="ecrm-1440">John Langford</span></div></div>
                                                                     

                                                                     
<div class="submaketitle">
</div>
   </div>I present many new results on sample complexity bounds (bounds on the
future error rate of arbitrary learning algorithms). Of theoretical interest are
qualitative and quantitative improvements in sample complexity bounds as well as
some techniques and criteria for judging the tightness of sample complexity
bounds.
<!--l. 70--><p class="indent">   On the practical side, I show quantitative results (with true error rate bounds sometimes less
than <!--l. 71--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">        <mrow 
><mn>0</mn><mo 
class="MathClass-punc">.</mo><mn>0</mn><mn>1</mn></mrow></math>)
for decision trees and neural networks with these sample complexity bounds applied to
real world problems. I also present a technique for using both sample complexity bounds
and (more traditional) holdout techniques.
</p><!--l. 76--><p class="indent">   Together, the theoretical and practical results of this thesis provide a well-founded
practical method for evaluating learning algorithm performance based upon both
training and testing set performance.
</p><!--l. 80--><p class="indent">   Code for calculating these bounds is provided.
</p><!--l. 82--><p class="indent">
                                                                     

                                                                     
</p><!--l. 82--><p class="indent">
                                                                     

                                                                     
</p><!--l. 82--><p class="indent">
                                                                     

                                                                     
</p>
   <div class="tableofcontents"><span class="likechapterToc">&#x00A0;&#x00A0;<a 
href="thesisli1.xml#x2-1000" name="QQ2-2-1">Contents</a></span><br /><span class="partToc">Part&#x00A0;1.&#x00A0;&#x00A0;<a 
href="thesispa1.xml#x3-20001" name="QQ2-3-2">Introductory Learning Theory</a></span><br /><span class="chapterToc">Chapter&#x00A0;1.&#x00A0;&#x00A0;<a 
href="thesisch1.xml#x4-30001" name="QQ2-4-3">Informal Introduction</a></span><br /><span class="sectionToc">&#x00A0;1.1.&#x00A0;&#x00A0;<a 
href="thesisse1.xml#x5-40001.1" name="QQ2-5-4">The
learning problem</a></span><br /><span class="sectionToc">&#x00A0;1.2.&#x00A0;&#x00A0;<a 
href="thesisse2.xml#x6-50001.2" name="QQ2-6-5">The problem with the learning problem</a></span><br /><span class="sectionToc">&#x00A0;1.3.&#x00A0;&#x00A0;<a 
href="thesisse3.xml#x7-60001.3" name="QQ2-7-6">A plethora of
learning models</a></span><br /><span class="sectionToc">&#x00A0;1.4.&#x00A0;&#x00A0;<a 
href="thesisse4.xml#x8-70001.4" name="QQ2-8-7">The oblivious passive supervised learning model</a></span><br /><span class="sectionToc">&#x00A0;1.5.&#x00A0;&#x00A0;<a 
href="thesisse5.xml#x9-80001.5" name="QQ2-9-8">Questions we
can answer</a></span><br /><span class="chapterToc">Chapter&#x00A0;2.&#x00A0;&#x00A0;<a 
href="thesisch2.xml#x10-90002" name="QQ2-10-9">Formal Model and Context</a></span><br /><span class="sectionToc">&#x00A0;2.1.&#x00A0;&#x00A0;<a 
href="thesisse6.xml#x11-100002.1" name="QQ2-11-10">Formal Model</a></span><br /><span class="sectionToc">&#x00A0;2.2.&#x00A0;&#x00A0;<a 
href="thesisse7.xml#x12-110002.2" name="QQ2-12-11">Relationship
to Prior Work</a></span><br /><span class="sectionToc">&#x00A0;2.3.&#x00A0;&#x00A0;<a 
href="thesisse8.xml#x13-140002.3" name="QQ2-13-14">Overview of the document</a></span><br /><span class="chapterToc">Chapter&#x00A0;3.&#x00A0;&#x00A0;<a 
href="thesisch3.xml#x14-220003" name="QQ2-14-22">Basic Observations </a></span><br /><span class="sectionToc">&#x00A0;3.1.&#x00A0;&#x00A0;<a 
href="thesisse9.xml#x15-230003.1" name="QQ2-15-23">The
Basic Building Block</a></span><br /><span class="sectionToc">&#x00A0;3.2.&#x00A0;&#x00A0;<a 
href="thesisse10.xml#x16-240003.2" name="QQ2-16-24">Approximation techniques</a></span><br /><span class="sectionToc">&#x00A0;3.3.&#x00A0;&#x00A0;<a 
href="thesisse11.xml#x17-250003.3" name="QQ2-17-25">Binomial Tail calculation
techniques</a></span><br /><span class="sectionToc">&#x00A0;3.4.&#x00A0;&#x00A0;<a 
href="thesisse12.xml#x18-260003.4" name="QQ2-18-26">Converting to a P-value approach</a></span><br /><span class="sectionToc">&#x00A0;3.5.&#x00A0;&#x00A0;<a 
href="thesisse13.xml#x19-270003.5" name="QQ2-19-28">Bounding the Union</a></span><br /><span class="sectionToc">&#x00A0;3.6.&#x00A0;&#x00A0;<a 
href="thesisse14.xml#x20-280003.6" name="QQ2-20-29">Arbitrary
Loss functions</a></span><br /><span class="chapterToc">Chapter&#x00A0;4.&#x00A0;&#x00A0;<a 
href="thesisch4.xml#x21-290004" name="QQ2-21-30">Simple Sample Complexity bounds</a></span><br /><span class="sectionToc">&#x00A0;4.1.&#x00A0;&#x00A0;<a 
href="thesisse15.xml#x22-300004.1" name="QQ2-22-31">Simple Holdout</a></span><br /><span class="sectionToc">&#x00A0;4.2.&#x00A0;&#x00A0;<a 
href="thesisse16.xml#x23-320004.2" name="QQ2-23-34">The
basic training set bound</a></span><br /><span class="sectionToc">&#x00A0;4.3.&#x00A0;&#x00A0;<a 
href="thesisse17.xml#x24-330004.3" name="QQ2-24-35">Lower Bounds</a></span><br /><span class="sectionToc">&#x00A0;4.4.&#x00A0;&#x00A0;<a 
href="thesisse18.xml#x25-340004.4" name="QQ2-25-36">Lower Upper Bounds</a></span><br /><span class="sectionToc">&#x00A0;4.5.&#x00A0;&#x00A0;<a 
href="thesisse19.xml#x26-350004.5" name="QQ2-26-37">Structural Risk
Minimization</a></span><br /><span class="sectionToc">&#x00A0;4.6.&#x00A0;&#x00A0;<a 
href="thesisse20.xml#x27-360004.6" name="QQ2-27-38">Incorporating a &#x201C;Prior&#x201D;</a></span><br /><span class="partToc">Part&#x00A0;2.&#x00A0;&#x00A0;<a 
href="thesispa2.xml#x28-370002" name="QQ2-28-40">New Techniques</a></span><br /><span class="chapterToc">Chapter&#x00A0;5.&#x00A0;&#x00A0;<a 
href="thesisch5.xml#x29-380005" name="QQ2-29-41">Microchoice
Bounds (the algebra of choices)</a></span><br /><span class="sectionToc">&#x00A0;5.1.&#x00A0;&#x00A0;<a 
href="thesisse21.xml#x30-390005.1" name="QQ2-30-43">A Motivating Observation</a></span><br /><span class="sectionToc">&#x00A0;5.2.&#x00A0;&#x00A0;<a 
href="thesisse22.xml#x32-400005.2" name="QQ2-32-44">The Simple Microchoice
Bound</a></span><br /><span class="sectionToc">&#x00A0;5.3.&#x00A0;&#x00A0;<a 
href="thesisse23.xml#x33-460005.3" name="QQ2-33-50">Combining Microchoice with Freund&#x2019;s Query Tree approach</a></span><br /><span class="sectionToc">&#x00A0;5.4.&#x00A0;&#x00A0;<a 
href="thesisse24.xml#x36-560005.4" name="QQ2-36-60">Microchoice
discussion</a></span><br /><span class="chapterToc">Chapter&#x00A0;6.&#x00A0;&#x00A0;<a 
href="thesisch6.xml#x37-570006" name="QQ2-37-61">PAC-Bayes bounds</a></span><br /><span class="sectionToc">&#x00A0;6.1.&#x00A0;&#x00A0;<a 
href="thesisse25.xml#x38-580006.1" name="QQ2-38-63">PAC-Bayes Basics</a></span><br /><span class="sectionToc">&#x00A0;6.2.&#x00A0;&#x00A0;<a 
href="thesisse26.xml#x39-590006.2" name="QQ2-39-64">A Tighter
PAC-Bayes Bound</a></span><br /><span class="sectionToc">&#x00A0;6.3.&#x00A0;&#x00A0;<a 
href="thesisse27.xml#x41-600006.3" name="QQ2-41-65">PAC-Bayes Approximations</a></span><br /><span class="sectionToc">&#x00A0;6.4.&#x00A0;&#x00A0;<a 
href="thesisse28.xml#x42-630006.4" name="QQ2-42-68">Application of the PAC-Bayes
bound</a></span><br /><span class="chapterToc">Chapter&#x00A0;7.&#x00A0;&#x00A0;<a 
href="thesisch7.xml#x43-640007" name="QQ2-43-69">Averaging Bounds (Improved margin)</a></span><br /><span class="sectionToc">&#x00A0;7.1.&#x00A0;&#x00A0;<a 
href="thesisse29.xml#x44-650007.1" name="QQ2-44-71">Earlier Results</a></span><br /><span class="sectionToc">&#x00A0;7.2.&#x00A0;&#x00A0;<a 
href="thesisse30.xml#x45-660007.2" name="QQ2-45-72">A generalized
averaging bound</a></span><br /><span class="sectionToc">&#x00A0;7.3.&#x00A0;&#x00A0;<a 
href="thesisse31.xml#x46-670007.3" name="QQ2-46-73">Proof of main theorem</a></span><br /><span class="sectionToc">&#x00A0;7.4.&#x00A0;&#x00A0;<a 
href="thesisse32.xml#x47-700007.4" name="QQ2-47-76">Methods for tightening</a></span><br /><span class="sectionToc">&#x00A0;7.5.&#x00A0;&#x00A0;<a 
href="thesisse33.xml#x48-710007.5" name="QQ2-48-77">Final thoughts
for Averaging Bounds</a></span><br /><span class="chapterToc">Chapter&#x00A0;8.&#x00A0;&#x00A0;<a 
href="thesisch8.xml#x49-720008" name="QQ2-49-78">Computable Shell bounds </a></span><br /><span class="sectionToc">&#x00A0;8.1.&#x00A0;&#x00A0;<a 
href="thesisse34.xml#x50-730008.1" name="QQ2-50-80">The Discrete Shell
Bound</a></span><br /><span class="sectionToc">&#x00A0;8.2.&#x00A0;&#x00A0;<a 
href="thesisse35.xml#x51-760008.2" name="QQ2-51-83">Sampling Shell Bound</a></span><br /><span class="sectionToc">&#x00A0;8.3.&#x00A0;&#x00A0;<a 
href="thesisse36.xml#x52-770008.3" name="QQ2-52-84">Lower Bounds</a></span><br /><span class="sectionToc">&#x00A0;8.4.&#x00A0;&#x00A0;<a 
href="thesisse37.xml#x53-780008.4" name="QQ2-53-85">Shell Bounds for Continuous
Spaces</a></span><br /><span class="sectionToc">&#x00A0;8.5.&#x00A0;&#x00A0;<a 
href="thesisse38.xml#x54-790008.5" name="QQ2-54-86">Conclusion</a></span><br /><span class="chapterToc">Chapter&#x00A0;9.&#x00A0;&#x00A0;<a 
href="thesisch9.xml#x55-800009" name="QQ2-55-87">Tight covering number bounds</a></span><br /><span class="sectionToc">&#x00A0;9.1.&#x00A0;&#x00A0;<a 
href="thesisse39.xml#x56-810009.1" name="QQ2-56-88">Introduction</a></span><br /><span class="sectionToc">&#x00A0;9.2.&#x00A0;&#x00A0;<a 
href="thesisse40.xml#x57-820009.2" name="QQ2-57-89">The
Setting and Prior Results</a></span><br /><span class="sectionToc">&#x00A0;9.3.&#x00A0;&#x00A0;<a 
href="thesisse41.xml#x58-830009.3" name="QQ2-58-90">Bracketing Covering Number Bound </a></span><br /><span class="sectionToc">&#x00A0;9.4.&#x00A0;&#x00A0;<a 
href="thesisse42.xml#x59-840009.4" name="QQ2-59-91">Covering number
calculations</a></span><br /><span class="sectionToc">&#x00A0;9.5.&#x00A0;&#x00A0;<a 
href="thesisse43.xml#x60-860009.5" name="QQ2-60-93">Conclusion and Future Work</a></span><br /><span class="chapterToc">Chapter&#x00A0;10.&#x00A0;&#x00A0;<a 
href="thesisch10.xml#x61-8700010" name="QQ2-61-94">Holdout bounds: Progressive
Validation</a></span><br /><span class="sectionToc">&#x00A0;10.1.&#x00A0;&#x00A0;<a 
href="thesisse44.xml#x62-8800010.1" name="QQ2-62-95">Progressive Validation Technique</a></span><br /><span class="sectionToc">&#x00A0;10.2.&#x00A0;&#x00A0;<a 
href="thesisse45.xml#x63-8900010.2" name="QQ2-63-97">Variance Analysis</a></span><br /><span class="sectionToc">&#x00A0;10.3.&#x00A0;&#x00A0;<a 
href="thesisse46.xml#x64-9000010.3" name="QQ2-64-98">Deviation
Analysis</a></span><br /><span class="sectionToc">&#x00A0;10.4.&#x00A0;&#x00A0;<a 
href="thesisse47.xml#x65-9100010.4" name="QQ2-65-99">A Quick Experiment</a></span><br /><span class="sectionToc">&#x00A0;10.5.&#x00A0;&#x00A0;<a 
href="thesisse48.xml#x66-9200010.5" name="QQ2-66-101">Conclusion</a></span><br /><span class="chapterToc">Chapter&#x00A0;11.&#x00A0;&#x00A0;<a 
href="thesisch11.xml#x67-9300011" name="QQ2-67-102">Combining sample complexity
and holdout bounds</a></span><br /><span class="sectionToc">&#x00A0;11.1.&#x00A0;&#x00A0;<a 
href="thesisse49.xml#x68-9400011.1" name="QQ2-68-103">Combination Possibilities</a></span><br /><span class="sectionToc">&#x00A0;11.2.&#x00A0;&#x00A0;<a 
href="thesisse50.xml#x69-9500011.2" name="QQ2-69-105">General Approaches for Combined
Bounds</a></span><br /><span class="sectionToc">&#x00A0;11.3.&#x00A0;&#x00A0;<a 
href="thesisse51.xml#x70-9600011.3" name="QQ2-70-106">Approximations in Combinations</a></span><br /><span class="sectionToc">&#x00A0;11.4.&#x00A0;&#x00A0;<a 
href="thesisse52.xml#x71-9700011.4" name="QQ2-71-107">Conclusion</a></span><br /><span class="partToc">Part&#x00A0;3.&#x00A0;&#x00A0;<a 
href="thesispa3.xml#x72-980003" name="QQ2-72-108">Experimental
Results</a></span><br /><span class="chapterToc">Chapter&#x00A0;12.&#x00A0;&#x00A0;<a 
href="thesisch12.xml#x73-9900012" name="QQ2-73-109">Decision Trees</a></span><br /><span class="sectionToc">&#x00A0;12.1.&#x00A0;&#x00A0;<a 
href="thesisse53.xml#x74-10000012.1" name="QQ2-74-110">The Decision Tree Learning Algorithm</a></span><br /><span class="sectionToc">&#x00A0;12.2.&#x00A0;&#x00A0;<a 
href="thesisse54.xml#x75-10400012.2" name="QQ2-75-115">Bound
Application Details</a></span><br /><span class="sectionToc">&#x00A0;12.3.&#x00A0;&#x00A0;<a 
href="thesisse55.xml#x76-10700012.3" name="QQ2-76-118">Results &#x0026; Discussion</a></span><br /><span class="sectionToc">&#x00A0;12.4.&#x00A0;&#x00A0;<a 
href="thesisse56.xml#x77-11700012.4" name="QQ2-77-136">Discussion</a></span><br /><span class="chapterToc">Chapter&#x00A0;13.&#x00A0;&#x00A0;<a 
href="thesisch13.xml#x78-11800013" name="QQ2-78-137">Neural
Networks</a></span><br /><span class="sectionToc">&#x00A0;13.1.&#x00A0;&#x00A0;<a 
href="thesisse57.xml#x79-11900013.1" name="QQ2-79-138">Theoretical
setup</a></span><br /><span class="sectionToc">&#x00A0;13.2.&#x00A0;&#x00A0;<a 
href="thesisse58.xml#x80-12300013.2" name="QQ2-80-142">Experimental   Results</a></span><br /><span class="sectionToc">&#x00A0;13.3.&#x00A0;&#x00A0;<a 
href="thesisse59.xml#x81-12400013.3" name="QQ2-81-145">Conclusion</a></span><br /><span class="chapterToc">Chapter&#x00A0;14.&#x00A0;&#x00A0;<a 
href="thesisch14.xml#x82-12500014" name="QQ2-82-146">Conclusion   &#x0026;
Challenges</a></span><br /><span class="likechapterToc">&#x00A0;&#x00A0;<a 
href="thesisli2.xml#x83-12600014" name="QQ2-83-147">Bibliography</a></span><br /><span class="chapterToc">Chapter&#x00A0;15.&#x00A0;&#x00A0;<a 
href="thesisch15.xml#x84-12700015" name="QQ2-84-148">Appendix: Definitions</a></span><br /><span class="chapterToc">Chapter&#x00A0;16.&#x00A0;&#x00A0;<a 
href="thesisch16.xml#x85-12800016" name="QQ2-85-149">Appendix:
Manual</a></span><br /><span class="sectionToc">&#x00A0;16.1.&#x00A0;&#x00A0;<a 
href="thesisse60.xml#x86-12900016.1" name="QQ2-86-150">Test Error Bound Calculation</a></span><br /><span class="sectionToc">&#x00A0;16.2.&#x00A0;&#x00A0;<a 
href="thesisse61.xml#x87-13000016.2" name="QQ2-87-151">Training Set Bound Calculation</a></span><br /><span class="sectionToc">&#x00A0;16.3.&#x00A0;&#x00A0;<a 
href="thesisse62.xml#x88-13100016.3" name="QQ2-88-152">Shell
Bound Calculation</a></span><br /><span class="sectionToc">&#x00A0;16.4.&#x00A0;&#x00A0;<a 
href="thesisse63.xml#x89-13200016.4" name="QQ2-89-153">Combined Bound Calculation</a></span><br />
   </div>
                                                                     

                                                                     
<!--l. 89--><p class="indent">
                                                                     

                                                                     
</p><!--l. 89--><p class="indent">
                                                                     

                                                                     
<a 
  name="x1-1001r1"></a>
                                                                     

                                                                     
</p><!--l. 1336--><p class="indent">
                                                                     

                                                                     
</p><!--l. 1336--><p class="indent">
                                                                     

                                                                     
<a 
  name="x1-36004r2"></a>
                                                                     

                                                                     
</p><!--l. 4397--><p class="indent">
                                                                     

                                                                     
<a 
  name="x1-97005r40"></a></p> 
</body> 
</html> 

                                                                     


