Trace Model einfinity-regression einfinity -unmonotone einfinity / T2,T -optimal ???Appr. Policy Iter.
monotone local optimal Very LargePolicy Gradient
&mu Trace Model e/T2-regression &mu = opt. dist e2-monotone e,T-optimal O(T/e2)Cons. Policy Iter.
generative model results