reset Model | einfinity-regression | einfinity -unmonotone | einfinity / T2,T -optimal | Very Large | Appr. Policy Iter. | |
monotone | local optimal | Very Large | Policy Gradient | |||
&mu Reset Model | e/T2-regression | &mu = opt. dist | e2-monotone | e,T-optimal | O(T/e2) | Cons. Policy Iter. |