Direct Experience MDP e,T-optimal O(|S|2|A|T3/e3)E3 KeSi
e-Approx. Planning Factoring Poly(|Factoring|, 1/e,T) Factored-E3 KeKo
local model O(T |Cover| / e) Metric-E3 KaKeL
e -optimal O(|A||S|T2/e2) Q-learning W
Trace Model einfinity regression e -unmonotone einfinity / T2,T optimal ???Appr. Policy Iter.???
monotone local optimal Very LargePolicy GradientSuMcSiMa
&mu Trace Model e/T2 regression &mu = opt. dist e2-monotone e,T -optimal O(T/e2)Cons. Policy Iter.KaL
Generative Model (|A|T/e)O(T) Sparse SamplingKeMN
e/T classification |A|T RLGenL
e local-optimal O(T2)variousmLP,NKi,L,BKaNSc
&mu = opt. dist Te,T -optimal PSDPBKaNSc
Deterministic Gen. Model e local optima T3 log (1/e) /e2 Pegasus NJ
Precise Description MDP optimal T|S||A| Value Iter. ???