e-monotone: Each iteration improves the policy by e.
e-unmonotone: Each iteration does not harm the policy by more than e.