The Reset Model

In the reset model, information comes from a set of independent traces.
Example 1: In a markov decision process world, you learn (several) (s,a,r)T where the initial state is chosen from P(s) and each action is chosen according to the algorithm and the state is chosen from P(s'|s,a).

Example 2: Robbie can learn from his 10000 clones.

The μ Reset Model

The reset model except you choose P(s).