In reinforcement learning (and sometimes other settings), there is a notion of “state”. Based upon the state various predictions are made such as “Which action should be taken next?” or “How much cumulative reward do I expect if I take some action from this state?” Given the importance of state, it is important to examine the meaning. There are actually several distinct options and it turns out the definition variation is very important in motivating different pieces of work.
- Newtonian State. State is the physical pose of the world. Under this definition, there are very many states, often too many for explicit representation. This is also the definition typically used in games.
- Abstracted State. State is an abstracted physical state of the world. “Is the door open or closed?” “Are you in room A or not?” The number of states is much smaller here. A basic issue here is: “How do you compute the state from observations?”
- Mathematical State. State is a sufficient statistic of observations for making necessary predictions.
- Internal State. State is the internal belief/understanding/etc… which changes an agent’s actions in different circumstances. A natural question is: “How do you learn a state?” This is like the mathematical version of state, except that portions of the statistic which can not be learned are irrelevant.
- There are no states. There are only observations (one of which might be a reward) and actions. This is more reasonable than it might sound because state is a derived quantity and the necessity of that derivation is unclear. PSRs are an example.
The different questions these notions of state motivate can have large practical consequences on the algorithms used. It is not clear yet what the “right” notion of state is—we just don’t know what works well. I am most interested in “internal state” at the moment.