Studying behaviour in a decision-making task with multiple features and changing reward functions, Tomov et al. find that a strategy that combines successor features with generalized policy iteration predicts behaviour best.
- Momchil S. Tomov
- Eric Schulz
- Samuel J. Gershman