1 / 1-gamma
Effective Horizon.
The time to find an optimal policy.
What is the relationship of gamma in value iteration?
Gamma tells you what your horizon is.
Smaller gamma stays more in the present.
Larger gamma pushes out into the future.
What happens with a gamma close to 0?
Agent becomes short sighted.
In what time does Value Iteration solve MDPs.
> than polynomial but it is possible to get close to the optimal policy in polynomial time.