-
Value Estimation with Monte Carlo Methods: Learning from Experience
We are now going to tackle for the first time the challenge to determine the value function when we don’t have access to a model. This will be very useful to move on to more complex environments where the interactions are much more complex than the Corridor example we have been working with so far.…
-
Understanding the Value Function in Reinforcement Learning: A Corridor Example
Value functions are a fundamental concept in Reinforcement Learning (RL). A solid grasp of value functions is essential for understanding more advanced RL algorithms. In this post, we explore value functions through a simple, custom environment to make their core ideas intuitive and accessible. We also provide the python code to replicate our results here.…
-
Finding a Reinforcement Learning Policy with a Markov Decision Process: Generalized Policy Iteration (GPI)
How to find a policy when you have a model of your Markov Decison Process (MDP)? There is a number of methods to do this that all fall under the umbrella of Generalized Policy Iteration (GPI). Here we will go through the two most notable methods – Policy Iteration and Value Iteration – and we…