Blog - toyproblems.xyz

Understanding the Value Function in Reinforcement Learning: A Corridor Example

May 1, 2025

Value functions are a fundamental concept in Reinforcement Learning (RL). A solid grasp of value functions is essential for understanding more advanced RL algorithms. In this post, we explore value functions through a simple, custom environment to make their core ideas intuitive and accessible. We also provide the python code to replicate our results here.…
Finding a Reinforcement Learning Policy with a Markov Decision Process: Generalized Policy Iteration (GPI)

March 31, 2025

How to find a policy when you have a model of your Markov Decison Process (MDP)? There is a number of methods to do this that all fall under the umbrella of Generalized Policy Iteration (GPI). Here we will go through the two most notable methods – Policy Iteration and Value Iteration – and we…
Bayes’ Theorem and its Applications to Classification and Sequence Models

March 30, 2025

Bayes’ Theorem and the Naïve Bayes classifier are foundational concepts in probability and machine learning. But what do they mean in practical terms? Let’s break it down with a simple example. Imagine you’re walking through the streets of Berlin and a stranger smiles and says “ciao!”. Naturally, you might wonder: Is this person Italian? Or…

Understanding the Value Function in Reinforcement Learning: A Corridor Example