A Markov Decision Process is one of the most fundamental knowledge in Reinforcement Learning. It’s used to represent decision making in optimization problems. The version we present here is the Finite MDP, which analyses discrete time, with discrete action problems, with some amount of stochasticity involved. …