University of Twente Student Theses
As of Friday, 8 August 2025, the current Student Theses repository is no longer available for thesis uploads. A new Student Theses repository will be available starting Friday, 15 August 2025.
Approximate Guarantee for Approximate Dynamic Programming
Lardinois, W.T.P. (2019) Approximate Guarantee for Approximate Dynamic Programming.
PDF
1MB |
Abstract: | A Markov decision process (MDP) is a common way to model stochastic decision problems. Finding the optimal policy for an MDP is a challenging task. Therefore, approximation methods are used to obtain decent policies. Approximate dynamic programming (ADP) is an approximation method to obtain policies for an MDP. No approximate guarantees for ADP related to MDP exist yet. This thesis searches for an approximate guarantee for ADP by using an optimal stopping problem. In Chen & Goldberg, 2018 [10], an approximation method and approximate guarantees were obtained for optimal stopping problems. A Markov chain over the policy space of the MDP is created to obtain an optimal stopping problem, denoted by OS-MDP. The method described by Chen & Goldberg applied on OSMDP yields error bounds for solution methods and approximation methods of MDP. The estimation relates the policy found after N iterations to the policy obtained after M iterations, where N < M. This estimation has an error bound of 1 k+1, where k is a parameter that determines the complexity of the computations. Solution methods discussed are; policy iteration, value iteration and ADP. A card-game, called The Game [36], is used as a running example. The Game is modelled as MDP, approximated using ADP, and an error bound of ADP is obtained by using OS-MDP. One small numerical test of OS-MDP is performed, where N = 5 and M = 10 yields an error bound of 0.25, hence no more than a 25% improvement can be achieved. |
Item Type: | Essay (Master) |
Faculty: | EEMCS: Electrical Engineering, Mathematics and Computer Science |
Subject: | 31 mathematics |
Programme: | Applied Mathematics MSc (60348) |
Link to this item: | https://purl.utwente.nl/essays/80296 |
Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page