University of Twente Student Theses

Login

On the sample complexity of finding rewards above the arithmetic mean

Huitema, Tim (2024) On the sample complexity of finding rewards above the arithmetic mean.

[img] PDF
1MB
Abstract:This thesis covers the sample complexity of a dynamic-threshold problem: the problem to find the set of arms that have a reward that is higher than the arithmetic mean of all rewards. This problem configuration differs from existing literature because of the dynamic behaviour of the threshold which is dependent on all arms. The assumption is made that rewards are sampled from an underlying Gaussian distribution. The results are acquired by using the Karush–Kuhn–Tucker conditions for a generic problem description. The sample complexity is compared to similar algorithms made using the Track-and-Stop strategy. We see that the sample complexity of our algorithm for comparable objectives is higher. For our algorithm the δ-PAC property is empirically confirmed. Unfortunately, a component of the algorithm remains dependent on iterative testing for optimality, this however only increases the computational run time and does not disprove the validity of the algorithm. In this thesis a method for accelerating this iterative testing is presented.
Item Type:Essay (Master)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:31 mathematics
Programme:Applied Mathematics MSc (60348)
Link to this item:https://purl.utwente.nl/essays/103968
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page