University of Twente Student Theses

Login
As of Friday, 8 August 2025, the current Student Theses repository is no longer available for thesis uploads. A new Student Theses repository will be available starting Friday, 15 August 2025.

Non-stationary preference learning in online human-robot interaction

Traykov, A. (2025) Non-stationary preference learning in online human-robot interaction.

[img] PDF
347kB
Abstract:Dueling bandit algorithms are used for decision-making problems where feedback is qualitative or otherwise unavailable in a precise, discrete numerical format. A key limitation of standard dueling bandit algorithms is the assumption of a stationary environment, where in practice, many applications are influenced by changing preferences. Non-Stationary Dueling Bandits are a variation of the framework that introduces a dynamic layer to the decision process of the agent. Non-stationary dueling bandits have been researched less than other bandit problems, with most research being done within the last five years. There is an identifiable lack of a standardized method to evaluate and compare algorithms. Researchers employ differing techniques and assumptions in their development of new solutions, making direct comparison difficult. Through a literature review and a proof-of-concept experiment, this paper aims to assess the feasibility of creating a standardized evaluation protocol for non-stationary preference learning algorithms. We found that the current evaluation landscape is fragmented and existing results may not be representative of real-world performance. We proposed a novel usability metric called the annoyance metric and conducted an experiment evaluating the feasibility of using an LLM as a proxy for human feedback and conclude that it is not a practical evaluation solution.
Item Type:Essay (Bachelor)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Business & IT BSc (56066)
Link to this item:https://purl.utwente.nl/essays/107865
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page