University of Twente Student Theses
Generating Automatic Commentary in Video Games using Large Language and Vision Language Models
Stournaras, Georgios (2024) Generating Automatic Commentary in Video Games using Large Language and Vision Language Models.
PDF
712kB |
Abstract: | This research explores the effectiveness of utilizing multimodal AI systems for generating accurate commentary in football video games by integrating visual information alongside text input. A prototype system, based on previ- ous research, was developed to collect game data, and generate commentary using both a Large Language Model (LLM) and a Vision Language Model (VLM). The study compared the generated outputs of these models in pro- ducing commentary, analysing errors, and determining the impact of visual data on accuracy. The results indicated that while the VLM hallucinated less by fabricating fewer data and events, it exhibited a higher overall error rate compared to the LLM. Additionally, the image analysis often resulted in very simple and superficial commentary. Further analysis suggests that significant improvements are required in both model training and hardware capabilities to achieve real-time, accurate commentary generation. Future work will focus on refining model training with specific game data and enhancing prompt engineering to address identified limitations. |
Item Type: | Essay (Bachelor) |
Faculty: | EEMCS: Electrical Engineering, Mathematics and Computer Science |
Subject: | 54 computer science |
Programme: | Computer Science BSc (56964) |
Link to this item: | https://purl.utwente.nl/essays/101002 |
Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page