University of Twente Student Theses

Login
As of Friday, 8 August 2025, the current Student Theses repository is no longer available for thesis uploads. A new Student Theses repository will be available starting Friday, 15 August 2025.

Exploring robustness of image captioning in visual place recognition under appearance shifts

Meijer, Pepijn (2025) Exploring robustness of image captioning in visual place recognition under appearance shifts.

[img] PDF
1MB
Abstract:Visual place recognition is currently one of the most important problems faced in the field of computer vision. It is the process of identifying the location of a given image and retrieving images captured at the same place. It is an essential component in the navigation of mobile robots, visual question answering, and autonomous driving. It is crucial that these models perform image retrieval tasks successfully in different weather conditions. In this study, we investigate the use of a captioning step within visual place recognition with the goal of improving the resistance of visual place recognition to differences in the query image. We specifically look at visual place recognition at a lower level by using a pipeline which emits the retrieval step and outputs image encodings, which would be used for image retrieval in an actual visual place recognition pipeline. We implement a pipeline using the ExpansionNet LLM to caption the image, the CLIP VLM to encode the image and the caption of the image. By introducing corruptions in the query image, we test the effectiveness of a captioning step in the pipeline. We find that with a caption, the results show a consistent 40% increase in resistance to corruption.
Item Type:Essay (Bachelor)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Computer Science BSc (56964)
Link to this item:https://purl.utwente.nl/essays/107689
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page