University of Twente Student Theses

Login

Video Text Matching : A Deep Learning Model for Video to Descriptive Text Matching

Anazo, M.A.C. (2024) Video Text Matching : A Deep Learning Model for Video to Descriptive Text Matching.

[img] PDF
3MB
Abstract:The goal is to develop a functional multi-modal model that can retrieve short videos based on a text description provided by a user and also give a textual description based on a user-provided video. This will be done by processing textual descriptions and video clips and designing a feature space that will be shared for both text and video, thus enabling the matching of the two data types using contrastive learning. The model is trained and tested on the Microsoft Research Video Description Corpus (MSVD).
Item Type:Essay (Bachelor)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Computer Science BSc (56964)
Link to this item:https://purl.utwente.nl/essays/100861
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page