University of Twente Student Theses
As of Friday, 8 August 2025, the current Student Theses repository is no longer available for thesis uploads. A new Student Theses repository will be available starting Friday, 15 August 2025.
Word Embeddings to Classify Types of Diachronic Semantic Shift
Koldenhof, Dylan (2021) Word Embeddings to Classify Types of Diachronic Semantic Shift.
PDF
303kB |
Abstract: | Languages are constantly evolving, in many ways. One of the ways they evolve is in semantics, the meaning of words. This presents an interesting challenge for automated Natural Language Processing (NLP), as a thorough manual inspection of this phenomenon is difficult. Much work has already shown promising results in detection of the semantic shift but there is little in the field investigating the nature of these shifts. This research aims to fill this gap by investigating whether different types of semantic shifts can be classified using word embeddings trained with Word2Vec. Different machine learning classifiers are trained on embeddings which are themselves trained on Project Gutenberg ebooks spanning the period 1800-1849, and embeddings trained on Wikipedia. Results show promise, but with a top accuracy of 0.5 when validated on another time period, there is room for improvement in future work. |
Item Type: | Essay (Bachelor) |
Faculty: | EEMCS: Electrical Engineering, Mathematics and Computer Science |
Subject: | 17 linguistics and theory of literature, 54 computer science |
Programme: | Computer Science BSc (56964) |
Link to this item: | https://purl.utwente.nl/essays/87304 |
Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page