University of Twente Student Theses


Ontology Matching using Domain-specific knowledge and Semantic Similarity

Guru Rao, Sathvik (2022) Ontology Matching using Domain-specific knowledge and Semantic Similarity.

[img] PDF
Abstract:In this thesis, we worked on techniques to improve the matching of two occupational ontologies namely ESCO which defines the Eurpoean labor market and O*NET, which defines the American labor market. This is done by incorporating domain-specific knowledge. A state-of-the-art language model - XLNet is used to create contextual embeddings of the occupations’ information which is then used to find the semantic similarity between the occupations of ESCO and O*NET. This is used as a baseline to understand the impact of using domain specific knowledge in the matching process. Domain-specific knowledge is used in two ways: i) by extending the XLNet model’s vocabulary with domain knowledge and ii) by using a domain-specific ontology as a helper ontology which can bridge the gap between the two ontologies. After finding the matches, the next stage was to establish a semantic relationship between the occupations to make the relationship between them more informative. These relations are based on the taxonomic structure of ESCO and the semantic similarity score between occupations. The evaluation of the results showed that the generic XLNet model performed well in finding an accurate O*NET occupation match with an accuracy of 69% for a sample of 200 ESCO occupations. A detailed analysis of the relationships established between the occupations provided an insight into where the O*NET occupation can be positioned inside ESCO occupation categories which can facilitate in ontology merging.
Item Type:Essay (Master)
TNO, Soesterberg, The Netherlands
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Computer Science MSc (60300)
Link to this item:
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page