University of Twente Student Theses

Login
As of Friday, 8 August 2025, the current Student Theses repository is no longer available for thesis uploads. A new Student Theses repository will be available starting Friday, 15 August 2025.

Knowledge graph for query enrichment in retrieval augmented generation in domain specific application

Perna, Massimo (2025) Knowledge graph for query enrichment in retrieval augmented generation in domain specific application.

[img] PDF
1MB
Abstract:We propose and evaluate a hybrid approach to enhance Retrieval-Augmented Generation (RAG) systems by leveraging query enrichment through knowledge graphs. RAG systems, which combine retrieval mechanisms with generative models, are powerful tools for answering complex queries by incorporating external knowledge. However, these systems often face challenges in domain-specific contexts where embedding models may lack the precision required to retrieve relevant information. This limitation is particularly significant in specialized domains, such as finance and biomedicine, where nuanced understanding is essential. Our approach addresses this gap by employing a Large Language Model (LLM) not only as the engine powering the RAG system but also as a tool for extracting structured triplets during both the ingestion and querying phases. These triplets, stored in a knowledge graph, are injected into queries during inference to generate enriched and contextually aware inputs, improving the precision of the retrieval process. We evaluate the proposed method on three datasets: a generaldomain dataset with question-answer pairs from Wikipedia and two domain-specific datasets in finance and biomedicine, each comprising approximately 8,000 document chunks. Experimental results demonstrate significant improvements in retrieval precision and recall with an improvement of up 5% for the precision metric, 19% for the recall and an overall increase of on average 7% in the generative quality of the output, as well as enhanced relevance and coherence in the system-generated answers. These findings highlight the potential of knowledge graphs to bridge gaps in embedding precision and improve overall performance in both open- and closed-domain settings.
Item Type:Essay (Master)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Programme:Computer Science MSc (60300)
Link to this item:https://purl.utwente.nl/essays/106310
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page