University of Twente Student Theses
As of Friday, 8 August 2025, the current Student Theses repository is no longer available for thesis uploads. A new Student Theses repository will be available starting Friday, 15 August 2025.
Knowledge graph for query enrichment in retrieval augmented generation in domain specific application
Perna, Massimo (2025) Knowledge graph for query enrichment in retrieval augmented generation in domain specific application.
PDF
1MB |
Abstract: | We propose and evaluate a hybrid approach to enhance Retrieval-Augmented Generation (RAG) systems by leveraging query enrichment through knowledge graphs. RAG systems, which combine retrieval mechanisms with generative models, are powerful tools for answering complex queries by incorporating external knowledge. However, these systems often face challenges in domain-specific contexts where embedding models may lack the precision required to retrieve relevant information. This limitation is particularly significant in specialized domains, such as finance and biomedicine, where nuanced understanding is essential. Our approach addresses this gap by employing a Large Language Model (LLM) not only as the engine powering the RAG system but also as a tool for extracting structured triplets during both the ingestion and querying phases. These triplets, stored in a knowledge graph, are injected into queries during inference to generate enriched and contextually aware inputs, improving the precision of the retrieval process. We evaluate the proposed method on three datasets: a generaldomain dataset with question-answer pairs from Wikipedia and two domain-specific datasets in finance and biomedicine, each comprising approximately 8,000 document chunks. Experimental results demonstrate significant improvements in retrieval precision and recall with an improvement of up 5% for the precision metric, 19% for the recall and an overall increase of on average 7% in the generative quality of the output, as well as enhanced relevance and coherence in the system-generated answers. These findings highlight the potential of knowledge graphs to bridge gaps in embedding precision and improve overall performance in both open- and closed-domain settings. |
Item Type: | Essay (Master) |
Faculty: | EEMCS: Electrical Engineering, Mathematics and Computer Science |
Programme: | Computer Science MSc (60300) |
Link to this item: | https://purl.utwente.nl/essays/106310 |
Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page