University of Twente Student Theses

Login
As of Friday, 8 August 2025, the current Student Theses repository is no longer available for thesis uploads. A new Student Theses repository will be available starting Friday, 15 August 2025.

LLM-Assisted Triple Extraction from Historical Texts

Alexe, V. A. (2025) LLM-Assisted Triple Extraction from Historical Texts.

[img] PDF
672kB
Abstract:Historical texts are rich sources of knowledge, but their unstructured nature makes it difficult to analyze them systematically. This research explores how Large Language Models (LLMs) can be used to extract subject-predicateobject (SPO) triples from historical narratives, enabling the construction of structured knowledge graphs (KGs). Focusing on three representative LLMbased frameworks (AiKG, Triplex, and GPT o3), we assess their effectiveness in handling the linguistic complexity, archaic vocabulary, and contextual nuance found in historical documents. The study includes both quantitative evaluations (e.g., precision, recall, F1) and qualitative assessments of factual accuracy, completeness, faithfulness, and entity alignment. Our findings highlight significant variation in performance across frameworks, with GPT o3 demonstrating the best balance of coverage and semantic accuracy. This work contributes to the growing field of digital humanities by showing how LLMs can support historical research through the automated extraction of structured information, while also identifying current limitations and areas for improvement in LLM-based extraction tools.
Item Type:Essay (Bachelor)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:02 science and culture in general, 15 history, 17 linguistics and theory of literature, 18 languages and literature, 50 technical science in general, 54 computer science
Programme:Computer Science BSc (56964)
Link to this item:https://purl.utwente.nl/essays/107762
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page