University of Twente Student Theses
As of Friday, 8 August 2025, the current Student Theses repository is no longer available for thesis uploads. A new Student Theses repository will be available starting Friday, 15 August 2025.
Syntactic Ambiguity in Legal Language : Automatic Classification and Interpretation
Elkady, Hamza (2025) Syntactic Ambiguity in Legal Language : Automatic Classification and Interpretation.
PDF
595kB |
Abstract: | Syntactic ambiguity in legal texts poses a serious risk for misinterpretation, inconsistent enforcement, and legal disputes. This study investigates whether large language models (LLMs) can automatically detect, classify, and interpret syntactic ambiguity in legal sentences. A manually labeled dataset was used to evaluate the classification performance of GPT models and to fine-tune LegalBERT for cost-effective local classification. The best results were with sentences that contained coordination ambiguity as they were the most consistently recognized. For interpretation, Gemini was used to generate paired rewrites of ambiguous sentences, which were then used to fine-tune a T5 model. While the T5 model preserved the intent of most inputs and avoided hallucinations, it often failed to restructure sentences in a way that fully resolved ambiguity. Overall, the study shows promise in using LLMs for ambiguity-related tasks, but highlights that high-quality data and expert guidance are essential for reliably training cost-effective models. |
Item Type: | Essay (Bachelor) |
Faculty: | EEMCS: Electrical Engineering, Mathematics and Computer Science |
Subject: | 17 linguistics and theory of literature, 54 computer science |
Programme: | Computer Science BSc (56964) |
Link to this item: | https://purl.utwente.nl/essays/107459 |
Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page