University of Twente Student Theses

Login
As of Friday, 8 August 2025, the current Student Theses repository is no longer available for thesis uploads. A new Student Theses repository will be available starting Friday, 15 August 2025.

Syntactic Ambiguity in Legal Language : Automatic Classification and Interpretation

Elkady, Hamza (2025) Syntactic Ambiguity in Legal Language : Automatic Classification and Interpretation.

[img] PDF
595kB
Abstract:Syntactic ambiguity in legal texts poses a serious risk for misinterpretation, inconsistent enforcement, and legal disputes. This study investigates whether large language models (LLMs) can automatically detect, classify, and interpret syntactic ambiguity in legal sentences. A manually labeled dataset was used to evaluate the classification performance of GPT models and to fine-tune LegalBERT for cost-effective local classification. The best results were with sentences that contained coordination ambiguity as they were the most consistently recognized. For interpretation, Gemini was used to generate paired rewrites of ambiguous sentences, which were then used to fine-tune a T5 model. While the T5 model preserved the intent of most inputs and avoided hallucinations, it often failed to restructure sentences in a way that fully resolved ambiguity. Overall, the study shows promise in using LLMs for ambiguity-related tasks, but highlights that high-quality data and expert guidance are essential for reliably training cost-effective models.
Item Type:Essay (Bachelor)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:17 linguistics and theory of literature, 54 computer science
Programme:Computer Science BSc (56964)
Link to this item:https://purl.utwente.nl/essays/107459
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page