University of Twente Student Theses
As of Friday, 8 August 2025, the current Student Theses repository is no longer available for thesis uploads. A new Student Theses repository will be available starting Friday, 15 August 2025.
Evaluating Language Models for Low-Resource NLP : A Comparative Study of RoBERT and Large Multilingual LLMs
Berisha, E.C. (2025) Evaluating Language Models for Low-Resource NLP : A Comparative Study of RoBERT and Large Multilingual LLMs.
PDF
484kB |
Abstract: | This thesis investigates the performance of RoBERT, a Romanian-language adaptation of the BERT model, in comparison with Gemini, a large language model (LLM) developed by Google, on several Romanian natural language processing (NLP) tasks. While LLMs have demonstrated impressive capabilities across many languages and tasks, their effectiveness in low-resource languages like Romanian remains underexplored. This study addresses this gap by evaluating both RoBERT and Gemini on five key Romanian NLP tasks: sentiment analysis, named entity recognition, topic identification, dialect identification, and offensive language detection. The models are tested using publicly available Romanian datasets, and their performance is compared using the F1-score as the evaluation metric. The results show that RoBERT outperforms Gemini on tasks that require detailed language-specific knowledge, particularly named entity recognition and dialect identification, while Gemini performs competitively on more general tasks such as sentiment analysis. These findings suggest that, despite the broad generalization abilities of large multilingual models, monolingual models like RoBERT continue to offer important advantages in low-resource language settings, especially when linguistic precision is critical. |
Item Type: | Essay (Bachelor) |
Faculty: | EEMCS: Electrical Engineering, Mathematics and Computer Science |
Programme: | Computer Science BSc (56964) |
Link to this item: | https://purl.utwente.nl/essays/107333 |
Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page