University of Twente Student Theses
As of Friday, 8 August 2025, the current Student Theses repository is no longer available for thesis uploads. A new Student Theses repository will be available starting Friday, 15 August 2025.
Comparing NER Performances of different LLMs on Darkweb Data
Ruijter, Job de (2025) Comparing NER Performances of different LLMs on Darkweb Data.
PDF
454kB |
Abstract: | Traditional Named Entity Recognition (NER) models generally do well at analysing text. However, when given data gets unstructured (e.g. with darkweb-related data), their performance drops. An alternative approach is using Large Language Models (LLMs), as they can be more adaptive. This exploratory research aims to find the difference in NER performance between generic and cybersecurity-specific LLMs when dealing with darkweb data. First, literature research is conducted to define a benchmark. Second, this benchmark is realised, and 6 different LLMs are tested on their NER performance on a darkweb dataset. After conducting the analysis, the LLMs’ performances are unsatisfactory, due to limited optimisations. There are significant differences between individual models, though there is no clear distinction between generic and cybersecurity-specific LLMs regarding their NER performance on darkweb-related data. Despite not achieving high performances, this research shows the potential of LLMs for darkweb NER tasks and provides a base for future research. |
Item Type: | Essay (Bachelor) |
Faculty: | EEMCS: Electrical Engineering, Mathematics and Computer Science |
Subject: | 54 computer science |
Programme: | Business & IT BSc (56066) |
Link to this item: | https://purl.utwente.nl/essays/107399 |
Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page