University of Twente Student Theses

Login
As of Friday, 8 August 2025, the current Student Theses repository is no longer available for thesis uploads. A new Student Theses repository will be available starting Friday, 15 August 2025.

From Descriptions to Decisions : Classifying Vulnerabilities by Information Sufficiency

Dominguez Adilova, Bejruz (2025) From Descriptions to Decisions : Classifying Vulnerabilities by Information Sufficiency.

[img] PDF
134kB
Abstract:In cybersecurity, Common Vulnerabilities and Ex-posures (CVE) and Common Weakness Enumeration (CWE)are the industry standard for registering a vulnerability andcategorizing a weakness, respectively. About 30% of CVEs arenot labeled with a CWE and approximately 50% of that subsetis unlabeled due to poor descriptions. A significant portion ofCVEs are not labeled with all their relevant CWEs. The existingstandard for CVE descriptions is generally not adhered to,causing issues for attempts to automate their CWE labeling.A binary classifier could potentially flag CVE entries withinsufficient description information to properly label it to a CWE.This work proposes exploring different binary classifiers utilizingNatural Language Processing (NLP) models in order to assess theviability of an automated classifier for determining the labelingsufficiency of a vulnerability description given the available CVEdatasets. BERT and Random Forest, two vastly different models,based on transformers and decision trees respectively, providedsimilarly promising results, between 83% and 87% accuracy intheir best performing models.
Item Type:Essay (Bachelor)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Business & IT BSc (56066)
Link to this item:https://purl.utwente.nl/essays/107580
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page