University of Twente Student Theses


Profiling Users by Access Behaviour Using Data Available to a Security Operations Center

Sonneveld, J.J. (2023) Profiling Users by Access Behaviour Using Data Available to a Security Operations Center.

[img] PDF
Abstract:Businesses face constant threats regarding cyber security, from both in- and outside their organisation. A Security Operations Center (SOC) monitors a company's digital infrastructure to protect against these threats by detecting suspicious events and taking mitigating action. Adversaries commonly need to access resources illegitimately to achieve action upon their objectives, and do so via existing user accounts. We develop a method to identify suspicious access behaviour using non-intrusive data available to a SOC. For every user, a feature vector is constructed describing their access behaviour. This vector contains statistics over a predefined period of time on what resources a user accessed at what time and in what manner. We survey different Machine Learning profiling possibilities and select the general-purpose K-means clustering to group users with similar behavioural characteristics. We apply this method to the Insider Threat logs from the Carnegie Mellon University Software Engineering Institute, a synthetic dataset which is a benchmark for insider threat detection research. Through access behaviour profiling, we were able to identify all users with the ITAdmin role. The consistency of clusters for benign users over time was compared and quantified through the Adjusted Rand metric. Comparing consecutive months, this resulted in an average consistency score of 0.87. We were able to detect 80% of insider threats with the ITAdmin role by monitoring for changes in their cluster over time. Applying the same methodology to access data of real-world organisations allows for significantly less consistent clustering, with average consecutive month consistency scores of 0.41. We observe a slight similarity between our clusters and the groupings within the organisation's Active Directory. The clusters also show slight similarity with the groups inferred by Microsoft User Entity Behaviour and Analytics via Machine Learning. We show how the granularity of the most relevant features of the synthetic dataset differs from that of real-world data, which is suspected to cause this decrease in clustering consistency. To be able to detect suspicious changes in user access behaviour a consistent profiling method is required. Our privacy-preserving, general-purpose profiling methodology enabled consistent results on the synthetic dataset. However, we have strong indication that in a production environment this data alone is not able to produce results usable for detecting suspicious access behaviour.
Item Type:Essay (Master)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Computer Science MSc (60300)
Link to this item:
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page