Evaluating Large Language Models for Automated Cyber Security Analysis Processes

Jonkhout, Beau

Cybersecurity faces ever-increasing complexities and novel threats, with recent reports from CrowdStrike, IBM, and ENISA highlighting the appearance of 34 new adversaries and a 15\% increase in cyberthreats. Traditional security tools and methodologies such as the NIST CSF 2.0 are under pressure to evolve rapidly to keep pace. The enormous volume of security alarms leads to ”alert fatigue,” a condition where the overwhelming volume of security alarms causes individuals to become desensitized and less responsive, which can cause significant oversight in threat detection. This thesis investigates the application of Natural Language Processing (NLP), particularly Natural Language Understanding (NLU) and Large Language Models (LLMs) like OpenAI's GPT series, to streamline the decision-making processes in cybersecurity alarm analysis. By delegating key decision-making aspects to LLMs, this approach seeks to mitigate alert fatigue. The research evaluates current state-of-the-art models, develops a methodology for assessing the efficacy of LLMs, and analyses their capabilities in specific security analysis. Results indicate that LLMs match human response levels to a non-random degree, suggesting that they can potentially support operators in reducing alert fatigue, but need further optimization. Data and findings are made available publicly to facilitate further research and verification.

Evaluating Large Language Models for Automated Cyber Security Analysis Processes

Author(s): Jonkhout, Beau (2024)

Abstract:

Document(s):