University of Twente Student Theses

Login
As of Friday, 8 August 2025, the current Student Theses repository is no longer available for thesis uploads. A new Student Theses repository will be available starting Friday, 15 August 2025.

Automating User and Infrastructure Profiling from Cyber Leaks with LLMs

Lolis, D. (2025) Automating User and Infrastructure Profiling from Cyber Leaks with LLMs.

[img] PDF
457kB
Abstract:Cyber leaks are an increasingly valuable but underutilized source of threat intelligence. As datasets grow in complexity and language diversity, most existing tools still rely on manual workflows or limited automation. In this work, we introduce a modular framework that automates the full pipeline of leaked data processing: extraction, translation, enrichment, and analysis. The system focuses on profiling users and mapping infrastructure from raw leaked data. Applied to the recently leaked conversations dataset from the Black Basta ransomware group containing 190.000 messages, it identifies 50 participants and over 5.500 IP addresses. These results highlight the potential of LLMs for scalable and structured cyber-leak analysis. The full implementation and source code are available at https://github.com/prioneto/ai-data-analysis-agent.
Item Type:Essay (Bachelor)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Computer Science BSc (56964)
Link to this item:https://purl.utwente.nl/essays/107532
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page