University of Twente Student Theses
As of Friday, 8 August 2025, the current Student Theses repository is no longer available for thesis uploads. A new Student Theses repository will be available starting Friday, 15 August 2025.
Generating Adversarial Prompts from Incidents and Guidelines
Salapare, Kurt (2024) Generating Adversarial Prompts from Incidents and Guidelines.
PDF
898kB |
Abstract: | Rapid deployment of Large Language Models (LLMs) has intro- duced significant security vulnerabilities, yet the limited public availability of detailed incident reports regarding exact prompts or techniques used impedes comprehensive security analysis and the development of robust defenses. This research addresses this gap by designing and evaluating a novel AI agent capable of automatically generating adversarial prompts from existing security guidelines and reported incidents. The agent employs a two-phase workflow: first, processing unstructured text into a classified, metadata-rich dataset via LLM-driven paragraph classification, and second, utilizing these insights to generate executable adversarial prompts. We inves- tigate the performance of various LLMs and prompt architectures (Descrip- tive, Concise, Few-Shot) within the agent, evaluating their computational efficiency, classification reliability, and the characteristics of the generated prompts. This systematic methodology offers a reproducible framework for improving proactive security analysis by providing a structured approach to adversarial prompt generation for law enforcement and developers. |
Item Type: | Essay (Bachelor) |
Faculty: | EEMCS: Electrical Engineering, Mathematics and Computer Science |
Subject: | 54 computer science |
Programme: | Computer Science BSc (56964) |
Link to this item: | https://purl.utwente.nl/essays/107517 |
Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page