University of Twente Student Theses
Adversarial attacks on neural text detectors
Fishchuk, Vitalii (2023) Adversarial attacks on neural text detectors.
PDF
650kB |
Abstract: | This thesis explores the effectiveness of adversarial attack methods in evading AI-text detection. Experimenting on three attack categories, prompt engineering, parameter tweaking, and character-level mutations, this research employs a mixed-methods approach to examine the effectiveness of such attacks with the recently released GPT-3.5 model. Results from this research reveal the low robustness of existing detectors towards practical and resource-efficient attack methods. The findings demonstrate how prompt engineering, parameter tweaking and character-level mutations can be exploited to evade detection effectively. Additionally, the study shows that detector algorithms struggle with the GPT-4 model and highlights the need for urgent improvement in existing detectors. |
Item Type: | Essay (Bachelor) |
Faculty: | EEMCS: Electrical Engineering, Mathematics and Computer Science |
Subject: | 54 computer science, 81 education, teaching |
Programme: | Business & IT BSc (56066) |
Awards: | The Twente Student Conference on IT - best paper |
Link to this item: | https://purl.utwente.nl/essays/95835 |
Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page