University of Twente Student Theses
Stealing Part of a Production Language Model
Mitka, Krystof (2024) Stealing Part of a Production Language Model.
PDF
2MB |
Abstract: | The rapid advancement of large language models (LLMs) has led to their widespread deployment in various applications, often as black-box systems accessible only through APIs. This paper investigates the vulnerabilities of such models to model-stealing attacks, specifically focusing on extracting the full logit distributions of next-token predictions. By leveraging the bias map feature provided by APIs, we introduce a novel algorithm that efficiently recovers the complete logit distribution. Our contri- butions include the formulation of a class of algorithms that rely solely on the bias map, theoretical insights into their convergence and lower bounds, and the identifica- tion and analysis of a new state-of-the-art attack. We demonstrate the effectiveness of our approach through theoretical analysis and numerical experiments, highlighting the potential risks and implications for the security of proprietary language models. |
Item Type: | Essay (Bachelor) |
Faculty: | EEMCS: Electrical Engineering, Mathematics and Computer Science |
Subject: | 31 mathematics, 54 computer science |
Programme: | Applied Mathematics BSc (56965) |
Link to this item: | https://purl.utwente.nl/essays/101813 |
Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page