University of Twente Student Theses

Login

Stealing Part of a Production Language Model

Mitka, Krystof (2024) Stealing Part of a Production Language Model.

[img] PDF
2MB
Abstract:The rapid advancement of large language models (LLMs) has led to their widespread deployment in various applications, often as black-box systems accessible only through APIs. This paper investigates the vulnerabilities of such models to model-stealing attacks, specifically focusing on extracting the full logit distributions of next-token predictions. By leveraging the bias map feature provided by APIs, we introduce a novel algorithm that efficiently recovers the complete logit distribution. Our contri- butions include the formulation of a class of algorithms that rely solely on the bias map, theoretical insights into their convergence and lower bounds, and the identifica- tion and analysis of a new state-of-the-art attack. We demonstrate the effectiveness of our approach through theoretical analysis and numerical experiments, highlighting the potential risks and implications for the security of proprietary language models.
Item Type:Essay (Bachelor)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:31 mathematics, 54 computer science
Programme:Applied Mathematics BSc (56965)
Link to this item:https://purl.utwente.nl/essays/101813
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page