University of Twente Student Theses

Login
As of Friday, 8 August 2025, the current Student Theses repository is no longer available for thesis uploads. A new Student Theses repository will be available starting Friday, 15 August 2025.

Generating Tool-Usage Tests from Minimal Developer Input for Custom Tool-Calling Agents

Calcaterra, Giacomo (2025) Generating Tool-Usage Tests from Minimal Developer Input for Custom Tool-Calling Agents.

[img] PDF
710kB
Abstract:Recent advances in architecture and scale have enabled LLMs to invoke external tools, greatly expanding what an agent can do. Yet, as the number and complexity of those tools grow, agents can still produce malformed or incorrect calls. Many existing studies focused on testing LLMs capabilities to assemble such tool calls, mostly focusing on the comparison between different foundation models. However, most of the recently released models are increasingly able to do so for specific tasks thanks to dedicated finetuning, shifting the cause of the issue to custom prompting, tool complexity, and tools combination and description. To test such specific situations it is necessary to perform manual tests or produce hand-curated testing scenarios.The aim of this paper is to show how, combining existing generation and evaluation techniques, it is possible to generate and execute a set of tailored tests for custom tools, tools combinations and prompt only requiring minimal additional information, and removing the need for manual intervention during the process. Our code is available at this URL.
Item Type:Essay (Bachelor)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Business & IT BSc (56066)
Link to this item:https://purl.utwente.nl/essays/107557
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page