University of Twente Student Theses


Exploring Large Language Models and Retrieval Augmented Generation for Automated Form Filling

Bucur, Matei (2023) Exploring Large Language Models and Retrieval Augmented Generation for Automated Form Filling.

[img] PDF
Abstract:Large language models (LLMs) such as the GPT family have shown remarkable natural language processing capabilities across a variety of tasks without requiring retraining or fine-tuning. However, leveraging their potential for use cases beyond the traditional chatbot paradigm remains an open challenge. One potential application is automated form completion, which enables users to fill out online forms using natural language and leverages available data about the user and the form completion guidelines. This can benefit a broad range of processes, such as applying for a loan or grant, filing a tax statement, or requesting a service. However, automated form filling faces challenges such as understanding form layout, guidelines, and user intent, as well as reasoning over data, in order to generate accurate and coherent text. In this paper, I propose a general method for adapting LLMs to different form-filling domains and tasks. The method consists of three steps: (1) creating a knowledge base that contains facts and rules related to the form-filling task; (2) augmenting the LLM with the knowledge base using retrieval-augmented generation; and (3) using prompt engineering techniques to improve the outputs. I evaluated the effectiveness of the method and the impact of the techniques on the task of completing request forms for various incentives and services.
Item Type:Essay (Bachelor)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Computer Science BSc (56964)
Link to this item:
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page